Real world social, biological, and technological networks have complex structures. They grow and rewire themselves over time, while maintaining topologies that enable them to be resilient to failure and highly interconnected.
These features are not found in randomly generated graphs. Researchers are attempting to explain how these complex networks form.
- Understanding the structures of complex networks allows us to understand how to look for resilient and interconnected networks
- Understanding their dynamics would enable designing systems to grow complex networks
Describing Network Structure
Graph theory is the mathematical study of networks, also called graphs. Graphs are described as being made of
- nodes: the elements in a graph. Also called vertices or points
- edges: the relationship between two nodes
In a social graph, the nodes might be people and an edge would represent a connection between two people. In a technological network, nodes might be computers and edges might be network connections between them.
Additionally, a path is the combination of edges required to get from one node to another. The average path length of a graph is the average length of paths between any two nodes in a graph.
Real World Network Structure
Real-world social, biological, and technological networks are often described as complex networks.
Complex networks have structures not found in random graphs or simple lattices. Example structures found in complex networks are local clusters and hubs that connect many nodes.
Small World: Short Average Paths
Although complex networks can be very large, studied complex networks have surprisingly short paths between any two nodes.
A common example is the idea of six degrees of separation, the average length of the path between any two people in the living human social graph is only 6. This attribute of having a short average path length is called being a small world network.
Path length statistics are of interest since average shorter paths equate with faster transmissions across a graph. For example, if a computer in a network wants to send a message to another computer in the network, it is better for the message to be handed off 6 times than 100 times.
Scale Free: Really Big Hubs
A scale free network is a network in which the distribution of links among nodes follows a power law distribution. The power law distribution indicates the existence of nodes with many connections that function as hubs. These hub nodes are connected to several magnitudes more nodes than is typical in the graph. These rare, but extremely connected nodes make up the long tail of the power law distribution. Networks without these hubs do not have a long tail or at least a shorter tail than that of a power law distribution.
The scale-free network model was introduced by Albert-László Barabási. He has since proposed that many complex networks such as the internet, social networks, and biological processes are scale-free networks. Proving out that these networks are scale-free is an ongoing effort. Analyzing large, complex networks is computationally expensive.
It has also been proposed that these hubs are what enable the short average path length and resiliency to node loss in complex networks. Losing a random node in the graph has little impact on the average path length.
However, this argument assumes nodes are lost randomly, without any strategic targeting. If hubs are lost in a scale free network, they have outsized effects on disconnection and path length compared to graphs with more evenly distributed links.
Network Dynamics Models
Graph generation models are used to describe the dynamics of how graphs are formed. Their algorithms can generally be boiled down to a few simple rules.
The following models are all incomplete. In simulations, none of them manage to fully recreate the sort of complex topologies seen in real world networks with clustering and short average paths.
However, even with their very simple rules, they produce some of the attributes of real world complex networks.
The Watts-Strogatz model is focused on creating small worlds. Specifically, two attributes of small worlds:
- local clustering: small, tightly connected neighborhoods of related nodes
- triadic closures: If there is a relationship between nodes (A, B) and nodes (B, C), there should be a relationship between (A, C)
- Arrange N nodes in a ring
- Connect each node with an even number of neighbors, K
- K / 2 neighbors on each side (why K needs to be even)
- Rewire the edges with probability ß
- Rewired edges connect to a random, new node
The model does produce a locally clustered network with a short average path length, but not the highly connected hubs of a scale-free network. This can also be described as the tail of the edge distribution being much shorter.
The other unrealistic aspect of the model is that there is no change in the graph over time. The node count never changes and the relationships between them are only rewired once.
Barabási–Albert: Growth with Preferential Attachment
The Barabási–Albert model does produce scale-free networks.
- start with one node
- add a new node, one at a time
- link the new node with two other nodes with a preference towards nodes with more links
Networks grown with this algorithm do not have the strong, local clusters of the Watts-Strogatz model because it does not have the concept of neighbors. Instead, hubs are created since older hubs begin with the most links and then preferentially are attached to by new nodes.
However, nodes rise and fall in popularity in the real world. For example, a website in the real world can be become unpopular. In the BA model, connections only accumulate and are never lost. Overtime, the oldest hubs will always be the most connected. Newer hubs can never catch up to older hubs in terms of connections.
The Bianconi–Barabási model attempts to explain how newer nodes might overtake older nodes in link count. It introduces a new variable, fitness, that modifies the preferential attachment role from the BA model.
Fitness is an attribute of a node that describes how attractive it is to new links. It is a fixed attribute and is independent of how old the node is and how many links it currently has. This makes each nodes different from another.
The total attractiveness, the fitness product, of a node is the product (multiplication) of its fitness and the number of links it currently has.
Overtime, the distribution of fitness among the nodes becomes the most important factor in this model and has the biggest impact on the long term fitness product of a node.
Fitness Distribution Effects
- BA: When all nodes have the same fitness, the dynamics are the same as the BA model
- Fit Get Rich: When fitness is constrained to a finite domain, a scale free network emerges similar to the BA model, but newer nodes are able to surpass older nodes in links
- Winner Take All: When fitness has an infinite domain, winner take all emerges. One node’s fitness product accelerates beyond any other and begins attracting almost all new possible connections
The winner-take-all outcome is interesting since it seems to explain how monopolies can emerge in different industries. The formation of the monopoly is also described as a form of condensation. However, the name is a bit of an exaggeration since:
- the old connections from before a single node became super fit continue to exist
- each new node connects to more than node and thus creates connections outside the monopoly
It is also worth noting that not all monopolies last forever in the real world, but there is no way out of winner-take-all in this model.