Random Networks

When analysing network and computing measures, it is difficult to gauge whether the outcome is “high” or “low”. For example, is a clustering coefficient of 0.3 high or low? Random networks based on an observed network can be used as to create a benchmark value. In other words, an observed value (e.g., a clustering coefficient) can be compared to the distribution of clustering coefficient from corresponding random networks.

There are many procedures or null models for creating random networks. These procedures vary in how much they randomise an observed network. Traditionally, classical (also known as Bernoulli or Erdos-Reyni) random networks were the commonly used. These networks are constructed by using the same number of nodes, and assigning each dyad a uniform probability of having a tie based on the number of observed ties. In these networks, the degree distribution is uniform or Poisson. However, such a distribution rarely exist in reality. In fact, most real-world distributions are skewed.

In response to the lack of comparability between observed networks and classical random networks, a set of randomisation procedures that maintain additional features has been developed. First, by reshuffling ties in an observed network, it is possible to create random networks that maintain each node’s number of ties as well as the overall number of nodes and ties (Molloy and Reed, 1995). Second, for weighted networks, it is also possible to reshuffle only the weights. By maintaining the ties and simply reshuffling tie weights, features based on connectedness are maintained (e.g., the size of the largest interconnected group of nodes; Opsahl et al., 2008).

Classical Random Networks

Classical random networks are created by setting the number of nodes and giving all possible ties a uniform probability of being formed. Given that each tie is independent of all other ties, it is often possible to approximate the value of a measure mathematically without the need of simulations when using these networks. For example, the global clustering coefficient is equal to the uniform probability and the average shortest path is approximately the ratio between the logarithm of nodes and the logarithm of the average number of ties that nodes have (Watts and Strogatz, 1998). A main limitation of classical random networks is that the distribution of ties across nodes is uniform or Poisson. While this in convinient for deriving expected values, such a distribution rarely exist in reality. In fact, most real-world degree distributions are skewed, which implies that a few nodes are hubs. The hubs particulily affect network measures based on shortest paths as hubs reducing it by acting as shortcuts among other nodes.

Link Reshuffling

As nodes’ degree are not uniformely distributed, a second procedure for creating random networks consists in reshuffling the topology, reaching the maximally random network with the same degree distribution as the observed network (Maslov and Sneppen, 2002; Newman, 2003). It does so by randomly selecting two ties, and . The two ties are then rewired by setting and . The weights are automatically redistributed by remaining attached to the reshuffled ties. However, if either of these ties is already formed, this step is reverted, and two new ties are selected. This condition guarantees that multiple ties are not formed between two nodes, which ensures that the weight and degree distributions remain unchanged. If this procedure is repeated enough times, the outcome is a corresponding random network. This model is commonly used in the Physics literature; however, each of the random networks that are produced is not produced with an equal probability. For more details, see Snijders (2001) and Rao et al. (1996).

One step of the link reshuffling procedure for an undirected netwok. Note that while a node maintain its out-strength in a directed network, this is not the case in an undirected network.

Weight Reshuffling

While the link reshuffling procedure can be applied to binary networks, two other procedures are possible to use when dealing with weighted networks. The weight reshuffling procedure consists simply in reshuffling the weights globally in the network (Opsahl et al., 2008). This null model maintains the topology of the observed network. Therefore, the number of ties originating from a node (degree) does not change. Moreover, other features, such as the giant component does not change as well.

One step of the weight reshuffling procedure for an undirected netwok

Local Weight Reshuffling

Inevitably, since weights are reshuffled globally using the link and weight reshuffling procedures, they produce random networks in which the nodes do not maintain the same node strength as in the observed network. Local weight reshuffling is a third randomisation procedure for directed networks that preserves this quantity by reshuffling weights locally for each node across its outgoing ties (Opsahl et al., 2008). The procedure can be extended to undirected networks by duplicating an undirected tie into two directed ties – one in each direction. It should be noted that this procedure breaks the weight symmetry in the two directions of an undirected tie (the topology remains invariant). The appropriateness of this method for undirected networks depends on the research setting and how tie weights are defined. For example, its applicability to undirected transportation networks is justified by the typically directed nature of traffic flows (although the US airport network displays a high symmetry; Barrat et al., 2004).

One step of the local weight reshuffling procedure for a directed netwok

Example

To highlight how the various procedures differ, the diagram below shows an observed network and the four random networks introduced above. To reproduce these networks, see the R code below.

Random networks created using the null models outlined above. The classical random network uses the same number of nodes, a 0.4 uniform probability of ties being present, and tie weights randomly picked between 1 and 4.

Want to test it with your data?

The randomisation procedures are implemented in tnet. First, you need to download and install tnet in R. Then, you need to create an edgelist of your network (see data structures in tnet for weighted one-mode networks). The commands below show how the edgelist for the sample network here can manually be entered, and how to apply the randomisation procedures.

Like this:

I wondered how well the algorithm scaled to larger networks? I have a network of over 10k nodes and about 80k edges, I have been running your program for a while so far, but wondered if you had any idea on how long it might take to calculate?

The link reshuffling is not very efficient as it relies on igraph. I used to have a much more efficient script for sparse graphs, but had to provide loads of support for people using it on dense graphs. If you send me an email, we can see if this might be more appropriate for you.

Hi Tore,
Many thanks for your site, I find it very helpful!
I have a question regarding the random network. How should I compare the metrics in the observed network with those in random network? I suppose I should use a kind of significance test (like in the weighted rich-club effect), but how can I implement that in R? Or doesn’t it need significance test? When can I say my outcome is high or low?
Sorry for the trivial question, but I’m just a beginner in the network analysis
Thanks,
Marton

It’s great that you are thinking along these lines. I would recommend the following:
– calculate the metric of interest for the actual network (“observed”)
– in a loop of at least 1,000
{
– create a random network
– calculate the metric on the random network
}
– compare the observed value to the ones found using random networks

If the observed value is in the far tails of the distribution of values from random networks, then you can make a claim about it being different from random.

I would recommend using the reshuffling procedures for generating random network to ensure the network are comparable.

Thanks Tore, but what does it mean “loop” here? I suppose it’s something that makes me 1000 random networks, but how can I implement it? Can you give me an example command to implement this comparision? Sorry again for my ignorance!

Results:
As you can see, the observed value (vertical red line) is much higher than the comparable random values. You could also compute the standard deviations of valueRdm to determine the statistical significance of this.

Now I have some problem with the “rg_reshuffling_w” function. So far I have used this function with the link reshuffling method and it workd properly. But now I can’t used this; when I run the command I get the error message: “Error in igraph::rewire(net.i, niter = (ecount(net.i) * 10)) : unused argument (niter = (ecount(net.i) * 10))”. After this I run those scripts that I used with this function before and I got the same error message again (they had worked properly back then I had worked with them). I tried the links option with the example given in the offical tnet package description but it doesn’t work either. Do you have any idea what happened with this function? Was there any change with it? Or maybe am I doing something wrong?

Thanks for spotting an error! It seems that igraph has been updated and the tnet code that connected to it has broken. I will look into this and upload a new version of tnet to Cran when I’ve fixed it.

Hi Marton: An updated version has been created, and a new version of tnet (3.0.14) has been uploaded to CRAN. If you run, update.packages(), you should see tnet being updated. It might take some time before tnet gets updated on all the mirrors. Good luck! Tore