Weighted Networks

A major limitation of many methods used for studying large-scale networks stems from the fact that the strength of ties is not taken into account. Granovetter (1973) argued that the strength of a social tie is a function of its duration, emotional intensity, intimacy, and exchange of services. For non-social networks, the strength often reflects the function performed by the ties, e.g. carbon flow (mg/m²/day) between species in food webs (Luczkowich et al., 2003) or the number of synapses and gap junctions in a neural networks (Watts and Strogatz, 1998). In infrastructure and information networks, variations in the strength of a tie depend on the flow of information, energy, people, and goods along that tie (Barrat et al., 2004). The strength of a tie is generally operationalised into a weight that is attached to the tie, thereby creating a weighted network. For more information on the operationalisation of tie weights, see Defining Weighted Networks which highlights various consideration that should be included when collecting and analysing weighted network data.

Exploring the information that weights hold allows us to further our understanding of networks. In social networks, strong ties are often found among socially embedded individuals (Granovetter, 1973; Panzarasa et al., 2009). In fact, Simmel (1950) argued that a strong tie cannot exist without other indirect ties (weak and strong). Strong ties facilitate change in the face of uncertainty (Krackhardt, 1992), reinforce obligations, expectations, and social norms (Coleman, 1988), and promote the transfer of complex and tacit knowledge by sustaining individuals’ motivation to assist one another (Hansen, 1999; Reagans and McEvily, 2003). Strong ties also aid communication through, for example, the development of relationship-specific heuristics (Uzzi, 1997). In the behaviour of non-social networks, such as technological and transportation ones, strong ties also play crucial roles. For instance, in the network of routers on the Internet they form part of the large backbones that provide national or inter-continental connectivity (Pastor-Satorras and Vespignani, 2004), whereas in airport networks they represent major international or trans-oceanic routes (Barrat et al., 2004).

One of Granovetter’s (1973) major findings is the idea that novel and explicit information is more likely to flow to individuals through weak ties than through strong ones. An individual’s friends tend to move in the same circles, and therefore are likely to receive the same information that the individual already possesses. Conversely, acquaintances are likely to know people that the individual does not, and thus receive more novel information. If this information is explicit or codified, it can easily be transferred from the acquaintance to the individual (Levin and Cross, 2004). In light of these findings, the measures that scholars typically apply to study networks should be sensitive to tie strength and capture the difference between strong and weak ties. This will ensure that the full richness of the data is retained.

However, even though ties in many empirical networks have naturally a strength associated with them, most network measures can only be applied to binary networks, i.e. networks where ties are either present or absent (Scott, 2000; Wasserman and Faust, 1994). Therefore, researchers must convert weighted networks into binary ones. A common way of doing this is to dichotomise a weighted network into a series of binary ones by using a set of cutoffs. A tie is set to present if its weight is higher than the cut-off, and removed otherwise. As the figure below shows, a weighted network (a) can create a range of binary networks (b-d) depending on the cut-off. This figure illustrate how the subjectivity of the researcher’s choice of the cut-off value will produce biases that may invalidate subsequent analysis.

A better way to analyse weighted networks would be to redefine and generalise current methods to explicitly take weights into account. Currently, only a small set of measures have been generalised (Freeman et al., 1991; Newman, 2001). For example, Newman (2001) generalised Freeman’s (1978) closeness measure by applying a method from computer science to define the distances among nodes (Dijkstra, 1959). The generalised methods would aid the analysis by removing the bias resulting from the subjective choice of the cut-off.

Shortest Paths in Weighted Networks

Which is the shortest path from the lower-left node to the central node? Directly or via the top-left node?

Shortest paths among nodes has long been a key element of network research. While the shortest paths often are not of interest in themselves, they are the key component of a number of measures First, a popular question has been what is the average distance among people? The question became famous following Milgram’s six-degrees of separation experiment in 1967 that found that people in the US were on average 6-steps from any other person. Second, the shortest distances have been used as an underlying metric in a number of measures, such as centrality ones (Freeman, 1978). One of these measures is betweenness, which is roughly equal to the number of shortest paths between others that pass through a node. This page is aimed at showing a generalisation of shortest path calculations for weighted networks that has the potential of creating more accurate. Read more…

Node Centrality

Centrality is the concept of being “in the thick of things.” In 1978, Freeman reviewed and clarified a growing field of research on centrality of nodes for binary networks in an article published in the first issue of Social Networks. Three measures were formalised: degree, closeness, and betweenness. Degree was the number of ties or neighbours of a node; closeness was the inverse of the sum of all shortest paths to others or the smallest number of ties to go through to reach all others individually; and betweeness was the number of shortest paths on which a node was on. The three measures have been generalised to weighted networks. Barrat et al. (2004) generalised degree to weighted networks by taking the sum of weights instead of the number ties, while Newman (2001) and Brandes (2001) utilised Dijkstra’s (1959) algorithm of shortest paths for generalising closeness and betweenness to weighted networks, respectiviely. Dijkstra’s algorithm defined the length of paths as the sum of cost (e.g., time in GPS calculations), which is generally only defined as the sum of the inversed tie weights. These generalisations focused solely on tie weights and ignored the original feature of the measures: the number of ties. As such, a second set of generalisation were proposed by Opsahl et al. (2010) that incorporates both the number of ties and the tie weights by using a tuning parameter. This page reviews Freeman’s (1978) centrality measures and the generalisations to weighted networks. Read more…

Clustering

A fundamental measure that has long received attention in both theoretical and empirical research is the clustering coefficient. This measure assesses the degree to which nodes tend to cluster together. Evidence suggests that in most real-world networks, and in particular social networks, nodes tend to create tightly knit groups characterised by a relatively high density of ties (Feld, 1981; Friedkin, 1984; Heider, 1946; Holland and Leinhardt, 1970; Louch, 2000; Opsahl and Panzarasa, 2009; Snijders, 2001; Snijders et al., 2006; Watts and Strogatz, 1998). In real-world networks, this likelihood tends to be greater than the average probability of a tie randomly established between two nodes (Holland and Leinhardt, 1971; Wasserman and Faust, 1994; Watts and Strogatz, 1998). Traditionally, the two versions of the clustering coefficient developed for testing the tendency of nodes to cluster together into tightly knit groups are the global clustering coefficient (Doreian, 1969) and the local clustering coefficient (Watts and Strogatz, 1998). The global version was designed to give an overall indication of the clustering in the network, whereas the local gives an indication of the embeddedness of single nodes. Read more…

Weighted Rich-club Effect

Research has long documented the abundance of complex systems characterised by heterogeneous distribution of resources among their elements. Back in 1897, Pareto noticed the social and economic disparity among people in different societies and countries. This empirical regularity inspired the 80-20 rule of thumb stating that only a select minority (20%) of elements in many real-world settings are responsible for the vast majority (80%) of the observed outcome. While studies have shown that high degree nodes form many ties with each other (Borgatti and Everett, 1999; Colizza et al., 2006; Newman, 2002), the weighted rich-club effect tests whether a select group of nodes share their strongest ties with each other in a weighted network (Opsahl et al., 2008). The nodes in the select group is often referred to as prominent as they are the highest ranking nodes according to a measure, such as degree or node strength. Read more…

Random Networks

When analysing network and computing measures, it is difficult to gauge whether the outcome is “high” or “low”. For example, is a clustering coefficient of 0.3 high or low? Random networks based on an observed network can be used as to create a benchmark value. In other words, an observed value (e.g., a clustering coefficient) can be compared to the distribution of clustering coefficient from corresponding random networks. Read more…

If you use any of the information in this post, please cite: Opsahl, T., Agneessens, F., Skvoretz, J., 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks 32 (3), 245-251

Like this:

Hi Tore, thanks for the website, it is really helpful. I figured that the function degree_w() doesn’t seem to handle well long ids. I was working with ids of 6 digits and was getting strange results. Once I replaced them by simple ids (1, 2,…), the function works a treat. Is there really any restriction about the ids you can use?

The degree_w-function is not restricted in any way to small node ids; however, the return object includes all nodes from 1 to the maximum integer. The nodes ids without a mentioned in the edgelist will get a degree score of 0 (i.e., isolates). To overcome this issue, you can use the compress_ids-functions before running the degree_w-function. For example: