Ties often have a strength naturally associated with them that differentiate them from each other. Tie strength has been operationalized as weights. A few network measures have been proposed for weighted networks, including three common measures of node centrality: degree, closeness, and betweenness. However, these generalizations have solely focused on tie weights, and not on the number of ties, which was the central component of the original measures. This paper proposes generalizations that combine both these aspects. We illustrate the benefits of this approach by applying one of them to Freeman’s EIES dataset.

Motivation

Ego networks of Phipps Arabie (A), John Boyd (B), and Maureen Hallinan (C) from Freeman's third EIES network. The width of a tie corresponds to the number of messages sent from the focal node to their contacts. Adopted from the paper.

Centrality is the concept of being “in the thick of things.” In 1978, Freeman reviewed and clarified a growing field of research on centrality of nodes for binary networks in an article published in the first issue of Social Networks. Three measures were formalised: degree, closeness, and betweenness. Degree was the number of ties or neighbours of a node; closeness was the inverse of the sum of all shortest paths to others or the smallest number of ties to go through to reach all others individually; and betweeness was the number of shortest paths on which a node was on.

The three measures have already been generalised to weighted networks. Barrat et al. (2004) generalised degree to weighted networks by taking the sum of weights instead of the number ties, while Newman (2001) and Brandes (2001) utilised Dijkstra’s (1959) algorithm of shortest paths for generalising closeness and betweenness to weighted networks, respectiviely. Dijkstra’s algorithm defined the length of paths as the sum of cost (e.g., time in GPS calculations), which is generally only defined as the sum of the inversed tie weights. All these generalisations fail to take into account the main feature of the original measures formalised by Freeman (1978): the number of ties.

This limitation is highlighted for degree centrality by the three ego networks from Freeman’s third EIES network. The three nodes have roughly sent the same amount of messages; however, to a quite different number of others. If Freeman’s (1978) original measure was applied, the centrality score of the node in panel A is almost five times as high as the node in panel C attains. However, when using Barrat et al.’s generalisation, they get roughly the same score.

This articles proposes a new generation of node centrality measures for weighted networks. The second generation of measures takes into consideration both the weight of ties and the number of ties. The relative importance of these two aspects are controlled by a tuning parameter.

Want to test it with your data?

The degree_w, closeness_w, and betweenness_w-functions in tnet allows you to calculate the binary, weighted, and the measures that combine these two aspects on your own dataset.

For example, to calculate second generation node centrality measures (alpha = 0.5) on the sample network above, you can run the code below in R. The degree function easily calculates the binary and first generation measures as well; however, this is not the case for the closeness and betweenness-functions. If you would like the binary version, you can either use the dichotomise function or set alpha=0. If you would like the first generation weighted measures, you can set alpha=1 (default value).

If you use any of the information in this post, please cite: Opsahl, T., Agneessens, F., Skvoretz, J., 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks 32 (3), 245-251

Hi Tore, thanks for your answer. Still, I got a doubt: the values in the columns, what do they correspond to? I mean, why is the vector i made of 1,1,2,2,2,2 etc? Because, looking to the net represented by the picture (the net made by 6 nodes, from A to F) I see other weigths, so I am pretty lost :)

I’m trying to calculate the degree_w for my data, but it seems that the output is always equivalent when alpha is set as 1. Even for the sample network in this post, I got the same results when varying the alpha parameter…
Please, could you tell me what is my mistake here?
With many thanks

Thanks for taking an interest in tnet. This is an issue with the igraph-package that tnet depends on. You can fix it by copying the iconv.dll from a previous version of igraph (or email me). I have sent them a bug report – hopefully they will have it sorted soon.

The igraph-guys (Gabor Csardi) have just submitted a new version of their R-package that should solve this issue. As soon as it has been approved by CRAN and disseminated across the servers, you should be able to get it by writing update.packages() in R.

Tore

20.Nitesh | August 19, 2010 at 12:35 pm

Hi Tore,

Thanks for your response. As you mentioned before, by copying the iconv.dll file in igraph-libs, it worked.

Can you suggest me something about cliques in weighted graphs. I have seen your related to node centrality and clustering coefficient.
Just wondering, if you have any ideas on finding cliques in weighted graph.

You can look at my paper called Clustering in Weighted Networks for determining the level of clustering / cliqueness in a network. But if you would like to find the specific cliques or communities, I would suggest you contact some of the community detection guys. Also, you might want to have a look at the book Generalized Blockmodeling by Dorian, Batagelj, and Ferligoj.

Tore

22.minjung | August 24, 2010 at 12:55 pm

Hi tore!
Thanks for sharing ur great works. It’s really helpful for me to understand
the concepts about centrality.
I just wondering u can give me some advice.
Im working on a project that analyzing the customers of mobile company and doing SNS. That is, i have to find the community among the customers and define the leader, who is influential to other customers.
So i detect the community(cluster) and i got the degree centrality, closeness centrality, between centrality of each nodes within a cluster.
And NOW, i have to assign roles to each node; leader, sub-leader,follower and outliers.

Do u think i can just assign roles based on three centrality? for example,
make the one with big centralites as a leader. Centrality can be the measure of influence within the community?

I am not aware of any general literature on this topic. You could have a look at the Fernandez and Gould-paper that does something similar with brokerage measures. They define – based on directionality, which your data should have – various roles based on ego networks.

I have a doubt. In his article states that could make a regression analysis to find an optimal level of alpha. For example, if I have a binary variable to explain, how you might proceed in the analysis?. You could provide guidance for compare different and found and optimal alpha. I use SPSS for analysis.

Thank you for taking an interest in my work, and reading the future research section carefully!

The “optimal” (read: what high performers have)-level of the centrality measures could be probed by using a performance variable as the dependent variable and a centrality variable as an independent variable along with controls. If you ran multiple regressions each with a different alpha for the centrality parameter (e.g., 0, 0.1, 0.2, 0.3 etc), then if you could plot the attained z-scores of the centrality variable (y-axis) against the alpha (x-axis). This should give you an inverse u-shaped curve. The optimal-level is where the maxium z-score is attained.

Great site, and great software :)
One question regarding degree_w and closeness_w. The output I get is an array, however I am interested in the overall degree(strength) centralization of the weighted network, not for each individual node. Is there a way to do this with tnet?

Normalisation of node centrality scores is, to my opinion, adding a bias to the data instead of removing one. In fact, I have stayed clear of standardising the measures due to what I believe was misleading in the original measures, let alone generalised ones. My main concern with the original ways of standardising/normalising node centrality measures (i.e., n-1) is that these scale linearly with the number of nodes. Specifically, I believe that none of the main three node centrality measures scales linearly. First, it has been argued that the average degree in networks does not change as a network grows. Hence, no scaling (i.e., use the average degree to compare networks). Second, closeness centrality is based on shortest distances. In small world networks, shortest distances does not scale linearly with the number of nodes, but rather logarithmically (i.e., divide farness scores by log(N) if your network is a “small world”). Third, betweenness is based on n*(n-1) shortest paths, so it could be argued that it scales n-squared. Given these issues with the original measures, I have not given much thought/effort to normalise the generalised ones. Let me know if you figure out a way of doing it!

Thank you for the quick response. I will have to think about this… :)
Just to make sure that I get your other function right, – the clustering_w “gm” would be a transitivity measure for the overall weighted graph, right?

All the clustering_w-function is the global clustering coefficient, while the clustering_w_local-function is the local clustering coefficient that produces a score for each node (see Barrat et al., 2004). These functions are just different aggregations of the triplets in the network with the global one aggregating all triplets and the local one being an intermediate step.

The measure-parameter (“am” “gm” “ma” “mi” “bi”) controls how triplets are valued. This could, for example, be the regular mean (arithmetic mean; “am”) of the two tie weighs that make up the triplet, or the geometric mean (“gm”). The main difference between these two is that the geometric mean discounts the tie weight if there is variation (e.g., tie weights of 2 and 2 would be am=2 and gm=2, and tie weights 3 and 1 would be am=2 and gm=sqrt(3)~1.73.

Sorry for ‘spamming’, I forgot to ask another question that puzzels me: I am using closeness_w, but if I dont use weights (set them all to 1, just to check), I do not get the same result for node closeness as when using closeness() in an unweighted network. Why is this? Am I doing something wrong?

I am uncertain what you are referring to when you are using inverse weights in the degree_w-function, and also what closeness() is. Please send me an email with the commands and data you are using.

Best,
Tore

34.xingqin | March 16, 2011 at 2:31 am

Hi, Tore, Thanks for your hard work, it is easy to use your code now, just simply add \alpha. One question, do you know are there any other centrality methods handing weighted networks yet? Eigenvector is also a centrality method, but it is dealing with unweighted graphs, Does the same idea work for weighted ones?
Thanks a lot.

My database capture a communication network and I want to find what value of alpha helps to explain a continuous variable (for example, to predict the number of words used by each node using weighted out degree). Ridge-regression method can be a good choice ?

Your data sounds ideal for finding the “optimal” value of alpha. My understanding of ridge models are limited, but here is a suggestion of how it can be done using OLS. Note that the example data is not ideal as there are only 32 observations, the dependent variable is not normally distributed, and there are no control variables etc, but it shows how it can be implemented.

The last command produces the following plot, which suggests that for this limited data, the optimal point is close to 0. Please do not base any conclusions on this data due to the limitations listed above.

I read this post with interest … would the same work if the dependent variable was a factorial measure? In my case it would be animal disease status as the dependent variable – either positive or negative, and seeing which value of alpha was most appropriate to infer whether the number of animals you are connected to (an alpha value closer to 0) or the weighting – the amount of time that you spend in contact with those animals (an alpha value closer to 1) was more important in terms of predicting whether individuals would be infected or not?

I was wondering if you could help me with an error message that i’m getting? I’m doing some simple analysis with a weighted network and have calculated degree and closeness values fine, but the betweenness_w function is throwing up an error message:

It seems that you are getting the warnings because you have disconnected components. All the components are fully connected except for the ones with node 31 and 42. In these two components, the two nodes sits between others, and hence gets a betweenness score when the alpha is set to 0.

Hope this helps,
Tore

47.Tara | November 27, 2012 at 3:32 pm

Hi Tore,

IS it possible to calculate centrality for valued graphs with ucinet ?if yes should i dichtomize the data or not before starting calculation.

Great! tnet is created around directed networks, and the degree_w-function calculates the out-degree binary scores, weighted scores, and the weighted scores from Node centrality in weighted networks: Generalizing degree and shortest paths (Social Networks 2010, 245-251). If you would like the corresponding in-degree scores, set the parameter type = “in”.

Hi Tore,
great work! It’s good to see that someone expanding the domain and methods.
I am doing comparative work across many networks. For this reason individual node centrality is not useful but what I need is to obtain the overall graph centrality. Is it appropriate to use the centralization formula used for the non weighted networks or perhaps obtain the mean of alpha values and compare this value across my various networks? Or should abandon the weighted approach all together and calculate non weighted normalization scores?

Normalization is not a straight forward process due to scaling issues with the number of nodes (see comment #28), and with weighted networks, you should also consider the distribution of weights. Given that the distributions are rarely normally distributed and networks are variously sparse, I have yet to come up with a good way to normalize networks. My recommendation is not to give up weighted network analysis for a binary one, but more to give up the more intricate normalization procedures and -perhaps- only use average scores.

Thank you Tore for your reply. I do agree that the weight matters alot and If I were to drop it then I would be missing important information. I could probably use weighted and not weighted just for comparison with the weighted using the mean scores perhaps.
Michael

54.Leila | May 21, 2013 at 7:31 pm

Hi Tore,
Again, thank you for your research. I use it in the field of international economics and finance. Yet, I have a question. I would like to use the closeness centrality measure for directed network. Besides, my matrix of data is not squared (74 columns, 180 lines). When I add “in” in the program the argument is refused. For instance, closeness_w(net, alpha=0.5, type”in”) is not valid. I read the paper you wrote with Agneessens and Skvoretz. You indicate that for directed network, we need to add a constraint (“a path from one node to another can only follow the direction of present ties”). I don’t really understand, hence I have no idea how to transform my data to fill this constraint. Can you help me please ?
Thank you very much for your help Tore,
Kindest regards
Leila

It is Rui again. I am reading you paper “Node centrality in weighted networks: Generalizing degree and shortest paths”. There are few questions about betweenness centrality in your paper. One is that in equation (6b), it seems that betweenness should be the summation of g_kj(i)/g_jk over j and k, doesn’t it? Another one is what is the expression of g^w\alpha_jk in equation (10)?

Betweenness is defined as the number of shortest paths between other nodes that passes through a node (node i in the equations). As there might be multiple shortest paths (same length and this is the shortest length), the ratio of g_jk(i)/g_jk ensure that double counting doesn’t occur (g_jk is the number of shortest paths between nodes j and k, and g_jk(i) is the number of these that passes through node i). In other words, if there are two paths that have the same length and this is the shortest length, then the nodes on both paths get a score of 1/2 assigned to them.

The g^w\alpha is the notation of shortest paths where the weights are incorporated and adjusted by the tuning parameter.

Thank you for the reply. I know what you mean. As you said in your example, g_jk(i)/g_jk is the score of 1/2, but this is not the betweenness of node i. You should add up g_jk(i)/g_jk for all pairs of j and k where j is not equal to k not equal to i.

My question is, since g_jk(i) and g_jk are the numbers of shortest paths, how you incorporated weights into g^w\alpha?

Best regards
Rui

59.Rui | September 5, 2013 at 8:55 am

Hi Tore,

I have got my answer. How silly I am…. “the weights are incorporated” doesn’t mean the weights are explicitly incorporated in g^w\alpha, but in the calculation of shortest paths. Thank you very much.

Best regards
Rui

60.Patrick S. Forscher | December 4, 2013 at 11:24 pm

Hi Tore,

Thanks very much for your work on weighted graphs and for your tnet package! I have two questions about your package.

I’ve been trying to use the tnet package to calculate centrality and centralization measures for a weighted, directed network. I noticed that igraph has implemented an option to calculate closeness for weighted, directed graphs, so I tried to duplicate my results from tnet in igraph to ensure that I was doing everything correctly. Below are my results using a toy network:

Do you have any idea what might be causing the differing results between tnet and igraph?

My second question is a bit tougher. I would like to calculate the centralization of the closeness scores (and more broadly of the other centrality scores I obtain for my network). However, it's not obvious to me how to do this for a weighted network. Do you have any idea how I might calculate the centralization score for, say, the closeness centralities of a network?

Thank you for using tnet! igraph is able to handle weights; however, the distance function in igraph expects weights that represent ‘costs’ instead of ‘strength’. In other words, the tie weight is considered the amount of energy needed to cross a tie. See Shortest Paths in Weighted Networks.

To your second question, I am afraid I cannot help. There are no good ways to my opinion to calculate centralization scores for weighted network. Even for binary networks, I do not agree with the centralization scores. They often assume that the node level scores grows with n-squared; however, this does not seem to be the case for networks. A better metric might be to report average degree etc.

Thank you very much for your prompt response! I am relieved to hear that the discrepancy between tnet and igraph is due to a simple scaling problem. However, I tried running the code that you provided and am still seeing a difference in the results from igraph and tnet. Here’s the matrix from tnet:

Regarding centralizations for weighted networks, I’m a little disappointed to hear that there’s no good way to compute these. If I wanted to get a sense of, for example, how “cohesive” my network is based on the network weights, do you know of an analogue to the centralization that I could calculate? Or am I out of luck (at least until such measures are developed)?

Thank you for your terrific work! I find it very helpful. I want to compute node centrality in a weighted two mode network that takes into account weight of ties and number of ties.
Although you write a lot about two mode networks, I could unfortunately not find how I can apply “your “ node centrality measure to weighted two mode data.
Your response to the question of Mauricio (August 2010) was already very helpful in this regard. However when I run this with a sample data set, I get the response “In as.tnet(net, type = “weighted one-mode tnet”) : There were self-loops in the edgelist, these were removed”. I have used the following formula:
# Load tnet
> library(tnet)
>
> ## Read the undirected network
> weighted.net
> ## Run function
> degree_w(weighted.net, measure=c(“degree”,”output”,”alpha”), alpha=0.5)
node degree output alpha
[1,] 1 1 5 2.236068
[2,] 2 3 9 5.196152
[3,] 3 1 6 2.449490
[4,] 4 0 0 0.000000
[5,] 5 1 3 1.732051
[6,] 6 1 2 1.414214

How can I ensure that this runs with a weighted two mode tnet? From your manual (p.6) I understand that for a weighted (?) two mode network there need to be more than 4 and an unequal number of rows and columns. Therefore I have also tried it with 5 columns and more rows, yet I got the same response…

Thank you for your interest. I do believe that the issue is based on how the network is read. As weighted one-mode and two-mode networks both have a three column structure, it is not straight-forward to understand whether to treat the object as a one-mode or two-mode network. As one-mode networks are more common, as.tnet assumes three column objects to be one-mode networks. To specifically set an object to be read as a weighted two-mode network, run (assuming your data is loaded in an object called net):

net <- as.tnet(net, type='weighted two-mode tnet')

This should allow you to run all two-mode functions.

The degree measure post you are referring to is only for weighted one-mode networks (i.e., function name ends with _w). However, the measure could be applied to two mode data. While I don't recommend the following method for other functions, I believe it can be used for this case:

I understand that your proposed first formula is to avoid self-loops, isn’t it? In your second formula, you are suggesting to set it again as a one mode network. Does it mean that it is not possible to calculate node centrality (based on number of ties and weight of ties) in a weighted two mode network – without transferring it into one mode data?

I am unsure why you would get errors with the specification type=”weighted two-mode tnet”. If this is the case, please send me the code and data that you are using.

To your second point, indeed, the sample code shows how you can apply a one-mode metric to a two-mode network. However, this procedure only works for this measure as it is not focusing on the network, but more on the number and distribution of tie weights.

Dear Tore,
Thank you very much for your work and tnet, that I use a lot in my research papers. I have a question please. Is it possible to get easily in-closeness and out-closeness scores in a directed weighted network ? I know how to get in-degree and out degree but I can’t find the same distinction for closeness.
Thank you very much for your help.
Best Regards
LA

Hi Tore,
I am interested in using tnet to calculate weighted centrality for a one-mode network. I am having trouble reading the data into tnet. Would you please let me know where I got wrong?
“edge_PP 60.csv” is an edgelist with three columns, i, j, and weight without header.

Great that you are using tnet! I would suggest that you do not load the network into igraph before loading it in tnet. If you transform the node identifiers into integers, you should be able to run as.tnet on the regular data.frame-object.

Hi Dr. Opsahl, this is Li again :-) I am wondering how we should deal with isolate in weighted network. Do we include them in the edgelist and assign the weight as 0? or we do not include them? If we do not include them, the size of the network will change. Does that matter?
Thanks for letting me know.

For most metrics, a change in network size doesn’t matter (e.g., degree). However, if you calculate average degree, I would calculate degree and then sum the degree score and divide by the network size you have instead of simply writing mean(degree_w(net, measure=”degree”)[,”degree”]).

i am using tnet to compute centrality measures (Closeness, Betweenness) in weighted directed networks using datasets from different sources (e.g http://snap.stanford.edu/data/links.html) and give probabilities as weights to the links in [0-1].

Everything seems to work perfect. First i would like to ask if i missed anything in the commands or if i did something wrong to this part. And also, i read to a previous post that weigths are accounted as costs, so a link of 0.4 will be prefered over a link of 0.9. Is that correct?

Your code seems fine. I would suggest that you take the output from the functions (e.g., closeness) and save it to an object, and then write that object to a file. But if the sink-function works for you, go ahead!

tnet assumes that a higher weight is better. This is in contrast to the shortest path functions of igraph, which assumes that people will invert the tie weights first themselves. From my own experience, we are not always so good at understanding the details of the functions before running them, so I have actually made tnet based on the common assumption of higher values = stronger ties = higher transmission. As such, a tie with a weight of 0.9 is preferred over a tie with 0.4.

I was thinking of inverting the weights so as to have the desired outcome but since you confirmed that tnet prefers higher values then i do not have to invert the weights and in both betweenness and closeness 0.9 will be prefered over 0.4, correct?

One more thing is that closeness is computed for the largest connected component, is there a way to compute it for the entire network topology?

Lastly in my work i am referencing the your website should i reference :

i am using the version of closeness for the complete network topology and disconnected components as you suggested, however its seems that there is not enough memory to store the distance array for large networks. The networks i am testing are with nodes of size 83000, weighted and directed networks. I tested R in both 32-bit and 64-bit system and i also saw through other posts that it is a memory problem. How can we overcome this problem ?

There are two ways of dealing with running out of memory in R: buy more memory or make the code more memory-efficient. I had some time, and below is a suggestion for doing the latter. Note you might want to use JIT compiling to speed things up.

Betweenness is not limited to the giant / main component; however, its distribution might be highly skewed and zero-inflated. Be sure to check before using the raw scores in frameworks that assumes a Gaussian distribution.

actually the distribution i use is zipfian, ranging on skewness from 0.1 to 0.9. I get a lot zero values, (i assume this occurs because these nodes are not present in shorterst paths). Do you think there is a problem with zipfian ?

many thanks again,
Pavlos

87.Farnaz | November 17, 2014 at 8:59 am

hello Dr. opsahi
Thank you for your very good works.
i want to use this metric in the industrial engineering discipline, but i am a bit confused about the choose of alpha parameter, is that any procedure that i can use to determine the best value of alpha for my network? i mean that for example i consider degree is relatively more important for me, haw can i found which score i should choose?
many thanks.
Farnaz

Good day!
Thank you for this awesome package. It is really useful. However, is there an upper limit to the number of nodes one can analyze using tnet? My data has 416 nodes, and every time I run I keep getting. Specifically, here is the code I have used:

set2008df =read.table(“test2008.txt”,header=TRUE) # read the data, this is in the form of i.j.w and is read as a dataframe.

set2008 closeness_w(set2008)
Error in unique(y[ind]) :
Value of SET_STRING_ELT() must be a ‘CHARSXP’ not a ‘bytecode’

Hi Tore:
The weight columns are numeric, the nodes are ids. I have also sent you the code and data. Let me know if you have more thoughts.
Thanks!

92.Zhitao Zhang | May 14, 2016 at 7:59 am

Hello Dr. Opsahi
Thank you for your good works. I want to use link betweenness to analyse some links’ intermediary in weighted networks.But you know the tnet only provides node betweenness’ function, could you give me some suggestions for solving this problem?
many thanks.
Zhitao Zhang

thank you very much for excellent work. It would be extremely useful to use your measure of weighted centrality as I dispose of weighted interaction data. In particular my work focuses on inter-group interaction, and I would like to ask whether it would be possible to get a measure of weighted in-degree and out-degree centrality measure for inter-subgroup relationships.

To get the out- and in-centrality measures, you need to first aggregate your network to the subgroup level (i.e., create a new network where the subgroups are the nodes) and then you can specify the type parameter as either “out” (default) or “in”.