Dynamic Network Analysis: 7. Why do network analysis?

When you are asking good questions and trying to use network analysis to answer them, the focus should not only be on nodes and linkages, but also on exploring the consequences of the relationships being examined.

https://pxhere.com/en/photo/1084246

John Terrell

IT SEEMS LIKELY MANY TODAY on hearing the word “network” may assume you must be talking about an Internet outage or service delay. It seems less likely they will think you are referring to something as rarefied as network analysis.

However, people devoted to using social media in creative ways, and those paying attention to the news each day, may have at least a beginner’s understanding of the currently popular uses of this technical approach to the mysteries of the universe. Even they, however, may be minimally aware of the weaknesses, the limitations, of such research.

Furthermore, should some inquisitive souls venture as far as reading a textbook on this topic, they may decide that such books, undoubtedly impressive for their mathematics and node-link diagrams, are more about “how to” than about “why?” or “so what?”

Therefore, what are some of the demonstrated strengths of network research? What currently are some of the known weaknesses? Is there anything that can be done to overcome the latter?

Said more simply, why do network analysis?

Data mapping and visualization

Graph representing the metadata of thousands of archive documents, documenting the social network of hundreds of League of Nations personnel. (By Martin Grandjean [CC BY-SA 3.0 , via Wikimedia Commons.)Network diagrams can be visually striking, even intimidating. Come across one or two of these visualizations in a research article, and even if you don’t know all that much about the science involved, it’s hard not to be impressed.

While granting, therefore, that network diagrams can be both pretty and impressive, evidently little is yet known about how successfully people are able to interpret such visualizations. Nor it it clear whether or when network visualizations actually improve human understanding and problem-solving. There is, however, some evidence suggesting that both data tables and network visualization are more effective ways of communicating information and scientific findings than conventional text descriptions (Welles and Xu 2018).

“The change of community structure after applying the schneider’s method [23] on the dolphin network. Fig. 1(a) presents the community structure of the dolphin network with N = 62 dolphins and M = 159 co-appearance of them. The network can be divided into 5 communities, which are marked as different colors. Fig. 1(b) presents the onion-like structure of the improved network. The community structure of the network is greatly changed. Fig. 1(c) shows the clear community structure of Fig. 1(B), i.e. the improved network. The first two figures are drawn by the Gephi automatically, while the vertices in the last figure are separated by the colors manually. The social network of 62 bottlenose dolphins was observed by Lusseau [30]. The dolphins lived in Doubtful Sound, New Zealand. Lusseau collected the data of dolphins according to his field studies of dolphins for two years. The ties between dolphin pairs are established by the observation of the statistically significant frequent association.”(Yang et al. 2015)

Data mining

Anyone shopping nowadays online at companies like Amazon have firsthand experience with the ability these firms now have using relational algorithms to track you, your tastes, where you have been on the Internet, and your prior purchases. What am I referring to? The enterprising phrase “customers who bought this also bought” that is now standard & usual at commercial websites.

Mining what is popularly called “big data” culled off the Internet in this fashion to find correlations and patterns of association is proof enough that relational thinking can have highly profitable payoffs—even when the final product is not an attractive network visualization but instead an enticing and strongly personalized buying recommendation.

If relational data mining like this sounds more than a just little creepy to you, fair enough. The self-serving, some would say malicious, misuse of personal information mined from 87 million Facebook accounts by a company called Cambridge Analytica during the 2016 presidential election in the United States should be more than enough to convince even the most hardened skeptics that data mining using network algorithms can be more than a trivial pursuit.

“Social media analytics is the process of gathering data from stakeholder conversations on digital media and processing into structured insights leading to more information-driven business decisions and increased customer centrality for brands and businesses” (http://www.wikiwand.com/en/Social_media_analytics).

Correlation or cause?

While some purists debate whether statistically speaking “an association” is or isn’t the same thing as “a correlation,” one of the fundamental clichés of statistics is the quip that “a correlation [or association] is not a cause.” The logic behind this observation is straightforward. Saying something (call it A) is correlated with something else (B) just means: if A, then also B with some likelihood or degree of probability. For example: if you like A enough to buy it, then you may also want to purchase B, which the analysis of large amounts of buying data has revealed is something often also bought with it.

Correlations are rarely absolute or perfect. Even when Amazon or some other company lets you know other people who bought this also bought that, buying the latter may be something you’d never ever dream of doing. But given how often Amazon knows for sure that other people have also bought B, it is not unreasonable to think you might be willing to do so, too.

Note, however, that nowhere in the logic of “if A, then B” is it stated why the two are likely to be purchased together. Obviously if Amazon also knew what causesthe pairing of A and B,” the odds of getting someone to buy both might be greatly improved.

In any case, the lesson for us remains unchanged. Just because A and B are correlated doesn’t mean we know why or how they are. Nor do we even know when they are likely to be.

Yes, often comparison of A and B can suggest possible reasons. Both may be similar in character or for a similar purpose. For example, A is a window air conditioner, and B is a steel support bracket for this type of air conditioner. Or A is a book about sexual behavior, and B is something risqué from Victoria’s Secret. Yet while similarity may often be a useful clue, similarity and cause are not one and the same thing.

Constraints and opportunities

Although perhaps not always obvious, a correlation is a kind of pattern. Hence data mining, mapping, and visualization are ways of finding and illustrating not just dyadic correlations between A and B, but also more complex forms of relational patterning within samples of information about this, that, or something else (commonly referred to as data sets). On the face of it, therefore, if it is true that a correlation is not a cause, then so too, it should be true that a pattern is not a cause.

Or maybe complex patterns can be, at least sometimes? It is conventional to say that the goal of networks research is to learn how the patterning of ties within social networks—in the jargon commonly used, the structure of such networks—shapes human behavior (Scott 2000: 19).

Given the familiar old saying “it’s not what you know, it’s who you know,” this possibility may hardly seem worth mentioning. Doesn’t everybody know that the patterning of their relationships with others can often make or break the best-laid schemes o’ mice an’ men? Or as Stephen Borgatti and his colleagues recently expressed what would appear to be this sentiment in a less poetic way:

a generic hypothesis of network theory is that an actor’s position in a network determines in part the constraints and opportunities that he or she will encounter, and therefore identifying that position is important for predicting actor outcomes such as performance, behavior or beliefs. (Borgatti et al. 2013: 1)

If we grant at least for now that the patterning of social ties within a social network can “in part” (as these authors say) determine constraints and opportunities, then doesn’t this mean that network structure can be causal, at least some of the time? If so, then is proving this exception to the general rule that a correlation (or pattern) is not a cause more than adequate justification for pursuing networks research? Or is this justification for this kind of research making too much out of too little?

You guessed it. The question I just asked was a leading question. Surely there must be more to be gained by doing network analysis than just proving over and over again that one’s success depends on who you know, not what you know. Or maybe not?

Limiting assumptions

For me, one of the surprising limitations of many published applications of social network analysis is the evident focus not on what, when, and why the individuals or groups being studied do what they do in their actual relationships with one another, but instead on what Stanley Wasserman and Katherine Faust (1994: 8) have described as “the characteristics of the network as a whole.” Or as they have also phrased this notion: “To a large extent, the power of network analysis lies in the ability to model the relationships among systems of actors” (1994: 19).

Their commitment to such a research agenda is shared by many others doing social network analysis (Knox et al. 2006). In practice, such an agenda leads one to define the “unit of analysis” not simply as people and their relationships with one another, but instead as that vague collective something popularly called the “group”—which Wasserman and Faust are willing to define for us as “the collection of all actors on which ties are to be measured” (1994: 19).

Such a definition of what social network analysis focuses on might be more understandable if they just came out and told us that the word “group” simply means what statisticians label as a “sample.” But evidently this is not what Wasserman and Faust (and others doing network analysis) have in mind. When they write about “systems of actors,” what they want us to take these words to mean is something far more categorical:

A system consists of ties among members of some (more or less bounded) group. The notion of the group has been given a wide range of definitions by social scientists. For our purposes, a group is the collection of all actors on which ties are to be measured. One must be able to argue on theoretical, empirical, or conceptual criteria that the actors in the group belong together in a more or less bounded set. Indeed, once one decides to gather data on a group, a more concrete meaning of the term is necessary. (Wasserman and Faust 1994: 19)

These authors do go on to tell us that dealing with such groups presents social scientists with “some of the more problematic issues in network analysis, including the specification of network boundaries, sampling, and the definition of group” (1994: 19–20). But here’s the big question. If you want to do network analysis, is it necessary to focus on groups? Why can’t the focus be on nodes and linkages? And on causes and consequences more telling, more interesting, and perhaps more mysterious than just what can be made of who we know?

What’s next?

I am hoping you agree with me that I haven’t as yet put our collective finger on why doing network analysis is a good thing to do. I will offer you in the next post in this series the basics of what I see as a better reason. Before going on to do so, however, I want to plant two thoughts in your mind.

First, it’s a big mistake to believe that network analysis has to be about studying something called “the whole network.” Given even the most basic math (N=2n), it is obvious that as the number of nodes increases, the complexity of “the whole” can grow so swiftly that practically speaking, the best any one can do is study samples, not groups or wholes however defined.

Second, if you are really interested in asking good questions and trying to use network analysis to help you try to answer them, then not only should the focus be on nodes and linkages “down on the ground,” so to speak. But also on exploring the consequences of the relationships, human or otherwise, under observation.

In particular, as I will discuss in the next post, we need to add to our classification of networks another column and label it “adaptive.”