Mapping the ASF, Part II

In my last post I showed you one view of the Apache Software Foundation, the relationship of projects as revealed by the overlapping membership of their Project Management Committees. After I did that post it struck me that I could, with a very small modifications to my script, look at the connections at the individual level instead of at the committee level. Initially I attempted this with all Committers in the ASF This resulted in a graph with over 3000 nodes and over 2.6 million edges. I’m still working on making sense of that graph. It was very dense and visualizing it as anything other than a giant blob has proven challenging. So I scaled back the problem slightly and decided to look at the relationship between individual members of the many PMCs, a smaller graph with only 1577 nodes and 22,399 edges.

Here’s what I got:

As before I excluded the Apache Incubator, Labs and Attic, but looked at all other PMC members. Each PMC member is a dot in this graph, with a line connecting two people who serve together on a PMC. The layout and colors emphasizes communities of strong interconnection. An SVG version of the graph is here.

Each PMC is a “clique”, a group that strongly interacts with itself. But aside from a small number of exceptions, which you can see at the top of the graph, each PMC has one or more members who are also members of other PMCs. In structural terms they are “between” the two communities and help connect them. This could mean various things in social terms, from acting as a conduit of information, a broker, or even a gatekeeper. The person who introduces you to new people at a party serves the same role as the person who tells the prisoner stories of the outside world. The context is different, of course, but in either case, the structural position is one of importance.

A common way of quantifying the importance of the nodes that connect other nodes, is via a metric called “betweenness centrality“, which you can think of as a measure of how many shortest paths between other nodes pass through that node. If the shortest path is always going through you, then you have high betweenness and you’re helping connecting the disparate parts of the organization.

Let’s draw the graph again and show each node with a size proportionate to its betweenness. You can see more clearly now the position of the high betweenness nodes and how they bridge sub-communities.

Now of course, the structural role doesn’t necessarily equate to the actual social role. Someone could be inactive or lurking in multiple projects and not serve as the conduit of much of anything, though on paper they appear central. But Apache participants might take a look at this larger version of the chart, where I have labeled the nodes, and see how well it matches reality in many ways.