Digital Agenda: Network Analysis Beats Political Doublespeak

“Digital Agenda” has been used in such creative ways by our political parties that I was not surprised when my friend Marco asked this old data juggler (I did data-driven web usability assessment in the ’90s) a hand. His idea was to monitor the political platforms for the general elections that will be held next weekend (that would be 24-25 february 2013, if you’re reading this in the future), and you can download the resulting “Observatory on digital policies for the 2013 Italian general elections” (in Italian). Alessandro Garbagnati joined in too (and was invaluable).

Can three lone geeks (please, no “Three Am1g0s” jokes) take on the propaganda behemoth and come out alive? Yes they can.

Well, I’m no political analyst. I am also painfully aware of the supreme rhetorical levels politicians can achieve at campaign time: political language is crafted to mean absolutely everything. But I love a challenge, I love to see through hype and I burned to do some network analysis on the worst possible data. So here’s a part of the story of how the Observatory came to be: the funnier one, and with graphs.

Side-stepping hype

The first problem with data analysis is data preparation. The first problem of data preparation is telling data from hype, wheat from chaff. Or signal from noise, as Nate Silver (gotta love Nate Silver). When it comes to political platforms, there’s an additional problem: you can’t. There is no way you can tell hype from data. People way above your pay grade have sweated every comma so that everything and its exact opposite can be inferred from the text.

Comparing apples and oranges

Choose what is better:

a nation-wide program to teach digital technologies to school teachers

economical incentives to businesses who donate pre-obsolescent PCs to schools in need

bringing Internet into every classroom.

Sorry, your answer is wrong. The three proposals cannot be compared, only debated. Endlessly. Which is exactly what politicians are for. Since our goal was different, we had to side-step the issue.

We selected 14 “areas of intervention” to classify proposals, so that apple-and-orange comparisons could be defused. For instance, the three proposals above would fall under the “Digital Literacy”. It should be noted that these areas were chosen from recurring keywords in official platform documents, and not conceived by us. Another important point is that no matter how many proposal a party makes in a given area, we only record whether there are proposals or not in any given area. Flooding and rhetorical bravado: defused.

Avoiding statistical pitfalls

We now know which party wants to act in which areas

So we can now build a table where each row is a party and each column an area of intervention. And we put a 1 in a cell (X, Y) if party X promises to do something in area Y. What next? The bare minimum. Remember, we are playing with loaded dice. Any attempt to read to much into this will invariably read propaganda.

So, no fancy R scripts with this junk data. Trying to assess who fares better is not a very interesting exercise, because it would bring us back to the basic problem, i.e. we’re dealing with propaganda here.

Enter Network Analysis

Beware: this is little more than an entry-level exercise, very much like shooting flies with a shotgun. Overkill, yes, but it gives us the advantage of catching one or two flies. All graphs were produced with Gephi, the open-source interactive network visualization and exploration platform.

1: What Areas Are Really important?

Luckily, there are more indirect ways to tackle the problem. What we have so far is simply a graph, where parties and areas are nodes and arcs represent a party proposing some action in an area. This is called a bipartite graph.

One interesting thing to ask would be: which ares appear together more often? Another would be: which parties share the same target areas? These are two questions that could cut directly through any layer of propaganda and get to something interesting: common interests between (potentially rival) parties and consensus on areas. Luckily, this is exactly what network analysis and Gephi allow us to ask.

Areas appearing together in at least 3 platforms (thicker/whiter arc = more platforms)

In the chart, areas are ordered clockwise by increasing number of platforms they appear in (their degree). To reduce clutter, only those arcs representing co-occurrence of the areas in 3 or more platforms are represented. Note that three areas appear together in almost all platforms: Digital Literacy, Tax Incentives, Broadband Investments (from top, counterclockwise). Let’s remove some more clutter.

Now, an arc appears only if it stands for two areas appearing together in 4 or more platforms. That is, there are at least four parties whose platform prescribes actions in both areas. With less clutter, a few things become apparent:

there is a large consensus that Digital Literacy, Tax Incentives, Broadband Investments should be pursued together

when it comes to reducing the digital divide, consensus is already diminishing

bringing digital technologies to the Public Administration (PA) and Incentives to Startups already belong in open-triangle-land: Incentives to Startups is strongly linked to Reducing the Digital Divide; this in turn is strongly linked to other areas, none of which is strongly connected to Incentives to Startup; here is where consensus becomes brittle

all other areas (from Digital Citizenship at 6:00 counterclockwise) appear way less often; these will likely be the Cinderellas of actual government action.

It is instructive to see what area are “Cinderellas”; from 6:00 counterclockwise we meet

Digital Citizenship

Open Source for the Public Administration

Smart City

Open Government

Open Data.

You may probably say that these areas are crucial to any self-respecting digital agenda, that all of them should appear in a party’s platform, and you would be right. Here we have some indication that when it comes to digital, there is no strategic vision.

We can also note that as we walk the nodes counterclockwise, the keyword becomes less and less fashionable and more and more “technical”. We can safely assume that most professionals and entrepreneurs in the digital realm would deem the right side of the chart at least as important to the economy as the left side. Parties seem to think otherwise. Fashionable keywords get all the attention. This is another indication that platforms have been designed more to impress the public than to pursue strategic goals. Also, plain ignorance of “technical” issues cannot be ignored.

What does all this means in terms of potential policy? Simply that, whoever wins the elections, will find it easier to pursue thickly-connected areas than thinly-connected ones. That is, for instance, to pass laws on tax incentives to promote digital literacy than on Open data to digitalise the Public Administration.

2: Who Agrees With Whom?

You’ll like this: in the following graphs nodes are now parties, and an arc represents that two parties propose actions in a same area. The more areas both parties want to act in, the thicker and whiter the arc. Look what happens…

Parties whose platforms have at least 2 areas in common (thicker, whiter arc = more common areas)

Here, nodes are political parties. An arc connects two parties if their digital agendas share at least two common areas. As you can see, everybody agrees with everybody else, to a greater or lesser extent. Consider that the entire political spectrum (though not all parties) is represented. Yes, there are stronger bonds among the parties on the left side of the chart. The interesting thing is, half of them are from the political left, the other half from the political right. Let’s remove some clutter and see the really strong bonds.

Parties whose platforms have at least 4 areas in common (thicker, whiter arc = more common areas)

Now it’s better. Platforms from Lega Nord, SEL and Rivoluzione Civile have less than 4 areas in common with anyone. This is no surprise as their platforms cover 2, 2 and 3 areas, respectively. When we get to the stronger commonalities (the thicker arcs in the upper left quadrant), things become interesting:

Monti (political right) shares more common areas with PD (political left )than with PDL (political right)

PD (political left) and PDL (political right) have more common areas between themselves than with any other party (excluding Monti)

the M5S (Grillo, political left) platform shares more areas with Monti (political right) than with any other party

FARE (political right) has as many common areas with Monti (political right) as with PD (political left, and more with these than with PDL).

I anticipate an objection: even if two parties declare that action in same area (say, reducing the digital divide) is needed, this does not imply that the parties will agree on which action to pursue. True. But politics is the art of the win-win compromise. This simple exercise reveals very strong bipartisan convergences on a large number of areas. In simple terms, whoever wins the elections, the very platforms tell us there will be a majority of votes in favor of some action in a large number of digital areas. It is the parties’ job to find out which action.

Conclusion

Our little exercise in network analysis has proved worthwhile. We have found that network analysis can provide some insight even with very low quality data (in this case, the carefully vague statements of Italian political platforms). We have not found out whose platform was “the best”, but we have found out that some interesting facts:

which platforms have an ample digital agenda, and which only pay lip service

which joint areas are more likely to attract a majority of votes, whoever wins the elections

which potential alliances may be formed on digital issues, regardless of election results.

All this, of course, requires that political parties stick to their pre-election platforms, or are so required.