Resisting networks

This has been fun, historical networks, but we have to say goodbye – at least partly. Class after class, I have tried to tame historical networks and their visualizations, only to come to the conclusion that this belongs to the things which humanists can’t handle by themselves. From the moment on when you seriously want to understand how your network visualization is being generated, which itself is the condition to interpret it correctly, you need someone who can deal with numbers.

There are several issues that make interdisciplinary cooperation necessary in network visualization and network analysis. Generally, if you want to visualize the networks you are working on, it is because you can’t get hold of them in another (linear, textual or tabular) manner, be it due to the complexity of the relationships that are in your focus or due to the amount of actors you want to consider. Either the edges are too complicated or the nodes are too numerous – or both. The whole point of network visualization is to bring these down to less extensive blocks of information, to simplify. But for historians who already feel terrible because not all material ever has been preserved or is accessible for their research, the inhibitions to simplify are overwhelming. Of course, it is a major epistemological problem: what is historical method but a way of interpreting the sources that we have as if we were sure that these would give us the clues we need? Historians have to live with loss of information to begin with.

The class I am giving on historical networks is probably not your standard class on historical networks (if ever there is such a thing). I am not a computer scientist. I am a historian of literature. I conceived this class as a comparison of theory and digital practice in the field of historical networks. Interestingly, it has turned out that even this requires too much computational background to be realized. There is not much to be said about existing projects working with historical networks. Many of them don’t put their visualization results online. Those who do don’t necessarily describe the algorithm and/or the database they are using. So much for the practice. As for the theory, I seem to have difficulties to explain methodological transfer to my students. Seriously, what can history learn from the social sciences? A lot. But I feel like I have not been able to transmit that so far. Which brings me to the psychological trauma of having to admit information loss. Yes, sociologists work with interviews and historians with uncomplete letter collections. But still, you can transfer methods. This is probably the aspect that I find (for myself) the most interesting.

The other aspect that has been inhibiting the course of the class is the need for a visual answer while I felt that data formatting should have been the top one issue there. This aspect crystallized in particular this week, when the students presented their projects and data. I was taken aback because I did not expect them to come to the class with such huge datasets in their pockets, by which I literally mean in their pockets: excel files saved on personal hard drives. This was really scary, for several reasons.

First, the methodological aspects of data harvesting were hardly questioned: “I informed this field this way because I did not have time to do more”. On that level of information gathering, the simplification gesture is not being questioned. There are obviously accepted historiographical methods to extract key information from letters and consider them historically representative. This is where I would draw the line between historians and historians of literature. The fact that there is a letter from x to y in an old edition, to my mind, accounts for nothing but canon building. Bu let’s skip that for now.

Second, I have always been told to avoid excel since it makes export so unpractical. And enrichment so complicated. And the definition of relationships so unpracticable. Does anyone have an idea what alternative format I should be suggesting there, if not directly xml? Because I hear more and more voices protesting again xml, that it does not really provide the interoperability and stability we really need.

Third, where should this data (in a somewhat interoperable, quotable, enrichable format) be hosted? Whose role is it to curate digital data from M.A. theses? To what extent should the university provide long time archiving?

And fourth, the visualization. Pretty much the top of the iceberg. The students want a visual solution to their network, are not happy with UCINET, want to try out Gephi et al. but don’t want them to be too complicated to use. This is my problem: how do I make them understand that there is no possible visualization without serious data modelling in the background? That data modelling is their research question proper, and that it even involves another step before visualization, namely the definition of the visualization algorithm that answers that question? That historical networks are not pictures on the basis of an excel file, as big as it may be?

While I am still not able to teach any kind of solution to this, the fact that I can diagnose the different steps so precisely in the students’ projects helps me move on with my own research corpus. Which will hopefully, in turn, make me better armed for the next class on the topic. After all, I might not give up on historical networks so easily, especially with so many computer scientists and visualization specialists around me willing to help out…

Anne Baillot

I studied German Studies and Philosophy in Paris where I got my PhD in 2002. I then moved to Berlin, where I have been living & doing research ever since. My areas of specialty include German literature, Digital Humanities, textual scholarship and intellectual history. I am currently working at the Centre Marc Bloch in Berlin as an expert in digital technologies for the humanities.