Digital Humanities and the NLeSC

By Jacqueline Hicks and Sally Wyatt

At the Netherlands eScience Center (NLeSC)

On 8 October 2015, the NLeSC hosted its 3rd national eScience Symposium at the Amsterdam ArenA. Of course, the ArenA is the home of the Ajax football team, and not a place one associates with scholarly enquiry. However, it proved to be a fantastic location for a great day of lectures, demonstrations, posters and discussion. It was maybe a little distracting to have photos of famous footballers and all of the rooms were equipped with bars for the ViP football guests. Smaller than the usual football crowd, but substantial for discussions of eScience, over 500 people joined the Symposium. The day began and ended with keynote lectures from José van Dijck, President of the KNAW; Alexander Rinnooy Kan, co-chair of the Dutch National Science Agenda; Leonard Smith, Oxford; and Tony Hey about to return to the UK after his years in the Pacific Northwest. This was a great mixture of insights from the humanities and the natural sciences. The title of José van Dijck’s keynote was ‘Big data, grand challenges. On digitization and humanities research’. She posed a fundamental question for the humanities: ‘Does digital humanities (DH) favour quantitative over qualitative data?’. By implication, we can extend that to ask what the implications are for the kinds of knowledge produced, not only in the humanities but certainly in the social sciences, and even in the natural sciences. Debates about the relative merits of quantitative and qualitative methods have been going on for a long time in the social sciences, and it is interesting to see how the introduction of digital and computational methods in the humanities will play out. A further question of whether DH requires data to be digital is also important. Despite the repeated charts showing the exponential increases in digital data, within the humanities we know full well that large proportions of cultural heritage collections and historical archives are not yet digital. But if we only analyse digital data, we are in danger of missing out on important sources, and skewing our interpretations.

During the late morning and early afternoon sessions, participants had a choice of five tracks – life sciences & eHealth, environment & sustainability, physics, computer & data sciences, and humanities & social sciences. The eHg co-organised the latter session, bringing together people currently funded by the NLeSC and other researchers, and the track was very well attended. The opening talk was given by Evelyn Ruppert, from Goldsmiths College London and editor of Big Data & Society. She pointed out how different framings of ‘data’ lead to different conversations and policies. A utilitarian framing of data as resource prompts questions of ownership and rights, whereas framing data as representation leads to abstractions that may or may not reflect the experiences of individuals and groups. Both of these threaten the possibility of seeing data as a social good or commons. Evelyn argued that data are always social, and can be enriched through their circulation. Particularly big data are always social, both because they reflect social processes and people are socialized by data and the platforms through which they move. Therefore, she went on, we need to develop an ethics of care in order to understand better the ways in which people relate to one another through data, and how people relate to data they generate and data about themselves.

Other speakers were Piek Vossen and Oana Inel from the VU, Els Stronks, Sonja de Leeuw and Anja Volk from Utrecht University, and Kees Aarts from the University of Twente. A number of common themes emerged, about how to represent uncertainty, identifying patterns and exceptions, different framings of data, role of theory in data-intensive times, the potential for prediction, engaging with the ‘crowd’ to annotate, generate and analyse data. Finally, there was an important common theme, going under different labels including tool criticism, media archaeology and reflexivity, that attended to the importance of critical and sustained engagement with tools and algorithms, in order to understand what they do to people as citizens and what they mean for researchers and the knowledge produced.

These themes were picked up on 14 December, when the NLeSC hosted a one-day workshop at the Lorentz Centre in Leiden. The morning started with an opening lecture by one of us (Wyatt) called ‘Digital Humanities in the Netherlands and beyond’, which picked up on some of the themes outlined above. This was followed by presentations of the projects supported by the NLeSC, including DiliPad which develops a tool for searching Dutch parliamentary proceedings; DIVE which links historical events to cultural heritage collections; PIDIMEHS which searches for evidence of pillarization in Dutch media; and Texcavator which is developing a tool for cultural text mining.

The afternoon sessions turned to discussion, including one asking a group of digital humanities practitioners (including another one of us, Hicks) what they thought the field would look like in five years’ time. While there were some inspiring imaginings of the technical capabilities to come in online representations of cultural repositories, the discussion also included some reflection on the effect of using computational techniques on the way that research in the humanities is undertaken. Rather than reproducing concerns with the qualitative/ quantitative divide (see above), a distinction was made between empirical and interpretive approaches. This gives room for both approaches in one analysis, but draws attention to the idea that it is not always necessary to try to answer big humanities questions with computational techniques alone, but rather they can produce smaller ‘nuggets’ of empirical findings in a larger interpretive analysis.

Other points of discussion dealt with CLARIAH’s role in supporting a transition from the corpus-specific tools which have so far been developed in the digital humanities community to more generalizable ones that can operate across corpora. Common practical issues in the field, such as how tools could be maintained as a web service, and the implications for the growing variety of computational jobs in academic work were also touched on in the course of the afternoon.