Big Humanities workshop on Big Data

As the IEEE conference on Big Data moved into a new phase, an incredible collection of humanities and technology partners gathered to share stories of what big data means to them.

Ada Lovelace described herself as a poetical scientist and an Analyst. I would call her a storyteller and codifier. She was able to both tell stories and encode them in a repeatable note format… A father the poet and her mother the mathematician, she from an early age learnt to flit between writing and analysing…. She learnt how to communicate and write while applying logic an analysis. I’m a bit of a fan. My time travelling self would no doubt have fallen for the most important person in the history of computer science (I suspect it might have been love in just one direction – so let’s move on shall we, but before we do.. it’s worth noting (and this is steeling from a conversation from the met office disrupter Mike Saunby) that working with Babbage not only led to computer science it also led Joseph Whitworth devising standards (without standards you wouldn’t be able to replace a light bulb, tune a TV, listen to a record, drive a car, or take a medicine)…. But I digress. So what am I trying to say? I It’s this: that interdisciplinary across the humanities is not just a desirable thing, it is the life force of discovery that flows through human existence. And history continues to show us that when it happens then sparks (literally and metaphorically) will fly. Note: October the 15th is Ada Lovelace Day.

Getting back to the conference. It’s been a hard start, I won’t deny it (most least because the four people that might, just might read this blog have been incredibly patient with my rants about the lack of people… ). Today is another day all-together. Today I feel human. I feel connected. Today is different. I apologise for yesterday. No actually I don’t. I want to shout at being subjected to a difficult two days of struggling with poorly presented academic knowledge. Today is incredible.

In a wonderful interplay between acronyms and metaphors I have been delighted with the truly interdisciplinary work of technologists and humanities collaborating on meaningful human big data. Expressing doubt and confidence in equal measure. A dash of humour has been thrown in, but underneath has been a rolling and rising discourse about that has got not just under the meaning of digital humanities but has started to get under my skin. I want to know more.

What’s happened? What has happened is that Dr Tobias Blanke and Dr Mark Hedges King’s College London have put together a remarkable workshop on Big Humanities workshop

I want the reader to understand that the humanities presented was further from my academic knowledge than the computer science has been. Yet in this set of talks the computer science is so much more vivid and exciting. The canvas of humanities enables me to understand. I don’t know about Victorian poetry texts. Yet I could immerse myself in the understanding of a subject elegantly presented as a visual narrative…

The science appears richer, more understandable, further advanced and more meaningful when presented at the heart of humanities.

Technologists have not held back on owning their subject – OCR is thrown in next to NLP and Cluster Analysis. (And we don’t want this to stop – researchers need to use their languages if they are to give passionate academic talks). But maybe we need a guide? Some simple How, Why, What, Wows of big data computational techniques – why is Hadoop better for real-time analysis. What is OCR – how does it differentiate from pattern recognition. Etc. This isn’t a barrier – it’s an opportunity.

There have been a wealth of presentations – and you’ll have to go to the workshop organisers for academic knowledge on this. But some thoughts, insights and connections that I have had today go a bit like this.

“[A]t a time when the web is simultaneously transforming the way in which people collaborate and communicate, and merging the spaces which the academic and non-academic communities inhabit, it has never been more important to consider the role which public communities – connected or otherwise – have come to play.” (Dunn & Hedges, 2012”) ->here

In the closing comments a panel of speakers came together to start to discuss themes, thoughts and what-nexts for humanities.

I liked Andrew Prescott’s reflection as scholar, “As historians we don’t know who the user is. Is a curator a user?”. My thought on this is that we don’t need a new form of ‘user centred histories’, instead we should rethink how we collaborate. To embrace the idea of historians a participants in a co-design process. Has anyone done this? Are their persona’s documenting typical (or a-typical) historians? Are their design guidelines or ‘branding’ documents for working with history? Is this something we could look at? Would this make a working across disciplines an easier thing? Or am I just being another voice in a mix of ideas that is just finding its feet.

Another clear big difference in this workshop to the technology focussed workshops and talks was the variety of data. And while Variety is a core theme of IEEE Big Data, the definition of variety is actually pretty narrow in terms of the talks I saw. All of the talks mentioned variety and then went on to show something that handled numerical data in a structured database. In humanities it seems the problem is more complex, potentially much harder (for machines) and crosses time and materiality to connect with everywhere humans have made their mark. In these talks it included 19th century newspapers, interviews, travelogues, transcripts, photographs, films, guidebooks, poems, private letters, journals and novels. This richness of data makes the problem of data exponentially grow into new dimensions. There was a lot of talk about language and translation – which should be a reasonably trivial problem once it goes through google translate? Right? Yes, but does google translate have a setting for 17th century vernacular? Does it have a setting for how a small community in the Lake District describe their world? And how often does language change? How many time zones do we need to encode to capture textual data? The problem was big data and now, I really don’t know what kind of data it is. But isn’t that the deal – it is at this very point of dealing with data, when your head spins and your hard drive melts that you know you’re dealing with something that’s possible bigger than big data?

At times the excitement in the room morphed into nervousness. Is this to big? And just like our friends in IEEE who worry about the end of Moore’s law, the humanities were asking ss this the end of theory? No said Barry Smith, “We must be prepared for failure. As Beckett said: fail and fail better”. He then went on the remind us that “it’s not the first time we’ve had big data. It’s happened before and we must understand the future from the past”.

As someone pointed out in the audience: there is going to be a huge argument in the humanities…

And as Christie Walker from the AHRC closed things off with: “It’s going to be great fun to stand back and see what happens”. That indeed it is.