What the Sumerians can teach us about data

I spent this afternoon wandering the British Museum's Mesopotamian collection, and I was struck by what the humanities graduates in charge of the displays missed. The way they told the story, the Sumerian's biggest contribution to the world was written language, but I think their greatest achievement was the invention of data.

Writing grew out of pictograms that were used to tally up objects or animals. Historians and other people who write for a living treat that as a primitive transitional use, a boring stepping-stone to the final goal of transcribing speech and transmitting stories. As a data guy, I'm fascinated by the power that being able to capture and transfer descriptions of the world must have given the Sumerians. Why did they invent data, and what can we learn from them?

First you get the data, then you get the power

The Sumerians were a nasty lot. Their idea of a fun time was wheeling a bunch of caged lions into an arena so the king and his friends could shoot them from a chariot. One of the perks of working for a king was the opportunity to drink poison and join him in his grave. They created seals and cuneiform writing as tools of power. They kept track of who owed them what, in a way that left evidence that could be used to convince a third party of the obligation. I could swear blind that you'd verbally promised me three lambs in the spring, and it would be your word against mine. With a written record of the transaction, I could convince the rest of the community that it was true. If you don't hand over those lambs, some of them might help me stick that dagger between your ribs. Since these sort of obligations are the foundation of any state, the earliest writing was a potent source of power.

That's still true today. Gathering data is not a neutral act, it will alter the power balance, usually in favor of the people collecting the information.

Power corrupts data

"The inscription on this stone is a statement of grants and privileges bestowed on the sun-god Shamash's temple by the Akkadian king Manishtushu (2269-2255 BC). It was actually written many centuries later. The object was clearly a forgery designed by the Sippar priesthood for their purposes."

As soon as records become vital in arguments about who gets what, people will figure out how to falsify them. The more important the outcome, the more temptation there is to fudge or fake them. Written records remove the problem of fallible memories, but replaces it with a second-degree question of provenance. How do you know the data accurately reflects what happened?

It's a good reminder that the map is not the territory. We still have a disturbing tendency to trust anything that's recorded, without understanding the subjective process that went into creating the record.

(Pre-)Digital Rights Management

This stone was planted in the ground to mark a property boundary, and the top section records the details of the claim. The bottom third is covered with threats of supernatural retribution against anyone who moves or alters the marker. The main way Sumerians protected the integrity of their data was through curses. This may seem laughable to a modern audience, but I don't think we're so different. Does you expect the FBI to actually raid your house if you copy that VHS tape? The warnings are a way of forcefully expressing society's norms, rather than a credible threat of punishment.

As geeks we'll often roll our eyes at a technically-ineffective mechanism for preventing the copying or alteration of data, but the longevity of useless curses should make us think twice. Violating the rules is a decision taken by a person, so sometimes hacking the human element of the process is the most effective prevention.

Reading the future with data

Many of the tablets archaeologists have recovered are elaborate instruction manuals on how to interpret omens. The idea was that you'd observe events that were happening now and use them to predict what was going to happen in the future. All the examples I saw at the Museum were obvious nonsense, using inputs like the shape of animal entrails, but what struck me was how respected they must have been despite their lack of results.

We've created science as a much more elaborate process for predicting the future from data, but in many ways that's lulled us into a false sense of security. The media prominently features 'scientific' studies showing that everything gives us cancer, thanks to our insatiable appetite for certainty and reassurance in the face of something terrifying and unpredictable. The lesson for me is that the results of any data-driven project will be accepted or ignored based on people's needs and fears. In the absence of real answers, we'll take bogus ones painted with a veneer of data, just like the Sumerians.

All data matters

We actually know more about everyday life in Sumeria five millenia back than we do in Europe fifteen hundred years ago. The Sumerians recorded everything on stone or clay tablets, most of which were discarded after use with no thought for posterity. As it happened, the clay tablets proved remarkably resilient and so archaeologists and scholars have found and decoded hundreds of thousands of them. This data exhaust gives a rich view into trade, worship, life, death, medicine and almost every other aspect of the Sumerian's world.

This is a big reason why I'm so fanatical about opening up data sources. It's great to see Twitter taking steps to archive our public conversations in the Library of Congress, but it's taken a year and they're still not finished. Even when they're done, storing the records in a single location and on a single system is a terrible long-term plan, the only approach that's proven to last centuries is wide distribution of many copies on a range of mediums. Craiglist is another bad example, holding information that could be vital to understanding details of our social and commercial lives in the future, data that's been on view to the public, and yet refusing to discuss archiving any of it and actively blocking anyone who tries. If there's any way you can, please think about how to open up data you control, it's the best way to pass it on to posterity.

See for yourself

I had an amazing time in the British Museum's Mesopotamian galleries, I'd highly recommend it if you're ever in London, and it's completely free. Data was the aspect that fascinated me, but there's so much more held in the treasury of beautiful objects their scholars have collected, I guarantee you'll come away with a feeling of awe, and maybe a fresh view of the world around you too.