I’m going to start by asking you to contemplate the unimaginable – the age-old fantasy of the total archive, the comprehensive record of humanity’s time on earth. Picture everything that the human mind has ever created laid out before you – every song and every story, every image, inscription, and invention – in chronological order from the dawn of humankind to the present day.

This fantastical archive is the record of the passage of history: the evolution of cultures and countries, politics and economics, science and scholarship, laws and literature. Not only the major figures, stories and works with which we are more familiar, but a kaleidoscopically rich reflection of the world and its inhabitants in all of their contingency and particularity, captured in images, documents, descriptions, and artefacts as though frozen in amber.

Somewhere amidst this colossal conglomerate we should be able to draw a line. On one side of the line – the side closer to us – everything is owned and protected by default. Here rights are reserved by creators and we must either pay, or ask permission to access or use what we find. We must be careful what we touch and how we tread, lest we fall foul of the law. On the other side of the line is a public commons that is free and open for everyone, at least in principle. Here everyone can enjoy, use, and share the fruits of our collective intellectual labour. All our creations eventually end up here, after their temporary legal protections have expired.

The system we have for governing the use of our intellectual creations is supposed to provide a balance between private protection, compensation and incentivisation on the one hand, and broader public access on the other. Today we shall focus not on whether or not we have the right balance (although this is a very important discussion for another day), but on how we might go about establishing the contours of that part of our collective record which is free for everyone: on how we can map the public domain and create a stronger and more positive conception of it as an indispensable public good.

To begin our inquiry we must first depart from this idea of the total archive, and look to the world. The perfect archive of our fantasy is dismantled and scattered throughout space and time, its artefacts dispersed like the people of Babel: across countries, public institutions, private collections, and so forth. Furthermore, there is no single line demarcating the distinction between that which is protected and that which is the public domain. Rather, there are as many different lines as there are different national copyright laws, and what is in the public domain depends on which bit of the earth you are standing in.

So in order to trace the contours of the public domain we have two main tasks before us. The first is to create an inventory of all the works in the world, or at least those which are notable by some criteria or likely to be of interest to us. The second is to model how copyright and related rights apply to these works in different countries. Let us look at each of these things one at a time.

Firstly, a list of all the works that have been created. If we could list the holdings of every library, archive, museum and gallery in the world, surely this would be a good start: a kind of super-catalogue of the cultural and memory institutions in each and every country. We will also need to augment this with other pieces of information relevant to assessing the copyright status of a given work in a given country – which might include, for example, biographical information about creators (such as their nationality and birth and death dates). To do these things we will need to be able to use and combine information from different sources.

When we first started working in this area it was not easy to come by information about works – or ‘metadata’ – from cultural institutions. In the beginning we had to patch together what we could from databases that were donated to us. For example, someone at the BBC gave us a database of information about sound recordings. A book collector gave us a large database of information about authors that he had manually compiled. But many big cultural institutions said they could not give out their databases for copyright reasons – often as they used or depended on information from third party commercial providers.

Now, over half a decade on, the situation is much better than we could have anticipated. Many major institutions have now opened up their metadata – including the British Library, the Rijksmuseum, and many others. Just last week the Tate in the UK opened up their database of information about their works. Thanks to digital federation initiatives like Europeana and the Digital Public Library of America, we now have vast caches of cultural metadata available for use under under the Creative Commons CC0 license, which permits anyone to reuse, build on, and connect it to other data sources. The Open Knowledge Foundation’s OpenGLAM initiative, which I founded just over two years ago, is now flourishing – and we have a growing network of advocates working to open up cultural metadata in institutions around the world.

So that is one part of the equation: the inventory of cultural works. The other part is using this and other information to make an estimation about whether or not a given work is likely to be in the public domain, using what we know about the state of the law in countries around the world. The dream is to have a two light system for reuse: a green light and a red light. Or perhaps even green, red and yellow if there are cases which are uncertain. However it is often a bit more complicated than this.

To use a metaphor from the philosopher Ludwig Wittgenstein, the law is a bit like an ancient city, which has developed in response to the manifold contingent demands and concerns of its users. Some bits are very old, others very new. Bits of it have been repurposed, reworked, and remodelled. As a result it is not always as clear cut or straightforward as we might like.

In any case, our aim is clear: we want to model the most crucial bits of copyright law and related rights that are relevant to making an informed estimate as to whether or not a given work is still in copyright or whether it has entered the public domain in a given country. We started off doing very basic models for this in the UK. We also worked with the late Aaron Swartz, who was then at the Internet Archive in San Francisco and interested in mapping which works are in the public domain in the US. We have gone on to work with a global network of lawyers and legal experts to map copyright law in countries around the world, producing flow diagrams to show what questions you must answer in order to establish the copyright status of a work. Europeana then took up the mantle and built on our work to produce flow diagrams for 30 European countries.

We can then use these diagrams to generate computer software – or “Public Domain Calculators” – which combine an algorithmic rendering of these legal models with information about works in order to estimate which ones are in the public domain in a given country. These calculators can then be used on institutional websites, by aggregators, and by anyone else who is interested in knowing which items in a given collection are out of copyright. The dream is to be able to be able to do this for every work in the world, in every country in the world – so that people will know what is in the public domain, and will be able to have a sense of the lay of the land, the contours of the cultural commons wherever they are.

The bit that I find most exciting is what happens next: how this information about the public domain might be used by people to inform digitisation, curation, translation, transcription, publishing, research and other cultural initiatives. Imagine being able to see for a given person, topic or genre which public domain works are freely available in digital form and where the gaps are. We could trace the history of different reproductions to see which works were most widely distributed in which different periods, and which ones were less well known. We could use this to get an impression of the cultural universes of different people at different points in history – a kind of time machine that would be able to give a sense of the texts and contexts that were around in a particular place at a particular point in time. We would be able to discover, explore and republish less well known works: whether literary fragments or sketches, diagrams or diaries, images, recordings or film.

In other words, what we are talking about is a global, collaborative, distributed curatorial project to map the cultural commons and to enable everyone to access, use and benefit from it. Unlike the arrangement of items in a physical library or archive, which generally only afford one mode of classification and presentation, this virtual digital commons would afford limitless slices, lenses, annotations, sub-collections and selections – customised and catered for different audiences and interests. Imagine the potential for sifting, sorting, exploring, representing, and re-imagining the vast constellation of artefacts that constitutes our collective history – as we go from the linear sequences of words on a page or rooms in a gallery, to complex, responsive, multi-dimensional constructions of the kind we are still only just beginning to imagine – from new forms of scholarly collaboration assisted by digital technology, to interactive multimedia projects and installations that enable us to explore our cultural past.

Finally, I think it is worth noting that this is not just of historical interest. We humans are constituted by this constellation of words, objects, institutions, and practises – from the languages we speak, to the way we conduct ourselves in the world, to the stories, values and the musical and visual forms through which we express ourselves. When we create or communicate, we are very often enacting, rearticulating or building upon the phrases, expressions and patterns of others, thoughts, elements and values which have been passed down to us, and shaped by countless people before us. As the philosopher Walter Benjamin wrote in his Passagenwerk about the city of Paris, “each epoch dreams the one to follow” (“chaque époque rêve la suivante”). In exploring the dreams of past generations we find out about ourselves and the world we live in now. Finding news ways to understand, critique and relate to that which came before us can furnish us with valuable material to assist us in recomposing our collective future.