GSTAR IV: Return of the GeoJSON

Following on from my Days of Archaeology in 2013, 2014 and 2015 (and for the last time), the bulk of my Day of Archaeology this year focussed on my doctoral research, writing up my thesis on Geosemantic Technologies for Archaeological Research (GSTAR). It’s been a busy three years but the project is nearing completion and will hopefully inform heritage management and research strategy over the coming years.

The aim of the project was to show how geosemantic technologies can be used to provide a framework for working with heritage data in a range of research contexts. To this end, I have built a demonstrator application which is based around a map (obvs!) for the Stonehenge landscape and which draws data from Historic Environment Records, museums and project archives, allowing users to ask questions across these diverse resources taking advantage of the semantic goodness of Linked Geospatial Data, thesauri and ontologies. Geosemantic ‘glue’ was used to integrate horizontally between resources (such as monuments and artefacts found within or nearby) and vertically (ie between excavation records and monument/event HER records and museum collection records).

The ontologies used were the CIDOC CRM, CRM-EH and GeoSPARQL which allow the concepts used by the various sources to be aligned whilst the terminology provided by the thesauri (published using SKOS) allow for the various terms used to document these concepts to be related. In other words, the semantic tools allow for the different sources to be made interoperable and queryable with the results displayed and interacted with on a map.

Moving forward, the approach taken and successfully demonstrated could be scaled up to act as the basis for the next generation of heritage information portals; think of the Heritage Gateway but with some additional bells and whistles:

the ability to undertake proper geospatial queries and analysis, even where there is no GIS data

spatial queries mediated using geospatial semantics, to get away from purely Cartesian views of space dependent on geometry and the problems that entails for historic information

complex querying across all of the participating providers, with differences in terminology ironed out

The demonstrator application is built using a range of standard web and geospatial technologies. Currently, the accessioning process for data is largely manual, built using the STELLAR Toolkit to process outputs from MODES and HBSMR, two major software packages used in museums and HERs respectively. A next step would be to automate this, which would be fairly straightforward from a technological if not a political perspective. If an automated pipeline could be implemented across all the HBSMR and MODES using institutions and organisations, this would cover an enormous amount of heritage information and, combined with a map based portal and live feeds to desktop GIS, would greatly improve the way in which we undertake all kinds of research activities, both in academic and commercial contexts.

Information from site archives was a little tricksier, as one might expect; such data does not typically get archived in a readily useable fashion unlike information found within the structured systems used for managing Historic Environment Record data or museums collections. However, with ongoing work relating to the digital capture and sharing of fieldwork information through OASIS, HERALD and the broader Heritage Information Access Strategy (HIAS), we are undoubtedly moving towards a time when this becomes not just possible but the norm. When this happens (and note I say when not if!), we can start to extend Linked Data principles more fully to our information resources, so monument records can be directly built up from linked fieldwork records, museum collection artefact records can be layered on top of linked excavation finds records and, on top of all this, our Research Agendas and Frameworks can be truly data driven, dynamic resources drawing directly on this web of Linked Data, informing and informed by ongoing research and our shared knowledge of the past, across all of our information resources.

The use of such geosemantic ‘glue’ allows for a much more intelligent approach to finding and working with geospatial information from heterogenous sources split across numerous providers. Take the following query for example:

Show me all the Bronze Age mounds where dolerite has been found during excavations and carved chalk balls were discovered nearby.

Using the HeritageData Periods thesaurus, it is possible to mediate different uses of language across sources to describe time-spans relating to the Bronze Age, using broader, narrower and/or related terms. We can use the FISH Event Types Thesaurus to find event records relating to interventions (including excavations) and draw on the project archives for these to check for finds of dolerite, potentially using geological ontologies such as GEON to mediate identifications of rock types. Using the FISH Object Types Thesaurus, it is possible to do the same for chalk balls or any other artefact type. Geospatial information may well not exist for these objects as recorded in museums collections, most likely not in the form of British National Grid coordinates at least, particularly where they were discovered in antiquity. But we do often have some basic spatial information such as an associated location (eg Stonehenge), parish (eg Amesbury) or named place (eg Stonehenge Road); in such cases we can use the Ordnance Survey Linked Data plus some of the spatial relationships defined by the Simple Features specification (used by the GeoSPARQL ontology) to perform a spatial query using these index terms via a bit of geosemantic magic. Moving forward, we can align our research questions with such resources and queries so, for example, if the dating of carved chalk balls (typically thought of as of Neolithic origin) were to change, we can use the same approach to identify contexts where such changes would have a knock on effect or where our broader understanding of deposits, sites and complexes may also need to be updated or where new research questions arise. So this may be the end of the GSTAR project, but it’s only just the beginning for the use of such approaches within the heritage sector.

Many thanks again to everyone who has helped, contributed and otherwise supported this research project along the way, particularly:

Wooston Castle Local Relief Model draped over a 3D Digital Terrain Model, all based on LiDAR data and available on Sketchfab

As is usual for me, my day comprises working on digital heritage projects, as in my previous Days of Archaeology (2011a, 2011b, 2012, 2013 and 2014). So no archaeological features were harmed in the making of this post!

Although on one current project, my GSTAR doctoral research, I am indeed working with archaeological excavation data from the archives of Wessex Archaeology combined with museums collections data from Wiltshire Museum and also heritage inventory data from the Wiltshire Historic Environment Record. This project is nearing completion (thesis due for submission April-ish next year!) and having already shown that geospatial information can be published and used in Semantic Web / Linked Data contexts through the integration of ontologies, I’m currently building demonstrators to show how data can then be used to undertake archaeological research through framing fairly complex archaeological research questions as spatial queries asked across the range of resources I’ve included.

Today however, I’m working mainly on Archaeogeomancy commercial projects as I do one day a week. And thanks to the wonders of digital technologies, I’m working out of Bristol for a change; my first Day of Archaeology away from Salisbury. It’s been a busy week this week, clocking up quite a few miles, as Monday and Tuesday were spent at the Pelagios Linked Pasts event held at Kings College London where a diverse group from across the world spent a very productive couple of days talking about Linked Data with particular emphasis on people, places, space and time.

This morning’s tasks focussed on an automation project involving planning applications. I’m building a system which consumes planning data collated by Glenigan, classifies it according to type of project (as defined by the client) and then pushes out regional and property specific maps and summaries on a weekly/monthly basis for a list of properties which may be affected by these planning applications. This allows specialists in each region to assess each planning application and make recommendations regarding any responses needed. So whilst not the shiniest and most academically interesting of projects, it is the kind of GIS based systems development and automation that can really make a difference by freeing up staff time from the mundane production of such maps and reports.

This afternoon’s tasks will focus on another system I’m developing, this time to assist with the analysis and interpretation of LiDAR data. I’m building a toolkit which incorporates a select range of visualisation techniques requested by the client including Local Relief Maps, Principal Components Analysis and the usual hillshades, slope, etc. The toolkit is to be deployed to users who are not necessarily experts in the analysis and interpretation of LiDAR data or GIS so needs to be simple to use with many variables preset and also needs to be integrated within their corporate GIS solution rather than be a standalone application. The first batch of tools mentioned above are all complete and working nicely; this afternoon’s mission is to wrap up the Openness and Sky View Factor visualisations.

Indeed, it’s been great working with LiDAR data again lately. When thinking of a suitable image for this year’s Day of Archaeology post, the one shown above immediately leapt to mind. It shows a screenshot of the output of the Local Relief Model (LRM) tool I built draped over the Digital Terrain Model (DTM) for a rather lovely hillfort as viewed on Sketchfab. I mention this because disseminating informative views of LiDAR data has long been problematic, but platforms such as Sketchfab allow us to composite 3D and 2D products and then share them in an interactive way with anyone who has a web browser and an internet connection without the need for any specialist software at all. Nice.

Last Friday, the Day of Archaeology, was a fairly typical day involving some research and a bit of commercial work. I have a number of ongoing projects, a number of which required some input last Friday. And spending a bit of time with my latest daughter, three week old Florence (who has yet to show any interest in archaeology, unlike her big sister Amelia who loves ruins). One thing I rarely get to do these days is dig, my time being almost entirely filled with research, writing and other desk/computer based activities. But I still very much consider myself an archaeologist, it’s just that my tools are different. The photos I’ve used all come from my Flickr stream and are of archaeological sites, hopefully just a bit more interesting than photos of my computers…

Research

I am currently wrapping up the literature review section of my PhD and heard last Thursday that my three month review has been accepted so full steam ahead. I’ve been looking at the range of Semantic Web and Linked Data technologies out there with particular reference to archaeological and heritage applications. Within this subject area, the GSTAR project is focussing on spatial data and geosemantic techniques and builds on the preceding STAR and STELLAR projects, collaborations between the University of South Wales, English Heritage and the Archaeology Data Service.

I’ve also been working on some refinements of an ontological model, the CRM-EH, further clarifying aspects relating to the formation of archaeological features, deposits and the deposition of artefacts. Preliminary results are posted here on my blog, which I use to talk about my work in digital heritage and interesting things I come across.

Consultancy

In addition to my research, I am currently working on a number of exciting projects for clients. I have just deployed an archaeological information system to facilitate the interpretation of marine geophysics data based around Microsoft Access and Esri ArcGIS; this is currently in beta testing which gives me an opportunity to complete other projects including some tools, again built using Esri ArcGIS, to support data collation, synthesis and reporting/cartography for Desk Based Assessments (DBAs) including Environmental Impact Assessments (EIAs).

Digging, the activity which reveals archaeological features, deposits and the stratigraphic relationships between them.

Another interesting project I was working on last Friday involves the creation of a Linked Data resource relating to the recent excavations at Silbury Hill, near Avebury, Wiltshire. This site is very dear to me, having featured in my undergraduate and masters dissertations which investigated the formation of landscapes in prehistory and the spatial patterning of archaeological remains by means of movement and perception of human scale actors. This Linked Data resource relates to the later Roman activity at the site and currently comprises c.40K assertions about contexts, stratigraphy, finds and samples all held in a triple store which will be published in due course to further add to the growing number of Linked Data resources online.

1. Open Access Hesperia. Our journal, Hesperia, is currently housed on JSTOR. We have a Content Sharing Agreement with JSTOR, however, which allows us to share our content from beyond the 3-year moving wall. This means that in July 2012 individual readers who need to search for and download any/all Hesperia articles published from 1932-2009 will be able to do so from the ASCSA’s website for free. The PDF articles can be read on any device that can open PDFs, and they can be used without Internet access post-download. There is no DRM. I alpha-tested the behind-the-scenes upload utility yesterday with reasonable success. I need to do a batch name-change on the PDFs and then load those onto our webserver (the test links currently point to JSTOR, but this will change in July). It is my hope that I can find just over $1M with which I can endow the journal at which point I can make open access to it complete and eternal.

2. Open Bibliography on Zotero. After the LAWDI meetings, I returned to Princeton to map out what I could begin to do with the concept of linking content for the ancient world. I had briefly used Zotero to read articles posted by Tom Elliott on Twitter, but I’d never gotten into the platform as a contributor of content. Since then, I have created a Zotero group for the American School of Classical Studies at Athens in which I have now shared publicly the enter bibliography of 1,500+ Hesperiaarticles and about 150 (or 230+) monographs. I need to go through (and encourage others to help with this) and edit the book entries and add abstracts to earlier Hesperia articles. This will take time, but it’s a good start.

3. Linking in eBooks. June saw the publication of our latest printed monograph, Isthmia: The Roman and Byzantine Graves and Human Remains (Isthmia IX), by Joseph L. Rife. I spent yesterday and will spend today creating links in the PDF eBook. My previous attempts at linking were restricted to links between text, note, table, and image. I have done this in Isthmia IX, tedium made bearable through listening to hardcore punk, gangsta rap, and the Euro 2012 match between Germany and Italy. This is only the first step. The next is to attempt to create dynamic, outward-looking links from every bibliographic citation and every footnote to actual articles and books on the Internet. This could be insane and/or impossible, but I’m going to try. I am also going to attempt to link each inventoried object as presented on the ASCSA’s open access website for archaeological data, ascsa.net. Lastly, I’m going to try to link from places mentioned in Rife’s book to records in Pleiades. Wish me luck.

The above is what I’m doing now and in July, and I’m looking forward to sharing/linking with other archaeologists worldwide on these and future projects.