Guide to Interoperability

Guide to Interoperability

Many Arctic science organizations realize that it's important to share information. Sure, a data center or monitoring network can increase its visibility by having a web page or data catalog of some sort, making it possible for end users to browse for information – metadata – that makes data more discoverable and accessible.

The problem is that there is a growing multitude of data catalogs, with the end result that the Arctic data landscape is fragmented, frustrating the end user that wants to easily find relevant data. In this context, organizations or initiatives can more successfully showcase their efforts by releasing metadata in such a way that it is broadly compatible for inclusion in various portals. In so doing, the information is highly visible for more users, and for greater impact.

Once an organization makes a decision to release metadata, the next hurdle is deciding on a path forward with implementation that maximizes compatibility with other information systems. Ideally, organizations will release metadata through web services – live data feeds between databases and applications – so that metadata is kept up to date and comprehensive. In this light, this brief guide is an attempt to facilitate the interoperability of metadata, and specifically for sets of metadata that span from projects to observing sites to datasets and back. This guide is intended for existing or potential Partners collaborating with AOV, and may be helpful as an example of successful implementation.

Why Create Web Services?

The ultimate goal is that information for multiple observing networks is discoverable, authoritative, and up to date. Due credit should be given to data sources. And the information should be made accessible for use by various groups in a variety of ways for their own purposes.

In essence, what is needed is a dynamic network of distributed nodes for information sharing. This in turn relies on establishment of web services – live data feeds that conform to community-based metadata standards and compatible web service formats. Without interoperable web services, information becomes out-of-date, or requires repeated, substantial harmonizing and reprocessing. The Arctic data community is making progress on this front, notably through ADIwg, the IARPC ADCT, the IASC/SAON ADC, and other efforts or initiatives. The AOV Team is assisting with this planning.

Why Consider the Project-Data Life Cycle?

Dynamically sharing information is becoming commonplace for dataset-level metadata, with compatible web services enabling quick federated searches across multiple data catalogs, for example. Also important is sharing project-level metadata – high-order information such as title, funding agency, project location, etc. for project tracking, program coordination, or for other purposes. And the piece in the middle that can tie it all together is site-level metadata, with details such as observation type, various keywords, and the precise location of monitoring assets such as boreholes, towers, etc., which can be useful for science planning and more. Sharing machine-readable metadata across these three levels helps to integrate information systems and science itself from funding through observation to results.

Three levels of information with metadata services linking them together.

The linkages embedded in this spectrum of the project-data life cycle are web services that are compatible as well as hierarchical. Linking across these levels assists with Arctic science goals for data discovery and access, logistical coordination, collaboration, and science planning.

Create a Web Service? Or Just Email It?

If your organization wants to share metadata with AOV or anyone else, you can simply provide a static handover. For example, with AOV you can fill out an online form or populate a spreadsheet (see the Collaborate page). Just be aware that the information will become stale and outdated with time, requiring repeated and laborious reprocessing. For these reasons and more you may decide to host a web service, which is becoming an easier task with many database management systems.

FGDC, ISO, or What?

True, there is a bewildering array of metadata standards to choose from. What follows are use cases exemplified by AOV and ARMAP, drawing from community-based standards and best practices as much as possible, with an eye toward interoperability.

Federal and state agencies in the U.S. for many years have commonly adopted the Content Standard for Digital Geospatial Metadata (CSDGM) established by the Federal Geographic Data Committee (FGDC). AOV and our sister application, ARMAP, continue to ingest and host web services based on FGDC. Specifically, our implementation of FGDC follows the 2011 release of a Project Metadata Standard by ADIwg, which blends FGDC with KML for the spatial domain.

However, recently we’ve moved to geospatial metadata standards established by the International Organization for Standardization (ISO), which offer benefits for information exchange at the international scale, are becoming more broadly adopted in the U.S., and offer specific advantages for spanning the project-data life cycle (see below). The databases and services underlying AOV and ARMAP take advantage of the ISO 19115-1 conceptual model and the new ISO 19115-3 XML implementation. Our use of ISO differs slightly from – but supports core fields adopted by – the ADIwg Project Metadata Profile for ISO 19115-2.

Why Hierarchical?

One advantage of the ISO 19115-3 metadata standard is that it allows the interconnected levels of information to be combined together effectively. Metadata about scientific datasets can be nested within metadata about the observing site, which in turn is nested within an enveloping metadata record for the project. This hierarchical approach is what makes it possible to link across the project-data life cycle.

This approach furthermore is flexible, scalable, and efficient in that the nested pieces of metadata can be embedded dynamically with “XLinks” – external URL’s embedded in XML tags – such that projects and sites and datasets are pulled together with information from different, appropriate sources (in a distributed system of interoperable nodes). For example, in a metadata record for a recent ITEX study funded by NSF AON, the project level metadata record is in a service hosted by ARMAP but links to information about associated sites hosted by AOV, which links to information about corresponding datasets dynamically served by the NSF Arctic Data Center.

In actuality, the site-level metadata piece is composed of two separate services. Each project-level metadata record links to a list of observing sites, which includes links to individual site-level metadata records. This is easier to follow by digging into the XML as provided in the next section.

Implementation Examples

The metadata web services inherent to AOV and ARMAP are illustrated with ISO XML links in the table below. The template XML's are embedded with explanatory text, whereas the use case XML's are from live services for an NSF-funded AON project. Together they can assist with generating a workflow.

Additional templates will be made available when possible. The templates and use cases above were last updated on May 21, 2018.

Which Fields to Use?

Metadata records for projects, observing sites, and scientific datasets have the potential for each to include a multitude of descriptive fields, or tags. It can be helpful to identify a minimum set of core fields while designing or maintaining databases and services. Indeed, most important for interoperability in general is the ability to "crosswalk" fields with compatible definitions. Existing and potential Partners are advised to peruse:

Note also that some of the fields in the AOV Database are constrained to codelists established by ISO, ADIwg, and GCMD or to informal "pick lists" customized for AOV. For more on this, see the contributors' template spreadsheet.

Why RESTful?

Largely as an aside, note that the metadata services hosted by AOV and ARMAP are “RESTful”, meaning they conform to a standardized form of dynamic interchange between database management systems and applications. In a bit more detail, this means that the services conform to the Representational State Transfer (REST) architectural style of client-server HTTP-based queries. REST has many advantages, given that it is simple, scalable, reliable, and – in particular – fast. An additional advantage is the ability to establish permanent URL’s for sets or subsets of records.

Why Use a Geohash?

When establishing a database management system for site-level metadata, it can be helpful to create unique identifiers that incorporate geographic location explicitly. For this reason, each ISO component XML file for AOV has embedded in its filename a geohash, an encoding system for geospatial coordinates. For example, the characters “bs8tstbehhw9” in the XML file named “SoilTemperatureProbe_bs8tstbehhw9_ITEXSoilTemperatureProbeBarrowWetSiteCBWCT2_9549.xml” are derived from latitude and longitude. Advantages of using geohashes include the ability to uniquely identify a site in part by its geographic location, to represent location as a single alphanumeric string, to accommodate varying degrees of precision, and for visualization. For more information, see geohash.org. To generate geohashes, AOV uses a python script on GitHub.

In Closing

It's really a matter of deciding your overall goals, and then finding a solution that meets them with some foresight to avoid having to commit more effort sooner than you think. Please let us know if you have any questions. The ultimate goal is for end users and decision makers to easily find and access comprehensive information to guide the planning, completion, and impact of Arctic monitoring activities.