Oil metadata standards - can it help us find stuff?

Tuesday, January 17, 2017

Energistics is developing a new industry standard for metadata. Will it help people in the industry to find stuff and make sure they don't miss anything important? We interviewed Energistics CTO Jay Hollingsworth.

We're all familiar with the organisational challenge of helping people find the right piece of information at the right time. It applies to geologists trying to find all the information a company has about a certain field. It applies to police trying to keep track of people who may be potential terrorists.

One pathway forward which may be helpful is to have a standard system for labelling data and information.

But for the system to work well for experts, the labelling system would probably also need to be designed by experts.

For example, you would need to be a geologist to know how to label or tag a certain geological report, so someone who might benefit from it would be able to find it, by searching for key words.

Just like the tagging system on eBay, if the system is too broad, then someone doing a search would get too many results to search through them all, like searching geological papers for ones about the Jurassic era. But if it is too narrow, then there is less chance of people finding what they want. The scope needs to be just right.

The labelling system would also need to be universally used within a certain industry, because many different companies would need to share it. It wouldn't work for one company to set its own labelling standard.

To try to solve the problem, oil and gas standards organisation Energistics is setting up an international oil and gas data labelling ('metadata') standard.

More specifically, Energistics has created a special oil and gas 'profile, or limited view, of the vast International Standards Organization 19115-1:2014 metadata standard.

The full name is the Energy Industry Profile of ISO 19115-1:2014 ("EIP") metadata specification. The current release is v1.1.

History

To understand how the EIP standard works, it helps to know a little about how the underlying ISO standard came to exist, says Jay Hollingsworth, chief technology officer of Energistics.

Metadata was first used in the scientific publishing world, where publishers have manually indexed technical papers for decades with key words, such as 'oil, gas, cretaceous rocks, Wytch Farm, Jurassic, BP.'

When computers came along, this information was loaded into computerised systems, together with data such as the author, publisher and publication date.

But these early computer systems could have been harder to use than previous manual tagging systems, because the various computer indexing systems were not compatible.

So in 1995, a group of US librarians developed a standard system, known as the 'Dublin Core Metadata Initiative' - with Dublin referring to the town of Dublin, Ohio, where the schema originated, and 'Core' referring to the metadata terms being able to be used for a wide range of resources.

The Dublin Core work led to the first ISO (International Standards Organization) standard for words which can describe any piece of data. It can be used for describing articles in a magazine, or web pages, or any kind of web resources, Mr Hollingsworth says.

As a separate project, in 1990 the US government set up the 'Federal Geographic Data Committee' to help discover and transfer maps in a standard way. They realized that every US government agency was making maps, often the same maps over and over again, for example every government agency was making maps of counties in the US by median income.

The idea was that people could reduce work if it was easier to share the information - and it would be easier to share with the general public. Also it could make it possible for government agencies to reuse work done by other agencies.

There were also other groups working on geographic metadata at the same time - including an ISO Committee and a group in Australia and New Zealand.

It is designed to be usable with any data set, on any computer program, not necessarily a map, he says.

For maps, the metadata might include the co-ordinate reference system chosen, and who commissioned the map in addition to the kind of metadata found in the Dublin Core standard.

This work, in the US, Australia and New Zealand, eventually came together to make ISO Standard 19115. It includes a standard way to describe the metadata, ways to put metadata into XML language, and standard fields every piece of metadata must have (like a title and an author).

Profiles

ISO standards can be immensely complicated. 'They are practically legal documents, described in a really formal way,' Mr Hollingsworth says. 'It's really big and really complicated, almost no-one would be able to use the entire thing.'

So communities develop 'profiles' - where one profile might use for example 10 per cent of the whole standard.

As an example, the Norwegian government has developed a special Norwegian 'profile' of Energistics PRODML standard for production data reporting for Norway, which operators in Norway use to report to the government.

The profile can specify which fields are mandatory and which fields are optional. So for example if the master standard includes 150 different attributes which could be applied to a well, the 'profile' might include 10 of them and make 2 of them mandatory.

It is also possible that you would have attributes mandatory on the profile but optional on the master standard, for example, if there was some information which the Norwegian government really wanted, in the example above.

There are a number of other profiles made for ISO 19115, including profiles for 'North America', a profile for fisheries data.

Oil and gas profile

The work to develop an oil and gas profile started a few years ago, with a group of professionals, mainly working with GIS (geographic information systems), who decided that a metadata standard would be useful, but not the full ISO 19115.

These professionals became Energistics' Energy Industry Metdata Work Group, originally formed by representatives from Chevron, Shell, BHP Billiton Petroleum, Arizona State Geological Survey, Esri and Gimmal Group. To determine which metadata would be most valuable to the industry, the Work Group solicited requirements and feedback from a global stakeholder community. The organizations that participated included Apache, BP, ConocoPhillips, Devon Energy, ExxonMobil, Maersk Oil, Total, and a number of technology companies, service companies, and publishers.

The community input revealed the need for metadata not in the then-current ISO 19115, so Energistics secured membership on a newly formed ISO Project Team charged with updating ISO 19115. That membership resulted in enhancements to the ISO standard itself, now ISO 19115-1, which are exploited by the EIP.

The initial focus of the project is on metadata for information with spatial co-ordinates, including maps, interpretations, modelling project data, and raw geospatial data.

A demonstration 'reference implementation' was developed for Energistics together with ESRI, NOAA's Geophysical Data Center, and the University of Colorado, showing how it can work to make it easier to find information.

The objective is to make oil and gas data easy to find - based on what oil and gas people look for. Usually the first (and easiest) search is by geography, but after that people might search for information relevant to someone in their discipline for that region.

The geographic search could be by the name of a region, or by latitude and longitude.

Consider that there has been an enormous amount of papers, books, government maps and magazine articles published about any one region - and someone might want to search for information relevant to rock scientists, without scanning through any of the text itself, just finding it through metadata.

Mr Hollingsworth suggests that technical publishers should adopt the metadata standard. It may be a good commercial proposition for them if they sell articles one by one, because it will make them easier to find.

The publishing company would need to do the indexing, and Energistics can provide advice about what sort of searches people usually make, which can help make the metadata relevant. 'Our group would supply domain expertise to them to know what to scan for,' he said.

The same standard can also be applied to data from specialist software, such as Petrel projects or an OpenWorks database. This can mean that it can be easy to find the data used to create a certain map, as well as the map itself.