EPN-TAP

EPN-TAP is a VO access protocol dedicated to Planetary Science data. It is based on the TAP mechanism from IVOA, completed with sets of parameters and associated lists of values. In this regard, it is similar to ObsTAP but with a different scope.

EPN-TAP was described in its first version in Erard et al, Astronomy and Computing 7 p.52-61 (2014). Version 1 was designed as part of the IDIS activity in Europlanet-RI (FP7).

EPN-TAP version 2 is a major update of the protocol to accommodate larger services and simplify setup and use of data services. All parameters are described here.

New in v2

The "laundry list" format makes services easier to design and to query

Allows grouping of results from several services at once

Supports multispacecraft observations

Speeds up mirroring of services (support for partial updates)

Better support of footprints, and better interface with GIS

Main evolutions relative to v1

1) The previous notion of dataset is deprecated

This was complex to handle in the database, and in general not relevant for the services.

2) Grouping of products is rationalized in version 2

A granule is still a record/line in the epn_core view database, and corresponds to the smallest data unit described by the service.A "product" is typically a data file, or a service output, that can be reached through a URLIn version 2 both concepts coincide, while in v1 a single granule could be composed of several products related to the same initial data (an "observation" in v2)

In EPN-TAP v2, granules are referenced by 3 parameters:

granule_uid provides a unique ID for the granule in the service, ie for each line in the epn_core view. It is equivalent to the previous index parameter (index is a reserved word in many database languages and should not have been used in the first place)

granule_gid is related to a type of product: it is identical for all granules containing the same type of information for different observations (e.g., calibrated files). An explicit string is recommended in this field, with suggested standard values (e.g., preview/native/calibrated/geometry/projection…).

obs_id is related to an observation: it is identical for all granules related to the same observation, containing different type of data (e.g.: raw and calibrated data, associated geometry, etc). In many EPN-TAP v1 services, such products were described together on the same line with a unique index parameter.

These 3 parameters can be arbitrary alphanumeric strings — see example application to APIS service below.

In practice, different products related to the same observation are no longer described together on a single line of the epn_core view, but on successive lines associated by the same obs_id, each with a different granule_gid (and a specific granule_uid, see Table 2 below).

Each line in the epn_core view must describe only one product (plus a thumbnail wherever relevant). The notion of "main product" (which was more or less explicit in v1) is therefore deprecated, and the epn_core view in v2 includes more lines than in v1. Although less compact that the previous table presentation, this list presentation is much more efficient for machine-handling, and easier to design.

3) The notion of table/service parameters is deprecated

Such parameters were available in the registry only, and were not directly accessible by TAP. In v2, constant parameters must be replicated in every line of the epn_core view.Such parameters may be duplicated in the registry declaration though, so as to provide a fast description of services.

4) Footprints can be provided through s_region

s_region is a parameter in the ObsCore standard of IVOA, and ADQL allows for powerful query functions such as intersections or inclusions. s_region introduces a pgSphere variable of type spoly providing description of the observed area on a sphere, which allows for accurate searches.

This can be used to provide footprints of spatially extended data either on the sky or on a planetary surface (as long as it is reasonably ellipsoidal). Use with 3D shape models needs study (this may fail with concave 3D shapes such as 67P). Also need to study how concave footprints/polygons are handled, TBC. The use in the context of GIS also has to be studied.

The C1/C2 min/max parameters can still be used to provide a bounding box of the observation.

5) Better support of evolving services

creation_date and modification_date are now mandatory parameters for every granule. The latter is intended to optimize mirroring of services, by identifying the granules to be updated/copied.

These parameters must be provided as ISO 8601 strings with format "2013-11-17T10:41:00.00+01:00" (with no space) where the indication of time zone is mandatory. The associated data type must be "timestamp". ADQL supports this format, and filtering based on time zone is possible.

6) Support of coordinated observations

The target_time_min/max parameters now provide observation time in the reference frame of the target. This is intended to facilitate the cross-correlation of observations from different locations, e.g., telescopic observations in support of space missions, or multi-spacecraft campaigns.

7) Axes ranges

All parameters defining a range are now introduced with a min and max value.All floating point parameters are now in double precision to prevent errors.

8) Thematic extensions

Some science fields will require optional parameters, which need to be used consistently between services addressing the same field. Such extensions have to be designed by sub-groups involved in the corresponding data services, either as providers or users. This includes:

Bibcode preferred if available (does that include link?), doi, or other biblio id, URL…

meta.bib

meta.bib.bibcode (if bibcode)

meta.bib.bibcode

(always as bibcode)

obscore:Curation.reference

internal_reference

Text

List of granule_uid(s) in the current service

meta.id.cross

Use to link one granule to a set of other granules. Only to solve situations that would otherwise require several tables

external_link

Text

(url)

Link to a web page providing more details on the granule.

meta.ref.url

Link to an individual page in a web site associated to the database, e.g., a planet page in Exoplanets service. This is a way to provide extra granule information which cannot be accommodated in the table.

Other modifications

This can be parsed by ADQL/RegTAP function ivo_hashlist_has like this:select * from vvex.epn_core where 1 = ivo_hashlist_has(lower(target_name),'Venus')

Where the lower function is mandatory to handle values possibly containing upper cases (this is implicit on the 2nd argument)

Beware that only complete elements between separators will be found. The provider has to split the string according to expected searches, e.g.: Composite Infrared Spectrometer#CIRSnotComposite Infrared Spectrometer (CIRS)

Parameters supporting multivalued lists include:dataproduct_type (this one is best avoided when possible)target_namealt_target_nametarget_classinstrument_host_nameinstrument_namemeasurement_typebib_referenceprocessing_level

• Defaut values (to be reviewed): NULL/void: will never return an answer to a query using this parameter (TBC, seems ADQL-related. To be corrected if it is a limitation of the client) For float / double : -inf for *_min +inf for *_max – still TBC (NaN won't do). To be tested on a real case.Not needed for strings? (i.e. NULL/void is OK?)Default behavior looks OK in fact. If we need to preserve the NULL values on queried parameters (ie, when NULL is considered as "I don't know"), this can be done with OR param = NULL (on option in the client? This is feasible manually on TapHandle)

• UCDs: to be reviewed against PDS4 and IPDA, and completed (ObsCore checked)

• Processing levels: to be reviewed against PDS4 (again- mostly related to geometry, considered as derived in PDS4)

• min vs max: if only one value available, it must appear in both fields

• Optional parameters: they come in sets that are logically related; if one is present, the related ones must be present also (e.g., 3 access_* parameters)

• Granule_gid: any general indication to providers? I.e.: preview, native, calibrated, geometry… A client should be able to display the values present in a service, TBC

access_format = "PDS3label" (it has to be here: we need to know that there is a detached label)then access_url points to the label, and data_access_url points to the file (param is mandatory in this case - although the data file name is in the label, it can be in another directory)A script can then recover both files and do something with them. This mechanism can be extended to other formats with detached labels (ENVI…).Solution with datalink seems OK: detached labels provided under datalink_url in a table - although no attempt made to read them from the portal, yet.

• Utypes

Need to clean up current doc (2.0). Utypes are = DM fields. They are supposedly used to identify the meaning of parameters and help e.g. tools to grab required quantities - This will not work in some areas though, e. g. with spectral tools as they currently use UCD instead of Utype for this purpose (not many tools appear to actually rely on Utype in fact). See discussion here for usage (a bit old?): http://www.ivoa.net/documents/Notes/UTypesUsage/20130213/NOTE-utypes-usage-1.0-20130213.html

To handle this in practice:

Associate each parameter to a specific Utype in EPNCore - all names need to start with the epncore: prefix/namespace.

Then map epncore Utype to other DM (find equivalent parameter, or trace back the original templates of EPNCore parameters - often from ObsCore)

Reuse Utype from other DM each times it makes sense - TBC: the epncore: namespace is still required (even when using Utypes from other DM)This allows tools to handle EPNCore parameters like existing parameters from other DM, i.e., with no specific implementationPb is that small differences in the use of parameters may preclude reusing the same Utype (TBC: does that applies to units also?)

Cross our fingers: known Utype (from other DMs) may be usable in existing tools (e.g. a tool supporting Provenance would grab equivalent info in EPN-TAP services transparently) Unclear if the use of the namespace makes it more complicated in tools.

About ADQL and S_REGION. My first try with PgSphere in DaCHS is not very promising. I could create a fov in celestial coordinates that is seen as an s_region in the EPNcore table, but the PgSphere capabilities are rather limited. For instance, it is not possible to get to East/West-most longitudes nor the North/South-most latitudes of a polygon. Such a function would help a lot for EPN-TAP: after building an s_region, we could extract right away the c1min/max c2min/max bounding box coordinates. We have to see if PostGIS as such capabilities and if DaCHS could optionally support it.

I think I would do it when preparing the data table: I'll have to compute the footprints somehow, probably in IDL. Once I have the vector, it's straightforward to extract min/man Lon/lat, which I'd feed in the table as c1/c2 min/max.

I've tested Mikhail's stuff on VVEx. It's perfect to provide a rough footprint on a planet, and easy to compute from geometry files - the only point is that I need to open all the files Then TapHandle (at least) can send it to Aladin (and yes, this is plotted on Mars). We have to think about implementing this on VESPA

What about the subgranule_url parameter introduced by Micha in the CRISM service? This points to a script allowing to extract a spectrum from a cube, therefore it introduces a post-processing function in the table itself. It does not look very VO to me, but this is actually very handy.

Doesn't it shift the view to center it on the freshly received footprint ? this is the impression I had when working with a Mars map. You can experiment on TAP services in Heidelberg, they all have footprints s_region style