Source and Catalog Contribution Page

Rationale

This page gathers contributions and discussions of the Source and Catalogs focused session which took place in Paris interop 2019

The goal of this session was to ask people invovled in data curation and client development (and any other people) what are their requirements to make source related data interoperable. This requires both modeling effort and serialization mechanisms and is a complex work:

There is no clear common definition of what a source is

Can be a detection by a telescope

Can be one row in a catalogue of compiled data

Can be anything related to a sky location emitting a signal detectable by astronomers

Source data are multi-faceted

Something that modelers do hate

Is a model relevant or possible or even desirable?

The scope of a global source model is very huge

It covers a large part of our science

It couldn't likely not be described by a model

This work should be shared among multiple working groups. The models need both domain and modelling expertise.

The effects on VOTable makes it clearly of interest to Apps,

The data models themselves, with appropriate use cases, are the primary driver led by DM.

Challenges (on behalf of TD)

Data model complexities, and lack of actual data models, has often been conflated with mapping complexities and lack of successful demonstrations. The mapping is seen as the problem, when it's only a part, sometimes a small part.

Pierre Fernique's very fair question in Victoria 2018 was a great example of this. "But what about my proper motions?"

This a complex problem, and we haven't been able to bring a commensurate amount of focused resources to it.

In my opinion, we lack specific, documented, user stories, use cases, etc. Pierre's case would be a good example if written up, and it's a case that must be demonstrated as clearly working in order to move on.

Straw Man Steps Forward

So to address some of those issues, we suggest something like the following steps to get moving forward. It may be that we don't have the resources to do all that, in which case maybe we settle on per-topic work-arounds (like with proper motion), or find a way to get more involvement from the greater community.

Get the TCG/Exec to agree on, then coordinate (mandate?) some steps like these.

Get a group of people to commit to work on this. a. This can worked in parallel with the other steps, but it is important to know who is out there with the time, interest, commitment and expertise to help. b. Keep sharing, presenting, and inviting participation throughout.

Define what problem(s) we are solving with a short-ish list of concrete user stories. For now, only include very important stories, and make them very specific, including what data sources will be used and what we will do with the data.

From those stories, decide what data we need to model.

Model the necessary data and map these models on real data

Need first to define the mapping from the model(s) to Astropy objects.

Astropy mapping is not sufficient. Explore different mapping syntaxes by implement the user stories. a. This can be done in parallel with the modelling effort, and can evolve with it.

Knowing the pencil and paper mapping is a prerequisite for each experiment. Before trying the mapping syntax, we must know unambiguously how the model maps to an implemented data structure (hopefully including Astropy).