Introduction

Background

The High Energy Astrophysics Science Archive Research Center (HEASARC) at
NASA Goddard Space Flight Center is a multi-mission archive facility supporting
the high-energy astrophysics community around the world. HEASARC services
include providing, via a variety of user interfaces, access to many different
types of data and information including proposals and grants tracking
information, astronomical catalogs, observation logs, images, etc.

Previously, most HEASARC information was housed in a home-grown database
system, based on that used for EXOSAT. As the HEASARC holdings continued to
grow, certain disadvantages of this system began to manifest themselves.

As in any home-grown system, continuity of maintenance became a problem.
Adding functionality in terms of enriched metadata to allow more complex or
multi-mission data searching was difficult. Further, the older database
software had file names and locations directly linked to the metadata, which
can cause changes at the lowest levels of the system to have impact
throughout.

In addition, other information at the HEASARC and affiliated organizations
were kept in separate databases, on different platforms running commercial
relational database management system software. There was no well-defined way
to communicate between these information repositories. This caused
inconvenience both to users who were required to deal with several disparate
systems, and to developers who had to create similar functions redundantly
for the separate systems. It also caused problems when data and information
is transferred between, for example, the HEASARC and the processing facility
or the deep archive.

Therefore, the database development efforts at the HEASARC had two main
goals: to migrate to a standard commercial database management system, and
to facilitate the exchange of information between database systems.

Maintaining and Enhancing Services

As an operational facility, the HEASARC had the obligation to avoid
disrupting the level of service it provided to its users. This implied
that current user interfaces, or enhanced versions thereof, must continue to
be available. The HEASARC currently provides both a command-line interface
and a variety of services accessible via the World-Wide Web.

In addition, many HEASARC users have become accustomed to accessing the
data files and tables directly via File Transfer Protocol (FTP) and
Structured Query Language (SQL) respectively. This type of access must
continue to be supported as well.

For maximum flexibility in the support of underlying heterogeneity, both
in the types of information within the HEASARC database and in communicating
between components of the HEASARC systems and with other organizations, it
became clear that what was required was a facility not only for describing
data files via entries in tables (catalog level), but also to describe the
catalog-level tables themselves in such a way that information could be
exchanged about what tables and attributes are available for search. This
level in the information hierarchy is what we are calling "meta-information,"
or the "metabase."

The meta-information design is intended to be as generic as possible.
Since the astronomical content in HEASARC's database resides in the
catalog-level tables, the meta-information serves the function of describing
these tables and the parameters available therein. Therefore, it could
theoretically be used to describe information used in any discipline. On a
practical level, this at least serves to facilitate the use of data across
astrophysics missions which might arrange their information differently.

Historical Requirements

The requirements analysis was undertaken as the first step in the
development of a multi-mission database management system for the High Energy
Astrophysics Science Archive Research Center (HEASARC). It was used as a
vehicle for discussion and as a springboard for the design effort. It was not
intended as a formal specifications document. Neither is it continually
updated to reflect the evolution of our thinking during the prototyping and
implementation phases of the project. It is included here as a useful summary
of our original goals.

The HEASARC Database System shall be developed so as not to preclude
migration of tables between different relational database management
systems (e.g. Ingres, Oracle, Sybase).

The HEASARC Database System shall be accessible via the
following user interfaces, at a minimum:

a Web interface

a command-line interface

an e-mail interface

The HEASARC Database System shall have a data dictionary or
system metabase including at a minimum for each table or view:

definition

access privileges

creator

creation date

fields to use for standard coordinate, time, class and name
searches

units and other conventions used to store the data, such as
epoch of the coordinates

The HEASARC Database System shall be queryable using SQL.

The HEASARC Database System shall include an application
programmers' interface (API) for FORTRAN and C calls.

The HEASARC Database System shall accommodate the following
types of information, at a minimum:

The HEASARC Database System shall be able to accommodate information
concerning multiple types of science data files, to include among
others:

telemetry

multiple product file types

auxiliary (e.g. calibration)

screened

unscreened

raw (FITS-converted telemetry)

The HEASARC Database System shall allow the identification and
retrieval of data files or granules individually and/or in groups as
defined by the mission.

The HEASARC Database System shall be developed on a schedule such
that it is tested and ready to accommodate XTE (i.e. first ingest of
accepted proposals expected late March/early April 1995), with migration
of additional data on a schedule to be determined.

The HEASARC Database System shall include a core set of generic
attributes based to the extent possible on the FITS standard, and
consistent table structures wherever feasible, in order to facilitate
cross-mission data search by attribute.

The HEASARC Database System shall accommodate the information
held currently in existing HEASARC DB tables, and shall support data
from a technically unbounded number of active and inactive missions,
subject to resource limitations.

The HEASARC Database System shall accommodate the frequent
updating of database contents associated with the support of active
missions.

The HEASARC Database System shall be capable of interfacing
as a node of the Astrophysics Data System.

The HEASARC Database System shall be capable of providing
secure access for proprietary information.

The HEASARC Database System shall have tools to check the
consistency of the database tables and the data archives.

The HEASARC Database System shall have tools to check the
validity of joins between database tables.

The HEASARC Database System shall be designed to accommodate
the distribution of the data archives among multiple servers in a manner
transparent to the user.

Status

The migration of the HEASARC database to a commercial RDBMS (first INGRES,
and then later Sybase) was initially completed in 1995, when the HEASARC's
World-Wide Web database interface, Browse, was released. The overall
architecture and implementation has continuously evolved since then,
however.