The Northern California Earthquake Data Center, a joint project of the
Berkeley Seismological Laboratory and the U.S. Geological Survey at Menlo
Park, serves as an "on-line" archive for various types of digital data
relating to earthquakes in central and northern California. The NCEDC is
located at the Berkeley Seismological Laboratory, and has been accessible to users via
the Internet since mid-1992.

The primary goal of the NCEDC is to provide a stable and permanent archival
and distribution center of digital geophysical data for northern and central
California such as seismic waveforms, electromagnetic data, GPS
data, and earthquake parametric data. The principal networks contributing
seismic data to the data center are the Berkeley Digital Seismic Network
(BDSN) operated by the Seismological Laboratory, the Northern California
Seismic Network (NCSN) operated by the USGS, and the Bay
Area Regional Deformation (BARD) GPS network. The collection of NCSN digital
waveforms date from 1984 to the present, the BDSN digital waveforms date from
1987 to the present, and the BARD GPS data date from 1993 to the present.

The NCEDC continues to use the World Wide Web as a principal
interface for users to request, search, and receive data from the NCEDC.
The NCEDC has implemented a number of useful and original mechanisms of data
search and retrieval using the World Wide Web, which are available to anyone
on the Internet. All of the documentation about the NCEDC, including the
research users' guide, is available via the Web. Users can perform catalog
searches and retrieve hypocentral information and phase readings from the
various earthquake catalogs at the NCEDC via easy-to-use forms on the Web.
In addition, users can peruse the index of available broadband data at the
NCEDC, and can request and retrieve broadband data in standard SEED format
via the Web. Access to all datasets is available via research accounts at
the NCEDC. The NCEDC's home page address is
http://quake.geo.berkeley.edu/

The initial phase of archiving the historic NCSN earthquake seismograms from
1984 through the present and BDSN data from 1987 through the present is
basically complete. All historic NCSN data tapes have been read and data for all
local and regional events have been loaded onto the NCEDC. A list of
teleseismic events recorded by the NCSN is now available on the NCEDC, and
the NCSN is determining which of those event have sufficient data to archive
at the NCEDC. The 16-bit BDSN data from 1987-1991 have been converted to MiniSEED
and are now online.

The total size of the datasets archived at the NCEDC is shown in Table
7.1.

The archival of current BDSN and HFN seismic data is an ongoing task. BDSN and HFN data are
telemetered from 29 seismic dataloggers in real-time to the BSL, where they are
written to disk files. Each day, an extraction process creates a daily
archive by retrieving all continuous and event-triggered data for the previous
day. The daily archive is run through quality control procedures to correct
any timing errors, triggered data is reselected based on the REDI, NCSN, and
UCB earthquake catalogs, and the resulting daily collection of data is
archived at the NCEDC.

NCSN waveform data continue to be collected and processed by the USGS at
Menlo Park, and shipped to the datacenter on exabyte tapes. The NCSN has
developed procedures to send event waveform files from Menlo Park to
the NCEDC via the Internet, but was delayed by TCP/IP software problems on
the Vax computers at Menlo Park. New software has been acquired that fixes
the problem, and we should be able to initiate that process shortly.
Parametric information, such as event catalogs and phase readings from both
the BDSN and NCSN networks are automatically updated on the NCEDC on a daily
basis.

Various problems were encountered that prevented the NCEDC from loading many
of the NCSN waveforms. First, it was discovered that a significant number of
waveforms were erroneously loaded by the NCEDC in VAX (little-endian)
byteorder instead of the Sun (big-endian) byteorder used at the NCEDC. In
addition, problems with the hardware on the new mass storage system and
software for the old mass storage system significantly delayed the migration of
data from the old mass storage system to the new mass storage system. Since
no additional space was available on the old mass store, we needed to migrate the
old data to the new mass store before fixing the byte ordering problem with
the existing datafiles. In addition, we needed to fix the byte ordering problem before
loading new NCEDC data onto the mass store. As a result, we have a
significant backlog of current NCSN data to archive at the NCEDC.

The NCEDC continues to archive and process electric and magnetic field data
acquired from 3 dataloggers at three sites (SAO, PKD, and PKD1). At PKD and SAO,
3 components of magnetic field and 2 or 4 components of electric field
are digitized and telemetered in real-time along with seismic data to the
Seismological Laboratory, where they are processed and archived at the NCEDC
in a similar fashion to the seismic data. The system generates continuous data
channels at 40 Hz, 1 Hz, and .1 Hz for each component of data.
All of these data are archived and remain available online at the NCEDC.
Using programs written by Dr. Martin Fullerkrug at the Stanford University
STAR Laboratory, the NCEDC is computing and archiving magnetic activity and
Schumann resonance analysis using the 40 Hz data from this dataset.

The datalogger at PKD1 acquires 8
channel of low frequency long baseline electric field data from an ongoing
project by Dr. Steve Park of UC Riverside. This data is acquired and archived
in an identical manner to the other electric field data at the NCEDC.

The NCEDC continues to expand its archive of GPS data through the BARD
(Bay Area Regional Deformation) network of continuously monitored GPS
receivers in northern California. The NCEDC GPS archive now includes 40 sites
in northern California. There are 25 core BARD sites owned and operated by
UC Berkeley, LLNL, USGS, UC Davis, Trimble Navigation, and Stanford. Data
from the other northern California sites are collected from sites operated by
JPL, the US Coast Guard, and Scripps Institute of Oceanography.

Most of the Berkeley BARD sites are co-located with seismic stations,
and data from these sites are acquired in real-time using shared frame relay
telemetry link. The remaining Berkeley BARD stations use dedicated frame relay
and/or spread spectrum radio to provide data in real-time to UC Berkeley, and
are automatically processed and archived at the NCEDC on a daily basis. Data
from the USGS sites are downloaded by the USGS and transferred to the NCEDC on
a daily basis, and is automatically archived by the NCEDC. The other sites
are automatically acquired from their respective operators on an hourly or daily basis,
and are archived by the NCEDC.

The NCEDC is participating in the UNAVCO-sponsored GPS Seamless Archive
Centers (GSAC) initiative, which is developing common protocols and interfaces
for the exchange and distribution of continuous and survey-mode GPS data.

The Unocal Corporation operated a micro-seismic monitoring network in the
Geysers regions of northern California. In prior years, Unocal had released
six years of triggered event waveform data from 1987-1994 for archival and
distribution at the NCEDC. Through an updated agreement with the NCEDC this year,
Unocal released triggered event waveform data and a preliminary hypocenter
catalog for an additional four years of data from 1995-1998. The total
dataset represents over 150,000 events that were recorded by the Unocal
Geysers network, and is available via research accounts at the NCEDC.
Although Unocal did not release a preliminary hypocenter catalog for the first
six years of data or phase readings for any of the events, several scientists
have already deemed this dataset as a useful addition to the NCEDC. Due to
problems encountered with the tape media used by Unocal to archive their data,
the NCEDC is still loading portions of the 1995-1998 data.

Event seismograms from the Parkfield High Resolution Seismic Network from 1987
through June 1998 have been loaded onto the NCEDC in their raw SEGY format. A
number of events have faulty timing due to the lack or failure of a precision
timesource for the network. Due to funding limitations, there is currently no
ongoing work to correct the timing problems in the older events or to create
MiniSEED volumes for these events. However, a preliminary catalog for a
significant number of these events has been constructed, and the catalog is
available via the web at the NCEDC. The raw SEGY data files are available via
research accounts at the NCEDC.

The NCEDC, in conjunction with the Council of the National Seismic
System (CNSS), is producing and distributing a world-wide composite catalog
of earthquakes based on the catalogs of the national and various US regional
networks. Each network updates their earthquake catalog on a daily basis at
the NCEDC, and the NCEDC constructs a composite world-wide earthquake
catalog by combining the data, removing duplicate entries that may ocurr
from multiple networks recording an event, and giving priority to the data
from each network's authoritative region. The catalog, which
includes data from 14 regional and national networks, and is available for
use at the NCEDC, and is made available to anyone over the Internet.

The NCEDC moved into new facilities at the newly renovated Berkeley Seismological
Laboratory in McCone Hall in January 1999. This move reunited all of the NCEDC
mass storage systems and computers into a single location and onto the same high-speed
100 MBit switched network.

Last year partial funding for a new mass storage system was made available
from the USGS in mid-year, but the purchase of the new mass storage system had
been deferred in an attempt to acquire higher density drives, which ultimately were
not available. The new mass system was purchased in late spring 1998, and is
comprised of a Sun Ultra 450 computer, a 1.3 Tbyte DISC 517 slot jukebox with
two 2.6 GByte Magneto Optical (MO) drives, an 11-slot AIT tape jukebox which
holds 25 GBytes per tape, and the SAM-FS hierarchical filesystem management
software. Only two MO drives and minimal media were purchased at that time.

This year, the mass storage system was upgraded from its initial 1.3 TByte
capacity to 2.5 TByte capacity by the replacement of its 2.6 GByte MO
drives by four 5.2 GByte MO drives and 5.2 GB MO media. The mass storage
system can be upgraded to a total of 1000 slots (5.2 TByte) capacity with the
addition of a second media picker, drives, and media cells.

The older NCEDC data stored on the original two 300 GByte Sony WORM jukeboxes
are currently being migrated to the new mass storage system in order to
reduce maintenance and operating costs and increase the speed of access to
the data. The data migration should be completed by the end of 1999.

The new hardware and software system can be configured to automatically create
multiple copies of each data file. The NCEDC is using this feature to create
an online copy of each data file on MO media, and another copy on AIT tape which
will be stored offline.

Currently both the USGS and BSL construct and maintain earthquake
catalogs for northern and central California. The "official" UC Berkeley
earthquake catalog begins in 1910, and the USGS "official" catalog begins in
1966. Both of these catalogs are archived and available through the NCEDC,
but the existence of 2 catalogs has caused confusion among both researchers and
the public. The BSL and the USGS have spent considerable effort over the past
year to define procedures for merging the data from the two catalogs into a
single northern and central California earthquake catalog in order to
present a unified view of northern California seismicity. The differences in
time period, variations in data availability, and mismatches in regions of
coverage all complicate the task.

From 1910 through 1967, the BSL catalog is the primary source of northern
California earthquake information. Only limited phase data are available for
this time period, although location and magnitude information is provided.
The NCSN began to come online in 1966, and observations from this network are
available beginning in July of that year.

Starting with data from 1996, the BSL and USGS are working to generate a "joint"
catalog by merging phase data and relocating the earthquakes. One of the
initial complications in this project is matching up events between the two
catalogs. Due to the sparse nature of the BSL instrumentation over the years,
the BSL catalog is only complete at the magnitude 3 level while the USGS
catalog is generally complete at the magnitude 2 level. However, the BSL
catalog includes regional events from southern California, Nevada, Utah,
Oregon, and Washington. Thus neither catalog is a subset of the other. Other
complications include foreshock/aftershock sequences, where one organization
might read a foreshock and the mainshock and the other might read the
mainshock and an immediate aftershock. Since limited phase data are available
for the BDSN until 1976, most events during this period will combine the USGS
location with the BSL magnitude. Where BSL phase data are available, an event
that appears in both catalogs will have its phase and amplitude data merged,
the event will be relocated with the combined phase data, and magnitudes will
be recomputed using the available amplitude readings and new location.

The process of consolidating the BSL data to be merged into the joint catalog
has uncovered many details in the original catalog which were ambiguous or
poorly documented or inconsistent over time, such as the use of channel names
for phase and amplitude readings. Significant time has been spent resolving
these issues before the joint catalog was actually constructed.

The USGS and BSL performed an initial joint catalog from the USGS and BSL
catalogs in the spring of 1999. The BSL spent considerable time analyzing
the resulting catalog, and has identified problems with specific earthquake
associations and other related problems. We anticipate creating a final
merged catalog within the next year.

Most of the parametric data archived at the NCEDC, such as earthquake
catalogs, phase and amplitude readings, waveform inventory, and instrument
responses have been stored in flat text files. Flat file are easily stored
and viewed, but are not efficiently searched. Over the last year, the NCEDC,
in collaboration with the USGS/SCEC Data Center, and TriNet, has continued
development of database schemas to store the parametric data from the joint
earthquake catalog, station history, complete instrument response for all data
channels, and waveform inventory.

The parametric schema supports tables and associations for the joint
earthquake catalog. It allows for multiple hypocenters per event, multiple
magnitudes per hypocenter, and association of phases and amplitudes with
multiple versions of hypocenters and magnitudes respectively. The instrument
response schema represents full multi-stage instrument responses (including
filter coefficients) for the broadband dataloggers. The hardware tracking
schema will represent the interconnection of instruments, amplifiers, filters,
and dataloggers over time. This schema will be used to store the joint
northern California earthquake catalog and the CNSS composite catalog.

The entire description for the BDSN network and data archive has been entered
into the hardware tracking, SEED instrument response, and waveform tables.
Programs have been developed to perform queries of waveform inventory and
instrument responses, and the NCEDC can now generate full SEED volumes from
the BDSN network based on information from the database and the waveforms on
the mass storage system. The second stage of development will include the
NCSN waveform inventory and later the NCSN instrument response data as they are made available.

Additional details on the joint catalog effort and database schema development
may be found at
http://quake.geo.berkeley.edu/db

In a collaborative project with the IRIS DMC and other worldwide datacenter,
the NCEDC has helped develop and implement NETDC, a protocol which will
provide a seamless user interface to multiple datacenters for geophysical
network and station inventory, instrument responses, and data retrieval
requests. The NETDC system is currently operational in beta test mode
between the NCEDC and the IRIS DMC, and was demonstrated at the IUGG
meeting in Europe this summer. The NETDC implementation at the NCEDC
makes significant use of the waveform and instrument response data stored in
our newly developed database. It is scheduled to be opened to public access
at the time of the Fall'99 AGU meeting.