The MIMIC II Waveform Database, version 2

This is version 2 of the MIMIC II Waveform Database (July 2010; 4458
records). Version 3 of the MIMIC II Waveform Database, more than 4 times the
size of version 2, is available
here. Version 3 is
recommended for all new studies. Version 2 will remain available to
support ongoing work.

The first 4458 records of the MIMIC II Waveform Database are available
here, including 4145 records of 3704 adult ICU patients (the a
records) and 395 records of 249 neonates (the n records). Records with
names beginning with a40 through a43 include a
physiologic waveform record containing up to four signals digitized at
125 Hz with 8-bit resolution, and a "numerics" record containing 10 or
more time series of vital signs sampled once per minute.
(Occasionally, technical constraints make it possible to create a
waveform record but not a numerics record, or vice versa.) Records
with names beginning with a44 through a46,
and n10, include waveform records with up to eight signals
digitized at 125 Hz with 10-bit (or occasionally 12-bit) resolution,
and numerics records sampled once per second. The waveform records
vary in length; some are several weeks in duration. It is common for
the physiologic signals to be interrupted or changed occasionally
during recordings of such long duration. The waveform records have
been constructed so that all signals available at any time during the
record are listed in a "layout header" file (see the discussion
of variable-layout records in
the WFDB Programmer's Guide). When
using a viewer such as the PhysioBank ATM,
all of these signals will be listed although in most cases only a
subset of them will be visible at any given time.

Within this directory, all files associated with each record are
gathered in a subdirectory named after the record. For example, the
files associated with record a40001 are all located within the
directory named a40001 (at the top of the listing
below). The files associated with the high-resolution (125
samples/second) physiologic waveform records are the master header
(a40001.hea) in which all of the
segments and their lengths are listed, the (binary) event annotation
file (a40001.al), the layout header
(a40001_layout.hea) in which
all of the available signals are listed, and the segment header (.hea)
and signal (.dat) files with names beginning with "a40001_". In
addition, the "numerics" record header
(a40001n.hea) and its associated binary trend
data file (a40001n.dat) are included, as are a low-resolution signal quality
index header (a40001T.hea) and its binary
data file (a40001T).

In some cases, multiple records associated with the same ICU stay are grouped
together within a directory. For example, records a40018a, a40018b, and
a40018c are grouped together in the a40018 directory. In some such cases,
it is possible that records grouped together in this way belong to different
patients.

Clinical correlates: In most cases there is a patient record in
the MIMIC II Clinical Database corresponding to one or more records in
the MIMIC II Waveform Database. In such cases, the layout header of
the waveform record, and the header of the numerics record, contain
the age and sex of the patient and cross-references to the clinical
record and to any other waveform/numerics records for the same
patient. The file named MAP in this directory
contains all of this information in a single location.

ECG limitations: The ECG signals in the waveform records were
originally sampled with 12-bit precision at a high sampling rate, and
were then scaled and decimated to 500 samples per second (per signal).
The scaling reduced the effective amplitude resolution from 12 bits to
9 or 10 bits in typical cases, and as little as 7 bits in some cases.
From each set of 4 consecutive decimated samples of the same ECG
signal, one was recorded (chosen using a turning-point compressor, a
technique sometimes called "peak-picking"). The result is an ECG
signal sampled 125 times per second, but at intervals that vary
between 2 and 14 ms (averaging 8 ms). Since the interval between any
given pair of samples was not available to us, the reconstructions of
the ECG signals assume uniform 8 ms intervals. These signals with
reduced time and amplitude resolution, and sampling jitter introduced
by the "peak-picking", were the only ECG signals that were possible to
capture from the ICU monitors. Although ECGs reconstructed in this
way can be readily interpreted visually, they are unsuitable as input
for certain algorithms for ECG analysis, particularly those that are
sensitive to frequency-domain features of the signal. Note that these
limitations apply only to the ECG signals, not to the other signals,
which were originally sampled at uniform 8 ms intervals (125 samples
per second) and were not scaled prior to capture.

Alarm annotations:
The ".alarms" annotation files contain information about some of the alarms
(alerts) gathered by this project. The
timestamp associated with each alert indicates when the definition for
the corresponding alert is satisfied (not the time of onset of the condition).
The aux field of each annotation is the text of
the alert, and the subtyp field
indicates the severity of the detected event (3: "red", most critical;
2: "yellow", less critical). The chan and num
fields of these annotations are unused, and the anntyp field
in these files is
always NOTE.

Although the ".alarms" annotation files contain information as gathered
from the ICU monitor alerts, this information has been edited by the MIMIC
II project. Users are encouraged to contribute additional information about
events in these records for possible inclusion in future revisions of these
files.

An additional .alM annotation file is available for the 498 records listed in
RECORDS.alM. These .alM files contain the
"red" arrhythmia alerts from the corresponding .alarm annotation
files, but the chan field has been used to indicate if the
alarm was judged to be true (1) or false (3) by expert human review.
For further information about the .alM annotations, see
Reducing false alarm rates for critical arrhythmias
using the arterial blood pressure waveform; these are the
"gold standard" alarms on which that study is based, as described in
section 2.3 of the paper. It is
important to note that the .alM annotation files are not complete
tallies of all critical arrhythmias in these 498 records, since
they include only detected events (true positives and false
positives) but not any missed events (false negatives).