1.3 Overview of data collection

The data were collected over a seven year period, beginning in 2001,
from Boston's Beth Israel Deaconess Medical Center (BIDMC). Any
patient who was admitted to the ICU on more than one occasion may be
represented by multiple patient visits. The adult ICUs (for patients
aged 15 years and over) include medical (MICU), surgical (SICU),
coronary (CCU), and cardiac surgery (CSRU) care units. Data were also
collected from the neonatal ICU (NICU).

Figure 1.1 illustrates the data acquisition process,
which did not interfere with the clinical care of patients, since
databases were dumped off-line and bedside waveform data and derived
trends were collected by an archiving agent over TCP/IP. Source data
for the MIMIC II database consists of a) bedside monitor waveforms and
associated numeric trends derived from the raw signals, b) clinical
data derived from Philips' CareVue system, c) data from hospital
electronic archives, and d) mortality data from the Social Security
Death Index (SSDI). These data are assembled in a protected and
encrypted database (both flat files for the waveforms and trends, and
in the form of a relational database for all other data). Once the
data have been assembled in a central repository and time aligned, the
waveforms and trends for each individual are linked to the
corresponding individuals' data in the relational database. (See
section 1.4.3 for more information.) The data are
then de-identified to produce a final set of data for public
consumption. (See section 1.4.4 and (6) for
more information on this detailed process.) We calculate standardized
severity scores on the database and we also incorporate user feedback
and corrections.

The resulting records contain realistic patient measures with
all the associated challenges (such as noise or missing data gaps)
that advanced monitoring and clinical decision support systems (CDSS)
algorithms would receive as input data. Noise and artifact examples in
the database, together with methods for dealing with these problems
are described in sections 1.6 and
5.4.

Figure 1.1:
Schematic of data collection and database
construction. Source data consists of: bedside monitor
waveforms and trends, the ICU clinical databases, the
hospital archives and the Social Security Death Index. These
data are assembled in a protected and encrypted database which
is then de-identified to provide one relational database plus
associated flat file bedside waveforms and trends. We also
incorporate standardized severity scores and user feedback and
corrections.