National Center for Health Statistics
Edward J. Sondik, Ph.D., Director
Jack R. Anderson, Deputy Director
Jack R. Anderson, Acting Associate Director for International Statistics
Lester R. Curtin, Ph.D., Acting Associate Director for Research and
Methodology
Jennifer H. Madans, Ph.D., Acting Associate Director for Analysis,
Epidemiology, and Health Promotion
P. Douglas Williams, Acting Associate Director for Data Standards,
Program Development, and Extramural Programs
Edward L. Hunter, Associate Director for Planning, Budget and
Legislation
Jennifer H. Madans, Ph.D., Acting Associate Director for Vital and
Health Statistics Systems
Douglas Zinn, Acting Associate Director for Management
Charles J. Rothwell, Associate Director for Data Processing and
Services
Division of Data Services
Philip R. Beattie, M.S.P.H., Director
Margot Palmer, Deputy Director
Division of Health and Utilization Analysis
Diane M. Makuc, Dr.P.H., Director
Compressed Mortality File 1968-88 on CD-ROM
(CD-ROM Series 20, No. 2A ASCII Version)
Compressed Mortality File 1968-88
Introduction.................................................................................................................................1
NCHS Data Use Agreement........................................................................................................2
Files on CD-ROM.......................................................................................................................3
Description of Mortality File and File Layout...............................................................................4
Description of Population File and File Layout.............................................................................8
Guidelines for Citation of Data..................................................................................................13Introduction
The Compressed Mortality File 1968-88 (CMF 1968-88) is a county-level mortality and
population data file for the United States spanning the years 1968-88. The file permits the
calculation of national, state, and county death rates for race-sex-age groups of interest. The
mortality file contains only a select set of key analysis variables, namely, 1) state and county of
residence, 2) year of death (rather than the full date of death), 3) race (recoded to white, black,
other races), 4) sex, 5) age group at death (specific age recoded to 16 age groups), 6) underlying
cause-of-death (4-digit ICD code), and 7) 69 or 72 cause-of-death recode. The national, state,
and county population estimates on the CMF are from the Bureau of the Census. The age, race,
and sex detail of the population file matches that of the mortality file.
Details of the data use restrictions are given in NCHS Data Use Agreement (see page 2).
Further detail about the mortality and population files on the CMF can be found in the
Documentation.NCHS Data Use Agreement
The Public Health Service Act (Section 308) (d) provides that the data
collected by the National Center for Health Statistics (NCHS), Centers for
Disease Control and Prevention (CDC), may be used only for the purpose of
health statistical reporting and analysis.
Any effort to determine the identity of any reported case is prohibited
by this law.
NCHS does all it can to assure that the identity of data subjects cannot
be disclosed. All direct identifiers, as well as any characteristics that
might lead to identification, are omitted from the dataset. Any
intentional identification or disclosure of a person or establishment
violates the assurances of confidentiality given to the providers of the
information. Therefore, users will:
1. Use the data in these datasets for statistical reporting and analysis only.
2. Make no use of the identity of any person or establishment discovered
inadvertently and advise the Director, NCHS, of any such discovery.
3. Not link these datasets with individually identifiable data from other
NCHS or non-NCHS datasets.
Files on CD-ROM
README.WPD This file is in WordPerfect version 6.1 format. This file includes general
descriptions of the mortality and population files on the Compressed
Mortality File 1968-88 and the file layouts.
DOCUMENT.PDF This file is in PDF format. It contains the file documentation for the CMF
for the period 1968-88. The file contains the NCHS Data Use Agreement,
descriptions and record layouts for the mortality and populatino data files,
detailed information about the mortality and population data, cause-of-
death coding, computation of death rates, and a dictionary of the FIPS state
and county codes and names.
SASCODE.TXT This file is an ASCII text format. The file provides sample PC SAS
programs for creating a format library, a mortality file, and a population
file from the data files on the CD-ROM.
Data Files
MORT6878 The mortality data file for 1968-78
MORT7988 The mortality data file for 1979-88
POP6878 The population data file for 1968-78
POP7988 The population data file for 1979-88
Description of the Mortality Files
The mortality data, for all years except 1972, are based on records for all deaths occurring in
the United States. For 1972, the data are based on a 50 percent sample and weighted by a factor
of 2. Deaths to foreign residents are excluded. Deaths to U.S. residents who died abroad are not
included on this file. Appendix A in the Documentation provides a description of the vital
statistics reporting system maintained by the NCHS.
The source records were condensed to 23-bytes by retaining only a select set of key analysis
variables. The variables included on the condensed record are: 1) state and county of residence,
2) year of death (rather than the full date of death), 3) race (recoded to white, black, other races),
4) sex, 5) age group at death (specific age recoded to 16 age groups), 6) underlying cause-of-
death (4-digit ICD code), and 7) 69 or 72 cause-of-death recode.
Including only these few variables on the file and recoding some of them into a limited
number of categories resulted in numerous records having identical values on all of the variables.
The number of records on the file was reduced substantially by aggregating records with identical
values on all of the variables into one record. A count indicating the number of identical records
was added to the aggregate record. For example, two white male residents of Clay County,
Alabama, with ages between 35 and 44 years, died from "bronchus and lung, unspecified" (ICD
162.9) in 1979. Their records were combined into one, with a 2 in the count field. Note that
there are no records on the file with zero in the count field. If no deaths occurred for a particular
combination of variable values, no record appears.
Specific details
1. Underlying cause-of-death for the years 1968-78 is classified in accordance with the
Eighth Revision International Classification of Diseases, Adapted for Use in the United
States (ICDA-8) codes. Cause-of-death for the years 1979-88 is classified in accordance
with the International Classification of Disease, Ninth Revision (ICD-9) codes. For a
further description of the ICD codes see Appendix B in the Documentation or Volume II of
the annual mortality volumes produced by the NCHS, such as Vital Statistics of the United
States, 1978, Volume II-Mortality, Part A,or Vital Statistics of the United States, 1988,
Volume II-Mortality, Part A. For a list of comparable ICD codes for the 8th and 9th
revisions and estimated comparability ratios, see Appendix B in the Documentation.
2. The fourth digit of the ICD code can assume the values 0-9 and blank. If the
fourth digit is a "blank", it is a blank on this file. Care must be taken when
reading the file to distinguish between blanks and zeros.
3. For injuries and poisonings, the external cause is coded (E800-E999) rather than the
Nature of Injury (800-999). The letter "E" is not included in the code.
4. For 1988, if there were three or fewer deaths for a given Georgia county of residence (of
deaths occurring in Georgia) with HIV infection (ICD codes *042-*044, 796.8) cited as a
cause-of-death (underlying or non-underlying cause), these records were assigned a
"missing" place of residence code (FIPS code = 13999).
5. The FIPS state and county codes contain leading zeros in both the 2-byte state code and
the 3-byte county code.
File Specifications for the Mortality Files
File names Years Number of records Record Length Format
MORT6878 1968-78 8,774,864 23 ASCII
MORT7988 1979-88 16,448,435 23 ASCII
The files are sorted by locations 6-9, 1-5, 10, 11-12, 13-16.
Field Item and
Location Size Code Outline Format
FIPS Codes
(See Appendices E and F in the Documentation)
1-2 2 FIPS state code Numeric
3-5 3 FIPS county code Numeric
6-9 4 Year of death Numeric
10 1 Race-sex Numeric
1 White male
2 White female
3 Black male
4 Black female
5 Other male
6 Other female
11-12 2 Age at death Numeric
01 under 1 day
02 1-6 days
03 7-27 days
04 28-364 days
05 1-4 years
06 5-9 years
07 10-14 years
08 15-19 years
09 20-24 years
10 25-34 years
11 35-44 years
Field Item and
Location Size Code Outline Format
12 45-54 years
13 55-64 years
14 65-74 years
15 75-84 years
16 85+ years
99 Unknown
13-16 4 ICD code for underlying
cause-of-death Numeric
1968-78: ICDA-8
1979-88: ICD-9
17-19 3 Cause-of-Death Recode Numeric
(See Appendix B in the Documentation)
1968-78: 69 Cause-of-Death Recode
1979-88: 72 Cause-of-Death Recode
20-23 4 Number of deaths Numeric
Description of the Population File
There are national, state, and county population estimates on the population file of the CMF.
The population estimates are based on U.S. Bureau of the Census estimates of U.S. national,
state, and county resident populations. The 1968-69 national estimates and all of the estimates
for 1971-79 and 1981-88 are intercensal estimates of July 1 resident populations. The 1970 and
1980 population estimates are April 1 modified (modified age-race-sex) census counts. The
1968 and 1969 state and county population estimates were calculated by NCHS using linear
extrapolation. A brief description of the population estimates is provided here; a more detailed
description is provided in Appendix D in the Documentation.
Specific details
1. There is one record on the file for each geographic unit (total U.S., state, county) x year x
race-sex group.
2. Modifications of the population estimates made by NCHS:
a. To permit the calculation of infant mortality rates, NCHS live-birth data were substituted
for the estimates of the population under one year of age. The race code for these records is
derived from "race of mother".
b. When the age group 1-4 years did not appear on the Census file, the age group 0-4 years
was multiplied by 0.8 to obtain an estimate of the population 1-4 years.
c. For non-censal years prior to 1992, the NCHS Division of Vital Statistics uses national
population estimates rounded to the nearest 1,000 to calculate published death rates. On the
CMF, the national population estimates for 1968-69 and 1971-79 are rounded to the nearest
1,000 in accordance with this practice. However, this means that calculation of rates for
aggregate age, race, and/or sex groups involves using population estimates that were rounded
before aggregation rather than after aggregation. As a result, national death rates for
aggregate groups calculated using the rounded estimates on the CMF may differ slightly from
those published by NCHS. The national population estimates for 1981-88 on the CMF are
not rounded so that the user can round them after aggregating across subgroups and avoid the
rounding error problem.
3. National, state, and county population estimates can be identified by using the FIPS code or
the record type variable in location 140. National population records have a FIPS code of
"00000". State population records have a valid 2-digit FIPS state code and a county code of
"000" (see Appendix E in the Documentation). The record type variable assumes the value "1"
for national records, "2" for state records, and "3" for county records.
It is necessary to provide separate sets of estimates for each geographic level because the
methodology used to produce the intercensal estimates (1971-79 and 1981-88) did not smooth
them sufficiently. Thus, for the intercensal years, the sum of the population estimates of counties
within a state may not equal the state population estimate, and the sum of all state population
estimates or all county population estimates may not equal the national population estimates. For
these years, the national population estimates should be used when calculating national death
rates and the state population estimates should be used when calculating state death rates.
4. The FIPS state and county codes contain leading zeros in both the 2-byte state code and the 3-
byte county code.
5. For 1988, there was an additional county in Georgia with a "missing" county code of "999".
The six records for this county have population counts of zero.
6. Brief description of population estimates for individual years
1968-69 population estimates - National population estimates are U.S. Bureau of
the Census intercensal estimates of the July 1 resident population. State and county
population estimates were calculated by NCHS using linear extrapolation from the
corresponding July 1, 1970 and July 1, 1971 estimates.
1970 population estimates - National, state, and county population estimates are
from a modified version of the April 1, 1970 census. The original census counts were
modified by the U.S. Bureau of the Census to correct: 1) errors discovered in the data, 2) race
misclassification - persons of Hispanic origin who reported their race as "other" were recoded
as "white".
1971-79 population estimates - National and county estimates are U.S. Bureau of
the Census intercensal estimates of the July 1 resident population. The Bureau of the Census
did not produce state population estimates by age, race, and sex for the 70's. Therefore, the
state population estimates for 1971-79 on this file are simply the sum of the population
estimates for the counties in each state.
Three Virginia independent cities (Manassas, Manassas Park, and Poquoson) did not
appear on the Census file prior to 1981. While these independent cities are not on the
mortality file for 1968-78, they are on the file for 1979 onwards. Therefore, the 1979
populations for these three cities were estimated from the July 1, 1980 and July 1, 1981
estimates of these cities. The 1979 population estimates for the counties containing the cities
were reduced by the estimated city populations.
1980 population estimates - National, state, and county population estimates are
from a modified version of the April 1, 1980 census. The original census counts were
modified by the U.S. Bureau of the Census: 1) persons who reported their race as "other" (the
majority being of Hispanic origin) were reassigned to one of the official race groups, 2) an
adjustment was made for the overcount of centenarians
April 1, 1980 population estimates for three Virginia independent cities, (Manassas,
Manassas Park, and Poquoson) had to be extrapolated from July 1, 1980 estimates. The April
1 populations for the three cities were calculated as a proportion of the April 1 county
population, with the proportion obtained from the July 1, 1980 city/county estimates. The
April 1 population estimates for the counties containing the three cities were reduced by the
estimated April 1 city populations.
1981-88 population estimates - National, state, and county estimates are U.S.
Bureau of the Census intercensal estimates of the July 1 resident population.
File Specifications for the Population Files
File name Years Number of records Record length Format
POP6878 1968-78 206,712 140 ASCII
POP7988 1979-88 189,966 140 ASCII
The files are sorted by locations 6-9, 1-5, 10.
Field Item and
Location Size Code Outline
Format
FIPS codes
(See Appendices E and F)
1-2 2 FIPS state code Numeric
3-5 3 FIPS county code Numeric
6-9 4 Year Numeric
10 1 Race-sex Numeric
1 White male
2 White female
3 Black male
4 Black female
5 Other male
6 Other female
11-18 8 Number of live births Numeric
19-26 8 Population in age group: 1-4 years Numeric
27-34 8 Population in age group: 5-9 years Numeric
35-42 8 Population in age group: 10-14 years Numeric
43-50 8 Population in age group: 15-19 years Numeric
51-58 8 Population in age group: 20-24 years Numeric
59-66 8 Population in age group: 25-34 years Numeric
67-74 8 Population in age group: 35-44 years Numeric
Field Item and
Location Size Code Outline
Format
75-82 8 Population in age group: 45-54 years Numeric
83-90 8 Population in age group: 55-64 years Numeric
91-98 8 Population in age group: 65-74 years Numeric
99-106 8 Population in age group: 75-84 years Numeric
107-114 8 Population in age group: 85+ years Numeric
115-139 25 County name Character
(See Appendix F in the Documentation)
140 1 Record type Numeric
1 National population record
2 State population record
3 County population recordGuidelines for Citation of Data
With the goal of mutual benefit, the National Center for Health Statistics
(NCHS) requests that recipients of data files cooperate in certain actions
related to their use. Any published material derived from the data should
acknowledge NCHS as the original source. The suggested citation to appear
at the bottom of all tables is as follows:
Source: National Center for Health Statistics (span of years used)
When cited in a bibliography, the citation should read:
National Center for Health Statistics (2000). Data File
Documentation, Compressed Mortality File, 1968-88 (machine
readable data file and documentation, CD-ROM Series 20, No. 2A),
National Center for Health Statistics, Hyattsville, Maryland.
The published material should also include a disclaimer that credits any
analyses, interpretations, or conclusions reached to the author (recipient of
the data file) and not to NCHS, which is responsible only for the initial
data. Consumers who wish to publish a technical description of the data
should make an effort to insure that the description is not inconsistent with
that published by NCHS.