Research

Contact Us

Sample Selection from Microfilmed Population Schedules

Population schedules from the 1950 census are preserved on 6,278 one-hundred foot reels of 35mm microfilm. Only the front of Form P1, the Population and Housing Schedule, was microfilmed; this side contained geographic and other identifying information and the population census items. The back side, containing housing census items for dwelling units, was not microfilmed. The original schedules were destroyed after microfilming. The punch cards originally used for processing the 1950 census were also destroyed. Copies of the microfilm are stored at the National Archives in Washington, D.C., and at the Personal Records Service Branch of the Bureau of the Census in Pittsburg, Kansas. All microfilm processing for the public use sample project was performed by the Census Bureau at its Pittsburg, Kansas, facility.

The 6,278 microfilm reels were randomly assigned to 20 subsamples. Final sampling of cases for transcription was conducted independently within each of the 20 subsamples. This was done because cost estimates were uncertain and contingency plans called for reducing the size of the sample if necessary. However, all 20 independent subsamples were completed.

The 6,278 microfilm reels at the Pittsburgh, Kansas, facility are organized alphabetically by state, within States alphabetically by county, and within counties numerically by enumeration district. Each of the 20 subsamples is a representative, albeit clustered, sample of the United States population.

The 20 subsamples were processed separately. The public use sample file is sequenced by subsample. The household record item SUBSAMP identifies each subsample. The value of SUBSAMP does not indicate the order of processing; the original subsample numbers were changed to protect the confidentiality of the microfilm reels.

The population schedules within an enumeration district are numbered consecutively in three basic sections. Sheets 1-50 (and sometimes portions of sheets 51-70) contain regular listings of persons in households (regular or group quarters). Sheets 51-70 contain listings of transient persons not residing in permanent-type living quarters. In some enumeration districts the regular listings extended into sheets 51-70, but this did not cause sampling problems. Sheets 71-100 contain listings of households or persons missed in the initial regular listing.

Within an enumeration district, households were numbered in order of visitation. This serial number is recorded in column 3 of the population schedule. Persons listed on sheets 71-100 who were members of households that were listed on sheets 1-70 were listed with the same serial number as the household to which they belonged.

Sampling Procedures for Household Selection

In the 1940 and subsequent censuses, some of the census questions were asked of all persons and others were asked of a sample of persons. In 1960, 1970, and 1980, the sample questions were included in a "long-form" questionnaire that was administered to selected households. All persons in the selected households were asked the sample questions. The public use samples from the 1960, 1970, and 1980 censuses were designed as samples of households that were administered the long-form questionnaire. Thus, the full set of census information is available for each person in these public use samples.

In the 1940 and 1950 censuses, the census sample questions were asked of persons on selected lines of the population schedule. Thus, if one person in a household were asked the sample questions, the household members listed on adjoining lines were not asked the sample questions. Design of the public use samples for 1940 and 1950 required a compromise between the desire to provide information on all members of selected families and households, and the desire to provide as large a sample as feasible of persons who answered the census sample questions.

The sampling procedure chosen for the 1950 public use sample was designed to select two interlocking samples. The first selection was of a systematic random sample of persons who answered the 3 and 1/3 percent sample questions at the bottom of the population schedule. The second selection occurred if the person selected was part of a regular household -- the 100 percent census items for each member of the entire household were then included in the public use sample. If the person selected was part of a group quarters household (institution, transient quarters, or lodging house), no other members of the household were included.

The front side of the 1950 Population and Housing Schedule (Form P-1) included lines for enumerating 30 persons. Every 5th line on the sheet was clearly marked as a sample line. Additional questions for the 6 sample line persons were listed at the bottom of the sheet, along with space for recording the answers. These questions constituted the 20 percent sample questions. For the last sample line at the bottom of each sheet, there was a block of additional questions. These questions, asked of one of every six persons in the 20 percent sample, constituted the 3 and 1/3 percent sample questions.

Drawing of the public use sample from the microfilm records was directed by a computer program which provided instructions to the sampling clerks. The clerks, sitting at stations with a microfilm reader and video display terminal, were instructed to mount a specified microfilm reel and locate a specified enumeration district and sheet number. The sheet numbers were specified by the program to constitute a 1/11 systematic random sample of sheets. Selection of 1 of 30 persons from 1 in 11 sheets produced a 1 in 330 sample of "sample line persons" (persons in the 3 and 1/3 percent census sample). The clerk then followed a computer-directed procedure to determine the type of household of the person listed on the 6th sample line. Information was examined on relationship (columns 8 and A), serial number of dwelling unit (column 3), header information (block e, "Hotel, Large Rooming House, Institution, Military Installation, etc."), and sheet number. The clerk also tested for the presence of five or more unrelated individuals in the household listing. These procedures permitted determination of whether the person lived in a group quarters household. (See the glossary entry, GROUP QUARTERS.) If not, the person was designated as living in a regular household.

The clerk transcribed the complete-count and all sample items from the population schedule for the selected person, and if the person lived in a regular household, the complete-count items for all other household members. On a second pass through the enumeration district, the clerk was instructed by the computer, based on the entries from the first pass, to record designated entries on sheets 71 and above for persons who were members of a regular household already selected for inclusion in the sample.

If the 6th sample line on the selected population schedule was blank, but there was at least one filled basic line following the blank line, no selection was made from that schedule and the clerk proceeded to the 11th following schedule. If the 6th sample line was blank and all following basic lines were blank, a random selection was made from the 6 sample lines on that schedule. If a person was listed on the selected sample line, that person and all members of his or ;her regular household were included in the public use sample. (In subsequent processing, answers to the 3 and 1/3 percent sample questions for this person were allocated following the same procedures used when data were missing for other reasons.) If no person was listed on the selected sample line, no selection was made for the public use sample from that schedule. This procedure for accounting for "short" population schedules was designed to maintain the overall selection probability of 1 in 330 for sample line persons.

Some sampling and enumeration experiments were carried out in a few areas of Michigan and Ohio as part of the 1950 census. In these experiments, all members of every fifth household were asked sample questions. To draw a 1 in 330 sample similar to that used in the rest of the 1950 public use sample, a 1 in 66 sample of persons was selected from the microfilm containing the 1 in 5 sample of households. After randomly selecting the first person between 1 and 66, every sixty-sixth person thereafter was selected. All census items for the selected person and the 100 percent census items for the other members of the selected person's regular household were transcribed onto special forms similar to those used in the rest of the country. Clerical entry and further processing then proceeded as for the non-experimental areas.