This chapter serves as a guide for data users to both the file and the technical documentation. Novice users trying to understand how to use the documentation and the file should read this chapter first.

Flat ASCII files by state are available for download via File Transfer Protocol (FTP) from American FactFinder on the U.S. Census Bureaus Web site http://www.census.gov/.

Users can utilize their standard software packages to manipulate the data, which can be easily imported. The geographic header record data are presented in fixed-length ASCII format, and the data file is in comma-separated ASCII format.

The DVD product contains an updated version of the Census 2000 Data Engine software that can aggregate user-defined areas, allows for multiple geographic selections, and creates customized variables and reports. Both the State files and National file will be included as a single DVD product. The software can export files in several formats, including databases and spreadsheets. Files can also be exported for use with popular GIS applications.

The smallest component for all census geography is the block. Figure 2-3 at the end of this chapter provides an example of the various geographic hierarchies used, building from the block. Take some time to review this chart to become familiar with the different hierarchies. Begin reading the schematic from the bottom at the blocks entry. By following the lines, you can see the hierarchy very quickly. For example, follow blocks to block groups to census tracts to counties. This path indicates that census tracts and their sublevels in the hierarchy are uniquely identified within a county and do not cross county boundaries. Follow blocks to the school district hierarchy. This path tells you that school districts can cross jurisdictional boundaries but do not cross state lines. Figure 2-4 at the end of this chapter presents similar information for the American Indian area/Alaska Native area/Hawaiian home land hierarchy. Again, read the schematic from the bottom, beginning with the lowest level of geography, i.e., census blocks.

The geographic header record, Figure 2-5 at the end of this chapter, defines each field and provides its data dictionary reference name, field size, starting position, and data type. In addition, the presence or absence of an X in each summary level column is a guide to the presence or absence of geographic information for that particular summary level. For example, on the column for summary level 040, we see X for the first 10 fields, indicating that there will be information for those fields. In the county field, there is no X, indicating that there is no information for county in summary level 040. Since 040 is the summary level for state, this is perfectly logical.

The geographic header record includes, for the first time, space reserved to accommodate the transition from the Federal Information Processing Standards (FIPS) 55 Code Series to the American National Standards Institute (ANSI) Code Series for the identification of selected geographic entities. Each of nine fields has eight character spaces reserved for an eight-digit Geographic Names Information System (GNIS) identifier code that has been adopted as part of a new national standard. The GNIS is the nations official geographic names repository database and has been designated by the U.S. Board on Geographic Names as the official source of geographic names for use by the federal government and its contractors. Federal agencies are expected to adopt the GNIS ID as a standard code for public and federal data exchange. The fields identified in the geographic header record are:

Field length

Field name

8

State (ANSI)

8

County (ANSI)

8

County Subdivision (ANSI)

8

Place (ANSI)

8

Consolidated City (ANSI)

8

American Indian Area/Alaska Native Area/Hawaiian Home Land (ANSI)

8

American Indian Tribal Subdivision (ANSI)

8

Alaska Native Regional Corporation (ANSI)

8

Subminor Civil Division (ANSI)

The GNIS identifier for states, counties, and equivalent areas are supplemental codes that do not replace the federal standard two-digit state and three-digit county codes also appearing in the header. The Census Bureau will continue to maintain and use in its unique geographic identifiers the existing five-digit codes for place, county subdivision, consolidated city, Alaska Native Regional Corporation, and subminor civil division and will assign and issue codes for new entities to meet customer needs, although these codes are not official or part of the new ANSI standards. The Census Bureau also will continue to maintain the existing four-digit codes for American Indian area/Alaska Native area/Hawaiian home land and three-digit codes for American Indian tribal subdivision.

The summary level sequence chart (Chapter 4) identifies each geographic level and provides the code that is in the SUMLEV field. The last geographic area type listed in the sequence identifies the geography of the summary level; the prior codes simply identify the geographic hierarchy.

140 State-County-Census Tract

144 State-County-Census Tract-American Indian area/Alaska Native area/Hawaiian home land

For summary level 140, the record contains data for a census tract, within a county, within a state. Census tracts are uniquely numbered within a county and do not cross county boundaries. Since counties do not cross state boundaries, this is a simple application. Thus, summary level 140 provides data for a complete census tract.

In summary level 144, the geography is more complex. The key is to work backward through the hierarchy. Thus, summary level 144 is a record for the portion of an American Indian Area (or an Alaska Native area or a Hawaiian home land) within a specific census tract within a specific county within a state.

When reading the summary level sequence chart, it is important to recognize that hyphens (-) separate the individual hierarchies, while slashes separate different types of geography (such as place/remainder) within the same hierarchy.

The data in Summary File 1 and other 2010 Census summary files are segmented. This is done so that individual files will not have more than 255 fields, facilitating exporting into spreadsheet or database software. The data and the corresponding geographic information for an individual state is known as the file set. Because of the large size of the tables, the file set will be broken into 48 files: a geographic header record file and 47 table files. Figure 2-2 provides the file/table details.

It is easiest to think of the file set as a logical file. However, this logical file consists of 48 physical files: the geographic header record file and file01 through file47. This file design is comparable to that used in Census 2000.

A unique logical record number (LOGRECNO in the geographic header) is assigned to all files for a specific geographic entity. This is done so all records for that specific entity can be linked together across files. Besides the logical record number, other identifying fields also are carried over from the geographic header file to the table file. These are file identification (FILEID), state/U.S. abbreviation (STUSAB), characteristic iteration (CHARITER), and characteristic iteration file sequence number (CIFSN). See Figure 2-1 on the next page for an example.

The geographic header record is standard across all electronic data products from the 2010 Census. Some header fields that appear in both files (geographic header and file01), are not used. For example, CHARITER field will be used in the 2010 Census Summary File 2 but is always coded as 000 in the2010 Census Summary File 1.