CMS provides a 2552-96 to 2552-10
crosswalk
and a list of
cost coded worksheets.
The text files in
2552-10
and
2552-96
may also be helpful in getting an idea of what lines are on each version of the worksheets.
The text files in
2552-10
and
2552-96
should be helpful for getting an idea of what worksheet, column, and line combinations are on each version of the worksheets.
Per the README,
the 2552-96 lines are changed to the
Cost Center Codes
for the cost center coded worksheets.
In 2552-10, the cost coded line numbers are not changed to cost codes.
Instead, the cost center
associated with each cost coded line number is described in the
alphanumeric data file value for
the report number where WKSHT_CD = "A000000" and
CLMN_NUM = "00000". Another difference between 2552-96 and 2552-10 is that
the column number variable was 4 columns wide in 2552-96 and 5 columns wide
in 2552-10. So, for example, column 1 in 2552-96 is "0100" and "00100" in 2552-10.

CMS began including rollups of their internal 2552-10 A, B, C, D, E, G, and S series in April 2014.
They consider the 2552-96 rollup files to be unreliable.

CMS released 2552-96 bad debt files but not 2552-10. However, a 2552-10 bad debt
bad debt rosetta stone is available.

Select variables from the cost reports is avaialble in the files below.
One of the variables included is ZIP Code. A
ZIP Code distance database is available.

To get the worksheets that were completed to generate the cost report data, go to
Paper-Based Manuals,
choose Publication # 15-2 for the Provider Reimbursement Manual Part 2,
then choose
Chapter 36.
R20P236F.zip in P152_36.zip has the 2552-96 worksheets A-M,S for
"cost reporting periods ending on or after September 30, 1996"
or Chapter 40
for "cost reports with fiscal years beginning on or after May 1, 2010"..

The layout for the way
2552-96 cost report data was delivered might be helpful for getting an idea of
some of the variables in some of the worksheets.

The HCRIS data consists of four databases: one has alphanumeric
variables, one has numeric variables, one has hospital report
meta-variables, and one has many of the individual numeric variables
"rolled up" into one variable.

Be careful when using both the rollup
file or the numeric file with negative amounts. For some items,
losses are to be recorded as (+amount). In the past,
sometimes values get recorded as -amount instead. Check
worksheet instructions and the data when there may be negative amounts.

The primary key linking these datasets is the report
record number, RPT_REC_NUM.
The hospital report database is an ordinary and small rectangular data file.
The alphanumeric (A) and especially the numeric (N) databases are big, long, skinny files.
The A & N files have all the HCRIS report variables for all fiscal years from 1996 on.
They have five variables each:
RPT_REC_NUM, WKSHT_CD, LINE_NUM, CLMN_NUM, and the value.
Extract data
from these files using the worksheet code, line number and column number.
Worksheets have names like 255296_*.xls, where * is a letter, a through s.
The rollup files have three variables: RPT_REC_NUM, LABEL which is a
reasonble mnemonic variable name reflecting the worksheet code, line number,
and column numbers that were rolled up, and ITEM which is the value.

A statistical or database package such SAS or Oracle, etc. ,
or a programming language that can handle large files is necessary to use the HCRIS data.
The fyYEAR.zip files are about 100 Mb and can unzip to over 1 Gb.

The SAS datasets created by these programs can be converted to other formats using
conversion software such as Stat/Transfer.

The files below may be helpful for extracting data.
They list every combination of worksheet code, column number, and line
number in HCRIS' alphanumeric and numeric databases.
These files can be manipulated
using software such as MS Excel ( Especially the alphanumeric file.
MS Excel has a limit of 65536 lines
and the numeric file has over 200,000 unique combinations of WKSHT_CD, CLMN_NUM, and LINE_NUM.
The numeric file could be cut down using an editor,
or read with a package such as MS Access. )

SAS-friendly versions are already available with the SAS programs below.

Users of the old Prospective Payment System
PPS
version of the hospital cost report data may wonder how PPS fields correspond to
data extracted from HCRIS. The worksheet codes, column numbers,
and line numbers from the most recent ( fiscal years 1996-1999 )
PPS files correspond exactly to the earliest 1996-1999 HCRIS
files. Most PPS fields correspond
to _exactly__one_ column and line on a 2552-96 worksheet.
A few PPS fields, however, are the sum of multiple lines.
The means files below are one way to check
that HCRIS-extracted data corresponds to PPS fields. The match
may not be exact, though. The HCRIS files have the most up-to-date
cost report data for fiscal years 1996-1999. Frequecy tables of
the character variables are included as well.