In addition to providing a web-based platform to search and analyze the Supreme Court Database, this site also provides downloadable files that researchers can use with their own statistical software. For each version of the database there is complete parity between the data used by the web-based system and that available for download. This allows for replication of results. The site archives all legacy data to allow for further replication of analyses conducted in the past.

Choosing a File

The most difficult part of downloading the Supreme Court Database for use is choosing which file to use. We provide both Case Centered and Justice Centered data. In the Case Centered data the unit of analysis is the case; i.e., each row of the database contains information about an individual case. The Citation database includes one row for each dispute. Consolidated cases (those with multiple dockets) or cases with multiple issues or legal provisions are included only once. The Docket database includes a row for each docket. Thus, consolidated cases appear multiple times, but there is only a single issue and legal provision for each case. In some cases there are multiple issues or legal provisions. The Legal Provision database includes a row for each issue or legal provision dealt with by the Court. Finally, the Legal Provision Including Split Votes includes the very rare instances when the Court has multiple vote coalitions on a single issue or legal provision. None of these files contain information about the justice votes.
The Justice Centered data include a row for each justice participating in the case. Most cases in the modern era have, thus, nine rows. When analyzing the behavior of individual justices these are the data to select. The Citation, Docket, Legal Provision, and Legal Provision Including Split Votes versions of the Justice Centered data correspond to the Case Centered data described above.

For those experienced with the previous releases of the database provided by Professor Spaeth you will notice a number of changes. First, many of the variables have been renamed to be more readable. Thankfully modern software allows variable names longer than eight characters. Second, all variables are labeled as well as their values, when supported by software. Many variable normalizations have changed, e.g., the three digit codes used to represented the case ISSUE are now different. While the underlying data remain the same, this was necessary to modernize the data model and to allow for back-dating of the data. Third, It is no longer necessary for researchers to do a selection on the unit of analysis (ANALU) variable; this is now accomplished by downloading the appropriate data file. Of course, one could take the Legal Provision Including Split Votes data and perform a custom selection. Fourth, we are no longer coding or including memoranda---the so-called "back of the book" cases (DEC_TYPE=3). Finally, in anticipation of the expansion of the database, we are no longer distributing a single "wide" database with columns for each justice votes. This model is not sustainable for over one-hundred justices. In its stead, justice vote data are distributed in the Justice-Centered format discussed above.