Development of Crime Forecasting and Mapping Systems for Use by Police in Pittsburgh, Pennsylvania, and Rochester, New York, 1990-2001 (ICPSR 4545)

This study was designed to develop crime forecasting as an
application area for police in support of tactical deployment of
resources. Data on crime offense reports and computer aided dispatch
(CAD) drug calls and shots fired calls were collected from the
Pittsburgh, Pennsylvania Bureau of Police for the years 1990 through
2001. Data on crime offense reports were collected from the Rochester,
New York Police Department from January 1991 through December 2001.
The Rochester CAD drug calls and sho... (more info)

This study was designed to develop crime forecasting as an
application area for police in support of tactical deployment of
resources. Data on crime offense reports and computer aided dispatch
(CAD) drug calls and shots fired calls were collected from the
Pittsburgh, Pennsylvania Bureau of Police for the years 1990 through
2001. Data on crime offense reports were collected from the Rochester,
New York Police Department from January 1991 through December 2001.
The Rochester CAD drug calls and shots fired calls were collected from
January 1993 through May 2001. A total of 1,643,828 records (769,293
crime offense and 874,535 CAD) were collected from Pittsburgh, while
538,893 records (530,050 crime offense and 8,843 CAD) were collected
from Rochester. ArcView 3.3 and GDT Dynamap 2000 Street centerline
maps were used to address match the data, with some of the Pittsburgh
data being cleaned to fix obvious errors and increase address match
percentages. A SAS program was used to eliminate duplicate CAD calls
based on time and location of the calls. For the 1990 through 1999
Pittsburgh crime offense data, the address match rate was 91 percent.
The match rate for the 2000 through 2001 Pittsburgh crime offense data
was 72 percent. The Pittsburgh CAD data address match rate for 1990
through 1999 was 85 percent, while for 2000 through 2001 the match
rate was 100 percent because the new CAD system supplied incident
coordinates. The address match rates for the Rochester crime offenses
data was 96 percent, and 95 percent for the CAD data. Spatial overlay
in ArcView was used to add geographic area identifiers for each data
point: precinct, car beat, car beat plus, and 1990 Census tract. The
crimes included for both Pittsburgh and Rochester were aggravated
assault, arson, burglary, criminal mischief, misconduct, family
violence, gambling, larceny, liquor law violations, motor vehicle
theft, murder/manslaughter, prostitution, public drunkenness, rape,
robbery, simple assaults, trespassing, vandalism, weapons, CAD drugs,
and CAD shots fired.

Access Notes

These data are freely available.

Dataset(s)

Study Description

Citation

Cohen, Jacqueline, and Wilpen L. Gorr. Development of Crime Forecasting and Mapping Systems for Use by Police in Pittsburgh, Pennsylvania, and Rochester, New York, 1990-2001. ICPSR04545-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2006-08-31. http://doi.org/10.3886/ICPSR04545.v1

Universe:
All criminal offenses or computer aided dispatch (CAD)
calls in Pittsburgh, Pennsylvania, from 1990 through 2001.
All criminal offenses or computed aided dispatch (CAD)
calls in Rochester, New York, from 1991 through 2001.

Data Types:
administrative records data

Data Collection Notes:

The files are provided in a WinZip archive with 73
files in three folders. The Statistical Data Files folder provides
data for Pittsburgh and Rochester in comma-separated text files. The
GIS folder provides geographic files for Pittsburgh and Rochester for
use with mapping software. The Report Files folder provides the final
report, a data dictionary, and the first and last five observations.
The WinZip archive must be extracted to the C:\ drive in order for the
ArcView project file to work correctly.

Methodology

Study Purpose:
The purpose of this study was to develop crime
forecasting as an application area for police in support of tactical
deployment of resources. The crime forecasting methods and models
included (1) a multivariate model for estimating crime seasonality
based on demographic and land use demographics, and (2) leading
indicator models with 4 and 12 time lags. An application of tracking
signals as a supporting crime analysis tool to automatically detect
crime series pattern changes was also introduced.

Study Design:
The crime data collected for this study were from
two Northeastern, mid-sized cities: Pittsburgh, Pennsylvania, and
Rochester, New York. The researchers had previously collected all
crime offense reports and computer aided dispatch (CAD) calls from the
Pittsburgh Bureau of Police for the years 1990 through 1998. The
current study added the years 1999 through 2001. Since Pittsburgh
started using a new record management system in 2000, all of the 1990
through 1999 data had to be reprocessed to ensure that the 1999 data
were treated identically to the 1990 through 1998 data and to make as
smooth a connection as possible to the new format of the 2000 and 2001
data. The 1990 through 1999 offense datasets were in 17 flat files
extracted from an old mainframe system. Oracle SQL Loader was used to
import the data into an Oracle database. The imported data were in 13
tables. The tables were then exported into an Access database. In
Access, links were created between the tables and various queries were
created to limit crime records to offenses only. Several fields were
concatenated to get a complete street address for each crime record. A
crime code table, created by the researchers, was joined to the
database so that each crime record would have a consistent descriptive
crime name that matched the Rochester data. The resultant table
containing the Pittsburgh offense data for 1990 through 1999 has
637,166 records. The Pittsburgh offense data for 2000 and 2001 were
taken from an Oracle database. The 132,127 records were appended to
the earlier data, with a crime code table added so each crime record
has a descriptive major code. The Pittsburgh CAD data have 874,535
records. Only CAD drugs and CAD shots are used in the forecast
models. CAD data could not be obtained for November and December of
1999. Instead, simple exponential smoothing was used to forecast those
two months, and the forecasts are used as data values in the datasets.
A SAS program was used to eliminate duplicated CAD calls based on the
time and location of the calls. The total number of crime offense and
CAD records for Pittsburgh is 1,643,828. ArcView 3.3 and GDT Dynamap
2000 Street centerline maps were used to address match the Pittsburgh
data. Some data were cleaned to fix obvious errors and increase
address match percentages. For the 1990 through 1999 crime offense
data the address match rate was 91 percent. The match rate for the
2000 through 2001 crime offense data was 72 percent. The CAD data
address match rate for 1990 through 1999 was 85 percent, while for
2000 through 2001 the match rate was 100 percent because the new CAD
system supplied incident coordinates. Once the data addresses were
matched, spatial overlay in ArcView was used to add geographic area
identifiers for each data point: precinct, car beat, car beat plus,
and 1990 Census tract. Car beat plus is an aggregation of car beats
designed to increase monthly average crime volumes while keeping the
resultant districts from crossing precinct boundaries and maintaining
compact areas. Car beats are aggregations of census tracts and were
the patrol districts used by the Pittsburgh Bureau of Police during
the study period. The next step was to aggregate a number of crime
types to monthly times series for each geography. The Rochester
offense data contain 530,050 records from January 1991 to December
2001. All files were imported and processed in Access. The Rochester
CAD records contain data from January 1993 to May 2001 and 3,767,002
records. However, only the 8,843 records containing the CAD shots and
drugs data were used. Again, SAS was used to eliminate duplicate CAD
calls. The total number of crime offense and CAD records for
Rochester is 538,893. ArcView 3.3 and GDT Dynamap 2000 Street
centerline maps were also used to address match the Rochester data. No
data cleaning was necessary. The address match rates for the Rochester
crime offenses data was 96 percent, and 95 percent for the CAD
data. Spatial overlay followed in the same fashion as in Pittsburgh.

Sample:
For Pittsburgh, 769,293 crime offense records and 874,535
computer aided dispatch (CAD) drug calls and shot fired call records,
for a total of 1,643,828 records, are included in the data. For
Rochester, 530,050 crime offense records and 8,843 CAD drug call and
shots fired call records, for a total of 538,893 records, are included
in the data.

Weight:
none

Mode of Data Collection:
record abstracts

Data Source:

The individual offense incident and computer aided
dispatch (CAD) data were obtained from the Pittsburgh, Pennsylvania
Bureau of Police and Rochester, New York Police Department.