By Ned Levine, PhD
Ned Levine & Associates
CrimeStat IV was released early in 2014 and has been updated several times. Funded by the Mapping and Analysis for Public Safety program at the National Institute of Justice (NIJ), CrimeStat IV is a stand-alone spatial statistics program for the analysis of incident locations. It was developed by Ned Levine & Associates from research grants by the NIJ. The NIJ is the sole distributor of CrimeStat and makes it available for free to law enforcement and criminal justice analysts, educators, and researchers.
The program is Windows-based and interfaces with most desktop GIS programs. The purpose is to provide supplemental statistical tools to aid law enforcement agencies and criminal justice researchers in their crime-mapping efforts. The program reads event locations that could represent both incidents and the number of events in zones (e.g., census tracts, traffic analysis zones). It has more than 100 spatial statistical routines that are useful for crime analysts and researchers, including those for identifying overall spatial distribution, hot spots, and broad regional trends through interpolation as well as those for analyzing the behavior of serial offenders and offender travel behavior.
It has an extensive collection of hot spot analysis tools for identifying clusters of crimes that are concentrated, as well as those for identifying where the number of events is higher than expected (risk analysis). For example, Figure 1 shows a kernel density estimate of 2006 burglaries in the City of Houston (TX). Superimposed over the estimate are 1st-, 2nd-, and 3rd-order burglary hot spots identified with the Nearest Neighbor Hierarchical clustering algorithm (Nnh). The Nnh hot spots provide specific information about the location of the hot spots that is missed in the kernel density interpolation.
CrimeStat III introduced a Crime Travel Demand module that allows analysts to model crime trips over a jurisdiction or even an entire metropolitan area. For example, a study was conducted of 258 bank robberies that occurred in Baltimore County, MD, from 1993 to 1997 (Levine, 2007). The crime travel demand model showed that bank robbery trips tended to originate in poorer, denser neighborhoods and, in general, involve banks that were close to the offender’s residence. Likely travel routes to the banks were modeled as well as possible escape routes on the assumption that there would be higher police presence around the bank after the robbery. Figure 2 shows the trip and escape routes that were modeled for bank robberies committed in west central Baltimore County (MD).
CrimeStat IV includes a Head-Bang module. The Head-Bang statistic was developed by the National Cancer Institute as a way to smooth zonal data in calculating rates. Zones with small populations will often produce exaggerated incident rates due to low numbers of events (e.g., cancer incidents, violent crimes). The Head-Bang weights the smoothing by the size of the zone. Large zones with many events keep their value whereas smaller zones are generally smoothed. For example, Figure 3 below shows the effect of smoothing 2006 Houston burglaries by Traffic Analysis Zones. The overall distribution of the data was maintained but zones that are small tended to have their numbers smoothed using information from adjacent zones.
New to CrimeStat IV was a Bayesian Journey-to-crime routine that adjusts the usual journey-to-crime (geographic profiling) model with additional information on where other offenders lived who committed crimes at the same locations as the offender that the analyst is trying to profile. This model was tested on more than 1,100 serial offenders in four cities (Baltimore County; Chicago; The Hague in the Netherlands; Manchester in the U.K) and produced a substantial improvement in the accuracy of predictions while maintaining reasonable reliability. The model was further improved by allowing the user to add other geographic information to improve the estimates, such as the location of low-income areas or the distance of each neighborhood from the central city.
Below in Figure 4 is the result of a Bayesian model that incorporated such information in predicting the residence location of a serial offender in Baltimore County, MD who committed 14 crimes between 1993 and 1996 (shown in black). Note that in spite of the dispersion of this offender’s crimes, as far away as five miles from his residence (shown in green), the model predicted the location of his residence that was quite close to it. The cell with the highest probability estimate is outlined in light blue (approximately 0.25 miles away).
Also new in CrimeStat IV are multivariate modeling tools. There is a regression module that allows non-spatial and spatial regression using both Maximum Likelihood Estimation (MLE) and Markov Chain Monte Carlo (MCMC) simulation. The MCMC method was developed during the U.S. Hydrogen Bomb Project as a means for estimating complex functions that MLE could not solve. CrimeStat IV is one of the few programs that includes MCMC spatial regression routines and is the only program that can handle very large datasets with this approach, such as those found in large and medium-sized police departments. It will be useful for researchers who want to incorporate spatial autocorrelation in their regression models as well as for advanced analysts who want to develop predictions of crime based on factors associated with the neighborhoods where crime occurs.
Another new module in CrimeStat IV is for discrete choice modeling, which allows the estimation of a multivariate model where the dependent variable is a discrete, nominal variable rather than the usual continuous variable. This module includes both the Multinomial Logit and the Conditional Logit models. The first allows the choices to be related to characteristics of the decision-maker (e.g., the offender) while the second allows the choices to be related to characteristics of the choices themselves.
For example, a Multinomial Logit model was constructed relating the choice of a weapon during a Houston robbery to characteristics of the offenders and the environment in which the robberies occurred (Levine, Robertson, & Fosberg, 2013). On the other hand, a Conditional Logit model was constructed of neighborhood choice by burglars in The Hague, Netherlands where the predictive variables were characteristics of the neighborhoods and the distance that the burglar lived from the crime location (Bernasco & Nieuwbeerta, 2005).
There is also a time-series forecasting module in CrimeStat IV that allows police to monitor crimes by week or month for specific geographical districts. Using at least three years of data by time period, the routine incorporates seasonality adjustments and estimates the expected number of crimes per district for any single week or month and then compares it to the actual number that occurred (Gorr & Olligschlaeger, 2013). If the discrepancy is larger than what would be expected by chance, the routine outputs a signal (the Trigg index) that can inform the police department that there is a larger number of crimes occurring than expected (or, occasionally, that there are much fewer than expected). The routine also makes a prediction for the next time period. It can help police monitor crime to better manage resources since they can direct extra resources if there is a sudden outbreak of crime in any one district.
For example, Figure 5 shows monthly Trigg signal trips for both excessively high and excessively low numbers of violent crimes in a single census tract in Pittsburgh over a 10-year period. The monthly data were collected for all census tracts in Pittsburgh over the period and the model was run using a jurisdiction-wide seasonality adjustment. As seen, the model does a good job of identifying months when the number of violent crimes was much higher than ‘normal’, adjusting for seasonality and trend over time. See chapter 23 in the CrimeStat manual for details.
There are other new routines in CrimeStat IV as well (e.g., new tools for measuring spatial effects in zones; the ability to output KML files to Google Earth if the coordinate system is spherical; a utility for converting Excel files to dBase IV files which are used in CrimeStat). CrimeStat IV includes extensive documentation of all the routines in the program plus many examples that we have made as well as those by other researchers. There is also a set of CrimeStat Libraries (version 1.1) that allows data base managers to integrate many of the routines into a records management system.
Both the program and documentation are free and are available from the NIJ website. It also includes sample datasets to allow users to learn a technique. The latest version is 4.02 and is available from:
Bernasco, W., & Nieuwbeerta, P. (2005). How Do Residential Burglars Select Target Areas? A New Approach to the Analysis of Criminal Location Choice. British Journal of Criminology, 45, 296-315.
Gorr, W. I. & Olligschlaeger, A. M. (2013). Time Series Forecasting. In Levine, N. (ed), CrimeStat IV: A Spatial Statistics Program for the Analysis of Crime Incident Locations. Chapter 23.
Levine, N. (2007). Crime travel demand and bank robberies: Using CrimeStat III to model bank robbery trips”. Social Science Computer Review, 25(2), 239-258
Levine, N., Robertson, A., & Fosberg, B. (2013). Modeling correlates of weapon use in Houston robberies with the Multinomial Logit model. In Levine, N. (ed), CrimeStat IV: A Spatial Statistics Program for the Analysis of Crime Incident Locations. Bernasco, W. & Block, R., Chapter 13 on Discrete Choice Modeling, Attachment A.