By Andreas M. Olligschlaeger, TruNorth Data Systems, Inc.
Unless you’ve been living under a rock for the past three or four years you’ll have heard of predictive policing, the latest of many buzzwords to emerge in the field. Media hype has been nothing short of spectacular, with many vendors – new and old – jumping into the fray and offering software products that promise to revolutionize policing as we know it. Naturally, all of this comes at a cost: not only are some of the products extremely expensive, but they have created sometimes unrealistic expectations of crime analysts. In a few rare instances city and police managers have been convinced that software can somehow reduce the need for human analysts or even replace them entirely.
Unlike the term predictive policing, crime forecasting has been around for a long time (see for example Olligschlaeger, 1998). What is new, however, is that efforts are being made to incorporate crime forecasting techniques – some new, some old – into daily police operations. This, in essence, is what predictive policing is, although some of the techniques used overlap with crime analysis concepts that have been in use for decades.
Unfortunately the media hype has led many to believe that predictive policing is a product that can be bought off the shelf. However, as Perry et al. (2013) and others have pointed out, predictive policing is not a software package or a statistical technique. Rather, it is a complex process, of which software and statistical methodology are just one part – albeit an important one. The process of using software and analysis to produce forecasts of criminal activity is what has traditionally been known as crime forecasting.
There are many crime forecasting methodologies, depending on what it is a crime analyst wants to forecast. They range from the very simple to the highly complex. It is important to note that highly complex models – those typically sold in commercial packages – often only provide an incremental improvement in terms of accuracy and utility over some of the simpler or intermediate methods. This means that a well-trained analyst can in many instances produce results almost as good as a sophisticated commercial software package costing a lot of money with tools that are either already in his or her toolbox, can be purchased for a relatively small amount of money or are available open source. For increasingly cash strapped police agencies this brings up an important question: does it make sense to spend large sums of money on commercial software or would it make more sense to provide training to crime analysts in order to teach them to build their own forecasting models? Some police agencies, especially small to medium sized ones might benefit more from the latter and realize a greater return on investment in the long run.
This last point brings me to the purpose of this article: how to put together a basic, yet powerful and robust crime forecasting toolkit on a shoestring budget or even no budget at all. I begin by discussing some of the requirements for forecasting (primarily data and software) and provide a very brief overview of some of the most commonly used techniques for one type of crime forecasting: one step ahead forecasts by areal unit. This is followed by suggestions for the crime forecasting toolkit and some examples of how that toolkit might be used in practice.
Crime Forecasting Requirements, Data Issues, and Methods
For simple to intermediate crime forecasting methods – the primary focus of this article – most analysts already have many of the software packages they will need. Nevertheless, it is worth briefly discussing individual components because the sophistication of each will have a bearing on what it is you will be able to do.
The first component is obviously a geographic information system (GIS). While strictly not required to produce actual crime forecasts, it is useful for displaying results as well as collecting data. For example, a GIS could be used to generate frequency counts of geocoded crime incidents by census tract, census block or grid cell. Many GIS systems also have built in statistical capabilities, such as spatial regression and data manipulation routines. The more sophisticated the GIS, the more you can do with your data.
A staple of every crime analysis unit is a SQL compatible database. Databases are essential for gathering, storing and manipulating data that is often retrieved or exported from a variety of other database, such as an RMS, CAD system or others. Naturally, there are a variety of ways in which data can be imported into a database, but depending on the need another good tool to have in the crime forecasting toolbox (or in the crime analyst toolbox in general, for that matter) in conjunction with a SQL database is a data migration tool. Data migration tools are very useful for setting up and conducting automated data transfers/exports on a regular basis from multiple databases. The advantage of such tools is that they can connect to multiple platforms, even if they are in the cloud, and allow the user to manipulate data before they are deposited into the crime analysis database. For example, date/time formats can be changed, fields parsed etc., meaning that many tasks a crime analyst normally has to perform manually can be automated. So they are also a significant time saver. But even if a crime analyst does not have direct access to an RMS or CAD database and depends on, say, regular XML export files, a data migration tool can be set up to automatically scan a local or network directory for a new export file and, if it finds one, process it.
Finally, no crime analysis unit should be without a good statistical software package. Most GIS systems and standard productivity software such as Microsoft Excel have some statistical features built in to them, but in order to progress beyond the most simple of crime forecasting techniques something more powerful is needed. Commercial statistics packages can be very expensive but fortunately there are some excellent open source alternatives.
One of the issues unique to crime forecasting as compared to forecasting in other disciplines is that crime is a rare event, yet successfully building a crime forecasting model can require hundreds or thousands of data points. The more data you have and the higher quality it is, the more sophisticated a forecasting model you can build. So the first and most important step is to decide what types of data to use, how to get it, and how to incorporate it into a single dataset.
A further issue that is related to how many data points you have is choosing the size of area by which you want to forecast as well as the time frame. While many analysts will have no choice but to use census tracts or blocks, or even police beats, those who decide to use grid cells or some other variable area size will have to experiment. If you choose too large an area you will have plenty of crimes but the tactical utility of even accurate forecasts diminishes. Pick too small of an area and most areas will have zero or one crime counts, meaning most forecasting methods will treat areas of high crime as outliers. Naturally, what constitutes the best size areal unit will depend on the crime type as well as jurisdiction.
The choice of temporal unit also affects how many crimes you will have per unit for much the same reasons as areal units. Typically forecasts are either weekly or monthly, with monthly the most commonly used.
There are many different types and categories of forecasting or predictive models used by police and the criminal justice system, depending on what the purpose of the forecast is (see Perry et al., 2013, for a more complete overview). One of the most common types of forecast produced by crime analysts is the one step ahead forecast by area. In other words, we want to forecast either levels of crime or changes in crime in the next time period (usually week or month) by area (census tract, census block, or grid cell). There are many methods that can accomplish this, varying from simple (such as simple exponential smoothing and Holt Smoothing) to intermediate (multiple regression) to highly complex (artificial neural networks and genetic algorithms) (see for example Gorr and Olligschlaeger, 2002).
There are two basic types of forecasting method used in one step ahead forecasts: univariate and multivariate. Univariate methods are simpler and require fewer data points. Typically they require only a time series of crime counts by area. Multivariate methods, on the other hand, incorporate data from a variety of sources such as RMS, CAD, public property records, and others. Those methods, such as regression, for example, are usually more accurate and better at identifying new areas of crime because they incorporate variables that can contribute to increases or decreases in crime. Such variables are also known as leading indicators (Cohen et al., 2007).
The primary goal of crime forecasting is not just to create forecasts that are accurate, but more importantly forecasts that tell police officers what they don’t already know. For example, accurately predicting that historical crime hotspots will remain hotspots in the next time period doesn’t provide much tactical insight. Neither does accurately predicting that most census blocks will have zero homicides next month. What is really needed are models that can predict new hotspots before they emerge, although almost equally as important is being able to forecast decreases in crime – after all, both have implications for manpower deployment. The only type of model that can accomplish this are multivariate methods. However, univariate methods are still very useful for identifying areas with unusual departures from business as usual, which in turn can be used to identify variables to include in multivariate methods. More on this later.
While discussing the various methods in detail is beyond the scope of this article, there are a number of general tips that apply to most multivariate methods:
- Try a number of different models to see which works best.
- Use a holdout sample (a portion of the data that was not used to calibrate the forecasting model) to compare how well the model generalizes. If the model performs as well or almost as well on the holdout sample as it does on the data used to estimate the model then a model is said to generalize well. Otherwise your model may be over fitting the data.
- Thoroughly analyze and explore your data before you build a forecasting model. Look for leading indicators that are correlated with what it is you are trying to forecast and avoid throwing as many variables into the model as possible.
- Overall model fit is not as important as being able to identify emerging or declining hot spots.
- If you have evidence of geographic displacement of crime, think about using spatially and temporally lagged variables, i.e. variables from neighboring areal units from the previous time period. The same is true if events in one area influence those in neighboring ones.
- Compare the results of your model to simple methods, such as the random walk, for example. The random walk assumes that the crime count in the next time period is the same as in the previous one. The random walk can be surprisingly hard to beat.
Now that we’ve looked at what we need to be able to do with our data and what types of software we need, it’s time to put together the actual crime forecasting toolkit. The good news is that just about everything is available from open sources. While some open source resources require at least a basic knowledge of scripting and programming languages (usually Java), there are a plethora of how-to manuals available, many of which are free. A big advantage of open source products is that there are usually numerous forums, support, and user groups associated with each product. This means that if you are looking to find a script or program to do something, the chances are pretty good that someone else has either already done it or has at least done something similar to what you’re looking for and all you have to do is search the web for it. But most importantly, the term “open source” does not necessarily mean cheap (as in you get what you pay for), inferior, or substandard products. Quite to the contrary, many of the best known websites including Twitter, Facebook, Yahoo, and Wikipedia are entirely or in part developed and deployed using open source products. For those that are concerned about security and vulnerabilities associated with using open source products, rest assured. They are as good if not better than most commercial products.
While there are numerous open source GIS packages, none are likely as powerful as the GIS that most crime analysts already have, which at the time of writing is usually ArcGIS. However, open source GIS packages are available that approximate at least some of the capabilities of commercial GIS packages, as well as web map servers and other geospatial tools. As most other open source products many of these will run on multiple platforms, including Windows, Linux and MAC operating systems. For a complete listing of open source GIS software, their capabilities as well as links to where they can be downloaded see http://en.wikipedia.org/wiki/List_of_geographic_information_systems_software.
Most crime analysis units have access to at least one SQL compatible database, usually Microsoft Access. While Access – especially the latest versions – will suit most analysts just fine, there are a number of very sophisticated and high powered open source database available. Of those, MySQL (http://dev.mysql.com/) and PostgreSQL (http://www.postgresql.org/) are the most popular and sophisticated, easily rivaling some of the most prominent commercial products, including enterprise products. For those of you using ArcGIS there are connectors available to both databases. Using a more sophisticated open source database makes sense for those analysts that consistently deal with large amounts of data, need to create a data warehouse, or need to share data with other units or analysts.
Data migration tools are extremely useful for those analysts that regularly receive data in the form of direct SQL queries or export files from one or more databases such as CAD, RMS and others. Crime analysts in general spend too much time focusing on data collection and integration and not enough time doing analysis. With a little bit of work (and perhaps some help from the IT department) it is easy to set up routine data exports/imports from and to other databases. Many migration tools have simple drag and drop interfaces where all you need to do is fill in parameters such as export file locations, database credentials, SQL commands, etc. Those analysts with even basic programming skills can add as much sophistication as desired, like data transformations, SQL commands to automatically increment crime counts by areal unit and time period based on new CAD and/or RMS incidents coming in, etc.
The two most commonly used data migration tools are Mule (www.mulesoft.com) and Apache Camel (camel.apache.org). Both can connect to all common commercial and open source databases (even in the cloud) and run on any popular operating system. Mule and Camel also both work with the two most widely used open source integrated development environments (IDE’s): Eclipse and NetBeans. For example, you can download a version of Mule that is already bundled with Eclipse and ready to use as soon as you have installed it.
The final piece of software in our crime forecasting toolkit is a decent statistical package. There are numerous general purpose open source statistical packages available, as well as some government funded ones that are free. Of the latter the one that really stands out is CrimeStat (http://nij.gov/topics/technology/maps/Pages/crimestat.aspx). CrimeStat was developed specifically for crime analysts and contains many spatial statistical routines useful for crime forecasting. While crime analysts have been using CrimeStat for years, it is worth mentioning because the latest version – CrimeStat 4 – includes two types of univariate forecasting, simple exponential smoothing and Holt smoothing as well as a Trigg signal detector that can identify areas that exhibit signs of unusual activity.
As far as general open source statistical packages are concerned, there is a plethora available (see http://en.wikipedia.org/wiki/List_of_statistical_packages for a detailed list). However, one in particular – R (http://www.r-project.org/) – easily stands out. R not only has highly complex statistical routines that can be used for crime forecasting and data exploration, but it can also produce high quality graphics. In addition to user groups, tutorials and forums a number of publications can help the budding crime forecaster get started (see for example Lander (2014) and Zumel and Mount (2014)). An excellent open source tutorial by Rob Hyndman and George Athanasopoulos of Monash University in Australia on using R for forecasting that explores some of the simple, intermediate and even advanced methods such as artificial neural networks can be found at https://www.otexts.org/fpp.
Finally, RStudio (http://www.rstudio.com/) is a powerful integrated development environment for R that will work on all major operating systems and includes tools that allow you to create interactive reports and visualizations for the web. Perfect for disseminating forecasts and other crime analysis products.
Crime Forecasting Examples
There are six steps involved in producing multivariate one step ahead forecasts by areal unit. They are:
- Aggregate – obtain data by areal unit and time period. The more data you have, both in terms of the number of variables and the length of the time series, the better.
- Explore – pick a crime that you want to forecast and analyze the data for other variables that might impact changes in that crime.
- Explain – use theory combined with data analysis and explanatory model such as regression to identify leading indicators.
- Forecast – develop forecasting models that use the leading indicators to produce one step ahead forecasts of rime.
- Feedback – learn from forecast errors and identify possible sources of error.
- Make changes to forecasting model, if any, and re-estimate for the next time period.
For univariate models steps 2 and 3 can be omitted. Further recommendations for producing crime forecasts are listed in Gorr and Olligschlaeger (2002).
One good way to explore your data is to use simple forecasts, apply the Trigg tracking signal and visualize the results on a map (see Gorr and Olligschlaeger, 2013a and 2013b). The Trigg tracking signal helps to identify areas showing unusual levels of crime (either increases or decreases).
Table 1 is an example of how CrimeStat 4 outputs results from univariate forecasting algorithms, including the Trigg signal strength and any signal trips. These data are then used to produce output such as that shown in Figure 1, which is a map of Trigg signal trips – meaning that shaded areas had unusually high (red) or low (blue) levels of activity – by census tract. Note that there are three areas where red and blue tracts neighbor each other. This is quite possibly in indicator of geographic displacement. One potential reason for geographic displacement of crime is police activity such as arrests for the crime you are trying to forecast, so police activity in a neighboring census tract would be a potential candidate as a leading indicator. Drilling down into data in other areas showing signs of unusual activity could provide further clues.
Another good idea is to create a matrix with all correlations between the dependent variable (the one you want to forecast) and the independent variables. Those variables that are highly correlated with the dependent variable might be good candidates for leading indicators. Table 2 shows correlations between dependent variable LLDRGTOT (log of total drug calls for service by area and time period) and 11 other variables. For example, the table shows that the number of nuisance bars (NBARS) is positively correlated and the log of median household income (LMDHHINC) is negatively correlated. Once candidates for leading indicators have been identified, explanatory models such as multiple regression can be used to verify that they are indeed a significant contributor changes in the independent variable.
Once you have identified your leading indicators it’s time to do some actual modeling. As mentioned earlier, it is always a good idea to try different models as well as different combinations of variables to see what works best. Table 3 is an example of what a comparison between different models might look like. The table lists a total of eight different models, including the random walk, which assumes that the number of crimes in the next time period is the same as that of the previous time period. In addition to CCF, which is a neural network based model, three types of regression models (Simple, Poisson and Tobit) are estimated with and without spatially lagged averages. Finally, results are compared to a holdout sample in order to determine how well each model generalizes. One thing that stands out is how difficult it is to beat the random walk, at least as far as overall model fit is concerned. Only the neural network model was able to beat it consistently both in the training data set as well as the holdout sample. With one exception all models generalize fairly well.
However, overall model fit doesn’t necessarily tell the whole story. Remember that one of the goals of crime forecasting is to tell police what they don’t already know. One way to do this – there are others – is to look at areas where crime was zero in the previous time period and non-zero in the current one. Table 4 shows a comparison of the same models shown in table 3 for those instances. It is clear that both the simple regression and neural network models beat the random walk and that including spatially lagged averages in regression models improves results, at least for this particular data set. Also worth noting is that while the most complex method – neural networks – performs the best overall it is only marginally better than simple regression. This reinforces the notion that intermediate methods can produce results almost as good as complex ones.
While this article only briefly touched on some of the many crime forecasting methods available, the reader will hopefully have drawn the conclusion that crime forecasting within the context of predictive policing is without a doubt feasible for any police department and crime analysis unit regardless of financial resources. There is no doubt that specialized commercial products have and will continue to play an important role in police departments that can afford them, but sophisticated, high end and free software products are available that can produce forecasts rivaling or at least coming close to many expensive commercial products. So for many agencies it makes sense to invest in training and, since data is such an important factor in producing good crime forecasts, to ensure that their crime analysis units have the most complete, accurate and wide ranging data sets available.
Cohen, J., Gorr, Wilpen and Olligschlaeger, Andreas M. (2007): “Leading Indicators and Spatial Interactions: A Crime-Forecasting Model for Proactive Police Deployment”, Geographical Analysis 39 (1), 105-127
Lander, Jared P. (2014): “R for Everyone”, Addison Wesley Data and Analytics Series
Gorr, Wilpen and Olligschlaeger, Andreas M. (2002): “Crime Hotspot Forecasting: Modeling and Comparative Evaluation”, Final Project Report, Grant 98-IJ-CX-K005, National Institute of Justice
Gorr, Wilpen and Olligschlaeger, Andreas M. (2013a): “Time Series Forecasting”, CrimeStat 4 documentation, Chapter 23
Gorr, Wilpen and Olligschlaeger, Andreas M. (2013b): “The CrimeStat Time Series Forecasting Module”, CrimeStat 4 documentation, Chapter 24
Olligschlaeger, Andreas M. (1997): “Spatial Analysis of Crime Using GIS-Based Data: Weighted Spatial Adaptive Filtering and Chaotic Cellular Forecasting with Applications to Street Level Drug Markets”, unpublished dissertation, Carnegie Mellon University
Olligschlaeger, Andreas M. (1998): “Crime Mapping in the Next Century: An Artificial Neural Network based Early Warning System”, in David Weisburd and Tom McEwen (eds.) Crime Mapping and Crime Prevention, Crime Prevention Studies, Volume 8, Criminal Justice Press
Perry, Walter L., McInnis, Brian, Price, Carter C., Smith, Susan C. and Hollywood, John S. (2013): “Predictive Policing”, Rand Corporation
Zumel, Nina and Mount, John (2014): “Practical Data Science with R”, Manning Publications