RTMDx Report

Model-06_12_2014_162627

6/12/2014 17:59

Result Summary

A significant Risk Terrain Model for leeds_burgd_2011 was found for leeds_district that contains 2 Risk Factors (see Model Specifications below).

Analysis Input Details

The Risk Terrain Modeling Diagnostics Utility was run to generate a model of leeds_burgd_2011 for leeds_district on 06/12/2014 17:59:27 and saved the results as Model-06_12_2014_162627 in the C:\Users\Nick\Desktop\results folder. A model was produced to represent the risk factors for 13857 events in the leeds_burgd_2011 data set considering the potential spatial influences of det_points and bus_stops. All geographic calculations were conducted in the projection of the study area boundary (see below), using raster cells of 250 m and an average block length of 500 m. There were 9199 raster cells used in the analysis of which 2389 cells contained events.

Analysis Parameters

The Utility was provided with the following risk factors and parameters:

Name	Feature Count	Operationalization	Spatial Influence	Analysis Increment
det_points	66118	Both_Proximity_and_Density	3 Blocks	Whole
bus_stops	4202	Proximity	3 Blocks	Whole

These 2 risk factors generated 9 variables that were tested for significance. This testing process began by building an elastic net penalized regression model assuming a Poisson distribution of events. Through cross validation, this process selected 5 variables as potentially useful. These variables were then utilized in a bidirectional step-wise regression process starting with a null model to build an optimal model by optimizing the Bayesian Information Criteria (BIC). This score balances how well the model fits the data against the complexity of the model. The stepwise regression process was conducted for both Poisson and Negative Binomial distributions with the best BIC score used to select between the distributions.

"Best" Model Specification

The RTMDx Utility determined that the best risk terrain model was a Negative Binomial type II model with 2 risk factors and a BIC score of 18325. The model also includes an intercept term that represents the background rate of events and an intercept term that represents overdispersion of the event counts:

Type	Name	Operationalization	Spatial Influence	Coefficient	Relative Risk Value
Rate	bus stops	Proximity	500	3.0617	21.3638
Rate	det points	Density	500	2.1578	8.6521
Rate	Intercept	--	--	-3.3565	--
Overdispersion	Intercept	--	--	1.6985	--

R Text Summary

*******************************************************************
Family:  c("NBII", "Negative Binomial type II") 

Call:  gamlss(formula = crime_count ~ r01d01_det_points_density_500 +  
    r02p01_bus_stops_proximity_500, sigma.formula = ~1, family = NBII,  
    data = raster.data, method = mixed(3, 10)) 

Fitting method: mixed(3, 10) 

-------------------------------------------------------------------
Mu link function:  log
Mu Coefficients:
                                Estimate  Std. Error  t value    Pr(>|t|)
(Intercept)                       -3.356     0.15578   -21.55  1.717e-100
r01d01_det_points_density_500      2.158     0.04072    52.99   0.000e+00
r02p01_bus_stops_proximity_500     3.062     0.15789    19.39   3.897e-82

-------------------------------------------------------------------
Sigma link function:  log
Sigma Coefficients:
  Estimate  Std. Error     t value    Pr(>|t|)  
   1.69852     0.03932    43.20191     0.00000  

-------------------------------------------------------------------
No. of observations in the fit:  9199 
Degrees of Freedom for the fit:  4
      Residual Deg. of Freedom:  9195 
                      at cycle:  3 
 
Global Deviance:     18288.17 
            AIC:     18296.17 
            SBC:     18324.68 
*******************************************************************

Risk Terrain Map Production

The selected risk terrain model was assigned relative risk scores to cells ranging from 1 for the lowest risk cell to 184.8 for the highest risk cell. These scores allow cells to be easily compared. For instance, a cell with a score of 184.8 has an expected rate of crime that is 184.8 times higher than a cell with a score of 1.

You can reproduce these risk scores in common GIS software by operationalizing the risk factors using the "best" model specifications displayed above. Risk factors based upon proximity should be set to 1 for areas within the distance threshold and 0 elsewhere. Risk factors based upon density should be set to 1 for areas 2 standard deviations above the mean value after applying a kernel density operation of the specified bandwidth and set to 0 in other areas.

The 2 manually produced risk map layers can then be combined through map algebra to produce a risk terrain map and to calculate relative risk scores. For example, using ArcGIS for Desktop's "Raster Calculator" function, you can copy and paste the following formula to assign relative risk scores to each cell updating the risk map layer names as needed:

Exp(-3.3565 + 3.0617 * "bus stops" + 2.1578 * "det points") / Exp(-3.3565)