Near-repeat modelling: hands on

Finally, let's look at how good our model is for predicting crimes. Here's the work-through. Where we use new tools, we'll mention these.

1) From the original csv file layer, select all the crimes falling in the week after selected crimes, so 2003-10-27 to 2003-11-03, and turn them into a new layer we'll call "next-week".

2) Click on Vector -> Data Management Tools -> Split Vector Layer, and run this on the robbery-buffer layer to split it into three layers, one per week, using the WEEK_NO column. Note that this tool needs a directory selecting, not a filename.

3) Go into Layer -> Add Layer -> Add Vector Layer, and add the three new layers. You need the ".shp" shapefiles. You can select all three files at once.

4) Go into Vector -> Data Management Tools -> Join Attributes By Location, and join the next-week file (as the target) with the split vector file for Week 41. Keep only matching records and use the first record found for attributes (the default settings). We'll call the layer produced 'r41'.

5) Now join the r41 file with the split vector file for Week 42 in the same way to make r42.

6) Finally join the r42 file with the split vector file for Week 43 in the same way to make r43. This should now contain only points beneath circles from all three weeks' worth of buffers.

7) Right-click r43 and select 'show feature count' to show the number of crimes covered by three buffer weeks.

8) To determine the crimes not found by any circles, use the Join Attributes By Location tool to join next-week and the original robbery-buffer. This will give you all the crimes covered by at least one buffer. Use "show feature count" on the layer produced to show the number of these, then "show feature count" on the next-week data to show the total number of these. Take the number covered by at least one buffer from the total to give the number not covered by any buffers.

With a 200m buffer, we get 37 covered by three buffers, 75 in next-week, at 52 covered by at least one buffer, giving 23 not covered by any buffers. Of course the quality of this prediction will depend on the prior method used by your police, but it seems a pretty good ratio, given we weren't initially sure this burglary-centred technique would work on street robbery data.