Mapping #OpFortress tweets
Hampshire Constabulary have an operation to combat drug related crime in Southampton called Operation Fortress and have been posting tweets relating to this operation with the hashtag #OpFortress. It has been an effective method of showing progress and engaging with the public, the tweets sometimes give advice, ask for help or can also report operation updates such as arrests, raids or crimes - often with the location at which the event took place.
I thought it would be quite interesting to see if you could geo-locate the position of these events or crimes solely from the tweets, you could also get this data from Police crime data API data.police.uk but I thought it would be cool to pull the data from Twitter instead. On the premise that this works in this project then you could geolocate tweets from any feed, just from the text and to street level in Great Britain.
Processing the tweets
The location data I am looking for in each tweet will be in the form of road and place names but I don’t know how it will be structured, if any abbreviations will be used or even if the spelling will be correct. So, I am going to have to apply some kind of fuzzy text search algorithm against the unstructured data matching against a known list of road names and place names.
I decided to use Ordnance Survey Open Data products as my source data;
The datasets are available through Open Data license and so I can download these and load them into a database ready for searching. Having read a few of the tweets and noticed that they generally contain names of roads at which events have occurred, so I will primarily use the OS Locator product to find tweets that contain road names, falling back to the 50K Gazetteer if not previously matched.
The process of matching tweets against the Open Data products involved sifting through words testing if they existed resembled something in the database of road and place names and then extracting the location data after finding all the #OpFortress tweets from @HantsPolice. I was able write some code to do all this and found some standard libraries to access Twitter data, so this process can be run again and again, capturing new tweets and data.
Visualising the results
The output from the matching process can be formatted into something that can be parsed by OpenLayers and projected onto a map. The results are overlaid on top of Ordnance Survey OpenSpace backdrop mapping service.
View the op-fortress map.