Event Registration is a prototype developed for the Police in Norway, with the aim of supporting investigations, preventing crime, and providing statistics for better enterprise governance. The entire back-end stack of the system is based on semantic technologies integrated with Java, by adopting the Apache Jena framework for semantic programming.

RDF – flexible statements

Semantic Web technologies (do not let the «Web» term fool you, as it can be applied totally independent of the Web) include the W3C standards: RDF for serializing data, SPARQL for querying the RDF data, and OWL and RDFs for applying a data model to the data. RDF data can be fully schema-unbound and solely exist as fact-statements, which makes it great for systems where it is not beneficial to agree upon a data-schema up front. This allows for the data schema to evolve, adapt and be extended throughout the life-cycle of the system. In Event Registration, it is important that the police can express anything about everything, meaning that the need for a flexible data structure is huge, and this was the main reason that semantic technologies were adopted in the project.

Imagine a case where a woman comes into the police station to file a police report about a theft. She just came from a large event with a huge and narrow crowd, and suddenly discovered that her wallet was stolen from her bag. The event she was attending, along with the actual theft, are registered as the two first events in the event sequence canvas depicted in Picture 1.

The police have few leads to follow in this case, but two weeks later the victim comes back to the police station and files a new police report regarding the arrival of a letter from a bank stating that she had opened a bank account there. The victim claimed to the police that she had never opened this bank account herself, which could imply that she might have been exposed to an identity theft.

The police now investigate this further by talking to a bank employee at the bank in question, who reveals that the account indeed has been opened in the victim’s name, and that a consumer loan of 500 000 Norwegian Kroner (approximately $60,000 USD) also has been taken up in the same name. Further, the bank employee notices that the consumer loan was paid out to the same bank account, and that one day later there was a purchase of a car from a car retailer. Finally, they see that the remaining money were used in a transaction of some sort abroad, meaning that the offender had already left the country.

Picture 1: The full case of an extensive identity theft.

The chain of events are registered by the police, and now that they have a hypothesis of how the events occurred, they can use the «Phenomena-search» functionality in order to identify similar cases (see Picture 2).

Picture 2: Searching for patterns across cases matching the given case from Picture 1.

In Picture 2 the police have drawn a pattern on the search canvas to the left, matching the chain of events from Picture 1. This functionality allows the police to see all clusters of patterns across other cases that match the pattern from the search canvas. The search results on the right side are patterns in other cases that match the pattern from the search canvas, meaning that the police can disclose phenomena, and gain valuable insight and knowledge from similar cases.

This blog post merely scratches the surface of the full range of possibilities in the phenomena search, as we have not even touched on the filtering of attribute values, or adding geographical places or persons and their roles into the equation. However, the example provided still illustrates that when discovering phenomena in crime across cases the police are able to see patterns that can aid them in terms of preventing crime and solving cases.

For instance, they could prevent similar thefts from happening by encouraging people at large events to secure their valuables properly, allocate police resources at such events more efficiently, and also having a look at how the routines for opening bank accounts and taking up loans can be more secure.

Finally, the chain of events are suggesting that several organized criminal groups are cooperating in this type of crime, as it clearly points to one group being responsible for stealing identity papers and selling them to others that want fake papers. This makes it easier for the police to understand what they are up against, and the knowledge and experiences extracted from similar cases can be extremely valuable for the investigation.

Technical aspects

The technical part of the phenomena search starts off with the requesting client building a JSON search criteria structure of the entities and the relations between them from the search canvas.

The next step is to convert the JSON search criteria into SPARQL queries by using the ARQ API in the Apache Jena framework. ARQ is a component in Jena meant for dynamically building SPARQL queries and executing them against an RDF database, in this case a Jena RDF database implementation called TDB.

Finally, the results of the executed SPARQL query is converted into a JSON structure that is sent back to the client issuing the initial request. Picture 3 shows an overview of entire data flow of the phenomena search.

Picture 3: The overview of the data flow of the phenomena search.

One thought on “Disclosing patterns and phenomena in crime with semantic technologies

Legg igjen en kommentar

Din e-postadresse vil ikke bli publisert. Obligatoriske felt er merket med *