SPATIAL AND COLLATERAL DATA MINING FOR CRIME DETECTION AND ANALYSIS



DATA AGGREGATION

The term “aggregation” is used here as a neutral word for the mathematical term “set”. The mathematical methods developed for analyzing these coherent aggregations are referred to under the general term of “cluster analysis.” Most modern techniques are extensions of methods developed as part of mathematical statistics, though some have arisen out of data management techniques developed by computers scientists. Parametric statistics, such as mean and standard deviation, can also be used to describe clusters. Nonparametric statistics use more esoteric terms and methods. Nonparametric statistics are particularly useful when the number of data points in the population is small. For both types of techniques, the analyst builds a model of the data. The model is usually mathematical, and might be represented as a graph, table, or drawing. A model is an explanation of the data. If the model is good, it might be capable of supporting prediction about future data points. Cluster analysis can provide a foundation for predictive modeling. For example, to develop a model to identify persons likely to commit large crimes, common indicators or information patterns among separate fraudulent and petty criminals customer cluster would be sought out. Once these aggregations are located and formally delineated, data on prospective criminals can be examined to identify in which cluster they are likely for “membership.”

Police department of Hyderabad is an investigation report by Police officers about criminal offence and contains description information on crime type, time of incident, type of weapon and details of the incident. The key terms extracted include terms referring to age, gender, and physical description of suspect, time, location of the incidents, type of injury or death resulting from the incidents. Polyanalyst software gives a user-defined interface to build the application. The data mining techniques are used to generate reports, graphs and charts of historical data.

Spatial data:
  • Digital map of Hyderabad metro area with Municipal ward and Police station limit boundaries.
  • City map showing vital installations, VIP resident / movement areas / corridors
  • Choropleth map of number of crimes within each police station limits
Non spatial data / Attribute information from Police records:
  • CrimeID :Individual Crimes are designated by unique Crime IDs
  • CrimeName :Disguised crime’s name
  • Gender:Belongs to which gender.
  • Age:Age of Individual criminal.
  • Height:Hight of Individual Criminal.
  • Location:Location of Individual criminal .
  • CrimeType :It indicate particular criminal belong to which crime.
  • WeaponUsed :It indicate which type of weapon a criminal used.
The project components are:
  • Dataset
  • Rules
  • Graphs and Chart
  • Reports.
These four components are discussed in next few pages.

DataSet : The World dataset gives the structure of the data. Make sure to designate that Procedure codes should be imported as categories rather than numbers.

RULES: The following rules have been formulated: Location of crime:
This rule act like attribute in world dataset for identifying the spatial Pattern (Location or name of police station limit ) of crime
Duration_of_Crime: 2003 - year( firstCrime )
This rule act like attribute in world dataset for calculating duration of crime in crime world of individuals.
Age_of_firstcrime: Age - Durationofcrime
This rule act like a attribute in world dataset for calculating the age of the person when the first crime was committed.

Graph and Charts :

The following charts are obtained.
  • Crime link to weapon.
  • Location link to Crime.
PPage 2 of 3
| Previous | Next |