|
GISdevelopment.net --> Application --> Miscellaneous
SPATIAL AND COLLATERAL DATA MINING FOR CRIME DETECTION AND ANALYSIS KIRANMAI CHERUKURI VNRVJ INSTITUTE OF ENGG. AND TECHNOLOGY, HYDERABAD, INDIA. MURALIKRISHNA IV CENTRE FOR SPATIAL INFORMATION TECHNOLOGY, JNTU, HYDERABAD, INDIA. VENUGOPALA REDDY A OSMANIA UNIVERSITY COLLEGE OF ENGINEERING, HYDERABAD, INDIA INTRODUCTION Data mining techniques can help discovery and exploitation of knowledge, which can aid in many aspects of knowledge management. Information on knowledge falls into three categories:
DATA AGGREGATION The term “aggregation” is used here as a neutral word for the mathematical term “set”. The mathematical methods developed for analyzing these coherent aggregations are referred to under the general term of “cluster analysis.” Most modern techniques are extensions of methods developed as part of mathematical statistics, though some have arisen out of data management techniques developed by computers scientists. Parametric statistics, such as mean and standard deviation, can also be used to describe clusters. Nonparametric statistics use more esoteric terms and methods. Nonparametric statistics are particularly useful when the number of data points in the population is small. For both types of techniques, the analyst builds a model of the data. The model is usually mathematical, and might be represented as a graph, table, or drawing. A model is an explanation of the data. If the model is good, it might be capable of supporting prediction about future data points. Cluster analysis can provide a foundation for predictive modeling. For example, to develop a model to identify persons likely to commit large crimes, common indicators or information patterns among separate fraudulent and petty criminals customer cluster would be sought out. Once these aggregations are located and formally delineated, data on prospective criminals can be examined to identify in which cluster they are likely for “membership.” Police department of Hyderabad is an investigation report by Police officers about criminal offence and contains description information on crime type, time of incident, type of weapon and details of the incident. The key terms extracted include terms referring to age, gender, and physical description of suspect, time, location of the incidents, type of injury or death resulting from the incidents. Polyanalyst software gives a user-defined interface to build the application. The data mining techniques are used to generate reports, graphs and charts of historical data. Spatial data:
DataSet : The World dataset gives the structure of the data. Make sure to designate that Procedure codes should be imported as categories rather than numbers. ![]() RULES: The following rules have been formulated: This rule act like attribute in world dataset for identifying the spatial Pattern (Location or name of police station limit ) of crime Duration_of_Crime: 2003 - year( firstCrime ) This rule act like attribute in world dataset for calculating duration of crime in crime world of individuals. Age_of_firstcrime: Age - Durationofcrime This rule act like a attribute in world dataset for calculating the age of the person when the first crime was committed. Graph and Charts : The following charts are obtained.
Link analysis is a powerful and important technique for discovering information in large, complex data sets. Link analysis is a data, among strategy for identifying “events” which occur together. An “event” in this sense can be any crime. The goal of link analysis is usually to find common indicators of an event so that the corresponding opportunity can be exploited. There are three general types of link analysis that arise frequently in majority of applications: Associations are groups of “events” that regularly occur together. For example, if the goal is to determine ways to detect and limit crimes related to burglaries, it might be important to discover the type of houses, their location in the city whether down town or out skirts etc which happen to be the targets. Associations are the simplest link relationships, and the easiest to discover. They are often used to suggest hypothesis for data mining: Sequential patterns that occur reliably can be used to formulate heuristics (rules of thumb): “we should pitch crime C to persons who are involved in A and B.” Stratification divide the crime locations which has spatial attribute and criminals which is a non spatial attribute into “strata” for some analytic purpose, in this case, to answer a question. Stratification can be used to perform link analysis by retrieving records from the spatial data store using the hypothesized links as the retrieval keys. For example, suppose that an analyst suspects that certain high level crimes are more likely to happen in a particular locality under a given police station limit. This hypothesis may be tested by retrieving lists from the data store.
![]() Fig : 1.Crime link weapon ![]() Fig : 2. Spatial data analysis - Location link Crime : Reports are the main asset of this project .These reports are giving valuable information regarding the our goal. Summary statistics , PolyNet Pedicter, nearest neighbor, decision tree, link analysis are the tools of preparing the reports.. Summary Statistics and regression :The Summary Statistics exploration engine provides basic statistics about your data, including means, standard deviations, and frequencies. In addition, the Summary Statistics report includes frequencies charts for each category, string, and yes/no variables ![]() Fig : 3. Frequencies Chart giving statistics of weapons used. The Predicted and Real vs. Counter graph allows you to see how closely the PolyAnalyst prediction follows the actual value of the attribute over the range of the dataset. ![]() Fig 4 Decision Tree algorithm The decision tree algorithm helps solving the task of classifying cases into multiple categories. Here the target attribute is CrimeType and using this target attribute Decision tree algorithm categories the dataset into six sub datasets. Here decision tree found dependence to crimes related to Gun ,Computer, Knife, Rifle, Phone . Link Analysis clearly gives the spatial distribution information of types of crimes that are occurring.
The central object of the Link Analysis report is the spatial linking and understanding. Display found positive and negative correlations (links) between attribute values (nodes) as a cyclical graph with directed links, the diagram allocates appropriate correlation weights to the links Red lines indicate positive correlations between values of attributes, while blue lines indicate negative correlations. The color intensity and weight of each line visually represents the strength of the association, where the thicker and darker lines have higher correlations. Conclusions : The performed analysis serves as an illustration of some Crime detection techniques and some chart for getting valuable results. This result should help to our Police Department investigating officer to identify the hidden pattern without a need to know the local demographics and behavioral patterns. What is needed is only spatial distribution of police station limits. For this the tools of GIS are extremely beneficial. Moreover this analysis represent a much better overall picture of the incidents as it deals with both structural and textual portion of the database. The tools of data mining, GIS and data base management systems when integrated leads to spatial data mining. Though the tools are not based on well defined algorithms at present and are a bit fuzzy, the continued efforts and research on spatial data mining would certainly lead to improved management by police officers in course of time. The spatial data integration with collateral data and understanding the hidden patterns of past crimes through data mining helps in achieving the following:
|
| © GISdevelopment.net. All rights reserved. |