Crime Analytics: Exploring Analysis of Crimes Through R Programming Language
Crime Analytics: Exploring Analysis of Crimes Through R Programming Language
Abstract - The complexity of geographical crime patterns creates challenges for many laws enforcing agencies as well for the
community. Crime analytics is an intelligent crime analysis designed to aggregate crime related data. We use different algorithm in
Machine Learning to collect data sets. Our focus in this study is to design a crime analysis system to identify trends and patterns.
This study provides solutions to reveal valuable records that can be used effectively for analyzing and recording the information.
The application of R programming language gives a new way to connect to enormous volumes of police crime data, where R
streamlines the processing and interpretation of crime analysis. The study used the comprehensible histogram to show crime rates
in every district. R explores predictions related to crime patterns, which depicts the crime committed. It is a tool that can help law
enforcement effort to set various stages of development to check the potential problems before they become disastrous. It is needed
to emphasize that R can help in the analysis of data concerning small units or districts.
I. INTRODUCTION data, it can preempt measure may applied to prevent the next
crime. Detecting crime patterns is challenging. Data mining
A crime-free nation is an important factor in fostering is one of the evolving fields that can play with the large
investment and economic growth. A record shows that volume of datasets. The crime intensity of a location, in
common crimes have a record of an average of 5.6% (SWS real-time basis (Bolla, 2014) should gain insights to
2017). This resulting 2 points below from the previous aggregate knowledge for it.
percentage which is 7.6% in 2016 record property crime rate Data mining in crime literature has rapidly increase for
(SWS 2017). Data shows there was steady drop rate. A total the past years, it turns out to be mandatory to improve an
of 227, 757 crimes were recorded from January to June impression to lessen the crime rate. This systematic review
lower by 47, 945 compared to 275, 702 crimes (Albayalde focuses on techniques and technologies used in the previous
2017). studies of data mining in crime. It is classified into different
The evolution of technology leads to digital information. categories and presented using visualizations. Crime data
This advancement leads the interest and the capacity to keep mining research indicates challenges related to it.
safe the community. These massive volume of data helps us The main purpose of this study is to identify the patterns
to fight crime. It will give way to predict possible behavior and behavior of accuracy of existing crime with the use of R
or patterns of crime. Classifying threats in the community Programming. This will determine the areas of opportunities
to prevent the future attacks or terrorizing. to improve, algorithm and procedures used. It can also assist
Studies by Thomas Rich and by Cynthia Mamalian and us to determine the best fit process to be use for further
Nancy LaVigne of the National Institute of Justice have study.
found that the majority of departments using crime mapping
are creating automated pin maps and generating hotspot II. EXISTING STUDIES
maps. Along a range of difficulty, pin maps and hot spots
analysis are the low end. For more advanced analysis, some To determine and understand the behavior of the
knowledge of cartography and geographic information algorithm used from the different literature presented. This
system is required. The complexity of these latest shows the strength and weaknesses for the study it may
innovations may well prevent their adoption by all with the applies. The following are the existing work:
most technologically advanced departments. The use of clustering and classification in data mining
Data mining denotes lineage of data or knowledge algorithms to detect the criminal and crime spot. Data stored
discovery and analysis of patterns into useful information. A in databases to classify the activities of criminals. The
tool that can be used for crime detection. The emersion of identification of criminals will be based on crime spot and
space and time enhance the detecting crime patterns, these witnesses. These criminal “hotspots” will help the law
patterns give situational awareness among the security enforcers to avoid or to lessen criminals. [Sukanya, 2012].
agents and prevent the potential problematic areas (Neil and Spatial is related to space, this method helps us to
Gorr, 2017). Crime patterns is very useful to locate a identify the patterns and the threshold. These patterns
possible committed crime to happen. Using the historical predict the potential crime spots. The assessment of the
datasets through statistical analysis conducted by the Apriori Agarwal et al. used the rapid miner tool for analyzing
algorithm and produces interesting frequent results of the crime rates and anticipation of crime rate using different
criminal hotspots pattern. The decision tree classifier and data mining techniques. Their work done is for crime
Naïve Bayesian use to predict potential types of crimes. The analysis using the K-Means Clustering Algorithm. The
paper brings together the analysis of the study through the main objective of their crime analysis is to extract the crime
combination of the demographics information datasets of patterns, predict the crime based on the spatial distribution
crime to factors the capture information that affects the of existing data and detection of crime. Their analysis
neighborhood. Those findings will give a result to this includes the tracking homicide crime rates from one year to
solution and the be used as the awareness of the people the next. (Jyoti Agarwal, Renuka Nagpal and Rajni Schgal,
regarding the dangerous locations or to help the agencies to 2013).
predict the future crimes happen in specific locations. Most of the literature introduce a method to map out
The K-mean clustering and geographical information crime in various locations. It provides database, crime
system (GIS) also use to detect the crime hotpots over the parameters and structure of the correlations among the
use of data mining tools to create a structure. This unique defining variables.
feature of this of this study is the application of factor. The
mapping of the information serves as the foundation of the III. ALGORITHMS USED AND THEIR ANALYSIS
parameters of the crime. The analysis of the factor
application is used to uncover the hidden structure of the The following table I shows the systematic review of
data that is present yearly. This procedure can help to literature, algorithm used by the studies presented, their
predict where the crime is located. strength and weaknesses and their findings.
Toppi Reddy, H. K. R., Saini, B., & K-Nearest To develop an accurate Helps the crime
Mahajan, G. (2018). Crime Neighbor real-time crime prediction analysts to analyze
Prediction & Monitoring Algorithm to reduce crime rates in crime networks by
Framework Based on Spatial the community as crime means of various
Analysis. Procedia Computer Naïve Bayes occurrences depends on interactive
Science, 132, 696-705. many complex factors. visualization.
Prabakaran, S., & Mitra, S. (2018, Naïve To process crime These techniques are
April). Survey of Analysis of Bayesian characteristics to help the used when the
Crime Detection Techniques Using society for a better living. dimensionality is high
Data Mining and Machine K-Mean and k-mean algorithm
Learning. In Journal of Physics: Clustering is fast, robust and
Conference Series (Vol. 1000, No. gives easy to
1, p. 012046). IOP Publishing. understand result.
different sections, such as data mining, Extraction and data. In the context of classification estimates the possibility
Classification, Pattern Identification, Prediction and of given point data falling into a certain object. This
Visualization. classifier method finds way to predict the finding parameters
Extraction and Classification are used to get the data to feature individual probability. In this section, the data will
source, it is recommended to use a classifier for statistical be extracted to get the summary or the description. To
method. The algorithm will classify the data. Using this identify the fields involved and the levels of data sets.
step, we can create a model that suits on the training the
Figure 1 the researcher explores on some functions of R in the data file. The variable name converts names t lower
like the variable. This figure tells that we have 22 variables case since R is predominantly done in lower case.
After determining the pattern, a new set of rules will model to predict the possible crime. The law enforcement
arrive, however, if it is the same, we can predict that there’s agencies used this model to determine the crime patterns and
another crime occur. predicting the future trends [Nasridinov, Park 2013].
In this section the law enforcer will take the opportunity Visualization section in Figure 4 shows graphical
to prevent it by providing the necessary safety measures. In representation trends of crime. It is an art to transform the
this case the police or any analyst take this as the detection data into useful information. It shows the occurrences of the
for crime. The mounting information available to process by dates when the crime happened. It makes easier for view to
the technologies has enabled the law enforcement agencies interpret the information. Below shows the graphical
to aggregate data into various crimes. The investigation of representation of Total Crime by the District.
law enforcement agencies used the classifications Heat map displays the probable regions to prevent
techniques to apply on these data and to form decision-aid crimes by taking preventing mechanism x- axis plotted the
tools to facilitate investigations. district while the y-axis plotted the rate of the crime. The
The implementation and analysis method of prediction heat map shows the regions which has the high volume of
can be done using the decision tree. Decision is parallel to a crimes. The data is based from the historical. Deering
graph which had nodes that represent every attribute. The District has the high level of crime rate.
researcher proposes a decision tree-based classification
rate as well as the safety and security of the community and https://www.rappler.com/sciencenature/environment/108276-
philippines-plastic-pollution-ocean-conservancystudy
its people. [6] Badilla, N. (2017). 45 percent of Metro’s garbage not properly
Clustering technique is used to cluster the similar type of disposed. Special Report. Retrieved December 27, 2017. From
crimes together, based on the clusters’ result the burglary https://www.manilatimes.net/45-percent metros-garbage-not-properly
type of crime hotspot will be identified. This result will help -disposed/370791
[7] GMA News Online. (2018). PHL 1 of 5 countries that produce half
to reduce the burglary type crime. In future all type of of world's plastic waste — UN report. Retrieved June 5, 2018. From
crimes’ hotspot will be identified, through this the crime http://www.gmanetwork.com/news/lifestyle/healthandwellness/655744/phl-
activities will be reduced. 1of-5-countries-that-produce-half-of-world-s-plastic-waste-un-
report/story
[8] ToppiReddy, H. K. R., Saini, B., & Mahajan, G. (2018). Crime
REFERENCES Prediction & Monitoring Framework Based on Spatial Analysis.
Procedia Computer Science, 132, 696-705.
[1] Alex Krizhevskye, Ilya Sutskever, Geoffrey E. Hinton, “ImageNet [9] Prabakaran, S., & Mitra, S. (2018, April). Survey of Analysis of
Classification with Deep Convolutional Neural Networks”, Neural Crime Detection Techniques Using Data Mining and Machine
Information Processing Systems, pp. 1106–1114, 2012. Learning. In Journal of Physics: Conference Series (Vol. 1000, No. 1,
[2] Cortes, C, “Support-vector networks”. Machine Learning. 20 (3): p. 012046). IOP Publishing.
273–297, 1995. [10] Tayal, D. K., Jain, A., Arora, S., Agarwal, S., Gupta, T., & Tyagi, N.
[3] Carullo A, Parvis M. An Ultrasonic Sensor for Distance (2015). Crime detection and criminal identification in India using
Measurement in Automotive Applications. In: IEEE Sensors J. data mining techniques. AI & society, 30(1), 117-127.
1(2):143p. [1] Sukanya.M, T.Kalaikumaran and Dr.S.Karthik. Criminals and crime
[4] Singh, S., et al. (2017). Waste Segregation System Using Artificial hotspot detection using data mining algorithms: clustering and
Neural Networks. Helix, the Scientific Explorer. Helix Vol. 7(5): classification International Journal of Advanced Research in
DOI 10.29042/2017-2053-2058 Computer Engineering & Technology (IJARCET) Volume 1, Issue
[5] Ranada, P. (2015). Why PH is world’s 3rd biggest dumper of plastics 10, December 2012.
in the ocean. RapplerBlog. Retrieved October 6, 2015. From