0% found this document useful (0 votes)
210 views5 pages

Crime Analytics: Exploring Analysis of Crimes Through R Programming Language

This document discusses exploring crime analysis through R programming. It aims to identify trends and patterns in crime data using machine learning algorithms. The document reviews existing studies on crime data mining and analytics that use techniques like clustering, classification, spatial analysis, and predictive modeling. It analyzes the strengths and weaknesses of algorithms used in previous research, such as k-means clustering and factor analysis. The goal is to design a crime analysis system using R programming to help law enforcement agencies better understand crime patterns and predict future crimes.

Uploaded by

Jaydwin Labiano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
210 views5 pages

Crime Analytics: Exploring Analysis of Crimes Through R Programming Language

This document discusses exploring crime analysis through R programming. It aims to identify trends and patterns in crime data using machine learning algorithms. The document reviews existing studies on crime data mining and analytics that use techniques like clustering, classification, spatial analysis, and predictive modeling. It analyzes the strengths and weaknesses of algorithms used in previous research, such as k-means clustering and factor analysis. The goal is to design a crime analysis system using R programming to help law enforcement agencies better understand crime patterns and predict future crimes.

Uploaded by

Jaydwin Labiano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

PILAR M. FANDINO et al: CRIME ANALYTICS: EXPLORING ANALYSIS OF CRIMES THROUGH R . .

Crime Analytics: Exploring Analysis of Crimes through R Programming Language

Pilar M. Fandino, Jose B. Tan Jr.

Technological Institute of the Philippines, Manila, Philippines.


Email: [email protected]; [email protected]

Abstract - The complexity of geographical crime patterns creates challenges for many laws enforcing agencies as well for the
community. Crime analytics is an intelligent crime analysis designed to aggregate crime related data. We use different algorithm in
Machine Learning to collect data sets. Our focus in this study is to design a crime analysis system to identify trends and patterns.
This study provides solutions to reveal valuable records that can be used effectively for analyzing and recording the information.
The application of R programming language gives a new way to connect to enormous volumes of police crime data, where R
streamlines the processing and interpretation of crime analysis. The study used the comprehensible histogram to show crime rates
in every district. R explores predictions related to crime patterns, which depicts the crime committed. It is a tool that can help law
enforcement effort to set various stages of development to check the potential problems before they become disastrous. It is needed
to emphasize that R can help in the analysis of data concerning small units or districts.

Keywords - crime analytics, data mining. Aggregate, machine learning

I. INTRODUCTION data, it can preempt measure may applied to prevent the next
crime. Detecting crime patterns is challenging. Data mining
A crime-free nation is an important factor in fostering is one of the evolving fields that can play with the large
investment and economic growth. A record shows that volume of datasets. The crime intensity of a location, in
common crimes have a record of an average of 5.6% (SWS real-time basis (Bolla, 2014) should gain insights to
2017). This resulting 2 points below from the previous aggregate knowledge for it.
percentage which is 7.6% in 2016 record property crime rate Data mining in crime literature has rapidly increase for
(SWS 2017). Data shows there was steady drop rate. A total the past years, it turns out to be mandatory to improve an
of 227, 757 crimes were recorded from January to June impression to lessen the crime rate. This systematic review
lower by 47, 945 compared to 275, 702 crimes (Albayalde focuses on techniques and technologies used in the previous
2017). studies of data mining in crime. It is classified into different
The evolution of technology leads to digital information. categories and presented using visualizations. Crime data
This advancement leads the interest and the capacity to keep mining research indicates challenges related to it.
safe the community. These massive volume of data helps us The main purpose of this study is to identify the patterns
to fight crime. It will give way to predict possible behavior and behavior of accuracy of existing crime with the use of R
or patterns of crime. Classifying threats in the community Programming. This will determine the areas of opportunities
to prevent the future attacks or terrorizing. to improve, algorithm and procedures used. It can also assist
Studies by Thomas Rich and by Cynthia Mamalian and us to determine the best fit process to be use for further
Nancy LaVigne of the National Institute of Justice have study.
found that the majority of departments using crime mapping
are creating automated pin maps and generating hotspot II. EXISTING STUDIES
maps. Along a range of difficulty, pin maps and hot spots
analysis are the low end. For more advanced analysis, some To determine and understand the behavior of the
knowledge of cartography and geographic information algorithm used from the different literature presented. This
system is required. The complexity of these latest shows the strength and weaknesses for the study it may
innovations may well prevent their adoption by all with the applies. The following are the existing work:
most technologically advanced departments. The use of clustering and classification in data mining
Data mining denotes lineage of data or knowledge algorithms to detect the criminal and crime spot. Data stored
discovery and analysis of patterns into useful information. A in databases to classify the activities of criminals. The
tool that can be used for crime detection. The emersion of identification of criminals will be based on crime spot and
space and time enhance the detecting crime patterns, these witnesses. These criminal “hotspots” will help the law
patterns give situational awareness among the security enforcers to avoid or to lessen criminals. [Sukanya, 2012].
agents and prevent the potential problematic areas (Neil and Spatial is related to space, this method helps us to
Gorr, 2017). Crime patterns is very useful to locate a identify the patterns and the threshold. These patterns
possible committed crime to happen. Using the historical predict the potential crime spots. The assessment of the

DOI 10.5013/IJSSST.a.20.S2.29 29.1 ISSN: 1473-804x online, 1473-8031 print


PILAR M. FANDINO et al: CRIME ANALYTICS: EXPLORING ANALYSIS OF CRIMES THROUGH R . .

datasets through statistical analysis conducted by the Apriori Agarwal et al. used the rapid miner tool for analyzing
algorithm and produces interesting frequent results of the crime rates and anticipation of crime rate using different
criminal hotspots pattern. The decision tree classifier and data mining techniques. Their work done is for crime
Naïve Bayesian use to predict potential types of crimes. The analysis using the K-Means Clustering Algorithm. The
paper brings together the analysis of the study through the main objective of their crime analysis is to extract the crime
combination of the demographics information datasets of patterns, predict the crime based on the spatial distribution
crime to factors the capture information that affects the of existing data and detection of crime. Their analysis
neighborhood. Those findings will give a result to this includes the tracking homicide crime rates from one year to
solution and the be used as the awareness of the people the next. (Jyoti Agarwal, Renuka Nagpal and Rajni Schgal,
regarding the dangerous locations or to help the agencies to 2013).
predict the future crimes happen in specific locations. Most of the literature introduce a method to map out
The K-mean clustering and geographical information crime in various locations. It provides database, crime
system (GIS) also use to detect the crime hotpots over the parameters and structure of the correlations among the
use of data mining tools to create a structure. This unique defining variables.
feature of this of this study is the application of factor. The
mapping of the information serves as the foundation of the III. ALGORITHMS USED AND THEIR ANALYSIS
parameters of the crime. The analysis of the factor
application is used to uncover the hidden structure of the The following table I shows the systematic review of
data that is present yearly. This procedure can help to literature, algorithm used by the studies presented, their
predict where the crime is located. strength and weaknesses and their findings.

TABLE I. SYSTEMATIC REVIEW OF LITERATURE AND ALGORITHM


Algorithm/
Architecture
Authors Strength Weaknesses Findings
/Framework
Used
Detecting Hot Spots on Crime K-Mean The unique feature of this Factor analysis is extended with the It shows that with only
Using Data Mining and Clustering study is the application of techniques of two of the seven HIMs
Geographical Information System Factor factor, k-mean clustering Varimax/ Quartimax criterion for it is possible to have a
Analysis and Geographical orthogonal rotation. Even though the results position independent
Information System obtained by both the criterions were very object classification
(GIS) analysis as data similar, the Varimax rotation provided algorithm that allows
mining tools to develop relatively better clustering of crime data. differentiating and
the hidden structure Consequently, only the results of Varimax classifying
present in the data for rotation are reported here. We have decided
each year. to retain 76

Toppi Reddy, H. K. R., Saini, B., & K-Nearest To develop an accurate Helps the crime
Mahajan, G. (2018). Crime Neighbor real-time crime prediction analysts to analyze
Prediction & Monitoring Algorithm to reduce crime rates in crime networks by
Framework Based on Spatial the community as crime means of various
Analysis. Procedia Computer Naïve Bayes occurrences depends on interactive
Science, 132, 696-705. many complex factors. visualization.

Prabakaran, S., & Mitra, S. (2018, Naïve To process crime These techniques are
April). Survey of Analysis of Bayesian characteristics to help the used when the
Crime Detection Techniques Using society for a better living. dimensionality is high
Data Mining and Machine K-Mean and k-mean algorithm
Learning. In Journal of Physics: Clustering is fast, robust and
Conference Series (Vol. 1000, No. gives easy to
1, p. 012046). IOP Publishing. understand result.

Table I clearly depicts the different systematic review of


IV. METHODOLOGY USED
literature and some studies that prove the strength,
weaknesses, and findings of the different studies found. It
This study allows the researcher to explore the data
was also indicated the application to where it was apply and
mining using the R. R programming is an open source
the conclusion in which it further discussed the output of the
program that can be use looking in to the data. For this
study. It was noted as well some of the disadvantages of the
research the proponent use crime data from Chicago as the
different studies which help the proponent to identify as to
dummy data, to make analysis more manageable by
what algorithm and application should she used and why
applying the historical data. Data Mining is a new approach
this kind of application be used.
in data collection. This algorithm design composed of

DOI 10.5013/IJSSST.a.20.S2.29 29.2 ISSN: 1473-804x online, 1473-8031 print


PILAR M. FANDINO et al: CRIME ANALYTICS: EXPLORING ANALYSIS OF CRIMES THROUGH R . .

different sections, such as data mining, Extraction and data. In the context of classification estimates the possibility
Classification, Pattern Identification, Prediction and of given point data falling into a certain object. This
Visualization. classifier method finds way to predict the finding parameters
Extraction and Classification are used to get the data to feature individual probability. In this section, the data will
source, it is recommended to use a classifier for statistical be extracted to get the summary or the description. To
method. The algorithm will classify the data. Using this identify the fields involved and the levels of data sets.
step, we can create a model that suits on the training the

Figure 1. Exploring Variable

Figure 1 the researcher explores on some functions of R in the data file. The variable name converts names t lower
like the variable. This figure tells that we have 22 variables case since R is predominantly done in lower case.

Figure 2. Size and Structure of data file.

Figure 2 shows the data file containing of 779919


observations and 22 variables. The figure has 9 variable
integers such as the year, the beat, the district, the ward, the
community, the Area, the x-coordinate, y-coordinate and ID.
The case number, date, block and the others are character.
Pattern Identification, see Figure 3, is the trend
identification section of the study. It identifies the trends and
patterns of crime. The Apriori algorithm is used in this
section to determine the association rules that will highlight
the ruling in trends data file, it corresponds with the location
that represents every attribute, such as the x coordinates, y
coordinates, ward, district.
Figure 3. Local data in frame.

DOI 10.5013/IJSSST.a.20.S2.29 29.3 ISSN: 1473-804x online, 1473-8031 print


PILAR M. FANDINO et al: CRIME ANALYTICS: EXPLORING ANALYSIS OF CRIMES THROUGH R . .

After determining the pattern, a new set of rules will model to predict the possible crime. The law enforcement
arrive, however, if it is the same, we can predict that there’s agencies used this model to determine the crime patterns and
another crime occur. predicting the future trends [Nasridinov, Park 2013].
In this section the law enforcer will take the opportunity Visualization section in Figure 4 shows graphical
to prevent it by providing the necessary safety measures. In representation trends of crime. It is an art to transform the
this case the police or any analyst take this as the detection data into useful information. It shows the occurrences of the
for crime. The mounting information available to process by dates when the crime happened. It makes easier for view to
the technologies has enabled the law enforcement agencies interpret the information. Below shows the graphical
to aggregate data into various crimes. The investigation of representation of Total Crime by the District.
law enforcement agencies used the classifications Heat map displays the probable regions to prevent
techniques to apply on these data and to form decision-aid crimes by taking preventing mechanism x- axis plotted the
tools to facilitate investigations. district while the y-axis plotted the rate of the crime. The
The implementation and analysis method of prediction heat map shows the regions which has the high volume of
can be done using the decision tree. Decision is parallel to a crimes. The data is based from the historical. Deering
graph which had nodes that represent every attribute. The District has the high level of crime rate.
researcher proposes a decision tree-based classification

Figure 4. Total Number of Crimes in each District

V. CONCLUSION cases to make samples to certain attributes and to avoid


error incur that leads to error in prediction. The
The flexibility of R gives us the new direction to have an effectiveness and accuracy of this project can improve by
effective and accurate win predicting the crime pattern in a using machine learning algorithms that can predict intense
location. Different algorithm can be the based training set of crimes.
the data file. The relatively insignificant performance of the The application of data mining gives the realm and the
certain attribute factor for the randomness of the crimes and curve of learning trends in crimes. The utilization of this
the features of the algorithm used, make the branches if the application can be long and tedious due to the large volumes
decision tress to build more rigid and accurate results if the of datasets, but this will help the law enforcer to get the
test set follows the model pattern. On the other hand, the accurate data needed. The precision could infer and create
linear regression algorithm could handle randomness of test new knowledge on how to slow down or lessen the crime

DOI 10.5013/IJSSST.a.20.S2.29 29.4 ISSN: 1473-804x online, 1473-8031 print


PILAR M. FANDINO et al: CRIME ANALYTICS: EXPLORING ANALYSIS OF CRIMES THROUGH R . .

rate as well as the safety and security of the community and https://www.rappler.com/sciencenature/environment/108276-
philippines-plastic-pollution-ocean-conservancystudy
its people. [6] Badilla, N. (2017). 45 percent of Metro’s garbage not properly
Clustering technique is used to cluster the similar type of disposed. Special Report. Retrieved December 27, 2017. From
crimes together, based on the clusters’ result the burglary https://www.manilatimes.net/45-percent metros-garbage-not-properly
type of crime hotspot will be identified. This result will help -disposed/370791
[7] GMA News Online. (2018). PHL 1 of 5 countries that produce half
to reduce the burglary type crime. In future all type of of world's plastic waste — UN report. Retrieved June 5, 2018. From
crimes’ hotspot will be identified, through this the crime http://www.gmanetwork.com/news/lifestyle/healthandwellness/655744/phl-
activities will be reduced. 1of-5-countries-that-produce-half-of-world-s-plastic-waste-un-
report/story
[8] ToppiReddy, H. K. R., Saini, B., & Mahajan, G. (2018). Crime
REFERENCES Prediction & Monitoring Framework Based on Spatial Analysis.
Procedia Computer Science, 132, 696-705.
[1] Alex Krizhevskye, Ilya Sutskever, Geoffrey E. Hinton, “ImageNet [9] Prabakaran, S., & Mitra, S. (2018, April). Survey of Analysis of
Classification with Deep Convolutional Neural Networks”, Neural Crime Detection Techniques Using Data Mining and Machine
Information Processing Systems, pp. 1106–1114, 2012. Learning. In Journal of Physics: Conference Series (Vol. 1000, No. 1,
[2] Cortes, C, “Support-vector networks”. Machine Learning. 20 (3): p. 012046). IOP Publishing.
273–297, 1995. [10] Tayal, D. K., Jain, A., Arora, S., Agarwal, S., Gupta, T., & Tyagi, N.
[3] Carullo A, Parvis M. An Ultrasonic Sensor for Distance (2015). Crime detection and criminal identification in India using
Measurement in Automotive Applications. In: IEEE Sensors J. data mining techniques. AI & society, 30(1), 117-127.
1(2):143p. [1] Sukanya.M, T.Kalaikumaran and Dr.S.Karthik. Criminals and crime
[4] Singh, S., et al. (2017). Waste Segregation System Using Artificial hotspot detection using data mining algorithms: clustering and
Neural Networks. Helix, the Scientific Explorer. Helix Vol. 7(5): classification International Journal of Advanced Research in
DOI 10.29042/2017-2053-2058 Computer Engineering & Technology (IJARCET) Volume 1, Issue
[5] Ranada, P. (2015). Why PH is world’s 3rd biggest dumper of plastics 10, December 2012.
in the ocean. RapplerBlog. Retrieved October 6, 2015. From

DOI 10.5013/IJSSST.a.20.S2.29 29.5 ISSN: 1473-804x online, 1473-8031 print

You might also like