0% found this document useful (0 votes)
41 views

Crime Analysis and Prediction Using Data

In this era of recent times, crimehas become an evident way of making people and society under trouble. An increasing crime factor leads to an imbalance in the constituency of a country. In order to analyse and have a response ahead this type of criminal activities, it is necessary to understand the crime patterns. This study imposes one such crime pattern analysis by using crime data obtained from Kaggle open source which in turn used for the prediction of most recently occurring crimes.

Uploaded by

SMARTX BRAINS
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Crime Analysis and Prediction Using Data

In this era of recent times, crimehas become an evident way of making people and society under trouble. An increasing crime factor leads to an imbalance in the constituency of a country. In order to analyse and have a response ahead this type of criminal activities, it is necessary to understand the crime patterns. This study imposes one such crime pattern analysis by using crime data obtained from Kaggle open source which in turn used for the prediction of most recently occurring crimes.

Uploaded by

SMARTX BRAINS
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Journal Publication of International Research for Engineering and Management (JOIREM)

Volume: 10 Issue: 04 | MAR-2023

Crime Analysis and Prediction Using Data

1 Bhole Abhedya , 2Chaudhari Ankit, 3Nagla Sanket, 4Jadhav Priyanka

5Prof. Kanchan Dhomse

1
Department of Computer Engineering of MET Bhujbal Knowledge City Institute of Engineering,Adgaon

---------------------------------------------------------------------***---------------------------------------------------------------------
crime type and patterns. It imposes the uses of existing crime
Abstract – In this era of recent times, crime has become
data and predicts the crime type and its occurrence bases on the
an evident way of making people and society under trouble.
location and time. Researchers undergone many studies that
An increasing crime factor leads to an imbalance in the
helps in analysing the crime patterns along with their relations
constituency of a country. In order to analyse and have a
in a specific location. Some of the hotspots analysed has become
response ahead this type of criminal activities, it is necessary
easier way of classifying the crime patterns. This leads to assist
to understand the crime patterns. This study imposes one
the officials to resolve them faster. This approach uses a dataset
such crime pattern analysis by using crime data obtained
obtained from Kaggle open source based on various factors
from Kaggle open source which in turn used for the
along with the time and space where it occurs over a certain
prediction of most recently occurring crimes. The major
period of time. We implied a classification algorithm that helpin
aspect of this project is to estimate which type of crime
locating the type of crime and hotspots of the criminal actions
contributes the most along with time period and location
that takes place on the certain time and day. In this proposed one
where it has happened. Some machine learning algorithms
to impose a machine learning algorithms to find the matching
such as Naïve Bayes is implied in this work in order to
criminal patterns along with the assist of its category with the
classify among various crime patterns and the accuracy
given temporal and spatial data.
achieved was comparatively high when compared to pre-
composed works. I. Literature Survey
II. Crime are of different type that occurs at different locations
Key Words: Crime, Analyse, Crime patterns, Kaggle, around the various geographical location. Many research

Estimate, Naïve Bayes, Accuracy scholars have been suggesting a mechanism to analyse the
relationship between crime and social variables that includes
INTRODUCTION: Crime has become a major thread imposed unemployed individuals, earning amount, level of education and
which is considered to grow relatively high in intensity. An so on. Suhong Kim and Param Joshi [1] proposed two different
action stated is said to be a crime, whenit violates the rule, machine learning models which is used for prediction, K nearest
against the government laws and it is highly offensive. The Shraddha S. Kavathekar [3] used association rule mining in
crime pattern analysis requires a study in the different aspects predicting crimes. Some Machine learning algorithms including
of criminology and also in indicating patterns. The Government Deep Neural Network (DNN) and Artificial Neural Network
has to spend a lot of time and work to imply technology to
govern some of these criminal activities. Hence, use of machine
learning techniques and its records is required to predict the

© 2023, JOIREM |www.joirem.com| Page 1


Journal Publication of International Research for Engineering and Management
(JOIREM)
Volume: 10 Issue: 04 | MAR-2023

Entirely connected convolution layers has been used geographical circumstances. Nikhli Dubey and Setu K.
in building the prediction model, mainly for multi- labelled Chaturvedi[8] imposed pertinent analysis of data mining
data classification. It was implemented using Tenserflow approaches for the detection of the impeding future crime.
that is an API mainly designed for Deep learning technique A Computational mechanism to classify the crime using
with the dropout layers. These findings suggest that when machine learning techniques [9] proposed a malleable
there is more count ofmissing values, there is a need for computational implementation tool to analyse the crime
pre-processing because crimes do not occur in the same rate in a country helps in classifyingcybercrimes. Hyeon-
manner but focuses on some particular areas. Artificial Woo Kang and Hang-Bong Kang
Neural Network [ANN] is based on the prognosis by trend
[10] Suggested a fusion method based on Deep Neural
analysis in solving problems. It comprises of enormous
Network in predicting the criminal activities from the
amount of processing constituent that works altogether in
feature level data with sufficient parameters.
building a model. Chandy and Abraham [4] proposed a
random forest classifier in extracting the features for data
Existing System:
processing using cloud computing. The extracted features
In pre-work, the dataset obtained from the open source
are request number,user identification, expiry time, time of
are first pre-processed to remove the duplicated values and
arrival nd memory requirement. After feature extraction,
features. Decision tree has been used in the factor of finding
the prediction of work load is done by using the trained
crime patterns and also extracting the features from large
data that has been perceived from the learning stage that
amount of data is inclusive. It provides a primary structure
allows to learn the details of the extracted features from
for further classification process. The classified crime
user’s request. Rohit Patil, Muzamil Kacchi, Pranali
patternsare feature extracted using Deep Neural network.
Gavali and Komal Pimparia [5] suggests an Apriori
Based on the prediction, the performance is calculated for
algorithm for frequent patterns and the result obtained from
both trained and test values. The crime prediction helps in
K-means is used. Due to increase in crime rate over these
forecasting the future happening of any type of criminal
recent years, system has to handle an enormous amount of
activities and help the officials to resolve them at the
data which requires more time to analyse them manually.
earliest.
Hence, advance machine learning approaches like K
means clustering has been used. A literature survey on Drawbacks:
Spatial and Temporal Hotspot prediction of crime [6]
The pre-existing works account for lowaccuracy since the
proposed a study to categorize and evaluate the location
classifier uses a categorical values which produces a biased
and time of the crime hotspot detection techniques by
outcome for the nominal attributes with greater value.
performing (SLR) Systematic Literature Review. Fuzhan
Nasiri, Zakikhani, Kimiya and Tarek Zayed [7] suggested The classification techniques does not suited for regions
a failure prediction model that helps in detecting the with inappropriate data and real valued attributes.The
corrosion in the pipelines of gas transmission. Most of the value of the classifier must be tuned and hence there is a
prediction model depend absolutely on the experimental need of assigning an optimal value.
tests data or involving some of the limited historical data
records. This helps in ignoring the corrosion from various Proposed System:
© 2023, JOIREM |www.joirem.com| Page 2
Journal Publication of International Research for Engineering and Management
(JOIREM)
Volume: 10 Issue: 04 | MAR-2023

The data obtained is first pre-processed using 3. Naïve Bayes classification


machine learning technique filter and wrapper in order to
4. Crime prediction
remove irrelevant and repeated data values. It also reduces
the dimensionality thus the data has been cleaned. The data 5. Evaluation

is then further undergoes a splitting process. It is classified A.Data Pre-Processing:


into test and trained data set. The model is trained by dataset
Data obtained from the open source must be first pre-
both training and testing .It is then followed by mapping.
processed in order to overcome unnecessary violations. The
The crime type, year, month, time, date, place are mapped
dataset has been chosen for Denver city with enormous
to an integer for ensuring classification easier. The
amount of crime data over six years. The machine learning
independent effect between the attributes are analysed
technique filter and wrapper is implied to find the missing
initially by using Naïve Bayes. Bernouille Naïve Bayes is
integral in specified attribute values. Data cleaning play a
used for classifying the independent features extracted. The
vital in training a prediction model and also in the
crime features are labelled that allows to analyse the
performance of the commenced process.
occurrence of crime at a particular time and location.
Finally, the crime which occur the most along with spatial Filtering the instance and removal of irrelevant context from
and temporal information is gained. The performance of the datasets are done. The filtering methods contributes in
prediction model is find out by calculating accuracy rate. measuring the significance of the features. The correlation
The language used in designing the prediction model is with the dependent values is considered in the feature
python and run on the Colab – an online compiler for data selection. The wrapper method imposed is used in
analysis and machine learning models. measuring how useful is the feature subset by training a
prediction model on it actually. The data after pre-
Advantages:
processing is split into test and trained attributes.
1. The proposed algorithm is well suited for the crime pattern
A. Mapping:
detection since most of the featured attributes depends on
The crime features such as crime type, the date on which
the time and location.
the crime has been occurred including the time of occurrence
2. It also overcomes the problem of analysing independent
are first segregated. It is then mapped to an integer for easy
effect of the attributes.
labelling. The labelled details are further analysed and used are
3. The initialization of optimal value is not required since it
used in graph plotting. Python is chosen as programming
accounts for real valued, nominal value and also concern
language
the region with insufficient information.
4. The accuracy has been relatively high when compared to In implementing the proposed work since it is well suited
other machine learning prediction model. for machine learning process. The package matplotlib is
imported in order to plot the graph to show the occurrence of
Module Description: the criminal activities. The crime which occurred the most can

1. Data pre-processing be plotted in the graph which contributes for further


prediction process.
2. Mapping

© 2023, JOIREM |www.joirem.com| Page 3


Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | MAR-2023

Table 1. Dataset Collection

Table 2. Crime Dataset with occurrence date and time

© 2023, JOIREM |www.joirem.com| Page 4


Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | MAR-2023

© 2023, JOIREM |www.joirem.com| Page 5


Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | MAR-2023

split occurrence. Finally, the mean and the standard deviation


A.Naïve Bayes Classification
of the average precision is calculated. The accuracy of 93.07%
The reason behind the application of Naïve Bayes is that crime has been achieved who gives a great increase in compared to
existing prediction models
prediction usually concerns with the temporal and spatial data.
The independent effect among the attribute values is first
analysed since the selected crime attributes possess an
independent effect. Among the attribute values is first analysed
since the selected crime attributes possess an independent effect
upon them. They are used in creating a model by providing a
training using crime data that are related to robbery, burglary,
murder, sexual abusing, armed robbery, chain snatching, gang
rape and highway robbery. Some of the extended techniques of
Naïve Bayes has been implied.

A.Crime Prediction:
The expected crime type is predicted by extending the
supported crime features. The features are then applied to Conclusion:
nominal values. It could be explained clearly by taking a single
In this paper, the difficulty in dealing with the nominal
tuple as an instance.
distribution and real valued attributes is overcome by using
Considering a tuple: two classifiers such as Multi- nominal NB and Gaussian NB.
{Gateway town, 20th October 2020, 2: 30 PM, Friday} => Much training time is not required and serves to be the best
{Larceny – a crimeinvolves the theft of a particular’s property}
suited for real- time predictions. It also overcomes the
Considering probable occurrence based on thefeature
problem of working with continuous target set of variables
extracted:
where the existing work refused to fit with. Thus the crime
1. {Gateway town} => {Theft has that occur the most could be predicted and spotted using
occurred} Naïve Bayesian Classification. The performance of the
algorithm is also calculated by using some standard metrics.
2. {October} => {Theft has occurred}
The metrics include average precision, recall, F1 score and
3. {2020} => {Theft has occurred} accuracy are mainly concerned in the algorithm evaluation.
The accuracy value could be increased much better by
4. {2:30 PM} => {Theft has occurred}
implementing machine learning algorithms.
5. {Friday} => {Theft has occurred}
Future Work:
Evaluation:
Though it overcomes the problem of the existing work, it has
The performance of the implied prediction is then evaluated in some limitations. In the situation of absence of class labels, then the
order to achieve a high degree of accuracy when compared to
probability of the estimation will be zero. As a future extension of
the pre-existing model used. The training is done with cross
validation that helps in training the data on different set of the proposed work, the application of more machine learning
training data. It will evaluate the accuracy for overall splits in classification models proves to increase accuracy in crime prediction
the cross validation implied. In python, in order to calculate and will enhance the overall performance. It helps in providing a
the value of accuracy we need to pass the data arguments such
better study for the future improvement by taking the income
as model name, target set and CV that helps in signifying the
information into consideration forneighborhoods places in order to

© 2023, JOIREM |www.joirem.com| Page 6


Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | MAR-2023

foresee if any relationship between the income levels of a particular


in the neighborhood places and their crime rate.

References:

[1]Suhong Kim, Param Joshi, Parminder Singh Kalsi,Pooya


Taheri, “Crime Analysis Through Machine Learning”, IEEE
Transactions on November 2018.

[2]Benjamin Fredrick David. H and


A.Suruliandi,“Survey on Crime Analysis andPrediction using
Data mining techniques”, ICTACT Journal on Soft
Computing on April 2012.

[3]Shruti S.Gosavi and Shraddha S. Kavathekar,“A Survey on


Crime Occurrence Detection and prediction Techniques”,
International Journal of Management, Technology And
Engineering , Volume 8,Issue XII, December 2018.

[4]Chandy, Abraham, "Smart resource usage prediction using


cloud computing for massive data processing systems"
Journal of Information Technology 1, no. 02 (2019

[5]Learning Rohit Patil, Muzamil Kacchi, Pranali Gavali and


Komal Pimparia, “Crime Pattern Detection, Analysis &
Prediction using Machine”, International ResearchJournal of
Engineering and Technology, (IRJET) e-ISSN: 2395-0056,
Volume: 07,

[6]Umair Muneer Butt, Sukumar Letchmunan, Fadratul


Hafinaz Hassan, Mubashir Ali, Anees Baqir and Hafiz
Husnain Raza Sherazi, “Spatio-Temporal Crime Hotspot
Detection and Prediction: A Systematic Literature Review”,
IEEE Transactions on September 2020.

[7]Nasiri, Zakikhani, Kimiya and Tarek Zayed, "A failure


prediction model for corrosion in gas transmission pipelines",
Proceedings of the Institution of Mechanical Engineers, Part
O: Journal of Risk and Reliability, (2020).

© 2023, JOIREM |www.joirem.com| Page 7

You might also like