Crime Analysis and Prediction Using Data
Crime Analysis and Prediction Using Data
1
Department of Computer Engineering of MET Bhujbal Knowledge City Institute of Engineering,Adgaon
---------------------------------------------------------------------***---------------------------------------------------------------------
crime type and patterns. It imposes the uses of existing crime
Abstract – In this era of recent times, crime has become
data and predicts the crime type and its occurrence bases on the
an evident way of making people and society under trouble.
location and time. Researchers undergone many studies that
An increasing crime factor leads to an imbalance in the
helps in analysing the crime patterns along with their relations
constituency of a country. In order to analyse and have a
in a specific location. Some of the hotspots analysed has become
response ahead this type of criminal activities, it is necessary
easier way of classifying the crime patterns. This leads to assist
to understand the crime patterns. This study imposes one
the officials to resolve them faster. This approach uses a dataset
such crime pattern analysis by using crime data obtained
obtained from Kaggle open source based on various factors
from Kaggle open source which in turn used for the
along with the time and space where it occurs over a certain
prediction of most recently occurring crimes. The major
period of time. We implied a classification algorithm that helpin
aspect of this project is to estimate which type of crime
locating the type of crime and hotspots of the criminal actions
contributes the most along with time period and location
that takes place on the certain time and day. In this proposed one
where it has happened. Some machine learning algorithms
to impose a machine learning algorithms to find the matching
such as Naïve Bayes is implied in this work in order to
criminal patterns along with the assist of its category with the
classify among various crime patterns and the accuracy
given temporal and spatial data.
achieved was comparatively high when compared to pre-
composed works. I. Literature Survey
II. Crime are of different type that occurs at different locations
Key Words: Crime, Analyse, Crime patterns, Kaggle, around the various geographical location. Many research
Estimate, Naïve Bayes, Accuracy scholars have been suggesting a mechanism to analyse the
relationship between crime and social variables that includes
INTRODUCTION: Crime has become a major thread imposed unemployed individuals, earning amount, level of education and
which is considered to grow relatively high in intensity. An so on. Suhong Kim and Param Joshi [1] proposed two different
action stated is said to be a crime, whenit violates the rule, machine learning models which is used for prediction, K nearest
against the government laws and it is highly offensive. The Shraddha S. Kavathekar [3] used association rule mining in
crime pattern analysis requires a study in the different aspects predicting crimes. Some Machine learning algorithms including
of criminology and also in indicating patterns. The Government Deep Neural Network (DNN) and Artificial Neural Network
has to spend a lot of time and work to imply technology to
govern some of these criminal activities. Hence, use of machine
learning techniques and its records is required to predict the
Entirely connected convolution layers has been used geographical circumstances. Nikhli Dubey and Setu K.
in building the prediction model, mainly for multi- labelled Chaturvedi[8] imposed pertinent analysis of data mining
data classification. It was implemented using Tenserflow approaches for the detection of the impeding future crime.
that is an API mainly designed for Deep learning technique A Computational mechanism to classify the crime using
with the dropout layers. These findings suggest that when machine learning techniques [9] proposed a malleable
there is more count ofmissing values, there is a need for computational implementation tool to analyse the crime
pre-processing because crimes do not occur in the same rate in a country helps in classifyingcybercrimes. Hyeon-
manner but focuses on some particular areas. Artificial Woo Kang and Hang-Bong Kang
Neural Network [ANN] is based on the prognosis by trend
[10] Suggested a fusion method based on Deep Neural
analysis in solving problems. It comprises of enormous
Network in predicting the criminal activities from the
amount of processing constituent that works altogether in
feature level data with sufficient parameters.
building a model. Chandy and Abraham [4] proposed a
random forest classifier in extracting the features for data
Existing System:
processing using cloud computing. The extracted features
In pre-work, the dataset obtained from the open source
are request number,user identification, expiry time, time of
are first pre-processed to remove the duplicated values and
arrival nd memory requirement. After feature extraction,
features. Decision tree has been used in the factor of finding
the prediction of work load is done by using the trained
crime patterns and also extracting the features from large
data that has been perceived from the learning stage that
amount of data is inclusive. It provides a primary structure
allows to learn the details of the extracted features from
for further classification process. The classified crime
user’s request. Rohit Patil, Muzamil Kacchi, Pranali
patternsare feature extracted using Deep Neural network.
Gavali and Komal Pimparia [5] suggests an Apriori
Based on the prediction, the performance is calculated for
algorithm for frequent patterns and the result obtained from
both trained and test values. The crime prediction helps in
K-means is used. Due to increase in crime rate over these
forecasting the future happening of any type of criminal
recent years, system has to handle an enormous amount of
activities and help the officials to resolve them at the
data which requires more time to analyse them manually.
earliest.
Hence, advance machine learning approaches like K
means clustering has been used. A literature survey on Drawbacks:
Spatial and Temporal Hotspot prediction of crime [6]
The pre-existing works account for lowaccuracy since the
proposed a study to categorize and evaluate the location
classifier uses a categorical values which produces a biased
and time of the crime hotspot detection techniques by
outcome for the nominal attributes with greater value.
performing (SLR) Systematic Literature Review. Fuzhan
Nasiri, Zakikhani, Kimiya and Tarek Zayed [7] suggested The classification techniques does not suited for regions
a failure prediction model that helps in detecting the with inappropriate data and real valued attributes.The
corrosion in the pipelines of gas transmission. Most of the value of the classifier must be tuned and hence there is a
prediction model depend absolutely on the experimental need of assigning an optimal value.
tests data or involving some of the limited historical data
records. This helps in ignoring the corrosion from various Proposed System:
© 2023, JOIREM |www.joirem.com| Page 2
Journal Publication of International Research for Engineering and Management
(JOIREM)
Volume: 10 Issue: 04 | MAR-2023
A.Crime Prediction:
The expected crime type is predicted by extending the
supported crime features. The features are then applied to Conclusion:
nominal values. It could be explained clearly by taking a single
In this paper, the difficulty in dealing with the nominal
tuple as an instance.
distribution and real valued attributes is overcome by using
Considering a tuple: two classifiers such as Multi- nominal NB and Gaussian NB.
{Gateway town, 20th October 2020, 2: 30 PM, Friday} => Much training time is not required and serves to be the best
{Larceny – a crimeinvolves the theft of a particular’s property}
suited for real- time predictions. It also overcomes the
Considering probable occurrence based on thefeature
problem of working with continuous target set of variables
extracted:
where the existing work refused to fit with. Thus the crime
1. {Gateway town} => {Theft has that occur the most could be predicted and spotted using
occurred} Naïve Bayesian Classification. The performance of the
algorithm is also calculated by using some standard metrics.
2. {October} => {Theft has occurred}
The metrics include average precision, recall, F1 score and
3. {2020} => {Theft has occurred} accuracy are mainly concerned in the algorithm evaluation.
The accuracy value could be increased much better by
4. {2:30 PM} => {Theft has occurred}
implementing machine learning algorithms.
5. {Friday} => {Theft has occurred}
Future Work:
Evaluation:
Though it overcomes the problem of the existing work, it has
The performance of the implied prediction is then evaluated in some limitations. In the situation of absence of class labels, then the
order to achieve a high degree of accuracy when compared to
probability of the estimation will be zero. As a future extension of
the pre-existing model used. The training is done with cross
validation that helps in training the data on different set of the proposed work, the application of more machine learning
training data. It will evaluate the accuracy for overall splits in classification models proves to increase accuracy in crime prediction
the cross validation implied. In python, in order to calculate and will enhance the overall performance. It helps in providing a
the value of accuracy we need to pass the data arguments such
better study for the future improvement by taking the income
as model name, target set and CV that helps in signifying the
information into consideration forneighborhoods places in order to
References: