0% found this document useful (0 votes)
45 views

Crime Prediction and Analysis Using Data Mining

Crime analysis and prediction is a systematic approach for identifying the crime. This system can predict region which have high probability for crime occurrences and visualize crime prone area. Using the concept of data mining we can extract previously unknown, useful information from an unstructured data. The extraction of new information is predicted using the existing datasets. Useful information from an unstructured data.

Uploaded by

SMARTX BRAINS
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Crime Prediction and Analysis Using Data Mining

Crime analysis and prediction is a systematic approach for identifying the crime. This system can predict region which have high probability for crime occurrences and visualize crime prone area. Using the concept of data mining we can extract previously unknown, useful information from an unstructured data. The extraction of new information is predicted using the existing datasets. Useful information from an unstructured data.

Uploaded by

SMARTX BRAINS
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Journal Publication of International Research for Engineering and Management (JOIREM)

Volume: 10 Issue: 04 | May-2023

Crime Prediction And Analysis Using Data Mining

1Abhedya Bhole, 2Ankit Chaudhari, 3Sanket Nagla, 4Priyanka Jadhav


Prof. Kanchan Dhomse

Department Of Computer EnfineeringMET Bhujbal Knowledge City Institute Of Engineering,Adgaon Nashik

------------------------------------------------------------------------------------------------------------------------------------
creating a crime report database. The knowledge which is
Abstract:
acquired from the data mining techniques will help in reducing
Crime analysis and prediction is a systematic approach for
crimes as it helps in finding the culprits faster and also the
identifying the crime. This system can predict region
areas that are most affected by crime
which have high probability for crime occurrences and
visualize crime prone area. Using the concept of data Data mining helps in solving the crimes faster and this
mining we can extract previously unknown, useful technique gives good results when applied on crime
information from an unstructured data. The extraction of dataset, the information obtained from the data mining
new information is predicted using the existing datasets. techniques can help the police department.
Useful information from an unstructured data. The
A particular approach has been found to be useful by the
extraction of new information is predicted using the
police, which is the identification of crime ‘hot spots
existing datasets. Crimes are treacherous and common
‘which indicates areas with a high concentration of
social problem faced worldwide. Crimes affect the quality
crime Use of data mining techniques can produce
of life, economic growth and reputation of nation. With the
important results from crime report datasets. The very step
aim of securing the society from crimes, there is a need for
in study of crime is crime analysis. Crime analysis is
advanced systems and new approaches for improving the
exploring, inter relating and detecting relationship between
crime analytics for protecting their communities. We
the various crimes and characteristics f the crime. This
propose a system which can analysis, detect, and predict
analysis helps in preparing statistics, queries and maps on
various crime probability in given region. This paper
demand. It also helps to see if a crime in a certain known
explains various types of criminal analysis and crime
pattern or a new pattern necessary.
prediction using several data mining techniques.

Crimes can be predicted as the criminal are active and


operate in their comfort zones. Once successful theytry to
Keywords: Crime prediction, Decision trees, Linear
Regression, k-means. replicate the crime under similar circumstances. The
occurrences of crime depended on several factors such as
1. INTRODUCTION:
Day by day crime data rate is increasing because the modern intelligence of criminals, security of a location,etc The
technologies and hi-tech methods are helps the criminals to work has followed the steps that used in data analysis, in
achieving the illegal activities .according to Crime Record which the important phases are Data collection ,data
Bureau crimes like burglary, arson etc have been increased classification, pattern identification, prediction and
while crimes like murder, sex, abuse, gang rap etc have been visualization. The proposed framework uses different
increased [2].crime data will be collected from various blogs, visualization techniques to show the trends of crimes and
news and websites. The huge data is used as a record for various ways that can predicts the crime using machine
© 2023, JOIREM |www.joirem.com| Page 1
Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | May-2023
learning algorithm. database for further process. This isunstructured data
and it is object oriented programming which is easy to use
CRIME DATA ANALYSIS:
and flexible.Crime data is an unstructured data since no of
field, content, and size of the document can differ from one
Collection and analysis of crime related data are imperative
document to another the better option is to have a schema
to secure agencies. The use of a coherent methods to
less database. Also the absence of joins reduces the
classify these data based on the rate and location of
complexity. Other benefits of using an unstructured
occurrences, detection of the hidden pattern among the
database are that:
committed crimes at different times, and prediction of their
future relationship are the most important aspectsthat have - Large volume of structured, semi-structured, and
to be addressed. One of the most popular approaches is hot unstructured data.

spot analysis. Some of the most popular approaches used - Object-Oriented programming that is easy to use and
flexible.
for this purpose of point pattern analysis and
clustering/distances statistics. Another popular approach is Classification

the discovery of pattern or trends through various


In this step use Naive Bayes Algorithm which is supervised
techniques from data mining, text mining and spatial
learning method. Naive Bayes classifier is a probabilistic
analysis, and self-organizing maps.[1]An crime analysis
classifier which when given an input gives a probability
tool should be able to identify crime patterns quickly and
distribution of set of all classes rather than providing a single
in an efficient manner for future crime pattern detection
output. One of the main advantages of the Naïve bayes
and action.
Classifier is simple, and coverage quicker than logistic
The main purpose of crime analysis is: regression [2].Compare to other algorithm like SVM (Support
Vector machine) which takes lots of memory.Using naïve Bays
algorithm is create a model by training crime data related to
1. Extraction of crime pattern by crime analysis and based on
vandalism, murder, robbery, burglary, sex abuse, gang
available criminal information
rape,etc. Naive Bayes is that works well for small amount of
2. Crime recognition
training to calculate the classification parameter. Estimating
3. Problem of identifying techniques
probability sometimes while checking a probability P(A) *
P(B/D) *P(C/D) * P(E/D) where P(C/D)=0[2].
CRIME ANALYSIS METHODOLOGY

The crime analysis methodologies are:- Pattern identification:


A third step is the pattern identification where we have
 Data Collection
identify trends and patterns in crime. For finding crime
 Classification pattern that occurs frequently we are using apriori
 Pattern Identification algorithm. Apriori can be used to determine association

 Prediction rule which highlight general trends in the database. By


using pattern identification it will helps to the police
 Visualization
officials in an effective manner and avoid the crime
occurrences in particular place by providing security,
Data collection: CCTV, fixing alarms etc.
The data collection is first methodology in crime analysis.
Data’s are collected from various different websites, news Crime Prediction
sites and blogs. The collected data is stored into
The second Approach is predicting the crime type that
© 2023, JOIREM |www.joirem.com| Page 2
Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | May-2023
might occur in a specific location within particular time.
To predict an expected crime type is Provide four related Flow chart of crime analysis
features of the crime. Features are: occurrence month, the
occurrence day of the week, the occurrences time and the
crime location. Prediction is stating probability of an event
in future period time. A Classification approach is used
crime prediction in data miningn[1]classify areas into
hotspots and cold spots and to predictive an area will be a
hotspot for residential burglary. Variety of classification
techniques are used for predicting the crime:-[1]

 K-Nearest Neighbor (k-NN)

 Decision trees (J48)

 Support Vector Machine (SVM)


1. DATA
 Neural Networks This dataset contains a record of incidents that the Austin
Police Department responded to and wrote a report.Data is
Linear Regression methods are also used for predicting
from 2003 to present. This dataset is updated weekly.
the crime prediction Based on the crime probability.The
Understanding the following conditions will allow you to
formula for a regression is predicted scoreY=aX + b
get the most out of the data provided Due to the
where, Y is the line, b is the slope of the line
methodological differences in data collection, different
data sources may produce different results. This database
Some Theories are used to predicting the crimes are:
is updated weekly, and a similar or same search done on
different dates can produce different results.
 Integrated theory

 Biological theory Comparisons should not be made between numbers

 Psychological theory generated with this database to any other official police
reports. Data provided represents only calls for police
 Sociological theory
service where a report was written. Totals in the database
 Conflict theory
may vary considerably from official totals following
 Victimization theory
investigation and final categorization. Therefore, the data
Choice theory
should not be used for comparisons with Uniform Crime
Visualization Report statistics The Austin Police Department does not
assume any liability for any decision made or action taken
The crime prone area can be graphically reoresented using a
or not taken by the recipient in reliance upon any
heat amp which indicates level of activity,darkcolour indicates
information or data provided. Pursuant to section 552.301
low activity and brighter colour indicates the high activity.
(c) of the Government Code, the City of Austin has
designated certain addresses to receive requests for public
information sent by electronic mail.

© 2023, JOIREM |www.joirem.com| Page 3


Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | May-2023

ALGORITHMS regressions is gain a far greater understanding of the variables


Our experiment choose the algorithm are that can impact its success in the coming weeks,months and
years into the future. The disadvantages of the regression is its
 Instance based algorithm
linearity.If the data has non linear dependencies, a linear
 Decision tree regression model will ouput the best fittingline which may
 Linear regression not fit very well.
 K-means algorithm

Instance Based Algorithm


1. Decision Tree:
-The instance based algorithm is also called as tge machine
based learning is a family of learning algorithm that, instead of Advantages of the decision trees are It is very simple to
performing explicit generalization compares new problems understand and help determine worst,best and expected
instances with instance seen in training, which have been values for different scenarios.it can be combined with other
stored in memory. These stored their training set when decision techniques.Some of the Disadvantages of the
predicting a value or class for a new instances, they compute Decision tree are They are unstable, They are often
distance training instances to make a decision.The algorithm in relatively inaccurate, Calculation can get very complex.
this category for numerical prediction can divided into two
K-Means Algorithm:
types: similarity- based, e.g., Euclidean or entropy based and
regression-based e.g., LWL Since regression is one of the most K –means is the simplest and most commonly used

popular methods for numerical prediction The advantages of portioning algorithm among the clustering algorithm in

the Instances based Algorithm is it over other methods of scientific and industrial software Acceptance of k means is
machine learning is its ability to adapt its model of machine mainly due to its being simple .This algorithm is also

learning is its ability to adapt its model to previously unseen suitable for clustering of a large datasets since it has much

data. Instance based learners may simply store a new instance less computational complexity grows linearly by

or throw an old instance away. The Disadvantages of the increasing of the data points. Advantages of the k-means

instances based Algorithm are its need more storage and algorithm are relatively simple to implement, Scales to

computational complexity. large dataset, Guarantees convergence, easily adapts to


new examples Disadvantages of the k-means algorithm are
choosing manually, being dependent on initial values,
1. Linear Regression
-It is simple form of regression. Linear regression attempts clustering data of varying sizes and density.

to model the relationship between the two variables by


fitting a linear equation to observe the data. This is
widely used in statistics. For this purpose,linear functions CONCLUSION:
are used for which the unknown parameter i.e., weight of In this paper focused on building predictive models for
the independent variables, are estimated from the training crime frequencies per crime type per month. Thecrime
data[1].this can be used to predict the values One of the rates in India are increasing day by day due to many factors
most common estimatingmethod is least mean square. such as increase in poverty, implementation, corruption,
Linear regression algorithms for predicting include simple etc. The proposed model day by day due to many factors
regression multiple regression and pace regression, which is such as increase in poverty, implementation, corruption,
suitable for data of high dimensionality and only accepts binary etc. The proposed model is very useful for both the
nominal attributes The main advantages of the linear investigating agencies and thepolice official in taking

© 2023, JOIREM |www.joirem.com| Page 4


Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | May-2023
necessary steps to reduce crime. The project helps the crime
analysis to analysis these crime networks by means of
[7] Sarpreet kaur, Dr. Williamjeet Singh,
various interactive visualization. Systematic review of crime data mining,
International Journal of Advanced Research in
Future enhancement of this research work on training bots
computer science , 2015.
to predict the crime prone areas by using machine learning
techniques. Since, machine learning is similar to data
[8] Ayisheshim Almaw, Kalyani Kadam, Survey
mining advanced concept of machine learning can be used
Paper on Crime Prediction using Ensemble
for better prediction. The data privacy, reliability, accuracy
Approach, International journal of Pure and Applied
can be improved for enhanced prediction.
Mathematics,2018.

[9] Dr .M.Sreedevi, A.Harha Vardhan Reddy,


REFERENCE: ch.Venkata Sai Krishna Reddy, Review on crime

[1] Ginger Saltos and Mihaela Coacea, An Analysis and prediction Using Data Mining

Exploration of Crime prediction Using Data Mining Techniques, International Journal.

on Open Data, International journal of Information


[10] K.S.N .Murthy, A.V.S.Pavan kumar, Gangu
technology & Decision Making,2017.
Dharmaraju, international journal of engineering,

[2] Shiju Sathyadevan, Devan M.S, Surya Science and mathematics, 2017.

Gangadharan.S, Crime Analysis and Prediction


Using Data Mining, First International Conference
[11] K.S.N .Murthy, A.V.S.Pavan kumar, Gangu
on networks & soft computing (IEEE) 2014.
Dharmaraju, international journal of engineering,

[3] Khushabu A.Bokde, Tisksha P.Kakade, Science and mathematics, 2017.

Dnyaneshwari S. Tumasare, Chetan G.Wadhai B.E


[12] Hitesh Kumar Reddy ToppyiReddy, Bhavana
Student, Crime Detection Techniques Using Data
Saini, Ginika mahajan, Crime Prediction
Mining and K-Means, International Journal of
&Monitoring Framework Based on Spatial Analysis,
Engineering Research& technology (IJERT) ,2018
International Conference on Computational

[4] H.Benjamin Fredrick David and Intelligence Data Science

A.Suruliandi,Survey on crime analysis and


prediction using data mining techniques, ICTACT
Journal on Soft computing, 2017.

[5] Tushar Sonawanev, Shirin Shaikh, rahul


Shinde, Asif Sayyad, Crime Pattern Analysis,
Visualization And prediction Using Data Mining,
Indian Journal of Computer Science and Engineering
(IJCSE), 2015.

[6] RajKumar.S, Sakkarai Pandi.M, Crime


Analysis and prediction using data mining
techniques,International Journal of recent trends in
engineering & research,2019.
© 2023, JOIREM |www.joirem.com| Page 5
Journal Publication of International Research for Engineering and Management (JOIREM)
Volume: 10 Issue: 04 | May-2023

Area Nota VIP Crim Crime

Sensiti ble Pres nal

vity event ence group

Yes Yes Yes No Yes

Yes Yes No Yes No

No No No Yes No

Yes No No No No

Yes Yes Yes Yes Yes

No Yes No No No

© 2023, JOIREM |www.joirem.com| Page 6

You might also like