Crime Rate Prediction
Crime Rate Prediction
Presented by :
BHAGYASHREE DABHADE FACULTY MENTOR
DATA SCIENCE (SEM4) MR. VIKRANT MADNURE
IAR/12227
Project Analysis
PROJEC
Problem Definition SYSTEM DESIGN
T
1. Community Policing G. Cordner The Oxford It Aims At Establishing Hostility between the 2014
handbook of police An Active And Equal police and
& policing Partnership Between neighborhood residents
The Police And The can hinder productive
Public Through Which partnerships
Crime And
Community
2. Digital crime and digital R. W. Taylor, Prentice Hall Press The potential of crowd When cyber criminals 2014
terrorism E. J. Fritsch sourcing can be well acquire a victim's
And J. Liederbach explored in crime financial information,
reporting and they can steal money
awareness projects as from his account or
well take loans in his name.
This can cause the
victim a lot of financial
problems.
Need for the New System
Project analysis slide 6
K-NN,RF,SVM and Bayes models are existing methods.
• Crime Rate Prediction is a systematic approach for finding crime patterns and
trends, By building a crime rate prediction system, it speeds up the process of
solving crimes and reduces the rate of crime.
Proposed System
• By using K-NN and Decision Tree algorithms we can be able to get high precision values By
using K-NN and Decision Tree algorithms we can we able to get high-precision values.
9
Limitations of the proposed system
• Hardware Specifications:
• Processor – 11th Gen Intel(R) Core(TM) i3-1115G4 @ 3.00GHz 3.00 GHz
• RAM – 8.00 GB
• OS – Windows 11 Home ©2017 Microsoft Corporation
• Keyboard, Mouse.
Targeted users
Project analysis slide 10
• This model is proposed for the betterment of society, targeted users are especially females, and
family members.
Data preprocessing
Crime Historical
Data
Data transformation Data filtering
Machine Learning
algorithm Data visualization Classification
Model Prediction
Data Analysis
Data Analysis
• Clustering: From a machine learning point of view, clusters relate to hidden patterns, the search for clusters
is unsupervised learning, and the subsequent framework represents a data concept.
• K-Means Clustering: K-means is a centroid-based clustering algorithm, where we calculate the distance
between each data point and a centroid to assign it to a cluster
• It is an unsupervised algorithm
• For finding the value of “K” we used 2 Methods namely “Silhouette Coefficient” and “Elbow method”.
K-Means Clustering
Silhouette Coefficient is a metric used to calculate the goodness of a clustering technique. Its value
ranges from -1 to 1.
Elbow method Elbow point is used as a cutoff point in mathematical optimization to decide at which
point the diminishing returns are no longer worth the additional cost.
K-Means Clustering
Implementation of K-Means:
We proceed further with the value of K=3, as it can be separated into High, Mid, and Low
categories. We got the count of 553, 21 & 205 for 0, 1 & 2.
K-Means Clustering
•Gini impurity:
If in the selected sample of dataset, all data belongs to the
same class then it is pure, if data is a mixture of different
classes it means impure. Gini Impurity is a measurement of the
likelihood of an incorrect classification of a new instance of a
random variable
•Information Gain:
A commonly used measure of purity is called information. For
each node of the tree, the information value measures how
much information a feature gives us about the class.
Decision Tree
Decision Tree
CRIME RATE
PREDICTION
Final Project evaluation
Presented by :
FACULTY MENTOR
BHAGYASHREE DABHADE
MR. VIKRANT MADNURE
DATA SCIENCE (SEM4)
IAR/12227