0% found this document useful (0 votes)
2K views26 pages

Crime Rate Prediction

This document summarizes a student project on crime rate prediction. The project aims to build a model using Indian crime data to predict crime rates and provide location-based insights. Existing systems are based on foreign data and have lower accuracy. The proposed system uses K-NN and decision tree algorithms to achieve higher precision. The targeted users are police, crime branches, and citizens seeking safety insights. The system design involves preprocessing historical crime data, applying machine learning algorithms for classification, and visualizing results.

Uploaded by

bsdrinku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views26 pages

Crime Rate Prediction

This document summarizes a student project on crime rate prediction. The project aims to build a model using Indian crime data to predict crime rates and provide location-based insights. Existing systems are based on foreign data and have lower accuracy. The proposed system uses K-NN and decision tree algorithms to achieve higher precision. The targeted users are police, crime branches, and citizens seeking safety insights. The system design involves preprocessing historical crime data, applying machine learning algorithms for classification, and visualizing results.

Uploaded by

bsdrinku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

CRIME RATE PREDICTION

Presented by :
BHAGYASHREE DABHADE FACULTY MENTOR
DATA SCIENCE (SEM4) MR. VIKRANT MADNURE
IAR/12227
Project Analysis

Project analysis slide 2


ANALYSIS The objective of new system

PROJEC
Problem Definition SYSTEM DESIGN
T

REQUIREMENT DETERMINATION DIAGRAMS


Existing System
• Crime Prediction Using Decision Tree (J48) Classification Algorithm
• Crime predictions based on population, Median household income,
Median family income differs from household income for non-
family households, and Number of people in poverty.

Point to note: Existing systems are based on foreign data.


LITERATURE SURVEY
Project analysis slide 5
S.NO TITLE OF THE PROJECT AUTHOR NAME JOURNAL METHOD AND DRAWBACKS YEAR
DESCRIPTION

1. Community Policing G. Cordner The Oxford It Aims At Establishing Hostility between the 2014
handbook of police An Active And Equal police and
& policing Partnership Between neighborhood residents
The Police And The can hinder productive
Public Through Which partnerships
Crime And
Community

2. Digital crime and digital R. W. Taylor, Prentice Hall Press The potential of crowd When cyber criminals 2014
terrorism E. J. Fritsch sourcing can be well acquire a victim's
And J. Liederbach explored in crime financial information,
reporting and they can steal money
awareness projects as from his account or
well take loans in his name.
This can cause the
victim a lot of financial
problems.
Need for the New System
Project analysis slide 6
K-NN,RF,SVM and Bayes models are existing methods.

They had less accuracy in prediction.

The results are not perfect.


Objective of the New System

The Objective of Crime Rate Prediction is to


build a model on Indian crime data with which
we can search for the location and get the crime
rate insights.

It provides the crime probability based on


Human Nature.
Problem Definition

• Crime Rate Prediction is a systematic approach for finding crime patterns and
trends, By building a crime rate prediction system, it speeds up the process of
solving crimes and reduces the rate of crime.
Proposed System

This project helps to users find if the area is


safe or not

It is especially for crime branches/system to


speed up solving problems

Find out patterns, insights from the data for


future prediction.
Advantages of the proposed system
Project analysis slide 7
• It has high accuracy in this model prediction methodology. This algorithm is for a data
mining approach to help predict crime patterns and fast up the process of solving crime.

• By using K-NN and Decision Tree algorithms we can be able to get high precision values By
using K-NN and Decision Tree algorithms we can we able to get high-precision values.

• The results are perfect and accurate using the technology.

9
Limitations of the proposed system

• We can work to make it user friendly.

• Data with other meaning full information will create more


scope in this model.
System requirements
Project analysis slide 8
• Software Specifications:
• Jupyter notebook.

• Hardware Specifications:
• Processor – 11th Gen Intel(R) Core(TM) i3-1115G4 @ 3.00GHz 3.00 GHz
• RAM – 8.00 GB
• OS – Windows 11 Home ©2017 Microsoft Corporation
• Keyboard, Mouse.
Targeted users
Project analysis slide 10
• This model is proposed for the betterment of society, targeted users are especially females, and
family members.

• Police and other officers working in the crime branch.


System Design
Project analysis slide 11
• About Dataset :
• Dataset is provided from wolrddata.com and contains 823 rows & 33
columns.
• With Attributes like Year, rape cases, kidnapping, Dowry deaths, total
29 types of crime are recorded on the basis of location, State &
Districts
• Find the most affected area.
Diagram

Data preprocessing
Crime Historical
Data
Data transformation Data filtering

Machine Learning
algorithm Data visualization Classification

Model Prediction
Data Analysis
Data Analysis

Based on data maximum crime cases has been seen in Sate


Maharashtra.

Murders are found to be high in bihar.

In case of women’s security Delhi, Rajasthan, Maharashtra comes


in least in the list.

Other IPC crimes listed in Gujarat shows high count.


K-Means Clustering

• Clustering: From a machine learning point of view, clusters relate to hidden patterns, the search for clusters
is unsupervised learning, and the subsequent framework represents a data concept.

• K-Means Clustering: K-means is a centroid-based clustering algorithm, where we calculate the distance
between each data point and a centroid to assign it to a cluster

• It is an unsupervised algorithm

• For finding the value of “K” we used 2 Methods namely “Silhouette Coefficient” and “Elbow method”.
K-Means Clustering

Silhouette Coefficient is a metric used to calculate the goodness of a clustering technique. Its value
ranges from -1 to 1.

Elbow method Elbow point is used as a cutoff point in mathematical optimization to decide at which
point the diminishing returns are no longer worth the additional cost.
K-Means Clustering

Implementation of K-Means:
We proceed further with the value of K=3, as it can be separated into High, Mid, and Low
categories. We got the count of 553, 21 & 205 for 0, 1 & 2.
K-Means Clustering

Index numbers indicate clusters as:


• Crime Rate Low: 0 (dark blue / indigo)
• Crime Rate Medium: 2 (teal)
• Crime Rate High: 1 (yellow)
Decision Tree

But why DT?


Decision Tree

•Approach to make a decision tree:


While making a decision tree, at each node of tree we asked
different types of questions. Based on the asked question we
calculate the information gain corresponding to it.

•Gini impurity:
If in the selected sample of dataset, all data belongs to the
same class then it is pure, if data is a mixture of different
classes it means impure. Gini Impurity is a measurement of the
likelihood of an incorrect classification of a new instance of a
random variable

•Information Gain:
A commonly used measure of purity is called information. For
each node of the tree, the information value measures how
much information a feature gives us about the class.
Decision Tree
Decision Tree

•Accuracy of the Model:


It is measured by the accuracy score with the confusion Matrix we got our score of the model
as 0.861538
Thank You
Shiash info solutions pvt.

CRIME RATE
PREDICTION
Final Project evaluation

Presented by :
FACULTY MENTOR
BHAGYASHREE DABHADE
MR. VIKRANT MADNURE
DATA SCIENCE (SEM4)
IAR/12227

You might also like