Group 6 - SMa - Crime Data Analysis Using Data Mining - Presentation
Group 6 - SMa - Crime Data Analysis Using Data Mining - Presentation
DATA MINING
.
By
{
Rohan Bhowmick 12200117030
Group 6 Ishika Chakrabarti
Anwesha Chakraborty
12200117045
12200118011
METHODOLOGY
I M P L E M E N TAT I O N
FUTURE WORK
CONCLUSION
REFERENCES
2
INTRODUCTION
C R I M E D ATA A N A LY S I S
3
OBJECTIVES
T H E O B J E C T I V E O F T H I S P R O J E C T I S :
4
THEORY & DISCUSSION
Retail Industry
Telecommunication Industry
Intrusion Detection
6
CRIME DATA
ANALYSIS USING
DATA MINING
LOOKING AHEAD
7
WHY ANALYSE CRIME?
8
DATA MINING AND CRIME PATTERN
We will look at how to convert crime information into a data-mining problem. In this case it can help
the analysts to identify crimes faster and help to make faster decisions. We have seen that in crime
terminology a cluster is a group of crimes in a geographical region or a hot spot of crime.
9
STEPS IN DOING CRIME ANALYSIS
But the data we got is ‘VERY UNSTRUCTURED’!, and how do we store it?!
The advantage of NoSQL database over SQL database is that it allows insertion of
data without a predefined schema.
Unlike SQL database it not need to know what we are storing in advance, specify its
size etc.
11
METHODOLOGY
Classification
• Naïve Bayes- a supervised learning method as well as a statistical method
• The algorithm classifies a news article into a crime type to which it fits the
best Eg. "What is the probability that a crime document D belongs to a
given class C?“
• Test results shows that Naive Bayes shows more than 90% accuracy!!
12
METHODOLOGY
Pattern Identification
• Apriori algorithm- used to determine association rules which highlight
general trends
• The result of this phase is the crime pattern for a particular place.
• After getting a general crime pattern for a place, when a new case arrives and
if it follows the same crime pattern then we can say that the area has a
chance for crime occurrence.
13
METHODOLOGY
Prediction
• Decision tree- It is simple to understand and interpret!
• Its robust nature and also it works well with large datasets.
Visualization
• A heat map which indicates level of activity, usually darker colors to
indicate low activity and brighter colors to indicate high activity.
14
I M P L E M E N TAT I O N U S I N G DATA V I S U A L I Z I N G
TECHNIQUE
15
M O S T R E P O RT E D C R I M E
16
D I F F E R E N T T Y P E S O F C R I M E W I T H C R I M E RAT E
17
C R I M E S AG A I N S T C H I L D
18
C R I M E AG A I N S T W O M E N
19
D I S T R I B U T I O N O F C R I M E OV E R T H E Y E A R
20
CRIME PREDICTION
21
CRIME PREDICTION RESULTS
22
FUTURE WORK
Criminal Profiling
• Helps the crime investigators to record the characteristics of
criminals.
23
CONCLUSION
An acceptable model for data mining which comes up with
excellent results of analysing crime data set; it requires huge
historical data that can be used for creating and testing the model.
More than 150500 crime records that were used in this work can
give estimation and lead to an acceptable model. VS Code and
Excel software were used to pre-process and analyse the collected
crime and criminal data.
24
REFERENCES
• [1] Malathi. A and Dr. S. Santhosh Baboo. Article:an enhanced algorithm to
predict a future crime using data mining. International Journal of Computer
Applications, 21(1):1–6, May 2011. Published by Foundation of Computer
Science.
• [2] Eibe Frank and Remco R. Bouckaert. Naive bayes for text classification with
unbalanced classes. In Proceedings of the 10th European Conference on
Principle and Practice of Knowledge Discovery in Databases, PKDD’06,
pages 503–510, Berlin, Heidelberg, 2006. Springer-Verlag.
25
REFERENCES
• [4] Lior Rokach and Oded Maimon. Decision trees. In Oded Maimon and Lior
Rokach, editors, The Data Mining and Knowledge Discovery Handbook, pages
165–192. Springer, 2005.
• [5] Tong Wang, Cynthia Rudin, Daniel Wagner, and Rich Sevieri. Detecting patterns
of crime with series finder. In Proceedings of the European Conference on
Machine Learning and Principles and Practice of Knowledge Discovery in
Databases (ECMLPKDD 2013), 2013.
• [6] R. Sanjana, H. S. S, and S. Sruthi, ‘Big Data Approach for Crime Classification
and Visualization Using Crime Dataset’, Int. J. Innov. Res. Sci. Eng. Technol., vol. 7,
no. 2, pp. 25–29, 2018.
• [7] A. Jain and V. Bhatnagar, ‘Crime Data Analysis Using Pig with Hadoop’, Procedia
Comput. Sci., vol. 78, no. December 2015, pp. 571–578, 2016.
26
Icon Icon Icon
THANK YOU