0% found this document useful (0 votes)
84 views7 pages

PREDPATROL (Predictive Patrolling) - IEEEFormat

This document describes a machine learning based tool called PREDPATROL that aims to classify and predict crime types using real-world crime reports and data. It compares the performance of Naive Bayes and Artificial Neural Network classification algorithms on this task. The tool could help police departments analyze crime data more efficiently to aid in investigations and curb crime rates.

Uploaded by

Sashaank
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views7 pages

PREDPATROL (Predictive Patrolling) - IEEEFormat

This document describes a machine learning based tool called PREDPATROL that aims to classify and predict crime types using real-world crime reports and data. It compares the performance of Naive Bayes and Artificial Neural Network classification algorithms on this task. The tool could help police departments analyze crime data more efficiently to aid in investigations and curb crime rates.

Uploaded by

Sashaank
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7

PREDPATROL (Predictive Patrolling):

A machine learning based visualization tool for crime type prediction

Sashaank Pejathaya Murali, Sangavi Vijayakumar, Rushvanth Bhaskar, Karthika Subbaraj


Department of Information Technology, SSNCE
SSN College of Engineering
Chennai, India
[email protected], [email protected], [email protected], [email protected]

Abstract—Crimes exist everywhere in different forms, and community in order to live a normal and peaceful life. This
data collected regarding different crimes, would be very large greatly affects the economy of the society by introducing
in volume which can be managed through data mining impalpable costs like psychological disturbances and
techniques. This research will be very useful in crime decreased quality of life for the victims of these crimes and
detection, but this depends on the volume and correctness of
the data because more data gives more accurate prediction
substantial costs to manage the increased policing in the
results. Classification is a famous, supervised learning society, correction facilities, etc. that places a huge financial
methodology in data analytics. Meaningful sets of data that burden on taxpayers and governments. [1] The high crime
may seem relevant to any particular application, is mined from rate that is prevalent in today’s world is causing a lot of
the huge amount of data available. It is a highly efficient mayhem among people from different countries, who are in
technique used to classify or group data into appropriate fact looking for solutions for detecting crime patterns by
classes and also predict unknown classes. This research focuses studying various criminal behaviors and trying to
mainly on classification of given input text to predict the class comprehend the numerous characteristics of crimes that are
of crime that it belongs to. The dataset used in this research committed. [1] Due to the lack of open crime datasets
was collected from the internet and our tool was trained with
available for case studies, and the inconsistency and
it, for each type of crime. This paper compares the Naïve Bayes
Classifier Algorithm with the Artificial Neural Network inadequacy of the very few that are available, data mining in
Algorithm for predicting the crime type. A huge data set is the area of Crime Analysis is a major challenge faced by
generated every year on the basis of reporting crime. This data researchers today, who are looking for new ways to enhance
can prove very useful in analyzing and predicting crime. Crime Data Analysis to curb this growing societal menace.
Crime analysis is an area of indispensable importance in the One of the most commonly used and important technique in
police department. In the notion of addressing these issues and data mining is Classification. This research will focus on
assisting the police in order to meet increasing demands of the applying Naive Bayes classification algorithm on the crime
current community, a crime type prediction tool has been report and view and compare the accuracy of the results in
proposed as a strategy.
classifying in the crime categories of the Bayesian algorithm
with the Artificial Neural Network Classification algorithm.
Keywords— Crime Prediction, Classification, Machine
Learning, Naïve Bayes, Artificial Neural Networks In this research, real time crime reports are used for training
the tool. PredPatrol’s mission is simple: To aid the Police
Department in predicting crimes. The proposed service is
I. INTRODUCTION not to replace skilled crime analysts and experienced
It is often seen that, the crime rate in our city is quite high officers, but to aid them in achieving results instantly. It is a
and the Police Department find it difficult to manage their web application that takes a crime scene investigation report
huge datasets and get buried in paperwork. The service that as an input and outputs the category of crime committed. In
is proposed will make their tasks easier by managing the the recent past the crime rates here have been increasing and
dataset efficiently and help them identify the type of crime. the action taken to arrest the accused has been quite slow.
Crimes are recurring social issues that disrupt our everyday The motivation behind this research came from this plight.
lives and impact the economic growth of any society. It is a It was decided to provide the police, a system that will help
common menace that induces a sense of fear and discomfort them to expedite prediction of crimes. The end users of the
among the people leading to lack of unity and harmony proposed service would be the members of the Police
among us. It hampers our routine by making us avoid Department, as they would be aided in solving cases faster.
certain places at night or break healthy associations with our This system will be used by the Police Department
neighbors, thereby causing chaos and damage to our extensively where they can feed in simple real-time data and
communal activities. It also leads to the relocation of some
families to a less crime-ridden
get a visual and refined output. It can accelerate the process using two feature selection methods used on a real-time
of taking action against crimes. crime dataset. It is found by comparison that Neural
Networks, k-Nearest Neighbor and Naïve Bayes are better
The organization of this paper is as follows. Section 2 classifiers against Support Vector Machine and Decision
discusses the literature survey, Section 3 covers the tree.
methodology and framework such as classification methods, In this research [9], classification is applied to a real-time
crime dataset/reports and measures for performance crime dataset to predict the category of crime for different
evaluation. Section 4 highlights the experimental results of states of the USA. The real-time crime dataset used in this
the classification algorithms for classifying the type of research was acquired from the 1995 FBI UCR. This work
crime. Finally, Section 5 talks about conclusion and future compares the two classification algorithms namely,
work. Decision Tree and Naïve Bayes for predicting the category
of crime for various states in the USA. The experimental
II. LITERATURE SURVEY
results showed that, Decision Tree outperformed Naïve
A. Criminology and Crime Analysis Bayes algorithm.
Criminology is a field about the study of crimes and The following are related work regarding the two
behavior of criminals and is a practice that recognizes classification algorithms that are used in PredPatrol.
characteristics of crime [2]. It is one of the most significant
areas where data mining approaches can produce vital 1) Naïve Bayes Classifiers: This work [10] gives a
outcomes. Crime analysis, a portion of criminology, is a solution for the problem of prediction of criminals (that is
process that deals with identifying crimes and their
the problem of identifying the most likely suspect) using the
associations with criminals. The huge volume of crime
Naïve Bayes theory. Obtaining the crime dataset is a hard
datasets and also the complication of links between these
task owing to confidentiality issues. So the crime dataset is
data have made criminology a suitable area for applying
data mining methods. The knowledge obtained from data produced synthetically. The proposed methodology is tested
mining techniques is very helpful that can aid and support for the criminal prediction problem and the results show that
the police [3]. According to [4], solving of crimes is a the proposed model provides high scores in identifying most
complicated task that needs both human experience and likely suspects.
intelligence and data mining is an approach that can assist In this research [11], crime data is used on Naive Bayes
the police with crime detection. classification algorithm, with Rapid Miner tool and how
efficiently and accurately Naïve Bayes algorithm could
manage this data is demonstrated.
B. Reason for Predictability of Crime In this work [12], the crime dataset is classified into
There is strong evidence to bolster the fact that crime is vulnerable and non-vulnerable for effective crime control
predictable statistically mainly because criminals function in strategies. The classification algorithms are applied
their comfort zones [5]. That is, they commit the same kind individually on real crime data and their performance
of crimes that they have done in the past, commonly close to evaluation is analyzed using standard measures such as
the same location and time with similar patterns. accuracy, time, Receivers Operating Characteristic (ROC).
The result showed that C4.5 performed better with higher
C. Review of Classification Algorithms Used In accuracy on the three datasets against Naïve Bayes. The
Crime Prediction result also revealed that the two classifiers performed better
under percentage split approach compared to 10 fold
In this work [6], three classification algorithms, namely validation approaches.
Naïve Bayes, C4.5 and K- Nearest Neighbor algorithms
(KNN) are compared using various famous missing data
2) Artifical Neural Networks: In this study [13], a
filling algorithms used on a real time crime dataset with
hybrid crime classification model was proposed, by
numerous missing data. The results show that higher
combining Artificial Bee Colony (ABC) algorithm with the
accuracy in classification can be acquired by combining
Artificial Neural Network (ANN) algorithm. The proposal
KNN classification algorithm and GBWKNN missing data
was by using Artificial Bee Colony as a learning method for
filling algorithm.
ANN, thus producing better results. This hybrid algorithm is
applied to a real time crime dataset to predict classes of
The main objective of this research [7] is to classify crimes crime. The dataset was obtained from the repository of UCI
based on the frequency of occurrence. A theoretical model Machine Learning. The experimental results show that this
based on techniques such as clustering and classification is proposed combination of ANN and ABC outperformed
applied to a real crime dataset recorded by police in other classification algorithms and achieved a high accuracy
England. Weights are allocated to the features to improve with improvement over other algorithms.
the model’s quality.

In this work [8], an experiment is performed to get better


supervised classification algorithms to predict crime status
by
III. METHODOLOGY will be the size of the input layer of the ANN and the
number of classes will be the size of the output layer. The
1) Dataset:The training data (for each category of size of the hidden layer can vary, which is automatically
crime, respective appropriate passages/texts) is what the handled by PredPatrol’s algorithm depending on the data
classifier was trained on taken mostly from this source [14]. input fed to the ANN. The texts that want to be trained with,
A sample of the data collected is shown below: are stored in an array. The dictionary for those texts is
obtained and then the training data is created by associating
“Coroner's Inspector Luwinda Johnson arrived at the scene each text to the appropriate classification.
at approximately 7:10 a.m. and joined R/Is at the body's
location. Inspector Johnson pronounced the victim deceased Algorithm followed for ANN:
at 7:24 a.m. by visual observation that the victim was not 1. The input layer contains the bag of words
breathing and by tactile observation that the victim did not representation of the CSIR.
have a palpable carotid pulse or any other indications of
2. Data is fed to the hidden layer where it gets
heartbeat or respiration. Inspector Johnson visually
converted to a logistic classifier (wx + b). The loss
examined the body and observed what appeared to be
function and the parameter matrices are stored as a
multiple sharp force injuries to the back, chest and right
matrix termed synapse. (b - bias function)
forearm as well as a blunt force injury to the right side of
the head. Inspector Johnson noted that rigor mortis was not 3. The logistic classifier matrix is then scaled to
sigmoid non-linearity. This forms the hidden layer.
yet evident and early indications of livor mortis were
observable on the anterior of the body and the right side of 4. The loss function is optimized using Stochastic
the face. While she withheld an official estimate pending an Gradient Descent (SGD), by iterating a specific
autopsy, Inspector Johnson speculated the victim had been number of times over the training data, resulting in
dead approximately 2 to 4 hours. Inspector Johnson optimal parameter matrix.
indicated that further details would be available in the 5. The output layer serves to add the SGD optimized
official autopsy report.” bias term to the matrix dot product.
6. The final step is to convert the logistic scores to
The highlighted words in the above text form the Bag of probabilities (accuracies) using softmax function
Words for this report. The above text is a sample of the (activation function)
dataset that was used for training our classifiers for the 7. Class with maximum accuracy is displayed as the
category of murder. category of crime.
8. The optimization iterates over the complete data
2) Ground Truth and Metric Used: The accuracy of the for a specified number of times. If a local number
model is tested using the test data, the input Crime Scene of iterations do not do any difference to the error
Investigation Report (CSIR). The ground truth refers to the reduced, the iteration stops.
label for each training sample i.e. which category/outcome
each training sample belongs to, is defined in prior. [15] 5) ANN Weight Update Rule: The Delta rule is a
gradient descent learning rule for updating the weights of
3) Naïve Bayesian Classifier: Naive Bayes Classifier the inputs to artificial neurons in a single-layer neural
needs to be fed directly with text and the corresponding network. The delta rule for a neuron with a linear activation
labels. It assumes no interrelation between the words of a function, which is used in the implementation, is given by,
sample text. So, the task is simply associating a label to a wi = wi + ɳ (ye - oe) xie where o = ∑i=od wi xie
text based on the number and frequency of words.
where,
Algorithm followed for Naïve Bayes Classifier: ɳ is a small constant called learning rate
1. Words (tokens) are extracted from CSIR and ye is the target output
accuracy is initialized to 0. oe is the actual output
2. For each class c, prior probability of c is assigned xie is the ith input.
to accuracy.
Initialization: Examples {(xe, ye)} e=1N, initial weights wi set
3. For each word w, conditional probability of word
w, given class c is added to accuracy. to small random values, learning rate parameter is taken as
0.1.
4. Class with maximum accuracy is displayed as the
Repeat
category of crime.
for each training sample (xe, ye)
output is calculated: o = ∑ i=od wi xie
4) Artificial Neural Networks: Given a set of
if the ANN does not respond correctly weights are
documents, it is possible to extract a dictionary of words and
updated:
represent each document in the set with an array using the wi = wi + ɳ (ye - oe) xie
Bag of Words model to represent text in a vector form. The until termination condition is satisfied.
size of this array
Example: Suppose, the ANN implemented in PredPatrol
accepts two inputs x1 = 2 and x2 = 1, with weights w1 = 0.5
and w2 = 0.3 and w0 = -1.
The output of ANN is:
O = 2 * 0.5 + 1 * 0.3 - 1 = 0.3

If the given correct output is zero, i.e., ye=0, the weights will
be adjusted according to the incremental gradient descent as
follows:

w1 = 0.5 + (0 - 0.3) * 2 = - 0.1


w2 = 0.3 + (0 - 0.3) * 1 = 0
w0 = - 1 + (0 - 0.3) * 1 = - 1.3

6) Proposed Model of PredPatrol: In the proposed tool,


Fig. 2. User Interface of PredPatrol
the crime scene investigation report is given as input and
results are obtained with the accuracy of the input report 8) Implementation Details: PredPatrol is a web
(text) in each category of crime as output. The categories of application written entirely using HTML, CSS, and
crime that will be present are Murder, Theft and Kidnap JavaScript. A JavaScript implementation of Naive Bayes
with an additional class for Non-Crime. The proposed Classifier was developed. A file must be uploaded to the
model of PredPatrol is shown in Fig. 1 below. tool which then reads the contents of the file and passes it
through a NBC and an ANN. The membership probabilities
A user can upload a file (crime scene investigation for each of the 4 classes are obtained as the result for both
report/text) and choose what algorithm he/she wants to use the classifiers. The text belongs to the class with the highest
on it for viewing results (either Naïve Bayes or ANN) from probability for each classifier. The accuracy of each
the two buttons. The GUI of PredPatrol is shown in Fig. 2.
prediction is measured and reported.
For the Artificial Neural Network, two JavaScript Libraries
The visualization is done using graphs. The four probability
results for each class are plotted in a graph taking an – mimir and brain were used. brain is a Neural Network
appropriate scale. A total of 3 graphs are presented. One for library written in JavaScript. Mimir is a micro module that
each classifier and one graph comparing the winner of NBC uses the Bag of Words model to represent text in a vector
and ANN. The novel aspect here is comparing the form. Mimir also performs tf-idf analysis to weigh the
performance of NBC and ANN for text classification, in a importance of a word within the context of a set of texts.
very user interactive manner.

The visualization is done using graphs. The four probability


results for each class are plotted in a graph taking an
appropriate scale. A total of 3 graphs are presented. One for
each classifier and one graph comparing the winner of NBC
and ANN. The novel aspect here is comparing the
performance of NBC and ANN for text classification, in a
very user interactive manner.

7) Workflow: The workflow (methodology) of PredPatrol is


shown in Figure 3 beside.

Fig. 3. Workflow of PredPatrol.


Fig. 1. Proposed Model of PredPatrol.
IV. EXPERIMENTAL RESULTS AND DISCUSSION TABLE I. PROBABILITY OF EACH WORD IN A CLASS OF CRIME

This section discusses the results of our research with


comparison of the two algorithms used for text Classes
classification. The main metric used here is the accuracy of
the classifiers. Words
Murder Kidnap Theft Non-Crime
For Naïve Bayes Classifier, to find which class a given
report must be classified to, the probability of each word
with respect to its association to each class must be blood 0.833 0.166 0.166 0.166
calculated and these values have to be multiplied to obtain
the product when given the probability of a particular class, was 0.585 0.683 0.683 0.071
multiplied by the prior probability of the class. After
performing the above calculations for all the classes of set found 0.166 0.722 0.722 0.166
C, the one with the highest probability is the accuracy given
by (1).
at 0.585 0.5 0.683 0.258
[Accuracy = c map]
the 0.5 0.5 0.5 0.5

(1) scene 0.833 0.166 0.166 0.166


Where tk represents the words, which are contained in the
report, C represents the set of classes that are used for
classification, P(c|d) is the conditional probability of class c sentence as “Theft” because it has the highest accuracy
given report d, P(c) the prior probability of class c and P(tk| (27%), which is not expected (misclassification). ANN does
c) the conditional probability of token tk given class c. The not give the expected classification in the initial iterations
conditional probability of a token when given a class because it needs constant tuning of weights and multiple
represented as the relative frequency of the term t in texts times the training that the Naïve Bayes Classifier gets. Thus,
which belong to a particular class c is given by (2) it misclassifies the above report into the wrong class. The
Bag of Words for the above sample sentence consists of
{blood, found, scene}. Naïve Bayes has classified the
sample sentence correctly under the murder class, with the
(2) probability of each word in a class clearly. But this kind of
clarity is not observable in ANN during the initial iterations.
Thus, this formula accounts for the number of times term t
has occurred in the training data from the class c, including The results obtained for Bayes Classifier and ANN on the
instances where it has occurred more than once. The total set of
text are shown in Fig. 4.
words is V. For ANN, Input variables are xi, weights are wi and
synapses are wixi (product of weights and input variables). The
summation of synapses and bias function (b) is ∑ wixi + b. The ANN was trained with a relatively small amount of
The accuracy of ANN is given by the softmax function (f) data. A network with higher training requires a larger
of training set. To determine the weights for the output that
∑ wixi + b, given by (3) will give optimum results will be extremely time
consuming. An optimal network topology implies
Accuracy = f (∑ wixi + b) (3) constructing multiple ANNs for which the least mean
squared error is the lowest. The performance of the ANN is
For testing purposes, PredPatrol was tried for a single affected by several factors which include the number of
sentence “Blood was found at the scene.” hidden nodes, nature of inputs and time taken for the
network to train. It is highly time-consuming to build and
According to the Naïve Bayes Classifier, the probability of train multiple ANNs for each class.
how much each word of the sentence falls in a crime
category is given in Table I below. Whereas in the case of
ANN, the probability of each word in a class is not
computed. It can be noticed that the word “blood” has the TABLE II. COMPARISON OF ACCURACIES OF NAÏVE BAYES AND ANN
highest probability under the murder class. (0.833) (IN %)
compared to the other classes. Algorithms
Classes
Naïve Bayes ANN
The comparison of accuracies (in percentage) of both the
murder 91 0.9
Naïve Bayes and the ANN algorithms for each class of
crime, are given in Table II kidnap 10 19
theft 34 27
Thus, from Table II it can be deduced that, the Naïve Bayes
Classifier classifies the sentence as “Murder” because it non-crime 0.02 25
has the highest accuracy (91%) and the ANN classifies the
Various factors like sensitivity and specificity of the ANN
are not significant enough to be generalizable to a variety of
classes. Thus, large training data texts from a variety of
settings are needed. A limitation of ANN, similar to other
data-derived techniques, is that over-fitting to the training
data leads to a compromise with respect to the
generalizability of the classes. The performance of the ANN
can be increased by increasing the size of the training data.

The performance of Naive Bayes algorithm is compared to


ANN algorithm. This comparison is done to calculate the
performance of the Naive Bayes Algorithm in classification
of crimes, in the form of bar charts. Fig. 4 shows the Fig. 5. Comparison of Winning Scores for sample sentence.
comparison of the winning scores (highest accuracy) of the
two algorithms. It indicates that Naïve Bayes gives a better compared to the ANN algorithm, even though both the
winning score (91%) than the ANN (27%). The results show algorithms were trained equally. ANN requires much more
that Naïve Bayes classification algorithm outperformed training and weight tuning. It is also evident that ANN does
ANN classification algorithm. not give accurate results for the initial iterations itself
because it needs constant feedback and error correction.
The above results were for one sentence of the test data.
Figure 6 represents the results for a full CSIR.
PredPatrol was tried for various crime scene investigation
reports such as the text for “Theft” shown below. In Naïve Bayes classifier, computing the parameters has a
complexity of O (|C||V|). The set of parameters consists of |
“At approximately 7:15 a.m., Friday morning, Mrs. King, C||V| conditional probabilities and |C| prior probabilities.
the seventh-grade science teacher, thought something was All the necessary preprocessing for computing the
fishy as she walked down the hall and noticed that her door parameters (extracting vocabulary, counting the terms, etc.)
was open. She walked into her classroom and immediately can be done in a single iteration of the training data.
discovered that the small aquarium had been broken and Therefore, this component has a time complexity of O (D|
her prized gold fish were gasping in the sink. Beside the La), where D is the number of reports and La is the average
broken aquarium were the shattered remains of the pink length of the report. Whereas, in ANNs, the time complexity
of a single iteration depends on the ANN's structure. For a
piggy bank that had been on the shelf above the aquarium.
trained Multi Layer Perceptron, the complexity of
A can of blue paint was spilled on the floor. Footprints of a
classification (the forward propagation) is roughly, the
barefooted burglar led to an open window. Bits of a white number of multiplications needed to compute the activation
powdery substance were found next to the broken, empty, of all neurons (vector product) in the i th layer of the net:
piggy bank. The only other item found was a half-eaten NumberOfNodesInLayer(i)*NumberOfNodesInLayer(i-
large chunk of chocolate candy.” 1). The worst case would be that the nodes are equally
distributed, which results in O (n2), where n is the number
The above results were for one sentence of the test data. of neurons in the network. Thus, naturally the Naive Bayes
PredPatrol was tried for various crime scene investigation proves to be much faster than ANN. Thus, naturally the
reports. Fig. 5 represents the results for a full CSIR. Naive Bayes proves to be much faster than ANN.
Hence it is concluded that the Naïve Bayes Classifier is the
Thus, Naïve Bayes classification algorithm has proved to best algorithm for text classification and works the best even
work the best for the proposed tool and has given accuracy for less training.
of 83% (calculated by the ratio of number of inputs
correctly classified to the total number of inputs) when.

Fig. 6. Comparison of Winning Scores for full CSIR.


Fig. 4. Comparison of accuracies of Naïve Bayes and ANN for the sample
sentence.
V. CONCLUSION AND FUTURE WORK
PredPatrol is a web application for primarily predicting the [7] Kiani, Rasoul, Siamak Mahdavi, and Amin
type of crime committed and is an aid to the Police Keshavarzi., “Analysis and prediction of crimes by
Department. This tool uses Naïve Bayes classifier to predict clustering and classification.” Analysis, 4.8, 2015.
crime and its accuracy is compared to the Artificial Neural [8] Shojaee, Somayeh, et al., “A study on classification
Network algorithm’s accuracy (Naïve Bayes gives a 83% learning algorithms to predict crime status.”
accuracy compared to ANN) and it is found to emerge as the International Journal of Digital Content Technology
best for text classification because it is non-trivial to and its Applications, 7.9, pp. 361, 2013.
decipher the process going on underneath with respect to the
[9] Iqbal, Rizwan, et al., “An experimental study of
decisions the ANN is making, troubleshooting the
classification algorithms for crime prediction.” Indian
implementations of ANN algorithms is tricky and ANNs do
Journal of Science and Technology, 6.3, pp. 4219-4225,
not provide information about the relative significance of
2013.
the various parameters. ANNs require a large diversity of
training for operation. The knowledge acquired during the [10] Vural, Mehmet Sait, and Mustafa Gök., “Criminal
training of the algorithm is stored in an implicit manner and prediction using Naive Bayes theory.” Neural
hence it is hard to come up with a reasonable interpretation Computing and Applications, 28.9, pp. 2581-2592,
of the overall structure of the network. The Naïve Bayes is a 2017.
classifier, which can be used to take different decisions [11] Ahmed, Waseem, Md TabrezNafis, and Siddhartha
based on results. ANNs need to be retrained after a single Sankar Biswas., “Performance Analysis of Naïve Bayes
instance. They can be made incremental, but it is not Algorithm on Crime Data using Rapid Miner.”
guaranteed to be optimal compared to retraining on the International Journal, 8.5, 2017.
entire data set. However, ANN outperforms Naïve Bayes [12] Obuandike, N. Georgina, Alhasan John, and M. B.
when correlations between input variables need to be Abdullahi., “Classification of Crime Data for Crime
considered. Naive Bayes assumes no dependence between Control Using C4. 5 and Naïve Bayes Techniques.”
input variables. Therefore, the accuracy of the Naïve Bayes International Journal of Mathematical Analysis and
classifier is impacted when the assumption is wrong. The Optimization: Theory and Applications, pp. 139-153,
correlation between input variables can be handled by an 2017.
ANN with an appropriate network structure. This tool has
[13] S. Anuar, A. Selamat, R. Sallehuddin, “Hybrid
been developed and was also tested successfully by using
Artificial Neural Network with Artificial Bee Colony
various reports and test cases. It is highly user friendly
Algorithm for Crime Classification.” Phon-Amnuaisuk
which can be utilized by the police official to feed in the
crime scene investigation report to get the desired results. S., Au T. (eds) Computational Intelligence in
For future work there is a plan to extend the tool with Information Systems. Advances in Intelligent Systems
ordering events of the crime based on supervised learning and Computing, vol 331, 2015.
algorithms (event sequencing) and also identifying possible [14] https://www.crimescene.com
suspects based on centrality analysis. [15] Gupta, Amit, et al., “A Comparative Study of
Classification Algorithms using Data Mining: Crime
and Accidents in Denver City the USA.” Education 7.7,
REFERENCES
2016.
[1] Ahishakiye, Emmanuel, et al., “Crime Prediction Using
Decision Tree (J48) Classification Algorithm.”
Analysis, 6.03, 2017.
[2] A. Malathi., B.S Santhosh., “Algorithmic Crime
Prediction Model Based on the Analysis of Crime
Clusters” Global Journal of Computer Science and
Technology, Volume 11, Issue 11 Version 1.0, July
2011.
[3] M.R. Keyvanpour, M. Javideh, and M.R. Ebrahimi,
“Detecting and investigating crime by means of data
mining: a general crime matching framework” Procedia
Computer Science, World Conference on Information
Technology, Elsvier B.V., Vol. 3, pp. 872-830, 2010.
[4] S. Nath, “Crime data mining, Advances and innovations
in systems”, K. Elleithy (ed.), Computing Sciences and
Software Engineering, Pp. 405-409, 2007.
[5] L. P. Walter, M. Brian, C. P. Carter, C. S. Susan and S.
H. John., Predictive Policing; The Role Of Crime
Forecasting In Law Enforcement Operations, Rand
Corporation, 2013.
[6] Sun, Cuicui, et al., “Detecting crime types using
classification algorithms.” BioTechnology: An Indian
Journal, 10.24, 2014.

You might also like