CRIMECAST: A Crime Prediction and Strategy Direction Service

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

19th International Conference on Computer and Information Technology, December 18-20, 2016, North South University, Dhaka, Bangladesh

CRIMECAST: A Crime Prediction and Strategy


Direction Service
Nafiz Mahmud, Khalid Ibn Zinnah, Yeasin Ar Rahman, Nasim Ahmed
Chittagong University of Engineering & Technology
Chittagong-4349, Bangladesh
[email protected], [email protected], [email protected], [email protected]

Abstract— Various researches on criminology provides coldspots focus should be more on individual occasional
us with a key piece of information about criminal offender. This calls for a much optimized probabilistic model
psychology that, a criminal doesn't hover around unknown and massive data analysis which is required to develop real
territory rather they commit crimes when opportunity time crime prediction with precise time and location.
provides in a concentrated or familiar area i.e. hotspots.
So, a crime predicting model can be simulated using crime We present here CRIMECAST, a mathematical simulation
pattern theory which can analyze verified past crime data process that analyses crime rate, crime locations, timing,
and predict future criminal activities. The aim of this nature of crime, damages, scope of crime from past years
paper is to introduce CRIMECAST which is a crime (preferably up to 30 years) and predicts future crime. This
prediction and strategy direction service which attempts to process is a method which relates to past crime trends and
predict probable future crimes by simulating probabilistic factors and determines possible influence on future scope of
model implementation and Artificial Neural Network. crime. It can be used for identifying both frequent offender
CRIMECAST is a spatial crime analysis process that and their hotspots and also occasional offender in coldspots.
focuses on authentic crime history and predicts crime, The authors in [5] proposes Levy Flight Model to capture the
develops strategy map, provides security alert. Our dynamics of hotspots. CRIMECAST uses statistical model and
simulation on very big dataset show that CRIEMECAST ANN implementation to predict spatial behaviors of both
outperforms all other methods of making crime frequent and occasional offenders. CRIMECAST can trace
predictions. and predict criminal's mobility for crimes like - series robbery.
It is usually be done using directional movement approach and
Keywords— Crime Prediction, Hotspot Detection, Oracle data analysis [6].
DB, Probabalistic Model, , ANN
The rest of this paper is organized as follows. Section II of
I. INTRODUCTION this paper addresses related work. CRIMECAST model is
When law was introduced in social life there was an presented in Section III. Section IV describes our experimental
automatic occurrence of an act which committed in violating evaluation and results. Section V provides the conclusion and
it. These acts are known as Crime for which penalty is future scope is discussed in Section VI.
imposed. Nowadays because of the inexorable increase in
crime, it's a difficult challenge to predict future crime II. BACKGROUND
accurately and efficiently. So the necessity of crime prediction Several criminological researches have explored different
is on the rise [1], [2], [3]. Crime is naturally unpredictable. It crime prediction methods. In [7], the authors developed a
is neither random nor uniform [4], [3]. Region with higher point pattern based density model for crime prediction. This
occurrence of criminal events which are referred here as model computes the likelihood of criminal occurrence in a
hotspots has a larger share of criminal activity than region fixed location by analyzing previous incidents. In [2],
with lower concentration of criminal events referred here as Rossomo has modeled a recognized method of determining
coldspots. home location of offender by analyzing the person's crime
location. Different Collaborative Filtering methods [8] infers
Hotspots are changed frequently with development of new user's behavior from behaviors of similar users based on the
residential area or market, employment etc. In hotspots idea that user with similar behavior in the past will have
criminals find crime opportunities easily. So crime occurrence similar behavior in the future. In [5], the authors propose a
is higher. Detection of hotspots enables law enforcement lattice model for predicting residential burglary where each
agency to focus more in hotspots and prioritize the use of location is assigned with a dynamic interest/attractiveness
resources for better prevention of crime. value. All above mentioned methods proposes solution model
for different crime prediction problems to which the method
But in coldspots a criminal has to hunt for opportunities to presented here can't be compared.
commit crime and thus faces higher risks. So, crime
occurrence is lower. For a feasible crime prediction in

978-1-5090-4090-2/16/$31.00 ©2016 IEEE 414 ISBN 978-1-5090-4089-6


19th International Conference on Computer and Information Technology, December 18-20, 2016, North South University, Dhaka, Bangladesh

In every crime prediction model a data mining technique is E. ARTIFICIAL NEURAL NETWORK:
used for predicting future crime. In [9], the authors introduce
the basic data mining techniques that are used to predict crime. A neural network has a massive number of processing
element that works only on local data called neurons. In [16],
A. Multivariate time series clustering: the authors introduce a crime prediction model that focuses on
areas with high crime occurrence. Then, a crime incidence-
Multivariate time series is a flow of data points computed in scanning algorithm is employed to identify clusters i.e.
similar time intervals. This technique is efficient for hotspots. This provides enough data to train ANN capable of
determining similar crime trends. In [10], the authors discuss determining crime trends within them. This approach is
the approach used for this technique. Because some crimes enhanced by applying the Gamma test otherwise the ANN
have more importance i.e. weight than other crimes example- takes long training time. The results of crime prediction using
murder has more weight than burglary this approach considers ANN is satisfactory and prediction accuracy is generally high.
the weightage scheme in clustering algorithm. The distance
function is defined by a weighted version of Minkowski
distance measurement. The effectiveness of this technique has III. PROPOSED METHODOLOGY
been tested on Indian crime dataset provided by Indian In this Section we present CRIMECAST model
national crime records. components, starting with dataset preparation.

B. Support Vector Machine:


A. Dataset preparation
In [11], the authors introduce the approach used for this Dataset should contain accurate information about several
technique. For a given dataset and predefined level of crime it crime domains for preferably up to 30 years. Example of
selects a subset of the crime dataset using k-clustering domains such as - Murder, Rape, Drug Trafficking, Sexual
algorithm and puts a label to each data point in the selected Harassment, Burglary, Theft, Pickpocket etc. the most
set. The points whose crime rate is above the given rate are common data types.
hotspots and the points whose crime rate is below the given A domain can be expressed as -
𝑛
rate are coldspots. Experiments show that SVM gives precise
output when appropriate algorithm is used. ∑ 𝑑(𝑖, 𝑦, 𝑠)
𝑦=1
C. BAYESIAN NETWORK: where, i= Name of the crime, y= Considered time period,
s= Location.
A Bayesian network is a probabilistic model that uses a Dataset should also contain information various variables of
DAG to represent a set of random variables and their crime such as- climate, season, cultural factors etc.
dependencies. In Bayesian network, nodes represent variables,
edges represent the conditional dependencies. The nodes that B. Calculating Probability of Crime Occurrence
are not connected are independent variables. In [12], the
author introduces a crime prediction model that crates a Using the following mathematical steps we determine the
geographic profile which has probability of being the next probability of occurrence of each crime -
crime site. The final prediction is made by appending all
profiles which can be adjusted by Bayesian Learning theory. i) Since some crimes has more importance than other crimes
This technique has been tested on a crime dataset of Gansu, we use a precedence factor to label the extremity of a specific
China and it can successfully predict criminal's intensions and crime. E.g.- Murder will have larger precedence factor than
next crime scene. Pickpocket or Snatching.
It is expressed as p(i, y)= Precedence Factor of crime i in
D. FUZZY TIME SERIES: location s.

Fuzzy time series was first introduced to predict student ii) Because the crime that has been occurring at a larger scale
enrolments at the University of Alabama [13]. It had complex in recent times has a higher probability of happening again
matrix operations which caused heavy overheads. In [14], the than the crime that used to occur at a larger scale in previous
authors proposed a simplified process. In [15], the authors times. So, we take into calculation a Time Impact Factor.
explored and developed the application of Fuzzy time series Expressed as f(i, y, s) = Time Impact Factor of crime i in
technique for crime prediction. Seventeen years of historic location s at a period of time y.
crime data of Delhi city have been used in this research. The
technique works even if some data is not available. iii) Then, we take into consideration the number of occurrence
of a specific crime in a time period y. Because the crime
occurring more has a bigger probability of occurring again
than the crime with less number of occurrence.

978-1-5090-4090-2/16/$31.00 ©2016 IEEE 415 ISBN 978-1-5090-4089-6


19th International Conference on Computer and Information Technology, December 18-20, 2016, North South University, Dhaka, Bangladesh

Expressed as b(i, y, s) = Number of occurrence of crime i in for calculation. But to obtain real-time feedback and precise
location s at a period of time y. prediction these factors need to be adjusted because of many
variable of crime like weather, season etc. The weights are
iv) Probability Factor: adjusted in every iteration of training algorithm and we obtain
Then we calculate a probability factor. Expressed as- more precise output. We implemented 2 layers of hidden layer
to get output. We propose the following mathematical process
𝑛
for training of Neural Network –
𝑂(𝑖, 𝑠) = ∑ 𝑝(𝑖, 𝑠) ∗ 𝑓(𝑖, 𝑦, 𝑠) ∗ 𝑑(𝑖, 𝑦, 𝑠)
𝑦=1 i) Compute β for all other nodes using –
Where, n = Amount of time period taken into consideration.
𝛽𝑗 = ∑ 𝑤𝑗→𝑘 𝑜𝑘 (1 − 𝑜𝑘 )𝛽𝑘
v) Probability of Occurrence of Crime i in Location s:
𝑘
We can determine this by following equation - ii) Compute weight changes for all weights using -
Probability of Occurrence, ∆𝑤𝑖→𝑗 = 𝑟𝑜𝑖 𝑜𝑗 (1 − 𝑜𝑗 )𝛽𝑗
where,
𝑂(𝑖, 𝑠) β = Node output
𝑂. 𝑃(𝑖, 𝑠) =
∑𝑚
𝑖=1 𝑂(𝑖, 𝑠) ∆w = Weight change
Where, m = Number of domains in the given dataset r = Rate parameter
In percentage, Probability of Occurrence = O.P(i, s) * 100%
Simulation of CRIMECAST ANN Design:

C. Hotspot Detection

Hotspot detection plays a very important role in crime


prediction. Locality aspect of crime states that criminals do
not attempt to move far from their anchor locations. So, if we
can detect hotspots that enables law enforcement agency to
focus and prioritize the use of assets. For a given collection of
data of several locations and a predefined threshold limit of
crime rate which can be adaptively adjusted by neural network
we propose the following mathematical process to detect
hotspots.

i) We calculate the total Probability Factor of all crimes in


location s by following equation -
𝑚
Fig. 1. CRIMECAST ANN
𝑇(𝑖, 𝑠) = ∑ 𝑂(𝑖, 𝑠)
𝑖=1
E. Consideration of variable of crimes
Where, m = Number of domains in the given dataset
Crime occurs all over the world. There are many variables
ii) Then we determine Crime Rate of location s denoted by
such as - weather/climate, season, cultural factors etc. that
C(s) using the formula given below -
𝑇 (𝑖, 𝑠) play a role during occurrence of crime.
𝐶(𝑠) = 𝑞
∑𝑠=1 𝑇 (𝑖, 𝑠) i) Weather: Study of criminology has shown that there is
correlation between weather and crime. Researches on the
Where, q= Number of locations in the given dataset. weather variable of crime will help authorities better prepare
In percentage, Crime Rate of location s = C(s) * 100% for different weather conditions and adjust their man power
appropriately. CRIMECAST fetches weather information
The locations with crime rate above the predefined threshold from forecast.io and adjusts crime prediction by using
limit are hotspots and areas with crime rate below the Artificial Neural Network.
predefined threshold limit are coldspots.
ii) Season: Researchers of criminology and related discipline
D. Implementation of Artificial Neural Network (psychology, psychiatry) have been trying for a long time to
associate crime rates with environmental factors. In Quetlet
We simulated Neural Network for adaptive adjustment of 1911, the author first drew attention to seasonal patterns in
Precedence Factor p(i, y) and Time Impact Factor f(i, y, s) as crime. Later in Quetlet 1968, Thermic law was introduced
in predictive model we used arbitrary weights of these factors

978-1-5090-4090-2/16/$31.00 ©2016 IEEE 416 ISBN 978-1-5090-4089-6


19th International Conference on Computer and Information Technology, December 18-20, 2016, North South University, Dhaka, Bangladesh

which stated that crime rates peak during summer time. Table II
Increase or decrease of light has direct effect on crime in most Calculated Probability Factor in Different Locations
countries. CRIMECAST also takes into account the varying
impact of this variable using ANN. Mirpur Uttara Motijheel Baridhara Gulshan
𝐷𝑜𝑚𝑎𝑖𝑛
IV. EXPERIMENTAL EVALUATION ⁄ 𝐿𝑜𝑐𝑎𝑡𝑖𝑜𝑛

This section describes implementation of simulated dataset, Murder 10401. 8077. 11401.46 5280.17 4045.55
results and comparison. In this dataset Impact Factor (P) refers 99 46
to Factor of Precedence and Impact Factor (T) refers to Time Rape 867.30 768.7 500.56 1096.77 642.66
7
Impact Factor considered as time domain. Table I represents Theft 4056.4 2000. 5042.13 1121.33 2353.66
the simulated data table. This data was inserted into oracle 2 55
database in order to perform calculation. Robbery 1844 1142. 1753.33 1000.42 1545.66
33
Table I Kidnapping 1647 800.6 1500.47 645.17 850.96
7
Simulated Dataset

JAN FEB MAR APR IMPACT


The result of applying method of hotspot detection on the
𝐷𝑜𝑚𝑎𝑖𝑛 FACTOR dataset of Table II is showed in Fig. 3.
(P)

Murder 3247 2865 3378 2232 1.00


Rape 324 218 312 233 0.80 Hotspot Detection
Theft 4893 5959 4782 5974 0.30
Robbery 212 231 332 543 0.75 13%
Kidnapping 127 89 323 321 0.40
Mirpur
13% 27%
IMPACT 0.17 0.33 0.50 0.66 - Uttara
FACTOR
(T) Motijheel
18%
29%
Comparative Probability of Occurrence and the Precedence Baridhara
Factor of each crime is shown in Fig. 2. It has been 10 times Gulshan
amplified for better visualization. This graph will be generated
for each of the crime locations. The graph signifies how to
prioritize and allocate limited resources of law enforcement
agencies. Additional categories can be added if the data is Fig. 3. Detected hotspots
readily available.
Comparison between mathematical implementation and
Artificial Neural Network implementation of our Crime
prediction model displayed in Fig. 4 shows that ANN
implementation gives us more precise prediction than
mathematical implementation.

Fig. 2. Probability of Occurrence and Precedence Factor


graph implementation by JSP (Java)

While calculating Probability of Occurrence we have to


calculate Probability Factor of each crime for every location.
By taking total Probability Factor of several locations we Fig. 4. ANN vs. Mathematical implementation
construct dataset of this Table II.

978-1-5090-4090-2/16/$31.00 ©2016 IEEE 417 ISBN 978-1-5090-4089-6


19th International Conference on Computer and Information Technology, December 18-20, 2016, North South University, Dhaka, Bangladesh

V. CONCLUSION model of criminal behavior." Mathematical Models and


In this paper, we propose our crime prediction model Methods in Applied Sciences 18, no. supp01 (2008):
CRIMECAST simulating in statistical and ANN 1249-1267.
implementation. As we are not aware of such implementation [10] Preetika Saxena et al. Int.J. “Forecasting
model of crime prediction system in our country we were Enrollments based on Fuzzy Time Series with
unable compare our approach. By implementing real life Higher Forecast Accuracy Rate” Computer
authentic dataset with precisely increased domain nodes and Technology & Applications, pp. 957-961, 2012.
hidden layers in Neural Network, we would able to figure out [11] Anand Kumar Shrivastav, Dr.Ekata “Applicability of
more accurate prediction. Soft computing technique for Crime ForecastingA
Preliminary Investigation ” International Journal of
VI. FUTURE WORK Computer Science & Engineering Technology pp
415-421, 2012.
As, all the domains we worked with in CRIMECAST were
[12] Wang, Hao, Manolis Terrovitis, and Nikos Mamoulis.
public crimes, hence our crime prediction model is unable to
"Location recommendation in location-based social
predict Familicide crimes like – Domestic Rape, Honor
networks using user check-in data." In Proceedings of
killing, Child killing etc. In order to predict these type of
the 21st ACM SIGSPATIAL International Conference
crimes we need to analyze massive social data like - social
on Advances in Geographic Information Systems, pp.
behavioral pattern, social values, norms and cultural values
374-383. ACM, 2013.
etc. Though genocide is public crime domain but it’s beyond
[13] Weisburd, David, Elizabeth R. Groff, and Sue-Ming
prediction by mathematical model.
Yang. The criminology of place: Street segments and
our understanding of the crime problem. Oxford
REFERENCES University Press, 2012.
[14] S. M. Chen and Hsu, “A new method to
[1] Brantingham, Patricia L., and Paul J. Brantingham.
forecasting enrollments using fuzzy time series”,
"Nodes, paths and edges: Considerations on the
International Journal of Applied Science and
complexity of crime and the physical environment."
Engineering, pp. 234-244, 2004.
Journal of Environmental Psychology 13, no. 1
[15] Nikhil Dubey et al “A Survey Paper on Crime
(1993): 3-28.
Prediction Technique Using Data Mining”, Int.
[2] B. Chandra, Manish Gupta, M.P Gupta: “A
Journal of Engineering Research and Applications,
Multivariate Time Series Clustering Approach for
2014
Crime Trends Prediction” pp. 892-896 IEEE 2008.
[3] Rossmo, D. Kim. Geographic profiling. CRC press,
[16] Jonathan J. Corcoran, Ian D. Wilson, J. Andrew
1999.
Ware: “Predicting the geotemporal variations of
[4] J. L. LeBeau, "The Methods and Measure of
crime and disorder” International Institute of
Centrography and the spatial Dynamics of Rape"
Forecasters, Elsevier 2003.
Journal of Quantitative Criminology, Vol.3, No.2,
pp.125-141, 1987
[5] Liu, Hua, and Donald E. Brown. "Criminal incident
prediction using a point-pattern-based density model."
International journal of forecasting 19, no. 4 (2003):
603-622.
[6] Renjie Liao, Xueyao Wang,Lun Li and Zengchang
Qinh, “A Novel Serial Crime Prediction Model
Based on Bayesian Learning Theory” Ninth
International Conference on Machine Learning and
Cybernetics, Qingdao, pp. 1757-1762, IEEE 2010.
[7] Tayebi, Mohammad A., Uwe Gla, and Patricia L.
Brantingham. "Learning where to inspect: Location
learning for crime prediction." In Intelligence and
Security Informatics (ISI), 2015 IEEE International
Conference on, pp. 25-30. IEEE, 2015.
[8] Sherman, Lawrence W., Patrick R. Gartin, and Michael
E. Buerger. "Hot spots of predatory crime: Routine
activities and the criminology of place." Criminology
27, no. 1 (1989): 27-56.
[9] Short, Martin B., Maria R. D'ORSOGNA, Virginia B.
Pasour, George E. Tita, Paul J. Brantingham, Andrea
L. Bertozzi, and Lincoln B. Chayes. "A statistical

978-1-5090-4090-2/16/$31.00 ©2016 IEEE 418 ISBN 978-1-5090-4089-6

You might also like