0% found this document useful (0 votes)
28 views36 pages

Criminova Crime Forecast

The document discusses a project that aims to build a machine learning based system called Criminova to predict crime rates and forecast future crimes. It aims to analyze historical crime patterns and predict future crime locations and types. The system will process crime and other related data, build machine learning models and provide a web interface for police departments.

Uploaded by

abhijithav2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views36 pages

Criminova Crime Forecast

The document discusses a project that aims to build a machine learning based system called Criminova to predict crime rates and forecast future crimes. It aims to analyze historical crime patterns and predict future crime locations and types. The system will process crime and other related data, build machine learning models and provide a web interface for police departments.

Uploaded by

abhijithav2002
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

CRIMINOVA: CRIME FORECAST

A Project Report submitted to

A P J ABDUL KALAM TECHNOLOGICAL UNIVERSITY


THIRUVANANTHAPURAM
in partial fulfillment of the requirements for the award of the degree of
Bachelor of Technology in
Computer Science and Engineering
Submitted by,

Abhijith A V (PEC20CS002)
Anziya A S (PEC20CS011)
Don Sabu (PEC20CS019)
Nithin Kumar S (PEC20CS030)
Under the guidance of
Mrs. Liji Sara Varghese

College of Engineering Pathanapuram


(Under CAPE, Govt. of Kerala)
May 2024
COLLEGE OF ENGINEERING PATHANAPURAM
An ISO 9001:2015 Certified Institution
Under Cooperative Academy of Professional Education
Established by Government of Kerala
Elikkattor P.O, Piravanthoor, Kollam – 689696

Certificate
This is to certify that the Project Phase - II report entitled “CRIMINOVA: Crime Forecast”
is a bonafide record of the Project Phase - II by Abhijith A V,Anziya A S,Don Sabu,Nithin
Kumar S during the year 2023-2024, in partial fulfilment of the requirements for the award
of B-Tech Degree in Computer Science and Engineering of APJ Abdul Kalam Technological
University is a bonafide record of the Project work carried out by them under our guidance and
supervision. This report in any form has not been submitted to any other University or Institute
for any purpose.

Mrs. Liji Sara Varghese Mrs. Jooby E


Project Guide Project Coordinator
Assistant Professor Assistant Professor
Dept. of CSE Dept. of CSE

Mrs. Prameela S Mr. Prasanth R


Project Coordinator Head of the Department
Assistant Professor Dept. of CSE
Dept. of CSE

Place: CE Pathanapuram
Date:May-2024

Phone (Office) : 8547852810, Mobile: 9495562111,


E-mail : [email protected] Website : www.cepathanapuram.ac.in
DECLARATION
We declare that this project report titled CRIMINOVA: Crime Forecast submitted in
partial fulfilment of the degree of B. Tech in Computer Science & Engineering is a record
of original work carried out by us under the supervision of Mrs. Liji Sara Varghese, and
has not formed the basis for the award of any other degree or diploma, in this or any other
Institution or University. In keeping with the ethical practice of reporting scientific information,
due acknowledgements have been made wherever the findings of others have been cited.

Abhijith A V (PEC20CS002)
Anziya A S (PEC20CS011)
Don Sabu (PEC20CS019)
Nithin Kumar S (PEC20CS030)

Place: CE Pathanapuram
Date:May-2024
Acknowledgement

Firstly, we would like to thank Almighty, thus we were able to complete our project within
the given time.

We express our sincere gratitude to Professor, Dr. R Bijukumar, Principal of College


of Engineering Pathanapuram, for providing us with a well-equipped laboratory & all other
facilities.

We also express our sincere gratitude to Mr. Prasanth R, Head of the Computer Science
and Engineering Department, for providing all support cooperation.

We are very much thankful to Mrs. Jooby E and Mrs. Prameela S , our Project
coordinators for giving us moral support and cooperation. It is our at most pleasure to convey
our sincere gratitude to Mrs. Liji Sara Varghese, our guide, for providing all valuable
suggestions and support.

Finally I thank my family, and friends who contributed to the successful fulfilment of this
project work.

Abhijith A V (PEC20CS002)
Anziya A S (PEC20CS011)
Don Sabu (PEC20CS019)
Nithin Kumar S (PEC20CS030)

i
Abstract

Criminal activity is one of the major problems in our society. With the revival of such
activities globally every day, it is quite difficult to manage and investigate the incidents by
crime investigation agencies either because of less head counts of cops or criminals are smarter
than investigation process. Traditional process of investigation for police department takes
quite longer to predict about the criminal profiles, to suspect the next future crime location, or
to know the pattern of crime. Therefore, there is need to analyze the historical crime patterns
more effectively in minimum time, and predicting the future location and type of crime. Police
department needs a systematic way for analyzing criminal profile easily and find the associated
criminals who can be associated to that crime. Advanced analytics system is also required
to track other information such as traffic sensors, calls, videos, police service calls etc. for
monitoring the criminal activities. In this project, we have discussed how machine learning
approaches can be used to prevent the deal with such cases.

ii
Contents

Acknowledgement i

Abstract ii

List of Figures v

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Outline of the Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Literature Review 3
2.1 Spatio-Temporal crime hotspot detection and prediction . . . . . . . . . . . . . 3
2.2 An empirical analysis of machine learning algorithms for crime prediction
using Stacked Generalization: An Ensemble Approach. . . . . . . . . . . . . . 4
2.3 Smart policing technique with crime type and risk score prediction based on
machine learning for early awareness of risk situation . . . . . . . . . . . . . . 4
2.4 A study on predicting crime rates through machine learning and data mining . 5
2.5 Criminal behavior analysis based on machine learning techniques . . . . . . . 5
2.6 Novel Multi-Module approach to predict crime . . . . . . . . . . . . . . . . . 6
2.7 Multimodal deep learning crime prediction using Tweets . . . . . . . . . . . . 6

3 Requirement Analysis 8
3.1 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Non Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 8

iii
3.3 Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.4 Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 Project Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Methodology 12
4.1 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3 Module Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.1 Data Processing Module . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.2 Machine Learning Module . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3.3 Web Application Module . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

5 Implementation 17

6 Result and Discussions 22

7 Conclusion and Future Scope 25


7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
7.2 Future Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

References 27

iv
List of Figures

3.1 Gantt chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

5.1 Importing and Loading Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 17


5.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.3 Feature Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.4 Model Training and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.5 Setting Flask App and Defining Routes . . . . . . . . . . . . . . . . . . . . . . 19
5.6 Rendering HTML Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.7 Processing Form Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.8 Main . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

6.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.2 Heatmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6.3 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.4 Output 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6.5 Output 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.6 Output 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

v
Chapter 1

Introduction

1.1 Background
Crime remains a significant challenge faced by societies worldwide. From property crimes
to violent offenses, crime rates continue to pose a threat to public safety and well-being. Law
enforcement agencies, despite their dedication, often struggle with limited resources and a
reactive approach focused on solving crimes after they occur. Predicting criminal activity can
be difficult, making it challenging to prevent crimes before they happen. Predictive policing,
utilizing data-driven technologies and predictive analytics, offers a proactive solution. By
analyzing past crime data, identifying patterns, and forecasting future criminal activity, law
enforcement can optimize resource allocation and reduce crime rates. However, challenges
such as data privacy, algorithmic bias, and ethical implications must be addressed. Despite
these challenges, the potential benefits of predictive policing, including improved crime
prevention and enhanced public safety, make it a promising tool in modern law enforcement.

1.2 Objectives
This project aims to enhance public safety through crime prediction. Firstly, we will develop
advanced predictive models that leverage data on past crimes, geographical factors, and socio-
economic indicators. By analyzing these combined datasets, the models will learn to identify
patterns and trends associated with criminal activity. Secondly, these models will be used to
assist law enforcement in allocating their resources more effectively. By pinpointing areas
with a higher likelihood of crime occurrence, police can proactively deploy their personnel to

1
CHAPTER 1. INTRODUCTION

high-risk zones, deterring potential crimes before they happen. Ultimately, this system aims to
contribute to a significant reduction in crime by anticipating potential occurrences and enabling
preventative measures to be taken by law enforcement.

1.3 Purpose and Scope


The main purpose of the proposed system is to ensure a safety system. For enhancing the safety,

• Develop predictive models using crime data, geography, and socioeconomics to forecast
criminal activity accurately.

• Aid law enforcement in proactive resource allocation for public safety measures.

• Contribute to crime prevention by identifying and addressing potential occurrences.

Scope of the proposed system:

• Predict crime locations and times to aid police resource allocation.

• Leverage diverse data sources like crime records,and demographics.

• Utilize machine learning for crime prediction.

1.4 Problem Definition


Current reactive crime-solving methods struggle to predict and prevent crime, necessitating a
proactive approach through a crime prediction system.

1.5 Outline of the Report


This module generally deals with a brief introduction part of what the project report is.
Chapter 2 is the background which includes the detailed literature survey. Chapter 3 is about
various requirement analysis. Chapter 4 is about proposed design, which deals with various
module descriptions and the design. finally Chapter 5 deals with implementation which gives
detailed information on devices and algorithms used for implementations .Chapter 6 deals with
result and Chapter 7 deals with conclusion and future scope of the work.

Dept. of Computer Science & Engineering 2 College of Engineering, Pathanapuram


Chapter 2

Literature Review

2.1 Spatio-Temporal crime hotspot detection and prediction


This study underscores the pivotal role of security in nation-building, highlighting law
enforcement agencies’ responsibility in curbing crime for societal welfare and economic
growth. Leveraging technological advancements, particularly Geographic Information Systems
(GIS), has facilitated crime detection and prediction techniques. The abundance of data in
recent years has spurred research interest in crime investigation and trend analysis, aiding
in policy formulation for safer communities. Spatial analysis reveals uneven distribution of
crimes, necessitating hotspot mapping for deeper insights. Integrating spatial and temporal
information through GIS has revolutionized crime prediction accuracy. Time series analysis
techniques and deep learning models like CNN and LSTM have shown promise in forecasting
crimes. However, challenges persist, such as the need for ample data and improving prediction
system robustness. Despite existing literature focusing on crime analysis, a systematic
review is warranted to consolidate knowledge, address challenges, and propose future research
directions. This systematic literature review (SLR) aims to bridge this gap, covering empirical
evidence, techniques, challenges, and dataset characteristics from 2010 to 2019, guided by
established methodologies.

3
CHAPTER 2. LITERATURE REVIEW

2.2 An empirical analysis of machine learning algorithms


for crime prediction using Stacked Generalization: An
Ensemble Approach.
This study introduces a tailored crime prediction approach leveraging Support Vector
Machines (SVM) implemented in MATLAB. The aim is to refine predictions by mitigating the
impact of outliers, thus enhancing overall accuracy and reliability. However, challenges such as
data quality and quantity concerns are acknowledged, as they can significantly affect algorithm
effectiveness. Additionally, issues related to generalization, where models struggle with unseen
data, and the computational resources required pose further obstacles to system development
and deployment. Despite these challenges, the study underscores the importance of exploring
ensemble approaches like SBCPM in machine learning algorithms for crime prediction. By
emphasizing the potential for improved accuracy and robustness, it highlights the significance
of such techniques in advancing predictive modeling for crime prevention and intervention
strategies. Overall, the paper emphasizes the need to address these challenges systematically to
realize the full potential of ensemble methods in crime prediction, ultimately enhancing public
safety and law enforcement efforts.

2.3 Smart policing technique with crime type and risk score
prediction based on machine learning for early aware-
ness of risk situation
This paper presents a novel machine learning-based technique for predicting crime type and
risk level, crucial for prompt and efficient law enforcement responses. Leveraging text-based
criminal case summaries, the system utilizes the KICS data format, containing comprehensive
policing data. With consideration for 21 representative crime types, the system predicts the
specific type of crime for each case. Additionally, a formula is developed to calculate crime risk
level, considering severity and damage. DNN and CNN-based prediction models are designed
for both crime type and risk score prediction. Evaluation demonstrates superior performance
compared to traditional algorithms, with CNN-based models outperforming SVM and naı̈ve
Bayes by 7% and 8% respectively in crime type prediction. The developed technology,

Dept. of Computer Science & Engineering 4 College of Engineering, Pathanapuram


CHAPTER 2. LITERATURE REVIEW

implemented as a user-friendly software platform, empowers field personnel like police officers
to swiftly identify crime types and risk levels upon receiving new cases, enhancing operational
efficiency.

2.4 A study on predicting crime rates through machine


learning and data mining
The study emphasizes the critical importance of crime analysis in ensuring security and
justice administration within nations. Traditional methods like paperwork and statistical
analysis have limitations in accurately predicting crime occurrences. However, with the
adoption of machine learning and data mining techniques, crime prediction accuracy has
significantly improved. This review surveys various machine learning and data mining
approaches applied to crime analysis and prediction, aiming to provide a concise overview
of their effectiveness. Supervised learning methods, particularly Logistic Regression, have
been widely utilized in crime prediction due to their efficacy. Challenges such as data quality,
generalization, and computational resources are acknowledged, underscoring the need for
stronger prediction algorithms. Historical crime data, often collected using Geographical
Information Systems (GIS), are utilized to forecast future crime patterns and trends. Clustering
methods and supervised learning models are applied to extract crime locations and analyze
crime patterns. Ensemble learning techniques have also shown promise in enhancing crime
prediction accuracy. Despite the challenges, machine learning and data mining algorithms
offer valuable insights for crime analysis and prediction, paving the way for more effective
crime prevention strategies. The study provides a comparative analysis of different prediction
techniques and outlines key challenges and opportunities in the field, aiming to guide future
research efforts for improved crime prediction systems.

2.5 Criminal behavior analysis based on machine learning


techniques
The paper explores the utilization of machine learning (ML) techniques in crime prediction,
emphasizing their significance in contemporary law enforcement. It begins with an introduction

Dept. of Computer Science & Engineering 5 College of Engineering, Pathanapuram


CHAPTER 2. LITERATURE REVIEW

underscoring the necessity of predictive patterns in crime identification and sets out to survey
ML approaches applied to criminal profiling. The literature review section summarizes
various studies, including clustering techniques, blockchain technology for surveillance, and
analytic algorithms for crime prediction. Additionally, it delves into the application of ML
in criminal activities, particularly in financial institutions for fraud detection and money
laundering prevention. The section stresses the importance of scenario analysis in evaluating
the risks associated with AI-driven crime-fighting tools. The conclusion underscores ethical
considerations and risk assessments in deploying AI for crime prevention, advocating for
responsible use and suggesting avenues for future research. Overall, the paper offers valuable
insights for researchers, law enforcement agencies, and policymakers seeking to harness ML
for enhancing crime prediction and prevention efforts.

2.6 Novel Multi-Module approach to predict crime


The paper outlines a comprehensive approach for crime prediction utilizing deep learning
techniques. The proposed method consists of two modules: Feature Level Fusion and Decision
Level Fusion. The former employs temporal-based Attention LSTM and Spatio-Temporal
based Stacked Bidirectional LSTM models, with a Fusion model leveraging their training
data. The latter module combines these models’ outputs for final predictions. The architecture
forecasts crime occurrences for the next hour based on data from the past twenty-four hours,
providing insights into future crime categories, times, and locations. Experimental analysis
focused on San Francisco and Chicago datasets yielded promising results, with Mean Absolute
Error of 0.008 and 0.02, Coefficient of Determination of 0.95 and 0.94, and Symmetric Mean
Absolute Percentage Error of 1.03% and 0.6%, respectively. The proposed model demonstrates
superior performance compared to existing models, offering law enforcement a valuable tool
for crime prevention.

2.7 Multimodal deep learning crime prediction using Tweets


This paper presents a novel approach to crime prediction by leveraging social media
data, particularly Twitter, alongside historical crime incident records. The study aims to
integrate semantic knowledge from text data and crime data using data fusion techniques,

Dept. of Computer Science & Engineering 6 College of Engineering, Pathanapuram


CHAPTER 2. LITERATURE REVIEW

enhancing the predictive capabilities of the model. By applying data fusion to a ConvBiLSTM
model, independent vectors from both tweet and crime modalities are combined into a unified
representation. The study conducted experiments using datasets from the Chicago police
department and crime-related tweets specific to Chicago. Performance evaluation against
various crime prediction models, including traditional deep-learning and BERT-based models,
demonstrated the superiority of the proposed ConvBiLSTM model with multimodal data
fusion, achieving an accuracy of 97.75%. This approach showcases promising results in
enhancing crime prediction accuracy by incorporating social media sentiment analysis into
predictive models.

Dept. of Computer Science & Engineering 7 College of Engineering, Pathanapuram


Chapter 3

Requirement Analysis

3.1 Functional Requirements


• User Input Handling: Accept location and timestamp input through a web form.

• Model Prediction:Utilize a pre-trained Random Forest Classifier to predict crime likeli-


hood.

• Crime Identification:Determine the most probable type of crime.

• Result Presentation:Display predicted crime type and nature on the web page.

3.2 Non Functional Requirements


• Provide timely predictions in response to user input.

• Handle multiple user requests without performance degradation.

• Offer an intuitive user interface for easy interaction.

• Ensure continuous availability with minimal downtime.

3.3 Hardware Requirements


• Server:Standard server hardware with a multi-core processor, sufficient RAM, and
storage space. Network Infrastructure:Reliable network connectivity with high-speed
internet. GPU :Optional dedicated GPU card for accelerated model inference.

8
CHAPTER 3. REQUIREMENT ANALYSIS

3.4 Software Requirements

Operating System (OS)

The application should be designed to operate efficiently across various operating systems to
accommodate a wide range of users. Compatibility with major OS platforms such as Windows,
macOS, and various distributions of Linux, including Ubuntu, CentOS, and Fedora, is crucial.
By ensuring compatibility with multiple operating environments, the application can reach a
larger user base and provide a consistent user experience across different platforms.

Python (Programming Language)

Python serves as the primary programming language for the project due to its versatility,
ease of use, and extensive ecosystem of libraries and frameworks. Leveraging the latest
stable version of Python 3.x ensures access to the latest language features, performance
improvements, and security updates. Additionally, Python’s strong support for scientific
computing and machine learning makes it an ideal choice for implementing deep learning
algorithms and processing large datasets efficiently.

Data Manipulation Libraries (Pandas)

Pandas is essential for handling structured data within the application, particularly for tasks
such as data cleaning, transformation, and analysis. Its powerful data manipulation capabilities,
including support for data alignment, indexing, and aggregation, streamline the processing
of nutritional datasets. Integrating the latest version of Pandas ensures compatibility with
new features, bug fixes, and performance improvements, enhancing the application’s data
processing capabilities.

Scientific Computing Libraries (NumPy)

NumPy plays a critical role in scientific computing tasks, providing support for mul-
tidimensional arrays, mathematical functions, and linear algebra operations. Within the
project, NumPy facilitates numerical computations, data preprocessing, and statistical analysis,
enabling efficient manipulation and processing of numerical data. Leveraging the latest

Dept. of Computer Science & Engineering 9 College of Engineering, Pathanapuram


CHAPTER 3. REQUIREMENT ANALYSIS

version of NumPy ensures access to new features, optimizations, and bug fixes, improving
the performance and reliability of numerical computations.

Web Browser

Users interact with the application through modern web browsers such as Google Chrome,
Mozilla Firefox, Apple Safari, or Microsoft Edge. Ensuring compatibility with a variety of
web browsers enhances the accessibility and usability of the application, allowing users to
access it seamlessly from their preferred browser on desktop or mobile devices. Compatibility
testing across different browsers helps identify and address any compatibility issues, ensuring
a consistent user experience across platforms.

Integrated Development Environment (IDE) or Text Editor

Developers use integrated development environments (IDEs) or text editors for writing,
editing, and debugging code during the development process. Popular IDEs such as PyCharm,
Jupyter Notebook, Visual Studio Code, Sublime Text, and Atom provide features such as
syntax highlighting, code completion, and version control integration, improving developer
productivity and code quality. Choosing the right IDE or text editor based on individual
preferences and project requirements helps streamline the development workflow and ensure
code consistency and quality.

Flask Framework

The system shall employ Flask, a lightweight and flexible web framework, for developing
and deploying the web application. Flask shall provide routing, request handling, template
rendering, and other essential functionalities to facilitate the development of user-friendly web
interfaces.

Geocoding Service

The system shall integrate with a geocoding service, such as Nominatim (provided by
OpenStreetMap), to convert user-provided addresses into geographical coordinates (latitude
and longitude). API requests to the geocoding service shall be efficiently managed to minimize
latency and ensure timely response to user input.

Dept. of Computer Science & Engineering 10 College of Engineering, Pathanapuram


CHAPTER 3. REQUIREMENT ANALYSIS

3.5 Project Schedule


In this section we have discussed the total time we are having and the way in which the time
is split between the different sections in our project. The below figure 3.1 shows the gantt chart
which describes our schedule.

Figure 3.1: Gantt chart

Dept. of Computer Science & Engineering 11 College of Engineering, Pathanapuram


Chapter 4

Methodology

4.1 Proposed System


In an effort to bolster public safety, a web application for crime prediction is proposed. This
system harnesses the power of machine learning to analyze historical crime data and identify
patterns that might predict future occurrences. Law enforcement agencies can leverage this
system to strategically allocate resources and implement targeted prevention measures.
The system functions by collecting crime data from reliable sources, such as police reports
or public safety databases. This data undergoes a meticulous cleaning and preparation process
to ensure its accuracy and usefulness for analysis. Machine learning algorithms, specifically a
trained model like a Random Forest Classifier, are then employed. These algorithms delve into
the preprocessed data, uncovering relationships and patterns between past crime occurrences
and various contributing factors. This could include aspects like timestamps, locations, and
specific crime types.
The user interface provides a user-friendly platform for interaction. Through a web form,
individuals can enter a specific location (address) and a desired timestamp. The system then
takes over, extracting relevant features from this user-provided data. By utilizing the insights
gleaned from the trained model, the system predicts the likelihood of different crime types
occurring at the specified location and time. This prediction is then translated into a clear and
concise message displayed on the web page, informing the user about the most probable crime
and its nature. This valuable information empowers users to make informed decisions about
their safety and take necessary precautions if deemed necessary.

12
CHAPTER 4. METHODOLOGY

4.2 System Architecture


The system architecture for crime prediction using machine learning comprises several key
components:

• User Interface (UI): This component enables users to interact with the system by
providing input, such as location and time data, and receiving crime predictions. The
UI may take the form of a web application, mobile app, or other user-friendly interface.

• Machine Learning Kernel: At the heart of the system lies the machine learning kernel
such as Random Forest Classifiers, which is responsible for crime prediction. This
component processes user input, extracts relevant features, queries the database for
historical crime data, and applies the trained machine learning model to make predictions.

• Database: The database stores historical crime data used for training the machine
learning model and making predictions. It contains detailed information about past
crime incidents, including timestamps, locations, crime types, and other relevant factors.
The database provides the necessary data for the machine learning kernel to analyze and
derive insights.

• Result Presentation: Once the machine learning model has processed the user input
and made predictions, the results are presented back to the user through the UI. This
component displays the predicted crime likelihoods, associated crime types, severity
levels, or overall risk assessments based on the specified location and time.

Overall, this system architecture outlines a machine learning-driven approach to crime


prediction, where users interact with the system to receive insights and predictions about
potential crime occurrences based on historical data.

Dept. of Computer Science & Engineering 13 College of Engineering, Pathanapuram


CHAPTER 4. METHODOLOGY

Figure 4.1: System Architecture

4.3 Module Description


The section explains about 3 modules :

• Data Processing Module

• Machine Learning Module

• Web Application Module

4.3.1 Data Processing Module

Within this module, a suite of powerful libraries such as Pandas, NumPy, Matplotlib, and
Seaborn come into play. Pandas serves as the cornerstone for data manipulation and analysis,
offering robust tools for handling structured data. NumPy complements Pandas with its efficient
numerical operations, while Matplotlib and Seaborn provide versatile options for visualizing
data in various formats. Together, these modules form a cohesive framework for loading,
cleaning, preprocessing, and visualizing crime data, ensuring its readiness for subsequent
analysis.

Dept. of Computer Science & Engineering 14 College of Engineering, Pathanapuram


CHAPTER 4. METHODOLOGY

4.3.2 Machine Learning Module

At the heart of the crime prediction system lies the machine learning module, empowered
by libraries like Scikit-learn and Joblib. Scikit-learn stands out as a comprehensive toolkit for
machine learning tasks, offering a wide array of algorithms and utilities for model training,
evaluation, and prediction. Meanwhile, Joblib plays a crucial role in the persistence of
machine learning models, enabling seamless saving and loading of trained models. Within
this module, sophisticated algorithms such as the Random Forest Classifier are employed to
construct predictive models based on historical crime data, leveraging patterns and correlations
to anticipate future crime occurrences.

4.3.3 Web Application Module

The user interface of the crime prediction system is powered by the Flask web framework,
which forms the cornerstone of the web application module. Flask’s lightweight and flexible
architecture make it an ideal choice for developing interactive web applications. It handles
user requests, routing them to appropriate functions, and rendering HTML templates to
present prediction results in a visually appealing manner. Additionally, Geopy, a geocoding
library, may be incorporated to process location data input by users, translating addresses into
geographic coordinates for precise spatial analysis.

4.4 Algorithm
Here’s a step-by-step algorithm for crime prediction using the Random Forest algorithm:
Step 1. Input: Historical crime dataset containing features such as timestamps, locations,
and crime types.
Step 2. Data Preprocessing:Clean the historical crime data to remove missing values,
outliers, and inconsistencies.
- Extract relevant features from the dataset, such as year, month, day, hour, and day of the
week from the timestamps.
- Encode categorical features like crime types using techniques such as one-hot encoding.
- Split the dataset into training and testing sets for model evaluation.
Step 3. Model Training: Initialize a Random Forest classifier with hyperparameters like
the number of trees (n estimators), maximum depth, and minimum samples per leaf.

Dept. of Computer Science & Engineering 15 College of Engineering, Pathanapuram


CHAPTER 4. METHODOLOGY

- Train the Random Forest model on the preprocessed training data, where features are the
extracted attributes, and labels are the crime types.
Step 4. Feature Importance:Assess the importance of features in the trained Random Forest
model to understand which factors contribute most to crime occurrences.
-Features with higher importance scores indicate a stronger influence on crime prediction.
Step 5. Prediction: Preprocess user-provided input, including location and timestamp,
similar to the training data preprocessing steps.
-Extract relevant features from the input data and encode categorical features as needed.
-Use the trained Random Forest model to predict the likelihood of different crime types
occurring at the specified location and time.
-The prediction is based on the majority vote or probability distribution of the individual
decision trees in the Random Forest ensemble.
Step 6. Output:Display the predicted crime likelihoods for various crime types to the user,
providing insights into potential risks associated with the specified location and time.
-Visualize the predictions using charts or maps to enhance user understanding and decision-
making.
-This algorithm enables users to input location and timestamp data and receive predictions
about potential crime occurrences, helping them make informed decisions to enhance personal
safety and security.

Dept. of Computer Science & Engineering 16 College of Engineering, Pathanapuram


Chapter 5

Implementation

The implementation of the Crime prediction system involves several steps and functions to
perform various tasks.
Importing Libraries: The code starts by importing necessary libraries such as pandas,
numpy, matplotlib, Flask, joblib, and RandomForestClassifier from sklearn. These libraries
are used for data manipulation, model training, web application development, and machine
learning tasks.
Loading Data:The code loads a CSV file named ’data.csv’ into a pandas DataFrame using
the pd.read csv() function. The loaded data is stored in the variable data.

Figure 5.1: Importing and Loading Data

Data Preprocessing: The ’timestamp’ column in the DataFrame is modified to convert it


into a datetime format using list comprehension and the pd.to datetime() function. This step
ensures that the timestamp data is suitable for analysis.

17
CHAPTER 5. IMPLEMENTATION

Figure 5.2: Data Preprocessing

Feature Engineering: Additional features like year, month, day, etc., are extracted from
the timestamp column using the ’dt’ accessor provided by pandas. These features are stored
in a new DataFrame called dat and then concatenated with the original DataFrame data using
’pd.concat()’.

Figure 5.3: Feature Engineering

Model Training: The data is split into training and testing sets using the train test split()
function from sklearn. Then, a RandomForestClassifier model is instantiated with 100
estimators and trained on the training data using the fit() method.
Model Evaluation: The trained model is used to make predictions on the testing data

Dept. of Computer Science & Engineering 18 College of Engineering, Pathanapuram


CHAPTER 5. IMPLEMENTATION

using the predict() method. The accuracy of the model is evaluated using the accuracy score()
function and the classification report() function from sklearn.metrics.

Figure 5.4: Model Training and Evaluation

Setting up Flask App: The code initializes a Flask application instance by creating an
object of the Flask class with the name app.
Loading the Trained Model: The code loads the trained Random Forest model (’rf model.pkl’)
using the joblib.load() function and assigns it to the variable rfc.
Defining Routes: The code defines several routes using the @app.route() decorator. Each
route corresponds to a different URL path and HTTP method.

Figure 5.5: Setting Flask App and Defining Routes

Rendering HTML Templates: The routes return HTML templates using the ren-
der template() function. The templates are located in the ’templates’ directory of the Flask

Dept. of Computer Science & Engineering 19 College of Engineering, Pathanapuram


CHAPTER 5. IMPLEMENTATION

application.
Processing Form Data: When the user submits a form with timestamp data, the
/result.html route receives the form data using the POST method. The predict() function
preprocesses the data, makes predictions using the trained model, and renders the result on
the web page.

Figure 5.6: Rendering HTML Templates

Figure 5.7: Processing Form Data

Running the Flask App: The code checks if the script is run directly ( name ==
’ main ’) and starts the Flask application with debug mode enabled.This Flask code sets

Dept. of Computer Science & Engineering 20 College of Engineering, Pathanapuram


CHAPTER 5. IMPLEMENTATION

up routes for different pages of the web application, loads a trained machine learning model,
processes form data, and renders HTML templates to display the results on the web page.

Figure 5.8: Main

Dept. of Computer Science & Engineering 21 College of Engineering, Pathanapuram


Chapter 6

Result and Discussions

The fig 6.1 and 6.2 depicts the processed crime dataset and its corresponding heatmap, show-
casing the distribution of crime occurrences across different locations and timestamps. The
heatmap provides a visual representation of crime hotspots and trends, aiding in understanding
patterns and identifying areas of high crime density.

Figure 6.1: Data

Figure 6.2: Heatmap

22
CHAPTER 6. RESULT AND DISCUSSIONS

The fig 6.3 displays a bar graph illustrating the distribution of crime incidents by month
and day of the week. This visualization helps in identifying any temporal patterns or trends in
crime occurrences, such as seasonal variations or specific days with higher crime rates.

Figure 6.3: Distribution

Figure 6.4,6.5,6.6 showcases the user interface of the crime prediction system. Users are
prompted to input their location (address) and the desired date for which they seek crime
predictions. Upon submitting the information, the system processes the input data, applies
the trained machine learning model, and provides predictions on potential crime occurrences at
the specified location and date.

Figure 6.4: Output 1

Dept. of Computer Science & Engineering 23 College of Engineering, Pathanapuram


CHAPTER 6. RESULT AND DISCUSSIONS

Figure 6.5: Output 2

Figure 6.6: Output 3

Dept. of Computer Science & Engineering 24 College of Engineering, Pathanapuram


Chapter 7

Conclusion and Future Scope

7.1 Conclusion
In conclusion, the development and implementation of the crime prediction system mark a
significant step forward in leveraging machine learning technology to enhance public safety
and law enforcement efforts. Through the analysis of historical crime data and the utilization
of advanced predictive algorithms, the system provides valuable insights into potential crime
occurrences, enabling proactive measures to be taken to mitigate risks and allocate resources
effectively. The system’s ability to identify crime patterns, hotspots, and temporal trends
empowers both individuals and law enforcement agencies to make informed decisions and
take timely actions to address security concerns. By leveraging data-driven approaches, such
as heatmap visualizations and predictive modeling, the system contributes to a more proactive
and strategic approach to crime prevention and intervention. Furthermore, the user-friendly
interface facilitates seamless interaction with the system, allowing users to input their location
and desired date to receive personalized crime predictions. This empowers individuals to take
proactive measures to safeguard themselves and their communities, fostering a sense of security
and confidence in the effectiveness of the system.

7.2 Future Scope


Looking ahead, the future scope of the crime prediction system lies in exploring advanced
AI applications like anomaly detection to identify suspicious activities in real-time. Integrating
real-time data streams such as social media sentiment analysis and weather data holds

25
CHAPTER 7. CONCLUSION AND FUTURE SCOPE

promise for enhancing prediction accuracy and comprehensiveness. Additionally, continuously


adapting the system to stay ahead of evolving criminal tactics and emerging crime types is
crucial. This involves ongoing research and development efforts to refine algorithms, expand
data sources, and improve predictive models. By staying abreast of technological advancements
and evolving security challenges, the system can remain effective in addressing the dynamic
nature of criminal activity and contribute to the creation of safer communities.

Dept. of Computer Science & Engineering 26 College of Engineering, Pathanapuram


References

[1] Mandalapu, Varun, et al. ”Crime prediction using machine learning and deep learning: A
systematic review and future directions.” IEEE Access (2023).

[2] Baek, Myung-Sun, et al. ”Smart policing technique with crime type and risk score
prediction based on machine learning for early awareness of risk situation.” IEEE Access
9 (2021): 131906-131915.

[3] Kshatri, Sapna Singh, et al. ”An empirical analysis of machine learning algorithms for
crime prediction using stacked generalization: an ensemble approach.” Ieee Access 9
(2021): 67488-67500.

[4] Travaini, Guido Vittorio, et al. ”Machine learning and criminal justice: A systematic
review of advanced methodology for recidivism risk prediction.” International journal of
environmental research and public health 19.17 (2022): 10594.

[5] Tam, Sakirin, and Ömer ÖzgürTanrıöver. ”Multimodal Deep Learning Crime Prediction
Using Crime and Tweets.” IEEE Access (2023).

[6] Kwan-Loo, Kevin B., et al. ”Detection of violent behavior using neural networks and pose
estimation.” IEEE Access 10 (2022): 86339-86352.

[7] Tasnim, Nowshin, Iftekher Toufique Imam, and M. M. A. Hashem. ”A novel multi-
module approach to predict crime based on multivariate spatio-temporal data using
attention and sequential fusion model.” IEEE Access 10 (2022): 48009-48030.

[8] Chen, Fan, et al. ”Wifi Log-Based Student Behavior Analysis and Visualization
System.” The International Archives of the Photogrammetry, Remote Sensing and Spatial
Information Sciences 43 (2022): 493-499.

27
REFERENCES

[9] Lin, Chih-Yang, et al. ”Invisible adversarial attacks on deep learning-based face
recognition models.” IEEE Access (2023).

[10] Abdelfattah, Mazen, et al. ”Towards universal physical attacks on cascaded camera-lidar
3d object detection models.” 2021 IEEE International Conference on Image Processing
(ICIP). IEEE, 2021.

Dept. of Computer Science & Engineering 28 College of Engineering, Pathanapuram

You might also like