0% found this document useful (0 votes)
24 views

Assignment Data Mining

This document provides a summary of 10 papers related to using data mining in crime investigation. It includes the title, authors, year published, journal, impact factor, and proposed algorithms or methods for each paper. The papers cover a variety of topics like detecting financial fraud, using data mining for law enforcement, and applying data mining techniques to identify crime patterns. Many of the papers used common data mining methods and algorithms like naive Bayes classifiers, decision trees, and clustering. The papers evaluated the proposed methods using different crime and network datasets and metrics like accuracy, recall, and precision.

Uploaded by

Umair Saeed
Copyright
© © All Rights Reserved
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Assignment Data Mining

This document provides a summary of 10 papers related to using data mining in crime investigation. It includes the title, authors, year published, journal, impact factor, and proposed algorithms or methods for each paper. The papers cover a variety of topics like detecting financial fraud, using data mining for law enforcement, and applying data mining techniques to identify crime patterns. Many of the papers used common data mining methods and algorithms like naive Bayes classifiers, decision trees, and clustering. The papers evaluated the proposed methods using different crime and network datasets and metrics like accuracy, recall, and precision.

Uploaded by

Umair Saeed
Copyright
© © All Rights Reserved
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Muhammad Musab Khalid

(21262)
Assignment: Data Mining

Topic: Crime Investigation


Using Data Mining in Market

MSCS (1 semester)

TITLE OF PAPER AUTHORS


Sr. #

1 A Review of Data Mining Applications in Crime Hossein Hassani1∗, Xu


Huang2, Emmanuel S. Silva3,
and Mansi Ghodsi1

2 AN OVERVIEW OF A CRIME DETECTION SYSTEM USING THE ART SRITHA ZITH DEY BABU,
OF DATA MINING DIGVIJAY PANDEY, ISMAIL
SHEIK
3 A Survey on Malware Detection Using Data Mining Techniques YANFANG YE, TAO LI,
DONALD ADJEROH, S.
SITHARAMA IYENGAR

4 Data Mining for the Internet of Things: Literature Review and Feng Chen, Pan Deng, Jiafu
Challenges Wan, Daqiang Zhang,
Athanasios V. Vasilakos, and
Xiaohui Rong

5 Detecting Financial Fraud Using Data Mining Techniques: A Mousa Albashrawi


Decade Review from 2004 to 2015

6 An Overview of Data mining application in judicial cases to Nima Norouzi, Elham Ataei
identify patterns of crime and crime detection
7 Using Data Mining to Detect Health Care Fraud and Abuse: A Hossein Joudaki, Arash
Review of Literature Rashidian, Behrouz Minaei-
Bidgoli, Mahmood
Mahmoodi, Bijan Geraili,
MahdiNasiri & Mohammad
Arab

8 A Systematic Survey of Online Data Mining Technology MATTHEW EDWARDS,


Intended for Law Enforcement AWAIS RASHID, and PAUL
RAYSON

9 Big data analytics for security and criminal investigations M.I. Pramanik, Raymond Y.K.
Lau, Wei T. Yue, Yunming Ye
and Chunping Li
10 Data mining in anti-money laundering field Noriaki Yasaka
YEAR JOURNAL HEC CATEGORY IMPACT
FACTOR PROPOSED ALGO /
METHOD

2016 Water X 1.946 The naive Bayes classifier


Environment was proposed
Research

2020 Scientometrics W 2.77 (Naïve formula is used)


We propose to the
prediction of real -time.
Though it will difficult to
get accurate cause
crimes are doing their
crimes using different
and complex methods.
2017 Communications W 4.654 intelligent malware
of the ACM detection methods

2015 SAGE Open W 1.54 Hierarchical clustering


methods used

2016 International X 3.11 The proposed


Journal of Data classification framework
Science and (i.e., naïve Bayes,
Analytics decision tree, neural
network, and SVM)

2021 Scientometrics W 2.77 Classification techniques


are used to predict
discrete features, while
predictive methods
model continuous
functions. Prediction
techniques include linear
and nonlinear
regression, neural
networks, and support
vector machines
2015 Annals of Internal W 25.39 Knowledge Discovery
Medicine from Databases (KDD)

2015 Communications W 4.654 Naive Bayes classifier


of the ACM running

2017 Water X 1.946 the proposed security


Environment games are bi-level
Research models20 that consider
an attacker’s ability to
gather information
about the
defense strategy before
planning an attack
2017 International X 0.8 sophisticated algorithms
Journal of that can detect illegal
Development behaviors quickly
Issues
MUEEN UD DIN
MS CS - 21439

COMPARISON / BENCHMARK / EVALUATION PARAMETERS


DATASETS
EXISTING METHOD / MEASURES

an overall 80–90% accuracy for opensource communities and By evaluating the


tests on New Zealand crime datasets performance on the suspect
newspapers and generally about description module, an
75% accuracy for crosscountry overall recall rate of 70%
scenarios and 100% precision was
achieved.

The bridge of data between the Network datasets used (1)Crime month. (2) Crime
police station and system of data day of the week. (3) The
mining will report about further real crime time
and upcoming crimes
Naive Bayes Classifier (NBC) Using 121 datasets parameter including the
feature selection method
(e.g., Document Frequency,
FS, or Gain Ratio) (DF, Gain
Ratio (GR), and FS) in
Classification CLOUDS: a decision tree malware detection
the parameters of the
classifier for large datasets transformation remain the
same for every time series
regardless of its nature,
related research including
DFT [86], wavelet functions
related topic [87], and PAA
[72].

Expert Systems with Applications European region was only Country, Frequency, and
and Decision Support Systems reported Percentage (%)

Data mining methods Used PSN project used data


(Descriptive This paper examines existing mining and predictive
and Predictive) databases with a analytics to investigate
crime data mining approach violent shootingrelated
(Data Bank) crimes
Data Mining (DM), Knowledge Primary studies (Dataset) that we think that low- and
Discovery from Databases (KDD) used data mining for detecting middle-income countries
and Business Intelligence (BI) health care fraud and abuse can use data mining
techniques as an
instrument for evaluating
provider’s behavior

Their detection system reached Both experiments reused the an evaluation against
an AUC of 91% on their Second Life and Entropia manually identified ground
experimental dataset, rising datasets truth would be
to 99.7% with additional stronger justification of the
components method’s validity

Artificial neural network (ANN) Handle systematic evaluation of


voluminous datasets state-of-the-art
data mining technologies
including intelligent agents,
link analysis, text mining,
machine learning (ML)
methods of estimating money United Nations Office on Drugs The suspiciousness
laundering is the suspicious and Crime evaluation
transaction report (STR) which is rules are the primary
reported to financial intelligence repository of knowledge in
units (FIU) FAIS
RESULTS DRAWBACKS ADVANTAGES

The results are Crime continues to remain a Data also points toward the
promising, with 76% severe threat to all need for more training and
detection accuracy on communities and nations investment in educating and
real world ‘blacklist’ data across the globe alongside the empowering youth with
and sound evidence that sophistication in technology knowledge on the
the approach can and processes that are being advantages
provide effective exploited to enable highly
warnings several months complex criminal activities
ahead of the official
release of blacklists

We analyzed that crimes To predict crime and analyze The data ming system will
vary with criminal's age crime activity also help people to make a
and health or political direct connection with
power also police and the law of justice
officials
Their results showed To protect legitimate users Advantage of being
that the FS was very from these threats, anti- exhaustive in detecting
accurate using just the malware software products malicious logic. In other
top 50 features from different companies, words, static analysis does
including Comodo, Kaspersky, not have the coverage
Data mining Kingsoft, andinvolves
Data mining Symantec, problem
advantage that dynamic
of distributed
technologies are discovering novel, interesting, parallel computer systems
integrated with IoT and potentially useful patterns
technologies for decision from large data sets and
making support and applying algorithms to the
system optimization. extraction of hidden
information.

data mining technique in This paper aims to review Besides this benefit,
detecting financial fraud research studies conducted to researchers can take
with a 13%, followed by detect financial fraud using advantage of knowing the
both of neural network data mining tools within one most frequent used
and decision tree, with a decade and communicate the methods
11%. While support current trends to academic
vector machine is scholars and industry
represented by a 9% and practitioners
naïve Bayes is
represented by a 6%.
Besides fraud detection.

the results should be The main purpose of this advantage of the industrial
evaluated, and its research is to study the opportunity to reap the true
importance should be method based on data mining. benefits of advanced
explained. Usually, in This method can be studied in computing technology in
classification problems, a various fields such as areas that include position
matrix of complexity is a identification, forecasting, and monitoring, intelligent
useful tool crime prevention using data control, and database
mining tools and algorithms knowledge discovery
and using the existing
database and their military
arrangement at the crime
scene to prevent crime.
Recommend seven The scale of this problem is As advantages, Shin et al
general steps to mining large enough to make it a used a simple definition of
health care claims (or priority issue for health anomaly score and
insurance claim) to systems. extracted 38 features for
detecting detecting abuse
fraud and abuse (after
preprocessing of data):
1). Identifying the most
important attributes of
data by expert
domains

This increases the risk of Technical issues with a module Behind web data
reviewer bias and human of the Dark Web Portal comes the other significant
error affecting results, data source, email, which
especially with regards has the advantage of being
to the classification of both
borderline cases long established and well
used

accountable for We also describe some We identify five major


executing a specific task challenging issues of big data technologies namely, link
and then analytics in the context analysis, intelligent agents,
synthesizing these of security and criminal text mining, ANNs, and ML
intermediate results for investigation perspectives. which have been widely
the final used in various domains for
solution developing the technical
foundations of an
automated security and
criminal investigation
system
we recognized It is difficult to determine the the advantages of the
that the data-mining definition of money decision to the same degree
theory such as laundering. as the decision maker
multivariate date Drugs and Crime estimated himself”
analysis (liner regression, the figure to be some 3.6% of Luhman, N , Risk (2008) ,
logistic regression, global GDP (2.3%-5.5%), 4th ed.New Brunswick, New
luster analysis) and equivalent to about US$2.1 Jersey, p68
artificial intelligence trillion (2009), in which is
technique included 2.7% of global GDP
(2.3%-5.5%)
SUMMARY

summary of data mining


applications in crime that can
act as a quick reference guide
for researchers.

To predict crime and analyze


crime activity we need to
proceed with a systematic
approach with data mining. By
using a data mining system one
can predict locations that have
a huge probability
To summarize, malware detection
is now conducted in a client-
server manner with the cloud-
based architecture

The Internet of Things concept


arises from the need to
manage, automate, and
explore all devices,
instruments, and sensors in the
world. In order to make wise
decisions both for people and
for the things in IoT, data
mining technologies are
integrated with IoT
technologies for decision
making support and system
optimization

Financial fraud has been a big


concern for many organizations
across industries and in different
countries. This review provides a
fast and easy-to-use source for
both
researchers and professionals,
classifies financial fraud
applications into a highlevel and
detailed-level framework

human social conditions make


confronting the phenomenon of
crime inevitable The main
purpose of this research is to
study the method based on data
mining
Inappropriate payments by
insurance organizations or third
party payers occur as a result of
error, abuse or fraud. the
technical methods used in KDD
and data mining, and paid
little attention to the practical
implications of their findings for
health care managers and
decision makers

Internet accessibility is widening


and an increasing amount of
crime taking on a digital aspect.
this study could be used as a basis
for deeper systematic exploration
of the literature regarding a
particular subgroup of topics or
techniques identified in our
results.

The purpose of this review article


is to provide researchers and
practitioners with a retrospective
view of several methodologies
and technologies. We identify five
major technologies namely, link
analysis, intelligent agents, text
mining, ANNs, and ML which have
been widely used in various
domains for developing the
technical foundations of an
automated security and criminal
investigation system
The detection and decision
making process should be done
using a risk-based approach with
his
professional knowledge. However,
there is no specific rule or
standard as to what constitutes
suspicious activities

You might also like