Digital Forensics Using Data Mining: Abstract
Digital Forensics Using Data Mining: Abstract
Abstract:
In this paper, we reviewed the challenges faced by forensics analyst to examine and analyse
large amount of data gathered as digital evidence. Digital forensics is process of identifying,
preserving, extracting and documenting the digital evidence. Forensics investigations become
very complex while dealing with large volume of data. As a result, more time is devoured
with construction of minimum outputs and results. By using data mining techniques along
with digital forensics process are useful to surpass this issue. We discussed different proposed
methods of digital forensics with data mining techniques. One base paper is selected and
analysed comprehensively in this paper.
1 Introduction:
Digital forensics also known as computer forensics is the process of identifying, collecting,
analysing and examining the digital evidence along with the preservation of information and
maintaining the integrity of the evidence. The main goal of digital forensics is the
preservation of electronic evidence in original form along with the investigation process. An
investigation in digital forensics has four phases which are collection, examination, analysis
and presentation. In collection phase, the digital evidence is collected from suspect's device.
It may include seizure of devices present at the scene of crime and also the devices that may
contain potential digital evidence with the maximum care to avoid any contamination to the
evidence. The second phase is examination phase, where the data from evidence containers is
extracted and then examined. It also includes the extraction of relevant pieces of information
and recovery of any deleted files and folders. In analysis phase, the extracted data in then
analysed in detail to draw conclusions and results. The presentation phase is the final phase
where the forensics analyst prepares and presents detailed reports of the investigation
outcomes from analysis phase. Figure 1.1 shows digital forensics investigation process:
In today's era, every organization has huge amount of data which includes employee details,
contracts, business information and other sensitive information. All these details are stored in
some kind of database either relational or non-relational referred to as no sql. In digital
forensics, the hard part is to get the data from these large databases for analysis. The forensics
analyst, law enforcement and intelligence agencies faces an extreme difficult challenge to
analyse large volumes of data collected from organizational database involved in crimes and
terrorism. A suitable technique to tackle the issue is Data mining. Data mining is a scientific
method of extracting the meaningful information from existing databases. By using data
mining techniques we can identify the different patterns of related data and also construct
relationships among data to perform analysis. This comes handful in analysis phase of digital
forensics as it saves time and makes work easier for forensics analyst. We can also generate
crime related patterns through data mining to foresee any crimes in the future. Just like digital
forensics, data mining also has five general phases data collection, data storage and
management, data access and operations, and data presentation. Figure 1.2 shows the
process of data mining below:
In general, digital forensics software only assist forensics analyst during investigations. To
solve a crime, it needs a lot more than assistance where Data mining techniques for digital
forensics becomes useful. By using data mining, not only a large amount of data is analysed
to produce crime related patterns and co-relations, investigation time and complexity is also
reduced which is a basic need for digital forensics.
2 Related Work
In this section, various solutions and frameworks proposed for digital forensics through data
mining are discussed. For extracting the meaningful data and crime related patterns from
databases and various other sources of data, many frameworks were introduced by
researchers. Some of them are following:
In 2016, Ashwinkumar Malwadkar and Prof. Sonali Patil proposed a system for digital
forensics comprised of computer forensics tools and data mining techniques. Data mining
algorithm based on Apriori Algorithm was proposed for working with data gathered by
computer forensics tools to find crime patterns, co-relations and associations between data
items.
In 2012, K. K. Sindhu, and B. B. Meshram proposed a tool for digital forensics to find
motive, cyber crime patterns and also frequency of a specific attack happened over a certain
period of time. The proposed system is a combination of digital forensics investigation and
crime data mining techniques. Data gathered from different areas e.g network traffic, file
system analysis, log file analysis was analysed by data mining algorithm.
In 2018, Raburu George, Omollo Richard, and Okumu Daniel proposed different data mining
techniques for digital forensics to extract data from extremely large databases. The main
focus was merging data mining techniques with digital forensics to extract digital evidence
and reduce complexity.
In 2012, Rosamaria Berte, Fabio Marturana, Gianluigi Me, and Simone Tacconi proposed an
approach for digital forensics investigations based on data mining techniques and knowledge
management theory a.k.a KMT. The main aim of proposed system was to speed up the
investigations by giving priorities to seized devices for analysis.
In 2015, Prof Sonal Honale , and Jayshree Borkar presented a framework for live digital
forensics using data mining. The data mining algorithms used are K-Means and Apriori
Algorithm for determining the cyber attacks that occurs and counting number of times any
specific attack occur during system working time. Various system tools are also used which
are win cap, jp cap and wmic.
In 2015, Peng Cheng and Hui Qu proposed a model for digital forensics with combination of
data mining. Data mining techniques were applied during data analysis process. A network
traffic forensics model was proposed based on hardtop. The model monitors the network
traffic and performs real time collection of traffic data for sampling and analysis.
2.8 Data Mining Methods Applied to a Digital Forensics Task for Supervised Machine
Learning[8]
2.9 Imminent accession of Artificial Intelligence based Forensic Exploratory with Data
Mining Analysis
In 2017, S. Umar , A. Praveen, S. Gouse, and N. Deepthi discussed the role of data mining
techniques in digital forensics. They also discussed how and in which procedure data mining
techniques should be applied in digital forensics investigations. Different methods
association, grouping, classification and unpredictable are discussed.
All the above discussed methods have their own advantages and disadvantages. A brief
summary of the literature is discussed in table below:
Table 2.1
[1] Model Based A model combined WEKA is used, It works well for
with data mining which is a small data sets but
Approach techniques and combination of memory can
system tools for data mining overflow in case if
gathering data algorithms and data volume is so
electronic mail large
messages are used
as unstructured
data
[4] Techniques Suggest a way to No tools are used. Non standard model
Based implement data based diagram
Approach mining techniques at Just a literature which shows the
the right place and review of certain point where data
right time to reduce data mining mining should be
investigation time techniques. applied
and complexity
[6] Framework Framework for live Tools used are Apriori algorithm
Based digital forensics wincap, jpcap, can be slow for large
Approach with K-Means and wmic, K-means item sets.
Apriori Algorithm and Apriori
during system algorithm. Another problem
running time. Parameters used related to apriori
are network traffic algorithm is works
packets and log well to find events
files that occurs mostly,
but at the same, it is
very hard to find
rarely occurring
events
Data mining is scientific field to construct intriguing structures of data from pre-stored data in
a certain format. The structures of data are basically patterns, statistical or graphical
representation of data. Data mining is a juvenile field in context of criminal and intelligence
analysis. Cyber crime data mining techniques that are helpful for solving crime are
clustering, association, deviation detection, classification, and string operator techniques.
Below is a diagram that shows data mining techniques:
Data mining algorithm used is Apriori algorithm for finding associations. Its working is based
on following steps:
1: In first step, we identify the item sets from evidence case report.
2: Then we make a set of items/variables Item Set Is = { I1, I2, I3, I4, I5..... In}
3: In third step, we identify/ determine the action set to perform actions on Item set, Action
Set As = { A1, A2, A3, A4....AN}
The item sets extracted from case report are stored as attributes of different tables.
Association rules are based on fact that second event will occur as a result of first event. We
can denote it as X ------> Y, arrow shows association between event X and Y. This means
that Y will occur if X occurs. Below is a flowchart diagram that shows the working of
proposed system:
The proposed system is a combination digital forensics tools and data mining techniques. Its
flow of work is shown in complete system diagram below:
4 Conclusion:
Digital forensics is the process of identifying, preserving, extracting and presenting the digital
evidence. The digital evidence is a data that is gathered from suspects or criminal devices.
These devices can range from small to large including mobile, PDA's, laptops, IPods,
computers and other electronic devices. Meanwhile, there can very huge amount of data and
it becomes very difficult for investigators to analyse the data. Due to this, forensics
investigations can be very time consuming and complex. To overcome this issue, data mining
techniques can be used. Data mining is the process of developing the data structures which
gives crime related patterns and also large amount of data can be analysed with minimum
time and complexity. This research addressed the need of data mining techniques in digital
forensics process. Different proposed methods of digital forensics with data mining
techniques are discussed. By using data mining techniques along with digital forensics
techniques can help forensics analyst and intelligence agencies to reduce the work complexity
and conclude results in mean time.
5 References:
[1] "Improving Digital Forensics Through Data Mining by Chrysoula Tsochataridou, Avi
Arampatzis, Vasilios Katos Department of Electrical and Computer Engineering Democritus
University of Thrace Xanthi, Greece"
[2] "Data mining Techniques for Digital Forensic Analysis Ashwinkumar Malwadkar Prof.
Sonali Patil Department of Information Technology Department of Information Technology
K. J. Somaiya College of Engineering K. J. Somaiya College of Engineering Mumbai,
Maharashtra [email protected] [email protected]"
[3] "Digital Forensics and Cyber Crime Data mining K. K. Sindhu, B. B. Meshram Computer
Engineering Department, Shah and Anchor Kutchhi Engineering College, Mumbai, India
2Computer Engineering Department, Veermata Jijabai Technological Institute, Mumbai,
India Email: [email protected], [email protected]"
[4] "Applying Data Mining Principles in the Extraction of Digital Evidence Raburu George,
Omollo Richard, Okumu Daniel Department of Computer Science and Software Engineering
Jaramogi Oginga Odinga University of Science and Technology, Kenya Email:
[email protected] ; [email protected] ; [email protected]"
[5] "Data Mining based Crime-Dependent Triage in Digital Forensics Analysis Rosamaria
Bertè1, Fabio Marturana, Gianluigi Me1, Simone Tacconi, Department of Computer
Science, Systems and Production University of Tor Vergata, Rome, Italy Servizio Polizia
Postale e delle Comunicazioni,Rome, Italy [email protected], [email protected],
[email protected], [email protected]."
[6] "Framework for Live Digital Forensics using Data Mining Prof Sonal Honale , Jayshree
Borkar Computer Science and Engineering Department, Aabha Gaikwad College of
Engineering , Nagpur, India"
[7] "A digital forensic model based on data mining Peng Cheng , Hui Qu , Training
Department, Engineering College of CAPF, Xi'an 710086, China Faculty of Science,
Engineering College of CAPF, Xi'an 710086, China [email protected],
[email protected]"
[8] "Data Mining Methods Applied to a Digital Forensics Task for Supervised Machine
Learning Antonio J. Tallón-Ballesteros and José C. Riquelme Department of Languages and
Computer Systems, University of Seville Reina Mercedes Avenue, Seville, 41012 Spain
[email protected]"
[9] "Imminent accession of Artificial Intelligence based Forensic Exploratory with Data
Mining Analysis S. Umar , A. Praveen , S. Gouse, N. Deepthi Department Of Computer
Science Engineering, MLRIT, Hyderabad Department Of Computer Science Engineering,
IARE, Hyderabad Department Of Computer Science Engineering, MLRIT, Hyderabad
Department Of Computer Science Engineering, CMR ENGG & TECH, Hyderabad"