0% found this document useful (0 votes)
35 views19 pages

Cryptographic Techniques For Data Privacy in Digit

Uploaded by

chukambedev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views19 pages

Cryptographic Techniques For Data Privacy in Digit

Uploaded by

chukambedev
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Date of publication xxxx 00, 0000, date of current version July 11, 2023.
Digital Object Identifier 10.1109/ACCESS.2017.DOI

Cryptographic Techniques for Data


Privacy in Digital Forensics
TAIWO BLESSING OGUNSEYI1 , AND OLUWASOLA MARY ADEDAYO2 , (MEMBER, IEEE)
1
International Faculty of Applied Technology, Yibin University, Yibin 644000, Sichuan Province, China
2
Department of Applied Computer Science, The University of Winnipeg, Manitoba, Canada
Corresponding author: Oluwasola Mary Adedayo (e-mail: [email protected])
This work was supported by the research and development program of Yibin University and grants provided by the University of Winnipeg.

ABSTRACT The acquisition and analysis of data in digital forensics raise different data privacy challenges.
Many existing works on digital forensic readiness discuss what information should be stored and how
to collect relevant data to facilitate investigations. However, the cost of this readiness often directly
impacts the privacy of innocent third parties and suspects if the collected information is irrelevant.
Approaches that have been suggested for privacy-preserving digital forensics focus on the use of policy,
non-cryptography-based, and cryptography-based solutions. Cryptographic techniques have been proposed
to address issues of data privacy during data analysis. As the utilization of some of these cryptographic
techniques continues to increase, it is important to evaluate their applicability and challenges in relation
to digital forensics processes. This study provides digital forensics investigators and researchers with
a roadmap to understanding the data privacy challenges in digital forensics and examines the various
privacy techniques that can be utilized to tackle these challenges. Specifically, we review the cryptographic
techniques applied for privacy protection in digital forensics and categorize them within the context of
whether they support trusted third parties, multiple investigators, and multi-keyword searches. We highlight
some of the drawbacks of utilizing cryptography-based methods in privacy-preserving digital forensics and
suggest potential solutions to the identified shortcomings. In addition, we propose a conceptual privacy-
preserving digital forensics (PPDF) model that is based on the use of cryptographic techniques and analyze
the model within the context of the above-mentioned factors. An evaluation of the model is provided
through a consideration of identified factors that may affect an investigation. Lastly, we provide an analysis
of how existing principles for preserving privacy in digital forensics are addressed in our PPDF model.
Our evaluation shows that the model aligns with many of the existing privacy principles recommended for
privacy protection in digital forensics.

INDEX TERMS Cryptographic Techniques, Data Privacy, Digital Forensics, Forensic Readiness, Privacy-
Preserving Digital Forensics

I. INTRODUCTION Digital forensics has become a part of many investigations


in cases where computers and digital devices have been
The rise in the rate of occurrence of cybercrimes can, on one used to facilitate a crime, where they are the object of the
hand, be attributed to the increasing use of interconnected crime, or in cases where they contain information relevant
computers and hand-held devices and their ability to store to an incident [4]. The significant development that has been
huge amounts of information. On the other hand, it could experienced in the field of digital forensics over the last two
be attributed to the increased level of digitization [1]. This decades can be attributed to the increased number of scien-
advancement has made it possible for a user to utilize mul- tific research that now exists and the engaging initiatives from
tiple devices and to access numerous digital services daily Organizations such as the National Institute of Standards and
[2], which in a way provide digital footprints of the user’s Technology. These initiatives, for example, the Computer
everyday life. In addition, this advancement has caused a Forensics Tools Testing (CFTT), National Software Refer-
growing digital dependence and has aided digital evidence ence Library (NSRL), and the Computer Forensics Reference
in finding its way to the courtrooms [3].

VOLUME 4, 2016 1

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

Data Sets (CFReDS) [5], [6] have been instrumental in the


validation of new tools, and the provision of research data
sets that have led to the increased and diverse contributions in
the field. Digital forensics as an essential domain of forensics
seeks to extract evidence from computers and other digital
devices to help uncover crime. Digital contents including au-
dio data, images, videos, logs, emails, metadata, cache data,
etc. existing on many devices can be used by law enforcement
to understand the details of an event or find supporting
evidence during an investigation. In many cases however,
these digital devices contain other information including
personal, business-related, health-related, financial records,
and confidential information that may be exposed during the
analysis of the device despite their irrelevance to the event
being investigated. Given that forensic investigators usually FIGURE 1. Digital Forensics Processes.

have full access to devices that are considered pertinent to an


incident, access to this information threatens the privacy of
those whose information may be on the device [7]. investigative guidelines that can be applied to different stages
Addressing data privacy in digital forensics seems to be of the forensic processes. The goal of this article is to provide
contradicting as the latter involves the extraction of all data forensic investigators and researchers with a roadmap for
for investigation, while the former advocates the need for the developing practical approaches to address some of the data
control of data access. However, the question remains, if a privacy challenges in digital forensics. To achieve this, we
suspect’s privacy is infringed on and personal information is examine various privacy techniques that have been proposed
revealed in a bid to uncover a crime, what happens when to tackle privacy challenges in digital forensics, suggest
such a suspect is found innocent? Even so, if the device potential solutions to the drawbacks associated with these
contains information about the device owner’s relationship techniques, and putting these drawbacks in mind, propose
with other third-party individuals, how is the third party’s a conceptual model that thrives to preserve user privacy in
privacy protected? digital forensics. More specifically, the main contributions of
Privacy protection involves the right to control one’s data, this study are as follows:
including identity, personal data, and personal activities [8]. • We review the use of cryptographic techniques for pri-
Although the collection and analysis of such information may vacy protection in digital forensics and analyze their
sometimes be important, collecting only relevant data during characteristics within the context of whether they sup-
an investigation while ignoring the non-relevant data is a key port trusted third parties, multiple investigators, and
point for privacy protection in digital forensics [9]. Balancing multi-keyword searches. We named these three factors
the needs of a forensic investigator to support a fair trial as our analysis factors.
with the privacy rights of those being investigated or those • We propose a simple conceptual model for a privacy-
associated with them is a quest where both aspects conflict preserving digital forensics investigation process, de-
with each other [10], [11]. As a branch of forensic science, scribe how each of the cryptography-based techniques
digital forensics focuses on the application of scientific meth- may be used within the model, present the mathematical
ods in the investigation of evidence present in digital devices representation and algorithm of the model, and examine
for understanding and reconstructing the sequence of events the model within the context of the identified analysis
that have transpired in the generation of the said evidence factors.
[12]. It ensures that the digital forensics processes (depicted • Lastly, we explore some of the analysis factors that may
in Figure 1) of identification, preservation, acquisition, ex- come into play in the use of the model and evaluate how
amination, analysis, and presentation of digital evidence are the model aligns with some of the existing principles
completed in a legally acceptable manner [13]. Regardless of and guidelines for preserving privacy in digital forensic
whether the information on a device is relevant or not, the investigations.
overall goal of these processes is to provide information that The rest of the paper is organized as follows. Section 2
supports or refutes a hypothesis about an incident. Although gives an overview of privacy challenges and examines how
the amount of non-relevant information is often significantly and where privacy concerns may arise in each subdomain
more than the relevant data, there has been less focus on how of digital forensics during an investigation. In Section 3, we
to preserve an individual’s privacy in such data [14] or the discussed the different cryptographic techniques, how they
effect that such privacy breaches may have on the subject have been utilized for privacy protection in digital forensics,
involved. and the drawbacks as well as potential solutions to applying
Existing works that have considered the issue of privacy cryptographic techniques for privacy protection in digital
in digital forensics have mostly described principles and forensics. Section 4 presents a privacy-preserving concep-
2 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

tual model for digital forensic investigation, highlighting the with other third parties, that is, an individual who is not
entities involved in the model, the investigation model, its directly involved in a crime and/or not a suspect or the
analysis factor, and mathematical representation. Section 5 victim. For example, communication records and photos or
provides some discussions and evaluation of the model and videos about family members, friends, or colleagues may be
Section 6 contains the conclusions and some future research present on the device. The need for users’ informed consent
work. when collecting data from their devices is still a concern in
digital forensics and there have been recommendations that
II. PRIVACY CHALLENGES. WHOSE PRIVACY? WHAT informed consent from a user should be simply spelled out,
PRIVATE DATA? complete, and explicable by the users [20]. While some of the
To address the issues of privacy in digital forensics and information on a device may be relevant to an investigation,
provide some context, it is important to establish how and and the examination of data may be done with the owner’s
where private data about individuals may be exposed dur- consent, such consent does not extend to the private infor-
ing investigations. This section discusses privacy challenges mation of third parties. However, obtaining the consent of
from the perspective of whose privacy may be impacted, and third parties before viewing their information for relevance
what data may be considered private in different domains of or further examination is not feasible in many cases, thus
digital forensics. preserving third parties’ privacy becomes the responsibility
Because digital forensics is mostly concerned with the of the investigator.
holistic acquisition and investigation of digital evidence, an Another privacy concern relates to data about secondary
underlying principle of digital forensics is that forensic inves- users (who are unrelated to the investigation) on a shared
tigations must be reliable, complete, accurate, and verifiable. or multi-user device. In some cases, the device may provide
Existing studies [11], [15]–[17] have shown that finding mechanisms to separate each user’s data e.g. users with
a balance between acquiring case-relevant information and different profiles, however, where this is not the case e.g. in
invading users’ privacy is a major challenge. In an inves- the case of a shared account or when an investigation involves
tigation, data may be retrieved from devices belonging to a corporate email server, the privacy of secondary users may
the suspect, victim, or even witnesses [18]. Despite that the be violated during the examination and analysis of data. Such
main focus of the digital forensics process is to collect and information should be separated and handled in a way that
analyze data that is relevant to the investigation, many digital protects the secondary user’s privacy.
devices will contain information that belongs to individuals Privacy concerns may also involve information relating
or entities other than the primary user of the device, and an to service providers since their interaction with the device
investigator may access such information to even determine may result in some details of their applications, application
their relevancy. Privacy concerns can be viewed with rela- interfaces or application functions being stored locally on
tion to four main entities during investigations: the primary the device. For example, information about endpoints, con-
owner of a device, third parties, secondary users, and service fidential information about accounts operated by the primary
providers. These entities are similar to those described by user, or purchase records may be exposed in the investigation
[11]. process. Determining whether such information is relevant to
an investigation should be done on an individual basis, taking
A. WHOSE PRIVACY? into account the possible implications for both the device
One aspect of privacy concerns relates to how private in- owner and the service provider [11].
formation about the primary owner of a device is handled, Lastly, when dealing with service providers or different
especially when such information has no benefits to the in- jurisdictions, the interpretation of privacy is also a challenge.
vestigation. Analyzing medical history, browsing and buying Although privacy is understood in similar ways at a high
patterns, communication metadata, and cell phones, (which level, different jurisdictions may understand specific details
tends to hold a greater quantity of information on lifestyles, and interpret privacy differently [20]. Since data may be
association, and activities), can provide a complete view of stored on servers in different countries (e.g., for cloud foren-
individuals’ lives [15]. The direct or indirect exposure of a sics), what is considered user privacy infringement in one
suspect’s information raises privacy concerns if the suspect jurisdiction might not be in another jurisdiction. The situation
is later found not guilty. As such, a suspect’s confidential is made more complex when a cloud service provider (CSP)
information should be kept private by forensics investigators is using services from another CSP located in a separate
until when found guilty. Satisfying the requirement that there jurisdiction [21]. Therefore, it is important to be aware of
should be no bias in the analysis of digital evidence in an the differences that may exist in such interpretations during
investigation implies that the possibility of being guilty or an investigation.
innocent should be considered, as well as the impact of the
investigation on a device owner after the investigation [19]. B. WHAT PRIVATE DATA?
Even though private data collected on a device will mostly Tracing back to the origin of digital forensics in the late
belong to the primary owner in many cases, such data may 1990s, when computer forensics was done by law enforce-
also expose details about their relationships or interactions ment personnel with computing expertise, the field has grown
VOLUME 4, 2016 3

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

to be a significant part of any investigation [22]. Coupled digital forensics tools, IoT forensics at the network and cloud
with the growing use of the internet, mobile devices, and levels have not been completely used in digital forensics and
other technological advances, different subdomains of digital efforts to design standard frameworks, models, or methods
forensics such as mobile forensics, cloud forensics, net- for IoT forensics are still in their infancy [27]–[29]. However,
work forensics, and Internet of Things (IoT) forensics have there have been attempts to extract and analyze data from
emerged to address the challenges of handling various types devices such as Google Home and Google Assistant apps
of data and analysis techniques in different aspects of com- [30].
puting. Despite that the nature of information being examined Although the fact that many devices only store data for
in different subdomains may differ, the issue of privacy is a a short time is a challenge for IoT forensics, the traces left
concern in almost every subdomain. In what follows, we give behind and the usage nature of IoT devices and frameworks
an overview of some domains of digital forensics and give a imply that a significant amount of information such as user
practical indication of how privacy concerns may arise in the activities, network traffic, system logs, communication and
different domains as shown in Figure 2. network usage patterns, and private data about device users
can be collected, depending on the device being examined
1) Network Forensics or the scenario. In addition to privacy concerns relating to
Network forensics is a branch of digital forensics that deals the primary owner of an IoT device, data collected during
with network-related investigations and may involve the IoT forensics may affect the privacy of third parties, external
tracking of external and internal network attacks by focus- users, and service providers as earlier described.
ing on inherent network vulnerabilities and communication
mechanisms [23]. It involves the identification, capturing, 3) Database Forensics
analysis, and reconstruction of network events to discover Database forensics is the branch of digital forensics that
evidential information about the source of security attacks in extracts evidential information from database systems [31].
a way that preserves the integrity of the data. Network foren- It is related to the study of metadata and the application of
sics may deal with both dynamic or static data depending on investigative techniques to database contents and metadata.
whether the collection and analysis are done on the fly or Database forensics investigation focuses on artifacts such as
post-mortem. Some aspects of network forensics include web database logs, schema, data structure, metadata (file system),
forensics - which involves the analysis of web browsers and storage engine, etc. [32] and may involve the inspection
web servers to collect user information; email forensics; and and validation of the timestamps relating to data updates to
cloud forensics - which focuses on investigating incidents validate a user’s action.
that occur primarily in the cloud environment. Because databases are used to store critical and sensitive
Network data or traffic captured for network forensics information in almost all computing systems and applica-
contains a lot of information about a user’s activities. This tions, they serve as a significant source of information that
may include websites visited, the amount of time spent on can be useful for forensic analysis. Much of the critical
each webpage, details of successful and unsuccessful lo- and sensitive information stored on a database, for exam-
gin attempts, unencrypted credentials, records of illegal file ple, information about an application, or its operation, an
download or intellectual property abuse, accessed multime- organization, processes on a device, location or transaction
dia files, emails, email attachments, and other documents histories, etc., creates privacy concerns both for the primary
sent or retrieved over the network [24]. Most of the existing user of the database and other entities that may be involved.
network forensics frameworks focus on the collection of Many of the existing process models for database forensics
data and have major impacts on the privacy of the primary [33] focus on the ability to recover evidential information
network user, third parties, and external parties in many cases from a database and do not consider how the privacy of those
[25]. whose data are stored in the database being investigated may
be impacted.
2) Internet of Things (IoT) Forensics
IoT forensics is a relatively new sub-domain of digital foren- 4) Multimedia Forensics
sics [19] that evolved due to the increase in both IoT devices, Multimedia forensics involves the attempts to explore, ana-
cybercrimes related to these devices as well as the machine- lyze, and retrieve information about multimedia such as im-
to-machine (M2M) communication enabled by IoT tech- ages, audio, video, and text. Specifically, it is the analysis of
nology. It focuses on identifying, acquiring, and analyzing digital multimedia content to produce evidence in the foren-
evidential information from IoT infrastructures and devices sics domain [34]. Image forensics as an aspect of multimedia
such as wearables, small devices, sensors, connected cars, forensics investigates images by analyzing the authenticity
and RFID for investigative purposes. and integrity of data to detect forgeries or manipulations as
IoT forensics processes comprise three levels of forensics well as trace the history of the image. In audio forensics,
namely, device-level forensics, network-level forensics, and tools and techniques of audio engineering and digital signal
cloud-level forensics [26]. Due to the distributed nature and processing are applied to study audio data as part of a legal
heterogeneity of IoT infrastructures as well as limitations of proceeding or for either civil or criminal investigations [35].
4 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

FIGURE 2. Digital forensics subdomains.

Video forensics focuses on the examination, comparison, and significant benefits for digital forensics investigations, many
evaluation of video for investigative purposes. of these processes carry the risk of exposing sensitive private
While multimedia data being examined may be relevant to data that may turn out to be irrelevant to an investigation.
an investigation, it may also contain images, conversations, or
videos of other individuals or portions that may be irrelevant 6) Mobile Forensics
to an investigation. For example, in audio or video forensics, Mobile device forensics involves the analysis of mobile
a recording can provide a real-time, eyewitness account of phones and devices to recover digital evidence. From
an event [36] but may also capture personal conversations or databases that store GPS records, chat history, and messages,
events that the entities involved may prefer not to disclose. to records and logs about applications installed on a phone
Also, images being analyzed may look compromising for or mobile device, the collection and analysis of data from
an entity, even though they may eventually be found to be mobile devices can expose several private details about the
unauthentic. device’s user, third parties, secondary users or even service
providers, depending on the scenario being investigated.
5) Computer Forensics Lastly, we note that although collection and analysis of
Computer forensics focuses on the procedure of obtaining evidence (including digital evidence) are expected to follow
and analyzing computer-related information including data certain rules in different countries [39], the need to follow all
files, hard disk, file storage, hard disk, etc. [34]. Some as- “reasonable lines of inquiry” in investigations requires that
pects of computer forensics include disk forensics, memory privacy concerns are considered from the onset of an inves-
forensics, and file system forensics. Disk forensics usually tigation and built into the digital forensics processes. Unfor-
involves the process of extracting the content of a file or tunately, many of the existing digital forensics investigation
recovering the contents of a deleted file [36] from a disk models focus mainly on technical aspects of investigations
drive. The process used to achieve this depends on the data in and do not address the issue of privacy in digital forensics.
the disk or its partition. The works [10], [11], that have considered the issue of pri-
In memory forensics, live analysis of the RAM can provide vacy in digital forensics have mostly provided principles and
detailed information on executed commands in a system, investigative guidelines that can be applied to different stages
running processes, internet history, system credentials, etc. of the forensic processes to enhance privacy preservation
[37]. File system forensics focuses on the application of during analysis but there are only a handful of solutions that
knowledge about a file system in discovering evidence and implement or support these guidelines practically. In Section
recovering deleted data [38]. Whereas the ability to retrieve 5, we discuss how some of these principles align with our
metadata, access encrypted data, find hidden information, or proposed conceptual approach for privacy preservation in
extract data from unallocated spaces on a disk may have digital forensics.
VOLUME 4, 2016 5

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

TABLE 1. Summary of the Privacy Principles and Policies for Digital Investigation

S/No Privacy-Principles [11] Privacy-Policies [40]


1 Before conducting an investigation, it is crucial to define its scope to ensure it is Make two identical hard disk copies and leave one in an environ-
proportionate and justifiable. Privacy should be safeguarded through appropriate ment trusted by the affected party.
measures, and documented for transparency.
2 Full-scale data extraction should only be used when targeted methods could Remove any unneeded data using specialized erasure tools, such
jeopardize the investigation’s integrity due to a significant risk. as Evidence Eliminator.
3 Full-scale data extraction must be justified and supported by evidence, aligning Limit the search for evidence to the goal of the investigation.
with the specific circumstances of the ongoing investigation.
4 The investigation scope should adapt as needed, with maximum privacy protection Handle time-stamped events in the strictest confidence.
as the initial approach. Examination methods may expand if necessary to serve the
investigation’s purpose.
5 Investigators should recognize when they have reached an investigation threshold Obtain packet acknowledgment via the use of a token rather than
and halt further probing activities. the IP address.
6 The use of examined data for investigative purposes should undergo legal and Safely stores all internal transaction logs.
procedural oversight.
7 When targeted data extraction is not feasible, consider screening methods and Preserve event logs in external nodes.
criteria before a comprehensive data review. Evaluate their suitability and use them
before resorting to a full data examination, as needed.
8 The investigating authority must document its decision-making processes in cre- Ensure that organizational policy describes actionable items
ating an examination strategy and make them accessible for third-party evaluation related to attacks.
to ensure transparency.
9 Investigating authorities should take steps to define and carry out digital device Establish policies to safeguard backed-up data relevant to an
investigations consistently. investigation.
10 In cases of a suitable relationship between the investigating authority and the de- Handle disposal of data in a secure manner.
vice owner, prompt communication of investigative strategies and any alterations
is recommended.

III. PRIVACY-PRESERVING TECHNIQUES FOR DIGITAL forensics [47], and data management (for maintaining a chain
FORENSICS of custody) [48]. This is due to its decentralized and trans-
Generally, techniques for data privacy protection for dig- parent information-sharing approach. However, the adoption
ital forensics can be categorized into policy-based, non- of this technology for privacy protection in digital forensics,
cryptography-based, and cryptography-based approaches. especially in the traditional digital forensics subdomain, is
Policy-based approaches give data owners insight into how still in its infancy stage [49].
their private data should be collected, used, and disclosed Cryptography-based techniques involve the use of encryp-
if needed [9]. It also provides insight into how policies are tion and have been adopted in many aspects of computing
developed to achieve privacy objectives without hindrance to provide confidentiality, integrity, authentication, and non-
to law enforcement agents during criminal investigations and repudiation [50], so the application of these techniques in
to restrict access to unrelated files [40]. Existing works that preventing or limiting access to information, as well as
have addressed privacy concerns in digital forensics through protecting private data is not new. However, the application
the specification of principles and guidelines provide details of cryptographic techniques in digital forensics is relatively
that can be incorporated into the definition of policies relating new with only a small number of studies focusing on different
to privacy in digital forensics. For instance, [11] suggested domains of digital forensics.
a set of privacy-preserving data processing principles that To provide a road map for understanding the use of
define conduct that is indicative of privacy protection. These privacy-preserving techniques in digital forensics, we de-
principles contain a set of investigative behaviors designed to scribe existing studies that have deployed cryptographic tech-
balance the requirement for effective investigative processes niques that allow the examination and analysis of digital
with the need to prevent unnecessary invasion of privacy, evidence while preserving the privacy of those involved. The
particularly during the data extraction and examination stages description of these studies is based on analysis factors such
of digital forensics. The principles are summarized in Table 1 as: (i) does the study require computation from a trusted or an
. Similarly, [40] proposed policies that could protect privacy, untrusted third party to function properly, (ii) does it permit
both from the user’s perspective and the investigator’s per- multiple investigators to access forensics data, and lastly
spective, without hindering law enforcement investigations (iii) does the study support multi-keyword searches. These
of crimes as depicted in Table 1. analysis factors are essential for a feasible privacy-preserving
Non-cryptographic-based techniques for privacy protec- digital forensic solution to consider. The following sections
tion in digital forensics mainly focus on the use of blockchain provide an overview of each cryptographic technique that
technology [41]. Blockchain technology as an emerging tech- can be considered and how they have been applied in digital
nology has been recently adopted for privacy protection in forensics. We also describe the drawbacks seen in these
digital forensics, particularly in, cloud forensics [42], [43], studies and suggest possible solutions to support the use
IoT forensics [44], [45], mobile forensics [46], multimedia of these techniques in digital forensics. A summary of the
6 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

relevant literature is provided in Table 2. tem over encrypted data which keeps the investigator subject
confidential and protects irrelevant data from the investigator.
A. HOMOMORPHIC ENCRYPTION The system requires a server administrator who is in charge
Homomorphic encryption (HE) allows computation on en- of a suspect’s encrypted data to receive some set of case-
crypted data without the need to first decrypt the data, related encrypted multiple keywords from an investigator.
learning neither the inputs nor the computed results. HE is The administrator searches these keywords against an en-
classified into three categories; fully homomorphic encryp- crypted suspect’s data and returns the resulting data to the
tion (FHE), somewhat homomorphic encryption (SWHE), investigator who then decrypts them for investigation. More
and partial homomorphic encryption (PHE) [51]. HE may specifically, their scheme supports both conjunctive and dis-
be used in digital forensics for data protection, by allowing junctive keyword searches over encrypted data to help gener-
evidential information to be encrypted and analyzed without ate robust investigation data. The conjunctive keyword search
first decrypting the data, thus ensuring data privacy of those returns documents containing all of the several keywords,
involved. This technique provides resilience in situations while the disjunctive keyword search returns aggregated doc-
where computations are carried out by an untrusted or po- uments containing either one of the keywords or all of them.
tentially compromised party [52]. While most of the studies examined do not require a trusted
One of the common use of HE for digital forensics has third party, which suggests that either the systems do not
been for the preservation of logs, particularly in cloud en- require any input from a third party, or could work with
vironments or where the privacy of individual logs records an untrusted third party while still preserving privacy, they
need to be preserved while allowing such record to be ana- lack a multi-keyword search. Therefore, investigators can not
lyzed together. Log files contain useful information crucial to submit multiple search keywords to retrieve the most relevant
forensics investigation, but may contain private information data.
that needs to be protected. [53] proposed a logging scheme
that considers log segmentation and distributed storage to B. COMMUTATIVE ENCRYPTION
collect logs from distributed edge nodes and protect log con- Commutative encryption enables plaintext to be encrypted
fidentiality by taking into account edge-cloud characteristics. more than once using different users’ public keys and does
The system utilized a multi-index-chain (MIC) technique and not require decryption before the encryption/re-encryption
distributed storage cluster to acquire forensics data without process [57]. Commutative encryption can be used for pri-
relying on a service provider. The index files include in- vacy protection in digital forensics by allowing both inves-
formation on the distributed log block being shared with tigator and server administrator (or a device owner) to en-
MIC peers through its network. Thus, allowing forensics crypt evidential information using different encryption keys.
investigators to collect related log blocks based on index Hence, ensuring that an investigator has access to case-
files and distributed storage clusters. To ensure log privacy, related information only and that the relevant data comes
the authors implemented the partial homomorphic encryption from the server (or device) without alteration. In general, this
scheme for the proposed system. technique may be used to address the challenge of getting
In another study, [54] proposed an efficient privacy- relevant data in situations where information (relevant and
preserving IoT-based log management system for digital irrelevant) is stored with a third-party service provider.
forensics that captures and preserves forensics logs continu- A design for privacy-preserving forensic investigations for
ously for IoT devices in a cloud environment. Through the shared servers was proposed by [58]. The proposed system
use of homomorphic encryption, the authors designed an allows the server administrator to encrypt all data of interest
automated and secure log collection model that is capable that are stored on the server; this prevents the investigator
of preserving smart environment logs from distributed edge from learning any data. The investigator then encrypts case-
nodes in a fog-enabled cloud environment. To preserve and related keywords and sends them to the server administra-
transfer logs securely from IoT devices to the cloud with tor. The administrator searches for the relevant keywords
consideration for delay sensitivity and task offloading, they from the encrypted data and returns the relevant data to
introduced a fog layer amid the IoT and cloud layer. The the investigator and the investigator decrypts the data to
three layers offer different security controls for log secrecy perform analysis on such relevant data. [59] improved the
and privacy to tackle multi-stakeholder (multi-tenancy) is- proposed system to include a verification technique such that
sues and log alteration. Similarly, [55] utilized the (partial) the authenticity and integrity of the collected encrypted data
homomorphic encryption scheme for secure log management from the administrator can be verified to know whether the
using a Tor network. The study designed a secure logging presented evidence is actually from the server without any
system that ensures the confidentiality and integrity of the alteration. However, both systems require a trusted third party
logs by applying HE for encrypted operations on the logs to function well, which could be sometimes infeasible. In
while the Tor network improves the privacy and security of addition, the systems do not support queries from multiple
log data while in transmission. investigators, which could be a challenge in a situation where
In terms of handling other types of forensic data, [56] two or more investigators are required.
designed a privacy-preserving multiple keyword search sys-
VOLUME 4, 2016 7

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

TABLE 2. Summary of the analyzed privacy-preserving digital forensics studies

Multi-
Trusted Multiple
Authors Year CT Used DF Domain keyword
Third Party Investigators
search
IoT
[53] 2019 HE No Yes No
Forensics
IoT
[54] 2020 HE No No No
Forensics
Log
[55] 2013 HE No No No
Forensics
Computer
[56] 2011 HE Yes No Yes
Forensics
Commutative Computer
[58] 2011 Yes No No
Encryption & HE Forensics
Commutative Computer
[59] 2013 Yes No Yes
Encryption & HE Forensics
Cloud
[61] 2017 Secret Sharing Yes No No
Forensics
Cloud
[62] 2017 Secret Sharing Yes No No
Forensics
Computer
[63] 2013 Secret Sharing Yes Yes No
Forensics
Computer
[64] 2013 Secret Sharing Yes Yes Yes
Forensics
Searchable Email
[66] 2015 Yes Yes Yes
Encryption Forensics
Searchable Disk
[67] 2016 No No No
Encryption Forensics
Cloud
[68] 2021 IBE Yes Yes No
Forensics
CT - Cryptographic Techniques, DF - Digital Forensics, IBE - Identity-Based Encryption

C. SECRET SHARING that efficiently searches for forensic evidence within cloud
Secret sharing as a cryptographic tool can be applied in a and edge environment without compromising the privacy
situation where access to sensitive information has to be of non-target. Leveraging standard metering, network logs,
protected by more than one party. It is a scheme in which and a secret sharing scheme, the study proposes a privacy-
different shares of a secret are distributed to parties such preserving solution that reduces the digital forensics target
that only a fixed subset of parties can reconstruct the secret search space during an investigation in the cloud.
[60]. Secret sharing may be used to preserve privacy in Aiming to improve investigation efficiency and privacy
digital forensics by distributing a secret (suspect’s data) into of data, [63] proposed the use of a secret sharing scheme
n pieces/data files and storing them in different locations to secure for keyword searching and matching procedures. The
prevent leaking. Each data file holds no intelligible infor- authors treated data files managed by a server administra-
mation about the secret, and the original secret cannot be tor as a sequence of words, with each word and keyword
reconstructed from any one separate file. This may be par- treated as secret and divided into n pieces of secret shares.
ticularly useful in situations involving the cloud environment A third party is then required to match each word in a file
or service provider, or where the data of interest is maintained with each keyword from the investigator. Precisely, the third
by a server administrator. party matches the shares of each word to the shares of each
Focusing on cloud forensics, [61] proposed a solution keyword given by the investigator until a match is found.
based on secret sharing and message authentication codes Once a match is found, t shares of all remaining words of
(MAC) for robust logging of cloud events for forensic in- the same file are forwarded to the investigator to reconstruct
vestigations. Since there is at least one logging server in a the whole file based on the principle of (t,n)-threshold secret
cloud environment, the proposed system attached a MAC sharing. Data integrity and authenticity were guaranteed in
to every data written to the log file by the log server, thus the system by utilizing a digital signature [64]. The proposed
creating a chain to avert an attacker from modifying events system can also be queried by multiple investigators. One
without being detected. As a further security buffer, each major downside is the inability of an investigator to query
event of the log is divided into n shares and circulated among forensics data using multiple keywords.
random nodes in the cloud, then the events are recorded into
an immutable database. Using the search space reduction D. SEARCHABLE ENCRYPTION
technique in a cloud environment, [62] designed a secure Searchable encryption (SE) is an encryption technique that
cloud forensic solution, based on the set of inputs that allows search operations over encrypted data. SE can ei-
define the historical activity data for the virtual machine, ther be searchable symmetric encryption (SSE) or Public-
8 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

Key Searchable Encryption (PKSE) [65]. SE can be used the digital forensics process often results in an enor-
to preserve privacy in digital forensics by allowing case- mous dataset and encrypting this dataset produces ci-
relevant keywords to be searched over encrypted evidential phertext with a substantially even bigger size. This has
information, without direct access to suspect personal infor- the potential of making an investigation less thorough
mation. Targeting email forensics, [66], presented a privacy- and requiring more resources for data processing. A
preserving email forensics system that analyzes email data in possible solution for this challenge is to encrypt only
a corporate environment. The system enabled non-interactive case-relevant data, implying that the verification of key-
threshold keyword searches on encrypted emails by utilizing words to determine data relevancy should be included in
searchable encryption. More specifically, an investigator with digital forensics models to ascertain whether compiled
the encrypted data searches the encrypted data for selected keywords are case-relevant or not. While this will not
keyword searches. The search process reveals the content of fully eliminate some of the privacy concerns earlier
an email if it contains at least t number of keywords amongst described, it has the potential to limit their occurrence
those that the investigator is searching for. Otherwise, the [69].
investigator learns nothing about the content of the email • With the increased rate at which data of interest in a
or whether any of the selected keywords are contained in forensic investigation may be generated, investigators
the encrypted data. As a follow-up to [66], the authors [67] now require more resources and time to collect, exam-
proposed an improvement to the above-described system ine, and analyze forensic data, regardless of whether
and implemented it for disk image forensics. The system the data is encrypted or not. To address this challenge,
included an additional step of pre-processing disk images investigators must work with relevant authorities and
before applying a protection mechanism (encryption). service providers to understand where to look for rel-
evant evidence. To achieve this, it may be important
E. IDENTITY-BASED ENCRYPTION (IBE) to focus on who, what, when, where, why, and how
Identity-based encryption is a public key encryption in which questions when reviewing possible sources of evidence.
a user/sender can generate a public key from a known unique • Some of the cryptography-based schemes lack a ver-
identifier such as the email address of the receiver, and a ification method to ascertain whether the keyword
trusted third-party server calculates a corresponding private searches are case-relevant or not. Case-relevant key-
key from the public key. [68] proposed an IBE-based secure words are instrumental in reducing the volume of data
cloud storage system that is compatible with cloud forensics collected and encrypted and reducing the potential ac-
and supports digital forensics investigations, by using multi- cess to private data. A careful selection of the key-
ple public-key generators (PKG) to generate the (encryption) words through collaboration between investigators and
keys. The system permits legal authority or an investigator other stakeholders (e.g., law enforcement or a service
to act as a party in the key generation in collaboration with provider) would be necessary to address some aspects
another trusted key generation authority which acts as the of this challenge. In addition, models for privacy-
other PKG. Whenever the need for a forensic investigation preserving digital forensics should integrate keyword
arises in the cloud environment, the legal authority collabo- verification techniques, to determine what data should
rates with the trusted key generation authority to re-generate be collected and/or encrypted and ultimately reduce
the private key and decrypt the file contents, then provide the investigation time.
decrypted files for further forensics analysis. Since neither • The management of encryption and decryption keys in
the legal authority nor the trusted key generation authority existing cryptography-based techniques is a challenge
can act alone to generate the private key for decryption, data that can limit the usability of some solutions. While this
access can be controlled. However, the shortcomings are the challenge may be addressed simply by minimizing the
file content has to be decrypted first before any forensics transfer of keys between parties involved in an investiga-
analysis and the scheme only permits single keyword search tion, a dedicated entity that generates key pairs, encrypts
which makes it impractical when large file content is to be data, and decrypts the ciphertext should be integrated
analyzed. into privacy-preserving digital forensics models.

F. DRAWBACKS OF DEPLOYING CRYPTOGRAPHIC


TECHNIQUES AND POTENTIAL SOLUTIONS With all these drawbacks and solutions in mind, we pro-
As seen from the literature reviewed and Table 1, very pose a simple conceptual privacy-preserving digital foren-
few studies have explored the use of cryptography-based sics model that utilizes a cryptography-based scheme in the
techniques for mitigating data privacy challenges in digital following section. We then explain how the different cryp-
forensics. This is in part due to the following reasons as seen tographic techniques earlier described may be incorporated
in existing studies. Possible ways of addressing the identified into this conceptual model and other factors that may be
issues are also described below. considered in their use.
• The resulting size of ciphertext from encrypted eviden-
tial information is too large. The collection stage of
VOLUME 4, 2016 9

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

IV. PRIVACY-PRESERVING DIGITAL FORENSICS privacy-preserving stages depicted in Figure 3 and discuss
MODEL the possible applications and limitations of the cryptographic
This section describes our conceptual model for privacy- techniques described in Section 3 at each stage of the privacy-
preserving digital forensics (PPDF) as shown in Figure 3. preserving model.
The model employs the use of encryption in the handling
of evidential data throughout the digital forensic investiga- Stage 1: Preparation / Key Generation
tion process. In what follows, we first describe the entities The first step of the digital forensic process is the identi-
involved in the model and their responsibilities, then discuss fication stage. At this stage, an investigator recognizes the
the model based on the digital forensic investigation pro- nature of an incident, prepares the tools and equipment
cesses. Furthermore, we examine how the model overcomes needed during the investigation, and defines the tasks to
the shortcomings highlighted in Section 3 and lastly present be accomplished during the investigation. Furthermore, and
the mathematical description of the model. The model is dis- while this may be a difficult task, they also work with other
cussed in light of how the different cryptographic techniques entities (users and service providers), to ensure data and user
discussed in Section 3 may be used at each stage. privacy. This stage can also be referred to as the forensics
readiness stage. As part of the preparation for the investi-
A. PRIVACY-PRESERVING MODEL ENTITIES gation, the investigator is expected to obtain the necessary
The main goal of the privacy-preserving digital forensics approval or warrant for investigation and devise appropriate
model is to allow the identification, preservation, acquisition, data preservation, and chain of custody mechanisms. The
analysis, documentation, and presentation of evidence from goal at this stage is to ensure that the right resources, both
digital devices while preserving the privacy of the individuals material and human are employed for evidence preservation
involved. The entities that exist in the privacy-preserving and to prevent irretrievable damage to digital evidence due to
model, as well as the assumptions relating to each entity their volatility.
within the model, are described as follows. No cryptographic technique is required at this stage since
• User: The user is an individual who is involved in a there is no interaction with data. However, measures to ensure
case that requires a digital forensics investigation. The data privacy should be integrated into the stages right from
user could be a suspect, an accused person, a victim, the beginning and included as part of a readiness plan.
or a third party. Data that belongs to a user, especially This should include generating encryption keys to facilitate
those that are non-relevant to the investigation should be privacy preservation throughout the investigation process.
kept private. Hence, it is important for investigators to In the case of a cooperating user (i.e., a user who shows
perform their investigative role without compromising a willingness to cooperate with the investigator during the
the user’s privacy. investigation process e.g. the victim in an incident), the
• Investigator: The investigator may be a law enforcement investigator may decide to delegate the responsibility of
agent (LEA) or someone involved in the examination generating the encryption key to the user to limit access to
and analysis of digital evidence and may interact with their personal information and foster their involvement in
both the user(s) and the service provider. The investiga- maintaining their privacy during the investigation. The public
tor is mostly responsible for generating an asymmetric key is made available to the investigator and/or the service
key pair (P rikey , P ubkey ), and making the public key provider as the case may be. On the other hand, if the user
(P ubkey ) available to the Service Provider (SP) but is an uncooperating user, either because they are not found at
the creation of the key pair may be delegated to a the crime scene, could not be located or for any other reason,
cooperating user if necessary. The investigator also de- the investigator generates the encryption key pair and makes
termines case-relevant keywords, required to ascertain the public key available to a service provider if necessary.
the data that should be extracted. The investigator is In practice, law enforcement agents who deal with digital
also required to work closely with the SP(s) to retrieve evidence are trained to properly utilize recent technology to
evidence within the SP’s jurisdiction. enhance their investigation [70], [71]. Therefore, it is natural
• Service Provider (SP): The service provider is charged to assume that LEAs can generate and manage the public key,
with searching the suspect’s data, in their custody, for generated either by the user or the investigator.
case-relevant keywords, encrypting relevant data, and The decision regarding whether or not a user generates the
sending it to the investigator(s). We assume that this key pair should be made on a case-by-case basis depending
entity only receives input from the investigator, and does on well-defined factors that have been identified based on
not collude with the user, hence, they can be trusted. the incident type, user request, or other conditions relevant to
the case. In a situation where a user is delegated to generate
B. CONCEPTUAL MODEL BASED ON THE DIGITAL the public key, the custody of the corresponding private
INVESTIGATION PROCESSES key is discussed in Section 5.1. Public key sharing is done
The digital forensics investigation process is divided into six based on the location of the evidential information, either
stages (as described in Section 1). However, in this section, in the cloud or non-cloud environment. For preparation in
we categorize the processes into four stages based on the four a cloud-based environment, the public key is shared with
10 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

FIGURE 3. An overview of the proposed PPDF model.

a cloud service provider, while the LEA manages the non-


cloud environment. The key generation step in this stage is an
important step that is required regardless of which encryption
techniques described in Section 3 are eventually used. This
step is also required regardless of the data acquisition method
later used, either static or live acquisition. A flowchart repre-
senting the steps involved in this stage is depicted in Figure
4.

Stage 2: Preservation / Encryption


Next to the identification and preparation stage of the digital
forensics process is the preservation and acquisition stage.
This stage focuses on the isolation, preservation, and collec-
tion of evidential data from the digital crime scene. Collected
data typically includes both case-relevant and non-relevant
data since it may be almost impossible to determine which
information is relevant right from the crime scene [72]. As
shown in Figure 5, data acquisition and preservation may
involve making an image copy of disk drives and other digital
objects found at the crime scene, or running data collection
tools and writing the output to external storage for servers
and other critical devices that cannot be powered off [73].
For our conceptual PPDF model, this stage also includes
the generation of keywords that are related to the crime
being investigated. The investigator(s) who have oversight of
the case, establishes a set of case-related keywords, known
as the first keywords. A search of the first keywords is
conducted on the collected data without any encryption of the
FIGURE 4. A flow diagram of the preparation and key generation stage.
collected data or the keywords. The data retrieved from the
first keyword search are then encrypted using the appropriate
public key earlier generated (Pub key ). To ensure that as
much data as possible can be gathered, the first keyword
VOLUME 4, 2016 11

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

search should use disjunctive keywords as described in [74].


This helps to generate more independent and non-interrelated
data [75], which gives the investigator access to more case-
related information. However, for evidential information with
a service provider (e.g. cloud service provider), the first
keywords are encrypted by the investigator before being sent
to the service provider as further discussed in Section 5.1.
It is important to note that we differentiate between our use
of ‘case-related’ and ‘case-relevant or pertinent’ data. Case-
related data is any data that can be linked to the criminal
case but may not be necessarily relevant and it is associated
with the first keyword search, while case-pertinent data are
relevant and are associated with the second keyword search
described in Stage 3 below.
Apart from ensuring that the data retrieved is related to the
case being investigated, one of the goals of this step is to en-
sure that non-related data is not encrypted, thus significantly
reducing the size of the ciphertext generated which needs to
be further analyzed. Thinking about privacy concerns, we
note that it is possible that information considered to be
case-related or case-relevant may still contain private data,
however, this must be included as part of the analysis to
ensure a holistic view of the data and/or the investigation. For
example, if the date of an incident is considered as a search
keyword, all files created on this date may be retrieved as
related data but their relevance is yet to be determined. Data
considered to be relevant is still included in the analysis even
FIGURE 5. A flow diagram of the preservation and encryption stage.
though it may contain some private information.
The main privacy-related task in this stage is the encryp-
tion of results from the disjunctive keyword search. From Stage 3: Extraction / Computation
the privacy techniques earlier discussed in Section 3, homo- After the evidence encryption comes the extraction of case-
morphic encryption, commutative encryption, and searchable pertinent data and analysis of the dataset. This stage ex-
encryption schemes fit this purpose. This is because the amines the encrypted case-related evidence to extract data
encryption of search results can be easily performed by relevant to the investigation and seeks to ensure data confi-
the investigator alone when any of these three schemes is dentiality. To achieve this, another set of keywords, known
utilized. The secret sharing scheme requires a suspect’s data as the second keywords, that are designed to be very case-
to be shared among multiple parties, thus, this may not be specific are established and searched on the encrypted case-
viable for the conceptual PPDF model as the entities involved related data as described in Figure 6. The main objectives of
in the model (i.e. investigator, user, and service provider) are this stage are to extract case-pertinent information from en-
separated, and would not typically be able to hold portions crypted data thus having relevant pieces of evidence to work
of data independently or share data. For identity-based en- with; and to perform analysis, such as pattern recognition
cryption, the encryption key is generated based on a user’s and classification on the encrypted case-pertinent evidence
identity, and a centralized server is required to generate the in order to determine the significance of the data, identify
private key. evidence patterns, and make conclusions about the case. This
To utilize either secret sharing or identity-based encryption is also the stage investigators test their hypothesis.
for privacy protection in digital forensics, a trusted third party The second keyword search is performed to obtain a fine-
or central server will be required to manage the distribution grained result that contains the case-pertinent data on the
of suspect’s data into n shares and generate a private key investigation and reduces the dataset that needs to be ex-
for decryption. This requirement can be seen in all related amined or analyzed further. The second keyword searches
solutions discussed in Section 3, where all applications of are also performed on the encrypted case-related evidence
secret sharing or identity-based encryption require a trusted received from a cloud service provider if any. The resulting
third party as depicted in Table 1. Thus, the use of encryption ciphertext from the search is then analyzed to discover their
techniques for privacy protection in digital forensics requires relationship to the case investigated. An example of such
that realistic assumptions be made and the practical feasibil- analysis on the encrypted data includes pattern recognition
ity of a proposed solution be considered. where evidence can be classified based on similar features
12 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

key scheme used for encryption.


Regardless of whether the key pair used for encryption is
generated by the user or an investigator in stage 1, it is the
sole responsibility of the investigator to decrypt the encrypted
outcome from the search using the associated key. Hence,
this justifies why the corresponding private key, for a public
key generated by a user, is stored in a private key storage
system as further described in Section 5.1. It is important to
note that several factors come into play in the application of
the conceptual model described. In the following section, we
examine the factors that may be considered with regard to the
model.
An example of a scenario where the model is applicable
could be a crime scene where a suspect has been accused of
defrauding someone who made an online purchase. We as-
sume that the suspect was apprehended and their smartphone
confiscated by an LEA. Following the four stages delineated
in our conceptual PPDF model, at the first step of preparation,
the LEA defines the tasks to perform during the investigation,
acquires a forensic image of the smartphone’s media storage,
prepares the needed tools and equipment to extract case-
related data from the smartphone, and obtains the necessary
warrant/approval. In the second stage, case-related keywords
are curated based on the crime at hand, this could involve
searching text messages, emails, call logs, documents, social
FIGURE 6. A flow diagram of the extraction and computation stage.
media posts, browsing history, and other content for pertinent
information. The keywords are searched and the result is
encrypted using the public key generated, either by the user
as demonstrated in [76]. The second keyword selection and or the LEA.
search for case-relevant data may be repeated several times Using a set of case-relevant keywords (second keyword),
to aid the confirmation or refuting of a hypothesis. This stage the encrypted case-related data from the smartphone is fur-
of the model achieves data confidentiality and user privacy as ther searched to obtain encrypted pertinent data to the in-
investigators only get to work on encrypted data. Moreover, vestigation. Investigators may refine the keyword searches
other (non-evidential and non-case related) information is not based on initial findings or narrow down the results as de-
examined further but stored. An investigator does not have scribed above. The encrypted case-relevant data is analyzed
access to this information unless there is a justification for to uncover patterns and make findings related to the crime.
such access, with permission given by a superior investigator, Analysis such as metadata pattern analysis, network traffic
this approach is in line with the third principle of privacy- analysis, file structure analysis, frequency and location pat-
preserving digital investigation described in [11]. tern analysis, and correlation analysis could be carried out on
Considering the cryptographic schemes earlier discussed, the encrypted case-relevant data to discover their relationship
only the homomorphic encryption and searchable encryption to the case investigated. Lastly, the result obtained from all
schemes can support this stage of the model. Other cryp- the analyses is decrypted, and the investigator’s hypotheses
tography techniques such as commutative encryption, secret are either validated or refuted.
sharing scheme, and identity-based encryption, in most cases,
would still need to be used in conjunction with the homomor- C. CONCEPTUAL MODEL ANALYSIS FACTORS
phic encryption scheme to allow computation on encrypted CONSIDERED
data without decrypting first. Also, the use of secret sharing In our examination of the conceptual model, we address the
and identity-based encryption requires the use of a central underscored challenges outlined in Section 3. These chal-
server or a dedicated third party as earlier discussed. lenges encompass: (i) the conceptual model’s capacity to ac-
commodate untrusted third parties, such as service providers,
Stage 4: Presentation / Decryption or to function autonomously without external involvement;
The last stage involves the summarization and description (ii) the model’s handling of queries from multiple investiga-
of findings, as well as further validation or refuting of hy- tors, particularly in scenarios necessitating such interactions;
potheses made during an investigation as shown in Figure 7. and (iii) the extent to which the model facilitates multi-
The resultant ciphertext from the encrypted analysis is then keyword searches. Subsequently, we explicate the model’s
decrypted using the corresponding private key of the public performance within these contexts.
VOLUME 4, 2016 13

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

TABLE 3. Variables and notations used in the model description

Symbol Definition
Pub key Public key for encryption
Pri key Private key for decryption
Fkywd First set of keywords searched from the user data
Skywd Second set of keywords searched from case-related data
UD User data (both relevant and non-relevant)
Cr kw Case-related keywords retrieved from user data

Cr kw Encrypted case-related keywords
Crv kw Case-relevant keywords retrieved from case-related data

Crv kw Encrypted case- relevant keywords

the proposed model aims to mitigate the encryption of all


user data, both relevant and non-relevant. This approach is
necessitated due to the substantial computational complexity
associated with cryptography-based privacy techniques and
the extensive data typically possessed by users. In light of the
aforementioned analysis factors, the conceptual model offers
a robust solution to these challenges, thereby furnishing a
FIGURE 7. A flow diagram of the presentation and decryption stage. substantiated proof of concept model.

D. FORMAL SPECIFICATION OF THE CONCEPTUAL


The conceptual model presumes that a portion of the MODEL
user’s evidential data resides with a cloud service provider, This section presents the formal description of the conceptual
an assumption commonly made due to the ubiquity and PPDF model. The definition of notations used is presented in
cost-effectiveness of cloud storage. However, it is unwise Table 3, and the mathematical representation and algorithms
to rely on the trustworthiness of the cloud service provider of the model are also presented.
in safeguarding evidential information. Consequently, user Using a generic homomorphic encryption key generation
information is encrypted with a public key shared with the scheme defined in [77], we describe the key generation
service provider. Furthermore, a set of encrypted keywords process depicted in Algorithm 1 for the conceptual model.
is sent to the service provider. Utilizing these encrypted key- Algorithm 1 illustrates the key generation process which
words, the service provider executes a disjunctive keyword takes as input n = the dimension of the ideal lattice and t =
search on the encrypted dataset, and subsequently transmits the bit length of the coefficient with the size being 128 bits
the resulting dataset to the investigator, as presented in stage and outputs the public key and private key. For the keyword
2. Consequently, the model demonstrates its operability in the generation, let kn+1 = {k1 , k2 , . . . , kn+1 } ∈ UD be the
presence of an untrusted third party, enhancing its feasibility. set of keywords that can be produced from user data and
The concept of multiple investigators denotes that two or let kn = {k1 , k2 , . . . , kn } be the set of first keywords from
more investigators can simultaneously access and manipulate user data denoted Fkywd as shown in table 3. A disjunctive
the same dataset. This scenario may arise when the investiga- keyword search will be {k1 ∪ k2 ∪ ... ∪ kn } ∈ UD . Therefore,
tion spans different regions or countries. Within the concep- Crkw = {k1 ∪ k2 ∪ ... ∪ kn } ∈ UD . Similarly, the second
tual model, diverse investigators can access and collaborate keyword search will be {k1 ∩ k2 ∩ ... ∩ km } ∈ Crkw ,
on the same case by sharing pertinent resources, including and Crv kw = {k1 ∩ k2 ∩ ... ∩ km } ∈ Crkw . Note
sets of keywords (both first and second), any encrypted that′ Crkw ∈ UD ≫ ′
Crv kw ∈ UD , which′
implies

that
dataset provided by the service provider, and the encrypted Crkw ∈ UD ≫ Crv kw ∈ UD hence Crv kw ∈ Crkw .
resultant dataset obtained from the initial keyword search. The encryption stage involves the encryption of both case-
Lastly, the inclusion of multi-keyword searches autho- related keywords retrieved through the first keyword search
rizes investigators to submit multiple keywords for retrieving and the encryption of the case-relevant keywords retrieved
more comprehensive data, in contrast to the limited results through the second keyword search as illustrated in Algo-
yielded by single-keyword searches. The conceptual model rithm 2. For the encryption of case-relevant keywords, the
accommodates both single and multi-keyword searches. The same process in Algorithm 2 is followed but with different
initial keyword search, i.e., the disjunctive search, constitutes input Public key (d, r) and case-relevant keywords (Crvkw )

a multi-keyword search that yields independent and case- and output (Crvkw ).
relevant data. Conversely, the second keyword search en- For the investigator(s) to accept or refute some of the
tails a conjunctive keyword search, retrieving case-pertinent hypotheses, the encrypted case-relevant keywords have to
information, as elucidated in Stage 3. In addition to other be analyzed and conclusions made based on the outcome
considerations, the incorporation of keyword searches in of the analysis. Following a similar pattern described in
14 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

Algorithm 1 : Key Generation Algorithm 4 : Decryption



Input: dimension n, bit length t Input: Private key P rikey = wi , encrypted cluster C =
′ ′ ′
Output: Pubkey = (d, r), P rikey = (wi ) (c1 , c2 , ..., cq )
1:Choose random vector v from
Pan−1 Output: A set of clusters in plaintext
v(x) = i=0 vixi : vi is a random t-bit length and ′
1:Compute [C .wi ]d (mod 2)
Pn−1
i=0 vi ≡ 1 mod 2 2:Let a ∈ Z, then we recover the plaintext since ∥ →

− −a ×
2:Compute the resultant d of v(x) and f (x), and the coeffi- W ∥∞ < 2 d
cient w1 of the linear term of w(x) 3: Output: C = (c1 , c2 , ..., cq )
Where wx vx = d mod f (x) 4: end
3: If gcd(wi, d) ̸= 1 then
4: Go to 1
5: else V. CONSIDERATIONS AND EVALUATION OF THE PPDF
6: Compute w0 and r = w w1 mod d
0 MODEL
7: Compute an odd wi via wi = rwi + 1 mod d and w0 , w1 . In this section, we examine some of the factors that may be
The subscripts are modulo n. considered at different stages of the conceptual model. We
8: Output: P ubkey = (d, r), P rikey = wi also discuss an evaluation of the model with respect to some
9: end of the existing principles for preserving privacy in digital
forensics.
Algorithm 2 : Encryption of case-related keywords
Input: Public key (d, r), case-related keywords (Crkw ) A. CONSIDERATIONS OF THE PPDF MODEL

Output: Encrypted case-related keywords (Crkw ) As mentioned earlier, when a cooperating user is delegated
1: To encrypt plaintext Crkw ∈ Z with pk = (d, r) to generate the asymmetric key pair (P rikey , P ubkey ), the
2: Choose a random noise vector → −u = (u0 , u1 , ...., un−1 ) public key is shared among the entities involved such as the
with ui ∈ (0, ±1) investigator and/or the service provider. The corresponding
3: Compute the resultant ciphertext of Crkw as private key for decryption is stored in a private key storage
Pn−1
c = Enc(Crkw , pk) = [b + 2. i=0 uiri ]d ∈ [ −d d
2 , 2 ).
system by the LEA. This is primarily to prevent private key
4: Set a →
− = →
− →

2 u + Crkw e 1 = (2u0 + loss and to enable the investigator to access the key for
Crkw , 2u1 , ..., 2un−1 ) ∈ Zn with →

e 1 = (1, 0, ..., 0). decryption after the analysis of the encrypted case-relevant

5: Output Crkw data, without having to wait for the user. However, it is worth
6: end mentioning that for the investigator to access the private key,
there has to be permission issued by a senior investigator who
has oversight of the case.
[78], we delineated the classification of the encrypted data. In a case where the user’s data is stored with a service
Specifically, we outline the K-Means classification algorithm provider, an encrypted set of the case-related keywords (first
on forensics data, an approach presented in [79], to determine keywords) is sent to the service provider to ensure that they
the significance of analyzed data as shown in Algorithm 3, do not become privy to the investigation details. This as-
while the decryption algorithm is presented in Algorithm 4. sumes the use of homomorphic encryption, thus allowing the
service provider to perform the search. The service provider
Algorithm 3 : K- Means Computation on Encrypted Data performs this encrypted case-related keyword search on the

Input: An encrypted case-relevant keywords Crvkw = evidence within their jurisdiction. The service provider then
Enc[k1 ∩ k2 ∩ ... ∩ km ], the number of clusters q, and the encrypts the retrieved data from the first keyword search with
termination condition µ. the public key shared by the investigator. Afterwards, the ser-
′ ′ ′ ′
Output: Encrypted Cluster C = (c1 , c2 , ..., cq ) vice provider sends the encrypted retrieved case-related data
(l) (l) (l)
1:Randomly selects q data records C(l) = c1 , c2 , ..., cq as to the investigator who then queries it to extract case-relevant
the initial clusters where l = 1. data. The encryption of the first keywords is only necessary

2:For each ki in Crvkw assign it to the closest cluster Crvi,j
(l) when evidential information is with a service provider (cloud
(l) (l) environment) and not in a non-cloud environment.
3: Compute the counts mi,j of each cluster mi,j = (1 ≤ j ≤ When dealing with a cloud environment, the extraction
(l) (l) (l) (l) (l)
q) and q local centers wi = (wi,1 , wi,2 , ..., wi,q ) where wi,j and computation of data in stage 3 of the PPDF model may
is a d dimensional point. be challenging, particularly if the cloud service provider is
(l) (l+1)
4: If maxDist(ci,j , ci,j )1 ≤ j ≤ q > µ the algorithm not willing to grant the investigator access to necessary data.
iterates, otherwise and output the final results This is a common issue that may be encountered both for
′ ′ ′ ′
5: Output: C = (c1 , c2 , ..., cq ) privacy-preserving and non-privacy-preserving digital foren-
6: end sics process models. It is also related to the drawback of
accessing information in data centers which may be in a
jurisdiction different from that of the investigator. For a non-
VOLUME 4, 2016 15

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

cloud environment, the investigator, in most cases, has the investigation based on prior knowledge of the private infor-
essential devices and/or information to perform the investiga- mation and proof of a hypothesis. In other words, it explores
tion. Therefore, investigators can perform the identified tasks the feasibility of protecting more sensitive information until
in this stage. knowledge about less sensitive information has been demon-
strated. To balance the efficacy of the investigation against
B. EVALUATION OF THE PPDF MODEL user privacy, they propose a scale for partitioning information
To evaluate the proposed conceptual PPDF model, we con- into privacy-accurate levels where (L1 ) denotes partitioned
sider how each stage of the model aligns with many of the information with a low privacy-accurate level and (L4 ) im-
existing principles for privacy protection in digital forensics plies a high privacy-accurate level. The four-level privacy-
models. Table 4 shows the categorization of the existing accurate scale is described as follows: (L1 ) – the evidence
principles into the four stages of the conceptual PPDF model, does not divulge any personal information, (L2 ) – evidence
depicting the alignment of the processes in each stage to each may refer to personal information, (L3 ) – evidence may
of the principles. infer personal information, and (L4 ) – evidence undeniably
In [9] the author proposed the classification of evidential divulges personal information. This somewhat supports the
information into different privacy levels to prevent encrypting four possible groups of forensics data posited in [9] and is
entire user’s data and to reduce the investigation cost in in line with our conceptual PPDF model’s case-related and
terms of time and resources. Considering both the user’s case-relevant approach to evidential information discussed in
and the investigator’s perspectives, the authors first classified stages 2 and 3.
this information into private and non-private for users and Another study in [11] proposed a set of ten privacy-
relevant and non-relevant for investigators. This resulted in preserving data processing principles for consideration dur-
these four groups: non-private and non-relevant, non-private ing the extraction and examination of evidential informa-
and relevant, private and non-relevant, and lastly private tion from digital devices in digital forensics investigation,
and relevant. Subsequently, they defined three privacy levels represented as PD1 - PD10 and summarized in Table 1.
for evidential information to enable more efficient privacy- Emphasizing the need for balance between the requirement
preserving digital forensics investigation. The privacy levels for effective investigative processes with the need to prevent
are; direct accessible data (DAD), privacy-preserving acces- unnecessary invasion of privacy, the principles highlighted
sible data (PAD), and non-accessible data (NAD). For DAD, the concerns regarding potential privacy invasion caused by
the data is relevant and non-private so it can be directly ex- the examination of digital devices in criminal investigations.
tracted and analyzed. The PAD signifies relevant and private We classified these ten principles (PD1 - PD10) under the first
data, hence, privacy-preserving technique(s) must be applied three stages of the conceptual model, as depicted in Table 4,
during data extraction and analysis, while NAD implies that because the principles only consider user privacy concerns
the data is not relevant to the case and is not accessible to from the extraction to the examination stage of evidential
the investigator. This is in line with stages 2 and 3 of the information. Notably, the first principle (PD1) which states
conceptual PPDF model in which only related and relevant that “the scope of any investigation should be defined and
data are extracted and the appropriate privacy-preserving evaluated before its implementation to ensure that it is both
technique is applied. proportionate and justifiable” is in line with stage 1 of
the conceptual PPDF model, where we underscore that an
TABLE 4. Classification of existing privacy-preserving digital forensics
investigator must define the investigation scope and ensure
principles that data privacy is integrated as part of a readiness plan.
Furthermore, the third principle (PD3), “where a need for
Preparation Preservation Extraction Presentation the extraction and examination of all available data from
Ref.
Key Generation Encryption Computation Decryption
No a given digital device is established, this need must be
[9] – DAD, PAD, – – both evidenced and justifiable with regards to the current
NAD
– L1 , L2 , L3 , L4 – – investigation scenario” aligns with stage 3 of PPDF model in
[10] which investigator does not have access to non-case related
PD1, PD4, PD9 PD2, PD3, PD5, PD6, – information unless there is a justification supported with
[11] PD7, PD10 PD8
PP3, PP8, PP9, PP1, PP2, PP4, – –
permission from a superior investigator with an oversight on
[40] PP10 PP5, PP6, PP7 the case.
L1 -L4 - privacy accuracy level of information classified based on its private Lastly, the author in [40] identified the significance of
level, PD1-PD10 - Privacy-preserving data processing principles represent privacy policies in protecting users’ private information and,
the principles for consideration when conducting digital forensics extraction
and examination of data from a digital device, while PP1 - PP10 - privacy hence, posited that such privacy-preserving policies should
policies for protecting user information in DF. restrict an investigator from analyzing user’s private data. The
author defined ten privacy-preserving policies, denoted by
Similarly, using cryptographic techniques and blind sig- PP1 - PP10 in Table 4 and summarized in Table 1, from both
natures, the authors in [10] proposed a system involving a the user’s and the investigator’s perspectives, which covers
sequential release of private information in digital forensics the first three stages of the digital forensics process model.
16 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

Five of the ten policies were based on the investigator’s [5] J. R. Lyle, R. P. Ayers, and D. R. White, "Digital forensics at the National
perspective, while two and three were based on the users’, Institute of Standards and Technology." US Department of Commerce,
National Institute of Standards and Technology, 2008.
and both investigator and user perspectives, respectively. [6] K. Kent, S. Chevalier, T. Grance, and H. Dang, "SP 800-86. Guide to
These policies are in alignment with the processes involved integrating forensic techniques into incident response." US Department of
in each stage of our PPDF model. For example, of particular Commerce, National Institute of Standards and Technology, 2006.
[7] R. Verma, J. Govindaraj, and G. Gupta, “Data privacy perceptions about
interest among the ten policies is the third policy (PP3) to digital forensic investigation in India,” in Ifip international conference on
“limit the search for evidence to the goal of the investigation” digital forensics pp. 25–45, 2016.
which aligns with the first stage of our conceptual PPDF [8] L. Englbrecht and G. Pernul, “A privacy-aware digital forensics investiga-
tion in enterprises,” in Proceedings of the 15th International Conference
model. In general, the conceptual PPDF model aligns with on Availability, Reliability, and Security, pp. 1-10, 2020.
the key existing privacy-preserving principles and privacy [9] W. Halboob, R. Mahmod, N. I. Udzir, and M. T. Abdullah, “Privacy
levels outlined in the literature. levels for computer forensics: toward a more efficient privacy-preserving
investigation,” Procedia Computer Science, 56, 370-375, 2015.
[10] N. J. Croft, and M. S. Olivier, “Sequenced release of privacy-accurate
VI. CONCLUSION information in a forensic investigation,” in Digital Investigation, 7(1-2),
In this study, we discuss the various cryptographic techniques 95-101, 2010.
[11] G. Horsman, “Defining principles for preserving privacy in digital forensic
that can be utilized for privacy protection in digital forensics, examinations,” Forensic Science International: Digital Investigation, 40,
with an analysis of relevant studies that have utilized any of 301350, 2022.
the techniques for privacy protection. We provide a summary [12] S. Raghavan, “Digital forensic research: current state of the art,” in CSI
of the findings for each study, highlight some drawbacks to Transactions on ICT 1(1), 91-114, 2013.
[13] R. McKemmish, "What is Forensic Computing," Trends and Issues in
the use of each cryptographic technique for privacy protec- Crime and Criminal Justice, 118, 1-6, 1999.
tion in digital forensics, and recommend potential solutions [14] S. Saleem, O. Popov, and I. Bagilli, "Extended Abstract Digital Forensics
to address the highlighted drawbacks. Moreover, we pro- Model with Preservation and Protection as Umbrella Principles," Procedia
Computer Science, 35, 812-821, 2014.
posed a conceptual model for a privacy-preserving digital [15] E. Vincze, "Challenges in digital forensics,"Police Practice And Research,
forensics model that is based on cryptographic techniques 17, 183-194, 2016.
and consider how and where each encryption technique may [16] N. M. Karie, and H. S. Venter, “Taxonomy of challenges for digital
forensics,” Journal of forensic sciences, 60(4), 885-893, 2015.
be used in the model. We present the mathematical repre- [17] M. Stoyanova, Y. Nikoloudakis, S. Panagiotakis, E. Pallis, and E. K.
sentation and algorithm of the model and examine how the Markakis, “A survey on the internet of things (IoT) forensics: challenges,
model may perform within the context of some identified approaches, and open issues,” IEEE Communications Surveys and Tutori-
als, 22(2), 1191-1221, 2020.
analysis factors. We also examine some of the factors that
[18] S.E. Goodison, R.C. Davis, and B.A. Jackson, “Digital Evidence and the
may be considered at each stage of the model in specific US Criminal Justice System. Identifying Technology and Other Needs to
situations and evaluate the model via a comparison with More Effectively Acquire and Utilize Digital Evidence, Priority Criminal
Justice Needs Initiative,” in Rand Corporation, 2015.
existing principles for preserving privacy in digital forensics
[19] E. Casey, “Clearly conveying digital forensic results,” Digit. Invest., 24,
investigations. pp. 1-3, 2018.
This study provides digital forensics investigators and [20] A. Nieto, R. Rios, and J. Lopez, “Privacy-Aware Digital Forensics,”
researchers with a roadmap for addressing data privacy Security and Privacy for Big Data Cloud Computing and Applications,
pp. 157-195, 2019.
challenges in digital forensics, specifically by using crypto- [21] J. I. James and Y. Jang, “Practical and legal challenges of cloud investiga-
graphic techniques. Our evaluation of the conceptual model tions,” arXiv preprint arXiv: 1502.01133, 2015.
shows that it performs well within the context of the analysis [22] E. Casey, “Digital Evidence and computer crime,” in Forensic Science,
computers, and the Internet. Academic Press, Inc. USA, 2011.
factors and it supports all the key privacy principles that have [23] S. Khan, A. Gani, A. W. A. Wahab, M. Shiraz, and I. Ahmad, “Network
been suggested in the literature for privacy preservation in forensics: Review, taxonomy, and open challenges,” in Journal of Network
digital forensics. The model focuses on the overall digital and Computer Applications, 66, 214-235, 2016.
[24] L. F. Sikos, “Packet analysis for network forensics: a comprehensive
forensics process but can be adapted to specific scenarios survey,” in Forensic Science International: Digital Investigation, vol 32,
and sub-domains of digital forensics as necessary. In future 200892, 2020.
work, we plan to implement and conduct an empirical study [25] A. Al-Dhaqm et al., "Digital Forensics Subdomains: The State of the Art
and Future Directions,” in IEEE Access, vol. 9, pp. 152476-152502, 2021.
to determine its practicality and performance. Extension of
[26] H. F. Atlam, E. E. D. Hemdan, A. Alenezi, M. O. Alassafi, and G. B. Wills,
the model to address some of the challenges in specific digital “Internet of things forensics: A review,” Internet of Things, 11, 100220,
forensics subdomains will also be explored. 2020.
[27] T. Janarthanan, M. Bagheri, and S. Zargari, “IoT forensics: an overview
of the current issues and challenges,” in Digital Forensic Investigation of
REFERENCES Internet of Things (IoT) Devices, 223-254, 2021.
[1] J. K. Malik and S. Choudhury, “A brief review on Cyber Crime-Growth [28] I. Yaqoob, I. A. T. Hashem, A. Ahmed, S. A. Kazmi, and C. S. Hong,
and Evolution,” Pramana Research Journal, 9(3), 242, 2019. “Internet of things forensics: Recent advances, taxonomy, requirements,
[2] F. Casino et al., "Research Trends, Challenges, and Emerging Topics in and open challenges,” Future Generation Computer Systems 92, 265-275,
Digital Forensics: A Review of Reviews," in IEEE Access, vol. 10, pp. 2019.
25464-25493, 2022. [29] A. A. Boozer, A. John, and T. Mukherjee, “Internet of Things Software
[3] D. Barrett, “The Basics of Digital Forensics: The Primer for Getting and Hardware Architectures and Their Impacts on Forensic Investigations:
Started in Digital Forensics,” The Journal of Digital Forensics, Security Current Approaches and Challenges,” Journal of Digital Forensics, Secu-
and Law:, JDFSL, 9(1), 83, 2014. rity and Law, 16(2), 4, 2021.
[4] R. Kaur and A. Kaur, “Digital forensics,” International Journal of Com- [30] A. Akinbi and T. Berry, “Forensic investigation of Google assistant,”
puter Applications, 50(5), 2012. Social Netw. Comput. Sci., vol. 1, no. 5, pp. 1-10, Sep. 2020.

VOLUME 4, 2016 17

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

[31] O. M. Adedayo, and M. S. Olivier, “Ideal log setting for database forensics International Journal of Engineering Research and Technology (IJERT),
reconstruction,” Digital Investigation, 12, 27-40, 2015. vol. 2 Issue 12, 2013.
[32] R. Chopade, and V. K. Pachghare, “Ten years of critical review on database [56] S. Hou, T. Uehara, S. M. Yiu, L.C. Hui, and K. P. Chow, “Privacy-
forensics research,” Digital Investigation, 29, 180-197, 2019. preserving confidential Forensic investigation for shared or remote
[33] A. Al-dhaqm et al., "Database Forensic Investigation Process Models: A servers,” in The Seventh International Conference on Intelligent Informa-
Review,” in IEEE Access, vol. 8, pp. 48477-48490, 2020. tion Hiding and Multimedia Signal Processing, pp. 378-383, 2011.
[34] G. M. Jones, and S. G. Winter, “An Insight into Digital Forensics: His- [57] K. Huang and R. Tso, "A commutative encryption scheme based on
tory, Frameworks, Types, and Tools,” Mangesh M. Ghonge, Sabyasachi ElGamal encryption,” in International Conference on Information Security
Pramanik, Ramchandra Mangrulkar, and Dac-Nhuong Le (eds.) Cyber and Intelligent Control, pp. 156-159, 2012.
Security and Digital Forensics, , 105–126, 2022. [58] S. Hou, T. Uehara, S. M. Yiu, L. C. Hui, and K. P. Chow, “Privacy-
[35] R. C. Maher, “Overview of audio forensics,” in Intelligent Multimedia preserving multiple keyword search for confidential investigation of re-
Analysis for Security Applications, Springer, Berlin, Heidelberg, pp. 127- mote forensics,” in Third International Conference on Multimedia Infor-
144, 2010. mation Networking and Security, pp. 595-599, 2011.
[36] A. R. Javed, W. Ahmed, M. Alazab, Z. Jalil, K. Kifayat and T. R. [59] S. Hou, R. Sasaki, T. Uehara, and S. M. Yiu, “Double Encryption for Data
Gadekallu, "A Comprehensive Survey on Computer Forensics: State-of- Authenticity and Integrity in Privacy-preserving Confidential Forensic
the-Art, Tools, Techniques, Challenges, and Future Directions,” in IEEE Investigation,” J. Wirel. Mob. Networks Ubiquitous Comput. Dependable
Access, vol. 10, pp. 11065-11089, 2022. Appl., 4(2), 104-113, 2013.
[37] R. Kumars, M. Alazab, and W. Wang, “A survey of intelligent techniques [60] T. B. Ogunseyi, and C. Yang, “Survey and analysis of cryptographic tech-
for Android malware detection,” in Malware Analysis Using Artificial niques for privacy protection in recommender systems,” in International
Intelligence and Deep Learning, Springer, Cham, (pp. 121-162), 2021. conference on cloud computing and security, pp. 691-706), 2018.
[38] B. Carrier, and E. Spafford, “An event-based digital forensic investigation [61] G. Weir, A. Aßmuth, M. Whittington, and B. Duncan, “Cloud accounting
framework,” Digital Investigation, 2004. systems, the audit trail, forensics and the EU GDPR: how hard can
[39] J. C. Deprez, C. Ponsard, and N. Matskanis, “A goal-oriented requirements it be?,” in British Accounting and Finance Association (BAFA) Annual
analysis for the collection, use, and exchange of electronic evidence across Conference, 2017.
EU countries,” in IEEE 24th International Requirements Engineering [62] A. Odebade, T. Welsh, S. Mthunzi, and E. Benkhelifa, “Mitigating anti-
Conference Workshops (REW), pp. 106-113, 2016. forensics in the cloud via resource-based privacy preserving activity attri-
[40] S. Srinivasan, “Security and privacy vs. computer forensics capabilities,” bution,” in Fourth International Conference on Software Defined Systems
Information Systems Control Journa, 4, 1-3, 2007. (SDS) , pp. 143-149, 2017.
[41] S. Kumar, A. Singh, A. Benslimane, P. Chithaluru, M.A. Albahar, R.S. [63] S. Hou, S. Yiu, T. Uehara, and R. Sasaki, “Application of secret sharing
Rathor, and R.M. Álvarez, "An optimized intelligent computational secu- techniques in confidential forensic investigations,” in Proceedings of the
rity model for interconnected blockchain-IoT system & cities," in Ad Hoc Second International Conference on Cyber Security, Cyber Peace fare and
Networks, 103299, 2023. Digital Forensics, pp. 69-76, 2013.
[42] Pallavi and V. Bharti, "A Comprehensive Review of Cloud Forensics and [64] S. Hou, S. M. Yiu, T. Uehara, and R. Sasaki, “A privacy-preserving
Blockchain-Based Solutions," in 6th International Conference on Elec- approach for collecting evidence in forensic investigation,” Int. J. Cyber-
tronics, Communication and Aerospace Technology, pp.749-754, 2022. Security and Digital Forensics (IJCSDF), 2(1), 70-78, 2013.
[43] G. Ragu, and S. Ramamoorthy, "A blockchain-based cloud forensics
[65] M. I. Mihailescu and S. L.Nita, “A Searchable Encryption Scheme with
architecture for privacy leakage prediction with cloud," in Healthcare
Biometric Authentication and Authorization for Cloud Environments,” in
Analytics, vol. 4, 100220, 2023.
Cryptography, 6(1), 8, 2022.
[44] G. Kumar, R. Saha, C. Lal, and M. Conti, "Internet-of-Forensic (IoF):
[66] F. Armknecht, and A. Dewald, “Privacy-preserving email forensics,” Dig-
A blockchain-based digital forensics framework for IoT applications," in
ital Investigation, 14, S127-136, 2015.
Future Generation Computer Systems, vol. 120, pp. 13-25, 2021.
[67] K. Afifah, and R. S. Perdana, “Development of search on encrypted
[45] M. Kim, Y. Shin, T. Shon, and W. Jo, "Digital forensic analysis of
data tools for privacy-preserving in digital forensic,” in International
intelligent and smart IoT devices," in J Supercomput, vol. 79. pp. 973-997,
Conference on Data and Software Engineering (ICoDSE), pp. 1-6, 2016.
2023.
[46] J. Gu, B. Sun, X. Du, J. Wang, Y. Zhuang, and Z. Wang, "Consortium [68] D. Unal, A. Al-Ali, F. O. Catak, and M. Hammoudeh, “A secure and effi-
blockchain-based malware detection in mobile devices," in IEEE Access, cient Internet of Things cloud encryption scheme with forensics investiga-
pp. 12118-12128, 2018. tion compatibility based on identity-based encryption,” Future Generation
[47] M. Kerr, F.V. Han, and R. Schyndel, "A blockchain implementation for Computer Systems, 125, 433-445, 2021.
the cataloguing of cctv video evidence," in 2018 15th IEEE International [69] T. B. Ogunseyi and O. M. Adedayo, “Cryptographic Techniques in Digital
Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. Forensics,” In American Academy of Forensic Sciences (AAFS) Annual
1-6, 2018. Scientific Conference, 366, 2023.
[48] S. H. Gopalan, S. A Suba, C. Ashmithashree, A. Gayathri, V. Jebin [70] H. Chen, J. Schroeder, R. V. Hauck, L. Ridgeway, H. Atabakhsh, H. Gupta,
Andrews, "Digital forensics using blockchain," in International Journal and A. W. Clements, “COPLINK Connect: information and knowledge
of Recent Technology and Engineering, vol. 8(2), pp. 182-184, 2019. management for law enforcement,” Decision support systems, 34(3), 271-
[49] T.K. Dasaklis, F. Casino, and C. Patsakis, "Sok: Blockchain solutions for 285, 2003.
forensics," in Technology Development for Security Practitioners , pp. 21- [71] B. Martini, and K. K. R. Choo, “An integrated conceptual digital forensic
40, 2021. framework for cloud computing,” Digital investigation, 9(2), 71-80, 2012.
[50] M. A. Khan, M. T. Quasim, N. S. Alghamdi, and M. Y. Khan, "A secure [72] H. L. Bulbul, H. G. Yavuzcan, and M. Ozel, “Digital forensics: an
framework for authentication and encryption using improved ECC for IoT- analytical crime scene procedure model (ACSPM),” Forensic science
based medical sensor data." IEEE Access, 8, 52018-52027, 2020. international, 233(1-3), 244-256, 2013.
[51] A. Viand, P. Jattke, and A. Hithnawi, “Sok: Fully homomorphic encryption [73] F. Y. Law, K. P. Chow, M. Y. Kwan, and P. K. Lai, “Consistency issue
compilers,” in IEEE Symposium on Security and Privacy (SP), IEEE, pp. on live systems forensics,” in Future Generation Communication and
1092-1108, 2021. Networking (FGCN 2007), vol. 2, pp. 136-140, 2007.
[52] T. B. Ogunseyi and T. Bo, “Fast decryption algorithm for paillier ho- [74] Y. Zhang, Y. Li, and Y. Wang, “Conjunctive and disjunctive keyword
momorphic cryptosystem,” in IEEE International Conference on Power, search over encrypted mobile cloud data in public key system,” Mobile
Intelligent Computing and Systems (ICPICS) , pp. 803-806, 2020. Information Systems, 2018.
[53] J. Park, and E-N Huh, “eCLASS: Edge-Cloud-Log Assuring-Secrecy [75] S. Tahir, L. Steponkus, S. Ruj, M. Rajarajan, and A. Sajjad, “A parallelized
Scheme for Digital Forensics,” Symmetry, 11, 1192, 2019. disjunctive query-based searchable encryption scheme for big data,” Fu-
[54] K. Janjua, M. A. Shah, A. Almogren, H. A. Khattak, C. Maple, I. U. Din, ture Generation Computer Systems, 109, 583-592, 2020.
“Proactive Forensics in IoT: Privacy-Aware Log-Preservation Architecture [76] R. Altschaffel, R. Clausing, C. Kraetzer, T. Hoppe, S. Kiltz, and J.
in Fog-Enabled-Cloud Using Holochain and Containerization Technolo- Dittmann, “Statistical pattern recognition based content analysis on en-
gies,” Electronics, 9(7):1172, 2020. crypted network: Traffic for the teamviewer application,” in 2013 Seventh
[55] M. M. Rathinraj, M. J. R. Rajalakshmi, and M. M. Saranya, “Partial Homo- International Conference on IT Security Incident Management and IT
morphic Encryption for Secure Log Management Using Tor Network,” in Forensics, pp. 113-121, 2013.

18 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3343360

Ogunseyi et al.: Cryptographic Techniques for Data Privacy in Digital Forensics

[77] Y. Zhang, R. Liu, and D. Lin, "Improved Key Generation Algorithm for
Gentry’s Fully Homomorphic Encryption Scheme," in Kim, H., Kim, DC.
(eds) Information Security and Cryptology – ICISC 2017. Lecture Notes
in Computer Science, vol 10779. Springer.
[78] Y. Zhu, X. Li, "Privacy-preserving k-means clustering with local synchro-
nization in peer-to-peer networks," in Peer-to-Peer Network and Applica-
tion vol. 13, pp.2272–2284 2020.
[79] L. B. Nicole, L. Lishu, "Clustering digital forensic string search output,"
in Digital Investigation vol. 11, pp. 314-322, 2014.

TAIWO BLESSING OGUNSEYI received his


Ph.D. from the Communication University of
China, Beijing, China, in 2020. He is currently an
Assistant Professor at Yibin University, Sichuan,
China. His research interests include applied cryp-
tography, privacy-enhancing technologies, data
privacy, information security, cybersecurity, and
machine learning.

OLUWASOLA MARY ADEDAYO (Member,


IEEE) received her Ph.D. degree in Computer Sci-
ence from the University of Pretoria, South Africa,
in 2015. She is currently an Assistant Professor
at the University of Winnipeg, Manitoba, Canada.
Her research interests include databases, database
forensics, digital forensics, cybersecurity, and pri-
vacy. She is a member of the Canadian Society of
Forensic Science as well as an Associate member
of the American Academy of Forensic Sciences.

VOLUME 4, 2016 19

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4

You might also like