0% found this document useful (0 votes)
78 views12 pages

Healthcare Data Breaches: Implications For Digital Forensic Readiness

This document analyzes healthcare data breaches and their implications for digital forensic readiness. It finds that while new technologies are transforming healthcare, most data breaches are caused by human error, theft, hacking, or ransomware rather than issues with new technologies. Healthcare information is highly sensitive and valuable to cybercriminals. The document examines past breach causes and associated challenges for digital investigations. It proposes a conceptual architecture for forensic auditing to help capture relevant digital artifacts, especially related to privilege misuse, to aid potential future investigations.

Uploaded by

NathT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views12 pages

Healthcare Data Breaches: Implications For Digital Forensic Readiness

This document analyzes healthcare data breaches and their implications for digital forensic readiness. It finds that while new technologies are transforming healthcare, most data breaches are caused by human error, theft, hacking, or ransomware rather than issues with new technologies. Healthcare information is highly sensitive and valuable to cybercriminals. The document examines past breach causes and associated challenges for digital investigations. It proposes a conceptual architecture for forensic auditing to help capture relevant digital artifacts, especially related to privilege misuse, to aid potential future investigations.

Uploaded by

NathT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Journal of Medical Systems (2019) 43: 7

https://doi.org/10.1007/s10916-018-1123-2

SYSTEMS-LEVEL QUALITY IMPROVEMENT

Healthcare Data Breaches: Implications for Digital Forensic Readiness


Maxim Chernyshev 1 & Sherali Zeadally 2 & Zubair Baig 3

Received: 12 June 2018 / Accepted: 15 November 2018 / Published online: 28 November 2018
# Springer Science+Business Media, LLC, part of Springer Nature 2018

Abstract
While the healthcare industry is undergoing disruptive digital transformation, data breaches involving health information
are not usually the result of integration of new technologies. Based on published industry reports, fundamental security
safeguards are still considered to be lacking with many documented data breaches occurring as the result of device and
equipment theft, human error, hacking, ransomware attacks and misuse. Health information is considered to be one of the
most attractive targets for cybercriminals due to its inherent sensitivity, but digital investigations of incidents involving
health information are often constrained by the lack of the necessary infrastructure forensic readiness. Following the
analysis of healthcare data breach causes and threats, we describe the associated digital forensic readiness challenges in
the context of the most significant incident causes. With specific focus on privilege misuse, we present a conceptual
architecture for forensic audit logging to assist with capture of the relevant digital artefacts in support of possible future
digital investigations.

Keywords Computer crime . Forensics . Health information management . Security . Threat

Introduction unprecedented amounts of health information records.


Fortunately for patients and practitioners, the increased avail-
Driven by the need to move away from incidental doctor- ability of the data generated and collected shall enable more
centered care towards more accessible patient-centered care, accurate and faster clinical and biomedical actions, including
the healthcare industry is undergoing disruptive transforma- proactive life-saving interventions.
tion [1]. The associated changes shall facilitate increased However, health information is also considered to be the
adoption of technology as part of the evolving health informa- most attractive target for cyber criminals. Depending on re-
tion technology (HIT) ecosystem. Technology-based innova- cord completeness, a single patient’s file can be sold for
tion trends, specifically in the areas of (1) digital health, (2) big several hundreds of dollars on the dark web [3]. The trans-
data and (3) precision health are instrumental in supporting the formational changes and integration of new technological
delivery of the future healthcare vision [2]. elements into the HIT ecosystem will expand the attack sur-
The ubiquitous Internet connectivity coupled with the in- face of healthcare services [4]. The associated threat land-
creasing adoption of mobile, wearable and the Internet of scape suggests that data breaches involving health informa-
Things (IoT) technologies will underpin solutions that handle tion are not generally the result of sophisticated attacks on
contemporary technological building blocks, such as IoT
sensors and wearable medical devices [5]. In contrast, wide-
This article is part of the Topical Collection on Systems-Level Quality spread human errors, misuse and physical
Improvement actions such as loss and theft have been the major causes
behind hundreds of publicly disclosed healthcare data
* Maxim Chernyshev breaches worldwide. This threat pattern is considered unique
[email protected]
to the healthcare industry. Leveraging publicly available
healthcare data breach information and industry reports, we
1
Edith Cowan University, Perth, Australia examine the associated implications from a digital forensic
2
University of Kentucky, Lexington, KY 40506-0224, USA perspective. We aim to identify key digital forensic readiness
3
Commonweath Scientific and Industrial Research Organisation challenges associated with the primary causes of these
(CSIRO), Data61, Melbourne, Australia breaches. The contributions of this paper are as follows:
7 Page 2 of 12 J Med Syst (2019) 43: 7

& We conduct an analysis of healthcare data breaches focus- Portability Act (HIPAA) 1996 [6]. Although other nations
ing on the location of the digital artifacts that can contain do not necessarily have healthcare-specific provisions in
potential digital evidence. their privacy legislation, health information is still usually
& We identify relevant digital forensic readiness challenges covered under the broader definition of personal data.
that reflect the unique threat pattern of the healthcare Several regulations such as the HIPAA Breach
sector. Notification Rule in the US and the recently introduced
& We present a conceptual architecture to address these chal- Notifiable Data Breaches (NDB) scheme [10] in Australia
lenges in investigations of incidents involving privilege mandate compulsory data breach notification requirements.
abuse with a specific focus on electronic medical record Generally speaking, notifications are issued when the data
(EMR) systems. breach poses high risk of harm to affected individuals,
which is often the case with health information, given its
Despite widespread availability of healthcare data highly sensitive nature.
breach information and statistical analyses provided by in- To this effect, the US Department of Health and Human
dustry, we are not aware of other works examining the Services (HHS) data breach portal [14] lists over 2250 data
associated implications with a specific digital forensic breaches (including over 390 still under investigation) for the
focus. period between 2009 and 2018. The Australian NDB scheme
received 63 submissions in just six weeks since its introduc-
tion, of which health service providers were responsible for
Data breaches in healthcare the majority (24%) of all submissions.

Regulatory landscape overview


Health information definition
The highly sensitive nature of medical records has been
recognized worldwide. There are several privacy frame- The concept of health information requires specific attention.
works and regulations in existence today in various coun- Sometimes, terms such as health records, medical records and
tries. As shown in Table 1, several nations such as health information are used almost interchangeably. Table 2
Australia, the United Kingdom (UK), and the European provides several definition summaries of these similar terms.
Union (EU) have enacted legislation that encompasses spe- There is no single universal description for what? for health
cific conditions pertaining to the handling and protection of information because we have slightly varying terms and def-
health information under respective privacy laws [7–9]. initions that have been adopted across the different contexts.
The United States (US) has dedicated legislation as part Based on the key common aspects of these definitions, health
of the long-standing Health Insurance Accountability and information is considered to be:

Table 1 Major privacy


frameworks and regulations Country Framework / Regulation Healthcare-specific
provisions

United States Health Insurance Accountability and Portability Act Dedicated


(HIPAA) 1996 [6]
United Data Protection Bill 2017 [7] Specific conditions a
Kingdom
European General Data Protection Regulation (GDPR) 2018 [8] Specific conditions a
Union
Australia Privacy Act 1988 [9], Notifiable Data Breaches (NDB) Specific conditions b
Scheme 2018 [10]
Singapore Personal Data Protection Act (PDPA) 2012 [11] Advisory guidelines
Canada Personal Information Protection and Electronic Not specified c
Documents Act (PIPEDA) 2000 [12]
Japan Act on the Protection of Personal Information 2005 [13] Not specified
New Zealand Privacy Act 1993 [3] Not specified
a
Specific definitions and rules around data concerning health, genetic data and biometric data
b
Specific conditions on obtaining explicit consent, usage of health information and access to collected
information
c
Selected provinces have specialized health-related privacy legislation in place
J Med Syst (2019) 43: 7 Page 3 of 12 7

Table 2 Definitions of health


information Framework / Term used Definition summary
Regulation

HIPAA 1996 [6] Protected health Individually identifiable health information that is transmitted by
information (PHI) electronic media, maintained in electronic media, or transmitted
or maintained in any other form or medium a.
Data Protection Health record Any record of information relating to someone’s physical or mental
Bill 2017 [7] health that has been made by (or on behalf of) a health
professional.
GDPR 2018 [8] Data concerning Personal data related to the physical or mental health of a natural
health person, including the provision of health care services, which
reveal information about his or her health status.
Privacy Act 1988 Health information Any information about someone’s health or a disability, as well as
[9] any other personal information collected while the person is
receiving a health service.
a
Excludes certain categories such as education and employment records

& Handled in any form, physical or digital. facilitate access to healthcare services by non-insured indi-
& Associated with all aspects of personal health, including viduals and also allow them to obtain prescription drugs
physical and mental health. for financial gain [16]. Whilst financial gain is by far the
& Related to past, present and future encounters with medi- most common motive, other reasons can include curiosity,
cal practitioners grudge and espionage [5].
& Collected, transmitted, processed and stored by various The attraction of individuals with nefarious objectives to
types of organizations (not only healthcare providers). health information is clearly motivated. Given the different
& Usually linked to individually identifiable data. types of crime are made possible using such data, the black
market value of health information is at least 10 to 20 times
more than the value of credit card data [17]. Depending on
Unlike other definitions of health information, which completeness, recency and accuracy, patient files can be sold
operate in broader terms, the HIPAA’s protected health in- starting from 10 USD per record up to 1000 USD per record
formation (PHI) description also includes a list of 18 ele- [3]. For example, following a failed extortion attempt a data-
ments, which must be removed from the data set so that it base containing health information of 655,000 patients was
is no longer identifiable health information. In addition to made available for purchase through one of the dark web
the more conventional attributes such as names and contact marketplaces [18]. In this case, the extortion demands were
details, this list also includes endpoint identifiers such as not targeted directly at the individuals whose information was
device serial numbers, Internet Protocol (IP) addresses and exposed, but rather at the entities responsible for safeguarding
Universal Resource Locators (URLs). Therefore, it is im- their data. Scenarios where criminals become aware of sensi-
portant to have full clarity on the particulars of the defini- tive diagnoses or clinical history via a data breach and then
tion that is applicable to the legal context of any digital target the affected high-profile individuals are also theoretical-
forensic investigation because identical sets of attributes ly possible.
may not necessarily be considered health information in
different contexts.
Threat actors and actions
Sensitivity and motives
External actors are not necessarily always the biggest threat to
It is also important to recognize the various sensitivity health information. The fact that the sector is facing a unique
levels are associated with health information [15]. These threat pattern has been recognized by the latest special issue
levels, as shown in Table 3, are based upon the perceived Protected Health Information Data Breach Report (PHIDBR)
impact of the breach on the individual’s privacy and social published by Verizon [5]. The 2018 report examines 1368
wellbeing, as well as the potential types of crime that can healthcare data breaches disclosed since the beginning of
be committed based on different categories of health infor- 2016 and in particular breaches that affect the healthcare in-
mation. For example, in certain contexts where healthcare dustry or involve patients or their health information. Based
systems do not include compulsory health insurance such on this report, the unique threats pertinent to the sector can be
as the US, identity theft using stolen health information can summarized as follows:
7 Page 4 of 12 J Med Syst (2019) 43: 7

Table 3 Health information


sensitivity levels based on [15] Sensitivity Data categories Access control scope Possible crime

Normal Personal Wide Identity theft


Social
Sensitive Financial information Based on medical staff role Fraud
Health risks
Highly sensitive Clinical diagnoses Treatment by nominated medical staff only Extortion

& Internal actors (insiders) are associated with the majority Although physical crimes such as device and document
(58%) of recent data breaches. theft have been a major issue in the past, the rate of theft
& Similar to electronic health information, breaches also af- has dropped significantly in the last three years. Stolen
fect paper-based records. devices such as laptops and external storage media may
& Given the lack of fundamental, standard security controls contain digital forensic artifacts that could be used to de-
across the affected entities, the sector remains highly sus- termine whether health information was actually accessed
ceptible to malware attacks and in particular ransomware. by an unauthorized individual. However, the recovery rate
of stolen office equipment in the US is reported to be
In contrast to the previous special issue report by Verizon under 6% [24]. Thus, medical information present on sto-
published in 2015 [23], which presented external actors as the len devices is assumed to have been compromised by de-
most significant threat, we observe a slight pattern shift where fault and we do not consider the digital forensic implica-
internal actors have become the most significant concern. tions associated with physical threats further in this work.
Actions performed by internal actors encompass both human Similarly, breaches involving human error are excluded
error and misuse. Although the majority of insiders are not from the scope of the subsequent discussion because doc-
malicious, some form of misuse is still involved in almost a ument mishandling, improper disposal, loss and publishing
third (29.5%) of all analyzed data breach incidents. errors are unlikely to be a common focus of digital
Table 4 presents several published healthcare data breach investigations.
reports and key associated threat pattern characteristics. We
found minor disparities in the reported results leading to the
potential lack of clarity around the associated landscape. Thus,
we perform our own analysis in order to obtain a clearer un- Digital forensic implications
derstanding of the associated threats.
Based on the types of breaches and threat action groups Digital forensics
shown in Fig. 1, when it comes to the Bhow^ aspect – which
represents the cause of the breach – internal actors would The discipline of digital forensics aims to extract court-
primarily be associated with unauthorized access, disclosure, admissible evidence by using scientifically designed and val-
loss and improper disposal of health information. In contrast, idated methods applied to data on digital devices [25]. The
external actors would generally be associated with hacking, common digital forensic process involves evidence identifica-
malware and physical incidents – primarily, theft. Thus, both tion, collection, examination, analyses and reporting. The in-
internal and external actors need to be considered as similarly tegrity of the evidence must be verifiably preserved through-
significant groups. out all stages of the process. There are several digital forensic

Table 4 Notable Data Breach Reports

Publisher Coverage Sample Context Key threat pattern characteristic


size

Verizon [5] 2016–2017 1368 Global Prevalence of internal actors (57.5%) closely followed by external
actors (42%)
Bitglass [19] 2014–2017 1179 US Increasing pervasiveness of external actors (70.9% in 2017)
Maryland Health Care Commission [20] 2010–2016 1780 US a Growing concern over external actors in 2014–2016
Office of the Australian Information Q1 2018 63 Australia Most breaches (24%) reported by health service providers, internal
Commissioner (OAIC) [21] actors (50%), external actors (44%)
a
Maryland, US compared to the entire US
J Med Syst (2019) 43: 7 Page 5 of 12 7

a US Department of Health and Human Services (HHS) b VERIS Community Database (VCDB)
150

Number of breaches (N=2166)

Number of breaches (N=2759)


150

100

100

50
50

0 0
2010 2012 2014 2016 2010 2012 2014 2016
Year Year

Hacking/IT Incident Loss Theft Error Malware Physical


Improper Disposal Other Unauthorized Access/Disclosure Hacking Misuse Social

Figure 1 Breakdown of healthcare breach types by year based on data Breakdown of threat actor actions by year based on data available in the
provided by the US Department of Health and Human Services (HHS) VERIS Community Database (VCDB) (1B) [22]
including archived breaches and breaches under investigation (1A) [14].

process models (DFPM) that are used to guide digital investi- Threat actions
gations in a structured and documented manner [26–28].
In particular, The integrated digital forensic process model Given the motives and various types of actors involved in
(IDFPM) was proposed to standardize the forensic process incidents leading to healthcare breaches, we also examine
terminology and eliminate the differences that exist among the different types of threat actions (such as hacking,
the various other models [29]. One of the key features of ransomware, phishing and privilege abuse) and the specific
IDFPM is the introduction of the preparation phase to satisfy types of assets where digital artifacts containing potential ev-
the need to establish operational and infrastructure readiness idence may be located. From Fig. 2, we observe that:
as a critical component of the process model. Subsequently,
we argue that the healthcare industry can facilitate improved & Specific actions taken as part of hacking are generally not
outcomes in digital investigations by assisting with the estab- known. This is likely because cyber criminals can rely on
lishment of the necessary infrastructure readiness aspects. tailored tactics and zero-day exploits. As the alleged rep-
Such readiness would lead to the increased usefulness of ev- resentative of the BTheDarkOverlord^ hacking collective
idence as well as the decrease of the associated investigation stated, BI keep all my exploits private for my own use.
costs [30]. In the next sections, we focus on the identification Never publish it…^ [18]. The tracks associated with these
of the relevant data sources associated with the most common exploitations, if left by the attackers, may be difficult to
threat actions and varieties. Knowing how to facilitate digital uncover without expensive specialist forensic assistance.
artifact collection and what sources to target is one of the key & Ransomware is by far the most common type of malware
steps in the forensic readiness implementation process [31]. used.

a Hacking b Malware c Misuse d Social Engineering


Breach Count

Breach Count

Breach Count
Breach Count

200 60 400 60

150 300
40 40
100 200
20 20
50 100

0 0 0 0
Unknown
Use of stolen creds
Use of backdoor or C2
Brute force
DoS
Other
Path traversal

Forced browsing
OS commanding

Ransomware
Unknown

Downloader
Password dumper

Privilege abuse
Possession abuse

Knowledge abuse
Unknown

Net misuse

Unapproved software

Phishing
Capture stored data
Spyware/Keylogger
Backdoor
Spam
C2
Capture app data
Destroy data
Disable controls

Data mishandling

Email misuse
Unapproved hardware
Unapproved workaround

Illicit content

Bribery

Pretexting

Unknown

Influence

Other

Extortion

Propaganda
Buffer overflow

Forgery

Figure 2 Breakdown of healthcare breach threat action varieties for hacking (2A), malware (2B), misuse (2C) and social engineering (2D). As explained
in section 2B, physical actions and human error are not included in the analysis [22]
7 Page 6 of 12 J Med Syst (2019) 43: 7

& Privilege abuse is the most significant concern when it Therefore, digital forensic readiness for investigations in-
comes to misuse. volving targeted and random hacking is largely dependent on
& Phishing is the most popular tactic used as part of social the security architecture of the HIT ecosystem elements and its
engineering attacks. ability to capture and preserve vast amounts of digital artifacts
that are associated with network-borne threats. These artifacts
can be collected via:
Subsequently, we also examine the types of assets most
commonly involved in these incidents. As shown in Fig. 3, 1) Traditional sources such as routers, firewalls, security in-
we identify the following sources where digital artifacts may formation and event management (SIEM) solutions, in-
be located: trusion detection systems (IDS) (host and network),
honeypots, data loss prevention (DLP) solutions and
& Servers (such as web and email servers). others.
& Databases. 2) Dynamically evolving components such as software de-
& End-user computers. fined networks (SDN), contemporary cloud architectures
& Removable media. such as serverless computing and microservices, applica-
tion programming interfaces (APIs) and container-based
virtual environments.
The relatively high prevalence of paper and films in the
HHS data set (Fig. 3A) is explained by the fact that breach Given that breach discovery can take months and even
causes that include both improper disposal, human error and years [5], the key aspect of readiness in this context is associ-
misuse are grouped under the same category (unauthorized ated with the ability to retain and efficiently analyze vast
access and disclosure). amounts of digital artifacts. Forensic readiness can be facili-
tated by using a cloud forensic logging-as-a-service solution
Forensic readiness implications to enable artifact capture and retention [33]. Subsequently,
intelligent and highly scalable visualization schemes (such
Hacking as those using frequent item mining and hypergraphs) can be
used to assist with streamlining the artifact analysis to pinpoint
In the context of hacking, achieving forensic readiness is con- the potential evidence [34].
siderably difficult and perhaps is no different to any other
industry. The attack surface is vast and can encompass all Ransomware
elements that comprise the infrastructure of the business as
well as any of its third-party services including desktop and In the context of ransomware, which prevents resource access
mobile endpoints, local wired and wireless networks, border by authorized users until a ransom is paid, achieving digital
routers, as well as the Internet service provider (ISP) and the forensic readiness is also very challenging. Modern
cloud infrastructure. To keep up with the constantly changing ransomware variants called Bcryptolockers^ encrypt user data
landscape infrastructure, the science of digital forensics must and any mounted backup locations upon infection and subse-
undergo rapid, continuous evolution [32]. quently demand ransom payment in cryptocurrency such as

Fig. 3 Locations of breached a b


US Department of Health and Human Services (HHS) VERIS Community Database (VCDB)
health information based on data 250
Number of breaches

Number of breaches

breaches not caused by theft or 200


60

loss (3A) [14]. Top ten asset 150 40


varieties compromised in
100
healthcare data breaches (3B) [22] 20
50

0 0
Network Server

Paper/Films

Email

Other

Desktop Computer

Electronic Medical Record

Laptop

Web application
Other Portable Electronic Device

Database

Laptop

Documents

Mail

Desktop

Unknown

Flash drive

Disk drive

Other
J Med Syst (2019) 43: 7 Page 7 of 12 7

Bitcoin. To identify the perpetrators, digital investigations Specifically, Electronic Medical Record (EMR) systems
handling ransomware infections usually adopt the Bfollow usually use a centralized data store supported by a relational
the money^ strategy [35]. database management system (RDBMS) which can be
Identification of payment recipients requires the ability to accessed by a local or remote client. Whilst access to health
apply the relevant Bitcoin and blockchain de-anonymization information is generally controlled via role-based access con-
tools and techniques. For example, the freely available open- trol (RBAC) policies, not all EMR systems share the same
source tool called BitCluster [36] can be used to analyze degree of granularity when it comes to their support for policy
Bitcoin transactions and group them by participating entities definitions and the effectiveness of RBAC controls often de-
based on public key hashes, which can be effective when pends on the correctness of these definitions. In addition, some
tracing ransom payments associated with the same public systems require specific considerations to support open access
key as long the recipients maintain this key for the duration policies to enable streamlined handling of emergencies [41].
of the campaign. Subsequently, users of EMR systems often have the ability
Several commercial services such as Elliptic1 state that they to access more health information than necessary based on
offer the capabilities necessary to detect and investigate crim- their role and patient context, as evident by the number of
inal activity that involve cryptocurrency leading to criminal breaches caused by misuse or unauthorized internal access.
convictions [37]. The need to focus on these currencies as part Several examples of internal actors who have accessed many
of investigations has facilitated the emergence of the Bcrypto patient records via an EMR system can be found in the de-
forensics^ concept. In addition to the availability of validated scriptions of data breaches published via the HHS portal [14].
data mining-based approaches that aim to identify payment Although it is not clearly evident how such cases of privilege
recipients based on transaction patterns, readiness can be fa- misuse have been uncovered, it reasonable to assume that
cilitated by cross-jurisdictional sharing of information by discovery would require either random or targeted access log
cryptocurrency payment processors to enable monitoring the audits to take place or be prompted by external discoveries.
financial criminal activities such as tax evasion and money Such audits can be manual, heuristics-based or powered by
laundering [35]. machine learning algorithms [42]. Therefore, from a digital
forensic infrastructure readiness perspective, investigation of
Phishing privilege abuse would be supported by the ability to identify
and produce forensically sound digital evidence based on the
Combating phishing is challenging and resource-intensive digital artifacts contained in EMR audit logs.
[38]. Phishers have the advantage of being able to easily Table 5 presents a summary of the various types of threats
spawn a new infrastructure and leverage infected web servers discussed and the associated forensic readiness challenges
and botnets to support their campaigns. The mechanisms that from an infrastructure perspective. These threats are applica-
trick the targets into following links and opening malicious ble to the entire industry landscape. In the context of EMR
attachments are also evolving constantly. There are many ap- systems, however, the issue of privilege abuse could be con-
proaches aimed at detecting phishing emails including those sidered a high-impact area for achieving infrastructure foren-
based on email content, structural characteristics, behavior sic readiness due to high prevalence.
and hybrid algorithms [39]. To enable the collection of rele-
vant digital artifacts, a traceback framework that includes a Misuse of electronic medical record (EMR) systems
forensic backend to capture the relevant attributes of phishing
emails to support the subsequent investigations would need to Traces of actions performed by EMR users (or other connect-
be established. However, as phishing attacks are constantly ed services such as programmatic access clients) are expected
advancing, there are still many unresolved challenges to over- to be captured in audit logs [41]. In fact, technical safeguards
come [40]. contained in the HIPAA security standards include specific
requirements around establishing Bmechanisms that record
Privilege abuse and examine activity in information systems that contain or
use electronic protected health information^ [6]. However,
Health information handled by a service resides in a data store. given that the standard has no associated implementation
Although emerging architectures to be adopted as part of the specifications, specific implementations normally differ mak-
digital transformation of healthcare will feature a higher de- ing it difficult to compare their logging actions. First, this
gree of decentralization, most health service providers still could be because a large portion of the smaller healthcare
rely on systems implemented using the traditional client- service providers do not have an established information se-
server architectures. curity capability. There is also lack of incentive to fund and
implement specific information security initiatives that focus
1
https://www.elliptic.co/ on key risks associated with the lack of auditable activity
7 Page 8 of 12 J Med Syst (2019) 43: 7

Table 5 Digital forensic


readiness complexity levels for Threat type Prevalence Data Data source Infrastructure readiness challenges
a
top healthcare threat types source examples
diversity

Hacking Medium Extreme Firewall Limited ability to process and visualize vast
Intrusion amounts of digital artifacts
Detection
System (IDS)
Web server
Hypervisor host
Container
orchestration
service
Ransomware Low High Local computer Lack of cross-jurisdictional cooperation to
Cryptocurrency support the Bfollow the money^ strategies
blockchain
Phishing Low Medium Email content Highly transient nature of linkable artifacts
and metadata
Landing server
Privilege High Low Database server Varying degree of logging built into
abuse Application Electronic Medical Record (EMR)
access logs systems

a
Based on data breach count per threat variety overall as presented in the VERIS Community Database (VCDB)
[22]

trails. Such entities also possibly run outdated or freely avail- Analytics (P3A) [44] and Ambient Cognitive Cyber
able EMR systems that cannot be patched or upgraded due to Surveillance [45] for healthcare specifically aimed at address-
business and technological constraints. In other words, inte- ing the issue of activity log capture and streamlining of auto-
gration of additional components than can facilitate forensi- mated analysis to detect violations. The specific analysis use
cally sound audit logging may not always be possible, espe- cases are also unique to the healthcare sector and reflects the
cially if not natively supported by the EMR system already in previously discussed motives such as personal gain, snooping
place. Second, in absence of a formal specification that defines or curiosity. For example, such use cases focus on uncovering
exactly what events and attributes must be logged, the way to specific patterns involving access to health information of co-
address the requirement remains open to interpretation thus workers, neighbors, family members and high-profile individ-
representing one of the other key challenges. Therefore, sys- uals [44].
tem architects must consider the concept of mandatory log Despite many privacy-enhancing and digital forensic ben-
events (MLEs) (forensics-enabling activities and actions) as efits, the EMR system landscape faces the challenging task of
part of the solution design and implementation. Identification integrating forensically sound audit logging capabilities.
of MLEs can be achieved using standards, resource and These challenges include both the requirement specification
heuristics-driven methods as described in [43]. However, as and implementation issues, and there is an opportunity to in-
the study suggests, identifying the best way to specify man- troduce intelligent solutions that are able to identify MLEs
datory logging requirements for ease of comprehension and based on operations of health information and can preserve
adoption by software engineers remains on open research the associated attributes in a forensically sound manner.
challenge. Table 6 shows a comparison of several open-source EMR
Availability of auditable action trails is crucial in safeguarding systems. There are significant differences across their avail-
patient privacy and assisting with investigation of incidents able audit logging capabilities. Some of these systems have
involving privacy violations via means of privilege abuse. been adopted worldwide and are used to handle millions of
Once available, the data can be used to facilitate auditing patient records. However, despite most of these systems hav-
activities at various sophistication levels such as from random ing some form of audit logging focusing on patient record
audits, regular algorithmic audits, rule-based alerting, ma- creation and modification, the majority of them do not have
chine learning-powered behavior analysis, and intelligent pro- provisions for tracking viewing activity. Unfortunately, the
active analytics. In the latter category, we can observe vendors lack of coverage of these activities means that there are likely
introducing the concepts of Proactive Patient Privacy no artifacts captured by these systems that could become
J Med Syst (2019) 43: 7 Page 9 of 12 7

Table 6 Comparison of built-in


audit logging capabilities of se- Name Version Last Access Audit logging capabilities
lected open source electronic updated model
medical record (EMR) solutions
OpenMRS 2.3.1 2018 Web-based Separate audit module that tracks create, update,
and delete (CRUD) operations on database objects
(not production-ready, not built-in)
OpenEMR 5.0.1 2018 Web-based Database query-based logging of all operations on
patient records with log record integrity checks and
optional encryption (HIPAA-friendly)
FreeMED n/a 2017 Web-based Built-in provisions for logging patient record operations
exist, but no specific logging method calls could be
located in source code upon manual inspection by the
authors
NOSH 2.0 2018 Web-based Automated logging of create, update, and delete
(CRUD) operations on data (not context-specific)
Solismed 2.3 2018 Web-based Built-in audit log module that tracks access to all system
activity areas that contain patient data
HospitalRun 1.0.0-beta 2018 Web-based Tracking of creation and modification of patient records
Offline

useful in privilege abuse investigations. Integration of audit keyword, expression and medical term matching) to detect
logging capabilities, where not already present or inadequate, payloads of interest that need to be directed to the identifica-
remains a challenging task. Thus, we propose a different con- tion module for analysis. This module utilizes a multi-step
ceptual solution. process for health information identification, such as the PHI
data identification chain based on natural languages process-
Proposed audit logging architecture for EMR systems ing (NLP) presented in [50]. Given the user and device con-
text, the time of activity and the set of identified health infor-
The proliferation of cloud services and the variety of the BX as mation categories, the identification module uses the FaaS
a service^ type offerings did not bypass the field of digital application programming interface (API) to enable the collec-
forensics. Numerous solutions and architectures have been tion and preservation of the identified artifacts in the FaaS
proposed to leverage this service delivery model to enable platform.
secure cloud-based aggregation and analysis of digital arti- The FaaS component in the proposed architecture is based
facts to improve forensic infrastructure readiness [46–49]. on the architectures described in [47, 48]. In particular, it de-
Subsequently, we leverage the established concept of forensic scribes a secure collection and preservation module, an exam-
logging as a service (FaaS) to propose an audit logging archi- ination and analysis module and a presentation module to
tecture for EMR systems that is generic and EMR support the remaining phases of the digital forensic process.
implementation-agnostic as shown in Fig. 4. To implement In addition, the component features a provenance module
this architecture, an EMR deployment would require the inte- which is responsible for guaranteeing the chain of custody
gration of two additional elements namely, payload analyzers and maintaining the record of artifact access and usage. The
and identification module. These elements are not required to FaaS API can be leveraged by external digital forensic tools
be part of the EMR system itself. Rather, they can be deployed and privacy analytics solutions in read-only fashion to support
alongside at the technical infrastructure level. Specifically, additional use cases and analysis types. Such an architecture
payload analyzers are expected to reside inline transparently still requires practical validation using real-life EMR systems
and perform asynchronous lightweight operations such as and a FaaS component implementation.
identifying payloads carrying health information to flag Despite it only being presented at the conceptual level, we
MLE activities. believe it has practical applicability given its key benefits.
At the technical level, payload interception can be achieved First, it does not require modification of the EMR system itself
via multiple options which include 1) placing a scriptable because it only requires the ability to inject payload analyzers
reverse proxy server in front of the back-end web server, 2) across the various EMR system component communication
deploying a packet analyzer, 3) replaying binary database channels which are often based on standard technological
logs, 4) installing a custom browser extension (for limited components. Second, given the intelligent health information
access terminals) and others. Payload analyzers can be detection capability (as long as the payload itself is readable
protocol-aware and can rely on simple heuristics (such as by the analyzer), such an architecture could be generically
7 Page 10 of 12 J Med Syst (2019) 43: 7

Fig. 4 Conceptual architecture Forensic Logging as a Service


describing the implementation of
the digital forensic process in the
context of forensic logging for an Provenance module
Electronic Medical Record
(EMR) system. The architecture
of the forensic logging as a Collection and Examination and
Presentation module
preservation module analysis module
service component is based on
[47, 48] Investigator

Application programming interface (API)

Digital forensic Privacy


tools analytics

Identification module

User interface

Back-end Data store


EMR User
Programmatic
interface
= Forensic process = Payload analyzer
Device
Electronic Medical Record (EMR) System

applicable to systems deployed on premise, in the cloud and types of systems that involve access to and operations on
also future service deployments that support device commu- potentially sensitive personal information. For example, a hu-
nications of healthcare IoT services. Finally, the FaaS platform man resource application could leverage the same architecture
required is also a generic service that could be part of a larger following a recalibration of the associated payload analyzers
system collecting artifacts from a diverse set of sources, en- and the identification module to recognize the data of interest,
abling further context augmentation and additional artifact such as personnel data, for the associated operating domain.
diversity.
We do acknowledge that not all of the EMR systems such
as legacy systems already deployed today would be compat- Conclusion
ible with the proposed architecture. In cases where system
updates that require mandatory audit logging are not feasible, Despite several transformational technological changes, the
database forensics-based approaches [51] could be considered healthcare industry still remains highly susceptible to compro-
as an alternative artifact identification and extraction avenue. mises of valuable health information. The threats behind nu-
Specifically, these tasks could be achieved based on the use of merous healthcare data breaches represent a unique pattern
query logs natively supported by RDMBS to identify opera- with the majority of incidents attributed to internal actors.
tions involving health information. Although human error is widespread, the rate of privilege
Another potential limitation of the proposed architecture is abuse associated with unauthorized access to health informa-
its explicit focus on assisting the enabling of MLE logging to tion is alarming.
support future digital investigations potentially involving Focusing on privilege abuse, we have discussed the impli-
privilege abuse. However, we believe that given the perceived cations for infrastructure forensic readiness in the context of
prevalence of this threat in the industry, addressing it requires EMR systems used by many healthcare services providers to
specialized solutions such as the architecture proposed in this handle health information. We have examined several long-
paper. Collection of artifacts associated with threats applicable standing and freely available open-source EMR systems that
to EMR systems such as hacking can readily be achieved from are used to handle millions of patient records worldwide. We
traditional sources as we have described in Section C. At the found that these systems do not always incorporate a sufficient
same time, the proposed architecture can be utilized for other level of forensic logging required to assist investigations
J Med Syst (2019) 43: 7 Page 11 of 12 7

focusing on privilege abuse. Subsequently, without the neces- Guidelines/Personal-Data-Protection-Act-Overview. Accessed 8


April 2018.
sary audit trails it may be challenging to support or refute
12. Office of the Privacy Commissioner of Canada, The Personal infor-
hypotheses that aim to identify and prove the fact that these mation protection and electronic documents act (PIPEDA). https://
incidents occurred. To address this issue, we have proposed an www.priv.gc.ca/en/privacy-topics/privacy-laws-in-canada/the-
architecture that incorporates an intelligent real-time artifact personal-information-protection-and-electronic-documents-act-
pipeda/. Accessed 8 April 2018.
identification module which can be deployed alongside the
13. Japan Personal Information Protection Commission. Act on the
EMS and be integrated into cloud forensic logging service. Protection of Personal Information Act No. 57 of (2003), 2005.
In future work, we plan to validate the conceptual architecture 14. U.S. Department of Health & Human Services (HHS). Breach
and further assess its practical applicability and the ease of Portal: notice to the secretary of HHS breach of unsecured protected
health information. https://ocrportal.hhs.gov/ocr/breach/breach_
implementation and EMS integration.
report.jsf. Accessed 7 April 2018.
15. Blum, B. I., Orthner, H. F., Implementing health care information
Acknowledgements We thank the anonymous reviewers for their valu- systems. In: Implementing Health Care Information Systems.
able comments which helped us to improve the organization and content Springer, pp 3–21, 1989.
of this paper. 16. Medical Identity Fraud Alliance (MIFA), The growing threat of
medical identity fraud: a call to action, 2013.
Compliance with Ethical Standards 17. Czeschik C (2018) Black Market Value of Patient Data. In: Claudia
Linnhoff-Popien RS, Michael Zaddach (ed) Digital Marketplaces
Unleashed. Springer-Verlag. 10.1007/978-3-662-49275-8_78
Conflict of interest The authors declare that they have no conflict of
18. Dissent, D., 655,000 patient records for sale on the dark net after
interest.
hacking victims refuse extortion demands. The Daily Dot. https://
Ethical approval.
www.dailydot.com/layer8/655000-patient-records-dark-net/.
This article does not contain any studies with human participants or
Accessed 21 April 2018.
animals performed by any of the authors.
19. Bitglass, Healthcare breach report 2018: Security Procedures
Thwart Attacks, 2018.
20. Moffit, R. E., Health care data breaches: a changing landscape,
References 2017.
21. Office of the Australian Information Commissioner (OAIC),
Notifiable Data Breaches - Quarterly Statistics Report: January
1. Cresswell, K. M., and Sheikh, A., Health information technology in
2018–March 2018., 2018.
hospitals: current issues and future trends. Future Hospital Journal
22. VERIS Community Database (VCDB) Project. The VERIS
2(1):50–56, 2015.
Community Database (VCDB). http://veriscommunity.net/vcdb.
2. Bhavnani, S. P., Parakh, K., Atreja, A., Druz, R., Graham, G. N.,
html, 2018.
Hayek, S. S., Krumholz, H. M., Maddox, T. M., Majmudar, M. D.,
23. Verizon. Protected health information data breach report, 2015.
Rumsfeld, J. S., and Shah, B. R., 2017 Roadmap for Innovation—
24. Federal Bureau of Investigation (FBI). Table 16 property stolen and
ACC Health Policy Statement on Healthcare Transformation in the
recovered. https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.
Era of Digital Health, Big Data, and Precision Health: A Report of
s.-2016/topic-pages/tables/table-16. Accessed 22 April 2018.
the American College of Cardiology Task Force on Health Policy
25. Palmer, G., A road map for digital forensic research. In: First Digital
Statements and Systems of Care. Journal of the American College
Forensic Research Workshop, Utica, pp 27–30, 2001.
of Cardiology 70(21):2696–2718, 2017. https://doi.org/10.1016/j.
26. Baryamureeba, V., and Tushabe F., The enhanced digital investiga-
jacc.2017.10.018.
tion process model. In, 2004.
3. Trustwave, The value of data: a cheap commodity or a priceless
27. Carrier, B., Spafford EH An event-based digital forensic investiga-
asset, 2017.
tion framework. In: Digital forensic research workshop, 2004.
4. Islam, S. R., Kwak, D., Kabir, M. H., Hossain, M., and Kwak, K.- 28. Cohen, F., Toward a Science of Digital Forensic Evidence
S., The internet of things for health care: a comprehensive survey. Examination. In Advances in Digital Forensics VI. Springer
IEEE Access 3:678–708, 2015. Berlin Heidelberg, pp 17–35, 2010.
5. Verizon, Protected health information data breach report, 2018. 29. Kohn, M. D., Eloff, M. M., and Eloff, J. H. P., Integrated digital
6. U.S. Department of Health & Human Services (HHS), The HIPAA forensic process model. Comput Secur 38:103–115, 2013. https://
privacy rule. https://www.hhs.gov/hipaa/for-professionals/privacy/ doi.org/10.1016/j.cose.2013.05.001.
index.html. Accessed 8 April 2018. 30. Tan, J., Forensic readiness. Cambridge: @ Stake, 2001, 1–23.
7. Information Commissioner’s Office (ICO), Data Protection Bill 31. Sachowski, J., Implementing Digital Forensic Readiness: From
2017. https://ico.org.uk/for-organisations/data-protection-bill/. Reactive to Proactive Process. 1st edn. Syngress, 2016.
Accessed 8 April 2018. 32. Hunt, R., and Zeadally, S., Network Forensics: An Analysis of
8. European Union (EU), Home Page of EU GDPR. https://www. Techniques, Tools, and Trends. Computer 45(12):36–43, 2012.
eugdpr.org/. Accessed 8 April 2018. https://doi.org/10.1109/MC.2012.252.
9. Office of the Australian Information Commissioner (OAIC), 33. Khan, S., Gani, A., Wahab, A. W. A., Bagiwa, M. A., Shiraz, M.,
Privacy Act. https://www.oaic.gov.au/privacy-law/privacy-act/. Khan, S. U., Buyya, R., and Zomaya, A. Y., Cloud log forensics:
Accessed 8 April 2018. Foundations, state of the art, and future directions. ACM
10. Office of the Australian Information Commissioner (OAIC), Computing Surveys (CSUR) 49(1):7, 2016.
Notifiable data breaches scheme. https://www.oaic.gov.au/ 34. Jiang, J., Chen, J., Choo, K.-K. R., Liu, C., Liu, K., Yu, M., A
privacy-law/privacy-act/notifiable-data-breaches-scheme. Visualization Scheme for Network Forensics Based on Attribute
Accessed 8 April 2018. Oriented Induction Based Frequent Item Mining and Hyper
11. Singapore Personal Data Protection Commission, Personal data Graph. In Digital Forensics and Cyber Crime. Cham: Springer
protection act overview. https://www.pdpc.gov.sg/Legislation-and- International Publishing, pp 130–143, 2018.
7 Page 12 of 12 J Med Syst (2019) 43: 7

35. MacRae, J., and Franqueira V. N., On Locky Ransomware, Al 44. Protenus, Getting Schooled on Patient Privacy Analytics. https://
Capone and Brexit. In: International Conference on Digital blog.protenus.com/getting-schooled-on-patient-privacy-analytics.
Forensics and Cyber Crime, Springer, pp 33–45, 2017. Accessed 3 May 2018.
36. BitCluster, BitCluster. https://www.bit-cluster.com. Accessed 28 45. Cognetyx, The inconvenient truth about patient data security and
April 2018. privacy in healthcare. https://www.cognetyx.com/the-inconvenient-
37. Elliptic, Elliptic. https://www.elliptic.co/what-we-do/bitcoin- truth-about-patient-data-security-and-privacy-in-healthcare-
forensics. Accessed 28 April 2018. cognetyxs-new-ambient-cognitive-cyber-surveillance-solution-is-
38. Vargas, J., Bahnsen, A. C., and Villegas, S., Ingevaldson D addressing-this-proble/. Accessed 3 May 2018.
Knowing your enemies: Leveraging data analysis to expose phish- 46. Zawoad, S., Dutta, A. K., and Hasan R., SecLaaS: secure logging-
ing patterns against a major US financial institution. In: Electronic as-a-service for cloud forensics. In: Proceedings of the 8th ACM
Crime Research (eCrime), 2016 APWG Symposium on. IEEE, pp SIGSAC symposium on Information, computer and communica-
1–10, 2016. tions security. ACM, pp 219–230, 2013.
39. Hamid, I. R. A., Samsudin, N. A., Mustapha, A., and Arbaiy, N., 47. Nanda, S., Hansen, R. A., Forensics as a Service: Three-tier
Dynamic Trackback Strategy for Email-Born Phishing Using Architecture for Cloud based Forensic Analysis. In: Parallel and
Maximum Dependency Algorithm (MDA). In Recent Advances Distributed Computing (ISPDC), 2016 15th International
on Soft Computing and Data Mining. Cham: Springer Symposium on, 2016. IEEE, pp 178–183
International Publishing, pp 263–273, 2017. 48. Zawoad, S., and Hasan, R., Faiot: Towards building a forensics
40. Gupta, B. B., Tewari, A., Jain, A. K., and Agrawal, D. P., Fighting aware eco system for the internet of things. In: Services
against phishing attacks: state of the art and future challenges. Computing (SCC), 2015 IEEE International Conference on.
Neural Computing and Applications 28(12):3629–3654, 2017. IEEE, pp 279–284, 2015.
https://doi.org/10.1007/s00521-016-2275-y.
49. Raju, B. K., Moharil, B., Geethakumari G FaaSeC: Enabling
41. Jayabalan, M., and Daniel T., Continuous and Transparent Access
Forensics-as-a-Service for Cloud Computing Systems. In: 2016
Control Framework for Electronic Health Records: A Preliminary
IEEE/ACM 9th International Conference on Utility and Cloud
Study. In: International Conference on Information Technology on
Computing (UCC). pp 220–227, 2016.
Information Technology, Information Systems, and Electrical
Engineering (ICITISEE 2017), 2017. 50. Yang, H., and Garibaldi, J. M., Automatic detection of protected
42. Kose, I., Gokturk, M., and Kilic, K., An interactive machine- health information from clinic narratives. Journal of Biomedical
learning-based electronic fraud and abuse detection system in Informatics 58:S30–S38, 2015. https://doi.org/10.1016/j.jbi.2015.
healthcare insurance. Applied Soft Computing 36:283–299, 2015. 06.015.
43. King, J., Stallings, J., Riaz, M., and Williams, L., To log, or not to 51. Frühwirt, P., Kieseberg, P., Schrittwieser, S., Huber, M., and
log: using heuristics to identify mandatory log events – a controlled Weippl, E., InnoDB database forensics: Enhanced reconstruction
experiment. Empirical Software Engineering 22(5):2684–2717, of data manipulation queries from redo logs. Information Security
2017. https://doi.org/10.1007/s10664-016-9449-1. Technical Report 17(4):227–238, 2013.

You might also like