0% found this document useful (0 votes)
270 views16 pages

05 Healthcare Data Analytics

This document discusses healthcare data analytics. It begins by defining descriptive, predictive, and prescriptive analytics. It then discusses key concepts in healthcare data analytics like big data, machine learning, data mining, and text mining. The document outlines challenges to healthcare data analytics like data quality issues and ethical concerns. It concludes by discussing research using electronic health records to predict hospital readmissions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
270 views16 pages

05 Healthcare Data Analytics

This document discusses healthcare data analytics. It begins by defining descriptive, predictive, and prescriptive analytics. It then discusses key concepts in healthcare data analytics like big data, machine learning, data mining, and text mining. The document outlines challenges to healthcare data analytics like data quality issues and ethical concerns. It concludes by discussing research using electronic health records to predict hospital readmissions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

11/27/2021

Princess Sumaya University for Technology


The King Hussein School for Computing Sciences

Course 13768: Health information systems

Topic 5: Healthcare Data Analytics

Dr. Rafat Hammad

Acknowledgements: Most of these slides have been prepared by Robert Hoyt, Elmer V Bernstam, and William Hersh
and adopted for our course. Additional slides have been
1 added from the mentioned references in the syllabus

Learning Objectives

∗ After reviewing this presentation, viewers should be able to:


∗ Discuss the difference between descriptive, predictive and prescriptive
analytics
∗ Describe the characteristics of “Big Data”
∗ Enumerate the necessary skills for a worker in the data analytics field
∗ List the limitations of healthcare data analytics
∗ Discuss the critical role electronic health records play in healthcare data
analytics

1
11/27/2021

Introduction

∗ One of the promises of the growing clinical data in


electronic health record (EHR) systems is secondary
use (or re-use) of the data for other purposes, such as
quality improvement and clinical research
∗ Interest in healthcare data has grown exponentially
due to EHR incentives after the HITECH Act and the
addition of genomic information that will eventually
be integrated with EHRs

Introduction

∗ The term analytics is achieving wide use both in and out of


healthcare. A leader in the field defines analytics as “the
extensive use of data, statistical and quantitative analysis,
explanatory and predictive models, and fact-based
management to drive decisions and actions”
∗ IBM defines analytics as “the systematic use of data and
related business insights developed through applied
analytical disciplines to drive fact-based decision making for
planning, management, measurement and learning

2
11/27/2021

Different Types of Analytics


Increasing functionality and value

∗ Descriptive – standard types of reporting that describe


current situations and problems (how many uninsured
patients do we have with type 2 diabetes?)
∗ Predictive – simulation and modeling techniques that
identify trends and portend outcomes of actions taken (can
we predict who will be readmitted for heart failure in the
next 30 days?)
∗ Prescriptive – optimizing clinical, financial, and other
outcomes (of those patients identified as high risk for
readmission for heart failure is it more cost effective to
case manage in the hospital or at home?)

Analytics Concepts

∗ Machine learning is the area of computer science that aims


to build systems and algorithms that learn from data
∗ Data mining is defined as the processing and modeling of
large amounts of data to discover previously unknown
patterns or relationships
∗ Text mining, a sub-area, applies data mining techniques to
mostly unstructured textual data

3
11/27/2021

Analytics Concepts

∗ Provenance, which is where the data originated and how


trustworthy it is for large-scale processing and analysis
∗ Business intelligence, which in healthcare refers to the
“processes and technologies used to obtain timely, valuable
insights into business and clinical data”
∗ Learning health system, where data can be used for
continuous learning to allow the healthcare system to better
carry out disease surveillance and response, targeting of
healthcare services, improving decision-making, managing
misinformation, reducing harm, avoiding costly errors, and
advancing clinical research

Big Data

∗ Another related term is big data, which describes large and


ever-increasing volumes of data that adhere to the
following attributes:
∗ Volume – ever-increasing amounts
∗ Velocity – quickly generated
∗ Variety – many different types
∗ Veracity – from trustable sources
∗ While big data is considered a buzz word by some, we are
having to deal with terabytes and petabytes of information
today. With the addition of genomics big data will escalate

4
11/27/2021

Big Data

∗ Healthcare organizations are generating an ever-increasing


amount of data. In all healthcare organizations, clinical
data takes a variety of forms, from structured (e.g., images,
lab results, etc.) to unstructured (e.g., textual notes
including clinical narratives, reports, and other types of
documents)
∗ For example, it was estimated by Kaiser-Permanente in 2013
that its current data store for its 9+ million members
exceeds 30 petabytes (petabyte = 1024 terabytes) of data

Big Data

∗ Another example is CancerLinQ that will provide a


comprehensive system for clinicians and researchers
consisting of EHR data collection, application of clinical
decision support, data mining and visualization, and quality
feedback
∗ Lastly, IBM’s Watson is now focusing on healthcare,
specifically Oncology so that massive amounts of cancer
information/research can be analyzed and applied to
individual patient decision making

5
11/27/2021

The Analytics Big Data Pipeline


According to Kumar et al

• One begins with multiple data sources, that


are extracted and cleansed and normalized
• Statistical processing prepares the data for
output
• Finally, the data helps generate descriptive,
predictive and prescriptive analytics

Big Data
Big Data will Drive ACOs

∗ Accountable care organizations (ACOs) provide


incentives to deliver high-quality care in cost-efficient
ways that will require a robust IT architecture, health
information exchange (HIE) plus analytics. This
approach would be used to predict and quickly act on
excess costs
∗ As one pundit put it: ACOs = HIE + Analytics

6
11/27/2021

Challenges to Data Analytics

∗ Data generated in the routine care of patients may be limited


in its use for analytical purposes. For example, data may be
inaccurate or incomplete. It may be transformed in ways that
undermine its meaning (e.g., coding for billing priorities)
∗ It may exhibit the well-known statistical phenomenon of
censoring, i.e., the first instance of disease in record may not
be when it was first manifested (left censoring) or the data
source may not cover a sufficiently long time interval (right
censoring)

Challenges to Data Analytics

∗ Data may also incompletely adhere to well-known


standards, which makes combining it from different
sources more difficult
∗ Clinical data mostly allows observational and not
experimental studies, thus raising issues of cause-and-
effect of findings discovered
∗ Research questions asked of the data tend to be
driven by what can be answered, as opposed to
prospective hypotheses

7
11/27/2021

Challenges to Data Analytics

∗ Data are not always as objective as one might like, and


“bigger” is not necessarily better
∗ There are ethical concerns over how the data of
individuals is used, the means by which it is collected,
and the possible divide between those who have access
to data and those who do not
∗ Who owns the data and who can use it?

Research and Application of Analytics

∗ There is an emerging base of research that


demonstrates how data from operational clinical
systems can be used to identify critical situations or
patients whose costs are outliers
∗ There is less research, however, demonstrating how
this data can be put to use to actually improve clinical
outcomes or reduce costs. Studies using EHR data for
clinical prediction have been proliferating

8
11/27/2021

Research and Application of Analytics

∗ One common area of focus has been the use of data analytics
to identify patients at risk for hospital readmission within 30
days of discharge. The importance of this factor comes from
the US Centers for Medicare and Medicaid Services (CMS)
Readmissions Reduction Program that penalizes hospitals for
excessive numbers of readmissions
∗ This has led to research using EHR data to predict hospital
readmissions. Thus far, the results are mixed and several
examples of trials are included in the textbook chapter

Research and Application of Analytics


Scenarios for EHR Data Analysis

∗ Predicting 30-day risk of ∗ Determining five-year life


readmission and death expectancy
among HIV-infected
inpatients ∗ Detecting potential delays
in cancer diagnosis
∗ Identification of children
with asthma ∗ Identifying patients with
∗ Risk-adjusting hospital cirrhosis at high risk for
mortality rates readmission
∗ Detecting postoperative ∗ Predicting out of intensive
complications care unit cardiopulmonary
∗ Measuring processes of arrest or death
care

9
11/27/2021

Research and Application of Analytics


Identifying Patients for Research Using EHR Data

∗ Identifying patients who might be eligible for


participation in clinical studies
∗ Determining eligibility for clinical trials
∗ Identifying patients with diabetes and the earliest
date of diagnosis
∗ Predicting diagnosis in new patients

Research and Application of Analytics


Use EHR Data to Replicate Randomized Controlled Trials

∗ Virtual Data Warehouse (VDW) Project was able to demonstrate


a link between childhood obesity and hyperglycemia in
pregnancy
∗ United Kingdom General Practice Research Database
(UKGPRD), a repository of longitudinal records of general
practitioners, was able to demonstrate the ability to replicate
the findings of the Women’s Health Initiative and RCTs of other
cardiovascular diseases

10
11/27/2021

Research and Application of Analytics


Use EHR Data to Replicate Randomized Controlled Trials

∗ Other data repositories have helped to predict a


variety of cancers, risk for venous thromboembolism
(blood clots) and even rare medical disorders
∗ Note the info box in the next slide that discusses data
analytics by the Veterans Health Administration (VHA)

Case Study: Veterans Health Administration (VHA)

The VHA is a large healthcare system with a long track record of EHR use (VistA). In 2013, the
VHA had 30 million unique electronic patient records with 2 billion clinical notes (100,000
notes added daily). They also have had a corporate data warehouse (CDW) of structured
data which allows them to analyze clinical and administrative data for patients at risk of
hospital admission (from falls, coronary disease, PTSD, etc.). Analytics are run once weekly
on all primary care patients looking for “at risk” patients who would likely require more
coordinated care using care managers, home health and telehealth. In 2012, VHA researchers
reported in the American Journal of Cardiology on the use of predictive analytics on heart
failure patients. Specifically, using six categories of risk factors derived from the EHR they
could successfully predict which patients were at risk of hospitalization and death.

According to Dr. Stephen Fihn, Director of Analytics and Business Intelligence for the VHA,
the VHA is embarking on a 24-month pilot project to expand the use of healthcare data
analytics. They will use natural language processing and machine learning to analyze patient
records to aid in diagnosis, identify dangerous drug-drug interactions and optimally design
treatment strategies.

11
11/27/2021

Research and Application of Analytics


Using Genomic Information and EHRs

∗ Researchers have carried out genome-wide association


studies (GWAS) that associate specific findings from the EHR
(the “phenotype”) with the growing amount of genomic and
related data (the “genotype”) in the Electronic Medical
Records and Genomics (eMERGE) Network
∗ eMERGE has demonstrated the ability to identify genomic
variants associated with atrioventricular conduction
abnormalities, red blood cell traits, white blood cell count
abnormalities, and thyroid disorders

Research and Application of Analytics


Using Genomic Information and EHRs

∗ More recent work has “inverted” the paradigm to


carry out phenome-wide association studies
(PheWAS) that associated multiple phenotypes with
varying genotypes
∗ Genome-wide and phenome-wide association studies
are also discussed in the chapter on bioinformatics

12
11/27/2021

Role of Informaticians in Analytics

∗ There has been little focus on the human experts who will
carry out analytics, to say nothing of those who will support
their efforts in building systems to capture data, put it into
usable form, and apply the results of analysis
∗ Where will these workers come from and what will be the
education of those who work in this emerging area, that
some call data science?
∗ We do know that data analytics experts are in high demand

Role of Informaticians in Analytics

∗ From basic biomedical scientists to clinicians and


public health workers, those who are researchers and
practitioners are drowning in data, needing tools and
techniques to allow its use in meaningful and
actionable ways
∗ Dr. Hersh believes that a strong background in Health
Informatics or Biomedical Informatics is the best
preparation for the healthcare data analytics field

13
11/27/2021

Role of Informaticians in Analytics

∗ Data science is more than statistics or computer science


applied in a specific subject domain. It requires an
understanding of data, its varying types, and how to
manipulate and leverage it
∗ The field requires skills in machine learning, a strong
foundation in statistics (especially Bayesian), computer
science (representation and manipulation of data), and
knowledge of correlation and causation (modeling)

The Need for Data Analytics Experts

∗ A report by McKinsey consulting states that there will soon


be a need in the US for 140,000-190,000 individuals who have
“deep analytical talent” and an additional 1.5 million “data-
savvy managers needed to take full advantage of big data”
∗ An analysis by SAS estimated that by 2018, there will be over
6400 organizations that will hire 100 or more analytics staff
∗ Another report found that data scientists currently comprise
less than 1% of all big data positions, with more common job
roles consisting of developers (42% of advertised positions),
architects (10%), analysts (8%) and administrators (6%)

14
11/27/2021

The Need for Data Analytics Experts

∗ The technical skills most commonly required for big data


positions as a whole were NoSQL, Oracle, Java and SQL
∗ PriceWaterhouseCoopers noted that healthcare
organizations need to acquire talent in systems and data
integration, data statistics and analytics, technology and
architecture support, and clinical informatics
∗ Business knowledge is also useful

The Need for Data Analytics Experts


What Skill Sets Should Universities Train For?

∗ Programming - especially with data-oriented tools, such as


SQL and statistical programming languages
∗ Statistics - working knowledge to apply tools and
techniques
∗ Domain knowledge - depending on one's area of work,
bioscience or health care
∗ Communication - being able to understand needs of people
and organizations and articulate results back to them

15
11/27/2021

Conclusions

∗ Healthcare data has proliferated greatly, in large part due to the


accelerated adoption of EHRs
∗ Analytic platforms will examine data from multiple sources, such
as clinical records, genomic data, financial systems, and
administrative systems
∗ Analytics is necessary to transform data to information and
knowledge
∗ Accountable care organizations and other new models of
healthcare delivery will rely heavily on analytics to analyze
financial and clinical data
∗ There is a great demand for skilled data analysts in healthcare;
expertise in informatics will be important for such individuals

16

You might also like