Real‑world data: a brief review of the methods, applications, challenges and opportunities
Real‑world data: a brief review of the methods, applications, challenges and opportunities
Real‑world data: a brief review of the methods, applications, challenges and opportunities
Abstract
Background The increased adoption of the internet, social media, wearable devices, e-health services, and other
technology-driven services in medicine and healthcare has led to the rapid generation of various types of digital data,
providing a valuable data source beyond the confines of traditional clinical trials, epidemiological studies, and lab-
based experiments.
Methods We provide a brief overview on the type and sources of real-world data and the common models and
approaches to utilize and analyze real-world data. We discuss the challenges and opportunities of using real-world
data for evidence-based decision making This review does not aim to be comprehensive or cover all aspects of the
intriguing topic on RWD (from both the research and practical perspectives) but serves as a primer and provides use-
ful sources for readers who interested in this topic.
Results and Conclusions Real-world hold great potential for generating real-world evidence for designing and
conducting confirmatory trials and answering questions that may not be addressed otherwise. The voluminosity and
complexity of real-world data also call for development of more appropriate, sophisticated, and innovative data pro-
cessing and analysis techniques while maintaining scientific rigor in research findings, and attentions to data ethics to
harness the power of real-world data.
Keywords Real-world data (RWD), Real-world evidence (RWE), Electronic health records, Machine learning, Artificial
intelligence, Causal inference
© The Author(s) 2022, corrected publication 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0
International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you
give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes
were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To
view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a
credit line to the data.
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 2 of 10
Fig. 1 RWD Types and Sources (source: Fig. 1 in [16] with written permission by Dr. Brandon Swift to use the figure)
vaccination [3–5], to model localized COVID-19 con- voluminous and dynamic data. Fourth, RWD may be
trol strategies [6], to characterize COVID-19 and flu incomplete and lack key endpoints for an analysis given
using data from smartphones and wearables [7], to study that the original collection is not for such a purpose.
behavioral and mental health changes in relation to the For example, claims data usually do not have clinical
lockdown of public life [8], and to assist in decision and endpoints; registry data have limited follow-ups. Fifth,
policy making, among others. RWD may be subject to bias and measurement errors
In what follows, we provide a brief review on the type (random and non-random). For example, data gener-
and sources of RWD (Section 2) and the common models ated from the internet, mobile devices, and wearables
and approaches to utilize and analyze RWD (Section 3) can be subject to selection bias; a RWD dataset is a
, and discuss the challenges and opportunities of using unrepresentative sample of the underlying popula-
RWD for evidence-based decision making (Section 4). tion that a study intends to understand; claims data are
This review does not aim to be comprehensive or cover known to contain fraudulent values. In summary, RWD
all aspects of the intriguing topic on RWD (from both are messy, incomplete, heterogeneous, and subject
the research and practical perspectives) but serves as a to different types of measurement errors and biases.
primer and provides useful sources for readers who inter- A systematic scoping review of the literature suggests
ested in this topic. data quality of RWD is not consistent, and as a result
quality assessments are challenging due to the complex
Characteristics, types and applications of RWD and heterogeneous nature of these data. The sub-opti-
RWD have several characteristics as compared to data mal data quality of RWD is well recognized [9–12]; how
collected from randomized trials in controlled set- to improve it (e.g. regulatory-grade) is work in progress
tings. First, RWD are observational as opposed to data [13–15].
gathered in a controlled setting. Second, many types of There are many different types of RWD. Figure 1 [16]
RWD are unstructured (e.g., texts, imaging, networks) provides a list of the RWD types and sources in medi-
and at times inconsistent due to entry variations across cine. We also refer readers to [11] for a comprehensive
providers and health systems. Third, RWD may be gen- overview of the RWD data types. Here we use a few com-
erated in a high-frequency manner (e.g., measurements mon RWD types, i.e., EHRs, registry data, claims data,
at the millisecond level from wearables), resulting in patient-reported outcome (PRO) data, and data collected
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 3 of 10
from wearables, as examples to demonstrate the variety adoption of modern statistical, data mining and ML tech-
of RWD and how they can be used for what purposes. niques for fraud detection [48–51].
EHRs are collected as part of routine care across clin- PRO data refer to data reported directly by patients on
ics, hospitals, and healthcare institutions. EHR data are their health status. PRO data have been used to provide
typical RWD – noisy, heterogeneous, structured, and RWE on effectiveness of interventions, symptoms moni-
unstructured (e.g., text, imaging), and dynamic and toring, relationships between exposure and outcomes,
require careful and intensive efforts pre-processing among others [52–55]. PRO data are subject to recall bias
[17]. EHRs have created unprecedented opportunities and large inter-individual variability.
for data-driven approaches to learn patterns, make new Wearable devices generate continuous streams of
discoveries, assist preoperative planning, diagnostics, data. When combined with contextual data (e.g., loca-
clinical prognostication, among others [18–27], improve tion data, social media), they provide an opportunity
predictions in selected outcomes especially if linked to conduct expansive research studies that are large in
with administrative and claim data and usage of proper scale and scope [56] that would be otherwise infeasible
machine learning techniques [27–30], and validate and in controlled trials. Examples of using wearable RWD to
replicate findings from clinical trials [31]. generate RWE include applications in neuroscience and
Registry data have various types. For example, prod- environmental health [57–60]. The wearables generate
uct registries include patients who have been exposed to huge amounts of data. Advances in data storage, real-
a biopharmaceutical product or a medical device; health time processing capabilities and efficient battery technol-
services registries consist of patients who have had a ogy would be essential for the full utilization of wearable
common procedure or hospitalization; and disease regis- data.
tries contains information about people diagnosed with
a specific type of disease. Registries data enable identifi- Using and analyzing RWD
cation and sharing best clinical practices, improve accu- A wide range of research methods are available to
racy of estimates, provide valuable data for supporting make use of RWD. In what follows, we outline a few
regulatory decision-making [32–35]. Especially for rare approaches, including pragmatic clinical trials, target trial
diseases where clinical trials are often of small size and emulation, and applications of ML and AI techniques.
data are subject to high variability, registries provide a Pragmatic clinical trials are trials designed to test the
valuable data source to help understand the course of a effectiveness of an intervention in the real-world clinical
disease, and provide critical information for confirmatory setting. Pragmatic trials leverage the increasingly inte-
clinical trial design and translational research to develop grated healthcare system and may use data from EHR,
treatments and improve patient care [34, 36, 37]. Reader claims, patient reminder systems, telephone-based care,
may refer to [38] for a comprehensive overview on reg- etc. Due to the data characteristics of RWD, new guide-
istry data and how they help understanding of patient lines and methodologies are developed to mitigate bias in
outcomes. RWE generated by RWD for decision making and causal
Claims data refer to data generated during process- inference, especially for per-protocol analysis [61, 62].
ing healthcare claims in health insurance plans or from The research question under investigation in pragmatic
practice management systems. Despite that claims data trials is whether an intervention works in real life and tri-
are collected and stored primarily for payment purposes als are designed to maximize the applicability and gener-
originally, they have been used in healthcare to under- alizability of the intervention. Various types of outcomes
stand patients’ and prescribes’ behavior and how they can be measured in these trials, but mostly patient-cen-
interact, to estimate disease prevalence, to learn disease tered, instead of typical measurable symptoms or mark-
progression, disease diagnosis, medication usage, and ers in explanatory trials. For example, ADAPTABLE trial
drug-drug interactions, and validate and replicate find- [63, 64] is a high-profile pragmatic trial and is the first
ings from clinical trials [31, 39–46]. A known pitfall of large-scale, EHR-enabled clinical trial conducted within
claim data is fraud, on top of some of the common data the U.S. It used EHR data to identify around 450,000
characteristics of RMD, such as upcoding1[47]. The data patients with established atherosclerotic cardiovascular
fraud problem can be mitigated with detailed audits and disease (CVD) for recruitment and eventually enrolled
about 15,000 individuals at 40 clinical centers that were
randomized to two aspirin dose arms. Electronic patient
follow-up for patient-reported outcomes was completed
every 3 to 6 months, with a median follow-up was 26.2
1
Upcoding refers to instances in which a medical service provider obtains months to determine the optimal dosage of aspirin in
additional reimbursement from insurance by coding a service it provided as a
more expensive service than what was actually performed
CVD patients, with the primary endpoint being the
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 4 of 10
composite of all-cause mortality, hospitalization for non- confounders and improve the capabilities of RWD for
fatal myocardial infarction, or hospitalization for a non- causal inference [73–76].
fatal stroke. The cost of ADATABLE is estimated to be ML techniques are getting increasingly popular and
only 1/5 to 1/2 of a traditional RCT of that scale. are powerful tools for predictive modeling. One reason
Target trial emulation is the application of trial design for their popularity is that the modern ML techniques
and analysis principles from (target) randomized trials to are very capable of dealing with voluminous, messy,
the analysis of observational data [65]. By precisely speci- multi-modal, and various unstructured data types with-
fying the target trial’s inclusion/exclusion criteria, treat- out strong assumptions about the distribution of data.
ment strategies, treatment assignment, causal contrast, For example, deep learning can learn abstract represen-
outcomes, follow-up period, and statistical analysis, one tations of large, complex, and unstructured data; natural
may draw valid causal inferences about an intervention language processing (NLP) and embedding methods can
from RWD. Target trial emulation can be an important be used to process texts and clinical notes in EHRs and
tool especially when comparative evaluation is not yet transform them to real-valued vectors for downstream
available or feasible in randomized trials. For example, learning tasks. Secondly, new and more powerful ML
[66] employs target trial emulation to evaluate real-world techniques are being developed rapidly, due to the high
COVID-19 vaccine effectiveness, measured by protection demand and the large group of researchers in the field
against COVID-19 infection or related death, in racially attracted by the hot topic. Thirdly, there are also many
and ethnically diverse, elderly populations by comparing open source codes (e.g., on Github) and software libraries
newly vaccinated persons with matched unvaccinated (e.g., TensorFlow, Pytorch, Keras) out there to facilitate
controls using data from the US Department of Veterans the implementation of these techniques. Indeed, ML has
Affairs health care system. The simulated trial was con- enjoyed a rapid surge in the last decade or so for a wide
ducted with clearly defined inclusion/exclusion criteria, range of applications in RWD, outperforming more con-
identification of matched controls, including matching ventional approaches [77–85]. For example, ML is widely
based on propensity scores with careful selection of applied in in health informatics to generate RWE and for-
model covariates. Target trial emulation has also been mulate personalized healthcare [86–90] and was success-
used to evaluate the effect of colon cancer screening on fully employed on RWD collected during the COVID-19
cancer incidence over eight years of follow up [67], and pandemic to help understand the disease and evaluate its
the risk of urinary tract infection among diabetic patients prevention and treatment strategies [91–95]. It should be
[68]. noted that the ML techniques are largely used for predic-
RWD can also be used as historical controls and refer- tions and classification (e.g., disease diagnosis), variable
ence groups for controlled trials, with assessment of the selections (e.g, biomarker screening), data visualization,
quality and appropriateness of the RWD and employ- etc, rather than generating regulatory-level RWE; but this
ment of proper statistical approaches for analyzing the may change soon as regulatory agencies are aggressively
data [69]. Controlling for selection bias and confounding evaluating ML/AI for generating RWE and engaging
is key to the validity of this approach because of the lack stakeholders on the topic [96–99].
of randomization and potentially unrecognized baseline It would be more effective and powerful to combine
differences, and the control group needs to be compa- the expertise from statistical inference and ML when it
rable with the treated group. RWD also provide a great comes to generating RWE and learning causal relation-
opportunity to study rare events given the data volumi- ships. One of the recent methodological developments
nousness [70–72]. These studies also highlight the need is indeed in that direction – leveraging the advances in
for improving the RWD data quality, developing surro- semi-parametric and empirical process theory and incor-
gate endpoints, and standardizing data collection for out- porating the benefits of ML into comparative effective-
come measures in registries. ness using RWD. A well-known framework is targeted
In terms of analysis of RWD, statistical models and learning [100–102] that has been successfully applied in
inferential approaches are necessary for making sense causal inference for dynamic treatment rules using EHR
of RWD, obtaining causal relationships, testing/validat- data [103] and efficacy of COVID-19 treatments [104],
ing hypotheses, and generating regulatory-grade RWE among others.
to inform policymakers and regulators in decision mak- Regardless of which area a RWD project focuses on
ing – just as in the controlled trial settings. In fact, the – causal inference or prediction and classification, rep-
motivation for and the design and analysis principles in resentativeness of RWD of the population where the
pragmatic trials and target trial emulation are to obtain conclusions from the RWD project will be generalized
causal inference, with more innovative methods beyond to is critical. Otherwise, estimation or prediction can be
the traditional statistical methods to adjust for potential misleading or even harmful. The information in RWD
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 5 of 10
might not be adequate to validate the appropriateness of not equipped with proper training or understanding the
the data for generalization; in that case, the investigators principles of the techniques before applying them to real-
should resist the temptation to generalize to groups that world situations. In addition, to maintain scientific rigor
they are unsure about. during the RWE generation process from RWD, results
from statistical and ML procedures would require medi-
Challenges and opportunities cal validation either using expert knowledge or conduct-
Various challenges – from data gathering to data quality ing reproducibility and replicability studies before they
control to decision making – still exist in all stages of a are being used for decision making in the real world
RWD life cycle despite all the excitement around their [105].
transformative potentials. We list some of the challenges Explainability and interpretability: Modern ML
below, where plenty of opportunities for improvement approaches are often employed in a black-box fashion
exist and greater efforts are needed to harness the power and there a lack of understanding of the relationships
of RWD. between input and output and causal effects. Model
Data quality: RWD are now often used for other pur- selection, parameter initialization, and hyper-parameter
poses than what they are originally collected for and tuning are also often conducted in a trial-and-error man-
thus may lack information for critical endpoints and not ner, without domain expert input. This is in contrast to
always be positioned for generating regulatory-grade evi- the medical and healthcare field where interpretability
dence. On top of that, RWD are messy, heterogeneous, is critical to building patient/user trust, and doctors are
and subject to various measurement errors, all of which unlikely to use technology that they don’t understand.
contribute to the lower quality of RWD compared to data Promising and encouraging research work on this topic
from controlled trials. As a result, accuracy and precision has already started [106–111], but more research is
of results based on RWD are negatively impacted and warranted.
misleading results or false conclusions can be generated. Reproducibility and replicability: Reproducibility and
While these do not preclude the use of RWD in evidence replicability2 are major principles in scientific research,
generation and decision making, data quality issues need RWD included. If an analytical procedure is not robust
to be consistently documented and addressed as much and its output is not reproducible or replicable, the pub-
as possible through data cleaning and pre-processing lic would call into questions the scientific rigor of the
(e.g., imputation to fill in missing values, over-sampling work and doubt the conclusion from a RWD-based study
for imbalanced data, denoising, combining disparate [113–115]. Result validation, reproducibility, and repli-
pieces of information across databases, etc). If an issue cability can be challenging given their messiness, incom-
can be addressed during the pre-processing stage, efforts pleteness, unstructured data, but need to be established
should be made to correct it during data analysis or cau- especially considering that the generated evidence could
tion should be used when interpreting the results. Early be used towards regulatory decisions and affect the lives
engagement of key stakeholders (e.g., regulatory agencies of millions of people. Irreproducibility can be mitigated
if needed, research institutes, industries etc.) are encour- by sharing raw and processed data and codes, assuming
aged to establish data quality standards and reduce no privacy is compromised in this process. For replica-
unforeseen risks and issues. bility, given that RWD are not generated from controlled
Efficient and practical ML and statistical procedures: trials and every data set may has its own unique data
Fast growth of digital medical data and the fact that characteristics, complete replicability can be difficult or
workforce and investment flood into the field also drive even infeasible. Nevertheless, detailed documentation of
the rapid development and adoption of modern statistical data characteristics and pre-processing, pre-registration
procedures and ML algorithms to analyze the data. The of analysis procedures, and adherence to open science
availability of open-source platforms and software greatly principles (e.g., code repositories [116]) are critical for
facilitate the application of the procedures in practice. replicating findings on different RWD datasets, assuming
On the other hand, noisiness, heterogeneity, incomplete- they come from the same underlying population. Readers
ness, and unbalancedness of RWD may cause consider- may refer to [117–119] for more suggestions and discus-
able under-performance of the existing statistical and sions on this topic.
ML procedures and demand new procedures that target
specifically at RWD and can be effectively deployed in
the real world. Further, the availability of the open-source
platform and software and the accompanied conveni- 2
Reproducibility refers to “instances in which the original researcher’s data
ence, while offered with good intentions, also increases and computer codes are used to regenerate the results” and replicability refers
the chance of practitioners misusing the procedures, if to “instances in which a researcher collects new data to arrive at the same sci-
entific findings as a previous study.” [112]
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 6 of 10
Privacy: Ethical issues exist when an RWD project algorithmic fairness, which aims at understanding and
is implemented, among which, privacy is a commonly preventing bias in ML models. Algorithmic fairness is
discussed topic. Information in RWD is often sensitive, an increasingly popular research topic in literature [123–
such as medical histories, disease status, financial situ- 127]. Incorrect and misleading conclusions may be drawn
ations, and social behaviors, among others. Privacy risk if the trained models systematically disadvantage a cer-
can increase dramatically when different databases (e.g., tain group (e.g., a trained algorithm might be less likely
EHR, wearables, claims) are linked together, a common to detect cancer in black patients than white patients or
practice in the analysis of RWD. Data users and policy- in men than women). Transparency means that infor-
makers should make every effort to ensure that RWD mation and communication concerning the processing
collection, storage, sharing, and analysis follow estab- of personal data must be easily accessible and easy to
lished data privacy principles (i.e., lawfulness, fairness, understand. Transparency ensures that data contributors
purpose limitation, and data minimization). In addition, are aware of how their data are being used and for what
privacy-enhancing technology and privacy-preserving purposes and decision-makers can evaluate the quality of
data sharing and analysis can be deployed, where there the methods and the applicability of the generated RWE
already exist plenty effective and well-accepted state- [128–131]. Being transparent when working with RWD
of-the-art concepts and approaches, such as differential is critical for building trust among the key stakeholders
privacy3[120] and federated learning4[121, 122]. Investi- during an RWD life cycle (individuals who supply the
gators and policymakers may consider integrating these data, those who collect and manage the data, data cura-
concepts and technology when collecting and analyzing tors who design studies and analyze the data, and deci-
RWD and disseminating the results and RWE from the sion and policy makers).
RWD. The above challenges are not isolated but rather con-
Diversity, Equity, Algorithmic fairness, and Transpar- nected as depicted in Fig. 2. Data quality affects the per-
ency (DEAT): DEAT is another important ethical issue formance of statistical and ML procedures; data sources
to consider in an RWD project. RWD may contain infor- and the cleaning and pre-processing process relate to
mation from various demographic groups, which can be result reproducibility and replicability. How data are
used to generate RWE with improved generalizability analyzed and which statistical and ML procedures to
compared to data collected in controlled settings. On the use have an impact on reproducibility and replicability,
other hand, certain types of RWD may be heavily biased whether privacy-preserving procedures are used dur-
and unbalanced toward a certain group, not as diverse ing data collected and analysis and how information is
or inclusive, and in some cases, even exacerbate dispar- shared and released relate to data privacy, DEAT, and
ity (e.g., wearables and access to facilities and treatment explainability and interpretability, which can in turns
may be limited to certain demographic groups). Greater affect which ML procedures to apply and development of
effort will be needed to gain access to RWD from under- new ML techniques.
represented groups and to effectively take into account
the heterogeneity in RWD while being mindful of the Conclusions
limitation for diversity/equity. This topic also relates to RWD provide a valuable and rich data source beyond
the confines of traditional epidemiological studies, clini-
cal trials, and lab-based experiments, with lower cost in
3
data collection compared to the latter. If used and ana-
Differential privacy provides a mathematically rigorous framework in which
randomized procedures are used to guarantee individual privacy when releas-
lyzed appropriately, RWD have the potential to generate
ing information. valid and unbiased RWE with savings in both cost and
4
Federated learning enables local devices to collaboratively learn a shared time, compared to controlled trials, and to enhance the
model while keeping all training data on the local devices without sharing, efficiency of medical and health-related research and
mitigating privacy risks.
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 7 of 10
decision-making. Procedures that improve the quality of 5. Henry DA, Jones MA, Stehlik P, Glasziou PP. Effectiveness of COVID-19
vaccines: findings from real world studies. Med J Aust. 2021;215(4):149.
the data and overcome the limitation of RWD to make 6. Firth JA, Hellewell J, Klepac P, Kissler S, Kucharski AJ, Spurgin LG. Using a
the best of them have been and will continue to be devel- real-world network to model localized COVID-19 control strategies. Nat
oped. With the enthusiasm, commitment, and invest- Med. 2020;26(10):1616–22.
7. Shapiro A, Marinsek N, Clay I, Bradshaw B, Ramirez E, Min J, et al.
ment in RWD from all key stakeholders, we hope that the Characterizing COVID-19 and influenza illnesses in the real world via
day that RWD unleashes its full potential will come soon. person-generated health data. Patterns. 2021;2(1):100188.
8. Ahrens KF, Neumann RJ, Kollmann B, Plichta MM, Lieb K, Tüscher O, et al.
Differential impact of COVID-related lockdown on mental health in
Abbreviations Germany. World Psychiatr. 2021;20(1):140.
AI artificial intelligence 9. Hernández MA, Stolfo SJ. Real-world data is dirty: Data cleansing and
CVD cardiovascular disease the merge/purge problem. Data Min Knowl Disc. 1998;2(1):9–37.
COVID coronavirus disease 10. Corrigan-Curay J, Sacks L, Woodcock J. Real-world evidence and
DEAT diversity, equity, algorithmic fairness, and transparency real-world data for evaluating drug safety and effectiveness. Jama.
EHR electronic health records 2018;320(9):867–8.
ML machine learning 11. Makady A, de Boer A, Hillege H, Klungel O, Goettsch W, et al. What is
NLP natural language processing real-world data? A review of definitions based on literature and stake-
PRO patient-reported outcome holder interviews. Value Health. 2017;20(7):858–65.
RWD real-world data 12. Franklin JM, Schneeweiss S. When and how can real world data analy-
RWE real-world evidence ses substitute for randomized controlled trials? Clin Pharmacol Ther.
2017;102(6):924–33.
Acknowledgements 13. Miksad RA, Abernethy AP. Harnessing the power of real-world evidence
We thank the editor and two referees for reviewing the paper and providing (RWE): a checklist to ensure regulatory-grade data quality. Clin Pharma-
suggestions. col Ther. 2018;103(2):202–5.
14. Curtis MD, Griffith SD, Tucker M, Taylor MD, Capra WB, Carrigan G, et al.
Authors’ contributions Development and validation of a high-quality composite real-world
FL and PD came up with the general idea for the article. FL did the literature mortality endpoint. Health Serv Res. 2018;53(6):4460–76.
review and wrote the manuscript. PD reviewed and revised the manuscript. 15. Booth CM, Karim S, Mackillop WJ. Real-world data: towards achieving
Both authors have read and approved the manuscript. the achievable in cancer care. Nat Rev Clin Oncol. 2019;16(5):312–25.
16. Swift B, Jain L, White C, Chandrasekaran V, Bhandari A, Hughes DA, et al.
Funding Innovation at the intersection of clinical trials and real-world data sci-
Not applicable. ence to advance patient care. Clin Transl Sci. 2018;11(5):450–60.
17. Sun W, Cai Z, Li Y, Liu F, Fang S, Wang G. Data processing and text min-
Availability of data and materials ing technologies on electronic medical records: a review. J Healthc Eng.
Not applicable. This is a review article. No data or materials were generated or 2018;2018:4302425. https://doi.org/10.1155/2018/4302425.
collected. 18. Wu J, Roy J, Stewart WF. Prediction modeling using EHR data: chal-
lenges, strategies, and a comparison of machine learning approaches.
Med Care. 2010;48(6 Suppl):S106-13. https://doi.org/10.1097/MLR.
Declarations 0b013e3181de9e17, https://pubmed.ncbi.nlm.nih.gov/20473190/.
19. Botsis T, Hartvigsen G, Chen F, Weng C. Secondary use of EHR: data
Ethics approval and consent to participate quality issues and informatics opportunities. Summit Transl Bioinforma.
Not applicable. 2010;2010:1.
20. Kawaler E, Cobian A, Peissig P, Cross D, Yale S, Craven M. Learning to
Consent for publication predict post-hospitalization VTE risk from EHR data. In: AMIA annual
Not applicable. symposium proceedings. vol. 2012. p. 436. American Medical Informat-
ics Association Country United States.
Competing interests 21. Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent
Both authors are Senior Editorial Board Members for the journal of BMC Medi- advances in deep learning techniques for electronic health record
cal Research Methodology. (EHR) analysis. IEEE J Biomed Health Inform. 2017;22(5):1589–604.
22. Poirier C, Hswen Y, Bouzillé G, Cuggia M, Lavenu A, Brownstein JS, et al.
Influenza forecasting for French regions combining EHR, web and cli-
Received: 8 April 2022 Accepted: 22 October 2022 matic data sources with a machine learning ensemble approach. PloS
Published: 5 November 2022 ONE. 2021;16(5):e0250890.
23. Zheng T, Xie W, Xu L, He X, Zhang Y, You M, et al. A machine learning-
based framework to identify type 2 diabetes through electronic health
records. Int J Med Inform. 2017;97:120–7.
References 24. Pivovarov R, Perotte AJ, Grave E, Angiolillo J, Wiggins CH, Elhadad N.
1. US Food and Drug Administration, et al. Real-World Evidence. 2022. Learning probabilistic phenotypes from heterogeneous EHR data. J
https://www.fda.gov/science-research/science-and-research-special- Biomed Inform. 2015;58:156–65.
topics/real-world-evidence. Accessed 1 Sep 2022. 25. Zhao D, Weng C. Combining PubMed knowledge and EHR data to
2. Wikipedia. Real world data. 2022. https://en.wikipedia.org/wiki/Real_ develop a weighted bayesian network for pancreatic cancer prediction.
world_data. Accessed 19 Mar 2022. J Biomed Informa. 2011;44(5):859–68.
3. Powell AA, Power L, Westrop S, McOwat K, Campbell H, Simmons R, 26. Veturi Y, Lucas A, Bradford Y, Hui D, Dudek S, Theusch E, et al. A unified
et al. Real-world data shows increased reactogenicity in adults after het- framework identifies new links between plasma lipids and diseases
erologous compared to homologous prime-boost COVID-19 vaccina- from electronic medical records across large-scale cohorts. Nat Genet.
tion, March- June 2021, England. Eurosurveillance. 2021;26(28):2100634. 2021;53(7):972–81.
4. Hunter PR, Brainard JS. Estimating the effectiveness of the Pfizer COVID- 27. Kwon BC, Choi MJ, Kim JT, Choi E, Kim YB, Kwon S, et al. Retainvis:
19 BNT162b2 vaccine after a single dose. A reanalysis of a study of ’real- Visual analytics with interpretable and interactive recurrent neural
world’ vaccination outcomes from Israel. medRxiv. 2021.02.01.21250957. networks on electronic medical records. IEEE Trans Vis Comput Graph.
https://doi.org/10.1101/2021.02.01.21250957. 2018;25(1):299–309.
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 8 of 10
28. Mahmoudi E, Kamdar N, Kim N, Gonzales G, Singh K, Waljee AK. Use 48. Kirlidog M, Asuk C. A fraud detection approach with data mining in
of electronic medical records in development and validation of risk health insurance. Procedia-Soc Behav Sci. 2012;62:989–94.
prediction models of hospital readmission: systematic review. BMJ. 49. Li J, Huang KY, Jin J, Shi J. A survey on statistical methods for health care
2020;369:m958. fraud detection. Health Care Manag Sci. 2008;11(3):275–87.
29. Desai RJ, Wang SV, Vaduganathan M, Evers T, Schneeweiss S. Compari- 50. Viaene S, Dedene G, Derrig RA. Auto claim fraud detection using Bayes-
son of machine learning methods with traditional models for use of ian learning neural networks. Expert Syst Appl. 2005;29(3):653–66.
administrative claims with electronic medical records to predict heart 51. Phua C, Lee V, Smith K, Gayler R. A comprehensive survey of data
failure outcomes. JAMA Netw Open. 2020;3(1):e1918962. mining-based fraud detection research. arXiv preprint arXiv:1009.6119.
30. Huang L, Shea AL, Qian H, Masurkar A, Deng H, Liu D. Patient clustering 2010.
improves efficiency of federated machine learning to predict mortality 52. Roche N, Small M, Broomfield S, Higgins V, Pollard R. Real world COPD:
and hospital stay time using distributed electronic medical records. J association of morning symptoms with clinical and patient reported
Biomed Inform. 2019;99:103291. outcomes. COPD J Chronic Obstructive Pulm Dis. 2013;10(6):679–86.
31. Bartlett VL, Dhruva SS, Shah ND, Ryan P, Ross JS. Feasibility of using 53. Small M, Anderson P, Vickers A, Kay S, Fermer S. Importance of
real-world data to replicate clinical trial evidence. JAMA Netw Open. inhaler-device satisfaction in asthma treatment: real-world observa-
2019;2(10):e1912869. tions of physician-observed compliance and clinical/patient-reported
32. Dreyer NA, Garner S. Registries for robust evidence. Jama. outcomes. Adv Ther. 2011;28(3):202–12.
2009;302(7):790–1. 54. Pinsker JE, Müller L, Constantin A, Leas S, Manning M, McElwee Mal-
33. Larsson S, Lawyer P, Garellick G, Lindahl B, Lundström M. Use of loy M, et al. Real-world patient-reported outcomes and glycemic
13 disease registries in 5 countries demonstrates the potential to results with initiation of control-IQ technology. Diabetes Technol Ther.
use outcome data to improve health care’s value. Health Affairs. 2021;23(2):120–7.
2012;31(1):220–7. 55. Touma Z, Hoskin B, Atkinson C, Bell D, Massey O, Lofland JH, Berry P,
34. McGettigan P, Alonso Olmo C, Plueschke K, Castillon M, Nogueras Karyekar CS, Costenbader KH. Systemic lupus erythematosus symptom
Zondag D, Bahri P, et al. Patient registries: an underused resource for clusters and their association with Patient‐Reported outcomes and
medicines evaluation. Drug Saf. 2019;42(11):1343–51. treatment: analysis of Real‐World data. Arthritis Care & Research.
35. Izmirly PM, Parton H, Wang L, McCune WJ, Lim SS, Drenkard C, et al. 2022;74(7):1079-88.
Prevalence of systemic lupus erythematosus in the United States: 56. Martinez GJ, Mattingly SM, Mirjafari S, Nepal SK, Campbell AT, Dey
estimates from a meta-analysis of the Centers for Disease Control AK, et al. On the quality of real-world wearable data in a longitudinal
and Prevention National Lupus Registries. Arthritis Rheumatol. study of information workers. In: 2020 IEEE International Conference
2021;73(6):991–6. on Pervasive Computing and Communications Workshops (PerCom
36. Jansen-Van Der Weide MC, Gaasterland CM, Roes KC, Pontes C, Vives R, Workshops). New York City: IEEE; 2020. p. 1–6.
Sancho A, et al. Rare disease registries: potential applications towards 57. Christensen JH, Saunders GH, Porsbo M, Pontoppidan NH. The everyday
impact on development of new drug treatments. Orphanet J Rare Dis. acoustic environment and its association with human heart rate: evi-
2018;13(1):1–11. dence from real-world data logging with hearing aids and wearables.
37. Lacaze P, Millis N, Fookes M, Zurynski Y, Jaffe A, Bellgard M, et al. Rare Royal Soc Open Sci. 2021;8(2):201345.
disease registries: a call to action. Intern Med J. 2017;47(9):1075–9. 58. Johnson KT, Picard RW. Advancing neuroscience through wearable
38. Gliklich RE, Dreyer NA, Leavy MB, editors. Registries for Evaluating devices. Neuron. 2020;108(1):8–12.
Patient Outcomes: A User’s Guide. 3rd ed. Rockville (MD): Agency for 59. Pickham D, Berte N, Pihulic M, Valdez A, Mayer B, Desai M. Effect of a
Healthcare Research and Quality (US); 2014 Apr. Report No.: 13(14)- wearable patient sensor on care delivery for preventing pressure inju-
EHC111. PMID: 24945055. ries in acutely ill adults: A pragmatic randomized clinical trial (LS-HAPI
39. Svarstad BL, Shireman TI, Sweeney J. Using drug claims data to assess study). Int J Nurs Stud. 2018;80:12–9.
the relationship of medication adherence with hospitalization and 60. Adams JL, Dinesh K, Snyder CW, Xiong M, Tarolli CG, Sharma S, et al. A
costs. Psychiatr Serv. 2001;52(6):805–11. real-world study of wearable sensors in Parkinson’s disease. NPJ Park Dis.
40. Izurieta HS, Wu X, Lu Y, Chillarige Y, Wernecke M, Lindaas A, et al. Zosta- 2021;7(1):1–8.
vax vaccine effectiveness among US elderly using real-world evidence: 61. Hernán MA, Robins JM, et al. Per-protocol analyses of pragmatic trials. N
Addressing unmeasured confounders by using multiple imputation Engl J Med. 2017;377(14):1391–8.
after linking beneficiary surveys with Medicare claims. Pharmacoepide- 62. Murray EJ, Swanson SA, Hernán MA. Guidelines for estimating causal
miol Drug Saf. 2019;28(7):993–1001. effects in pragmatic randomized trials. arXiv preprint arXiv:1911.06030.
41. Allen AM, Van Houten HK, Sangaralingham LR, Talwalkar JA, McCoy 2019.
RG. Healthcare cost and utilization in nonalcoholic fatty liver disease: 63. Hernandez AF, Fleurence RL, Rothman RL. The ADAPTABLE Trial and
real-world data from a large US claims database. Hepatology. PCORnet: shining light on a new research paradigm. Ann Intern Med.
2018;68(6):2230–8. 2015;163(8):635-6.
42. Sruamsiri R, Iwasaki K, Tang W, Mahlich J. Persistence rates and 64. Baigent C. Pragmatic trials-need for ADAPTABLE design. N Engl J Med.
medical costs of biological therapies for psoriasis treatment in Japan: 2021;384(21).
a real-world data study using a claims database. BMC Dermatol. 65. Hernán MA, Robins JM. Using big data to emulate a target trial when a
2018;18(1):1–11. randomized trial is not available. Am J Epidemiol. 2016;183(8):758–64.
43. Quock TP, Yan T, Chang E, Guthrie S, Broder MS. Epidemiology of AL 66. Ioannou GN, Locke ER, O’Hare AM, Bohnert AS, Boyko EJ, Hynes DM,
amyloidosis: a real-world study using US claims data. Blood Adv. et al. COVID-19 vaccination effectiveness against infection or death in
2018;2(10):1046–53. a National US Health Care system: a target trial emulation study. Ann
44. Herland M, Bauder RA, Khoshgoftaar TM. Medical provider specialty Intern Med. 2022;175(3):352–61.
predictions for the detection of anomalous medicare insurance claims. 67. García-Albéniz X, Hsu J, Hernán MA. The value of explicitly emulating a
In: 2017 IEEE international conference on information reuse and inte- target trial when using real world evidence: an application to colorectal
gration (IRI). New York City: IEEE; 2017. p. 579–88. cancer screening. Eur J Epidemiol. 2017;32(6):495–500.
45. Momo K, Kobayashi H, Sugiura Y, Yasu T, Koinuma M, Kuroda SI. Preva- 68. Takeuchi Y, Kumamaru H, Hagiwara Y, Matsui H, Yasunaga H, Miyata H,
lence of drug–drug interaction in atrial fibrillation patients based on a et al. Sodium-glucose cotransporter-2 inhibitors and the risk of urinary
large claims data. PLoS ONE. 2019;14(12):e0225297. tract infection among diabetic patients in Japan: Target trial emulation
46. Ghiani M, Maywald U, Wilke T, Heeg B. RW1 Bridging The Gap Between using a nationwide administrative claims database. Diabetes Obes
Clinical Trials And Real World Data: Evidence On Replicability Of Efficacy Metab. 2021;23(6):1379–88.
Results Using German Claims Data. Value Health. 2020;23:S757–8. 69. Jen EY, Xu Q, Schetter A, Przepiorka D, Shen YL, Roscoe D, et al. FDA
47. Silverman E, Skinner J. Medicare upcoding and hospital ownership. J approval: blinatumomab for patients with B-cell precursor acute
Health Econ. 2004;23(2):369–89. lymphoblastic leukemia in morphologic remission with minimal
residual disease. Clin Cancer Res. 2019;25(2):473–7.
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 9 of 10
70. Gross AM. Using real world data to support regulatory approval of 92. Oh Y, Park S, Ye JC. Deep learning covid-19 features on cxr using limited
drugs in rare diseases: A review of opportunities, limitations & a case training data sets. IEEE Trans Med Imaging. 2020;39(8):2688–700.
example. Curr Probl Cancer. 2021;45(4):100769. 93. Hemdan EED, Shouman MA, Karar ME. Covidx-net: A framework of deep
71. Wu J, Wang C, Toh S, Pisa FE, Bauer L. Use of real-world evidence in learning classifiers to diagnose covid-19 in x-ray images. arXiv preprint
regulatory decisions for rare diseases in the United States—Cur- arXiv:2003.11055. 2020.
rent status and future directions. Pharmacoepidemiol Drug Saf. 94. Wang S, Zha Y, Li W, Wu Q, Li X, Niu M, et al. A fully automatic deep
2020;29(10):1213–8. learning system for COVID-19 diagnostic and prognostic analysis. Eur
72. Hayeems RZ, Michaels-Igbokwe C, Venkataramanan V, Hartley T, Acker Respir J. 2020;56(2).
M, Gillespie M, et al. The complexity of diagnosing rare disease: An 95. Ardakani AA, Kanafi AR, Acharya UR, Khadem N, Mohammadi A.
organizing framework for outcomes research and health economics Application of deep learning technique to manage COVID-19 in
based on real-world evidence. Genet Med. 2022;24(3):694–702. routine clinical practice using CT images: Results of 10 convolutional
73. Hernán MA, Robins JM. Causal inference. Boca Raton: CRC; 2010. neural networks. Comput Biol Med. 2020;121:103795.
74. Ho M, van der Laan M, Lee H, Chen J, Lee K, Fang Y, et al. The current 96. Food U, Administration D. Proposed Regulatory Framework for Modi-
landscape in biostatistics of real-world data and evidence: Causal fications to Artificial Intelligence/Machine Learning (AI/ML)-Based
inference frameworks for study design and analysis. Stat Biopharm Res. Software as a Medical Device(SaMD) - Discussion Paper and Request
2021. https://www.tandfonline.com/doi/abs/10.1080/19466315.2021. for Feedback. 2019. https://www.fda.gov/files/medical%20devices/
1883475. published/US-FDA-Artifi cial-Intelligence-and-Machine-Learning-
75. Crown WH. Real-world evidence, causal inference, and machine learn- Discussion-Paper.pdf. Accessed 24 Mar 2022.
ing. Value Health. 2019;22(5):587–92. 97. Food U, Administration D. Artificial Intelligence/Machine Learning
76. Cui P, Shen Z, Li S, Yao L, Li Y, Chu Z, et al. Causal inference meets (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan.
machine learning. In: Proceedings of the 26th ACM SIGKDD Interna- 2021. https://www.fda.gov/media/145022/download. Accessed 24
tional Conference on Knowledge Discovery & Data Mining. New York: March 2022.
Association for Computing Machinery; 2020. p. 3527–3528. 98. of Medicines Regulatory Authorities IC. Informal Innovation Network
77. Xiong HY, Alipanahi B, Lee LJ, Bretschneider H, Merico D, Yuen RK, Hua Y, Horizon Scanning Assessment Report - Artificial Intelligence. 2021.
Gueroussov S, Najafabadi HS, Hughes TR, Morris Q. The human splicing https://www.icmra.info/drupal/sites/default/files/2021-08/horizon_
code reveals new insights into the genetic determinants of disease. scanning_report_artifi cial_intelligence.pdf. Accessed 24 March 2022.
Science. 2015;347(6218):1254806. 99. Agency EM. Artificial intelligence in medicine regulation. 2021.
78. Quang D, Chen Y, Xie X. DANN: a deep learning approach for https://www.ema.europa.eu/en/news/artifi cial-intelligence-medic
annotating the pathogenicity of genetic variants. Bioinformatics. ine-regulation. Accessed 24 Mar 2022.
2015;31(5):761–3. 100. Van der Laan MJ, Rose S. Targeted learning: causal inference for
79. Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiaka- observational and experimental data. 2011. Springer-Verlag New York
kou S. Lung pattern classification for interstitial lung diseases using Inc., United States.
a deep convolutional neural network. IEEE Trans Med Imaging. 101. Van der Laan MJ, Rose S. Targeted learning in data science. Causal
2016;35(5):1207–16. Inference for Complex Longitudinal Studies 2018. Cham: Springer.
80. Van Grinsven MJ, van Ginneken B, Hoyng CB, Theelen T, Sánchez CI. Fast 102. van der Laan MJ, Luedtke AR. Targeted learning of the mean
convolutional neural network training using selective data sampling: outcome under an optimal dynamic treatment rule. J Causal Infer.
Application to hemorrhage detection in color fundus images. IEEE 2015;3(1):61–95.
Trans Med Imaging. 2016;35(5):1273–84. 103. Sofrygin O, Zhu Z, Schmittdiel JA, Adams AS, Grant RW, van der
81. Kleesiek J, Urban G, Hubert A, Schwarz D, Maier-Hein K, Bendszus M, Laan MJ, et al. Targeted learning with daily EHR data. Stat Med.
et al. Deep MRI brain extraction: A 3D convolutional neural network for 2019;38(16):3073–90.
skull stripping. NeuroImage. 2016;129:460–9. 104. Chakravarti P, Wilson A, Krikov S, Shao N, van der Laan M. PIN68 Esti-
82. Gibson E, Li W, Sudre C, Fidon L, Shakir DI, Wang G, et al. NiftyNet: a mating Effects in Observational Real-World Data, From Target Trials to
deep-learning platform for medical imaging. Comput Methods Prog Targeted Learning: Example of Treating COVID-Hospitalized Patients.
Biomed. 2018;158:113–22. Value Health. 2021;24:S118.
83. Coccia M. Deep learning technology for improving cancer care in soci- 105. Eichler HG, Koenig F, Arlett P, Enzmann H, Humphreys A, Pétavy F,
ety: New directions in cancer imaging driven by artificial intelligence. et al. Are novel, nonrandomized analytic methods fit for decision
Technol Soc. 2020;60:101198. making? The need for prospective, controlled, and transparent vali-
84. Bien N, Rajpurkar P, Ball RL, Irvin J, Park A, Jones E, et al. Deep- dation. Clin Pharmacol Ther. 2020;107(4):773–9.
learning-assisted diagnosis for knee magnetic resonance imaging: 106. Chakraborty S, Tomsett R, Raghavendra R, Harborne D, Alzantot M,
development and retrospective validation of MRNet. PLoS Med. Cerutti F, et al. Interpretability of deep learning models: A survey of
2018;15(11):e1002699. results. In: 2017 IEEE smartworld, ubiquitous intelligence & comput-
85. Johansson FD, Collins JE, Yau V, Guan H, Kim SC, Losina E, et al. Predict- ing, advanced & trusted computed, scalable computing & commu-
ing response to tocilizumab monotherapy in rheumatoid arthritis: nications, cloud & big data computing, Internet of people and smart
a real-world data analysis using machine learning. J Rheumatol. city innovation. New York City: IEEE; 2017. p. 1–6.
2021;48(9):1364–70. 107. Zhang Q, Zhu SC. Visual interpretability for deep learning: a survey.
86. Ravì D, Wong C, Deligianni F, Berthelot M, Andreu-Perez J, Lo B, et al. arXiv preprint arXiv:1802.00614. 2018.
Deep learning for health informatics. IEEE J Biomed Health Informa. 108. Hohman F, Park H, Robinson C, Chau DHP. Summit: Scaling deep
2016;21(1):4–21. learning interpretability by visualizing activation and attribution
87. Suzuki K. Overview of deep learning in medical imaging. Radiol Phys summarizations. IEEE Trans Vis Comput Graph. 2019;26(1):1096–106.
Technol. 2017;10(3):257–73. 109. Ghoshal B, Tucker A. Estimating uncertainty and interpretability in
88. Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annu deep learning for coronavirus (COVID-19) detection. arXiv preprint
Rev Biomed Eng. 2017;19:221–48. arXiv:2003.10769. 2020.
89. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, et al. 110. Raghu M, Gilmer J, Yosinski J, Sohl-Dickstein J. Svcca: Singular vector
A survey on deep learning in medical image analysis. Med Image Anal. canonical correlation analysis for deep learning dynamics and inter-
2017;42:60–88. pretability. 2017; 31st Conference on Neural Information Processing
90. Lee JG, Jun S, Cho YW, Lee H, Kim GB, Seo JB, et al. Deep learning in Systems (NIPS 2017). Long Beach: NEURAL INFO PROCESS SYS F, LA
medical imaging: general overview. Korean J Radiol. 2017;18(4):570–84. JOLLA; 2017. ISBN: 9781510860964.
91. Amyar A, Modzelewski R, Li H, Ruan S. Multi-task deep learning based 111. Cruz-Roa AA, Ovalle JEA, Madabhushi A, Osorio FAG. A deep learning
CT imaging analysis for COVID-19 pneumonia: Classification and seg- architecture for image representation, visual interpretability and
mentation. Comput Biol Med. 2020;126:104037. automated basal-cell carcinoma cancer detection. In: International
Liu and Panagiotakos BMC Medical Research Methodology 2022, 22(1):287 Page 10 of 10