Applications of Machine Learning in Routine Laboratory Medicine Current State and Future Directions 2022
Applications of Machine Learning in Routine Laboratory Medicine Current State and Future Directions 2022
Author manuscript
Clin Biochem. Author manuscript; available in PMC 2023 May 01.
Author Manuscript
Abstract
Machine learning is able to leverage large amounts of data to infer complex patterns that are
otherwise beyond the capabilities of rule-based systems and human experts. Its application to
laboratory medicine is particularly exciting, as laboratory testing provides much of the foundation
for clinical decision making. In this article, we provide a brief introduction to machine learning
Author Manuscript
for the medical professional in addition to a comprehensive literature review outlining the current
state of machine learning as it has been applied to routine laboratory medicine. Although still in
its early stages, machine learning has been used to automate laboratory tasks, optimize utilization,
and provide personalized reference ranges and test interpretation. The published literature leads
us to believe that machine learning will be an area of increasing importance for the laboratory
practitioner. We envision the laboratory of the future will utilize these methods to make significant
improvements in efficiency and diagnostic precision.
Keywords
Artificial Intelligence; Clinical Pathology; Biochemistry; Precision Medicine; Clinical Decision
Support
Author Manuscript
1. Introduction
Author Manuscript
The application of machine learning in medicine has garnered enormous attention over the
past decade [1–3]. Novel computational methods provide a way to learn from past examples
in order to infer complex patterns beyond the capabilities of rule-based algorithms. Along
with this attention comes expectations and promises that advances in computation will
transform the way that medicine is practiced. In fact, there are already several examples
of machine learning methods that have been approved for use by the US Food and
Drug Administration (FDA), most recognizably in the field of radiology, cardiology, and
pathology [4,5].
The use of machine learning in laboratory medicine has also gained traction and is an
increasingly important area of which practitioners should stay abreast [6–8]. The numerical
and structured format of data in laboratory medicine lends itself well to computational
Author Manuscript
methods such as machine learning. Such advances harbor promise for the future of
medicine, where laboratory testing provides much of the basis for clinical decision making.
In this review we provide a practical introduction to machine learning for the laboratory
medicine specialist and a survey of ongoing work using machine learning in routine
laboratory testing and laboratory information systems. While there has been extensive work
in the use of machine learning in the greater field of clinical pathology, this review will
focus on its application in routine laboratory testing including clinical chemistries and
common laboratory tests such as blood counts and urinalysis [9,10]. Similarly excluded
are machine learning algorithms that rely on laboratory data to make clinical predictions
[11–14]. Although this is another growing interest in the medical application of machine
learning, we believe such algorithms pertain more to the clinical specialty related to the
Author Manuscript
model’s use case rather than the practice of laboratory medicine. The use of machine
learning in these related fields is briefly covered in section 4.3 of the text, to serve as a
reference for readers who may be interested in further exploring these areas.
In order to train a model, large amounts of structured data are required. Processing this
data involves cleaning and organizing data tables, imputing missing values, and reshaping or
combining observations so that they can be summarized and fed into a model. Furthermore,
these prior observations must be labeled so that the computer can learn from them (Figure
1). The majority of the work of developing a machine learning model is typically spent in
the data preparation phase. Important decisions must be made about which data to include
Author Manuscript
as model inputs and how they should be processed. Furthermore, the creation of accurate
labels also requires careful consideration and time. In many cases, labels are created through
manual enumeration by an expert; for example, a physician who reviews prior cases and
assigns diagnoses.
To put this all in context, consider a prior study that attempted to predict serum ferritin
levels based on other iron panel components [15]. In this example, the ferritin value was the
label. The other laboratory components were predictors. In cases where necessary laboratory
results were missing, the value was imputed using a statistical formula.
The retrospective data that is used to develop the machine learning model is broken up into
a training set of data and a testing set of data. The algorithm learns from the training set,
and then its performance is evaluated based on how well it runs on the testing data. This
Author Manuscript
is similar to how a student might study for a test based on published practice questions,
but a set of new questions is reserved for the actual evaluation—critical in preventing the
computer from simply memorizing the “practice questions” in the training data observations.
There are a broad suite of supervised machine learning models suitable for a variety of tasks
in medicine, including linear and logistic regression, support vector machines, and tree-
based models such as random forest and XGBoost. Tree-based algorithms were commonly
encountered in this literature review and in general have achieved good performance in
medical applications [16]. Such models use a decision tree that consists of a complex series
of decision points. The decision points are inferred during the model development based on
training data used to develop the model (Figure 2).
In contrast, unsupervised machine learning is when models are provided with an unlabeled
dataset. The model is left to describe relationships in the data according to patterns or
trends that it observes. Unsupervised machine learning can be used to discover previously
unknown patterns [17,18]. Examples of unsupervised machine learning models are k-means
clustering, k-nearest neighbors, and principal component analysis.
Author Manuscript
emergent property of these methods allows for the performance of complex tasks, such as
Author Manuscript
image recognition and language interpretation, explaining its rising popularity in healthcare
[19,20].
AND
The query was executed with a date range filter from 1 October 2011 to 30 September 2021.
Only English-language articles were included. This search returned 583 articles within the
10-year search period. As evidenced by the number of articles returned by our literature
search query over the past decade, this topic has received increasing attention over recent
years (Figure 3). Through manual title and abstract review, 544 of the original 583 articles
were excluded. Of those excluded articles, 108 did not primarily pertain to the field of
Author Manuscript
laboratory medicine or clinical pathology. Other major excluded themes included laboratory
imaging such as microscopy and cytology (166 articles), clinical prediction algorithms (90
articles), molecular medicine (47 articles), and microbiology (21 articles). The remaining 39
articles underwent full manuscript review, after which a total of 18 were included (Figure 4).
One of the first examples of this type of work is the study by Azarkhish et al. [21] in which
a neural network model predicted iron deficiency anemia and serum iron levels based on
features from a routine complete blood count. The model achieved an impressive AUROC
of 98% for the binary classification of iron-deficiency anemia. It predicted the actual serum
iron level with less accuracy, achieving a root-mean squared error of 0.136 mcg/dL and
Author Manuscript
R2 of 0.93. It is important to note, however, this study was limited by the relatively small
number of participants, with 149 subjects in the training group and 54 subjects in the testing
group.
This work continued with Luo et al. [15], who conceived a clinical decision support tool
capable of predicting laboratory test results from related laboratory results and other clinical
information. As a proof of concept, they demonstrated a machine learning algorithm that
was capable of predicting whether serum ferritin level was abnormal with considerable
accuracy—achieving an AUROC of 97% using a random forest imputation method to fill
in required missing laboratory features that were then fed into a logistic regression model.
Meanwhile, Lidbury et al. [22] also studied the redundancy of laboratory test panels, with
a focus on liver function tests. They were able to predict whether ɣ-glutamyl transferase
Author Manuscript
(GGT) was normal or abnormal using other components of the liver function panel,
achieving an accuracy of 90% with a tree-based machine learning model. They concluded
that GGT offered little additional value beyond the other components of a typical liver
function panel.
Along similar lines of test result prediction and lab utilization, Xu et al. [23] studied
a machine learning model to predict laboratory test results as normal or abnormal in
order to identify low-yield, repetitive laboratory tests. Their group performed a multi-site
study of nearly 200,000 inpatient laboratory testing orders to identify the most repetitive
Author Manuscript
laboratory tests, and then attempted to predict each one. They were able to achieve an
AUROC of >90% for 20 common laboratory tests including sodium, hemoglobin, and
lactate dehydrogenase. They proposed a sensitive decision threshold pertaining to a negative
predictive value of 95% to power a clinical decision support tool aimed at reducing low-
yield, repetitive testing.
In the same realm of clinical decision support and laboratory utilization, Islam et al.
[24] developed a deep learning machine learning model capable of recommending what
laboratory tests a provider should order. Rather than predicting specific test results, their
model predicted what tests should be ordered in the first place. Using features such as
clinical diagnoses, medications, prior laboratory tests, and demographic information, their
neural network model was able to achieve moderate performance with AUROCmacro of
0.76 and AUROCmicro of 0.87. One important limitation of this study, however, is that the
Author Manuscript
algorithm learned from prior testing patterns, but no expert determination was made about
whether these prior ordering behaviors were optimal in the first place. Thus a model like this
is prone to learning undesirable practices from historic testing patterns.
Lee et al. [25] from South Korea proposed a neural network deep learning model
to predict low density lipoprotein cholesterol (LDL-C) from high density lipoprotein
cholesterol (HDL-C), total cholesterol, and triglycerides model compared to a ground
truth of fractionated LDL-C measurement. They showed that their model achieved better
performance than the historical Friedewald equation [26] and Martin’s “novel method” [27],
with a root mean squared error of 8.1 mg/dL versus 10.8 mg/dL and 8.3 mg/dL respectively.
Finally, Dunn et al. [28] completed an experimental study using a machine learning
Author Manuscript
regression model to predict common laboratory tests using data from wearable devices such
as accelerometers and electrodermal probes sensors. Unfortunately, this futuristic take on
laboratory test prediction was unable to achieve meaningful performance. For example, their
model using wearable data was able to explain only 21% of the variability in hematocrit
level, which was the laboratory test for which the model performed best.
was valid. Model results were compared to expert opinion of a group of biochemists. The
model was able to correctly classify critical values as valid with a sensitivity of 91% at
a specificity of 100%, meaning the model could drastically reduce the number of critical
results requiring manual validation while keeping the rate of incorrectly validated tests to a
minimum.
the test result verification process. Their model was able to automatically verify laboratory
results with a sensitivity of 99.9% and specificity of 98%. On their retrospective data, this
would have led to an 80% reduction in laboratory reports requiring manual verification
compared to their current rule-based verification system.
Meanwhile, Cao et al. [31] used a tree-based machine learning model to reduce the volume
of samples flagged for manual review. Their model, which used features from a 10-point
dipstick and urine cytometry measurements, called for a manual review rate of 32%, which
corresponded to a sensitivity of 92% and specificity of 81.5% compared against expert-label
ground truth manual urine microscopy results.
As for quality assurance/quality control, Farrell et al. [32] showed that a machine
learning algorithm for identifying mislabeled lab samples was able to outperform manual
Author Manuscript
verification. Their best performing algorithm was a neural network that achieved an AUROC
of 98%. A limitation of this study, however, is that they do not compare their performance
against rule-based delta checks, which are the current gold standard. Meanwhile, a neural
network machine learning algorithm by Fang et al. [33] was able to classify if a blood
specimen was clotted or not with moderate accuracy (AUROC 91%). Their algorithm used
coagulation testing results from the sample and compared model outputs to a ground truth of
manual inspection for clotting by laboratory technicians.
Peng et al. [35] were able to achieve significant test performance improvements with a
tree-based (random forest) machine learning model capable of reducing false positives from
newborn screening, a common issue with the highly sensitive assay. Their model, which
used 39 metabolic analytes and clinical variables such as weight and gestational age was
able to reduce false positives by 98% for ornithine transcarbamylase deficiency and 89% for
glutaric acidemia type 1, without sacrificing any test sensitivity. They published their tool
Author Manuscript
Finally, one of the most promising applications of machine learning in medicine is the
general development of “personalized” medical diagnosis and interpretation. To this effect,
Poole et al. [36] demonstrated that a series of statistical learning methods can be used
to create more personalized reference ranges by analyzing test result distributions against
clinical features such as diagnosis codes. In an earlier study from China, Yang et al. [37] also
demonstrated a neural network model capable of predicting reference ranges for erythrocyte
Author Manuscript
sedimentation rate (ESR) testing, which is known to vary based on geographic factors such
as altitude. Their algorithm uses a number of environmental variables and is able to predict
ESR reference ranges for laboratories across China, differing only up to 3% from established
reference ranges (which vary from 4 to 21 mm/hr).
In one such article, Fillmore et. al [39] studied a group of models for mapping 7 common
laboratory concepts to the United States Department of Veterans Affairs (US VA) medical
records system, where LOINC mappings are imperfect. The best performance was achieved
by a tree-based (random forest) model with an accuracy of 98%, presenting a significant
improvement over what was an otherwise tedious task of manually reviewing hundreds of
possible conceptual links.
Similarly, Parr et al. [40] developed a machine learning model to assign missing LOINC
codes and improve the accuracy of existing codes in the US VA medical records data
warehouse. Their tree-based machine learning algorithm was able to correctly identify the
LOINC code with a rate of 85% in unlabeled laboratory tests and correctly identify the
LOINC code in 96% of randomly selected previously labeled laboratory tests. In cases
Author Manuscript
where the algorithm differed from the currently assigned LOINC code, manual review
revealed that the machine learning algorithm was correct 83% of the time, compared to the
72% accuracy rate of the incumbent label.
4. Discussion
4.1. Reflections and Future Direction
Machine learning is able to leverage large amounts of data to infer complex relationships
and patterns that may otherwise be beyond the capabilities of a rule-based system or human
expert. Furthermore, while static rule-based algorithms are based on previously established
knowledge, machine learning can identify new patterns and applications, and continuously
use new data to improve its performance.
Author Manuscript
Along those lines, one of the most promising aspects of artificial intelligence in laboratory
medicine has been its success in automation. The reviewed work demonstrates significant
advancements in using machine learning algorithms to improve upon current rule-based
methods for identifying samples for manual verification or validation. Such algorithms have
already achieved excellent performance and we anticipate will soon be commonplace in the
modern laboratory.
Another exciting application of machine learning is its ability to leverage large amounts of
Author Manuscript
prior medical data to create more personalized interpretation of test results. Although still
in its early stages, we see the foundations for this in the work by Poole et al. who propose
a relatively simple method of using diagnosis codes to achieve slightly more personalized
reference ranges [36]. We envision future work into more comprehensive algorithms that
consider the entire clinical context of the patient to provide personalized laboratory test
reference ranges to enable precision diagnostics. The paradigm will shift from “What is a
normal hemoglobin?” to “What is a normal hemoglobin for you.”
Finally, while it is the focus of many articles considered in this review, considerable work
must still be done until machine learning prediction of laboratory test results can be utilized
to make changes in clinical practice. While studies have demonstrated a high level of
redundancy in lab panels and ordering practices, attempts at predicting laboratory test results
still fail to achieve consistently high performance across a variety of tests. Despite the lack
Author Manuscript
of a generalizable solution in this space, there are opportunities for smaller gains to be made
by optimizing testing utilization in specific situations.
them.
From a clinical point of view, as a relatively young field, machine learning in laboratory
medicine requires standardization and regulation. Currently, there are no guidelines
regarding the best practices for the clinical validation of machine learning algorithms.
Even in pathology fields where clinical machine learning tools are developing rapidly,
such as digital pathology, there are no well-established guidelines for laboratory-developed
applications or for the verification of vendor-developed software [41]. In fact, just recently,
the College of American Pathologists (CAP) assembled a committee to start addressing this
gap, including the creation of laboratory standards for AI applications [42].
Similarly, regulatory entities, such as the FDA, have not completely determined what their
Author Manuscript
This review focuses on the use of machine learning in routine laboratory testing. However,
there has been much attention given to the application of machine learning to the broader
field of pathology. Several related applications are briefly discussed here to serve as a
reference for laboratory medicine practitioners who may wish to explore these topics further.
The digitization of histopathology slides have allowed for the widespread utilization of
computer vision and other artificial intelligence methods for image interpretation. Many
studies in this area focus on histopathology in cancer [44–46]. Along these lines, the FDA
recently approved the first artificial intelligence product in pathology that can identify areas
of interest in prostate biopsy slides [47].
Similarly, digital image acquisition in microscopy has spawned additional work in this
field. Identification of cellular events such as mitosis or apoptosis can be used to flag
Author Manuscript
Finally, related to laboratory medicine is the use of machine learning for point-of-care
testing. In this field, there has been considerable emphasis on the use of predictive
algorithms in continuous glucose monitoring for patients with diabetes [52].
5. Conclusion
Machine learning promises exciting advancements in medicine, but its application in
laboratory medicine is still nascent. As a young field, there is additional need for
standardization of how these algorithms are developed and presented. Regardless, several
Author Manuscript
machine learning models have achieved excellent performance in automating test result
validation and triaging samples for manual review. There is also exciting, ongoing work
in using machine learning for optimizing laboratory utilization, predicting laboratory test
results, and providing personalized laboratory test interpretation.
Acknowledgements:
The authors wish to thank Connie Wong, medical education librarian, for her help in forming our literature search
query.
Funding:
Jonathan H Chen was supported in part by the NIH/National Library of Medicine Award R56LM013365, the
Stanford Artificial Intelligence in Medicine and Imaging and Human-Centered Artificial Intelligence (AMIA-HAI)
Author Manuscript
Partnership Grant, Stanford Aging and Ethnogeriatrics Research Center (under NIH/National Institute on Aging
grant P30AG059307), the Stanford Clinical Excellence Research Center (CERC), and the Stanford Departments of
Medicine and Pathology.
References
[1]. Darcy AM, Louie AK, Roberts LW. Machine Learning and the Profession of Medicine. JAMA
2016;315:551–2. [PubMed: 26864406]
[2]. Obermeyer Z, Emanuel EJ. Predicting the Future - Big Data, Machine Learning, and Clinical
Medicine. N Engl J Med 2016;375:1216–9. [PubMed: 27682033]
Author Manuscript
[3]. Beam AL, Kohane IS. Big Data and Machine Learning in Health Care. JAMA 2018;319:1317–8.
[PubMed: 29532063]
[4]. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved
medical devices and algorithms: an online database. NPJ Digit Med 2020;3:118. [PubMed:
32984550]
[5]. Cui M, Zhang DY. Artificial intelligence and computational pathology. Lab Invest 2021;101:412–
22. [PubMed: 33454724]
[6]. Lippi G, Bassi A, Bovo C. The future of laboratory medicine in the era of precision medicine. J
Lab Precis Med 2016;1:1–5.
[7]. Cabitza F, Banfi G. Machine learning in laboratory medicine: waiting for the flood? Clin Chem
Lab Med 2018;56:516–24. [PubMed: 29055936]
[8]. Paranjape K, Schinkel M, Hammer RD, Schouten B, Nannan Panday RS, Elbers PWG, et al.
The Value of Artificial Intelligence in Laboratory Medicine. Am J Clin Pathol 2021;155:823–31.
[PubMed: 33313667]
Author Manuscript
[9]. Pillay TS. Artificial intelligence in pathology and laboratory medicine. J Clin Pathol 2021;74:407–
8. [PubMed: 34031137]
[10]. Niazi MKK, Parwani AV, Gurcan MN. Digital pathology and artificial intelligence. Lancet Oncol
2019;20:e253–61. [PubMed: 31044723]
[11]. Goldstein BA, Navar AM, Pencina MJ, Ioannidis JPA. Opportunities and challenges in
developing risk prediction models with electronic health records data: a systematic review. J
Am Med Inform Assoc 2017;24:198–208. [PubMed: 27189013]
[12]. Brnabic A, Hess LM. Systematic literature review of machine learning methods used in the
analysis of real-world data for patient-provider decision making. BMC Med Inform Decis Mak
2021;21:54. [PubMed: 33588830]
[13]. Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, et al. A clinically applicable
approach to continuous prediction of future acute kidney injury. Nature 2019;572:116–9.
[PubMed: 31367026]
[14]. Fleuren LM, Klausch TLT, Zwager CL, Schoonmade LJ, Guo T, Roggeveen LF, et al. Machine
Author Manuscript
learning for the prediction of sepsis: a systematic review and meta-analysis of diagnostic test
accuracy. Intensive Care Med 2020;46:383–400. [PubMed: 31965266]
[15]. Luo Y, Szolovits P, Dighe AS, Baron JM. Using Machine Learning to Predict Laboratory Test
Results. Am J Clin Pathol 2016;145:778–88. [PubMed: 27329638]
[16]. Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning
algorithms for disease prediction. BMC Med Inform Decis Mak 2019;19:281. [PubMed:
31864346]
[17]. Wang Y, Zhao Y, Therneau TM, Atkinson EJ, Tafti AP, Zhang N, et al. Unsupervised machine
learning for the discovery of latent disease clusters and patient subgroups using electronic health
records. J Biomed Inform 2020;102:103364. [PubMed: 31891765]
[18]. Roohi A, Faust K, Djuric U, Diamandis P. Unsupervised Machine Learning in Pathology: The
Next Frontier. Surg Pathol Clin 2020;13:349–58. [PubMed: 32389272]
[19]. Yu K-H, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng
2018;2:719–31. [PubMed: 31015651]
[20]. Shrestha A, Mahmood A. Review of Deep Learning Algorithms and Architectures. IEEE Access
Author Manuscript
2019;7:53040–65.
[21]. Azarkhish I, Raoufy MR, Gharibzadeh S. Artificial intelligence models for predicting iron
deficiency anemia and iron serum level based on accessible laboratory data. J Med Syst
2011;36:2057–61. [PubMed: 21503744]
[22]. Lidbury BA, Richardson AM, Badrick T. Assessment of machine-learning techniques on large
pathology data sets to address assay redundancy in routine liver function test profiles. Diagnosis
(Berl) 2015;2:41–51. [PubMed: 29540013]
[34]. Wilkes EH, Rumsby G, Woodward GM. Using Machine Learning to Aid the Interpretation of
Urine Steroid Profiles. Clin Chem 2018;64:1586–95. [PubMed: 30097499]
[35]. Peng G, Tang Y, Cowan TM, Enns GM, Zhao H, Scharfe C. Reducing False-Positive Results in
Newborn Screening Using Machine Learning. Screening 2020;6. 10.3390/ijns6010016.
[36]. Poole S, Schroeder LF, Shah N. An unsupervised learning method to identify reference intervals
from a clinical database. J Biomed Inform 2015;59:276–84. [PubMed: 26707631]
[37]. Yang Q, Mwenda KM, Ge M. Incorporating geographical factors with artificial neural networks
to predict reference values of erythrocyte sedimentation rate. Int J Health Geogr 2013;12:11.
[PubMed: 23497145]
[38]. Huff SM, Rocha RA, McDonald CJ, De Moor GJ, Fiers T, Bidgood WD Jr, et al. Development
of the Logical Observation Identifier Names and Codes (LOINC) vocabulary. J Am Med Inform
Assoc 1998;5:276–92. [PubMed: 9609498]
[39]. Fillmore N, Do N, Brophy M, Zimolzak A. Interactive Machine Learning for Laboratory Data
Integration. Stud Health Technol Inform 2019;264:133–7. [PubMed: 31437900]
Author Manuscript
[40]. Parr SK, Shotwell MS, Jeffery AD, Lasko TA, Matheny ME. Automated mapping of laboratory
tests to LOINC codes using noisy labels in a national electronic health record system database. J
Am Med Inform Assoc 2018;25:1292–300. [PubMed: 30137378]
[41]. Baxi V, Edwards R, Montalto M, Saha S. Digital pathology and artificial intelligence in
translational medicine and clinical practice. Mod Pathol 2022;35:23–32. [PubMed: 34611303]
[42]. College of American Pathologists. Artificial Intelligence (AI) Committee. College
of American Pathologists 2021. https://www.cap.org/member-resources/councils-committees/
artificial-intelligence-ai-committee/ (accessed January 24, 2022).
vision approaches for phenotypic profiling. J Cell Biol 2016;216:65–71. [PubMed: 27940887]
[49]. Falk T, Mai D, Bensch R, Çiçek Ö, Abdulkadir A, Marrakchi Y, et al. U-Net: deep learning for
cell counting, detection, and morphometry. Nat Methods 2018;16:67–70. [PubMed: 30559429]
[50]. Wang S, Zhou Y, Qin X, Nair S, Huang X, Liu Y. Label-free detection of rare circulating tumor
cells by image analysis and machine learning. Sci Rep 2020;10:12226. [PubMed: 32699281]
[51]. Syed-Abdul S, Firdani R-P, Chung H-J, Uddin M, Hur M, Park JH, et al. Artificial Intelligence
based Models for Screening of Hematologic Malignancies using Cell Population Data. Sci Rep
2020;10:4583. [PubMed: 32179774]
[52]. Perkins BA, Sherr JL, Mathieu C. Type 1 diabetes glycemic management: Insulin therapy,
glucose monitoring, and automation. Science 2021;373:522–7. [PubMed: 34326234]
Author Manuscript
Author Manuscript
Figure 1.
Machine learning models are trained using prior observations (samples). Features from prior
observations are extracted and processed into a data matrix. In supervised machine learning,
each observation is labeled with an outcome.
Author Manuscript
Author Manuscript
Author Manuscript
Figure 2.
Author Manuscript
Graphical representation of types of machine learning models: (A) a simple decision tree
and (B) deep learning neural network.
Author Manuscript
Author Manuscript
Figure 3.
Bar plot showing pubmed query results by year, adjusted by number of months included in
the queried year (i.e. only October to December of 2011 and January to September of 2021
are included in the search query date range).
Author Manuscript
Author Manuscript
Figure 4.
Diagram showing manuscript inclusion and exclusion criteria for review.
Author Manuscript
Table 1.
True Positive Rate (Sensitivity) Probability that a truly positive result is predicted to be positive
True Negative Rate (Specificity) Probability that a truly negative result is predicted to be negative
False Positive Rate = 1-Specificity Probability that a truly negative result is falsely predicted to be positive
Author Manuscript
Author Manuscript
Author Manuscript
Table 2.
Author and Year Objective and Machine Learning Task Best Model Major Themes
Predict iron deficiency anemia and serum iron levels from CBC
Azarkhish (2012) Neural Network Prediction
indices
Cao (2012) Triage manual review for urinalysis samples Tree-based Automation
Predict liver function test results from other tests in the panel,
Lidbury (2015) Tree-based Prediction, Utilization
highlighting redundancy in the liver function panel
Automation,
Classify whether critical lab result is valid or invalid using other lab
Demirci (2016) Neural Network Interpretation,
values and clinical information
Validation
Luo (2016) Predict ferritin from other tests in iron panel Tree-based Prediction, Utilization
diagnoses
Predict LDL-C levels from a limited lipid panel more accurately than Interpretation,
Lee (2019) Neural Network
current gold standard equations Prediction
Validation,
Wang (2020) Automatically verify if lab test result is valid or invalid Tree-based
Automation
Dunn (2021) Predict laboratory test results from wearable data Tree-based Prediction