How Machine Learning Will Transform Biomedicine
How Machine Learning Will Transform Biomedicine
Perspective
This Perspective explores the application of machine learning toward improved diagnosis and
treatment. We outline a vision for how machine learning can transform three broad areas of biomed-
icine: clinical diagnostics, precision treatments, and health monitoring, where the goal is to maintain
health through a range of diseases and the normal aging process. For each area, early instances of
successful machine learning applications are discussed, as well as opportunities and challenges
for machine learning. When these challenges are met, machine learning promises a future of
rigorous, outcomes-based medicine with detection, diagnosis, and treatment strategies that are
continuously adapted to individual and environmental differences.
Machine learning leverages sophisticated algorithms operating response (Zitnik et al., 2019). Automated pattern recognition
on large-scale, heterogeneous datasets to uncover useful pat- through machine learning is essential due to the enormity and
terns that would be difficult or impossible for even well-trained complexity of biomedical data; manual analysis is both inefficient
individuals to identify. There already are many applications of and untenable. Equally important, many human diseases involve
this approach throughout science and society ranging from a complex constellation of changes that occur dynamically and
game playing (Silver et al., 2018), to product recommendations vary from patient to patient. Understanding this complexity re-
(Batmaz et al., 2019), to controlling self-driving cars (Bojarski quires analysis of large-scale heterogeneous data to identify
et al., 2016). In biomedicine, work in the human genome project novel patterns that, after rigorous evaluation, can be used for
(Venter et al., 2001), efforts in cancer omics (e.g., The Cancer diagnosis and treatment. Machine learning, then, can assist
Genome Atlas [Tomczak et al., 2015], the International Cancer biomedical scientists and medical professionals by identifying
Genome Consortium [Zhang et al., 2019], and the Clinical Pro- and summarizing meaningful patterns from large datasets (Raj-
teomic Tumor Analysis Consortium [Ellis et al., 2013]), and komar et al., 2019). Careful evaluation of the patterns found
numerous international machine learning competitions such and predictions made by machine learning applications in diag-
as DREAM challenges (Saez-Rodriguez et al., 2016; Sage Bio- nosis and treatment is essential. ‘‘Ground truth’’ data, in which
networks, 2020) and the Critical Assessment of Genome Inter- associations between data and outcome are known, can be
pretation (Andreoletti et al., 2019) have shown the power of this used to rigorously evaluate the performance of novel algorithms.
approach. The ability to collect and analyze large datasets Such evaluation data may be quantitative, such as biomarker
related to medical treatments and outcomes promises to trans- reduction on treatment, or more qualitative, such as overall pa-
form medicine into a data-driven, outcomes-oriented discipline tient health. It is also important to appreciate that ground truth
with profound implications for disease detection, diagnosis, may change depending on individual characteristics such as
and treatment. Collection of molecular and phenotypic data age, gender, and environmental exposures.
has become pervasive and includes genomic testing for Recognizing this, there are a growing number of research pro-
personalized treatment of cancer, high-resolution two- and grams designed to collect and organize large-scale datasets
three-dimensional anatomical imaging of organs, histological linking variables to health status, which can be used to train
analyses of tissue biopsies, and smart watches that monitor and evaluate machine learning approaches. Programs in cancer
heart rates and notify wearers of irregularities (Shilo et al., that aggregate molecular profiles from experimental model sys-
2020). These and many other collected data provide the raw tems or patient samples together with diagnostic, prognostic,
material for a future of early, more accurate diagnoses, person- and therapeutic responses provide examples of these valuable
alized treatments, and ongoing monitoring in support of overall data repositories. For example, the Cancer Dependency Map
health. (Tsherniak et al., 2017) has collected multimodal molecular pro-
Machine learning will help realize a future of improved health files, drug response, and genetic viability data on more than
care by unlocking the potential of large biomedical and patient 1,000 cancer cell lines. The AACR Project GENIE (AACR Project
datasets. Early uses of machine learning in diagnosis and treat- GENIE Consortium, 2017) has collected genomic profiles and
ment have shown promise to diagnosis breast cancer from X- clinical data for more than 19,000 patients, and the ASCO Can-
rays (McKinney et al., 2020; Wu et al., 2019), discover new anti- cerLinQ is building a similar database of hundreds of thousands
biotics (Stokes et al., 2020), predict onset of gestational diabetes of patients. Coupled with advanced algorithms, such programs
from electronic health records (Artzi et al., 2020), and identify have the potential to transform our understanding of diseases
clusters of patients that share a molecular signature of treatment and improve our ability to predict disease outcomes.
learning/natural language
Continuous learning
met needs to be addressed. We conclude by identifying several
cross-cutting challenges that, if solved, will help realize the full
Deep learning
assignment
Unlabeled data
Labeled data
and biomarkers, machine learning models are likely to be sub- chine learning software is transparent will be critical before wide-
stantially more accurate than current practice, which is often spread deployment and adoption. ‘‘Transparency’’ in this
limited to a few markers and reflects only a narrow view of com- context includes description of the optimized objectives,
plex diseases. strengths, quantitative performance, and limitations of a partic-
Joint human-computer diagnostic approaches such as those ular algorithm (Cai et al., 2019) as well as the procedures
illustrated in Figure 1, are likely to become common because used to validate the algorithm. These attributes will help medical
they take advantage of the strengths of both humans and com- professionals decide when and how to use machine learning ap-
puters. In this collaborative approach, physicians will make a plications to obtain valid results and improve decision making.
final diagnosis by integrating all available information, including Applications that use machine learning can help build trust in
that provided by machine learning systems (Ahuja, 2019). Ma- the system and facilitate deeper understanding of the underlying
chine learning systems will have a key role by automating routine biological mechanism of disease by explaining predictions, such
diagnosis, flagging challenging cases that require more human as by highlighting the most important features used (Ching et al.,
input, and providing additional information useful in making diag- 2018; Litjens et al., 2016).
noses (e.g., Ardila et al., 2019). Moreover, machine learning sys- As more advanced clinical testing technologies are coupled
tems may use different features than medical professionals to with machine learning, it will be important to consider tradeoffs
make diagnoses, though care will be required to assess the bio- between disease detection rates, patient outcomes, and other
logical utility of such features. As a result, approaches that inte- factors that impact patient health and quality of life. Disease
grate knowledge from both medical professionals and advanced detection rates may increase with the use of machine learning
algorithms will lead to improved diagnoses. Ensuring that ma- technologies, and disease-specific research will be needed to
sign of cardiac arrest (Chan et al., 2019). In the future, machine with data collected for each individual. A key advantage of this
learning software is likely to be used to identify new biomarkers approach is that personal baselines can be established and de-
from wearable and audio sensor data, perhaps by integrating viations from baselines—that may indicate a change in health
data across different types of devices. Both traditional super- status—can be detected. Using personalized models, machine
vised learning and deep learning are likely to play roles in devel- learning applications will monitor individuals for any changes
oping models from wearable data. from normal and notify individuals when a change requires con-
Using machine learning together with data collected from sult with a medical professional. An interesting possibility along
smartphones provides new opportunities for diagnostics as these lines is suggested by recent work showing that monitoring
well. Deep learning approaches have been applied to analyze of individual internet symptom searches (in essence, self-re-
pictures from smartphone cameras to identify different types of ported health issues such as weight loss, bronchitis, cough,
skin cancers (Esteva et al., 2017) and also to diagnose diabetic chest pain, etc.) coupled with machine-learned tendencies
retinopathy (Micheletti et al., 2016). Recent studies have found from many individuals can enable early detection of lung (White
that sensory data (e.g., voice, tapping, response time, and accel- and Horvitz, 2017) and pancreatic (Paparrizos et al., 2016) can-
erometer data) collected from smartphones and processed cers. This could lead to a physician or patient alert system that
using machine learning can be used to monitor symptoms and recommends medical attention when a more serious issue may
progression of Parkinson’s disease (Arora et al., 2015; Espay explain the seemingly innocuous symptoms searched for. Of
et al., 2016; Ginis et al., 2016; Pereira et al., 2016). These proto- course, many issues regarding privacy would have to be over-
type applications suggest a role for machine learning where come to make this possible.
wearables, home devices, and smartphones are used to capture Once in a clinical setting, high-fidelity imaging and molecular
all kinds of data, including biometric measurements, photos, di- testing will be interpreted by medical professionals with the
etary intake, and even environmental information (i.e., the ‘‘expo- help of machine learning to identify noteworthy biomarkers and
some’’ [Vermeulen et al., 2020]). By connecting this information make a final diagnosis. Disease diagnoses that require treatment
with diagnoses, machine learning will be used to identify patterns will use multiscale modeling and automated search results for
within the data that suggest a particular diagnosis. similar patients to inform treatment selection.
The foundation of health management is the ongoing moni- After diagnosis and treatment, health management begins
toring of individual behavior and body functioning through again with ongoing monitoring of individual health. This time, how-
home and wearable devices together with readouts from routine ever, there are multiple goals that a machine learning system must
blood sampling. Personalized models of baseline functions and meet: monitor how the individual is responding to treatment,
activity will be built by customizing population-level models watch for any adverse effects, and monitor overall health and
Brown, B.P., Zhang, Y.-K., Westover, D., Yan, Y., Qiao, H., Huang, V., Du, Z., Ellis, M.J., Gillette, M., Carr, S.A., Paulovich, A.G., Smith, R.D., Rodland, K.K.,
Smith, J.A., Ross, J.S., Miller, V.A., et al. (2019). On-target resistance to the Townsend, R.R., Kinsinger, C., Mesri, M., Rodriguez, H., and Liebler, D.C.;
mutant-selective EGFR inhibitor osimertinib can develop in an allele specific Clinical Proteomic Tumor Analysis Consortium (CPTAC) (2013). Connecting
manner dependent on the original EGFR activating mutation. Clin. Cancer genomic alterations to cancer biology with proteomics: the NCI Clinical Prote-
Res. Published online February 22, 2019. https://doi.org/10.1158/1078- omic Tumor Analysis Consortium. Cancer Discov. 3, 1108–1112.
0432.CCR-18-3829. Espay, A.J., Bonato, P., Nahab, F.B., Maetzler, W., Dean, J.M., Klucken, J., Es-
Bumgarner, J.M., Lambert, C.T., Hussein, A.A., Cantillon, D.J., Baranowski, kofier, B.M., Merola, A., Horak, F., Lang, A.E., et al.; Movement Disorders So-
B., Wolski, K., Lindsay, B.D., Wazni, O.M., and Tarakji, K.G. (2018). Smart- ciety Task Force on Technology (2016). Technology in Parkinson’s disease:
watch Algorithm for Automated Detection of Atrial Fibrillation. J. Am. Coll. Car- Challenges and opportunities. Mov. Disord. 31, 1272–1282.
diol. 71, 2381–2388. Esteva, A., Kuprel, B., Novoa, R.A., Ko, J., Swetter, S.M., Blau, H.M., and
Cai, C.J., Winter, S., Steiner, D., Wilcox, L., and Terry, M. (2019). ‘‘Hello AI’’: Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep
Uncovering the Onboarding Needs of Medical Practitioners for Human-AI neural networks. Nature 542, 115–118.
Collaborative Decision-Making. Proc. ACM Hum. Comput. Interact. 3, 104. Fu, Y., and Guo, J. (2018). Blood Cholesterol Monitoring With Smartphone as
Camacho, D.M., Collins, K.M., Powers, R.K., Costello, J.C., and Collins, J.J. Miniaturized Electrochemical Analyzer for Cardiovascular Disease Prevention.
(2018). Next-Generation Machine Learning for Biological Networks. Cell 173, IEEE Trans. Biomed. Circuits Syst. 12, 784–790.
1581–1592. Gao, F., Wang, W., Tan, M., Zhu, L., Zhang, Y., Fessler, E., Vermeulen, L., and
Car, J., Tan, W.S., Huang, Z., Sloot, P., and Franklin, B.D. (2017). eHealth in the Wang, X. (2019). DeepCC: a novel deep learning-based framework for cancer
future of medications management: personalisation, monitoring and adher- molecular subtype classification. Oncogenesis 8, 44.
ence. BMC Med. 15, 73.
Gerstung, M., Papaemmanuil, E., Martincorena, I., Bullinger, L., Gaidzik, V.I.,
Chan, J., Rea, T., Gollakota, S., and Sunshine, J.E. (2019). Contactless cardiac Paschka, P., Heuser, M., Thol, F., Bolli, N., Ganly, P., et al. (2017). Precision
arrest detection using smart devices. NPJ Digit. Med. 2, 52. oncology for acute myeloid leukemia using a knowledge bank approach.
Chang, S., Chiang, R., Wu, S., and Chang, W. (2016). A Context-Aware, Inter- Nat. Genet. 49, 332–340.
active M-Health System for Diabetics. IT Prof. 18, 14–22. Ginis, P., Nieuwboer, A., Dorfman, M., Ferrari, A., Gazit, E., Canning, C.G.,
Chang, P., Grinband, J., Weinberg, B.D., Bardis, M., Khy, M., Cadena, G., Su, Rocchi, L., Chiari, L., Hausdorff, J.M., and Mirelman, A. (2016). Feasibility
M.-Y., Cha, S., Filippi, C.G., Bota, D., et al. (2018a). Deep-Learning Convolu- and effects of home-based smartphone-delivered automated feedback
tional Neural Networks Accurately Classify Genetic Mutations in Gliomas. training for gait in people with Parkinson’s disease: A pilot randomized
AJNR Am. J. Neuroradiol. 39, 1201–1207. controlled trial. Parkinsonism Relat. Disord. 22, 28–34.
Chang, Y., Park, H., Yang, H.-J., Lee, S., Lee, K.-Y., Kim, T.S., Jung, J., and Gulshan, V., Peng, L., Coram, M., Stumpe, M.C., Wu, D., Narayanaswamy, A.,
Shin, J.-M. (2018b). Cancer Drug Response Profile scan (CDRscan): A Deep Venugopalan, S., Widner, K., Madams, T., Cuadros, J., et al. (2016). Develop-
Learning Model That Predicts Drug Effectiveness from Cancer Genomic ment and Validation of a Deep Learning Algorithm for Detection of Diabetic
Signature. Sci. Rep. 8, 8857. Retinopathy in Retinal Fundus Photographs. JAMA 316, 2402–2410.
Metzcar, J., Wang, Y., Heiland, R., and Macklin, P. (2019). A Review of Cell- Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G.G.,
Based Computational Modeling in Cancer Biology. JCO Clin. Cancer Inform. Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al. (2001). The sequence
3, 1–13. of the human genome. Science 291, 1304–1351.
Micheletti, J.M., Hendrick, A.M., Khan, F.N., Ziemer, D.C., and Pasquel, F.J. Vermeulen, R., Schymanski, E.L., Barabási, A.-L., and Miller, G.W. (2020). The
(2016). Current and Next Generation Portable Screening Devices for Diabetic exposome and health: Where chemistry meets biology. Science 367, 392–396.
Retinopathy. J. Diabetes Sci. Technol. 10, 295–300. Way, G.P., and Greene, C.S. (2019). Discovering Pathway and Cell Type Sig-
Mobadersany, P., Yousefi, S., Amgad, M., Gutman, D.A., Barnholtz-Sloan, natures in Transcriptomic Compendia with Machine Learning. Annu. Rev. Bio-
J.S., Velázquez Vega, J.E., Brat, D.J., and Cooper, L.A.D. (2018). Predicting med. Data Sci. 2, 1–17.
cancer outcomes from histology and genomics using convolutional networks. Way, G.P., Sanchez-Vega, F., La, K., Armenia, J., Chatila, W.K., Luna, A.,
Proc. Natl. Acad. Sci. USA 115, E2970–E2979. Sander, C., Cherniack, A.D., Mina, M., Ciriello, G., et al.; Cancer Genome Atlas