0% found this document useful (0 votes)

578 views216 pages

Artificial Intelligence/ Machine Learning in Nuclear Medicine and Hybrid Imaging

Uploaded by

ismaeljr69

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

578 views216 pages

Artificial Intelligence/ Machine Learning in Nuclear Medicine and Hybrid Imaging

Uploaded by

ismaeljr69

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 216

Artificial Intelligence/

Machine Learning in
Nuclear Medicine and
Hybrid Imaging

Patrick Veit-Haibach
Ken Herrmann
Editors

123
Artificial Intelligence/Machine Learning
in Nuclear Medicine and Hybrid Imaging
Patrick Veit-Haibach • Ken Herrmann
Editors

Artificial Intelligence/
Machine Learning
in Nuclear Medicine
and Hybrid Imaging
Editors
Patrick Veit-Haibach Ken Herrmann
Joint Department of Medical Imaging Deptartment of Nuclear Medicine
University Health Network Universitätsmedizin Essen
Toronto, ON, Canada Essen, Nordrhein-Westfalen, Germany

ISBN 978-3-031-00118-5 ISBN 978-3-031-00119-2 (eBook)

https://doi.org/10.1007/978-3-031-00119-2

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2022
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher,
whether the whole or part of the material is concerned, specifically the rights of translation,
reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any
other physical way, and transmission or information storage and retrieval, electronic adaptation,
computer software, or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in
this book are believed to be true and accurate at the date of publication. Neither the publisher nor
the authors or the editors give a warranty, expressed or implied, with respect to the material
contained herein or for any errors or omissions that may have been made. The publisher remains
neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Foreword

It is such a pleasure to write the foreword for this visionary and timely book
on the role of artificial intelligence (AI)/machine learning (ML) in nuclear
medicine and hybrid imaging—including molecular imaging and theranos-
tics. Discussions of the role of AI/ML are increasingly en vogue in all areas
of medicine, from pathology and laboratory medicine to imaging, surgery,
neurology, and cardiology, to name a few [1, 2]. In the world of medical
imaging, examples of meaningful applications of AI/ML include distinguish-
ing normal from abnormal findings (without necessarily providing a diagno-
sis or a list of differentials); computer-assisted disease detection and scoring
to inform patient management (e.g., in the assessment of coronary calcium
[3], detection and management of thyroid nodules [4], evaluation of the pros-
tate by MRI [5], or staging of lung cancer or lymphoma with 18F-FDG PET
[6]); computer-enhanced image reconstruction [7]; and lesion tracking, quan-
tification, and categorization for the assessment of treatment response (e.g.,
with RECIST or PERCIST).
Enthusiasm about the many potential applications of AI/ML is not unmiti-
gated. Concerns have been raised about the lack of explainability [8] and
reproducibility [9] of ML-generated data, as well as the lack of validation and
proof of applicability in real-life scenarios and under varying conditions (e.g.,
across age groups and ethnicities). Importantly, to be clinically meaningful,
AI/ML models should obviate the need to perform a currently routine task or
provide some other advantage, such as an improvement in diagnostic perfor-
mance or prediction of risk and patient outcome; a reduction in errors (e.g., in
exam selection or interpretation); or increased throughput in the clinic. To
their credit, Drs. Veit-Haibach and Herrmann have assembled an outstanding
international group of authors, who address these issues head-on. The text
examines both the challenges of applying AI/ML and the wide range of ben-
efits AI/ML will likely provide for workflow and clinical care. While the
potential of AI/ML for advancing healthcare has been touted for quite a while,
advances in technology are now bringing this potential closer to realization.
AI and ML will have a tremendous impact on nuclear medicine and hybrid
imaging on both the front and the back end. Many functions will be auto-
mated, improving efficiency while ensuring much better quality control.
The book is divided into several main parts. Part I focuses on the technical
aspects of AI and ML and includes a much-needed chapter on repeatability,
reproducibility, and standardization of techniques. Lack of reproducibility
and an unwillingness to share codes or other programmatic details have been

v
vi Foreword

pervasive in the field of AI and ML [9], and a lack of standardization has also
stymied the field for many years. It is now time for investigators and vendors
to address these issues. Part II lays out current and potential clinical applica-
tions. While modern computational techniques are essential to contemporary
applications of AI/ML, the groundwork for some of the projects described in
this section started decades ago, without using the terms AI and ML; for
example, early attempts at computer-assisted recognition, characterization,
and description of imaging patterns led to the generation of standardized
reports in nuclear cardiology. In this book, Slomka et al. discuss some of the
more recent work in this area, all ultimately aimed at the generation of images
of better diagnostic quality, with shorter acquisition times or lower injected
activities, as well as the prediction of cardiovascular risk. In oncology, differ-
ent ML methodologies, and the use of radiomics and radiogenomics, can be
expected to enhance lesion characterization tremendously, allowing the
assessment of much more than the standardized uptake value. In addition, the
ability to automatically extract total tumor volume quickly and with high
reproducibility will be extremely helpful. The clinical application of AI/ML
should also lead to the development of novel imaging biomarkers with pre-
dictive and prognostic value far exceeding that of currently available imaging
biomarkers. Of note, the value of these markers will be maximized when they
are integrated with each other and with other forms of data (clinical, labora-
tory, etc.) with the help of AI and ML. Beyond diagnostics, AI and ML are
expected to play growing roles in theranostics, aiding in appropriate patient
selection, dosimetry, and response assessment. Finally, as enormous amounts
of data are being gathered in medical imaging, as in all areas of modern life,
ethical and legal questions have been raised regarding the ownership of these
data and their appropriate use. It is therefore very timely that the editors
invited Prof. Prainsack, a political scientist, and coauthors to contribute a
chapter that helps put these pertinent issues into perspective and offers poten-
tial solutions.
We remain optimistic about applications of AI/ML in nuclear medicine
and hybrid imaging. While the future is unpredictable, we do not expect these
techniques to replace physicians; rather, we expect changes in the nature of
our work, reducing or eliminating some time-consuming tasks, while adding
or enhancing others. In closing, we congratulate the editors on contributing
this valuable treatise on AI/ML to the literature and hope it reaches the ample
audience it deserves.

References

1. Rajpurkar P, Chen E, Banerjee O, et al. AI in health and medicine. Nat

Med. 2022;28:31–8.
2. Topol EJ. High-performance medicine: the convergence of human and
artificial intelligence. Nat Med. 2019;25:44–56.
3. Atkins KM, Weiss J, Zeleznik R, et al. Elevated coronary artery calcium
quantified by a validated deep learning model from lung cancer radio-
therapy planning scans predicts mortality. JCO Clin Cancer Inform.
2022;6:e2100095.
Foreword vii

4. Peng S, Liu Y, Lv W, et al. Deep learning-based artificial intelligence

model to assist thyroid nodule diagnosis and management: a multicentre
diagnostic study. Lancet Digit Health. 2021;3:e250–9.
5. Hosseinzadeh M, Saha A, Brand P, Slootweg I, de Rooij M, Huisman
H. Deep learning-assisted prostate cancer detection on bi-parametric
MRI: minimum training data size requirements and effect of prior knowl-
edge. Eur Radiol. 2021;32(4):2224–34. https://doi.org/10.1007/
s00330-021-08320-y.
6. Sibille L, Seifert R, Avramovic N, Vehren T, Spottiswoode B, Zuehlsdorff
S, Schäfers. 18 F-FDG PET/CT uptake classification in lymphoma and
lung cancer by using deep convolutional neural networks. Radiology.
2020;294(2):445–52. https://doi.org/10.1148/radiol.2019191114.
7. McMillan AB, Bradshaw TJ. Artificial intelligence-based data corrections
for attenuation and scatter in position emission tomography and single-
photon emission computed tomography. PET Clinic. 2021;16(4):543–52.
https://doi.org/10.1016/j.cpet.2021.06.010.
8. Holzinger A, Langs G, Denk H, et al. Causability and explainability of
artificial intelligence in medicine. Wiley Interdiscip Rev. Data Min Knowl
Discov. 2019;9:e1312.
9. Haibe-Kains B, Adam GA, Hosny A, et al. Transparency and reproduc-
ibility in artificial intelligence. Nature. 2020;586:E14–6.

Department of Radiology Hedvig Hricak

Memorial Sloan Kettering Cancer Center,
New York, NY, USA
[email protected]

Molecular Imaging and Therapy Service, Heiko Schöder

Department of Radiology
Memorial Sloan Kettering Cancer Center,
New York, NY, USA
[email protected]
Preface: Benefits and Challenges of AI/ML
in Hybrid Imaging and Molecular Imaging

Artificial intelligence (AI) and machine learning (ML) are two buzz words
which are currently electrifying multiple areas in medicine and beyond. As
every new technology AI/ML can induce a complex mix of emotions and
reactions in the medical community as well as in patients, ranging from
euphoria and optimism to fear and rejection. In several medical disciplines,
including but also certainly not limited to imaging specialties, the growing
role of AI/ML brings more implicit and sooner implications than in other and
maybe more “manual” medicine applications. Even well recognized maga-
zines as The Economist recently titled “Why scan-reading artificial intelli-
gence is bad news for radiologists” whereas other journals state “Doomsday
predictions about AI replacing radiologists are unrealistic, dangerous.”
There are many opinions when it comes to the use and integration of AI/
ML and they range from just being considered toys for gadget heads all the
way to being considered to provide true value for patients and physicians.
This very heterogenous mix of views and perception as well as the lack of a
dedicated view especially on how AI/ML potentially impacts Nuclear
Medicine and Hybrid Imaging motivated us to initiate this book. Nuclear
Medicine and Hybrid Imaging may not seem the first choice for artificial
intelligence and machine learning applications since the main advantage of
these techniques is prediction of specific patterns and not contextual diagno-
sis but there are actually a multitude of specific areas where those techniques
can be used to our advantage.
Moreover, as many new technologies are prone to undergo the “Gartner
hype cycle” with a steep increase of interest and expectations following a
technical trigger and a deep decrease of interest due to disillusionment prior
to a final slope of enlightenment and consecutive plateau of productivity, this
book intends to provide information hopefully accelerating the arrival in the
enlightenment and productivity phases.
Following this introduction a total of five chapters are dedicated to exist-
ing technologies setting the stage for a better understanding of clinical appli-
cations and its potential impact on molecular imaging and theranostics. Franc
et al. highlight the role and influence of AI/ML in healthcare with a special
focus on hybrid imaging and molecular imaging (Chap. 1). There give real-
world examples where such technologies could have values in clinical prac-
tice. The Kleesiek group reviews and establishes definitions and applications
for radiomics, radiogenomics, artificial intelligence, deep learning, and
machine learning (Chap. 2). Rezai et al. provide a short overview of the

ix
x Preface: Benefits and Challenges of AI/ML in Hybrid Imaging and Molecular Imaging

current knowledge of robustness, reproducibility, and standardization of

radiomics (Chap. 3). This is an especially important aspect to understand
when evaluating current radiomics results being published. The following
chapter by Schaeffertkoetter include the views of imaging device manufac-
turers highlighting steps of device evolution (Chap. 4). Afterwards, basic
principles of neural networks are explained in (Chap. 5). The latter provides
a broad overview on how networks are built, why they are built this way, and
also explains their basic function.
The clinical applications part kicks off with a review of Bayarri et al. about
imaging biomarkers and their meaning for molecular imaging (Chap. 6). By
understanding the meaning of such biomarkers, readers, molecular imaging
specialists, and referring physicians are able to connect the rather abstract
imaging features with the underlying pathophysiology. Currie and coworkers
tackle the question on how to integrate AI/ML into clinical routine of molecu-
lar imaging (Chap. 7) whereas the Bayarri group again discusses possibilities
to integrate AI/ML into our databases (Chap. 8). The next three chapters dis-
cuss how AI/ML applications can be clinically implemented into neurode-
generative (Chap. 9), oncological (Chap. 10), and cardiac (Chap. 11) imaging
applications.
The review of the potential impact of AI/ML on molecular imaging and
theranostics clinically as well as technologically has high relevance. A total
of four chapters are dedicated to reviewing the potential benefits and chal-
lenges of this new technology. Braren and coworkers discuss the potential
impact of AI/ML from the perspective of how it influences therapeutic deci-
sions and potential patient outcome (Chap. 12). Jurisica et al. elaborate why
imaging data alone is not enough and what additional data is needed to make
the most out of it (Chap. 13). This chapter points out the important aspect that
imaging data should not be considered without clinical context and not with-
out other available data (in vivo as well as in vitro) to get the most compre-
hensive overview of the patients’ state of disease. Prainsack and collaborators
discuss the legal and ethical issues involved with AI/ML and focus on the
aspect whether (or not) the patient does now own his/her data (Chap. 14).
This is internationally an increasingly debated topic since on one side there is
the need for international collaboration to provide high-volume and high-
quality data for studies, but on the other side there are different jurisdictions
involved with different legal and ethical requirements. The remaining chapter
by Hustinx et al. addresses two very timely and important questions not about
the technology itself but about the human side of our profession: (1) how is
AI/ML impacting the role of imaging specialists and what can we do to still
attract the smartest people to our field, and (2) how are we best prepared to
successfully handle the challenges of AI/ML (Chap. 15).
This compilation of chapters reviews the challenges and benefits of AI/ML
in special regard to nuclear medicine and hybrid imaging. Despite the overall
favorable mindset of the authors and editors towards AI/ML, this book also
considers ethical and legal aspects as well as warrants room for (and wants to
encourage) discussion of challenges and even critical aspects. As for every
new technology it requires time to successfully impact current practice of
medicine. AI/ML will most likely transform modern medicine and especially
Preface: Benefits and Challenges of AI/ML in Hybrid Imaging and Molecular Imaging xi

innovative fields such as nuclear medicine and hybrid imaging, but despite a
potential disruptive impact in the long run this will be rather an evolution over
time requiring regular revisitations on its progress. Also—as time told us
after the abovementioned comment—nothing happens overnight.
We as the editors believe that exciting times lay ahead for nuclear medi-
cine. Our field needs to rise up to the opportunities, overcome the challenges
but most importantly take ownership of driving our field towards the future.
We thank all the contributors, reviewers, and sponsors for accompanying us
in this journey and profoundly hope that this book will trigger lively discus-
sions and more importantly joint actions!

Toronto, ON, Canada Patrick Veit-Haibach

Essen, Germany Ken Herrmann
Contents

Part I Technology

1 Role and Influence of Artificial Intelligence

in Healthcare, Hybrid Imaging, and Molecular Imaging�� 3
Guido A. Davidzon and Benjamin Franc
2 Introduction to Machine Learning: Definitions
and Hybrid Imaging Applications�� 13
Jens Kleesiek
3 Radiomics in Nuclear Medicine, Robustness,
Reproducibility, and Standardization�� 29
Reza Rezai
4 Evolution of AI in Medical Imaging �� 37
Josh Schaefferkoetter
5 The Basic Principles of Machine Learning�� 57
Joshua D. Kaggie, Dimitri A. Kessler, Chitresh Bhushan,
Dawei Gui, and Gaspar Delso

Part II Clinical Applications

6 Imaging Biomarkers and Their Meaning

for Molecular Imaging�� 83
Angel Alberich-Bayarri, Ana Jiménez-Pastor,
and Irene Mayorga-Ruiz
7 Integration of Artificial Intelligence, Machine Learning,
and Deep Learning into Clinically Routine
Molecular Imaging�� 87
Geoffrey Currie and Eric Rohren
8 Imaging Biobanks for Molecular Imaging:
How to Integrate ML/AI into Our Databases �� 109
Angel Alberich-Bayarri, Ana Jiménez-Pastor, Blanca Ferrer,
María José Terol, and Irene Mayorga-Ruiz

xiii
xiv Contents

9 Artificial Intelligence/Machine Learning

in Nuclear Medicine�� 117
Sangwon Lee, Kyeong Taek Oh, Yong Choi, Sun K. Yoo,
and Mijin Yun
10 AI/ML Imaging Applications in Body Oncology�� 129
Robert Seifert and Peter Herhaus
11 Artificial Intelligence/Machine Learning in Nuclear
Medicine and Hybrid Imaging�� 137
Robert J. H. Miller, Jacek Kwiecinski, Damini Dey,
and Piotr J. Slomka

Part III Impact of AI and ML on Molecular Imaging

and Theranostics

12 Artificial Intelligence Will Improve Molecular Imaging,

Therapy and Theranostics. Which Are the Biggest
Advantages for Therapy?�� 159
Georgios Kaissis and Rickmer Braren
13 Integrative Computational Biology, AI, and Radiomics:
Building Explainable Models by Integration of Imaging,
Omics, and Clinical Data�� 171
I. Jurisica
14 Legal and Ethical Aspects of Machine Learning:
Who Owns the Data? �� 191
Barbara Prainsack and Elisabeth Steindl
15 Artificial Intelligence and the Nuclear Medicine Physician:
Clever Is as Clever Does�� 203
Roland Hustinx
Part I
Technology
Role and Influence of Artificial
Intelligence in Healthcare, Hybrid 1
Imaging, and Molecular Imaging

Guido A. Davidzon and Benjamin Franc

Contents
1.1 I Applications Support the Infrastructure and Interventions
A
of Healthcare, Including Molecular Imaging 4
1.1.1 Drug Development 5
1.1.2 Clinical Workflow 5
1.2 AI’s Clinical Applications with a Focus on Molecular Imaging 5
1.2.1 Understanding Disease 6
1.2.2 Diagnosis 6
1.2.3 Radiologic-Pathology Correlation 7
1.2.4 Characterization 7
1.2.5 Treatment Planning 8
1.2.6 Prediction of Response to Treatment 8
1.2.7 Overall Prognosis 9
1.2.8 Reporting 9
1.3 Conclusion 9
References 10

Artificial Intelligence (AI) is a broad term com- tern by a computer algorithm (i.e., machine
monly used to describe one of the highest growth learning (ML)) from large amounts of now digi-
industries worldwide, which is quickly finding tally available medical data, precluding the need
new and exciting applications in healthcare, par- for a priori knowledge of an underlying process
ticularly for image-based diagnostic workup [1– and thus eliminating the requirement for human-
3]. The concept of AI has evolved significantly driven modeling and feature engineering [1]. ML,
since early work using model-driven algorithms a narrower subfield of AI, develops algorithms
in the 1940s to its current state where computa- that learn to perform tasks, make decisions, or
tional power allows observational learning of pat- predictions automatically from data, rather than
having a behavior explicitly programmed. ML
G. A. Davidzon (*) · B. Franc can be broadly further subdivided in supervised
Division of Nuclear Medicine & Molecular Imaging, and unsupervised types of learning. The latter
Department of Radiology, Stanford University, techniques find patterns in data and provide struc-
Stanford, CA, USA tural learning (e.g., classes within a dataset) with-
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 3

P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_1
4 G. A. Davidzon and B. Franc

out the need for data annotation, making these For instance, during a developing pandemic,
suitable to learn from large and unlabeled datas- when essential swab and serologic testing was
ets. Clustering analysis and Principal Component lacking in the USA and around the globe, a pre-
Analysis (PCA) are techniques typically used for trained ANN showed high accuracy in diagnos-
these tasks. A recent study using the former tech- ing coronavirus disease 2019 (COVID-19) while
nique identified four new different clinical pheno- differentiating this disease from other types of
types in septic patients, each with its corresponding viral or bacterial pneumonias using plain chest
host-response pattern and clinical outcome [2]. X-rays; similar ANN validated chest CT against
In contrast, supervised learning techniques may viral RNA RT-PCR (reverse transcription poly-
be applied in learning tasks using “cleaned” data. merase chain reaction) test as a sensitive modal-
That is data that is organized and provided in the ity for diagnosis of COVID-19 [7–9].
form of input examples (and its features) paired Other computational training tasks use unstruc-
with those examples’ specific outputs which con- tured data to build models that can, for example,
stitute the “ground truth” (or labels) to be pre- diagnose from physician’s notes in medical
dicted when building a certain inference model. records (a.k.a. as “free text”) in combination with
Supervised learning techniques include regression the medical literature, using either supervised or
and classification algorithms such as linear and unsupervised natural language processing (NLP)
logistic regression, discriminant analysis, decision learning techniques. Recent examples in this
tree, random forest, naïve Bayes, support vector domain have shown that complications from spi-
machines (SVM), and neural networks. These type nal surgery can be screened from operative
of ML models have been evolving for decades and reports, septic shock, suicide risk can be predicted
continue to be the most commonly used in health- from clinicians’ notes and patients that need fol-
care. An early example is DXplain, a medical deci- low-up imaging can be detected using previous
sion support system developed at the Laboratory radiology reports [10–13].
of Computer Sciences at Massachusetts General Each of these ML methods is uniquely suited
Hospital, that takes a set of clinical findings (signs, to certain tasks, but their products—models pre-
symptoms, laboratory data) and produces a ranked dicting a class or outcome in a narrow specific
list of differential diagnoses [3]. area of healthcare or medical imaging—are the
Deep Learning (DL) started having an impact current essence of AI in medicine.
in the early 2000s and it took over an additional AI’s proposed applications in healthcare are
decade for its use in healthcare. It also encom- diverse, from hospital finance to diagnostic and
passes methods in the family of ML, and most therapeutic applications, promising to improve
frequently referrers to supervised learning using efficiency by decreasing both the time that tasks
Artificial Neural Networks (ANN), which use take and the potential for medical error [14]. A key
multiple layers of features from inputs to pro- characteristic of AI’s current and proposed appli-
gressively extract higher level features in order to cations is expanding what is possible today, not
predict labeled outputs. Examples of ANN only increasing the speed or accuracy of current
include convoluted neural networks (CNN), deep processes.
belief networks (DBN), and recurrent neural net-
works (RNN). DL may also be used in unsuper-
vised or semi-supervised learning and can 1.1 I Applications Support
A
become competent in exceedingly complex rela- the Infrastructure
tionships between features and labels, for which and Interventions
have shown similar or exceeding capabilities to of Healthcare, Including
humans in solving problems of computer vision Molecular Imaging
(CV) in medicine [4–6]. Usually the process of
preparing datasets with features and labels can be At the time of writing this chapter there were
time consuming; however, once readily available, 114,002 results for “Artificial Intelligence” in
ML algorithms can rapidly be trained and tested. PubMed and, while these entries date back to the
1 Role and Influence of Artificial Intelligence in Healthcare, Hybrid Imaging, and Molecular Imaging 5

year 1951, over fifty percent of the results corre- the clinical experience including admissions, dis-
spond to the last 7 years. This highlights the charges, and ICU transfers, AI has the potential
importance of time in scientific breakthroughs. to significantly improve the efficiency of clinical
While models and theoretical concepts for the care [24]. AI also extends the capability of clini-
now call AI/Deep Learning revolution were first cal care implementation, for example, through
introduced over half-a-century ago by Frank AI’s incorporation into robotic surgical guidance
Rosenblatt, a psychologist from Cornell systems [25, 26]. During the COVID-19 pan-
University [15] and Marvin Minsky, a cognitive demic, AI has also been proposed as a real-time
scientist from MIT [16], it was not until recently forecasting tool [27], and for early infection iden-
that these concepts were perfected and imple- tification, monitoring, surveillance, and preven-
mented at scale in various industries (including tion as well as mitigation of the impact to
more recently healthcare) by computer and data healthcare indirectly related to COVID-19 [28].
scientists due to more recent explosive growth in AI is now imbedded in various day-to-day
the global digital datasphere and computational operations of many imaging departments includ-
power. ing scheduling, image acquisition, dose reduc-
tion, image reconstruction and post-processing,
prioritization for reporting, classification of find-
1.1.1 Drug Development ings for reporting, and the reporting task itself
[1]. Beyond plugging AI into various pieces of
In drug development AI models predict the the existing workflow, eventually molecular
chemical and pharmaceutical properties of small- imaging workflows will need to be redesigned to
molecule candidates for drug design and devel- take full advantage of AI, for example through
opment, new applications of existing drugs, and merging data sources into a data model to enable
new patients who can benefit from drugs, predict easier data exploration and visualization [29].
bioactivity and identify patient characteristics for
clinical trials [17–20]. Moreover, AI in combina-
tion with sources of large amounts of data in the 1.2 AI’s Clinical Applications
biosciences has been used to build in silico mod- with a Focus on Molecular
els of disease processes, such as cancer, to enable Imaging
computer-aided design and testing of potential
therapeutic compounds [21]. In the realm of From a clinical perspective, models developed
infection, an approach using ML known as com- using modern DL methods can, in certain cir-
putational phenotyping, was capable to predict cumstances, be generalized across diseases and
antibiotic resistance phenotypes in various bacte- imaging modalities and are typically less suscep-
rial pathogens and another showed it could facili- tible to errors in predictions secondary to noise.
tate rapid drug development against SARS The interactions of various systems within a dis-
coronavirus 2 (SARS-CoV-2) the causative ease as well as complex dependencies of disease
organism in COVID-19 [22, 23]. states on each other can be better understood with
AI through DL because of its ability to aggregate
multiple data streams from imaging, laboratory,
1.1.2 Clinical Workflow genomics, proteomics, pathology, as well as data
from the electronic medical record, social net-
In clinical medicine, some of the tasks most ame- works, wearable sensors, and other data sources
nable to being performed by a computer, as well to create integrated diagnostic systems [30]. One
as some that can only be performed with levels of could envision multimodality DL modeling
computational ability beyond the human brain, approaches integrating multiple data streams not
reside in the clinical workflow itself. By stream- only to compute disease prognosis but in every
lining patient experience and clinical operations, step of the diagnostic imaging workflow to
and improving patient flow through key points in involve both upstream and downstream applica-
6 G. A. Davidzon and B. Franc

• Study prioritization
Image
triage

• Patient moving Image

Disease • Segmentation
• Body parts outside the FOV quality
Detection • Quantification
control

Upstream Image Acquisition Downstream Image Analysis

• Acquisition speed Disease

• Differential diagnosis
• Image enhancement Scanning Classifi-
• Dose reduction cation

• Test selection • Post-processing

• Patient selection Image
Planning automatization
• Scheduling reporting
• Report automatization
• Study protocol • Predictions (therapy
• Patient preparation response, progression free
survival
Adapted with permission from Dr. Curt Langlotz

Fig. 1.1 Schematic representation of potentially useful image quality control) and downstream (triage through
AI applications along the image life cycle including image reporting) domains. Adapted with permission from
potential applications in the upstream (planning through Dr. Curt Langlotz

tions. Such instances could include: planning ability of 18F FDG PET to predict neuropsycho-
(e.g., patient selection and scheduling based on a logical performance (NPP) in patients with neuro-
patient’s disease profile, previous and future fibromatosis Type 1 (NF 1), Schutze and colleagues
interactions with the healthcare system) to scan- built on the anatomical findings of MRI studies of
ning (e.g., reducing diagnostic study radiation NF 1, concluding that the accuracy in predicting
dose and bettering image quality), to reading NPP based on PET suggested an underlying meta-
(e.g., automated detection and classification of bolic pattern of cognitive function [32].
pathologies), and reporting (e.g., automating
reports with reproducible measurements, auto-
mating prediction of clinical outcomes) (Fig. 1.1). 1.2.2 Diagnosis

AI is used in a plethora of diagnostic tasks using

1.2.1 Understanding Disease data from traditional and untraditional sources. In
primary care, patients report their symptoms and
More and more, the concept that diseases are a concerns to chatbots that then route their care to
manifestation of interconnected organ systems is the appropriate channel for further diagnosis or
gaining traction. Understanding these systems and treatment [33]. Difficult diagnoses are aided by
their sequalae of signs and symptoms is an area piecing together symptoms of the patient with
where combined molecular and anatomic imaging those of millions of other patients for diagnosis
modalities can contribute. However, the connec- [34]. Given its particularly complicated origins
tions within and between these systems is highly including genetic and environmental factors inter-
complex and models built on AI can help in pattern acting with the immune system and other normal
elucidation. For example, areas of the brain associ- tissues in the body, cancer diagnosis is another
ated with specific cognitive changes typical of area where AI is helping in diagnostic tasks using
genetic disorders affecting the brain primarily or data from general health screening and diagnostic
secondarily have been identified using machine tests, including blood testing, imaging, and
learning approaches [31, 32]. In their study of the pathology [35]. In addition to the field of cancer,
1 Role and Influence of Artificial Intelligence in Healthcare, Hybrid Imaging, and Molecular Imaging 7

numerous AI applications have been developed in ogy in this domain could be useful for pancancer
neurology and cardiology [36, 37]. prognosis prediction using clinical data, mRNA
In diagnostic imaging tasks, methods of using expression data, microRNA expression data, and
computational analysis to aid in the detection of whole slide histopathology images [48].
lesions have evolved from early work in temporal
subtraction methods and artificial neural net-
works performed in the 1960s–1980s to sophisti- 1.2.3 Radiologic-Pathology
cated DL methodologies of today [38, 39]. AI can Correlation
recognize specific diagnoses on imaging, such as
pathologic bacteria on microscopy of blood sam- The power of AI over the experience of any single
ples or findings on a radiograph [40]. imaging physician or pathologist is the ability to
In cancer detection and diagnosis, AI can cross-reference imaging or other data from indi-
facilitate the workflow efficiency and accuracy of vidual tumors to databases of limitless cases for
imaging clinician expertise through precise deter- comparison, rather than limiting comparison to
mination of tumor volume and its change over those cases seen over the physician’s career [30].
time and tracking of multiple lesions [30]. AI solutions for pathology have been shown to
Automated PET segmentation of nodules based make diagnoses over tenfold faster than patholo-
on neural networks trained in the spatial and gists; while having obvious direct clinical applica-
wavelet domains have been shown to be repro- tions, AI-based pathology has shown high value in
ducible, volumetrically accurate, and demon- applications in the pharmaceutical industry [49].
strate lower absolute relative error when
compared to other automated techniques [41].
Other ML approaches have been useful in deal- 1.2.4 Characterization
ing with segmentation of larger and more compli-
cated tumors of the head and neck, particularly in Beyond connecting lesions on imaging with spe-
the setting of heterogeneous radiopharmaceutical cific pathologic correlations, AI can assist with
uptake, in segmenting brain tumors and classify- other areas of classification as well. For example,
ing brain scans [42–44]. In evaluating measures in neurology, Parkinson’s Disease severity can be
that are not typically detectable by an imaging classified with 99mTc-TRODAT-1 SPECT
physician, AI can help further guide additional Imaging based on support vector machine mod-
testing and patient management [45]. els [50]. Several applications of AI have been
The greatest strength of AI may be its ability to published in the cancer field including detection,
integrate far more factors than are possible for a characterization, and monitoring response to
single physician. For example, by analyzing treatment of various cancer types [30]. In the
images in tandem with blood and other laboratory realm of hybrid imaging molecular imaging
testing, genomics, and unstructured data from modalities such as PET or SPECT provide
patient medical records, AI algorithms are being molecular characterization of lesions possibly
used to make diagnoses in a more wholistic man- seen on companion anatomic imaging, such as
ner, decreasing the physician’s difficulty of inte- CT. Increasingly, work is focusing on predicting
grating disparate results from numerous tests and the data that would be produced by the functional
the medical history [46]. This approach is known modality using the traditional anatomic modality
as multimodality deep learning. A recent study in combination with artificial intelligence. For
using this approach showed that combining clini- example, uptake of 68Ga DOTATATE on PET has
cal, pathological, and imaging information been used to label bone metastases as active on
increased the predictive power of clinical out- PET-CT, with subsequent development of AI
comes in glioblastoma multiforme, where survival models to predict activity using only radiomic
is poor and ranges from one to two years in most data from the CT portion of the study [51]. This
patients [47]. Another recent promising methodol- new paradigm may enable a greater global reach
8 G. A. Davidzon and B. Franc

of the benefits of molecular imaging, allowing organs for preemptive adjustment of technique
even those geographies that lack molecular imag- [58]. Ideally, these types of techniques will also
ing systems the ability to better characterize inform therapies based on radiopharmaceuticals
lesions using staple and inexpensive modalities. as the field of theranostics grows.
AI has been key in the proliferation of the field In treatments involving radiation, AI has the
of radiomics. Radiomic approaches aim to identify potential to improve safety and quality. For exam-
imaging phenotypes specific to diseases that can be ple, in the realm of radiation oncology, DL with
used in their diagnosis, characterization, and treat- convolutional neural networks has been used to
ment management, approaches that some have identify radiotherapy treatment delivery errors
coined as “radiomic biopsy.” These imaging phe- using patient-specific gamma images [59].
notypes can be defined by characteristics of the However, AI can’t be treated as a black box whose
images measured or observed by an imaging spe- output should be trusted at face value, particularly
cialist or features extracted based upon pre-defined when this output directly affects therapy. Rather,
statistical imaging features, of which over 5000 the fields of molecular theranostics and radiation
have been described. Such radiomic features can therapy must recognize the fallibility of any tech-
identify key components of tumor phenotype for nology that is misused with potentially significant
multiple lesions at multiple time points over the consequences and that the workforce, including
course of treatment, potentially allowing enhanced physicians, technologists, and radiation physi-
patient stratification, prognostication, and treat- cists, must become more conversant in various AI
ment monitoring for targeted therapies, although approaches and algorithm development [60].
care must be taken in evaluating the generalizabil-
ity of the results of these approaches [52].
Radiogenomics, the translation of intratumoral 1.2.6 Prediction of Response
phenotypic features to genotypes, has been most to Treatment
explored in cancer imaging. Making these types of
correlations requires the development of new meth-Many current pharmacologic therapies and radio-
odologies to summarize phenotypes of large heter- therapy approaches rely on indirect actions on
ogenous populations of cells within a single tumordisease, requiring the decoding of complicated
and look for underlying genotypic similarities [53].
molecular inter-relationships to define response
to therapy, a task that is suited for AI. Such pre-
dictive models may require input of clinical fac-
1.2.5 Treatment Planning tors; for example, one study using pretherapeutic
clinical parameters to predict the outcome of 90Y
In the realm of treatment, AI has become a pillar radioembolization in patients with intrahepatic
of the concept of personalized healthcare whereby tumors [61]. Alternatively, models may focus on
machine-based learning based on numerous and predictions made solely upon imaging, such as
seemingly endless sources of medical data is the prediction of radioresistant primary nasopha-
anticipated to identify insights into patterns of ryngeal cancers from CT, MR, and PET imaging
disease and prognosis [54, 55]. Models predicting prior to IMRT using radiomics analysis com-
response based upon any given choice of therapy bined with machine learning to identify the most
would greatly inform choices of drug therapy predictive features [62]. Finally, models may rely
made by patients and their physicians. on a combination of predictors, such as a study
In the delivery of radiation, AI has enabled by Jin et al. investigating the ability to predict
dose distribution prediction for intensity- treatment response based on a machine learning
modulated treatment planning based on patient- model combining computed tomography (CT)
specific geometry and prescription dose on CT of radiomic features and dosimetric parameters for
cancers of the head and neck [56, 57]. Similarly, patients with esophageal cancer (EC) who under-
AI models can predict radiation dose to normal went concurrent chemoradiation (CRT) [63].
1 Role and Influence of Artificial Intelligence in Healthcare, Hybrid Imaging, and Molecular Imaging 9

1.2.7 Overall Prognosis actionable findings to referring clinicians [69].

Likewise, a recent study by Rao and colleagues
Evaluations of overall prognosis can be helpful to demonstrated an AI tool that can serve as peer
guide therapeutic choices in highly aggressive reviewer to augment radiologists diagnosing intra-
diseases or diseases that have a more chronic cranial hemorrhage and reducing error rates [70].
course with multiple therapeutic options. Just as Finally, healthcare records, imaging, medi-
in the case of models to predict response to ther- cal decision-making, and treatment data are
apy, overall prognostic models may incorporate now continuously recorded within the boundaries
clinical information, imaging data, or a spectrum of healthcare systems in siloed fashion. In part due
of data from in vivo molecular imaging to ex vivo to this, most of current machine learning efforts in
tissue analysis and patient characteristics. healthcare, including those in hybrid imaging and
For example, the ability to train neural net- molecular imaging, are unfortunately only based
work models on data from a single low-cost, on data from single institutions. AI and machine
widely available test such as bone scan to predict learning are inherently statistical methodologies,
prognosis in patients with metastatic prostate and as such, they benefit the most from large and
cancer or breast cancer enables the development heterogenous datasets, ideally from multiple insti-
of a widely applicable prognostic model [64] By tutions. Never before has the failure to build robust
comparison, models developed from studies such data-sharing systems for large-scale and near
as one using a combination of highly specialized real-time analysis in healthcare has been more
inputs including 11C methionine (11C MET) PET, evident than with the outbreak of COVID-19 pan-
tumor grade, histology, and isocitrate dehydroge- demic. Nevertheless, an international shared
nase 1 R132H mutational status to predict sur- data model exist for information from intensive
vival in glioma patients may only be applied care units (ICUs): MIMIC database is such model,
under very specialized circumstances [65]. it is publicly available, deidentified, and widely
used by investigators and engineers around the
world, helping to drive research in clinical infor-
1.2.8 Reporting matics, epidemiology, and machine learning [71].
Efforts like this that can enable global research-
The ability to provide a timely, accurate and ers generate AI applications that empower imag-
actionable imaging report is paramount to ensure ing specialists and other healthcare workers to
providing quality of care and better clinical out- make data-driven decisions, are sadly lacking in
comes. It is known that medical image reporting the hybrid and molecular imaging community.
errors are not rare [66, 67]. A plausible explana- MIMIC or other established models that enable
tion for this may be the increasing number and medical data sharing between institutions may
complexity of clinical imaging studies with lag- serve as direction for the molecular imaging
ging in training of radiologist specialists, render- and other medical communities yet, while such
ing attending radiologists overburdened. Hence, endeavor could boost the task of modeling using
solutions to augment imaging specialists improve retrospective medical data, prospective multi-
and expediate clinical reporting could be helpful center validation of developed ML models should
(Fig. 1.2). Already an AI framework that could be warrant before and during clinical deployment.
provide considerable benefits for patient safety
and quality of care for busy emergency and trauma
imaging services that press radiologists to meet the 1.3 Conclusion
demand of increased imaging volume and provide
accurate reports has been proposed [68]. Similarly, With all these abilities, the boundary between the
another AI framework could potentially increase job of AI and the role of the human imaging spe-
threefold the measurements of target lesions in cialist will be debated. While some prognosticate
oncologic scans and provided faster notification of an era of medical specialists like radiologists and
10 G. A. Davidzon and B. Franc

HISTORY: 67-year old male with advance stage prostate cancer,

radical prostatectomy in 2017 and biochemical recurrence in
2019 is referred for evaluation of osseous metastases.

RADIOPHARMACEUTICAL: 10 mCi of 18F-NaF i.v.

TECHNIQUE: Following IV administration of the

radiopharmaceutical, images were acquired using a Discovery
MI PET/CT scanner. Images were reconstructed using BSREM.
CTDIvol 4.3 (mGy), DLP is 450 (mGy-cm)

FINDINGS: There are numerous randomly distributed NaF-avid

foci in the axial and proximal appendicular skeleton in a pattern
suggestive of osseous metastases.
Do you want to index
this focus?
Index Lesions:

1) 18 cm3(2 x 3 cm) with SUV of 20 on image 154

2)
3)
4)
5)
6)

IMPRESSION:

Fig. 1.2 This screen displays a mockup depicting an AI system detected and segmented lesions and prompts a
automated clinical reporting workstation. On the left hand clinical reader to accept or deny the addition of these to
side an 18F-NaF PET/CT is displayed. Here, the back-end the clinical report (right hand side)
generates VOIs in overlapping PET/CT images. The AI

3. Barnett GO, et al. DXplain. An evolving diagnostic

pathologists being augmented by large-scale
decision-support system. JAMA. 1987;258(1):67–74.
computation from AI-based applications, others 4. Jiang F, et al. Artificial intelligence in health-
foresee a future where traditional imaging physi- care: past, present and future. Stroke Vasc Neurol.
cians and pathologists cease to have a role, 2017;2(4):230–43.
5. Esteva A, et al. Dermatologist-level classification
replaced by a physician “information specialist”
of skin cancer with deep neural networks. Nature.
trained less-so in radiology/pathology and more- 2017;542(7639):115–8.
so in the data sciences, statistics, and parallel
6. Gulshan V, et al. Development and validation of a
fields that serve as information sources, such as deep learning algorithm for detection of diabetic
retinopathy in retinal fundus photographs. JAMA.
genomics and proteomics [72]. Beyond its influ-
2016;316(22):2402–10.
ence on medical diagnosis and therapy, AI will 7. Hall LO, et al. Finding Covid-19 from chest X-rays
have effects throughout the healthcare continuum, using deep learning on a small dataset. arXiv e-prints.
including keeping people healthy [73]. As the role 2020. arXiv:2004.02060.
8. Gozes O, et al. Rapid AI development cycle for the
and influence of AI in healthcare continues to
coronavirus (COVID-19) pandemic: initial results
evolve, real and potential benefits become certain, for automated detection and patient monitoring using
and so far suggest AI will not replace radiologists deep learning CT image analysis. arXiv e-prints.
and physicians, but radiologists and physicians 2020. arXiv:2003.05037.
9. Ai T, et al. Correlation of chest CT and RT-PCR
who use AI will replace those who don’t [74].
testing in coronavirus disease 2019 (COVID-
19) in China: a report of 1014 cases. Radiology.
2020;2020:200642.
References 10. Karhade AV, et al. Natural language processing for
automated detection of incidental durotomy. Spine J.
2019;20(5):695–700.
1. Nensa F, Demircioglu A, Rischpler C. Artificial
11. Vermassen J, et al. Automated screening of natural
intelligence in nuclear medicine. J Nucl Med.
language in electronic health records for the diagnosis
2019;60(2):29S–37S.
septic shock is feasible and outperforms an approach
2. Seymour CW, et al. Derivation, validation, and poten-
based on explicit administrative codes. J Crit Care.
tial treatment implications of novel clinical pheno-
2020;56:203–7.
types for sepsis. JAMA. 2019;321(20):2003–17.
1 Role and Influence of Artificial Intelligence in Healthcare, Hybrid Imaging, and Molecular Imaging 11

12. Levis M, et al. Natural language processing of clinical 32. Schutze M, et al. Use of machine learning to pre-
mental health notes may add predictive value to exist- dict cognitive performance based on brain metab-
ing suicide risk models. Psychol Med. 2020; https:// olism in Neurofibromatosis type 1. PLoS One.
doi.org/10.1017/S0033291720000173. 2018;13(9):e0203520.
13. Lou R, et al. Automated detection of radiology reports 33. Winn AN, et al. Association of use of online symptom
that require follow-up imaging using natural language checkers with patients’ plans for seeking care. JAMA
processing feature engineering and machine learning Netw Open. 2019;2(12):e1918561.
classification. J Digit Imaging. 2020;33(1):131–6. 34. Tomasev N, et al. A clinically applicable approach to
14. Balasubramanian R, Libarikian A, McElhaney continuous prediction of future acute kidney injury.
D. Insurance 2030 - the impact of AI on the future of Nature. 2019;572(7767):116–9.
insurance. 2018. 35. Putcha G. Blood-based detection of early-stage
15. Rosenblatt F. The perceptron: a probabilistic model colorectal cancer using multiomics and machine
for information storage and organization in the brain. learning. In: American Society of Clinical Oncology
Psychol Rev. 1958;65(6):386–408. Gastrointestinal Cancers Symposium. 2020.
16. Minsky M. Steps toward artificial intelligence. Proc 36. Bouton CE, et al. Restoring cortical control of func-
IRE. 1961;49(1):8–30. tional movement in a human with quadriplegia.
17. Zhavoronkov A, et al. Deep learning enables rapid Nature. 2016;533(7602):247–50.
identification of potent DDR1 kinase inhibitors. Nat 37. Mannini A, et al. A machine learning framework for
Biotechnol. 2019;37(9):1038–40. gait classification using inertial sensors: application to
18. Mamoshina P, et al. Machine learning on human mus- elderly, post-stroke and Huntington’s disease patients.
cle transcriptomic data for biomarker discovery and Sensors. 2016;16(1):134.
tissue-specific drug target identification. Front Genet. 38. Shiraishi J, et al. Computer-aided diagnosis and artifi-
2018;9:242. cial intelligence in clinical imaging. Semin Nucl Med.
19. Aliper A, et al. Deep learning applications for pre- 2011;41(6):449–62.
dicting pharmacological properties of drugs and drug 39. Shiraishi J, et al. Development of a computer-aided
repurposing using transcriptomic data. Mol Pharm. diagnostic scheme for detection of interval changes
2016;13(7):2524–30. in successive whole-body bone scans. Med Phys.
20. Fleming N. How artificial intelligence is changing 2007;34(1):25–36.
drug discovery. Nature. 2018;557(7707):S55–7. 40. Smith KP, Kang AD, Kirby JE. Automated interpre-
21. Bhattacharya T, et al. AI meets exascale computing: tation of blood culture gram stains by use of a deep
advancing cancer research with large-scale high per- convolutional neural network. J Clin Microbiol.
formance computing. Front Oncol. 2019;9:984. 2018;56(3):e01521.
22. Drouin A, et al. Predictive computational phenotyping 41. Sharif MS, et al. Artificial neural network-based
and biomarker discovery using reference-free genome system for PET volume segmentation. Int J Biomed
comparisons. BMC Genomics. 2016;17(1):754. Imaging. 2010;2010:105610.
23. Stebbing J, et al. COVID-19: combining antiviral 42. Belhassen S, Zaidi H. A novel fuzzy C-means algo-
and anti-inflammatory treatments. Lancet Infect Dis. rithm for unsupervised heterogeneous tumor quantifi-
2020;20(4):400–2. cation in PET. Med Phys. 2010;37(3):1309–24.
24. Tsay D, Patterson C. From machine learning to 43. Blanc-Durand P, et al. Automatic lesion detection and
artificial intelligence applications in cardiac care. segmentation of 18F-FET PET in gliomas: a full 3D
Circulation. 2018;138(22):2569–75. U-Net convolutional neural network study. PLoS One.
25. Max DT. Paging Dr. Robot: a pathbreaking surgeon 2018;13(4):e0195798.
prefers to do his cutting by remote control. The 44. Nobashi T, et al. Performance comparison of indi-
New Yorker. 2019. vidual and ensemble CNN models for the classifica-
26. Gormley B. Impact of Auris Health’s acquisition tion of brain 18F-FDG-PET scans. J Digit Imaging.
could be felt across med-tech. In: The wall street jour- 2020;33(2):447–55.
nal. New York: Dow Jones & Company; 2020. 45. Dagan N, et al. Automated opportunistic osteoporotic
27. Hu Z, et al. Artificial intelligence forecast- fracture risk assessment using computed tomogra-
ing of Covid- 19 in China. arXiv e-prints. 2020. phy scans to aid in FRAX underutilization. Nat Med.
arXiv:2002.07112. 2020;26(1):77–82.
28. Ting DSW, et al. Digital technology and COVID-19. 46. Kaplan DA. How radiologists are using machine
Nat Med. 2020;26(4):459–61. learning. In: Diagnostic imaging. New York: Springer;
29. Brodbeck D, et al. Making the radiology workflow 2017.
visible in order to inform optimization strategies. Stud 47. Peeken JC, et al. Combining multimodal imag-
Health Technol Inform. 2019;259:19–24. ing and treatment features improves machine
30. Bi WL, et al. Artificial intelligence in cancer imag- learning-based prognostic assessment in patients
ing: clinical challenges and applications. CA Cancer J with glioblastoma multiforme. Cancer Med.
Clin. 2019;69(2):127–57. 2019;8(1):128–36.
31. Ding Y, et al. A deep learning model to predict a diag- 48. Cheerla A, Gevaert O. Deep learning with multimodal
nosis of Alzheimer disease by using (18)F-FDG PET representation for pancancer prognosis prediction.
of the brain. Radiology. 2019;290(2):456–64. Bioinformatics. 2019;35(14):i446–54.
12 G. A. Davidzon and B. Franc

49. Pokkalla H, et al. Machine learning models accu- 61. Ingrisch M, et al. Prediction of (90)Y radioemboliza-
rately interpret liver histology in patients with non- tion outcome from pretherapeutic factors with random
alcoholic steatohepatitis (NASH). Hepatology. survival forests. J Nucl Med. 2018;59(5):769–73.
2019;70(S1):187. 62. Li S, et al. Use of radiomics combined with machine
50. Hsu SY, et al. Feasible classified models for Parkinson learning method in the recurrence patterns after inten-
Disease from 99mTc TRODAT-1 SPECT imaging. sity-modulated radiotherapy for nasopharyngeal carci-
Sensors. 2019;19:1740. noma: a preliminary study. Front Oncol. 2018;8:648.
51. Acar E, et al. Machine learning for differentiating 63. Jin X, et al. Prediction of response after chemo-
metastatic and completely responded sclerotic bone radiation for esophageal cancer using a combina-
lesion in prostate cancer: a retrospective radiomics tion of dosimetry and CT radiomics. Eur Radiol.
study. Br J Radiol. 2019;92(1101):20190286. 2019;29(11):6080–8.
52. Morin O, et al. A deep look into the future of quan- 64. Inaki A, et al. Fully automated analysis for bone scin-
titative imaging in oncology: a statement of working tigraphy with artificial neural network: usefulness
principles and proposal for change. Int J Radiat Oncol of bone scan index (BSI) in breast cancer. Ann Nucl
Biol Phys. 2018;102(4):1074–82. Med. 2019;33(10):755–65.
53. Chidester B, Do MN, Ma J. Discriminative bag-of- 65. Papp L, et al. Glioma survival prediction with com-
cells for imaging-genomics. Pac Symp Biocomput. bined analysis of in vivo (11)C-MET PET features,
2018;23:319–30. ex vivo features, and patient features by supervised
54. Mamoshina P, et al. Blood biochemistry analysis to machine learning. J Nucl Med. 2018;59(6):892–9.
detect smoking status and quantify accelerated aging 66. Waite S, et al. Interpretive error in radiology. AJR Am
in smokers. Sci Rep. 2019;9(1):142. J Roentgenol. 2017;208(4):739–49.
55. Ahmad MA, et al. Death vs. data science: predict- 67. Sokolovskaya E, et al. The effect of faster reporting
ing end of life. In: Association for the advancement speed for imaging studies on the number of misses
of artificial intelligence conference on artificial and interpretation errors: a pilot study. J Am Coll
intelligence. Radiol. 2015;12(7):683–8.
56. Fan J, et al. Automatic treatment planning based 68. Jalal S, et al. Exploring the role of artificial intelli-
on three-dimensional dose distribution pre- gence in an emergency and trauma radiology depart-
dicted from deep learning technique. Med Phys. ment. Can Assoc Radiol J. 2020;72(1):167–74.
2019;46(1):370–81. 69. Do HM, et al. Augmented radiologist workflow
57. Chen X, et al. A feasibility study on an automated improves report value and saves time: a potential
method to generate patient-specific dose distribu- model for implementation of artificial intelligence.
tions for radiotherapy using deep learning. Med Phys. Acad Radiol. 2020;27(1):96–105.
2019;46(1):56–64. 70. Rao B, et al. Utility of artificial intelligence tool as
58. Avanzo M, et al. Prediction of skin dose in low-kV a prospective radiology peer reviewer - detection of
intraoperative radiotherapy using machine learning unreported intracranial hemorrhage. Acad Radiol.
models trained on results of in vivo dosimetry. Med 2020;28(1):85–93.
Phys. 2019;46(3):1447–54. 71. Cosgriff CVE, Celi LA. Data sharing in the era of
59. Nyflot MJ, et al. Deep learning for patient-specific COVID-19. Lancet Digital Health. 2020;2(5):E224.
quality assurance: identifying errors in radiotherapy 72. Jha S, Topol EJ. Adapting to artificial intelligence:
delivery by radiomic analysis of gamma images radiologists and pathologists as information special-
with convolutional neural networks. Med Phys. ists. JAMA. 2016;316(22):2353–4.
2019;46(2):456–64. 73. Duncan DE. Can AI keep you healthy. In: MIT tech-
60. Kearney V, et al. The application of artificial intel- nology review. Boston: MIT; 2017.
ligence in the IMRT planning process for head and 74. Cl L. Will artificial intelligence replace radiologists?
neck cancer. Oral Oncol. 2018;87:111–6. Radiology. 2019;1(3):e190058.
Introduction to Machine Learning:
Definitions and Hybrid Imaging 2
Applications

Jens Kleesiek

Contents
2.1 Introduction 13
2.2 History and Basic Definitions 14
2.3 Learning Paradigms 15
2.4 General Concepts of Machine Learning Methods 16
2.5 Classical Machine Learning Approaches 18
2.6 Artificial Neural Networks 20
2.7 Radiomics and Radiogenomics 22
2.8 Imaging Applications 23
2.9 Conclusions and Perspectives 25
References 25

2.1 Introduction cialties that are affected to varying degrees by the

hope and hype associated with these methods.
In everyday life, the terms machine learning ML and AI algorithms, although often tailored
(ML) and artificial intelligence (AI) have become to specific tasks and data types, come in different
indispensable. All conceivable domains are pre- flavors, but usually follow a generic principle.
destined to be optimized and enhanced by tech- They can be understood as a function mapping
niques summarized by these expressions. This that relates an input to an output. At their core, an
especially holds true for the medical domain, objective is optimized during training to opti-
which in turn has various subdisciplines and spe- mally establish this mapping, i.e., to generalize
well for unseen data.
Within the vast field of precision (personal-
J. Kleesiek (*) ized) medicine, many approaches are grounded
Institute for AI in Medicine (IKIM), University in imaging. These comprise, but are not limited
Hospital Essen, Essen, Germany to, the discovery of imaging biomarkers and the
German Cancer Consortium (DKTK), establishment of correlations between imaging
Essen, Germany phenotype, genotype, or clinical outcome
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 13

P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_2
14 J. Kleesiek

parameters for improved disease and therapy propelled forward by several remarkable mile-
monitoring. So far, most of these applications stones. In 2012, a convolutional neural network
have been demonstrated for radiological images. (CNN) termed AlexNet won the ImageNet visual
Technically, these methods can also be applied in recognition challenge by a large margin. The
the same manner to nuclear and hybrid imaging authors stated that the depth of the model, i.e., the
use cases. Yet, due to particular differences in number of layers of the artificial neural network,
what can be measured as well as in the acquisi- was one of the primary reasons for its perfor-
tion and processing of the images, there are other mance [1]. Although the term deep learning (DL)
applications unique to the field of nuclear imag- was coined earlier [2], this key event pushed deep
ing that can be enhanced by AI. As always, learning to the mainstream. Training of deep
domain knowledge is important for devising models is made feasibly by utilizing graphical
novel applications that truly lead to a clinical processing units (GPUs) that have and still are
impact. pushing the limits for fast matrix multiplications,
the very same requirements dominating within
the computer gaming industry. Increasing vol-
2.2 History and Basic Definitions umes of data available for training and novel net-
work architectures, intelligently designed for
The term artificial intelligence dates back to the certain tasks, are additional components for the
1950s where it was coined by John McCarthy. ongoing success story. Since 2016, the error rate
The general opinion is that the date of founding for image classification on ImageNet data is con-
for AI as a research field was established in a siderably better than the reported human error
summer conference at Dartmouth in 1956, bring- rate of 5.1% [3].
ing together some of the brightest minds of that Artificial intelligence often refers to the abil-
time. Two years later, Frank Rosenblatt presented ity of a machine to display intelligent human
the perceptron, the first artificial neural network behavior. However, this is not a rigorous disam-
(ANN). Altered variants of these artificial neu- biguation as there is no distinct definition for
rons still serve as building blocks of modern human intelligence either. The movement toward
architectures and applications. Less than two a general artificial intelligence of machines, i.e.,
decades and many research projects later, the ini- the ability to solve arbitrary problems, is often
tial hype was followed by the first AI-Winter, referred to as strong AI. Yet, the vast majority of,
presumably triggered by a book published by if not all, AI-driven applications to this date are
Marvin Minsky and Seymour Papert revealing weak or narrow AI systems, tuned for a specific
limitations of the perceptron. The field recovered task, e.g., the detection of a tumor lesion within a
due to knowledge representations that thrived in medical image. The machine detects this pattern,
the form of expert systems being utilized for and even might do this better and more consis-
decision-making. Yet, it was hit by the second tently than any human physician, but it does not
AI-Winter started in the late 1980s, not meeting understand the content nor the implications it
the ambitious expectations once again. This sec- might have for a patient.
ond trough of disillusionment1 ended in 1997 It is worth mentioning that, nonetheless, many
with the famous chess game in which IBMs Deep algorithms can be utilized as general-purpose
Blue defeated the reigning world champion Garry tools. The very same network architecture that
Kasparov. Since then, the field prospered and was was used for the detection of tumor lesion can be
trained with data from a different domain and is
then, for instance, able to detect pedestrians in a
1
Segment of the hype-cycle established by Gartner con- street scene. In turn, this means that we can
sultant Jackie Fenn.
2 Introduction to Machine Learning: Definitions and Hybrid Imaging Applications 15

expect many promising algorithms established analysis that groups, usually high dimensional
within the computer vision community to be data, based on a similarity measure. Other
transferred to the hybrid imaging field. approaches are entitled autoencoders. In this set
of methods, the input data serves at the same time
as the output. During training, a compression or
2.3 Learning Paradigms representation is learned that encodes the data.
There are other approaches, all have in common,
Machine learning is a subfield of AI subsuming that at some point meaning needs to be attributed
various techniques for building mathematical to structures discovered in the data. This step
models using data. Instead of explicitly program- usually is a task reserved for humans.
ming the computer how to perform a task or solve In self-supervised learning, the data itself pro-
a problem, the methods are designed to discover vides the supervision. There are different
relationships or semantic meaning within the approaches in imaging applications where this
data, e.g., learning a mapping between input and paradigm has been successfully applied.
output or generative models of the data. Procedures include randomly sampling two
Different learning approaches can be distin- patches from an image and letting the network
guished. In supervised learning, the input data x learn to predict their relative position, using a
and associated labeled output data y are available. monochrome version of the images for predicting
Together they are called training data. Supervised pixel color or making a jigsaw puzzle from the
learning algorithms learn, using n input and out- image and learning to reassemble the original
put pairs (xn, yn), to predict an output label y for a image. These tricks enable the learning of a
new input x, unseen during training. Sometimes it semantic representation of the data that can be
is described as learning with a teacher. Supervised exploited in downstream tasks, e.g., classifica-
learning algorithms are often used for classification. We are not aware that self-supervised
tion or regression and usually display a better approaches have been utilized in the hybrid imag-
performance in comparison to other approaches. ing community, yet. This could be due to the fact
The drawback is that a lot of training data is that normally substantial amounts of data are
needed to obtain good models, and even if this needed, and often supervised approaches per-
data is available, annotating this data can be quite form better.
laborious and thus is expensive w.r.t. time and Semi-supervised learning is a mixture of
money. Due to these reasons, weakly supervised supervised and unsupervised methods. Often a
learning relies on training labels that are either small part of the data is properly annotated,
noisy or imperfect but cheaper to obtain. whereas the rest of the training data lack labels.
In classification algorithms, the output is This combination can often boost performance in
restricted to a limited set of categories. Input data comparison to only utilizing either labeled or
is categorized to belong to a predefined class, for unlabeled data during training.
instance, to label a region with elevated SUVmax Another important approach to deal with
to be either physiological or pathological uptake. scarcely available data is referred to as transfer
In turn, the output of a regression algorithm is a learning. In this setting, data is trained on avail-
numerical value, e.g., a floating-point number able data from a different but to some extent
that corresponds to an SUV for a given voxel. related problem, and the model is then fine-tuned
In unsupervised learning, only input data on less data stemming from the actual problem at
without labels or other output values is available. hand. An example would be the classification of
The goal is to discover structures and relation- radiological images utilizing a model pretrained
ships within data. A famous example is clustering for a classification task on the aforementioned
16 J. Kleesiek

ImageNet data. The problems share the same ative models. Discriminative models aim at
natural imaging statistics, i.e., the images are determining a decision boundary. This boundary
composed of textures and edges. Thus, if these can be either linear or nonlinear (Fig. 2.1a). In
statistics are already learned, only higher level probabilistic terms, this means that a conditional
meanings need to be established for the medical probability distribution p(y|x) is learned that
imaging data. In multi-task learning, several allows to assign a class label y for a given data
tasks are solved at once, again being distinct point x. In contrast, in generative models, the
tasks that do share commonalities. It has been joint probability distribution p(x,y) is sought,
demonstrated that this can be beneficial to train- explicitly modeling the actual distribution of
ing separate models for each task, presumably by each class y. Next to transforming the joint prob-
improving learning efficiency. ability into a conditional probability using Bayes’
Reinforcement learning (RL) is yet another rule, this allows to actually generate data from
learning paradigm [4]. In this family of algo- the model by sampling from the distributions,
rithms, learning incorporates an interaction with hence the name. Looking for a higher accuracy,
a real or simulated environment. Usually, the task discriminative models are often the preferred
comprises a policy and a value function. The pol- choice.
icy function determines an action and the value Many ML algorithms are parameterized. For
function, the expected reward for this action. instance, parameters that are adjusted during
Next to robotics tasks, RL has been used in the learning are the weights of the neural network or
past for solving imaging applications such as fil- the coefficients of a regression model. In addi-
tering [5], segmentation [6–10], feature extraction, there are also hyperparameters. These
tion [11–13], and others. A very famous example hyperparameters are manually set by the user
that combines RL with DL is Google’s AlphaGo prior to starting the learning algorithm and
[14]. In the ancient and very complex board game include, e.g., the number of training steps and the
Go, the proposed algorithm was able to defeat a learning rate.
world champion and is presumably the strongest A general approach often found, and one of
player in history. When thinking out of the box, the most fundamental components of ML algo-
detecting a tumor lesion within a PET scan can rithms, is that during learning, an objective func-
also be reformulated in terms of a game: scrolling tion, a.k.a. loss, error, or cost function, is
through the stack of images in the least amount of optimized. Often this is accomplished using gra-
time, while integrating clinical and historic infor- dient descent, comprising a variety of iterative
mation, to predict the treatment (action) that will algorithms, or combinatorial approaches. This is
yield the highest reward (value), i.e., overall sur- necessary, because, as in real-world problems, an
vival time for the individual patient. Again, the analytical solution usually cannot be found.
very same algorithms that mastered Go could be Dozens of objective functions are available, and
employed to solve this task in precision oncol- developing the right one for a given problem is an
ogy. However, for the Go game, data can be sim- important part of designing an algorithm. A pop-
ulated, whereas for the medical example, we ular error is the mean squared error (MSE) that
would need disease histories from millions of computes the average squared difference between
patients. the predicted and true values. For an image, this
would result in comparing each pixel value of the
predicted to the real image by summing over all
2.4 eneral Concepts of Machine
G squared differences normalized by the total num-
Learning Methods ber of pixels. In imaging applications, other loss
functions have been described, e.g., the percep-
Despite sharing commonalities, there are differ- tual loss [15]. This loss aims at capturing higher
ent ways to categorize ML methods. One way is level differences between images, like content or
to distinguish between discriminative and gener- style. The advantage of such a loss is immedi-
2 Introduction to Machine Learning: Definitions and Hybrid Imaging Applications 17

a b
underfitting overfitting interpolating regime

Testing
Training
Feature 2

Error
?

Feature 1 Model Capacity

Fig. 2.1 (a) Two-dimensional toy problem. Features 1 prevented by regularization. (b) Bias-Variance trade-off
and 2, e.g., weight and height, characterize the instances for training ML models. The classical goal is to find the
of the two different classes, represented by blue circles sweet spot that balances under- and overfitting (dotted
and green stars. The dashed line shows a linear and the vertical line within blue shading), as the models tend to
dotted line a nonlinear decision boundary. The decision perform worse on unseen test data (solid line) even though
boundary is learned by the ML model. For instance, if the the training error (dashed line) further decreases. A recent
model is too powerful or non-representative training data publication proposed the existence of an interpolating
was used, overfitting might occur, i.e., it does not general- regime: very powerful models, like neural networks, can
ize well on unseen data. Depending on the type of model be trained to interpolate the training data, classically con-
used, a different classification might result for an unknown sidered as overfitting, and nevertheless display an
data point (question mark in red box). Overfitting can be improved performance on unseen data

ately apparent, as similar looking images, for [16]. Empirical evidence exists that very power-
instance, identical images that are only shifted by ful models, like neural networks, can be trained
a few pixels, would yield a higher MSE in com- to interpolate (and extrapolate from) the training
parison to the perceptual error. data, classically considered as overfitting, and
During the, often iterative, training procedure, nevertheless display improved performance on
the parameters of the model are adjusted so that unseen data (Fig. 2.1b).
the total error is minimized. Together with this A way to control overfitting is regularization.
procedure, several additional concepts are impor- Regularization can be described as reducing the
tant. One of these is the bias-variance trade-off variance of the model, without substantially
that is useful in understanding the different types increasing its bias. This can be realized by intro-
of error sources affecting the model quality. A ducing constraints on the model parameters, e.g.,
high bias leads to underfitting, not capturing the that they do not become too large (L2-norm,
relationship present in the data. This might be ridge regression) or a sparse solution is preferred
due to choosing the wrong learning algorithm or (L1-norm, Lasso). It restrains the model from
model capacity for the problem. On the other becoming too flexible and, thus, prevents fitting
hand, a high variance is given when the model the data exactly. For neural networks, techniques
captures noise in the training data or the algo- like dropout are used to prevent overfitting. In
rithm is trained with nonrepresentative data. It this procedure, neurons are randomly disabled
causes overfitting to the training data, and the (dropped out) during training, reducing the effect
model usually displays poor generalizability. of specific neurons (and their weights) on the
Apart from bias and variance, the irreducible overall output of the network.
error, inherent to the problem itself, contributes Another way of addressing overfitting is cross-
to the total expected model error. The classical validation (CV). In cross-validation, the data is
goal is to find the sweet spot that balances under- split into different folds of training and validation
and overfitting. However, a recent publication data. The model is trained and evaluated on each
suggests that this view might need to be extended of these splits, identifying the model with a param-
18 J. Kleesiek

eter set that probably works best for new data not has been pointed out that the ranking of algo-
being part of the CV procedure, e.g., which on rithms, as seen in biomedical imaging competi-
average worked best on the validation sets. tions, should be interpreted with care, and
The available data should be split into train- reproducibility is often not possible due to miss-
ing, validation, and test sets.2 The training data is ing information [18]. Thus, comparing two algo-
used during training, the validation data to evalu- rithms designed for solving an identical task is
ate the model during learning (serving as a proxy far from trivial.
for test data) and the test data should only be
touched at the very end for producing the final
results. This allocation of the data can be quite 2.5 lassical Machine Learning
C
challenging, especially if only few data is avail- Approaches
able as it is quite often the case in the medical
imaging field. Therefore, when reading publica- Despite a noticeable shift to DL methods within
tions on AI applications, attention should be paid the last years, several classical machine learning
if the division of the data in these three groups methods are frequently used. Especially, within
has been carried out or if CV has been radiomics applications, they are still the predomi-
conducted. nant approach for relating imaging features to
To assess the performance of a classification genetic or clinical results. Next to simple regres-
algorithm, quite often the area under the curve sion analysis, decision trees and support vector
(AUC) of the receiver operator characteristics machines are often encountered for classification
(ROC) is reported. The value scales between 0.0 as well as regression tasks. But there are plenty of
and 1.0, the higher the AUC the better the model. other approaches beyond the scope of this manu-
The ROC curve is obtained by plotting the true- script, e.g., Bayesian networks and genetic
positive rate (sensitivity) versus the false-positive algorithms.
rate (1.0, specificity). Dozens of performance A decision tree is a very common and power-
measures exist for assessing segmentations in ful data structure in computer science. It is built
images [17], and novel metrics are proposed con- up out of layers of nodes and edges. There is a
stantly. A popular measure is the DICE score that single root at the top and at the end of the edges
geometrically describes the area of the overlap of of the last layer are the leaf nodes that contain the
two segmentations divided by the total size of the results. Based on an input to the root node, the
two areas. For regression problems, other metrics tree is traversed according to decision rules
exist. Quite often, these or similar objective func- encoded in the nodes, e.g., if the SUVmax is larger
tions, like MSE or the perceptual loss, can be than 10.0, take the left edge otherwise the right
found in medical imaging. They encode the dif- edge to the succeeding node. Decision trees are
ference between two images with a single num- easy to understand and train, and they are also
ber. It always should be kept in mind that these computationally efficient. Fortunately, the rules
numbers might not reflect the human impression, for building up the tree can be learned from data
e.g., when visually comparing images, and thus, and do not have to be set manually. A random
the result should not be evaluated purely based on forest consists of many individual decision trees
them. Instead, looking at the data and the actual that are combined to form an ensemble. When
results of an algorithm is of utmost importance building up the forest, each individual tree is built
(Fig. 2.2). If perfect metrics for assessing the by randomly sampling with replacement from the
results existed, they could be utilized as objective training data, resulting in different trees. Further,
functions, and even better results could be random subsets of features are chosen, enforcing
achieved by the learning algorithm. Further, it an even greater variation among the trees in the
model. Each individual decision tree in the ran-
2
In some sources, the meaning of test and validation set is dom forest results in a class prediction and the
reversed. But usually it can be deduced in context. class with the most votes wins—note the analogy
2 Introduction to Machine Learning: Definitions and Hybrid Imaging Applications 19

Reference MSE: 2911

MSE: 1072 MSE: 1307

Fig. 2.2 MSE between different PET images. In com- reference to an upper abdominal PET slice (upper right).
parison to the reference image (head, upper left), the MSE This illustrates the importance of choosing an appropriate
is smaller for the pelvis region (lower left) than for the loss function for the learning algorithm that, for instance,
vertically mirrored but otherwise identical head image incorporates information about the context and not only
(lower right). Largest MSE results from comparing the raw pixel values

to crowd intelligence. By combining simple clas- mance on reasonably sized data sets. However, as
sifiers, i.e., individual trees, the decision bound- they are computationally expensive, they do not
ary can become substantially more complex. scale well with the number of training examples.
Methods following this idea are subsumed as SVMs are synonymously called maximum mar-
ensemble methods. gin classifiers or kernel methods. Taking these
For quite some time, support vector machines three names together yields a very good descrip-
(SVMs) were among the most popular algorithms tion for the method. It is possible to transform
in the field. They often lead to a very good perfor- any data (or features extracted thereof), so that
20 J. Kleesiek

the underlying classes can be separated with a contrast to classical ML approaches, where fea-
linear decision boundary (Fig. 2.1a). This trans- tures are handcrafted, i.e., chosen by humans, the
formation can be performed by using any posi- ANNs learn to extract features that are relevant
tive definite function as a kernel. During training, for solving a given task. In fact, this can be con-
the optimal decision boundary is found, i.e., the firmed visually and relates to the hierarchical
line separating the classes which results in the structure of the networks (Fig. 2.3a). Within the
maximum margin between the data points on lower layers, neurons are tuned during the train-
either side of it. The data points that lie closest to ing process to detect fundamental properties, like
the decision boundary are called support vectors. edges and their orientation, that are combined to
They are actually the most important points for more complex features in the top layers, e.g.,
our classification, as they are the ones where detecting a nose, an eye, or an entire face [20].
errors might occur and also because they are the Due to the resemblance to biological visual sys-
only data points we need for defining the decision tems, this explains why networks can be pre-
boundary. All other data points are not needed to trained on photos from a different domain (see
perform the classification and can be discarded. above), which share the same low-level image
features, and, for instance, then can be adapted to
perform well on medical images. The deep lay-
2.6 Artificial Neural Networks ered architecture makes them very powerful and
allows to unravel hidden high dimensional rela-
The term deep learning summarizes a group of tionships that are too complex for humans to dis-
models utilizing artificial neural networks cover [21, 22]. However, this usually comes at
(ANNs) at their core. Especially, since 2012, they the cost of requiring substantial training data.
gained more and more importance and nowadays ANNs are built up out of several components.
comprise a significant share of the employed There are connections linking neurons, i.e., the
machine learning methods. One major reason for output of a neuron serves as the input to a single
this success is grounded in the way they work. In or multiple subsequent neurons. The individual

a b

Fig. 2.3 (a) Simple ANN. Its hierarchical structure is model. The activation of the neuron is calculated by a
composed of three layers of neurons schematically repre- weighted sum of the incoming connections from upstream
sented by nodes arranged in vertical columns. Input layer neurons. This sum is transformed using an activation
(left, 3 neurons), hidden layer (middle, 4 neurons), and function f and passed on to the neurons of the next layer.
output layer (right, 3 neurons). The number of hidden lay- During training, the weights w are adjusted using back-
ers refers to the depth in DL models. (b) Drawing of a propagation. Image taken from [19]
single artificial neuron overlayed on its biological role
2 Introduction to Machine Learning: Definitions and Hybrid Imaging Applications 21

Fig. 2.4 Convolution leading to edge detection. An seen as the result of the convolution operation (right
enlarged section of a brain MRI scan shows gray values of matrix). As an example, the value 50 is highlighted, which
individual pixels (left matrix). Convolution with a filter is the sum of the multiplication of the red marked values
for diagonal edges (middle matrix) results in an activation with the filter. During CNN training, weights are learned
map. The filter response, i.e., the detection of an edge is that compose different filters. Image taken from [19]

neurons (units) (Fig. 2.3b) combine their inputs learning, the correct output is known, and the
as a weighted sum. The result is transformed by error between the current and desired output can
an activation function introducing nonlinearities. be computed. The derivative of the error function
Activation functions represent an abstraction of (see above) can be calculated, and by using the
the dependency of a biological neuron’s spiking chain rule, the weights of individual neurons can
frequency on its synaptic input currents and are be updated proportionally to their total error con-
also called squashing or transfer functions, tribution. This mechanism is called backpropaga-
because they “squash” the input by transformation as the error is propagated back through the
tion into a predefined value range. Activation network. This is done with thousands of training
functions have been the focus of intensive examples until the weights of the network have
research as they can significantly influence the converged to produce a minimal error.
result of the computation. Hence, dozens of vari- Backpropagation is the key principle of most
ants are described in literature. deep learning algorithms and is also used for the
In the end, a mapping between an input and training of convolutional neural networks
output is learned by a neural network. It has been (CNNs). Especially, in the field of image process-
stated that in theory arbitrary functions can be ing, CNNs are the top dog. In order to understand
approximated by feed-forward networks that why this is the case, one must first consider what
have at least one hidden layer [23]. In feed- a computer “sees.” The individual pixels of an
forward networks, the connections between neu- image correspond to gray values3 that can be dis-
rons do not form cycles. In contrast, recurrent played in a matrix (Fig. 2.4). For example, a gray
neural networks do display those cycles, leading scale image of size 256 × 256 would result in 256
to a form of internal memory. This is very useful × 256 = 65,563 neurons connected via weights to
for sequence learning, e.g., needed for speech a single neuron in the first hidden layer of a fully
processing, and, thus, explains why these models connected ANN. Clearly, this does not scale. This
are especially successful in this domain. is one manifestation of the curse of dimensional-
During training, the weights of the neural net- ity [24], as deep architectures with real color
work are optimized. Initially, random values are photos or radiological images would result in
assigned. Propagating an input through the net-
work results in a series of transformations, ini- 3
In the case of color images there are channels added for
tially generating a random output. In supervised each color component leading to a 3-D matrix.
22 J. Kleesiek

billions of weights (parameters) rendering the rium—the generator is able to produce realisti-
problem intractable or susceptible to overfitting. cally looking images and the discriminators
For this purpose, in CNNs, neurons are only con- performance is at chance level as it is not able to
nected to a small region of the preceding layer. tell generated from real images apart. Despite
This is achieved by introducing convolutional being difficult to train, they have proven to be
layers. A convolution operation involves small very powerful, e.g., for transforming one image
matrices, called filters, which are shifted over the into another, e.g., an MRI scan into a CT image
image. The values are multiplied and added (see below). In contrast to hand-specified losses,
together. For each of these operations, a numeri- e.g., the MSE, the generator will be driven to
cal value is computed. Taken together, these val- learn the full distribution of the original data,
ues correspond to new “images” (one for each instead of summary statistics, such as the mean,
filter), referred to as activation maps. Each pixel which usually results in significantly more realis-
value in these maps represents the filter response tic samples.
for a spatial location within the image of the pre-
vious layer. Combined they serve as input for the
next layer and so forth. As the network learns, it 2.7 Radiomics
creates better and better filters. The numerical and Radiogenomics
values of the filters thus correspond to the weights
that are optimized during training. Filters are said Radiomics describes the quantitative evaluation
to be activated when they “recognize” a feature, of imaging markers in radiological data and their
e.g., an edge (Fig. 2.4). correlation to clinical assessment, molecular data
Dozens of different architectures have been or genetic information. Radiomics is an artificial
proposed and successfully employed for biomed- word formed by combining radiology and omics.
ical imaging tasks [25–28]. Most of them use Originally, the term radiogenomics was intro-
convolutional layers, sometimes even exclusively duced in radiooncology for investigating the radi-
(fully convolutional). A very popular variant, ation response of cells based on their genetic
especially for medical image segmentation, is the profile [31]. Meanwhile, the term radiogenomics
U-Net [29, 30]. It has a u-shaped architecture, is often used synonymously with radiomics,
consisting of a contracting part (encoder), where when establishing a connection between the
spatial information is reduced while feature (imaging) phenotype and the genotype.
information is increased, followed by an expand- A radiomics analysis comprises several steps
ing part (decoder). Crucially, resolution is (radiomics pipeline), including data acquisition
increased again by simultaneously incorporating and preprocessing, image segmentation followed
high-resolution features directly from the con- by the computation and selection of imaging
tracting path. This allows to preserve the struc- markers. These markers are used for the develop-
tural integrity of the image leading to superior ment of the radiomics model by relating them to
results. Examples where this architecture has desired target parameters—up until now most
been used and additional imaging applications commonly by employing classical machine
are introduced below. learning approaches (see above). The identified
Another family of very powerful methods is imaging markers are referred to as the radiomics
called generative adversarial networks (GANs) signature.
[28]. As the name implies, it consists of two com- There are several categories of the radiomics
peting networks: one network, the generator, is features, including shape features, first-order,
trained to generate images indistinguishable from second-order, and higher-order statistics, result-
real images, whereas a second network, the ing in hundreds of features that can be computed
discriminator, is trained to distinguish real from for a region of interest (ROI). An example for
fake images. During training, the two adversaries first-order statistics would be the mean of the
compete and improve until they reach an equilib- Hounsfield units for a delineated area within a
2 Introduction to Machine Learning: Definitions and Hybrid Imaging Applications 23

CT scan or the SUVmax within a PET scan. For when mapping the radiomics features for a given
instance, second-order statistics relate to texture ROI to the genomic target parameters.
features capturing the heterogeneity of a ROI. It In the last years, the radiomics-related publi-
should be stressed, that these are usually pre- cations increased constantly and are considered
defined handcrafted features, originating from as a valuable component for achieving the goal of
classical computer vision approaches, and the precision medicine. It has been conjectured that
advantage of DL methods for automatically the quantitative analysis of imaging features gen-
learning relevant features is not part of the canon- erates more and better information than the
ical radiomics cascade. Nevertheless, recent assessment of images by a physician alone [39].
publications do take modern DL approaches into Of course, the underlying principles and pro-
account (e.g., [32, 33]). Also, for the automatic cesses of the radiomics pipeline are very general.
segmentation of the ROIs, modern AI algorithms They can be directly applied to any kind of medi-
are increasingly being utilized. cal images, e.g., stemming from pathology or
All steps of the pipeline are susceptible to nuclear medicine. For example, it has been dem-
errors and must be carried out carefully. Data onstrated that radiomics features extracted from
acquisition and preprocessing can devastatingly multiparametric PET/MRI images can be used to
influence the results, as image characteristics classify gliomas as well as to predict their muta-
directly impact the feature computation. In the tional status [40].
past, different equations have been employed for
feature computations, e.g., with or without nor-
malization. To tackle this source of variation, 2.8 Imaging Applications
standardization attempts by the image biomarker
standardization initiative (ISBI) and Quantitative In the previous part, the groundwork for under-
Imaging Biomarkers Alliance (QIBA) have been standing AI methods has been established. This
put forward [34–36]. section will look at ML imaging applications that
Radiomics studies often display an imbalance have been proposed for hybrid imaging tasks.
between patients included (N) and features exam- Broadly, two major categories can be distin-
ined (P), impeding the establishment of statisti- guished: (1) image acquisition and processing
cally sound claims. Increased false-positive rates and (2) clinical applications. Within this chapter,
have been reported in these “large-p-small-n” we focus on the first set of applications; the clini-
scenarios [37]. In a recent review, the majority of cal applications will be presented in detail in the
studies included less than 100 patients [38]. As second part of the book.
the feature space grows exponentially with the In addition to methods for segmentation of
number of features included, powerful ML mod- PET images and the improved quantification of
els tend to adapt very strongly to the few existing SUV-related parameters, e.g. [41], ML methods
points in this high-dimensional space and thus do for faster image acquisition, dose reduction, and
not generalize well (see overfitting above). the synthesis of PET contrasts from other modali-
Further observed failure points for reproducibil- ties have been proposed. It has been demonstrated
ity include inadequate corrections for multiple that CNNs can learn image reconstruction based
testing as well as an improper separation of data on PET sinogram raw data [42]. This inverse
into training, validation, and test set (see above). problem has also been addressed by an approach
In oncological studies, the genetic heterogene- coined DeepPET, leading up to a drastic recon-
ity of tumor tissue also should be paid attention struction speedup in comparison to iterative tech-
to. The biopsy contains only a sample of the niques [43]. Next to this potential speedup for
tumor and already a neighboring site within the PET image reconstruction, there is also ongoing
same lesion, not even to consider a metastasis at work for improving the acquisition times of MR
a different location, might have a deviating images starting with undersampled k-space data
genomic profile. This needs to be considered [42, 44]. Interestingly, Facebook AI Research is
24 J. Kleesiek

one of the major partners of the fastMRI chal- the denoised original scans can be utilized as
lenge that addresses this problem [45]. training in- and output pairs for the learning algo-
To obtain a quantitative signal, attenuation rithm. Thereby, the method learns to remove the
correction (AC) is carried out during PET image artificially added noise component. Within this
reconstruction. In PET/CT, the CT data can be scope, it has been demonstrated that the integra-
directly used for this purpose. However, as most tion of anatomical information, i.e., CT or MRI
MR images do not correlate with tissue density, scans, has a favorable effect on the final result
the AC of PET/MRI examinations is a major [53, 55].
challenge. It has been shown that next to classical Low tracer doses also lead to a decreased
approaches, e.g., atlas-based, this challenge can SNR. Similar to denoising an image, as described
also be solved with ML approaches. For instance, above, the image quality can be increased by
DL neuronal networks allow to transform MRI learning to transform low-dose images into stan-
scans into synthetic CT images [46–49]. This is dard dose images using deep neural networks.
often done with GANs and called image synthe- However, deliberately decreasing the radiation
sis. Sometimes, the resulting synthetic CT images exposure is a desired effect as it potentially leads
consist of only a few classes, e.g., bone, air, and to additional PET applications for which a bene-
soft tissue, that are nonetheless sufficient for a ficial risk–benefit ratio currently might not be
good attenuation correction. It even has been given. Additionally, it reduces the costs and dura-
claimed that some of these methods achieve bet- tion of an examination. It has been demonstrated
ter results than the currently available commer- that a standard dose PET can be predicted from
cial solutions [46, 48]. the combination of low-dose PET and
The use of MR or CT images for AC might T1-weighted MR images [56, 57]. Also Chen and
lead to disadvantages in case of an incorrect co- colleagues demonstrated that 100-fold dose
registration of the hybrid image data, for exam- reduced amyloid (18F-florbetaben) PET images
ple, if the patient moves during the acquisition. together with T1-, T2-, and FLAIR-weighted
For this reason, DL approaches have also been MRI images can be used to predict standard dose
proposed for generating synthetic CT images images [58]. The quality of the synthesized
from nonattenuation corrected (NAC) PET data images was assessed by specialists to be only
and to use them in a subsequent step for AC of the slightly worse in comparison to the ground truth
original data [50, 51]. Of course, this suggests data. In addition, a quantitative comparison
that this can also be realized in a single step by yielded that the amyloid status reached an accu-
transforming an NAC PET directly into its AC racy of almost 90% and that it was similar to the
counterpart as shown by Shiri et al. [52]. The intra-rater reproducibility determined with full-
applications that use only NAC PET data as input dose images. They also demonstrated that the
are limited to tracers that are taken up in the usage of hybrid images, i.e., the integration of the
entire body (e.g., FDG), since otherwise not structural MRI data, leads to an added value for
enough morphological information would be the prediction. Nevertheless, recent work sug-
available as input for the ANN. gests that by utilizing a GAN, a similar perfor-
Due to detector characteristics and image mance can be achieved even when omitting MR
reconstruction effects, PET images often display images and solely relying on low-dose PET scans
a suboptimal signal-to-noise ratio (SNR) [53]. as input for the neural network [59]. It should be
Most classical approaches denoise the image data noted that for all methods presented above, low-
at the expense of the local or temporal resolution dose scans were artificially generated from full-
resulting in a decreased image contrast [54]. dose scans and the clinical verification with
Again, ML methods have been proposed for actual low-dose image data is still pending. In
denoising PET images. One way to achieve this addition, prospective studies are needed to evalu-
goal is to add artificial noise to existing high- ate if in AI-generated full-dose PET images
resolution data. In turn, these noisy images and information is lost or artifacts are introduced that
2 Introduction to Machine Learning: Definitions and Hybrid Imaging Applications 25

could unfavorably influence the reading of the evaluation of these novel techniques. It still needs
image. to be established if, given the clinic context in
The logical next step after reducing the dose is addition to an AI enhanced image, the same con-
to artificially generate PET images from other clusion for diagnosis and therapy can be drawn.
imaging modalities without the application of Furthermore, boundary conditions and applica-
any tracer substance. Although this might not be tion areas for ML algorithms need to be specified
deemed possible as molecular and functional more thoroughly for real clinical applications.
information from PET scans should not be Entire ML subfields have evolved that investigate
captured in anatomical images, initial publica- the explainability and out-of-distribution condi-
tions explore this as well as other tasks. For mul- tions of algorithms. The first subject aims at
tiparametric MR images, it has been demonstrated developing methods that make the decision plau-
that, using a DL architecture, the gadolinium sible for the physician, whereas the second exam-
contrast enhancement of brain tumors can be pre- ines what data statistics are mandatory for an
dicted solely based on the pre-contrast scans algorithm to deliver reliable results for unseen
[21]. It has also been shown that it is possible to data. This stresses also the need for the incorpo-
reliably predict some fluorescent labels from ration of representative training data w.r.t to the
unlabeled transmitted-light microscopy images target application when training an algorithm.
[22]. During reading of morphological scans, Luckily, an increasing number of publications
specialists often have a strong suspicion about provide code and sometimes also data, allowing
how certain lesions will behave in PET scans. for reproducibility of the results. This will accel-
Thus, it could be possible that ML methods are erate the progress within this exciting field and
able to identify phenotypic “traits” in images that also open up the possibilities for novel applica-
are indicative of tracer uptake. This notion is also tions. Among other, dual-tracer applications will
supported by a recent study that investigated probably benefit from the recent developments in
CNNs in predicting the 68Ga-PSMA-PET lymph machine learning and open up yet another area of
node status from the lymph node appearance and active research in the near future, e.g., disentan-
its surrounding in CT images alone [60]. The gling the signal from the individual tracers.
results were susceptible to the composition of the
training set but, nevertheless, yielded a classifica- Acknowledgments The author would like to thank Kai
tion accuracy higher than radiologists. When Ueltzhöffer, Jacob Murray, and Christian Strack for their
advice and comments on this manuscript.
examining the regions in the images that were
pivotal to the decision of the best performing
neural network, the authors found that the ana-
tomical location in combination with the appear- References
ance of the lymph node were the key factors.
1. Krizhevsky A, Sutskever I, Hinton GE. ImageNet
classification with deep convolutional neural net-
works. In: Pereira F, CJC B, Bottou L, Weinberger
2.9 Conclusions KQ, editors. Advances in neural information pro-
and Perspectives cessing systems. Red Hook: Curran Associates, Inc.
p. 1097–105.
This chapter gave an overview of machine learn- 2. Dechter R. Learning while searching in constraint-
satisfaction-problems. In: Proceedings of the fifth
ing and its application to hybrid imaging, empha- AAAI national conference on artificial intelligence.
sizing data acquisition and image processing. Palo Alto: AAAI Press; 1986. p. 178–83.
Combining low-dose AI-enhanced imaging with 3. Dodge S, Karam L. A study and comparison of
faster acquisition times of PET as well as MRI human and deep learning recognition performance
under visual distortions. ArXiv170502498 Cs. 2017.
scans will lead to shorter and safer examinations Available from http://arxiv.org/abs/1705.02498
for the benefit of patients. Large-scale prospec- 4. Sutton RS, Barto AG. Reinforcement learning: an
tive multicenter studies are needed for the critical introduction. Cambridge: MIT Press; 1998.
26 J. Kleesiek

5. Taylor GW. A reinforcement learning framework for 21. Kleesiek J, Morshuis JN, Isensee F, Deike-Hofmann
parameter control in computer vision applications. K, Paech D, Kickingereder P, et al. Can virtual contrast
In: First Canadian conference on computer and robot enhancement in brain MRI replace gadolinium? A fea-
vision, 2004 proceedings. 2004, pp. 496–503. sibility study. Investig Radiol. 2019;54(10):653–60.
6. Peng J, Bhanu B. Closed-loop object recognition 22. Christiansen EM, Yang SJ, Ando DM, Javaherian A,
using reinforcement learning. IEEE Trans Pattern Skibinski G, Lipnick S, et al. In silico labeling: pre-
Anal Mach Intell. 1998;20(2):139–54. dicting fluorescent labels in unlabeled images. Cell.
7. Sahba F, Tizhoosh HR, Salama MM. Application 2018;173(3):792–803.
of opposition-based reinforcement learning in 23. Leshno M, Lin VY, Pinkus A, Schocken S. Multilayer
image segmentation. In: 2007 IEEE symposium on feedforward networks with a nonpolynomial activa-
computational intelligence in image and signal pro- tion function can approximate any function. Neural
cessing. 2007, pp. 246–251. Netw. 1993;6(6):861–7.
8. Sahba F, Tizhoosh HR, Salama MM. Application of 24. Bellman RE. Dynamic programming. Mineola: Dover
reinforcement learning for segmentation of transrectal Publications, Inc.; 2003.
ultrasound images. BMC Med Imaging. 2008;8(1):8. 25. Simonyan K, Zisserman A. Very deep convolu-
9. Shokri M, Tizhoosh HR. A reinforcement tional networks for large-scale image recognition.
agent for threshold fusion. Appl Soft Comput. ArXiv14091556 Cs. 2015. Available from http://arxiv.
2008;8(1):174–81. org/abs/1409.1556
10. Ghajari S, Naghibi Sistani MB. Improving the qual- 26. He K, Zhang X, Ren S, Sun J. Deep residual learning
ity of image segmentation in ultrasound images using for image recognition. ArXiv151203385 Cs. 2015.
reinforcement learning. Commun Adv Comput Sci Available from http://arxiv.org/abs/1512.03385
Appl. 2017;2017(1):33–40. 27. Huang G, Liu Z, van der Maaten L, Weinberger
11. Jodogne S, Piater JH. Interactive selection of visual KQ. Densely connected convolutional networks.
features through reinforcement learning. In: Bramer ArXiv160806993 Cs. 2018. Available from http://
M, Coenen F, Allen T, editors. Research and develop- arxiv.org/abs/1608.06993
ment in intelligent systems XXI. London: Springer; 28. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B,
2005. p. 285–98. Warde-Farley D, Ozair S, et al. Generative adversarial
12. Piñol M, Sappa AD, Toledo R. Multi-table reinforce- networks. ArXiv14062661 Cs Stat [Internet]. 2014.
ment learning for visual object recognition. In: Kumar Available from http://arxiv.org/abs/1406.2661
SS, editor. Proceedings of the fourth international 29. Ronneberger O, Fischer P, Brox T. U-Net: convolu-
conference on signal and image processing 2012 tional networks for biomedical image segmentation.
(ICSIP 2012). New Delhi: Springer; 2013. p. 469–79. ArXiv150504597 Cs. 2015. Available from: http://
13. Liu D-R, Li H-L, Wang D. Feature selection and arxiv.org/abs/1505.04597
feature learning for high-dimensional batch rein- 30. Isensee F, Kickingereder P, Wick W, Bendszus M,
forcement learning: a survey. Int J Autom Comput. Maier-Hein KH. No New-Net. ArXiv180910483 Cs.
2015;12(3):229–42. 2018. Available from http://arxiv.org/abs/1809.10483
14. Silver D, Schrittwieser J, Simonyan K, Antonoglou 31. Andreassen CN, Schack LMH, Laursen LV, Alsner
I, Huang A, Guez A, et al. Mastering the game J. Radiogenomics – current status, challenges and
of Go without human knowledge. Nature. future directions. Cancer Lett. 2016;382(1):127–36.
2017;550(7676):354–9. 32. Xu Y, Hosny A, Zeleznik R, Parmar C, Coroller T,
15. Johnson J, Alahi A, Fei-Fei L. Perceptual losses Franco I, et al. Deep learning predicts lung cancer
for real-time style transfer and super-resolution. treatment response from serial medical imaging. Clin
ArXiv160308155 Cs. 2016. Available from http:// Cancer Res. 2019;25(11):3266–75.
arxiv.org/abs/1603.08155 33. Lou B, Doken S, Zhuang T, Wingerter D, Gidwani M,
16. Belkin M, Hsu D, Ma S, Mandal S. Reconciling Mistry N, et al. An image-based deep learning frame-
modern machine-learning practice and the clas- work for individualising radiotherapy dose: a retro-
sical bias–variance trade-off. Proc Natl Acad Sci. spective analysis of outcome prediction. Lancet Digit
2019;116(32):15849–54. Health. 2019;1(3):136–47.
17. Taha AA, Hanbury A. Metrics for evaluating 3D med- 34. Kinahan PE, Perlman ES, Sunderland JJ,
ical image segmentation: analysis, selection, and tool. Subramaniam R, Wollenweber SD, Turkington TG,
BMC Med Imaging. 2015;15(1):29. et al. The QIBA profile for FDG PET/CT as an imag-
18. Maier-Hein L, Eisenmann M, Reinke A, Onogur S, ing biomarker measuring response to cancer therapy.
Stankovic M, Scholz P, et al. Why rankings of bio- Radiology. 2020;294(3):647–57.
medical image analysis competitions should be inter- 35. Sullivan DC, Obuchowski NA, Kessler LG, Raunig
preted with care. Nat Commun. 2018;9(1):1–13. DL, Gatsonis C, Huang EP, et al. Metrology standards
19. Kleesiek J, Murray JM, Strack C, Kaissis G, Braren for quantitative imaging biomarkers. Radiology.
R. Wie funktioniert maschinelles Lernen? Radiology. 2015;277(3):813–25.
2020;60(1):24–31. 36. Zwanenburg A, Vallières M, Abdalah MA, Aerts HJ,
20. Jones N. Computer science: the learning machines. Andrearczyk V, Apte A, et al. The image biomarker
Nat News. 2014;505(7482):146. standardization initiative: standardized quantitative
2 Introduction to Machine Learning: Definitions and Hybrid Imaging Applications 27

radiomics for high-throughput image-based pheno- 49. Liu F, Jang H, Kijowski R, Bradshaw T, McMillan
typing. Radiology. 2020;295(2):328–38. AB. Deep learning MR imaging–based attenua-
37. Park JE, Park SY, Kim HJ, Kim HS. Reproducibility tion correction for PET/MR imaging. Radiology.
and generalizability in radiomics modeling: possible 2017;286(2):676–84.
strategies in radiologic and statistical perspectives. 50. Dong X, Wang T, Lei Y, Higgins K, Liu T, Curran WJ,
Korean J Radiol. 2019;20(7):1124–37. et al. Synthetic CT generation from non-attenuation
38. Bodalal Z, Trebeschi S, Nguyen-Kim TDL, Schats W, corrected PET images for whole-body PET imaging.
Beets-Tan R. Radiogenomics: bridging imaging and Phys Med Biol. 2019;64(21):215016.
genomics. Abdom Radiol. 2019;44(6):1960–84. 51. Liu F, Jang H, Kijowski R, Zhao G, Bradshaw T,
39. Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho McMillan AB. A deep learning approach for 18F-
S, van Stiphout RG, Granton P, et al. Radiomics: FDG PET attenuation correction. EJNMMI Phys.
extracting more information from medical images 2018;5(1):24.
using advanced feature analysis. Eur J Cancer. 52. Shiri I, Ghafarian P, Geramifar P, Leung KH-Y,
2012;48(4):441–6. Ghelichoghli M, Oveisi M, et al. Direct attenuation
40. Haubold J, Demircioglu A, Gratz M, Glas M, Wrede correction of brain PET images using only emission
K, Sure U, et al. Non-invasive tumor decoding and data via a deep convolutional encoder-decoder (Deep-
phenotyping of cerebral gliomas utilizing multipa- DAC). Eur Radiol. 2019;29(12):6867–79.
rametric 18F-FET PET-MRI and MR fingerprinting. 53. Liu C-C, Qi J. Higher SNR PET image prediction
Eur J Nucl Med Mol Imaging. 2020;47(6):1435–45. using a deep learning model and MRI image. Phys
41. Seifert R, Herrmann K, Kleesiek J, Schafers MA, Med Biol. 2019;64(11):115004.
Shah V, Xu Z, et al. Semi-automatically quantified 54. Klyuzhin IS, Cheng J-C, Bevington C, Sossi V. Use
tumor volume using Ga-68-PSMA-11-PET as bio- of a tracer-specific deep artificial neural net to denoise
marker for survival in patients with advanced prostate dynamic PET images. IEEE Trans Med Imaging.
cancer. J Nucl Med. 2020;61(12):1786–92. 2019;1:1.
42. Zhu B, Liu JZ, Cauley SF, Rosen BR, Rosen 55. Cui J, Gong K, Guo N, Wu C, Meng X, Kim K,
MS. Image reconstruction by domain-transform man- et al. PET image denoising using unsupervised
ifold learning. Nature. 2018;555(7697):487–92. deep learning. Eur J Nucl Med Mol Imaging.
43. Häggström I, Schmidtlein CR, Campanella G, Fuchs 2019;46(13):2780–9.
TJ. DeepPET: a deep encoder-decoder network for 56. Wang Y, Zhou L, Wang L, Yu B, Zu C, Lalush DS,
directly solving the PET image reconstruction inverse et al. Locality adaptive multi-modality GANs for
problem. Med Image Anal. 2019;54:253–62. high-quality PET image synthesis. In: Frangi AF,
44. Eo T, Jun Y, Kim T, Jang J, Lee H-J, Hwang D. KIKI- Schnabel JA, Davatzikos C, Alberola-López C,
net: cross-domain convolutional neural networks for Fichtinger G, editors. Medical image computing
reconstructing undersampled magnetic resonance and computer assisted intervention – MICCAI 2018.
images. Magn Reson Med. 2018;80(5):2188–201. Cham: Springer International Publishing; 2018.
45. Knoll F, Zbontar J, Sriram A, Muckley MJ, Bruno p. 329–37.
M, Defazio A, et al. fastMRI: a Publicly available 57. Xiang L, Qiao Y, Nie D, An L, Lin W, Wang Q, et al.
raw k-Space and DICOM dataset of knee images for Deep auto-context convolutional neural networks for
accelerated MR image reconstruction using machine standard-dose PET image estimation from low-dose
learning. Radiol Artif Intell. 2020;2(1):e190007. PET/MRI. Neurocomputing. 2017;267:406–16.
46. Bradshaw TJ, Zhao G, Jang H, Liu F, McMillan 58. Chen KT, Gong E, de Carvalho Macruz FB, Xu
AB. Feasibility of deep learning–based PET/MR J, Boumis A, Khalighi M, et al. Ultra–low-dose
attenuation correction in the pelvis using only diag- 18F-florbetaben amyloid PET imaging using deep
nostic MR images. Tomography. 2018;4(3):138–47. learning with multi-contrast MRI inputs. Radiology.
47. Kläser K, Varsavsky T, Markiewicz P, Vercauteren T, 2018;290(3):649–56.
Atkinson D, Thielemans K, et al. Improved MR to CT 59. Ouyang J, Chen KT, Gong E, Pauly J, Zaharchuk
synthesis for PET/MR attenuation correction using G. Ultra-low-dose PET reconstruction using gen-
imitation learning. In: Burgos N, Gooya A, Svoboda erative adversarial network with feature match-
D, editors. Simulation and synthesis in medical imaging and task-specific perceptual loss. Med Phys.
ing. Cham: Springer International Publishing; 2019. 2019;46(8):3555–64.
p. 13–21. 60. Hartenstein A, Lübbe F, Baur ADJ, Rudolph MM,
48. Ladefoged CN, Marner L, Hindsholm A, Law I, Furth C, Brenner W, et al. Prostate cancer nodal stag-
Højgaard L, Andersen FL. Deep learning based ing: using deep learning to predict 68 Ga-PSMA-
attenuation correction of PET/MRI in pediatric brain positivity from CT imaging alone. Sci Rep.
tumor patients: evaluation in a clinical setting. Front 2020;10(1):1–11.
Neurosci. 2019;12:01005.
Radiomics in Nuclear Medicine,
Robustness, Reproducibility, 3
and Standardization

Reza Rezai

Contents
3.1 Introduction 29
3.2 Robustness of Radiomic Features 30
3.3 Image Acquisition 30
3.4 Image Reconstruction 30
3.5 Segmentation 33
3.6 Image Processing 33
3.7 Discretization 34
3.8 Software 34
3.9 Pitfalls 34
3.10 Standardization 34
3.11 Discussion 34
3.12 Conclusion 35
References 35

3.1 Introduction process may lead to missing some worth data

which could help us to complete clinical evalua-
Quantitative information is the main form of tions [1]. Usage of images biomarkers is the prin-
gathering information in digital imaging systems cipal issue in the field of radiomics that can
that follows by transforming to qualitative and provide useful information, absolutely noninva-
sensible information for the human eye, but this sively, about the behavior and characteristics of
suspected tissue and lesion in the body [2].
Radiomics tries to access those forms of quanti-
R. Rezai (*) tative information of medical images that are hid-
Princess Margaret Bioinformatics and Computational den from the physician’s eyes and eventually help
Genomics Laboratory, University Health Network, them to improve their prognosis and diagnosis
Toronto, ON, Canada tasks. Repeatability and reproducibility are the
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 29

P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_3
30 R. Rezai

two main characteristics of radiomic features that in radiomics features [13, 17]. Decreasing the
are necessary for clinical trial applications [1]. level of injected lead to a lower dose receiving by
The objective of this chapter is an evaluation of the patient. This is important especially for pedi-
the influence of nuclear medicine imaging param- atrics nuclear imaging as high organ sensitivity
eter changes on radiomic features variability of this group. Some recent studies have shown
extracted from these images. Also, try to bring a that keeping tumor diagnostic power with a
comparison between the recent article’s results diminishing volume of tracer could be done
and a demonstration of the most robust and sensi- simultaneously [2].
tive features from these researches (Table 3.1). Branchini et al. in their research have shown
that robust features can be extracted from pediat-
rics PET/MRI scans even with lower tracer vol-
3.2 Robustness of Radiomic ume. In this study features from shape and
Features intensity families had an acceptable range of sta-
bility (ICC > 0.9) [2]. Another study that assessed
Like other new methodologies difficulty and lim- the impact of acquisition parameter consisting of
itations along with usefulness are inevitable, like- the number of views, view matrix size, with or
wise, sensitivity and difference potentiality to without attenuation correction, on radiomic fea-
imaging parameters are the most important limi- tures variability, demonstrated that DE(GLDM),
tations of the application of radiomic features RLNUN, SRE, RP(GLRLM), ZE(GLSZM) and
[14]. However, repeatability and reproducibility IDMN, IDN, IMC2(GLCM) have significant sta-
both refer to robustness of features, there are bility (Table 3.2). Also, it can be recognized from
some differences though. When the same features this article that matrix size and number of views
from the same subject, situations, and imaging are two factors that have the most effect on
parameters are extracted, these are repeatable radiomic biomarkers [13].
features and, if the same features from different
parameters and situations are extracted, these are
reproducible features [15]. These two properties 3.4 Image Reconstruction
of radiomic features are such obstacles that limit
application and generalization of them to the Another source of variation in radiomic features
broad manner in medicine [1]. Researches dedi- is reconstruction parameters that its effect has
cated to assessing these characteristics can open studied and proven by multiple assessments [3,
new bright paths for radiomics employment in 5, 6, 8, 13]. Edalat-javid et al. in their research
precision medicine and clinics as well [7, 16]. about the influence of reconstruction parameters
The main factors that have a significant impact on on radiomic features extracted from SPECT
radiomic features consist of image acquisition scans, illustrated that FWHM of Gaussian filter
parameters, image reconstruction methodologies, has maximum effect on features between other
and settings, contouring and delineation pro- parameters they studied. Besides, in their study,
cesses, image processing and feature extraction some of the most robust features for instance:
parameters, etc. [14, 17]. RLNUN, SRE, RP(GLRLM), and most sensi-
tive features like SDLGLE, LDLGLE, DV
(GLDM) have reported [13]. The stability of
3.3 Image Acquisition features included in GLRLM and GLSZM fami-
lies is proven by multiple studies that have eval-
Image acquisition parameters such as tracer uated the variability of PET/CT radiomic
uptake time or level, scan mode, number of features [3, 5, 6, 8, 13]. Among these features,
views, view matrix size, attenuation correction, RP and SRE from GLRLM represented maxi-
type of scanner, etc. can be a source of variation mum stability.
3
Table 3.1 Identification of multiple variation sources evaluated by lectures and comparing of robustness and sensitivity of extracted features
Author Source of variation Most robust features Most sensitive features Most affecting source
1 Gallivanone 1. Segmentation method To recon: To To recon: To seg: Segmentation
et al. [3] 2. Lesion uniformity 1. GLRLM seg: 1. Intensity 1. Morphological method
3. Reconstruction parameter 2. GLSZM 1. Contrast histogram 2. Intensity histogram
3. Morphological features (GLCM)
2. Dissimilarity
(GLCM)
3. HGRE
(GLRLM)
4. SRHGE
(GLRLM)
5. HGZE
(GLSZM)
2 Papp et al. [4] 1. Extraction parameters 1. Information correlation (GLCM) 1. Contrast and difference variance Lesion volume
Voxel size, bin size, lesion 2. Compactness, volume and Spheric dice (GLCM) change
volume change coefficient (shape features) 2. Contrast (NGTDM)
3 Pfaehler et al. 1. Discretization method 1. GLSZM Discretization
[5] 2. Noise 2. GLRLM Sphere size
3. Recon method 3. GLCM Activity uptake
4. Underlying data
4 Pfaehler et al. Reconstruction method 1. LocINT 1. NGLDM FBN discretization
[6]
5 Yang et al. [7] Contouring 1. GLNDM (for 3,264,128 discretization) Most of #GLSZM features Low discretization
2. GLCOM (for 256 discretization) schemes
6 Shiri et al. [8] Reconstruction settings 1. Entropy, homogeneity, dissimilarity, 1. NGTDM Matrix size
correlation (GLCM) 2. TS
2. SRE, LRE, RLV, RP(GLRLM)
3. SZE, IV, ZP (GLSZM)
Radiomics in Nuclear Medicine, Robustness, Reproducibility, and Standardization

4. Entropy, homogeneity, dissimilarity

(NGLCM)
5. SUVmean, entropy, SULpeak and 16 other
features (intensity and SUV)
6. SNE, NNU, SM, entropy (NGLD)
7. Homogeneity (TFC)
8. Entropy, homogeneity, intensity, IDMcglcm,
CE (GLCM-coding)
(continued)
31
Table 3.1 (continued)
32

Author Source of variation Most robust features Most sensitive features Most affecting source
7 Belli et al. [9] Delineation (manual, 1. First order SUV families VA and ISZ families Automated (>50%)
semi-automatic, fully 2. Co-occurrence matrix (higher-order)
automatic)
8 Lv et al. [10] Extraction parameters 1. GLRLM 1. GLSZM
9 Catherine 1. Three contouring For observers For contouring For observers For contouring methods Contouring
Guezennec [11] methods 1. Homogeneity methods 1. Busyness Homogeneity, correlation, methods
2. Two observers 2. Correlation 1. Busyness entropy, and LZLGE
3. Entropy
10 Branchini et al. 1. Statistics count reduction 1. GLCM 1. GLRLM Discretization
[2] 2. Discretization method method
11 Vuong et al. Segmentation methods Shape and intensity features Wavelet features Threshold-based
[12] segmentation
12 Edalat-Javid Image acquisition and 1. DE from GLDM 1. SDLGLE, LDLGLE, and DV from 1. Matrix size
et al. [13] reconstruction settings 2. RLNUN and SRE and RP from GLRLM GLDM 2. Number of views
3. ZE from GLSZM 2. SALGLE, LALGLE and LGLZE from 3. FWHM of the
4. IDMN and IDN, IMC2 from GLCM GLSZM Gaussian filter
Suv standard uptake value, VA voxel-alignment, ISZ intensity-size zone
R. Rezai
3 Radiomics in Nuclear Medicine, Robustness, Reproducibility, and Standardization 33

Table 3.2 Abbreviations Table 3.2 (continued)

Gray-level Inverse difference moment Gray level Dependence entropy (DE)
co-occurrence (IDMcglcm) dependence Small dependence low gray level
matrices (GLCM Code entropy (CE) matrix (GLDM) Emphasis (SDLGLE)
or GLCOM) Inverse difference moment Large dependence low gray level
Normalized (IDMN) emphasis (LDLGLE)
Inverse difference normalized Dependence variance (DV)
(IDN) Normalized
Informal measure of correlation gray-level
(IMC) co-occurrence
Gray level run High gray-level run emphasis (NGLCM)
length matrix (HGRE)
(GLRLM) Short run high gray-level
emphasis (SRHGE)
Short run emphasis (SRE) 3.5 Segmentation
Long run emphasis (LRE)
Run-length variability (RLV) Radiomic features are extracted from VOI regions.
Run percentage (RP) Contouring or delineation are the names of precise
Run length non-uniformity recognition VOI on the images, that can perform
Normalized (RLNUN)
manually, automatically, and semi-automatically
Gray-level size High gray-level zone emphasis
zone matrices (HGZE) [17]. Comparing all three methods is done by Belli
(GLSZM) Short-zone emphasis (SZE) et al. in 2018 in which a semi-automatic contour-
Intensity variability (IV) ing method named PET/Edge represented best
Zone percentage (ZP) repeatability (DICE > 0.95) [9]. The impact of
Zone Entropy (ZE) applying different segmentation methods on image
Small area low gray level features alteration is proven as well. For example,
emphasis (SALGLE)
in a study in 2018 it has illustrated that between
Large area low gray level
emphasis (LALGLE) parameters influencing PET imaging features
Low gray level zone emphasis which have studied, segmentation methods had the
(LGLZE) most effect on radiomic features variability [3].
Neighborhood
gray-level
dependence
matrix (NGLDM) 3.6 Image Processing
Gray-level
neighborhood Another step of radiomics workflow (after image
difference acquisition) is image processing consisting of dis-
matrices
cretization methods, normalizations, interpolations,
(GLNDM)
Neighboring gray Small number emphasis (SNE) noise filtering, etc. that always affect the radiomics
level dependence Number non-uniformity (NNU) features [17, 18]. Papp et al. in their research about
(NGLD) Second moment (SM) the effects of extraction parameters changes on
Texture feature radiomic features variability, demonstrated that
coding (TFC) lesion volume alterations have a larger effect on fea-
Texture spectrum tures in comparison with voxel and bin size [4]. A
(TS)
study concerning the consequence of matrix con-
Long-zone low
gray-level structions on radiomic features has indicated that
emphasis most GLRLM features are independent of matrix
(LZLGE) parameters meaning there is high stability [10].
34 R. Rezai

3.7 Discretization after that examination and performance tests of

models should be done to evaluate the function
Discretization is a process done before feature and also differentiate authentic and actual results
calculation for some reasons such as noise from false ones. Among multiple pitfalls existing
decreasing, intensity range limiting, and gener- in model creation, overfitting, the condition in
ally to diminish texture matrix dispersal [2]. which the model performance is pretty good for
Among different discretization methods includ- training data but disappointing for real situations,
ing Max–LIoyd discretization, histogram equal- is a common problem that has been mentioned by
ization, fixed bin size (FBS), fixed bin number plenty of papers [1, 14, 15]. Lack of external vali-
(FBN), etc. the two last methods are most appli- dation, class imbalance, incorrect model calibra-
cable in radiomics researches [17]. The impact of tion, use of nonrobust image biomarkers,
bin number and bin width, two main parameters incomplete reporting, etc. are other mentioned
related to FBN and FBS respectively, on features pitfalls that should be considered [17].
variation has proven by multiple studies. Indeed
larger influence of FBN and much more repeat-
able features of FBS have reported too [2, 5]. 3.10 Standardization

Reproducibility of radiomic features and also dif-

3.8 Software ference in methodologies to achieve these fea-
tures are the crucial challenges in the radiomics
It has been seen in some quantitative researches, field [21, 22]. Uncertainty in feature reproduc-
radiomics papers particularly, the use of home- ibility is the factor that directly impacts on the
made software to calculate and extract features. translation of these features to clinical area [22,
These such works seem to be in contrast to pro- 23]. So, ascertain the reliable and stable features
vide a general pipeline for radiomics applications in order to interpretations and also establish a
[19]. So it has recommended that to improve and uniform workflow for radiomic features achieve-
progress in radiomics research and make it closer ment so that it would be followed by all researches
to clinical reliability, it’s necessary to follow IBSI in this field are two ineluctable requisites for
laws and policies [17]. Foy et al. in the assess- future of the radiomics [21]. There has been try-
ment of variability of radiomic features against ing recently to develop a defined framework and
different software implemented, assessed four features nomenclature by the image biomarker
radiomic packages which two of them were standardization initiative (IBSI) researchers in
home-made software employed. In this study order to integrate the routes and outcomes in the
only a few first- and second-order features were future researches [24].
used, they observed that the values calculated by
four packages have significant differences so the
use of reliable packages for radiomics researches 3.11 Discussion
has recommended [20].
Stability of radiomic features is the recent and
critical issue in translating these features for
3.9 Pitfalls prognostic and diagnostic context. For this pur-
pose, the objective of this chapter was an assess-
The last step of radiomics workflow is the cre- ment of the robustness and repeatability of
ation of models and algorithms for prognostic radiomic quantities against imaging parameter
and diagnostic support tasks. The objective is to changes. Medical imaging has great potential to
achieve such a model which is most relevant to provide a wide range of information about the
reality. For this purpose, first, we need high- internal construction and function of the body
quality and reliable data to construct models and completely noninvasively, so it’s necessary to
3 Radiomics in Nuclear Medicine, Robustness, Reproducibility, and Standardization 35

accelerate researches in this area. Along with cles and it is exactly why clinicians can’t benefit
offering new methods and procedures in this from radiomics abilities. To identify what param-
field, the reliability and dependability of these eter can alter radiomic features we have to inves-
methods are also important as practical applica- tigate the influence of each parameter to the
tion is the goal. In this review, searching main attainment of an optimized set of radiomics fea-
keywords including repeatability, reproducibility, tures. In this chapter we reviewed some relevant
robustness, stability, radiomic features, nuclear papers that assessed the robustness and stability
medicine, etc. tries to collect all recent researches of radiomic features along with different changes
directly relevant to the variability of radiomic in imaging parameters like image acquisition,
features against imaging parameters changes. segmentation, image reconstruction, etc. also by
The priority of the last three-year articles was one comparing the response of extracted features to
of the reasons that limited the number of parameter changing, the most robust and sensi-
articles. tive features highlighted. Acquiring robust and
In addition to imaging parameter changes, reproducible features need homogenous use of
inter-patient and inter-scanner repeatability need influencing parameters (scan protocols) which
to be evaluated too, so the need for a general pro- have optimized to radiomics usage. More
cedure for the execution of this kind of researches researches and endeavors are necessary to take
is seriously vital. One of the existing problems in this field into its appropriate place of science.
the field of radiomics researches is the extended
range of research methodologies. For example,
although the use of retrospective data is prevalent References
for many studies, implementing multiple kinds of
phantoms including digital phantom, anthropo- 1. Park JE, Park SY, Kim HJ, Kim HS. Reproducibility
morphic phantom, software simulated phantom, and generalizability in radiomics modeling: possible
etc. is another source of data. Moreover, various strategies in radiologic and statistical perspectives.
Korean J Radiol. 2019;20(7):1124–37.
ways to feature extraction and feature robustness 2. Branchini M, Zorz A, Zucchetta P, Bettinelli A, De
analysis can aggravate the condition. This disper- Monte F, Cecchin D, et al. Impact of acquisition count
sion of operation pathes can lead to the accumu- statistics reduction and SUV discretization on PET
lation of a large volume of new and raw data radiomic features in pediatric 18F-FDG-PET/MRI
examinations. Phys Med. 2019;59:117–26.
which without any reliability characteristics are 3. Gallivanone F, Interlenghi M, D’Ambrosio D, Trifirò
not applicable and maybe cause to the confusion G, Castiglioni I. Parameters influencing PET imag-
of newcomers to this branch of science. So it ing features: a phantom study with irregular and het-
seems necessary to present a coherent and inte- erogeneous synthetic lesions. Contrast media. Mol
Imaging. 2018;2018:5324517.
grated methodology to achieve meaningful 4. Papp L, Rausch I, Grahovac M, Hacker M, Beyer
results. T. Optimized feature extraction for radiomics
analysis of (18)F-FDG PET imaging. J Nucl Med.
2019;60(6):864–72.
5. Pfaehler E, Beukinga RJ, de Jong JR, Slart RHJA,
3.12 Conclusion Slump CH, Dierckx RAJO, et al. Repeatability of
18F-FDG PET radiomic features: a phantom study
Radiomics is a developing field of research that to explore sensitivity to image reconstruction set-
connects imaging technology and statistical anal- tings, noise, and delineation method. Med Phys.
2019;46(2):665–78.
ysis to achieve more information that may be hid- 6. Pfaehler E, van Sluis J, Merema BBJ, van Ooijen
den or unclear for the human eye. Repeatability P, Berendsen RCM, van Velden FHP, et al.
and reproducibility of radiomic features are the Experimental multicenter and multivendor evaluation
most recent issues that got more attention due to of the performance of PET radiomic features using
3-dimensionally printed phantom inserts. J Nucl Med.
the tendency of employing radiomics in clinics 2020;61(3):469–76.
[1]. Variability of features caused by imaging 7. Yang F, Simpson G, Young L, Ford J, Dogan N, Wang
parameter changes is mentioned in several arti- L. Impact of contouring variability on oncological
36 R. Rezai

PET radiomics features in the lung. Sci Rep. 16. Kuhl CK, Truhn D. The long route to standard-
2020;10(1):369. ized radiomics: unraveling the knot from the end.
8. Shiri I, Rahmim A, Ghaffarian P, Geramifar P, Radiology. 2020;295(2):339–41.
Abdollahi H, Bitarafan-Rajabi A. The impact of 17. Zwanenburg A. Radiomics in nuclear medicine:
image reconstruction settings on 18F-FDG PET robustness, reproducibility, standardization, and how
radiomic features: multi-scanner phantom and patient to avoid data analysis traps and replication crisis. Eur
studies. Eur Radiol. 2017;27(11):4498–509. J Nucl Med Mol Imaging. 2019;46(13):2638–55.
9. Belli ML, Mori M, Broggi S, Cattaneo GM, Bettinardi 18. Ibrahim A, Vallières M, Woodruff H, Primakov S,
V, Dell’Oca I, et al. Quantifying the robustness of Beheshti M, Keek S, et al. Radiomics analysis for
[18F]FDG-PET/CT radiomic features with respect to clinical decision support in nuclear medicine. Semin
tumor delineation in head and neck and pancreatic Nucl Med. 2019;49(5):438–49.
cancer patients. Phys Med. 2018;49:105–11. 19. Reuzé S, Schernberg A, Orlhac F, Sun R, Chargari C,
10. Lv W, Yuan Q, Wang Q, Ma J, Jiang J, Yang W, Dercle L, et al. Radiomics in nuclear medicine applied
et al. Robustness versus disease differentiation when to radiation therapy: methods, pitfalls, and challenges.
varying parameter settings in radiomics features: Int J Radiat Oncol Biol Phys. 2018;102(4):1117–42.
application to nasopharyngeal PET/CT. Eur Radiol. 20. Foy JJ, Robinson KR, Li H, Giger ML, Al-Hallaq
2018;28(8):3245–54. H, Armato SG. Variation in algorithm implemen-
11. Guezennec C, Bourhis D, Orlhac F, Robin P, Corre tation across radiomics software. J Med Imaging.
J-B, Delcroix O, et al. Inter-observer and segmenta- 2018;5(4):044505.
tion method variability of textural analysis in pre- 21. Zwanenburg A, Vallières M, Abdalah MA, Aerts HJ,
therapeutic FDG PET/CT in head and neck cancer. Andrearczyk V, Apte A, et al. The image biomarker
PLoS ONE. 2019;14(3):e0214299. standardization initiative: standardized quantitative
12. Vuong D, Tanadini-Lang S, Huellner MW, Veit- radiomics for high-throughput image-based pheno-
Haibach P, Unkelbach J, Andratschke N, et al. typing. Radiology. 2020;295(2):328–38.
Interchangeability of radiomic features between 22. Welch ML, McIntosh C, Haibe-Kains B, Milosevic
[18F]-FDG PET/CT and [18F]-FDG PET/MR. Med MF, Wee L, Dekker A, et al. Vulnerabilities of
Phys. 2019;46(4):1677–85. radiomic signature development: the need for safe-
13. Edalat-Javid M, Shiri I, Hajianfar G, Abdollahi H, guards. Radiother Oncol. 2019;130:2–9.
Arabi H, Oveisi N, et al. Cardiac SPECT radiomic 23. McNitt-Gray M, Napel S, Jaggi A, Mattonen S,
features repeatability and reproducibility: a multi- Hadjiiski L, Muzi M, et al. Standardization in quanti-
scanner phantom study. J Nucl Cardiol. 2020;28(6): tative imaging: a multicenter comparison of radiomic
2730–44. features from different software packages on digital
14. Mayerhoefer ME, Materka A, Langs G, Haggstrom reference objects and patient data sets. Tomography.
I, Szczypinski P, Gibbs P, et al. Introduction to 2020;6(2):118.
radiomics. J Nucl Med. 2020;61(4):488–95. 24. Hagiwara A, Fujita S, Ohno Y, Aoki S. Variability
15. Traverso A, Wee L, Dekker A, Gillies R. Repeatability and standardization of quantitative imaging: mono-
and reproducibility of radiomic features: a sys- parametric to multiparametric quantification,
tematic review. Int J Radiat Oncol Biol Phys. radiomics, and artificial intelligence. Investig Radiol.
2018;102(4):1143–58. 2020;55(9):601.
Evolution of AI in Medical Imaging
4
Josh Schaefferkoetter

Contents
4.1 Disease Characterization 40
4.2 Segmentation 43
4.3 Image Generation/Reconstruction 44
4.4 Data Corrections 46
4.5 Image Registration 48
4.6 Radiology Reporting 49
4.7 Conclusion 50
References 51

In the field of medical imaging, the application of searches on the internet to facial and voice recog-
computer vision to solve radiologic problems has nition in mobile devices, and it has made remark-
been proposed since the mid-twentieth century able progress in recent years. There are various
[1]. As computers became more prevalent and potential applications of AI in medicine, and AI
imaging became digitized, the infrastructure was has already impacted radiology in some regards,
in place upon which to build sophisticated analy- introducing quantification into a space which was
sis pipelines to be used in routine workflow—this historically based purely on subjectivity [2, 3].
workflow has included, and will certainly con- This however is just the beginning—it is widely
tinue to include, different applications of artifi- recognized that medical imaging is one of the
cial intelligence. Today, AI is fundamental in many fields in which advanced AI will cause a
many facets of everyday life, from semantic complete paradigm shift. Molecular imaging in
particular is an especially likely candidate to ben-
efit, and it is in a position which would allow it to
J. Schaefferkoetter (*) readily integrate this technology.
Siemens Medical Solutions USA, Inc., Molecular imaging technologies have contin-
Knoxville, TN, USA ually improved year over year. MRI develop-
e-mail: joshua.schaefferkoetter@siemens- ments include higher field strength magnets,
healthineers.com

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 37

improved RF coil arrays increasing acquisition object detection, classification, image segmenta-
SNR, and a growing catalog of pulse sequences tion, speech recognition, and image generation—
for various applications. Single photon emission in fact, DL models have matched and even
computed tomography (SPECT) systems rou- surpassed human performance in certain tasks
tinely employ advanced correction techniques [4–6]. It is impossible to ignore that these tasks
now producing quantitative images, and modern are ubiquitous components in many aspects of
positron emission tomography (PET) scanners radiology, and novel applications for DL are
are using smaller crystals leading to better spatial immediately identified. Indeed, there are many
resolution, with detection systems approaching areas of active research in medicine and remark-
timing resolution close to 200 ps. All of these able successes have been reported. Most reviews
modalities have realized concurrent progress in or general overviews of DL in medicine cite the
data processing as well, including sophisticated growing number of related publications on
reconstruction and motion correction techniques. PubMed, and at the time of this writing, the
These advances have yielded extraordinary levels search phrase “deep learning” returned 5315
of image quality, but point is approaching where results for 2019. This is up from 3004 in the pre-
it is becoming less clear how these improvements vious year, and for 2020, there are already 3994
are practically realized in terms of clinical out- results in the first 6 months. This trend is cer-
comes. For instance, producing images with tainly a testament to the applicability and success
superfine resolution for routine examinations of DL in medicine.
might not significantly impact diagnostic reliabil- It is difficult to understand the evolution and
ity, staging, or treatment planning. In fact, the future direction of AI without a basic understand-
additional time taken for the data acquisition and ing of the recent advances in AI techniques. This
radiologist interpretation would potentially have section gives an abbreviated overview, detailing a
adverse effects on the clinical workflow. few specific examples. It cannot possibly cover
Furthermore, in recent years, the amount of med- all aspects but will instead focus on DL, since it
ical imaging data has grown exponentially, and is, without question, the dominant trend and
this has already increased the pressure on radiol- direction of recent AI research; it has demon-
ogists to maintain accuracy at higher throughput. strated promising improvements even over other
While novel imaging innovations will continue to traditional ML approaches. Almost all DL tech-
have impact on patient care and be welcomed by niques are based on artificial neural networks
the medical community, it is likely that techno- (ANNs) comprising layers of numerical weights
logical developments in the near future will focus and “activation” nodes. More specifically, each
on increasing efficiency, reliably standardizing node within a layer generally consists of a linear
care, and improving patient safety. operation involving the summed product of its
Artificial intelligence, by definition, is the weights and input (the outputs of the previous
branch of computer science, developing com- layer), followed by a nonlinear operation, e.g.,
puter algorithms to perform jobs normally requir- sigmoid, hyperbolic tangent, rectified linear—
ing human intelligence. Machine learning (ML) there may be thousands of nodes in a given layer.
is a subgroup of AI connoting any algorithm By stacking many of these layers, through
which improves through experience. There are densely interconnected nodes, one can effectively
many different schemes, ranging in complexity piecewise construct complex functions which are
from simple regression models and component able to be shaped throughout many degrees of
analyses to more complex methods like random freedom. In this sense, a network can be shaped
forests and support vector machines. However, to “learn” mapping functions between different
most of the remarkable successes and resulting domains. Unlike most other ML approaches, DL
excitement of recent times belong to the class of does not require inputs which explicitly define
ML known as deep learning (DL). State-of-the- the discriminating features of the population;
art results have been achieved in the fields of through training, it inherently learns the features
4 Evolution of AI in Medical Imaging 39

which best represent the data for the current task. also have multiple resolution downsampling (or
This data-driven approach allows DL applica- encoding) layers. Many of the early uses for
tions to characterize more abstract features and CNNs were focused on classification tasks and
makes these systems more generalizable, but it is used a nonconvolutional, densely connected layer
predicated on the availability of large amounts of at the last layer to sort the output in scalar class
training data to enable accurate characterizations probabilities [7]. Fully convolution networks
of the sample populations. (FCNs), however, do not contain any densely
Convolutional neural networks (CNNs) are an connected layers and preserve the input dimen-
extension of neural networks, designed to handle sionality throughout the network—this architec-
data with higher dimensionality, usually in 2D or ture is better suited to certain analysis tasks, i.e.,
3D, and so are well suited for image-based tasks. when requiring a dense prediction map over all
In conventional ANNs, the weights at each layer pixels [8]. The U-Net architecture has become
have a single, unique value for every combination widely used in image analyses [9] and uses a
of nodes of its layer and the nodes of the previous dedicated encoding and decoding path to produce
layer, and so the corresponding total number of outputs of the same size as the inputs. A major
weights at each layer is the product of these num- contribution of U-Net was the introduction of
bers. For CNNs, instead of a single value, there is skip connections between the encoding and
a matrix of values, which can be thought of as a decoding paths at each resolution level in order to
weighted filter; the size of the matrices is rela- preserve spatial detail throughout the network—
tively small. The filters are passed over the layer this feature makes this architecture popular for
input data like a convolution kernel, resulting in medical image segmentation tasks. Another use-
output feature maps of the same dimensionality ful architecture is ResNet, which is built on resid-
as the input. This approach exploits the spatial ual blocks containing multiple convolution
dependencies within the data and makes the net- layers, with the block input directly connected to
work invariant to input translations, while at the its output [10]. This direct connection results in
same time significantly reducing the total number an alternate identity path, and so each convolu-
of network parameters. For example, say we have tional block needs only to learn the pixel residu-
a single 2D input image with pixel dimensions als and is pre-conditioned to learn mappings
100 × 100, and this feeds a layer with 128 chan- which are close to identity; the ResNet architec-
nels. A conventional ANN would handle each of ture has facilitated training stability in some of
the 10,000 input pixels independently, and so the the deepest networks. The last relevant architec-
total number of parameters would be 1,280,000 ture is called Inception [11]. It contains blocks of
for that single layer. For a CNN, this correspond- multiple streams, each with different numbers of
ing layer would handle the whole image as a convolutions, under the premise that explicit fil-
single, multidimensional input—with a filter size ter sizes need not be defined since the image is
3 × 3, the total number of layer parameters would now analyzed at multiple scales at the same level,
then only be 1152 (1 × 3 × 3 × 128). This scheme i.e., taking the network wider rather than deeper.
is not only more efficient but potentially allows There is also a powerful extension of this called
the same network to handle inputs of arbitrary Inception-ResNet, which as the name implies,
sizes. For these reasons, CNNs are currently the uses Inception blocks, rather than blocks of
AI technique of choice for image analyses and single-convolution streams, to calculate the block
computer vision tasks. residuals.
Various CNN architectures are currently Alongside the evolution of network architec-
used—a few are explicitly mentioned here, but tures were concurrent advances in network train-
many of the basic concepts are common with ing approaches. In the context of ML, training
many other networks. The convolution layers refers to the minimization of an objective loss
typically have filters with sizes between 3 and 5 metric corresponding to a certain task, i.e., some
pixels (for each dimension), and most networks measure of distance between the network output
40 J. Schaefferkoetter

and target value. In more basic terms, this means results, and it is designed for image-based tasks
the values of the network weight parameters are performed by CNNs. Generative adversarial net-
gradually modified so that the desired outputs are works (GANs), introduced in 2014, comprise a
obtained. This is usually accomplished by back- system of two networks [12]. The first is the pri-
propagating the derivative of the loss through the mary network, the generator, which for simplic-
network. Backpropagation is a computationally ity, can be regarded no differently than the
efficient method, combining simple mathemati- networks discussed above—its job is to perform
cal operations, to generate a gradient of partial the desired task. However, instead of defining the
derivatives comprising the influences on the loss training loss directly at its top layer with labels,
of every parameter in the network. After a com- the generator’s output is fed into the second net-
plete backpropagation cycle, each network work, the discriminator, and the job of this net-
parameter is updated according to a predefined work is to distinguish the generator’s outputs
schedule in the direction which minimizes the from a corresponding set of real samples. During
loss. This process is repeated for many, some- training, the discriminator learns the features that
times millions, of iterations until acceptable per- are common to the real and generated popula-
formance is achieved. tions as a whole and uses this information to dis-
In general, there are two fundamental criminate between the two sample sets. However,
approaches to training ML systems, supervised this same information can also be backpropa-
and unsupervised. Under supervised approaches, gated to the generator and used to improve its
the input data have corresponding labels, and gra- own output. In this way, the two networks are
dient backpropagation begins with a loss calcula- adversaries in that they are each constantly trying
tion over every output element of the network. to outperform the other, but at the same time,
For example, a CNN designed for classification both the networks can simultaneously improve
might predict the correct class for a given input together. Deep learning systems built on the
image by finding the maximum of the discrete GAN framework have been tailored for specific
probabilities calculated over all possible applications in a wide range of fields and have
classes—during training, it would compare this demonstrated state-of-the-art performance, espe-
prediction to the correct label and backpropagate cially for image generation, translation, and
its error differentials. In a simple classification transformation tasks.
task, each possible class might be represented as Artificial intelligence has already established
a single node in the output layer. This concept is applications in the medical field. Novel investiga-
readily extended to FCNs, in which a classifications however, particularly those based on DL,
tion framework might be used for organ segmen- are yielding especially impressive results, and
tation, for example. In this situation, the loss these provide a glimpse of the direction of AI and
would be calculated over each pixel, giving the hint at its potential future role in molecular imag-
likelihood that it belongs to a given tissue class. ing. The following sections provide an abbrevi-
Supervised methods provide a direct objective ated outline of its historical and current uses and
but require manual data labeling or annotating, also highlight some areas of emerging research.
which is a laborious task and is often the main
challenge given the large scale of data typically
needed for training. Unsupervised methods, on 4.1 Disease Characterization
the other hand, do not require labeled data and
instead rely on the algorithm itself to extract the Characterization is a general term implying the
discriminating features within different sample segmentation, diagnosis, and staging of disease.
populations to minimize the loss for the task at These tasks are achieved by identifying and mea-
hand. There are several methods for unsupervised suring the imaged properties of a pathologic
network training, but one approach stands out for abnormality. A radiologist performing these anal-
its range of applicability and remarkable recent yses is therefore required to process large
4 Evolution of AI in Medical Imaging 41

amounts of data for each examination, and he or arena, but the benefits to the real clinical applica-
she must then distill it down into a manageable, tions fell short [29], and it was not until 1998 that
and much smaller, number of qualitative features, the FDA approved its use in screening and diag-
e.g., size, shape, heterogeneity, to serve as the nostic mammography, as well as in plain chest
basis for the final interpretation. Inevitably, some radiography and CT imaging. Today, several sys-
radiological information is lost throughout this tems are in clinical use with screening mammo-
process. Furthermore, every physician is differ- grams [30]. They are typically recommended to
ent, and there will be unavoidable variability serve as a second opinion, complementing the
among human observers. Artificial intelligence initial radiologist assessment [31], and these led
can help to automate this procedure. It has the to the development of similar systems for other
capacity to consider large numbers of quantita- imaging modalities, including ultrasonography
tive features, potentially orders of magnitude and MRI [32].
greater than a human, and it could perform the These conventional CAD systems generally
task in a fraction of the time in a reproducible consist of two components: detection of suspi-
way. For example, benign and malignant pulmo- cious lesions and reduction of the false positive
nary nodules have similar appearances, and findings. The detection system is based on
hence, the status of malignancy in the lungs is radiologist-defined criteria like tumor volume,
difficult to assess. AI can account for many fea- shape, texture, etc. which are translated into a
tures simultaneously and automatically deter- pattern-recognition problem where the most
mine those which are most relevant to the current robust features are fed into an algorithm to high-
case. The relevant features could be treated as light suspicious objects in the image [33]. The
imaging biomarkers to be used in the malignancy false-positive reduction part is also based on tra-
prediction, along with other clinical endpoints ditional ML, but can pose a bigger challenge to
like risk assessment and prognosis [13]. these algorithms. Even with sophisticated pro-
The idea to use AI for disease characterization grams, the general performance of current CAD
and diagnosis dates back to the mid-twentieth systems is not good, and this limits their exten-
century [14–17]. Many of these studies focused sive clinical use. Several trials have concluded
on the improved interpretation of electrocardio- that these systems, at best, deliver no benefit [34,
grams by computers [18–21] since these data are 35]. It is more concerning though that these sys-
particularly suitable for computer analyses. Other tems were actually found to reduce radiological
related work included the differential diagnosis accuracy in some cases [36], leading to higher
of hematological diseases [22], automatic bio- recall and biopsy rates [37, 38].
chemical analysis of bodily substances [23], and Conventional CAD systems are built on rigid
sclerosis prediction in the coronary arteries [24]. ML algorithms, mostly relying on expert knowl-
These efforts mostly comprised smaller pilot edge, established a priori, for engineering fea-
studies and reported some success. Although tures to be extracted from regions of interest. In
larger-scale, definitive experiments were not per- contrast, new programs built on DL algorithms
formed during this time, these efforts led to the offer potential advantages regarding the degrees
general belief that automatic diagnoses by com- of freedom and level of abstraction in which the
puters were not just feasible, but necessary as detection and classification tasks are defined.
part of a comprehensive medical data control sys- Furthermore, the performance of conventional
tem [25–27]. These early studies fostered an CAD systems is notoriously sensitive to image
optimistic outlook for the potential of machine- noise and selected scanning protocol, and DL has
assisted diagnosis and led to many advancements demonstrated flexibility with regard to these
in computer-aided diagnostic (CAD) programs. parameters [39].
Dedicated CAD programs have early roots Largely due to the advances in computer hard-
[28], but researchers only started large-scale ware and processing technology, DL applications
development toward practical solutions in the have emerged only recently for CAD systems—
1980s. Significant effort was made in the research perhaps the earliest use in radiology was first
42 J. Schaefferkoetter

reported in 1990, when a group at the University in lung lesion detections [52]. Simultaneous
of Chicago developed an ANN for improving dif- PET/CT data have also been used to classify
ferential diagnosis of interstitial lung diseases lymph node metastases; a recent work found that
using clinical and radiographic information. this approach yielded higher sensitivities than
They claimed that the decision performance of radiologists [53]. Studies are consistently show-
the neural network was comparable to that of the ing that the detection performance of AI in dedi-
chest radiologists and even superior to that of the cated tasks is rivaling that of physicians [54], and
senior radiology residents [40]. This led to sev- recent interest in pursuing large-scale CAD solu-
eral subsequent studies at that institution investi- tions suggests the future for developing robust,
gating neural network-aided diagnoses of lung high-performance systems based on deep learn-
disease [41–43]. The first object detection system ing [55].
using CNNs was proposed a few years later in Deep learning has also demonstrated success
1995 at Georgetown University Medical Center, for using radiological information, not just for
using a CNN with four layers to detect nodules in disease detection and characterization, but for
X-ray images [44]. predicting patient diagnosis and prognosis. Early
Since then, DL-based CAD systems have been works in this area included survival predictions
developed for the identification, detection, diag- in patients with lung adenocarcinoma [56] and
nosis, and risk analysis of various pathologies. high-grade gliomas [57]. More recently, DL
Breast cancer, for example, was an obvious target algorithms have been developed to predict the
since there was a historical precedent, and recent risk of lung cancer from a patient’s current and
studies have demonstrated promising results prior CT volumes [58]. This work achieved a
regarding the performance of these next- state-of-the-art predictive performance on thou-
generation systems in detecting and staging the sands of national lung cancer screening trial
diseases [45, 46]. In particular, it was reported cases and independent clinical validation sets.
that the automatic feature exploration and higher This work also noted that its AI-based model
noise tolerance of DL-based CAD systems were reduced many risks associated with conventional
responsible for the performance gains, which low-dose CT screening, including false positives,
were quantified using different metrics, including overdiagnoses, and radiation exposure. The
sensitivity, specificity, and receiver operating computer- aided detection and diagnosis of
characteristic analyses [47]. Lung cancer detec- Alzheimer’s disease (AD) is another area of
tion and screening is another attractive applica- active DL research. SPECT and PET are both
tion, and several studies have evaluated the used by physicians to image the metabolism, pro-
implementation of DL-based CAD systems for tein aggregation, or amyloid deposition associ-
this purpose [48, 49]. These have also shown ated with AD, and a few studies have investigated
potential to effectively predict lung cancer and DL-based CAD systems for early AD diagnoses.
classify pulmonary nodules [47, 50]. In derma- The flexibility of DL allows brain data from mul-
tology, deep convolutional networks have been tiple modalities to be assessed together [59–61].
used to classify skin lesions according to malig- Two notable recent works even used 3D CNNs to
nancy [51]. This large study found that AI classify patients having AD [62, 63]. In other
achieved equivalent performance to all tested functional neurological studies, Parkinson’s dis-
experts on two separate classification tasks, and ease has been automatically diagnosed in dopa-
further, it suggested that smartphone cameras mine active transporter SPECT scans, achieving
could be used in conjunction with this technol- sensitivities around 95% [64, 65]. Other work has
ogy to provide low-cost access to vital diagnoses. been performed with PET/CT and PET/MR data,
Other groups have also investigated DL with and the inclusion of multimodal inputs, exploit-
multi-modal imaging data. One notable study ing functional and structural information, has the
used PET and computed tomography (CT) data potential to further improve the performance of
together in order to reduce false-positive results AI-based disease characterization.
4 Evolution of AI in Medical Imaging 43

4.2 Segmentation ing” the fully connected layers from conventional

CNNs, and replacing them with new layers to
Segmentation is an important component of med- expand the resolution. This resulted in a network
ical image analyses—indeed, many of the afore- which produced an output having the same
mentioned applications regarding the dimensions as its input, and by fine-tuning only
characterization of disease may be predicated on the new layers, the parameters of the original lay-
accurate delineations of organs, tissue or patho- ers which had already been trained on millions of
logic region of interest. It can often be a tedious images for classification tasks were not affected.
and arduous task, and techniques to reliably The result was a network which was able to
speed the process would be welcomed by medi- exploit the feature extraction mechanisms of the
cal practitioners. Automatic segmentation meth- original network and apply this information to a
ods using computer vision date back to the 1980s dense prediction matrix. These researchers
[66], with continual improvement over the fol- achieved impressive results, effectively using an
lowing decades. Early approaches were based on FCN to segment detailed regions based on multi-
clustering to isolate areas of similar intensities or class probabilities predicted for every discrete
region growing algorithms which spatially pixel [72]. Although this work focused only on
expanded regions around a user-selected seed natural images, the concept is readily extended to
point until homogeneity dropped below a certain medical images.
criterion [67]. The next-generation algorithms Substantial attention has been paid to CNNs to
used statistical learning and optimization to resolve the challenges associated with medical
improve accuracy. One such approach is the imaging segmentation. Many techniques have
watershed algorithm, in which image values are been evaluated for various applications—a few
used to construct topology-like maps [68]. More specific examples include the automatic segmen-
advanced systems were able to use previous tation of lungs [73], biological cells and mem-
knowledge to construct a probability map to branes [74, 75], tibial cartilage [76], bone tissue
inform the segmentations. This approach is anal- [77], brain structures [78], prostate [79], and
ogous to Bayesian inference, and the use of prior tumors [80–83]. An important contribution came
information lends itself, for example, to situa- in 2015 with the introduction of the U-Net archi-
tions where objects are ill-defined in terms pixel tecture and skip connections [9]. U-Net has been
intensities. The use of probability maps has the de facto choice for many applications, includ-
proven especially helpful for oncologic segmen- ing segmenting multiple organs on thoracic CT
tation within patient populations, since they con- images with 3D data [84] or as incorporated into
tain information regarding the expected location a GAN framework [85]. This network architec-
of tumors [69]. Other segmentation systems ture also led to other derivatives like V-Net, which
based on prior knowledge-based probability introduced a novel loss function directly based on
maps have also been applied to radiotherapy the Dice coefficient [86].
planning in head and neck CT images [70] and Segmentation platforms built on DL offer
segmenting gliomas in brain MRIs [71]. other general advantages over older AI tech-
These past techniques have realized some suc- niques as well. One study describes that DL
cess in the clinical workflow, but the algorithms methods for brain MRI segmentation completely
are somewhat inflexible and were designed for eliminate the need for image registration required
specific tasks. Segmentation programs built on by other approaches like atlas-based methods
DL technology will significantly outperform [87]. It has also been reported that a single DL
their predecessors, and for these applications, system is able to perform diverse segmentation
fully convolutional networks are well suited. A tasks, without task-specific training, across mul-
major step toward semantic segmentation by tiple modalities and tissue types, including brain
FCNs was reported by UC Berkeley in 2015 [8]. MRI, breast MRI, and cardiac CT angiography
This group first constructed FCNs by “decapitat- [88]. Considering this with the fact that current
44 J. Schaefferkoetter

DL technologies are already equivalent in many years, and through many recent advances, the
regards to radiologists’ performance for segmen- images which are routinely produced in the clinic
tation [89], it is expected that the presence of are of unprecedented quality. Artificial intelli-
DL-based segmentation algorithms in routine gence has the potential to push this even higher.
clinical tools will increase dramatically in the Until recent times, AI had not realized an
near future. overwhelming presence in image reconstruction.
Conventional approaches relied on physics and
closed-form mathematics to define the acquisi-
4.3 Image Generation/ tion process and translate the data into images.
Reconstruction However, recent decades have seen processing
schemes which have become less rigid and more
Images are fundamental in radiology and diag- adaptive. Although these may not be considered
nostic medicine. It was Wilhelm Roentgen who AI, per se, they incorporate some of the same
first discovered X-rays could be used to image components. For example, direct reconstruction
bone just prior to the turn of the twentieth cen- methods like FBP have been replaced by iterative
tury. These early images were created directly, algorithms. The objective of these algorithms is
simply by exposing photographic film with the to find the image which is the most likely source
high-energy radiation. Over the next few decades, of the projections—this framework can account
several other scanners were developed and some for data which may be incomplete which results
became digitized. This included the first positron- in far less image noise. The optimal image may
annihilation coincident detection system in the be found by maximizing some likelihood or min-
1950s. A simple rectilinear scanner with sodium imizing some cost measure, a technique which is
iodide detectors was designed and built by often used in clustering machine learning algo-
Gordon Brownell at Massachusetts General rithms. Also, many MR systems are moving
Hospital to image tumors in the brain. As imag- toward compressed sensing to perform routine
ing technology advanced throughout the century, examinations in fractions of the time. Combining
so did the methods used to process the acquired these under sampled data with prior information,
data and produce the images. Certainly, one of images of high fidelity can still be produced.
the most groundbreaking inventions was the CT Deep learning algorithms based on CNNs
scanner in the 1970s by Sir Godfrey Hounsfield. have incredible potential for applications in
This achievement ushered in the era of volumet- image reconstruction and generation. Research in
ric tomography, i.e., cross-sectional imaging of a this field is rapidly increasing, with the large
3D body, in the medical setting. The CT scanner majority of work focusing on MRI—only a rela-
acquired X-ray projection data at various angles tively small subset of studies is mentioned here.
for sequential axial positions. The projection data A popular area is looking to AI for acceleration
were used to reconstruct image slices by filtered of MR imaging through improving compressed
back-projection (FBP), a direct reconstruction sensing techniques [90, 91]. Neural networks
technique which is still used even today. FBP was have demonstrated the ability to learn spatio-
used to reconstruct projection data for emission temporal dependencies which enable them to
modalities as well like PET and SPECT as they improve the accuracy of reconstructed MR
made their way into nuclear medicine depart- images from highly undersampled complex-
ments in the 1980s and 1990s. During this time, valued k-space data. This concept can be applied
MRI systems also became a mainstream diagnos- to dynamic MR imaging and may be especially
tic tool. MR is unique from the others in that its interesting for cardiac cine protocols [92].
images are generated directly through inverse Furthermore, this idea has been extended to vari-
Fourier transforms of the acquired frequency and ous MRI acquisition strategies. Recent algo-
phase data. For all imaging modalities, process- rithms have proved to be flexible for treating the
ing methods have made great strides over recent MR reconstruction process as a supervised learn-
4 Evolution of AI in Medical Imaging 45

ing task, mapping the scanner sensors to resultant lights a tremendously powerful use for generative
images [93]. Deep learning has also been used to networks, namely creating or augmenting train-
reduce the gadolinium dose in contrast-enhanced ing data. This is highly interesting for medical
brain MRI by an order of magnitude while pre- imaging as datasets are often sparse or imbal-
serving the quality of the images [94] and for anced, with few examples of pathological find-
inferring advanced MRI diffusion parameters ings. Overcoming this challenge would help
from limited data [95]. Quantitative susceptibil- alleviate a huge limitation commonly encoun-
ity mapping, which aims to estimate the magnetic tered in training deep learning models. This
susceptibility of biological tissue, is currently a approach has been used for brain tumor segmen-
growing field in MRI research [96, 97]. The esti- tation [109], synthesizing realistic prostate
mation of magnetic susceptibility from local lesions [110], augmenting data for improved liver
magnetic fields is an ill-posed problem, and lesion classification [111], and generating syn-
recent AI methods are being used here as well. thetic retinal fundus images [112]. GANs have
One work developed a CNN based on the U-Net also been used for unsupervised generation of
architecture which was able to generate high- T1-weighted brains [113] and image synthesis
quality susceptibility maps from single orienta- for tissue recognition and computer-assisted
tion data [98]. MR-fingerprinting (MRF) is intervention [114, 115]. Inter-modality transla-
another recent technique [99]. The idea is to use tion has even been performed by GANs, trans-
a pseudo-randomized acquisition that captures a forming MR to CT images [116, 117] and to PET
unique signal from different tissues. These tissue images [118]. This work even showed that the
“fingerprints” are then mapped back to standard generated images can be used in CAD systems
parameters, T1, T2, proton density, etc. by match- for improving the diagnosis of Alzheimer’s dis-
ing them to a predefined dictionary of predicted ease when the patient data are incomplete.
signal evolutions. This mapping is a difficult Artificial intelligence has provided a new para-
problem and has usually employed a pattern rec- digm for solving inverse problems in medical
ognition approach—deep learning methodology imaging [119–123]. Furthermore, studies have
is now being investigated for this purpose. A demonstrated the ability of DL to not only
four-layer neural network was trained to map the improve existing image reconstructions [124,
recorded signal magnitudes to their correspond- 125] but also replace the reconstruction alto-
ing tissue T1 and T2 values [100]. This group gether, generating images directly from acquisi-
found reconstruction times using this approach tion data [126]. This work found that a deep
were 300–5000 times faster than conventional convolutional encoder–decoder network could be
dictionary-matching techniques in both phantom successfully used to generate quantitatively accu-
and human brain studies. Other similar rate PET images in a fraction of the time taken by
approaches have been used to predict quantitative conventional reconstruction methods. These
tissue parameter values from undersampled MRF works, and others like them, are incredibly
data [101, 102]. encouraging. As a result, they have provoked a
Although MRI has so far realized the largest new, and necessary, avenue for research focusing
number of deep learning research efforts, these solely on the potential pitfalls of DL-based recon-
have potential applications extending to many struction, and it has been found that deep learning
areas in medical imaging on a more general scale. can often cause unstable reconstruction methods.
The last few years have seen impressive results One recent work reported that these instabilities
for synthesizing photo-realistic images, espe- occur in several forms including: severe recon-
cially using GANs [12, 103–105], and these tech- struction artifacts caused by small perturbations
niques have also been used for biological image in both the image or sampling domain; incom-
synthesis [106, 107]. One recent study designed a plete or incorrect representation of small struc-
system to generate synthetic tumors in otherwise tural changes, e.g., tumors; and more training
normal brain images [108]. This approach high- samples yielded poorer reconstruction perfor-
46 J. Schaefferkoetter

mance for several of the models investigated In addition to the corrections needed to com-
[127]. Numerical accuracy and stability are essen- pensate for the limitations of the acquisition
tial components of medical image reconstruction, method, the acquired data themselves may not be
and so the limitations of new technology are of high inherent quality. For PET, the true data
important to understand before it can be reliably come from pure annihilation photons, detected
used in the clinic. It is likely that, in the future, the within a small coincidence window. However, the
image reconstruction process will be omitted alto- scanner also captures coincident events arising
gether for certain applications, since a computer from scattered and random photons which must
can theoretically extract any information con- be corrected. These are not generally abled to be
tained in an image directly from the acquired data. measured directly, so they must be estimated—
For now, however, since humans perform the clin- this is currently accomplished by modeling the
ical interpretation, medical images need to be underlying physics. Photon scattering and
generated, and AI will continue to impact this pro- absorption also leads to signal attenuation, and
cess in unprecedented ways. this requires an additional correction, usually
based on an accompanying anatomical map. For
MRI, the quality of the acquired data depends on
4.4 Data Corrections the homogeneity of the static magnetic field, lin-
earity of the gradients and stability of the receiver
As alluded to in the previous section, the methods coils. These properties are bound by engineering
to create medical images must be accurate and limitations, and many techniques are routinely
stable in order to be reliable—these requirements used to correct anomalies; for example, shim-
become even more critical when medical deci- ming is used to adjust the field homogeneity and
sions depend on measurements of precisely spherical harmonic polynomial models can be
quantified image values. Hence, the entire recon- used to characterize high-order gradient nonlin-
struction process may comprise multiple steps to earities. However, sometimes these attempts are
address different aspects. The backprojection insufficient. Additionally, the MR scanner is very
algorithm, the cornerstone of tomographic recon- sensitive to environmental perturbations, and
struction, can help to illustrate this. Data that are these can also lead to image noise. Artificial
acquired as projections are mathematically intelligence has proven adept at finding solutions
regarded as a set of 1D line integrals, and back- to inference problems and should be able to help
projection seeks to invert this process and trans- with issues related to incomplete or corrupted
form the sets of projections back to their original imaging data—indeed, it has already attained
2D form. However, due to the nature of the acqui- some notable successes.
sition, low frequencies have a stronger latent Deep learning has recently been introduced to
prevalence within the projections than do the image denoising for many applications. In one
higher frequencies. So, to avoid a blurry recon- study, neural networks were specifically devel-
structed image dominated by low frequencies, oped to learn the implicit brain manifolds in MR
the projection data must first be convolved with a images [113]. This group tested their approach
ramp filter to boost the high frequencies. by adding various levels of noise to several hun-
Additionally, the cylindrical geometry of the dred T1-weighted brain images and reported
detection system results in nonuniform radial improved performance over current denoising
sampling, and this nonuniformity must also be methods in terms of peak signal-to-noise ratios.
accounted for in the reconstruction. This example Denoising has also been applied to dynamic
demonstrates some of the steps necessary for a contrast enhanced MR data, using multiple net-
correct reconstruction approach, but backprojec- works to improve the signal quality, both spa-
tion is considered a direct method—newer, more tially and temporally [128]. Emission modalities
sophisticated techniques usually require many have also been a focus of AI denoising research
additional considerations. since they are inherently noisy. For instance, each
4 Evolution of AI in Medical Imaging 47

projection bin of a routine PET acquisition may independent k-space data [141]. Studies have
contain only a few coincident events, introducing reported success for estimating quantitative high-
uncertainty into the reconstruction. Several works resolution T1 maps from a corresponding set of
within the last few years have reported success low-resolution maps [142] and even using con-
for PET image denoising using both supervised ventional machine learning techniques to gener-
and unsupervised training approaches [129–131]. ate 7T-like MR images from 3T data [143]. Within
One notable study incorporated a 2D network the last few years, image super-resolution has
pretrained on millions of natural images as a per- become an interesting application for DL meth-
ceptual loss network [132]. This group reported ods. Novel methods have produced state-of-the-
that image resolution and noise properties were art results for resolution up-sampling in natural
improved by optimizing the perceptual loss in images [144], and applications specific to MRI
this way, rather than simply using a per-pixel followed closely. Deep convolutional networks
supervised loss like L1- or L2-norm. This have constructed super-resolution brain [145] and
approach has also been successfully applied for musculoskeletal [146] images. These networks
denoising CT images at various noise levels have also been adapted to generate super-resolu-
[133]. These reported successes have driven other tion images from another modality [147].
research to investigate the potential clinical The transformational mapping between multi-
impacts of these methods. One such work ple image domains is yet another exciting applica-
reported improvements in physician lesion tion for DL [148]. Due in part to recent advances
detectability performance when low-count PET in unsupervised training methods [149], this con-
images where denoised by a CNN [134]. cept has found applications in medical research.
Artifacts are another common nuisance in Deep convolutional networks have been devel-
medical images—physiological or random oped for transforming Flair to T1 MRI [150], CT
patient motion, metal implants and temporal or to PET [151], and T1 MRI to CT [117]. Clinical
spatial aliasing all cause distortions in the recon- interpretations and therapy planning based on
structions. Deep learning methods have been images synthesized from another, unrelated
used for correcting these. Techniques have been modality could have far-reaching effects in the
applied to automatically detect and correct future of diagnostic and therapeutic medicine;
patient motion for both MRI [135] and PET this should be approached cautiously though, as
[136]. Motion does not only compromise imag- synthesized images may contain incorrect patho-
ing data. It can also affect techniques like MR logical information and could lead to critical
spectroscopy, and approaches based on DL have errors [150]. Notwithstanding this, image trans-
been developed to remove ghosting artifacts in formation based on DL may have the immediate
these studies [137, 138]. Regardless of their potential to be a valuable tool for some technical
source, artifacts degrade the reconstructed spatial problems. One popular current focus is related
resolution. This of course limits the value of to PET/MR systems, transforming MR data to
medical images for diagnoses, since good resolu- CT for PET attenuation correction. In order to
tion properties are required to extract fine details produce quantitative images, photon attenuation
from small pathological foci. must be corrected in all PET scans. This can be
Improving medical image resolution has been accurately estimated when an anatomical corre-
the sole focus of many research efforts. Super- late of quantified attenuation values is available
resolution in MRI has been around for over a for directly generating a correction map, as it is
decade [139]. These approaches enabled the with PET/CT. For PET/MR, however, this prob-
reconstruction of a 3D volume with high isotropic lem is more complicated since MR data do not
resolution by acquiring the data typically through contain information regarding photon scattering
regular angular sampling about a common fre- and absorption. Transforming MR images into
quency encoding axis [140] or through modula- quantified CT data has been implemented by
tion of the longitudinal magnetization to acquire several groups with promising results [152–154].
48 J. Schaefferkoetter

Furthermore, the PET/MR attenuation correction ing one image onto a grid, which is then deformed
problem has also been addressed by omitting the in a way which increases some joint similarity
CT transformation step altogether, using a CNN measure. Many different similarity metrics have
to estimate the correction map directly from the been proposed and investigated, but common
attenuated PET data themselves [155]. ones include correlation (for single-modality
data) or mutual information (for multimodal
data). The optimization algorithm typically com-
4.5 Image Registration bines these approaches within some convergence
framework to try and maximize the relative
Once accurate medical images are produced, the similarity.
image data must be translated into information The registration problem comprises a chal-
which can be used for clinical patient manage- lenging combination of many factors; decisions
ment by a physician. In certain situations, the regarding the spatial transformations, similarity
information obtained from multiple images read metrics, optimization strategies and numerical
concurrently may be of much greater value than framework all play important roles in the perfor-
that obtained from reading them independently. mance. Machine learning techniques have been
The frequency of these situations dramatically applied successfully for some specific applica-
increased at the turn of the twenty-first century tions in the past. However, as with other tradi-
for multimodal imaging with the invention of the tional ML techniques, these algorithms require
PET/CT [156]. Multimodality imaging brought a explicitly handcrafting the features and have lim-
new perspective into the field of clinical imaging. ited flexibility. In many cases, they are unable to
In this case, the combination of functional infor- meet the accuracy requirements of high-
mation with anatomical and morphological infor- resolution medical imaging [157–160]. Recently,
mation provided an advanced medical tool, and DL methods have been applied to image registra-
countless studies over the past two decades have tion in order to improve accuracy and speed
unequivocally established its diagnostic value. [161]. Image registration depends fundamentally
Other situations in which multiple images may on the identification of relevant information in
be analyzed simultaneously include dynamic the images, and this is a strength of deep neural
acquisitions, longitudinal comparisons or multi- networks. Convolution stacked auto-encoder net-
parametric MRI. In each of these cases, it is help- works, for example, have demonstrated the abil-
ful, or even necessary, for the images to be ity to identify intrinsic features in image patches
spatially matched. For this reason, image regis- [162], and CNNs have been developed for
tration is a constant focus of research, and tech- regressing the transformation parameters of the
niques continue to evolve. registration for multimodal data [163]. The flexi-
There are many potential sources of misregis- bility of DL makes it well suited to address appli-
tration between two images of the same object, cations involving deformable registrations [162,
but assuming the differences are only spatially 164]. Many groups have reported recent suc-
variant, one space can be mapped to the other cesses for specific tasks including elastic regis-
through linear and nonlinear transformations. It tration between 3D MRI and transrectal
is then the job of the registration algorithm to find ultrasound for guiding prostate biopsy [165],
the optimal transformation. For rigid structures, deformable brain MRI registration [166],
e.g., the head, linear transformations comprising unsupervised CNN-based deformable registra-

global translations and rotations may be suffi- tion for CT and MRI [167–169], and DL-based
cient for coregistration. However, most other 2D/3D registration for registration of preopera-
natural movement contains local, elastic defor- tive 3D data and intraoperative 2D X-ray images
mations, and more complex methods are addi- in image-guided therapy [170].
tionally needed to characterize and compensate As diagnostic medicine continues to evolve,
for it. This is conventionally handled by project- more complementary and multiparametric tissue
4 Evolution of AI in Medical Imaging 49

information will be acquired in space and time— graphic images [173], and this has led to many
accurate image registration will become increas- studies investigating potential applications for
ingly critical. Methods based on AI have shown generating textual descriptions for medical
impressive results and will undoubtedly play images [174–181] and also for identifying find-
important roles in the automated clinical work- ings in radiology reports [182–184]. Such AI
flow, enabling quantitative comparisons at multi- tools could also replace the conventional qualita-
ple timepoints and across different imaging tive nature of radiologic reporting with a more
modalities. interactive quantitative one, and this approach
has been shown to improve collaboration between
radiology and oncology [185]. For example, it is
4.6 Radiology Reporting plausible to expect that in the future, an
AI-powered platform would be able to identify
The underlying goal of any medical imaging and diagnose pathological abnormalities and
examination is a noninvasive survey of pathologi- annotate them in a textual format that included
cal information. Regardless of the imaging quantified information about size, location, and
modality, the radiologic data must be read and probability of malignancy with associated confi-
translated into reports which are able to be used dence levels. These data would reduce subjective
toward guiding patient management—these bias in decisions regarding patient management.
reports lie at the intersection of radiology and Additionally, these well-structured reports would
multiple downstream clinical subspecialties. prove very beneficial to population sciences and
These reports are sensitive to errors in the previ- big data mining initiatives. Another related ave-
ous steps of the imaging pipeline, and so great nue of DL research is using the generated radio-
care must be taken to clearly and accurately out- logic reports themselves to annotate and label the
line the relevant findings. This makes it an ardu- imaging data. Medical PACS systems typically
ous and time-consuming task. Furthermore, store thousands of free-text reports containing
subjectivity and inter-reader variability may valuable information describing the images.
introduce communication inconsistencies Parsing this text and turning it into accurate
between radiology and other physicians. AI pres- annotations or labels requires sophisticated text-
ents an attractive option for increasing speed and mining method—this is a field in which DL is
improving standardization of radiology reports. currently being applied. Reports with higher
Artificial intelligence algorithms for voice rec- degrees of structure more readily lend themselves
ognition and text generation were first proposed to this purpose, and there are already some
nearly two decades ago [171], and today, they are emerging applications. For example, there has
used routinely for radiologic reporting. Since been work reporting success leveraging radiolo-
then, machine learning techniques have made gists’ BI-RADS categorizations for training deep
great strides in natural language processing, and neural networks for characterizing breast lesions
now several vendors have developed powerful [174]. Considering the point that labeled data can
tools capable of speech-to-text translation, along be used to improve classification accuracy, one
with compatible hardware, e.g., dictation micro- study was motivated by the fact that large
phones [172]. These solutions have proven them- amounts of annotated data might be unobtainable.
selves invaluable for automatic transcription This work proposed to create semantic descrip-
without the need for typing dictation content from tion labels for the data, using both images and
radiologists, substantially reducing report genera- textual reports [186]. This group reported that
tion times and improving clinical workflow. semantic information can increase classification
Radiologic tools driven by deep learning algo- accuracy for different pathologies in medical
rithms have the potential to further streamline images. Advanced AI algorithms are also being
this process. Recently, DL has been used to auto- applied in other ways to improve efficiency in
matically produce captions for natural photo- radiology practice. Convolutional neural net-
50 J. Schaefferkoetter

works can be used to determine scanning proto- rapidly evolving technologies, and their symbio-
cols from short text classification [187] and to sis will likely lead to a single unified framework
improve time-sensitive decisions by prioritizing to support clinical decisions—this has the poten-
urgent cases [188]. One of the most interesting tial to completely transform the field of precision
recent endeavors, however, addressed the chal- medicine [13].
lenges summarizing and representing patient data
from electronic health records [189]. This work
presented a novel unsupervised DL method for 4.7 Conclusion
constructing general-purpose patient representa-
tions. This value of such data would be huge, Fundamentally, medical images are generated in
since it could then potentially facilitate clinical order to be presented to physicians for evalua-
predictive modeling on a large scale. tion—optimizing the appearance of images for
The applications mentioned above involved, human viewers almost always includes simplifi-
to some degree, image interpretations based on cation and down-sampling of the raw data.
human perception. Years of collecting data in Quantitative approaches like radiomics represent
routine clinical practice have produced an incred- a step toward automatic image interpretation
ibly rich resource of quantified radiological data using the latent information embedded in the
along with the associated clinical outcomes. images, and following this evolutionary track, it
These data are being leveraged to refine the field is expected in the future that the presence of auto-
of radiomics. Radiomics in medicine refers to the mated, AI-driven analyses in routine clinical
high-throughput extraction of large amounts of workflow will continue to increase. In this para-
features from medical images [190]. Radiomic digm, processed medical images may become
analyses, sometimes involving high order statis- altogether unnecessary for certain indications.
tics, can be used to identify patterns related to This would avoid the loss of information inherent
disease characteristics—patterns which may be in the creation of images, leading to reproducible
undetectable by a traditional observer. Radiomics analyses which were faster and more accurate.
emerged from the field of oncology with the In conclusion, AI has made great advances,
hypothesis that imaged tumors may reveal dis- especially recently, but it is not expected that it
tinctive features pertaining to the disease which will outperform humans for general clinical plan-
can be useful for predicting prognoses and planning and patient management in the near future.
ning personalized therapy [191, 192]. Early work Instead, both will improve together. Although AI
in radiomics involved analyzing large sets of is currently able to provide advantages for spe-
images and building correlations among various cific quantitative tasks, medical decisions cannot
predefined features characterizing, for example, be strictly regarded as such. They are based on
tumor morphology, intensity, and texture. knowledge obtained through life experience and
Following this, many efforts have successfully philosophy. To incorporate these characteristics
applied radiomic evaluations for assisting clinical into an AI program, one would be faced with
decision-making in oncology. For example, many challenges including data collection and
radiomics has been used to predict metastatic algorithm development [29]. Considering this, it
patterns in lung adenocarcinoma [193] as well as is likely that the trend in AI will move toward
disease recurrence [194] and prognoses [195]. advanced unsupervised learning approaches,
Recently, deep learning has been applied in this allowing the immense amounts of readily-
space [161]. As with many other examples pre- available, unlabeled data to be utilized. In any
sented in this chapter, DL poses advantages over case, the synergy between AI and physicians will
traditional methods for automatically extracting certainly grow and continue to be mutually ben-
the relevant features, while simultaneously pro- eficial within the field of medical imaging, lead-
viding information regarding their clinical rele- ing to unprecedented levels of precision and
vance. Deep learning and radiomics are two quality in patient care.
4 Evolution of AI in Medical Imaging 51

References 19. Caceres CA. How can the waveforms of a clini-

cal electrocardiogram be measured automati-
cally by a computer? IRE Trans Biomed Electron.
1. Ledley RS, Lusted LB. Reasoning foundations of
1962;9(1):21–2.
medical diagnosis. Science. 1959;130(3366):9–21.
20. Pipberger HV, Stallmann F. Use of computers in
2. Haug PJ. Uses of diagnostic expert systems in clini-
ECG interpretation. Am Heart J. 1962;64:285.
cal care. In: Proceedings of the annual symposium
21. Steinberg C, Abraham S, Caceres C. Pattern recog-
on computer application in medical care. Bethesda:
nition in the clinical electrocardiogram. IRE Trans
American Medical Informatics Association; 1993.
Biomed Electron. 1962;9(1):23–30.
3. Ambinder EP. A history of the shift toward full
22. Lipkin M, Hardy JD. Mechanical correlation of data
computerization of medicine. J Oncol Pract.
in differential diagnosis of hematological diseases. J
2005;1(2):54.
Am Med Assoc. 1958;166(2):113–25.
4. Krizhevsky A, Sutskever I, Hinton GE. Imagenet
23. Jonnard R. Random selection system for automatic
classification with deep convolutional neu-

biochemical analysis-partial functional analysis.
ral networks. Adv Neural Inf Proces Syst.
IRE Trans Biomed Electron. 1961;8(2):83–98.
2012;25:1097–105.
24. Moyer D, Talbott G. Instrumentation for the diagno-
5. He K, et al. Delving deep into rectifiers: surpassing
sis of coronary-artery disease. Trans Am Inst Electr
human-level performance on imagenet classifica-
Eng. 1962;80(6):717–21.
tion. In: Proceedings of the IEEE international con-
25. Ledley RS, Lusted LB. Computers in medical data
ference on computer vision. New York: IEEE; 2015.
processing. Oper Res. 1960;8(3):299–310.
6. Silver D, et al. Mastering the game of Go with
26. Gillon J. Is automatic diagnosis in the future?
deep neural networks and tree search. Nature.
Concours Med. 1962;84:3829–33.
2016;529(7587):484–9.
27. Schweisheimer W. Can electronic machines facili-
7. Simonyan K, Zisserman A. Very deep convolutional
tate and improve medical diagnosis? Hippokrates.
networks for large-scale image recognition. arXiv
1962;33:162.
preprint arXiv:1409.1556, 2014.
28. Lodwick GS, Keats TE, Dorst JP. The coding of
8. Long J, Shelhamer E, Darrell T. Fully convolutional
roentgen images for computer analysis as applied to
networks for semantic segmentation. In: Proceedings
lung cancer. Radiology. 1963;81(2):185–200.
of the IEEE conference on computer vision and pat-
29. Tang X. The role of artificial intelligence in medical
tern recognition. 2015.
imaging research. BJR Open. 2019;2(1):20190031.
9. Ronneberger O, Fischer P, Brox T. U-net:
30. Champaign JL, Cederbom GJ. Advances in breast
Convolutional networks for biomedical image seg-
cancer detection with screening mammography.
mentation. In: International conference on medical
Ochsner J. 2000;2(1):33–5.
image computing and computer-assisted interven-
31. Shiraishi J, et al. Computer-aided diagnosis and arti-
tion. New York: Springer; 2015.
ficial intelligence in clinical imaging. In: Seminars
10. He K, et al. Deep residual learning for image recog-
in nuclear medicine. London: Elsevier; 2011.
nition. In: Proceedings of the IEEE conference on
32. Ayer T, et al. Computer-aided diagnostic mod-
computer vision and pattern recognition. 2016.
els in breast cancer screening. Imaging Med.
11. Szegedy C, et al. Going deeper with convolutions.
2010;2(3):313.
In: Proceedings of the IEEE conference on computer
33. Nagaraj S, Rao G, Koteswararao K. The role of pat-
vision and pattern recognition. 2015.
tern recognition in computer-aided diagnosis and
12. Goodfellow I, et al. Generative adversarial nets. In:
computer-aided detection in medical imaging: a clin-
Advances in neural information processing systems.
ical validation. Int J Comput Appl. 2010;8(5):18–22.
2014.
34. Cole EB, et al. Impact of computer-aided detection
13. Hosny A, et al. Artificial intelligence in radiology.
systems on radiologist accuracy with digital mam-
Nat Rev Cancer. 2018;18(8):500–10.
mography. Am J Roentgenol. 2014;203(4):909–16.
14. Ledley RS. Using electronic computers in medical
35. Lehman CD, et al. Diagnostic accuracy of digi-
diagnosis. IRE Trans Med Electron. 1960;4:274–80.
tal screening mammography with and without
15. Amosov N, Shkabara E. Experience in determining
computer-aided detection. JAMA Intern Med.
diagnosis with the aid of diagnostic machines. Eksp
2015;175(11):1828–37.
Khirurgiia. 1961;6:15–22.
36. Fenton JJ, et al. Influence of computer-aided detec-
16. Rikli AE, et al. Computer analysis of elec-
tion on performance of screening mammography. N
trocardiographic measurements. Circulation.
Engl J Med. 2007;356(14):1399–409.
1961;24(3):643–9.
37. Gilbert FJ, et al. Single reading with computer-aided
17. Paycha F. Diagnosis with the aid of artificial intelli-
detection for screening mammography. N Engl J
gence: demonstration of the 1st diagnostic machine.
Med. 2008;359(16):1675–84.
Presse Therm Clim. 1968;105(1):22.
38. Oakden-Rayner L. The rebirth of CAD: how is
18. DeCote R, Horvath WJ. An electronic computer
modern AI different from the CAD we know? Oak
for vector electrocardiography. IRE Trans Med
Brook: Radiological Society of North America;
Electron. 1957;1957:31–7.
2019.
52 J. Schaefferkoetter

39. Lee J-G, et al. Deep learning in medical imaging: gen- 55. Kooi T, et al. Large scale deep learning for computer
eral overview. Korean J Radiol. 2017;18(4):570–84. aided detection of mammographic lesions. Med
40. Asada N, et al. Potential usefulness of an artifi- Image Anal. 2017;35:303–12.
cial neural network for differential diagnosis of 56. Paul R, et al. Deep feature transfer learning in
interstitial lung diseases: pilot study. Radiology. combination with traditional features predicts sur-
1990;177(3):857–60. vival among patients with lung adenocarcinoma.
41. Lin J-S, et al. Reduction of false positives in lung Tomography. 2016;2(4):388.
nodule detection using a two-level neural classifica- 57. Nie D, et al. 3D deep learning for multi-modal
tion. IEEE Trans Med Imaging. 1996;15(2):206–17. imaging-guided survival time prediction of brain
42. Ashizawa K, et al. Artificial neural networks in tumor patients. In: International conference on
chest radiography: application to the differential medical image computing and computer-assisted
diagnosis of interstitial lung disease. Acad Radiol. intervention. New York: Springer; 2016.
1999;6(1):2–9. 58. Ardila D, et al. End-to-end lung cancer screen-
43. Ashizawa K, et al. Effect of an artificial neural ing with three-dimensional deep learning on
network on radiologists' performance in the dif- low-dose chest computed tomography. Nat Med.
ferential diagnosis of interstitial lung disease 2019;25(6):954–61.
using chest radiographs. AJR Am J Roentgenol. 59. Suk H-I, et al. Hierarchical feature representation
1999;172(5):1311–5. and multimodal fusion with deep learning for AD/
44. Lo S-C, et al. Artificial convolution neural network MCI diagnosis. NeuroImage. 2014;101:569–82.
techniques and applications for lung nodule detec- 60. Suk H-I, Shen D. Deep learning-based feature
tion. IEEE Trans Med Imaging. 1995;14(4):711–8. representation for AD/MCI classification. In:
45. Wang D, et al. Deep learning for identifying meta- International conference on medical image comput-
static breast cancer. arXiv preprint arXiv:1606.05718, ing and computer-assisted intervention. New York:
2016. Springer; 2013.
46. Kallenberg M, et al. Unsupervised deep learning 61. Liu S, et al. Early diagnosis of Alzheimer’s disease
applied to breast density segmentation and mam- with deep learning. In: 2014 IEEE 11th interna-
mographic risk scoring. IEEE Trans Med Imaging. tional symposium on biomedical imaging (ISBI).
2016;35(5):1322–31. New York: IEEE; 2014.
47. Cheng J-Z, et al. Computer-aided diagnosis with 62. Hosseini-Asl E, Gimel’farb G, El-Baz
deep learning architecture: applications to breast A. Alzheimer’s disease diagnostics by a deeply
lesions in US images and pulmonary nodules in CT supervised adaptable 3D convolutional network.
scans. Sci Rep. 2016;6(1):1–13. arXiv preprint arXiv:1607.00556, 2016.
48. Hua K-L, et al. Computer-aided classification of 63. Payan A, Montana G. Predicting Alzheimer's dis-
lung nodules on computed tomography images ease: a neuroimaging study with 3D convolutional
via deep learning technique. Onco Targets Ther. neural networks. arXiv preprint arXiv:1502.02506,
2015;8:2015–22. 2015.
49. Kumar D, Wong A, Clausi DA. Lung nodule clas- 64. Choi H, et al. Refining diagnosis of Parkinson’s
sification using deep features in CT images. In: disease with deep learning-based interpretation
2015 12th conference on computer and robot vision. of dopamine transporter imaging. NeuroImage.
New York: IEEE; 2015. 2017;16:586–94.
50. Chen J, et al. Use of an artificial neural network to 65. Kim DH, Wit H, Thurston M. Artificial intelligence
construct a model of predicting deep fungal infec- in the diagnosis of Parkinson’s disease from ioflu-
tion in lung cancer patients. Asian Pac J Cancer Prev. pane-123 single-photon emission computed tomog-
2015;16(12):5095–9. raphy dopamine transporter scans using transfer
51. Esteva A, et al. Dermatologist-level classification learning. Nucl Med Commun. 2018;39(10):887–93.
of skin cancer with deep neural networks. Nature. 66. Haralick RM, Shapiro LG. Image segmentation
2017;542(7639):115–8. techniques. Comput Vis Graph Image Process.
52. Teramoto A, et al. Automated detection of pul- 1985;29(1):100–32.
monary nodules in PET/CT images: ensemble 67. Pham DL, Xu C, Prince JL. Current methods in
false-positive reduction using a convolutional neu- medical image segmentation. Annu Rev Biomed
ral network technique. Med Phys. 2016;43(6): Eng. 2000;2(1):315–37.
2821–7. 68. Grau V, et al. Improved watershed transform for
53. Wang H, et al. Comparison of machine learning meth- medical image segmentation using prior informa-
ods for classifying mediastinal lymph node metas- tion. IEEE Trans Med Imaging. 2004;23(4):447–58.
tasis of non-small cell lung cancer from 18 F-FDG 69. Sharma N, Aggarwal LM. Automated medical image
PET/CT images. EJNMMI Res. 2017;7(1):11. segmentation techniques. J Med Phys. 2010;35(1):3.
54. Chen T, Metaxas D. Medical image computing 70. Han X, et al. Atlas-based auto-segmentation of head
and computer-assisted intervention—Miccai 2000. and neck CT images. In: International conference
Vol. 1935 of lecture notes in computer science. on medical image computing and computer-assisted
New York: Springer; 2000. intervention. New York: Springer; 2008.
4 Evolution of AI in Medical Imaging 53

71. Parisot S, et al. A probabilistic atlas of diffuse WHO 87. de Brebisson A, Montana G. Deep neural networks
grade II glioma locations in the brain. PLoS One. for anatomical brain segmentation. In: Proceedings
2016;11(1):e0144200. of the IEEE conference on computer vision and pat-
72. Shelhamer E, Long J, Darrell T. Fully convolutional tern recognition workshops. 2015.
networks for semantic segmentation. IEEE Trans 88. Moeskops P, et al. Deep learning for multi-task med-
Pattern Anal Mach Intell. 2017;39(4):640–51. ical image segmentation in multiple modalities. In:
73. Middleton I, Damper RI. Segmentation of magnetic International conference on medical image comput-
resonance images using a combination of neural net- ing and computer-assisted intervention. New York:
works and active contour models. Med Eng Phys. Springer; 2016.
2004;26(1):71–86. 89. Ghafoorian M, et al. Location sensitive deep convo-
74. Ning F, et al. Toward automatic phenotyping of lutional neural networks for segmentation of white
developing embryos from videos. IEEE Trans Image matter hyperintensities. Sci Rep. 2017;7:5110.
Process. 2005;14(9):1360–71. 90. Sun J, Li H, Xu Z. Deep ADMM-Net for compres-
75. Ciresan D, et al. Deep neural networks segment neu- sive sensing MRI. In: Advances in neural informa-
ronal membranes in electron microscopy images. In: tion processing systems. 2016.
Advances in neural information processing systems. 91. Wang S, et al. Accelerating magnetic resonance
2012. imaging via deep learning. In: 2016 IEEE 13th
76. Prasoon A, et al. Deep feature learning for knee international symposium on biomedical imaging
cartilage segmentation using a triplanar convolu- (ISBI). New York: IEEE; 2016.
tional neural network. In: International conference 92. Qin C, et al. Convolutional recurrent neural net-
on medical image computing and computer-assisted works for dynamic MR image reconstruction. IEEE
intervention. 2013. Springer. Trans Med Imaging. 2018;38(1):280–90.
77. Cernazanu-Glavan C, Holban S. Segmentation 93. Zhu B, et al. Image reconstruction by
of bone structure in X-ray images using convolu- domain-transform manifold learning. Nature.
tional neural network. Adv Electron Comput Eng. 2018;555(7697):487–92.
2013;13(1):87–94. 94. Gong E, et al. Deep learning enables reduced gado-
78. Moeskops P, et al. Automatic segmentation of MR linium dose for contrast-enhanced brain MRI. J
brain images with a convolutional neural network. Magn Reson Imaging. 2018;48(2):330–40.
IEEE Trans Med Imaging. 2016;35(5):1252–61. 95. Golkov V, et al. Q-space deep learning: twelve-fold
79. Zhu Q, et al. Deeply-supervised CNN for prostate shorter and model-free diffusion MRI scans. IEEE
segmentation. In: 2017 international joint confer- Trans Med Imaging. 2016;35(5):1344–51.
ence on neural networks (IJCNN). New York: IEEE; 96. Deistung A, et al. Toward in vivo histology: a
2017. comparison of quantitative susceptibility mapping
80. Rastgarpour M, Shanbehzadeh J. Application of ai (QSM) with magnitude-, phase-, and R2⁎-imaging
techniques in medical image segmentation and novel at ultra-high magnetic field strength. NeuroImage.
categorization of available methods and in tools. In: 2013;65:299–314.
Proceedings of the international multiconference 97. Deistung A, Schweser F, Reichenbach JR. Overview
of engineers and computer scientists. Princeton: of quantitative susceptibility mapping. NMR
Citeseer; 2011. Biomed. 2017;30(4):e3569.
81. Pereira S, et al. Brain tumor segmentation using 98. Yoon J, et al. Quantitative susceptibility mapping
convolutional neural networks in MRI images. IEEE using deep neural network: QSMnet. NeuroImage.
Trans Med Imaging. 2016;35(5):1240–51. 2018;179:199–206.
82. Roth HR, et al. Deep learning and its application to 99. Ma D, et al. Magnetic resonance fingerprinting.
medical image segmentation. Med Imaging Technol. Nature. 2013;495(7440):187–92.
2018;36(2):63–71. 100. Cohen O, Zhu B, Rosen MS. MR fingerprinting deep
83. Tang X, Wang B, Rong Y. Artificial intelligence will reconstruction network (DRONE). Magn Reson
reduce the need for clinical medical physicists. J Med. 2018;80(3):885–94.
Appl Clin Med Phys. 2018;19(1):6. 101. Hoppe E, et al. Deep learning for magnetic reso-
84. Feng X, et al. Deep convolutional neural net- nance fingerprinting: a new approach for predicting
work for segmentation of thoracic organs-at-risk quantitative parameter values from time series. In:
using cropped 3D images. Med Phys. 2019;46(5): GMDS. 2017.
2169–80. 102. Fang Z, et al. Quantification of relaxation times in MR
85. Dong X, et al. Automatic multiorgan segmentation fingerprinting using deep learning. In: Proceedings
in thorax CT images using U-net-GAN. Med Phys. of the International Society for Magnetic Resonance
2019;46(5):2157–68. in Medicine... Scientific Meeting and Exhibition.
86. Milletari F, Navab N, Ahmadi S-A. V-net: fully International Society for Magnetic Resonance
convolutional neural networks for volumetric medi- in Medicine. Scientific Meeting and Exhibition.
cal image segmentation. In: Fourth international Bethesda: NIH Public Access; 2017.
conference on 3D Vision (3DV). New York: IEEE;
2016. p. 2016.
54 J. Schaefferkoetter

103. Creswell A, et al. Generative adversarial net- 118. Li R, et al. Deep learning based imaging data com-
works: an overview. IEEE Signal Process Mag. pletion for improved brain disease diagnosis. In:
2018;35(1):53–65. International conference on medical image comput-
104. Hong Y, et al. How generative adversarial networks ing and computer-assisted intervention. New York:
and their variants work: an overview. ACM Comput Springer; 2014.
Surv. 2019;52(1):1–43. 119. Strack R. Imaging: AI transforms image reconstruc-
105. Huang H, Yu PS, Wang C. An introduction to image tion. Nat Methods. 2018;15(5):309–10.
synthesis with generative adversarial nets. arXiv pre- 120. McCann MT, Jin KH, Unser M. Convolutional neural
print arXiv:1803.04469, 2018. networks for inverse problems in imaging: a review.
106. Osokin A, et al. Gans for biological image synthesis. IEEE Signal Process Mag. 2017;34(6):85–95.
In: Proceedings of the IEEE international confer- 121. Jin KH, et al. Deep convolutional neural network
ence on computer vision. New York: IEEE; 2017. for inverse problems in imaging. IEEE Trans Image
107. Antipov G, Baccouche M, Dugelay J-L. Face aging Process. 2017;26(9):4509–22.
with conditional generative adversarial networks. In: 122. Schlemper J, et al. A deep cascade of convolutional
2017 IEEE international conference on image pro- neural networks for MR image reconstruction. In:
cessing (ICIP). New York: IEEE; 2017. International conference on information processing
108. Shin H-C, et al. Medical image synthesis for data in medical imaging. New York: Springer; 2017.
augmentation and anonymization using generative 123. Lucas A, et al. Using deep neural networks
adversarial networks. In: International workshop for inverse problems in imaging: beyond ana-
on simulation and synthesis in medical imaging. lytical methods. IEEE Signal Process Mag.
New York: Springer; 2018. 2018;35(1):20–36.
109. Mok TC, Chung AC. Learning data augmentation for 124. Kim K, et al. Penalized PET reconstruction using
brain tumor segmentation with coarse-to-fine genera- deep learning prior and local linear fitting. IEEE
tive adversarial networks. In: International MICCAI Trans Med Imaging. 2018;37(6):1478–87.
brainlesion workshop. New York: Springer; 2018. 125. Gong K, et al. Iterative PET image reconstruction
110. Kitchen A, Seah J. Deep generative adversarial neu- using convolutional neural network representation.
ral networks for realistic prostate lesion MRI synthe- IEEE Trans Med Imaging. 2018;38(3):675–85.
sis. arXiv preprint arXiv:1708.00129, 2017. 126. Häggström I, et al. DeepPET: a deep encoder–
111. Frid-Adar M, et al. Synthetic data augmentation decoder network for directly solving the PET image
using GAN for improved liver lesion classification. reconstruction inverse problem. Med Image Anal.
In: 2018 IEEE 15th international symposium on 2019;54:253–62.
biomedical imaging (ISBI 2018). New York: IEEE; 127. Antun V, et al. On instabilities of deep learning in
2018. image reconstruction and the potential costs of
112. Guibas JT, Virdi TS, Li PS. Synthetic medi- AI. Proc Natl Acad Sci. 2020;117(48):30088–95.
cal images from dual generative adversarial 128. Benou A, et al. Ensemble of expert deep neural
networks. arXiv preprint arXiv:1709.01872, networks for spatio-temporal denoising of contrast-
2017. enhanced MRI sequences. Med Image Anal.
113. Bermudez C, et al. Learning implicit brain MRI 2017;42:145–59.
manifolds with deep learning. In: Medical imaging 129. Liu C-C, Qi J. Higher SNR PET image prediction
2018: image processing. Bellingham: International using a deep learning model and MRI image. Phys
Society for Optics and Photonics. p. 2018. Med Biol. 2019;64(11):115004.
114. Zhang Q, et al. Medical image synthesis with gen- 130. Lu W, et al. An investigation of quantitative accu-
erative adversarial networks for tissue recogni- racy for deep learning based denoising in oncologi-
tion. In: 2018 IEEE international conference on cal PET. Phys Med Biol. 2019;64(16):165019.
healthcare informatics (ICHI). New York: IEEE; 131. Cui J, et al. PET image denoising using unsuper-
2018. vised deep learning. Eur J Nucl Med Mol Imaging.
115. Nie D, et al. Medical image synthesis with 2019;46(13):2780–9.
context-aware generative adversarial networks. In: 132. Gong K, et al. Pet image denoising using a deep neu-
International conference on medical image comput- ral network through fine tuning. IEEE Trans Radiat
ing and computer-assisted intervention. New York: Plasma Med Sci. 2018;3(2):153–61.
Springer; 2017. 133. Yang Q, et al. CT image denoising with per-
116. Nie D, et al. Estimating CT image from MRI data ceptive deep neural networks. arXiv preprint
using 3D fully convolutional networks. In: Deep arXiv:1702.07019, 2017.
learning and data labeling for medical applications. 134. Schaefferkoetter J, Yan J, Ortega C, et al.
New York: Springer; 2016. p. 170–8. Convolutional neural networks for improving
117. Wolterink JM, et al. Deep MR to CT synthesis using image quality with noisy PET data. EJNMMI
unpaired data. In: International workshop on simu- Res. 2020;10:105. https://doi.org/10.1186/
lation and synthesis in medical imaging. New York: s13550-020-00695-1.
Springer; 2017.
4 Evolution of AI in Medical Imaging 55

135. Küstner T, et al. Automated reference-free detection European signal processing conference (EUSIPCO).
of motion artifacts in magnetic resonance images. New York: IEEE; 2019.
Medicine. 2018;31(2):243–56. 152. Han X. MR-based synthetic CT generation using
136. Li T, et al. Motion correction of respiratory-gated PET a deep convolutional neural network method. Med
images using deep learning based image registration Phys. 2017;44(4):1408–19.
framework. Phys Med Biol. 2020;65(15):155003. 153. Torrado-Carvajal A, et al. Dixon-VIBE deep
137. Gurbani SS, et al. A convolutional neural network learning (DIVIDE) pseudo-CT synthesis for pel-
to filter artifacts in spectroscopic MRI. Magn Reson vis PET/MR attenuation correction. J Nucl Med.
Med. 2018;80(5):1765–75. 2019;60(3):429–35.
138. Kyathanahally SP, Döring A, Kreis R. Deep learn- 154. Leynes AP, et al. Zero-echo-time and Dixon deep
ing approaches for detection and removal of ghost- pseudo-CT (ZeDD CT): direct generation of
ing artifacts in MR spectroscopy. Magn Reson Med. pseudo-CT images for pelvic PET/MRI attenu-
2018;80(3):851–63. ation correction using deep convolutional neural
139. Robinson MD, et al. New applications of super- networks with multiparametric MRI. J Nucl Med.
resolution in medical imaging. Super-Resolut 2018;59(5):852–8.
Imaging. 2010;2010:384–412. 155. Spuhler KD, et al. Synthesis of patient-specific
140. Shilling RZ, et al. A super-resolution framework for transmission data for PET attenuation correction for
3-D high-resolution and high-contrast imaging using PET/MRI neuroimaging using a convolutional neu-
2-D multislice MRI. IEEE Trans Med Imaging. ral network. J Nucl Med. 2019;60(4):555–60.
2008;28(5):633–44. 156. Beyer T, et al. A combined PET/CT scanner for clin-
141. Ropele S, et al. Super-resolution MRI using micro- ical oncology. J Nucl Med. 2000;41(8):1369.
scopic spatial modulation of magnetization. Magn 157. Wohlhart P, Lepetit V. Learning descriptors for object
Reson Med. 2010;64(6):1671–5. recognition and 3d pose estimation. In: Proceedings
142. Van Steenkiste G, et al. Super-resolution T1 esti- of the IEEE conference on computer vision and pat-
mation: quantitative high resolution T1 mapping tern recognition. 2015.
from a set of low resolution T1-weighted images 158. Dollár P, Welinder P, Perona P. Cascaded pose
with different slice orientations. Magn Reson Med. regression. In: 2010 IEEE computer society con-
2017;77(5):1818–30. ference on computer vision and pattern recognition.
143. Bahrami K, et al. 7T-guided super-resolution of 3T New York: IEEE; 2010.
MRI. Med Phys. 2017;44(5):1661–77. 159. Zach C, Penate-Sanchez A, Pham M-T. A dynamic
144. Ledig C, et al. Photo-realistic single image super- programming approach for fast and robust object
resolution using a generative adversarial network. pose recognition from range images. In: Proceedings
In: Proceedings of the IEEE conference on computer of the IEEE conference on computer vision and pat-
vision and pattern recognition. 2017. tern recognition. 2015.
145. Liu C, et al. Fusing multi-scale information in con- 160. Mottaghi R, Xiang Y, Savarese S. A coarse-to-fine
volution network for MR image super-resolution model for 3d pose estimation and sub-category rec-
reconstruction. Biomed Eng Online. 2018;17(1):114. ognition. In: Proceedings of the IEEE conference on
146. Chaudhari AS, et al. Super-resolution musculoskel- computer vision and pattern recognition. 2015.
etal MRI using deep learning. Magn Reson Med. 161. Litjens G, et al. A survey on deep learning in medical
2018;80(5):2139–54. image analysis. Med Image Anal. 2017;42:60–88.
147. Zeng K, et al. Simultaneous single-and multi- 162. Wu G, et al. Scalable high-performance image reg-
contrast super-resolution for brain MRI images istration framework by unsupervised deep feature
based on a convolutional neural network. Comput representations learning. IEEE Trans Biomed Eng.
Biol Med. 2018;99:133–41. 2015;63(7):1505–16.
148. Isola P, et al. Image-to-image translation with con- 163. Sloan JM, Goatman KA, Siebert JP. Learning rigid
ditional adversarial networks. In: Proceedings of the image registration-utilizing convolutional neural
IEEE conference on computer vision and pattern networks for medical image registration. 2018.
recognition. 2017. 164. Yang X, et al. Quicksilver: fast predictive image
149. Zhu J-Y, et al. Unpaired image-to-image transla- registration–a deep learning approach. NeuroImage.
tion using cycle-consistent adversarial networks. In: 2017;158:378–96.
Proceedings of the IEEE international conference on 165. Haskins G, et al. Learning deep similarity metric
computer vision. 2017. for 3D MR–TRUS image registration. Int J Comput
150. Cohen JP, Luck M, Honari S. Distribution match- Assist Radiol Surg. 2019;14(3):417–25.
ing losses can hallucinate features in medical image 166. Cao X, et al. Deformable image registration using
translation. In: International conference on medical a cue-aware deep regression network. IEEE Trans
image computing and computer-assisted interven- Biomed Eng. 2018;65(9):1900–11.
tion. New York: Springer; 2018. 167. Shan S, et al. Unsupervised end-to-end learning for
151. Armanious K, et al. Unsupervised medical image deformable medical image registration. arXiv pre-
translation using cycle-MedGAN. In: 2019 27th print arXiv:1711.08608, 2017.
56 J. Schaefferkoetter

168. Kearney V, et al. An unsupervised convolutional neu- 181. Zhang Y, et al. Learning to summarize radiology
ral network-based algorithm for deformable image findings. arXiv preprint arXiv:1809.04698, 2018.
registration. Phys Med Biol. 2018;63(18):185017. 182. Pons E, et al. Natural language processing in
169. de Vos BD, et al. A deep learning framework for radiology: a systematic review. Radiology.
unsupervised affine and deformable image registra- 2016;279(2):329–43.
tion. Med Image Anal. 2019;52:128–43. 183. Zech J, et al. Natural language–based machine learn-
170. Zheng J, et al. Pairwise domain adaptation mod- ing models for the annotation of clinical radiology
ule for CNN-based 2-D/3-D registration. J Med reports. Radiology. 2018;287(2):570–80.
Imaging. 2018;5(2):021204. 184. Goff DJ, Loehfelm TW. Automated radiology
171. Antoniol G, et al. Radiological reporting based on report summarization using an open-source natu-
voice recognition. In: International conference on ral language processing pipeline. J Digit Imaging.
human-computer interaction. New York: Springer; 2018;31(2):185–92.
1993. 185. Folio LR, et al. Quantitative radiology reporting in
172. Liu Y, Wang J. PACS and digital medicine: essential oncology: survey of oncologists and radiologists.
principles and modern practice. Boca Raton: CRC Am J Roentgenol. 2015;205(3):233–43.
Press; 2010. 186. Schlegl T, et al. Predicting semantic descriptions
173. Karpathy A, Fei-Fei L. Deep visual-semantic from medical images with convolutional neural net-
alignments for generating image descriptions. In: works. In: International conference on information
Proceedings of the IEEE conference on computer processing in medical imaging. New York: Springer;
vision and pattern recognition. 2015. 2015.
174. Kisilev P, et al. Medical image description using 187. Lee YH. Efficiency improvement in a busy radiology
multi-task-loss CNN. In: Deep learning and data practice: determination of musculoskeletal magnetic
labeling for medical applications. New York: resonance imaging protocol using deep-learning
Springer; 2016. p. 121–9. convolutional neural networks. J Digit Imaging.
175. Shin H-C, et al. Interleaved text/image deep min- 2018;31(5):604–10.
ing on a very large-scale radiology database. In: 188. Annarumma M, et al. Automated triaging of adult
Proceedings of the IEEE conference on computer chest radiographs with deep artificial neural net-
vision and pattern recognition. 2015. works. Radiology. 2019;291(1):196–202.
176. Shin H-C, et al. Learning to read chest X-rays: 189. Miotto R, et al. Deep patient: an unsupervised rep-
recurrent neural cascade model for automated image resentation to predict the future of patients from the
annotation. In: Proceedings of the IEEE conference electronic health records. Sci Rep. 2016;6(1):1–10.
on computer vision and pattern recognition. 2016. 190. Lambin P, et al. Radiomics: extracting more infor-
177. Wang X., et al. Unsupervised category discovery mation from medical images using advanced feature
via looped deep pseudo-task optimization using a analysis. Eur J Cancer. 2012;48(4):441–6.
large scale radiology image database. arXiv preprint 191. Aerts HJ. The potential of radiomic-based phenotyp-
arXiv:1603.07965, 2016. ing in precision medicine: a review. JAMA Oncol.
178. Jing B, Xie P, Xing E. On the automatic gen- 2016;2(12):1636–42.
eration of medical imaging reports. arXiv preprint 192. Kumar V, et al. Radiomics: the process and the chal-
arXiv:1711.08195, 2017. lenges. Magn Reson Imaging. 2012;30(9):1234–48.
179. Li Y, et al. Hybrid retrieval-generation reinforced 193. Coroller TP, et al. CT-based radiomic signature pre-
agent for medical image report generation. In: dicts distant metastasis in lung adenocarcinoma.
Advances in neural information processing systems. Radiother Oncol. 2015;114(3):345–50.
2018. 194. Huynh E, et al. Associations of radiomic data
180. Moradi M, et al. Bimodal network architectures extracted from static and respiratory-gated CT scans
for automatic generation of image annotation from with disease recurrence in lung cancer patients treated
text. In: International conference on medical image with SBRT. PLoS One. 2017;12(1):e0169172.
computing and computer-assisted intervention. 195. Parmar C, et al. Machine learning methods for quan-
New York: Springer; 2018. titative radiomic biomarkers. Sci Rep. 2015;5:13087.
The Basic Principles of Machine
Learning 5
Joshua D. Kaggie, Dimitri A. Kessler,
Chitresh Bhushan, Dawei Gui, and Gaspar Delso

Contents
5.1 Introduction 57
5.1.1 The Task of ML 58
5.1.2 Supervised Learning 60
5.1.3 Unsupervised Learning 61
5.1.4 Radiomics and Texture Analysis 61
5.1.5 Feature Reduction 62
5.1.6 Scaling and Normalization 63
5.1.7 Training, Validation, and Testing 63
5.2 Linear Regression 64
5.2.1 Under- and Overfitting 64
5.2.2 Linear Regression Mathematics 65
5.2.3 The Neural Network 66
5.2.4 The Objective Function 68
5.2.5 Gradient Descent 69
5.2.6 Deep Learning with Convolutional Neural Networks 70
5.2.7 Advanced Deep Learning Architectures 72
5.2.8 Deep Learning in Medical Image Analysis 74
5.2.9 Federated Learning 77
References 77

5.1 Introduction

J. D. Kaggie (*) · D. A. Kessler Medical imaging encompasses a broad range of

Department of Radiology, University of Cambridge, methods from the more established to the new
Cambridge, UK (Fig. 5.1). The X-ray radiograph, which is the
e-mail: [email protected];
[email protected]
backbone of computed tomography (CT) imag-
ing, has existed for over a hundred years. Positron
C. Bhushan
GE Research, Niskayuna, NY, USA
emission tomography (PET) and magnetic reso-
e-mail: [email protected] nance imaging (MRI) had their earliest clinical
D. Gui · G. Delso
developments in the late 1970s. These modali-
GE, Waukesha, USA ties, among many others, remain subjects of wide
e-mail: [email protected]; [email protected] research and continue to make strides within

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 57

a MRI b MRI c MRI d e

In Phase Water ZTE pCT CT

Fig. 5.1 Images of a head using (a–c) three MRI methods, (d) a pseudo-CT image derived from the MRI data using
machine learning, and (e) a CT image

these fields as new techniques continue to numbers. ML is unlikely to completely replace

improve them. The preponderance of medical routine care in these scenarios, but can greatly
imaging has led to an explosion of data, which in help in enriching the overall understanding of
medical settings is then be used by a radiologist diseases and treatments by revealing new rela-
to determine whether there are significant dis- tionships between diseases, demographics, treat-
eases. Instead of relying on human observation, ment outcomes, or other factors. ML within
this data is increasingly analyzed with machine medicine will provide insights into human biol-
algorithms. Mathematics is at the heart of ogy that will feed future clinical advancements.
the many machine algorithms designed to help in
the clinical interpretation of underlying biology.
If you are in medicine, you cannot escape 5.1.1 The Task of ML
encountering “machine learning” (ML), which is
a subset of “artificial intelligence” (AI). “Machine Machine learning has been successfully applied
learning” encompasses a wide range of tech- to an increasing number of tasks. The success of
niques, from the more basic linear regression to ML is near inescapable. ML has been successful
combinations of deep neural networks. New in computer vision, speech recognition, image
methods are frequently developed, and so it is and audio generation, drug design, and natural
very easy to lose sight of significant develop- language processing. However, none of these
ments. It helps to know that at the heart of successes are considered a “general AI”. A gener-
machine learning are fundamental mathematical alized ML algorithm that can be used without
principles, which can help with understanding user input and can give you an appropriate output
the underlying principles of future methods. It is considered a “general AI.” Oftentimes, general
also helps to know a few broad techniques that AI is brought up when discussing the promises
fall into several major categories. and fears of ML, which is confused with routine
The reality is that diseases and treatments that ML techniques. “ML” is the preferred term
can be collected on large scales will likely be within scientific practice to avoid the confusion
replaced by ML methods throughout the next of the term AI. To perform ML, you will need a
decades. Even with common diseases, simultane- question, a computer, an algorithm or model, and
ous occurrence of several diseases along with data to interpret.
patient’s demographics, environment, and treat-
ment history can result in exponential numbers of 5.1.1.1 A Question
possibilities. This leads to low numbers of data The overall goal of ML in medicine is the specific
points with specific/similar inputs. The signifi- tasks of predicting and treating disease. ML is
cant number of rare diseases that exist will not be being increasingly used wherever computer pro-
replaced easily by ML without substantial cessing can be involved with the acquisition or
advancements in ML techniques, due to their low analysis of any dataset. ML relies on having a
5 The Basic Principles of Machine Learning 59

question to answer, knowledge to gain, or process image processing. The combination of thousands
to improve. There are several questions in medi- of low-powered CPUs for parallel computing are
cine that can be tackled with ML, such as: does commonly used in “graphics processing units”
the person have any disease? If yes, what is the (GPUs), which combines CPUs to optimize calcu-
disease? How far along is it? Where is it located? lations for graphical displays and intensive ML
Can it be treated? Can imaging be used in con- tasks. GPUs can contain thousands of CPUs, called
junction with any other data to diagnose the dis- ‘cores’, although these cores will individually be
ease? In order to have a useful measurement, it is much slower than the CPUs in your home com-
necessary to have a question or hypothesis. puter. The advantage of the GPU is that it performs
highly parallel computations, which allows it to
5.1.1.2 A Computer outperform a CPU when many similar computa-
To perform ML, you will need hardware and soft- tions can be performed at the same time, or paral-
ware that can perform this task (Fig. 5.2). The hard- lelized. Within your home, you may have a GPU
ware will consist of a computer made of a “central that could perform ML; however, many of the tasks
processing unit” (CPU) and memory. CPUs are require higher quality GPUs that can cost thou-
incredibly versatile and can handle nearly all com- sands of dollars. In order to access high perfor-
puting tasks. Your home computer will likely con- mance GPUs, there are many online websites that
tain 2–16 CPUs for its daily computational tasks, offer free trials to quality GPUs. High performance
whether that is browsing the internet or advanced computing centers exist at many universities and
offer purchasable time with the possibility of thou-
sands of CPUs and hundreds of GPUs for parallel
Basic Computing Architecture
computing. There are also “tensor processing
units” (TPUs), which are a proprietary form of
CPUs/GPUs dedicated to ML that use lower
RAM: Operational Memory numerical precision to enable faster processing.

5.1.1.3 An Algorithm or Model

An ML algorithm or model is the sequence of
Hard disk: Storage Memory well-defined instructions, implemented in a com-
puter, to solve a question. Typically, these algo-
rithms are specific to a class of similar questions.
Within medical imaging, ML algorithms domi-
CPU: Processing power nate with solutions of the following three classes
of questions: (1) radiomics, e.g., texture analysis,
(2) automated segmentation, and (3) disease pre-
GPU: Parallel Processing power diction. These algorithms do not encompass all
imaging applications, which can include the opti-
Thousands of CPUs mization of data acquisition routines or the
reconstruction of data.
ML algorithms learn or fit a mathematical
GPU RAM “model” to solve a question using prior knowledge
and observed data, which creates “training data.”
ML models are generally flexible enough to allow
learning a specific solution to a specific question
Fig. 5.2 An image showing four components that are by changing the training data. Perhaps surpris-
present in modern computers: RAM, hard disks, CPUs,
and a GPU. These four parameters determine the speed
ingly, the same ML algorithms can be used to
that a computer can perform ML, although the data size learn models for the (a) segmentation of brain tis-
and algorithm are also important! sues in MRI images and the (b) segmentation of
60 J. D. Kaggie et al.

liver lesions in CT images, just by changing the with small files/parameters and more individuals.
training data! The stronger the relationship between the input
A significant amount of computer code with and output, the sooner any model will converge.
different ML algorithm implementations are Datasets where strong correlations exist, such as
readily available online for the intrepid research- the segmentation of large features like an entire
ers who are willing to explore them. There lung or brain volume, may be trainable on tens of
are increasingly more free tools available for subjects, and tested on tens of other subjects.
individuals to use with their own datasets. ML Datasets with weak correlations may require
methods become increasingly impressive, with thousands of subjects or more, if they are even
recent demonstrations that ML can predict what a possible to train well.
person may be looking at (after very significant There are several questions to ask when con-
pretraining!). At present, there are no ML algo- sidering the data: is the data balanced between
rithm or ML models that enables generalized training types? Are the labels noisy enough to
intelligence (“general AI”), so we work with very model real-world environments, but not too noisy
specific tasks, data, and models. Not all diseases to be useless? Are all of the interesting values
will be detectable with ML methods, as even ML represented? Is the data structured appropriately,
is subject to the sensitivity and specificity of such as in a file format type (video, images,
physical systems. The more specific the data, the XML, etc.), where ML can be performed? Does
more specific the technique can be used. the data need labels/supervision, and are there
sufficient labels? Are the labels accurate? Is the
5.1.1.4 Data to Interpret data biased, and can we overcome these biases?
ML is specific to the datasets used in training,
and thus requires curation of the data to perform
a specific task. Data curation remains a limiting 5.1.2 Supervised Learning
factor, as the majority of medicinal techniques
require human involvement in order to establish Successful ML models in medicine use datasets
efficacy, and few locations are willing to accept based on the inputs of an experienced clinician
the additional risk when an ML method may mis- providing outlines or disease scores. These mod-
diagnose a patient. In practice, a large amount of els are trained with very specific questions in
time for machine learning is spent in the prepara- mind, often on “labeled” datasets to enable
tion of datasets, such as curating or labeling “supervised” learning. Labeling a dataset refers
them, and analyzing whether the outputs are to providing a structured raw dataset, such as
meaningful. images that may have a type of disease present,
There are two axioms presented when discuss- and a label specific to each image, such as
ing ML: “garbage in, garbage out” and “more whether that image has the disease present.
data is always better.” The following question Supervised learning happens when both the raw
arises: what are the optimal numbers within a data and labels are presented in a structured for-
dataset to learn and then verify the task at hand? mat. To draw parallels with fitting the curve of a
Unfortunately, there is no simple answer to the line, raw data is often referred to as “x.” The goal
data amount. This depends largely on the quality of the algorithm would be to identify the curve or
of data and the strength of the relationship formula that best fits the labels “y” corresponding
between the model and observed parameters. to those “x” data points. After performing super-
Sufficient dataset diversity is a requirement, but vised learning, a machine learning program will
too much data diversity will waste resources on create an output of predicted “y” values, which
unimportant leads. A dataset with large variabil- are then tested against the observed or measured
ity in its files/parameters and with fewer individ- “y” values, which are labels derived from an
uals will require more resources than a dataset experienced reader.
5 The Basic Principles of Machine Learning 61

5.1.3 Unsupervised Learning 5.1.4 Radiomics and Texture

Analysis
“Unsupervised learning” does not rely on having
labels from a trained clinician. Unsupervised While a human is able to automatically classify
learning finds correlations within datasets to the properties of an image, this classification is
automatically categorize the data. As an example often subjective. Texture analysis attempts to make
of unsupervised learning, imagine that you had this classification numerical or objective by creat-
two apple and three bananas (Fig. 5.3). You might ing mathematical descriptions of “image features”
automatically categorize them, as a human, into (Fig. 5.4). An image feature is any mathematical
two groups: apples versus bananas. Perhaps you description of the image, which could be as simple
did this based on their color or shape. An algo- as the number of red pixels (or when discussed in
rithm that does this automatic categorization is 3D, voxels) in an image of an apple, the mean
referred to as an unsupervised learning algorithm. intensity of these pixels, or standard deviation.
There are many ways to categorize these two There is information encoded in space, such as
apples and three bananas. Perhaps we classify or whether red pixels in an image are close together
separate these fruits based on their weight, then might indicate the presence of an apple, whereas
we might find that the clusters contain a mixture random, scattered pixels may indicate noise.
of apples and bananas if we had a wide range of Mean, variance (and standard deviation), kur-
weights within a single fruit category. There may tosis, and skew are commonly used features, also
be an algorithm that leads to an intuitive result, referred to as 1D histogram measurements, and
such as the “yellowness” of the bananas becom- do not account for spatial variations. Two-
ing a distinguishing quantifiable metric. The dimensional histograms account for intensity
clustering or separation of these characteristics as similarities within a mask and are based on work
chosen by an algorithm may not be as intuitive as from Haralick et al. who described “gray-level
one chosen by a human observer. It may be pos- co-occurrence matrices” (GLCMs) [1]. GLCMs
sible to find an algorithm that relies on the same are then further processed or transformed into
characteristics that a human will choose, or it other, more descriptive measurements, many of
may be that a computer finds a result that is which are derived from physics equations, such
unfathomable. Regardless of the result, the ques- as “entropy” and “energy.” Other texture mea-
tion remains, “will this algorithm help us under- surements can also be measured, such as gradient
stand or treat a disease?” Even unsupervised measurements, which can also be described as
learning does not occur in a vacuum, so eventu- “total variation” [2, 3]. There are many different
ally must be tied into other analysis regarding its possible mathematical features, with many of
utility. them overlapping each other. Convolutional neu-

Cluster 1 Cluster 2 Cluster 1 Cluster 2 Cluster 1 Cluster 2

Length Yellowness Weight

Fig. 5.3 Three plots showing quantifications of apples classification can result in several clusters that can distin-
and bananas based on length, color (“yellowness”), and guish the apples and bananas or can make them
weight. Depending on the quantification method used, the indistinguishable
62 J. D. Kaggie et al.

Fig. 5.4 A 4 × 4 image Original 4x4 Image

of white and black block
tiles. This image can be Feature Value
used to create quantities
that can describe any Mean intensity (black=1) 0.375
number of feature
Number of black blocks 6
dynamics
Most connected blocks 5
Blocks without two adjacent 2

ral networks allow learning very complex, task- pixels, before even considering the many layouts
specific textures that can be used to recognize possible or 3D layers!
faces [4] and numbers [5], and segment structures This proliferation of dimensionality obscures
in different types of bio-medical images [6, 7]. the interpretation of the results, making it difficult
Texture measurements can have a confusing to understand why an algorithm has reached a cer-
relationship with shape and volume. While some tain conclusion. The goal of ML research should
textures are shape dependent, others are not. A be to find new mechanisms underlying the causes
simple total volume or length may be more impor- of diseases, which will aid future assessments
tant than other texture features because large dis- (whether in the physical or computational spaces).
eased areas often represent comorbidities. Due to Not all data is necessary—it does not normally
this dependency, many different image combina- make sense to look for disease correlations within
tions are possible. Should you use an arbitrary the brain to a predominantly kidney disease. In
region outline to extract features of your image order to make sense of much of the data, we can
that could vary over different disease shapes and perform “ablation” to determine whether these
sizes, or a rigid shape (such as a square)? Outlining features are primary measurements within our
a disease specifically will highlight disease spe- tests. “Ablation” is the removal of a portion of the
cific features but will result in irregular shapes that ML pipeline to see whether the same results are
are less repeatable than conforming to drawing a obtained, whether that’s removing data, features,
rigid shape based on anatomical landmarks; the or a portion of the ML network. For example, an
precise outline required depends on context. ablation study might be to randomly remove half
of the image features to see whether the same
results can be obtained [8, 9].
5.1.5 Feature Reduction Large numbers of features are more difficult to
train and require more data. Two methods to reduce
One of the difficulties of texture analysis, and in the number of features are called principal compo-
general ML, is that the number of descriptions can nent analysis (PCA) and singular value decomposi-
quickly outpace the data present. Let us take five tion (SVD). These are unsupervised algorithms
different fruits: an apple, orange, banana, cherry, that reduce the number of features by combining
and pear. There are 5! = 120 different orders in them into new features based on linear relationship
which we can arrange these in a line, which is calculations. By reducing the number of features
much more than the number of fruit themselves. used before further calculations, ML methods can
However, it may be that one of those specific train more quickly and with reduced dataset sizes.
orders is the key to unlocking a mystery. Those PCA and SVD can be considered to be the same
120 orders are not all of the possible representa- thing for all practical purposes, although PCA can
tions of those 5 fruits, because we could place the differ in its operational order. These create a
fruits into 5 boxes instead of 1, or 4, 3 or 2. Images reduced set of features based on their linear corre-
have much more than five features, often lations, calculated based on a linear algebra tech-
using pixel dimensions of 256 × 256 = 65,536 nique called eigenvalue decomposition.
5 The Basic Principles of Machine Learning 63

Feature selection can be performed by the normalization is not always ideal because it can
recursive elimination of features (feature abla- reduce the quantitative nature of outputs; how-
tion) to determine whether an output remains sig- ever, rescaling features is often necessary to
nificant. Feature selection can be prior to analysis obtain a meaningful result. For example, if I have
as well. For example, features can be removed two processes, one which results in 1000 ± 100
that have strong linear correlations to each other, and another which results in 0.0010 ± 0.0001,
presupposing that they do not contain new infor- then the effect of the one with higher scale may
mation. This is not strictly true as a quadratic be overestimated during any optimization pro-
curve could be based off of a linear curve and will cess. Rescaling also normally results in faster
not have a strong linear correlation, but would be fitting.
caused by exactly the same information. Another A (non-fruit) example might be one below.
selection method retains features that have the Let us say we have the feature inputs, x, and the
highest variances, based on the assumption that outputs, y, as listed:
independent relationships will have high vari-
ances - but this could also could be the result of Texture Shape eGFR Gender Disease Disease
noise. Recursive elimination is the most likely to (x1) (x2) (x3) (x4) score (y) (ybinary)
find significant results, but it is also the most 0.7 1000 65 M (0) 4 Yes (1)
likely to find insignificant results at the same time 0.1 3000 15 F (1) 5 Yes (1)
due to the wide range of repeated tests, which is 0.5 500 110 M (0) 0 No (0)
0.9 7000 80 M (0) 3 Yes (1)
why linear- or variance-based feature removals
0.5 2500 120 F (1) 1 No (0)
are used.

A common standardization method of the

5.1.6 Scaling and Normalization input features, xn, is to subtract all features by
their mean and then divide those outputs by their
For a large majority of ML methods, the models overall standard deviation per each feature or per
are improved when the data is standardized— column. This will normally result in better out-
which means to adjust the scale of the input fea- puts. This would result in a table as such:
tures. We can show how this relates to features in
the five fruit example. If we have measurements Texture Shape eGFR Gender Disease Disease
of the fruits based on their weights and longest (x1) (x2) (x3) (x4) score (y) (ybinary)
lengths, then we want our output to be consistent 0.6 −0.8 −0.3 −0.8 4 Yes (1)
regardless of the input. That is, regardless of −1.6 0.1 −1.6 1.2 5 Yes (1)
whether we obtain the weights of the fruits in −0.2 −1.0 0.9 −0.8 0 No (0)
grams, kilograms, or pounds, we would like the 1.4 1.8 0.1 −0.8 3 Yes (1)
end result to be the same. Furthermore, if we have −0.2 −0.1 1.1 1.2 1 No (0)
calculated the features at very different scales,
like the weight in milligrams, which would give
features in hundreds of thousands, and length in 5.1.7 Training, Validation,
kilometers, which would give features at decimal and Testing
places, then we may get a nonoptimal outputs.
Standardization is necessary partially because a A well-known issue when testing statistical prob-
majority of ML methods are built on linear regres- lems is called the “multiple comparisons prob-
sion, discussed more in depth later. lem.” If there are five random measurements of
This normalization process can occur for tex- any kind, and a dice is thrown five 50 times, then
ture features prior to analysis or for features there will be strong linear correlations between
within neural networks or for outputs between the initial set of random measurements and
neural network layers (discussed later). Feature roughly 10 of these dice rolls, despite having no
64 J. D. Kaggie et al.

meaningful linking relationship. These correla- regression can help inform a more intuitive under-
tions are called false positives or type I errors if standing of ML structures. Linear regression can be
there is no underlying causation. ML normally used as its own ML method. The principles demon-
uses too many features or variables while ignor- strated while performing a simple linear regression
ing multiple testing corrections (such as are applicable in more advanced machine learning
Bonferroni corrections). The high dimensionality topics.
of the problems would result in small statistical
significance if such corrections were to be used in
many of these tests. To work around this limita- 5.2.1 Under- and Overfitting
tion and ensure reliable measurements, datasets
are often broken up into “Training,” “Validation,” Imagine that two points on a line are known
and “Testing” datasets. exactly: at x = 1, y = 2, and at x = 5, y = 10. It is
The training dataset is used while developing easy to see that the equation for this line would be
an algorithm. The validation set is used after an y = 2*x. We have two variables, x and y, which
ideal algorithm has been developed to measure are unknown. We have two points of (x,y): (1,2)
initial results on an independent set, which will and (5,10). If we know those two points exactly
result in lower scores than a training set. The test- and that the model to be fit is a line, then we can
ing set is meant to be a completely independent fit the curve exactly.
set of data that has not been tested previously, Now imagine that our model is not a line but
such that the training and validation would not that it is a second-degree polynomial:
bias its scores. The test set is considered mea- y = ax2 + bx + c. This could be the case because
sured or observed values. In practice, many data- we have nonlinear relationships within our data-
sets have been obtained from online databases set. We may not even know what this model is,
that have been tested previously, so there is ambi- which does not necessarily have to be a second-
guity in whether a test set is “validation” or “test- degree polynomial but could be any number of
ing” set within literature. other curves. We stick with the second-degree
These datasets might be split up equally into polynomial in this hypothetical example to give
thirds, or into 30/20/30 splits, or even into 80/20 an intuitive feel for fitting. Let us assume that we
splits, depending on the data quality and type. only have those original two points [(1,5) and
For large databases, a full train-validation-test (2,10)]. The second-degree curve that fits this
split is possible for sufficient training to obtain curve could be y = −5x2 + 20x − 10. It could also
high scores (of any metric), while for smaller be y = −10x2 + 35x − 20. Or any number of solu-
databases, such as in the case of rare diseases, an tions that exist. We could continue to extend the
80/20 split is required. It is also possible to do parameters with higher orders of polynomials,
five 80/20 splits, with the 20% of the split coming where even more possibilities exist. For a noise-
from the five different portions of the data, and to less problem, we would want the same number of
repeat the training/testing five times, if the data- input datasets to match the number of variables to
set numbers are very low [10]. be predicted. In the real world, data collection
processes are messy, so we normally require
significantly more datasets to average noise out,
5.2 Linear Regression even for predicting simple linear relationships.
Underfitting occurs when the data follows a
While it may seem backwards to begin with linear more complicated relationship than the final
regression, it is helpful to understand before delv- model requires. Underfitting usually results in
ing deeper into ML methods. Since the importance poor predictions because it does not model all of
of textures can be separated via linear regression the complex relationships within the data.
and since linear regression is a fundamental com- Overfitting is when the data follows a simpler
ponent of the majority of neurons in neural net- relationship than the final model requires.
works, having a solid framework of linear Many ML tasks will have many more variables
5 The Basic Principles of Machine Learning 65

predicted than inputs, which is overfitting. far lower than the number of variable parameters
Overfitting is unavoidable because we are (Fig. 5.5). The number of variable parameters in
attempting to predict multiparametric, nonlin- a typical deep learning model can include 30+
ear relationships, such as the shape of a liver layers of parameters with tens to hundreds of
within images. The cost of overfitting is that it parameters per layer. In this context, both “under-
requires a larger number of datasets to have fitting” and “overfitting” are when the model can-
good predictability, it requires more advanced not be generalized to a group beyond the
computational equipment (and power!), and it individuals included in the study, because the
precludes the understanding of underlying mod- model is either not trained on enough data (“over-
els and mechanisms, which should be the goal fitting”) or does not have enough parameters
of every researcher. However, an overfitted (“underfitting”). ML generally focuses on future
model is far from useless—if an overfitted predictions based on past data, whereas medici-
model can be used for future predictions with nal research focuses on better patient treatments
statistical significance, naively as a lookup table or finding new underlying causes of disease pro-
using a very large dataset, then it may point to gression, which require sensitive techniques and
the existence of a simpler, underlying model or meaningful statistical methods to avoid biases.
mechanism.
The goals of researchers vary: for some, find-
ing small relationships in large datasets is impor- 5.2.2 Linear Regression
tant where effects are hidden; for others, finding Mathematics
strong relationships in small datasets is impor-
tant, as that demonstrates very significant effects. This section is included as an easy reference for
Scientific progression relies on both types. those who may be required to perform these
Within the ML applications presented hereafter, calculations. It can be skipped for those not wish-
we primarily focus on the use of large datasets, ing to delve deeper into the mathematics.
but note that small datasets require more specific Linear regression tries to predict a linear rela-
models and stronger correlations to demonstrate tionship between a set of variables and an output.
efficacy, whether with ML or classical statistics. We are most familiar with linear regression in the
In a very strict sense, many ML applications form of
perform “overfitting,” as the amount of data pres-
y = f m ,b ( x ) = mx + b
ent is limited by the number of patients that it is

A Good Fit An Underfit An Overfit

500 500 500

400 400 400

300 300 300

200 200 200

100 100 100

0 0 0

–100 –100 –100

–20 –15 –10 –5 0 5 10 15 20 –20 –15 –10 –5 0 5 10 15 20 –20 –15 –10 –5 0 5 10 15 20

Fig. 5.5 When discussing fitting, the number of parame- any model can be fit with reduced error, but this reduces
ters should be representative of the underlying data and the predictability of future results without massive data
ignore noise. By increasing the number of fit parameters, increases
66 J. D. Kaggie et al.

where y is an output that is dependent on the 5.2.3 The Neural Network

variable x as described by linear function f. The
variables m and b are parameters of the function f, You would undoubtedly have heard of “deep
commonly known as “slope” and “intercept,” learning” and hopefully a “neural network”
respectively, and are generally unknown. A sim- (Fig. 5.6). These build on the principle of a “neu-
ple example might be the relationship between the ron,” which is a single building block in a neural
height (y) of an individual with the width of that network. A “neuron” is a linear fit of inputs (or
person’s shoulders (x). Linear regression can be features, usually denoted as x) that are fed into a
extended beyond a single variable—so instead of transformation (called an activation function,
the width of shoulders being dependent on height which is often a nonlinear transformation) and
alone, you can include other variables like consid- attempts to minimize the error between its output
ering the effects of average caloric intake. (a predicted y) and a measured quantity (an
Linear regression is process of estimating observed y). When multiple neurons are chained
unknown parameters (m, b) given a set of obser- together, they form a “neural network.” When a
vations (xi, yi). It is performed by minimizing the large number of these neurons are chained
difference (or “residuals”) between the measured together, they are considered a “deep neural net-
and the model values. That is, we seek to mini- work” [12].
mize the function over When the outputs of a group of neurons are
fed into the inputs of another group of neurons,
S ( m,b ) = ∑  yi – f m ,b ( xi )  = ∑ [ yi – mxi – b ]
2 2
each of these groups is considered a “layer.”
Each layer of neurons is usually referred to as a
This minimization is found for slope m when the “hidden layer,” with the inputs (x) and predicted
derivative of S with respect to m is zero, i.e., outputs (y) not being hidden. A neuron individu-
∂ S (m)
= 0 . This formulation may not be very
∂m Input Hidden Hidden Output
intuitive, but this minimization leads to the abil- Layer Layer 1 Layer 2 Layer
ity to predict the slope of the curve, which can be
demonstrated to be [11]: N1

∑x∑ y N1
Covariance [ x,y ] ∑ xy –
m= = n x1
Variance [ x ] ( ∑ x)
2 N2
∑ x2 – y1
n N2

After the slope, m, is found, the intercept, b, can x2 N3

then be found:
N3 y2
b = y–mx
x3 N4

The “goodness-of-fit” of a linear function is often N4

referred to as “r,” which is calculated as:
N5

r=
( )( y – y )
∑ x– x

∑ ( x – x) ( y – y)
2 2

Fig. 5.6 A neural network with three inputs, two hidden

layers, and two predicted outputs. Neural networks can
have any number of neurons per hidden layer, as well as
where x and y are sample means. When “r” any number of hidden layers, provided the computational
approaches 1, then fitted x and y values have a ability remains
good fit.
5 The Basic Principles of Machine Learning 67

ally has a largely linear relationship between fed into an activation function, also known as
inputs and outputs. However, by chaining neu- nonlinearity, that performs a certain fixed mathe-
rons together with activation functions, nonlinear matical operation on it.
relationships occur that can model most complex Activation functions subdue or amplify the
functions. By increasing the number of layers, outputs of a neuron to increase or decrease its
the model becomes more specific and less gener- importance in a neural network. Common activa-
alized, which requires more data and more spe- tion functions are in Fig. 5.8. The sigmoid func-
cific data. A higher number of neuron layers is tion (or logistic curve) is a common function to
incredibly useful when the model has an unknown “squish” a real number between 0 and 1. Very
or difficult to model relationship, which occurs negative inputs become close to 0, very positive
frequently in the arbitrary shapes that many inputs become close to 1, and the function
lesions can cause. steadily increases or decreases around the input
We want to suppress the output of neurons that 0. The sigmoid function is continuously differen-
are not contributing to accuracy in the end pre- tiable as a smooth, nonlinear step function. This
diction (Fig. 5.7). Within biology, an activation function is useful for predicting probabilities.
function is when a cell has minimal output until a However, this is not zero-centered, which may
set of inputs, such as a voltage or chemical con- require more data for training and can be difficult
centration, surpass a definable limit to ensure a to optimize [12]. The hyperbolic tangent, tanh,
neuron fires only after passing the threshold of an function is sigmoid scaled to be centered around
action potential. Due to this similarity, neural net- zero, and so is preferred over the sigmoid func-
works are referred to as “artificial neural net- tion. The most widely used activation function
works.” In order to attenuate these computational currently is the “rectified linear unit” or “ReLU”
neurons, similar to biological under-stimulation pictured in Fig. 5.8, used for its speed and stabil-
or hypersensitization, we use a similar idea ity. The ReLU has simpler mathematical opera-
encapsulated in within the “activation function.” tions over the sigmoid and tanh activation
A neuron weights each of its inputs, creating a functions. No activation function is perfect, as
linear model, and then outputs this. This output is the ReLU results in the a “dying ReLU problem,”

Fig. 5.7 An example of Features Layer 1 Layer 2 Output

how different input
“features” can combine Weights Weights
within a network to
create an output. The
weights might be
randomly initialized.
This example shows that
N1
there may be overlaps of
features, which should
become less weighted
with increased training
N2 N4

N3
Ground Truth
68 J. D. Kaggie et al.

1.0
v = (4,3)

0.5

0.0
ReLU
–0.5 Leaky ReLU

3 Blocks
Sigmoid
Tanh
–1.0
–3 –2 –1 0 1 2 3

Fig. 5.8 Four activation functions are shown. A neural

network weights its inputs and then based on that output,
undergoes a transformation following an activation func-
tion. This amplifies neurons that contribute to the final
4 Blocks
objective
L1 = 4+3 = 7
L2 = sqrt(42+32)
which is when “dead” neurons form during the
learning process if their weights have progressed Fig. 5.9 A 4 × 4 tile showing the difference between an
L1 norm and an L2 norm. An L1 norm is the distance it
to zero, and no longer can be updated. The leaky takes for a taxicab to drive to the location, which is shown
ReLU is an upgraded version of ReLU, which as from the origin (0,0) to the point (4,3). An L2 norm is
has a small linear gradient to ensure that a neu- the quadratic distance, following the Pythagorean theo-
ron’s weights will never reach zero and thus rem for a two-dimensional plot
never become fully deactivated [13].

ideal universal “cost” between predictions and

5.2.4 The Objective Function observations. Every cost function has their
drawbacks and their effectiveness depends on
The objective function is the calculated penalty the application. There are two frequently used
between a wrong classification and prediction. cost functions, referred to as the L1 and L2
In many ML problems, we wish to minimize the norms [16]. A norm can be thought of like an
errors between a set of predicted outputs and the absolute value—except how does one take the
observations (often measurement data, which absolute value of a multidimensional vector?
can include disease scores). This process of min- The norm of −1 is +1. Moving to complex num-
imization is based on a so-called objective func- bers, the norm of 1 + i, where i is equal to the
tion that encodes in a single value the –1 is dependent on whether one is measuring
disagreement between predictions and observa- its L1 or L2 norm.
tion metrics [14, 15]. The objective function can The L1 norm of a vector, v, is its distance from
also be referred to as a “loss,” “regret,” or “cost” the origin and has the symbol ‖v‖1 the absolute
function. There are ML problems where the sum of all vector values. That is,
objective function is maximized and referred to ‖v‖1 = |v1| + |v2| + ⋯ + |vn|. This is also called the
as a “reward,” “profit,” “utility,” or “fitness” “taxicab” or “Manhattan” norm because it refers
function, although we will not consider these lat- to the distance it takes for a taxicab in Manhattan
ter cases. to reach its destination, based on a square grid
The objective or cost between a set of pre- (Fig. 5.9). The L1 norm of 1 + i is 2. The L1 norm
dicted and measured outputs could be a simple when used in ML will be the difference between
difference between these values. There is no each predicted and measured output:
5 The Basic Principles of Machine Learning 69

|| y ||1 = y1, pred – y1, obs + y2, pred – y2, obs + … that have a large number of parameters. Hence,
most optimization methods use local informa-
+ yn , pred – yn , obs
tion like derivatives and Taylor series expansion
to find a “local” minima that is practically useful
The L2 norm of a vector is its classical dis- for the task [27–29]. “Gradient descent” is a
tance from the origin and has the symbol ‖v‖2. It widely used technique that uses local derivatives
can also be referred to as the Euclidean norm and to find local minimum for a variety of applica-
is the most common norm used (Fig. 5.9). It is the tions [30, 31].
square root of the sum of the squared vector val-
ues. The L2 norm of 1 + i is 2 . The L2 norm of
a vector with higher dimensions is 5.2.5 Gradient Descent

|| v ||2 = v1 + v2 + ? vn
2 2 2
or within an optimi- Gradient descent is a widely used iterative
zation technique: method to optimize an objective function. That
is, it is the method for updating parameters within
2 2 a neural network algorithm [30, 31].
y1, pred – y1, obs + y2, pred – y2, obs + … A gradient descent algorithm can be used to
|| y ||2 =
+ yn , pred – yn , obs
2
optimize a continuous function. Let us say our
model is quadratic, that is, ymodel = x2, and we have
a set of measured values, y1, obs, y2, obs, y3, obs. We
These two norms are the most commonly used want to find the x values that minimize the error
cost functions. Another cost function could be the between the model and the observed or measured
p-norm, which is the L2 norm with the power of values. While the solution to each x is trivial
two replaced with the power of p: because we know the model (since we could eas-
ily take the square root of y1, obs, y2, obs, and y3, obs),
p p
this is enlightening as it can help us understand
y1, pred – y1, obs + y2, pred – y2, obs +… models with more complicated interdependen-
|| y || p = p
p cies, such as if y1, y2, and y3 were not completely
+ yn , pred – yn , obs
independent.
The L2 norm in this model will be

State-of-the-art deep learning approaches use

a variety of task-specific complex cost functions 2 2
y1, pred – y1, obs + y2, pred – y2, obs
that aim to be easier to optimize [16–18], seek to || y ||2 = 2
capture perceptual differences between images + y3, pred – y3, obs
[19–21], and even learn a cost function specific
for the task [22, 23]. Irrespective of choice of
cost/objective function, the training process If we assume no knowledge of x1, x2, and x3
must numerically minimize the chosen cost prior to starting, we can enter in initial guesses
function to obtain the model with optimal param- and calculate ‖y‖2. However, we need a method
eters (e.g., (m, b) in the regression example to update these values. To do this, we can take the
above) [24, 25]. Ideally the optimization process derivative—or gradient—of the objective func-
seeks to find the “global” minima of the objec- tion. If the gradient is large, then that parameter
tive function, i.e., there does not exist any other should change quickly. If the gradient is small,
set of parameters that can have a lower cost [26]. then that parameter creates a minimum or maxi-
Finding a global minima or even verifying a can- mum value in the objective function. Gradient
didate for global minima is a very difficult prob- descent corrects for predictions or guesses that
lem, especially in state-of-the-art ML approaches are distant from the optimum value.
70 J. D. Kaggie et al.

Gradient descent updates each parameter with tion gained by comparing the network’s output
its rate of change based on the derivative of the and our observation data is propagated through
cost function. For example, with each gradient the rest of the network. The minimum is found by
descent step, an initial point x1 is moved in the repeatedly using the above update rule:
direction of steepest descent with a step-size α:
∂ || y ||2
wn +1 = wn – α ·
∂ || y ||2 ∂ wn
x1, new = x1, old – α·
∂x1, old
∂ || y ||2
bm +1 = bm – α ·
Alpha, α, is a “learning rate,” which can be stati- ∂ bm
cally set or updated throughout training. Alpha is
often determined empirically and can be consid- Neural networks will typically use “mini-batch”
ered a “hyperparameter.” Hyperparameters are gradient descent, which is a combination of two
often user-definable variables that can affect other gradient descent methods: batch and sto-
training effectiveness substantially. Large-scale chastic gradient descent. Batch gradient descent
optimization, like that in deep learning, generally calculates the descent for all parameters and
use sophisticated approaches for updating the observations but is computationally expensive
learning rate throughout the process to quickly especially with very large datasets. Stochastic
find the minimal point [32–35]. gradient descent updates parameters using an
When applied to neural networks, the gradient “estimate” of the derivative computed using only
descent algorithm is used to update the weights a random “mini” subset of the complete observa-
wn and biases bm to minimize the cost (Fig. 5.10). tion, thus saving on computational requirements,
This process by which values are updated is which results in high variability in the new
called “back-propagation” because the informa- parameter estimates [36, 37].

y 5.2.6 Deep Learning

w1 with Convolutional Neural
y Networks
w2

y
Deep learning (DL) is machine learning that
y

y
w2
w1 occurs over many learning layers. Similar to ordi-
y y nary neural networks, convolutional neural net-
w1
w2
works (CNNs) also consist of neurons that have
learnable weights and biases. The main differ-
y

y
w1
y

ence is that the structure of CNNs assumes matri-

1
w

w2
ces/images as inputs (Fig. 5.11; see also [4, 5,
w1 12]) and are thus built accordingly to make use of
convolution filters, which is fundamental to sev-
Fig. 5.10 The goal of the optimization process is to eral image processing operations [38–40]. The
determine the weights of the system that reach the mini- layers of CNNs are made up of neurons arranged
mum value (marked with the green x). The weights can be
updated with a gradient descent algorithm. New values of
in three dimensions—height, width, and depth.
the system are based on the rate of change caused by each The core parts of any CNN are the convolu-
parameter. Parameters that drastically affect the system tional layers (CL). As with any other layer, a CL
have a higher rate of change and are updated more quickly. receives an input and outputs a transformation of
Multiple parameters can affect a system, which is the
effect shown in blue, where these might be updated each
the input to the next layer. Inputs passing through
w update shown in red. A danger of optimization tech- a convolutional layer will have a set of k small
niques is that local minima can be reached n × n filters convolve (slide) across its volume
5 The Basic Principles of Machine Learning 71

Input Convolutional Output

Layer Layer Layer

Row 1
Original 4x4 Image
N1

Row 2
N2

N3
Row 3

N4
Row 4

Fig. 5.11 A 4 × 4 image is unwrapped to show how it can create 16 inputs into a convolutional layer with 4 neurons.
These four neurons can then weigh into a final output layer, which may be a prediction of the severity of a disease

and have the dot product between each filter and In addition to the convolutional layer, CNNs
input entry calculated (Fig. 5.12). Therefore, k are made up of a series of different types of layers
2D feature maps are obtained, which are stacked that either perform a transformation of the input’s
across the depth dimension. The three hyper- activations (convolutional layer, fully connected
parameters that define the output volume size are layer) or apply a fixed function (activation func-
the number of filters k (depth of the CL), the tion layer (AF), pooling layer, batch
stride with which the filter is convolved over the normalization).
input and the size of zero-padding around the A fully connected layer (FCL) simply multi-
input’s border. A stride of 1 means, the filter is plies its input by a weight matrix followed by a
slid pixel by pixel over the input. Zero-padding bias offset and an activation function transforma-
allows control over the output spatial size. By tion. All the neurons from an FCL are connected
padding the borders of the input with zeros, the to all neurons from the previous layer. FCLs are
spatial size of the output equals the size of the typically used at the end of CNNs designed for
input. performing classification tasks.
72 J. D. Kaggie et al.

2 3
1
5 6
4
9 0 0
7 8 1 1 1 1 5 1 9
1 0
0
0 1
0 15

Input 3 3 Filter Output

0 0
0
3
1 2
6 0 0
4 5 1 1 2 1 6
1 0
0
0 1
0 8 Fig. 5.13 Pooling an image or layer down-samples that
image to become a smaller shape. “Max pooling” selects
the maximum value within a region. “Average pooling”
can also be performed, where the average of a region is
selected

Input
3 3 Filter Output
Zero-Padding
5.2.7 dvanced Deep Learning
A
Fig. 5.12 (Top) A demonstration of convolution using a Architectures
3 × 3 filter (or “kernel” if in 2D) that highlights diagonal
elements. (Bottom) Prior to convolution, the input image
is zero padded to maintain the final image size The list of various deep learning architectures
used in medical imaging is long and growing rap-
idly [44]. Here we cover a few of the most widely
Pooling layers (PL) are used intermittently used models.
between CL to reduce the spatial dimension of
the feature maps [12, 41]. Common pooling 5.2.7.1 Autoencoders
operations are max pooling and average pooling Autoencoders learn a feature representation of
of the spatial data (Fig. 5.13). Through pooling, unlabeled data using an unsupervised encoding–
important feature information is preserved while decoding approach (Fig. 5.14) [45, 46]. Typically,
other less important is discarded [42]. This an input image is fed into an encoding CNN and
increases the robustness for feature extraction mapped to a lower-dimensional feature represen-
and controls overfitting while reducing the total tation of the input. Through this process, the
number of parameters in the network and ulti- input experiences a dimensionality reduction to
mately the computation time [43]. only obtain the most important features that
CNN architectures typically consist of a series define the input. After encoding, the features are
of convolutional layer → activation function fed into a decoding CNN (increasing dimension-
layer or convolutional layer → batch normaliza- ality) to reconstruct the input data. Here stems
tion layer → activation function layer stacks, the name “autoencoding” as the network is
while each stack is followed by a pooling layer, encoding its most prominent features to be able
until the input has been spatially reduced. A final to reconstruct itself. The network learns by com-
fully connected layer → activation function layer paring the reconstructed input with the original
is then applied to classify each neuron. input using for instance an L2 cost function. This
5 The Basic Principles of Machine Learning 73

Fig. 5.14 The autoencoder consists of two neural net- dimension upsampling by the decoder to generate the out-
works, an encoder and a decoder network. The dimension put. Usually, the encoding and decoding paths are sym-
of input (brain MRI) is first reduced by the encoder to a metric to generate an output identical to the input
lower-dimensional feature representation followed by a

learning is unsupervised as no external labeled mat. Within U-Net, the input is progressively
data is used during training. After training, the down-sampled (encoder) by a typical CNN archi-
decoder part of the network can be discarded tecture described in the previous section and then
while the encoder part could be used for some up-sampled (decoder) by a series of transpose
different task, such as to initialize a new super- convolutional layers. By introducing additional
vised network or to provide justification [47, 48]. skip-connections, features from the encoding
network path are concatenated to features from
5.2.7.2 ResNet the decoding network paths, enabling high-
The ResNet architecture was introduced in 2015 resolution segmentations.
and won the ImageNet challenge in that year. It
consists of so-called ResNet or residual blocks 5.2.7.4 Generative Adversarial
that allowed easier and faster training of very Networks
deep networks [49] by mitigating the vanishing A generative adversarial network consists of two
gradient problem. When the gradient is backprop- competing neural networks that are trained
agated in very deep networks, the repetitive mul- simultaneously in a mini-max game to optimize a
tiplication tends to make the gradient vanishingly loss objective function [50]. As an example, a
small. This can lead to the accuracy of deep net- noise image can be fed into a network, which
works being saturated or even degraded once net- competes against the original input image fed
work convergence begins. The residual block uses into the same network for generating the best
skip or shortcut connections that skip one or more result. This relies on a generative network, G,
layers and creates a parallel branch that reuses the focusing on image generation while another clas-
activations from previous layers. By skipping lay- sification network, D, works on image discrimi-
ers, training time and the effect of the vanishing nation. During training, D guides G to learn a
gradient are reduced as the network uses fewer translation of input images to realistic representa-
layers during initial training while preserving tions of the ground truth training data. D makes a
information as the input is fed through the layers. binary prediction of whether the generated image
is a true representation of the desired output or
5.2.7.3 U-Net not, and feeds its prediction back to G to produce
The U-Net was also introduced in 2015 and has more accurate representations. Therefore, GANs
become one of the most widely used networks for and its conditional variant (cGANs) have been
automated image segmentation (Fig. 5.15) [7]. successfully applied in many image-to-image
The name is derived from its U-like appearance transformation tasks such as segmentation and
when its neuron layers are shown in a graph for- image synthesis.
74 J. D. Kaggie et al.

Classification

Tumor (0.02)

No Tumor (0.98)

Convolution Convolution
(Max) (Max) Fully Output
Activation Function Pooling Activation Function Pooling Flatten Connected Predictions
(ReLU) (ReLU)

Classification + Localization Class Scores

Heat (0.08)
Knee (0.01)
Breast (0.05)
Kidney (0.10)
Brain (0.76)

Box Coordiates
x w

y
h
w
h
x, y
Convolution Convolution
(Max) (Max) Fully Output
Activation Function Pooling Activation Function Pooling Flatten Connected Predictions
(ReLU) (ReLU)

Fig. 5.15 The U-Net progressively down-samples an up- or transpose convolutions to the original spatial size of
input (brain MRI) in an encoding path through multiple the input. By concatenating features from the encoding to
convolutional and pooling layers while increasing the the decoding path through additional skip-connections,
number of feature representations. In a succeeding, sym- high-resolution segmentations are achieved
metric decoding path, the features are up-sampled using

5.2.7.5 Deep Boltzmann Machines 5.2.8 eep Learning in Medical

D
Deep learning can also appear as “Deep Image Analysis
Boltzmann machines” (DBMs) [51] that can per-
form image recognition tasks, but these differ Over the recent years, the application of deep
from CNNs in several ways. A DBM is used for learning has had a significant impact on various
classification problems, whereas a CNN can be areas of medical imaging analysis [52]. With the
applied to more general problems. DBMs do not rapid progression and development of deep learn-
use gradient descent and backpropagation. ing architectures and their subsequent application
Instead, DBMs use probability distributions over to the analysis of medical images, this section will
binary values. A Boltzmann machine can also be only highlight a few applications and p ublications
called a Markov random field. demonstrating the advances in this research field.
5 The Basic Principles of Machine Learning 75

5.2.8.1 Classification, Localization network only on images of normal, “healthy”

and Detection anatomy appearance. Structures that are not part
Image classification has become one of the most of the normal distribution learned can be detected
effective applications of deep learning in medical by an anomaly scoring system [53]. Furthermore,
image analysis. A deep neural network classifies deep learning methods have been applied to med-
an image by extracting image features to predict ical image denoising and artifact detection. For
class labels of an object in the input image. example, an autoencoder has been shown to out-
Typically, the input image is fed into a CNN that perform the FSL SUSAN denoising algorithm
progressively down-samples the image through for denoising brain MRIs with various degrees of
multiple convolutional and pooling layers and additional Gaussian noise [54], while a CNN has
outputs a single categorical label defining the been used to patch-wise detect motion artifacts in
input, for example, “Tumor” or “No Tumor” to cranial and abdominal MRIs [55].
characterize tumor presence in a brain MR image
(Fig. 5.16). 5.2.8.2 Segmentation
Deep learning has also been successfully Region- or tissue-specific segmentation plays an
applied to object localization in which the inci- important role in the analysis of medical images
dence of an object is located, and their location for disease quantification and clinical translation
usually specified by a bounding box. In this net- of new imaging techniques. Expert manual seg-
work, object classification and localization are mentation remains to be the most accurate
combined by adding an additional fully con- method, however, requires a significant amount of
nected layer that outputs box coordinates, such as time and is subject to inter-rater variability.
the width, height, and x and y coordinates of the Consequently, the attention has grown to develop
bounding box (Fig. 5.16). In object detection, the sophisticated deep learning networks to fully
localization task is extended to multiple objects automate the semantic segmentation task of
in a single input image. All objects in an image medical images, i.e., labeling each pixel in an
are defined by a class label and location. image with a class label. The most recognized
GANs have been implemented to detect CNN-based segmentation architecture applied to
abnormalities in medical images by training the medical images is the previously mentioned

U-Net

Conv + AF (ReLU)

(Max) Pooling

Up-Conv

Skip Connection

Fig. 5.16 In the classification task, the network output is exact location of the object, along with the class label
a discrete label or description of the object in the input describing the object detected
image. In the localization task, the network outputs the
76 J. D. Kaggie et al.

U-Net. The majority of CNNs developed for seg- imaging modalities as the cost function during
mentation tasks are spin-offs of the U-Net archi- network training. Conditional GANs can also be
tecture such as its 3D variant, called V-Net, used for medical image registration. In this case,
applying 3D convolutions [6]. GANs are also the cGAN generator could determine the trans-
increasingly being used for automated segmenta- formation parameters or the transformed image
tion of medical images as they are showing great while the discriminator classifies between aligned
promise in overcoming a spatial continuity con- and unaligned image sets. Deep learning methods
straint identified in U-Net-based methods. They have been applied for various multi-modality
typically employ a U-Net-based generator net- image registration tasks, such as unsupervised
work for creating the segmentation image, while affine and deformable image registration of CTs
the GAN’s discriminator network could act as a and MRIs [63–65]. These methods have shown to
shape regulator and increase the spatial consis- be significantly faster than conventional image
tency in the final segmentation image. registration methods.
Conventionally, pixel-wise or voxel-wise objec-
tive functions are used during training. Examples 5.2.8.4 Image Synthesis
are (binary) cross entropy or variants of segmen- Deep learning networks have also shown their ben-
tation-based evaluation metrics such as the efit in cross-modality image synthesis where the
Sørensen–Dice Similarity Coefficient [56, 57] or desired image modality is either expensive or infea-
Jaccard Index [58]. sible to acquire. For this, CNNs are used to convert
While most segmentation strategies have been an image acquired with one modality into an image
applied to single-modality images, there has been of another. The pooling layer is usually absent in
an increasing interested in multi-modality seg- CNNs used for image synthesis, with down-sam-
mentation as this can provide different structural pling occurring through convolutions. GANs in par-
and functional information about the target ticular have shown great promise in the field of
simultaneously. For this purpose, the multi- synthetic image generation producing realistic
modality images are usually fused as multi- images in supervised and unsupervised cross-
channel inputs, with a single-channel modality settings. GANs have been used to gener-
segmentation map as network output. For exam- ate cross-modality medical images between MR,
ple, Wang et al. [59] and Zhou et al. [60] fused CT, and PET. While the majority of studies have
the four MRI modalities (T1-w, T1c-w, T2-w, been focused on generating synthetic images of the
and FLAIR images) from the BraTS dataset [61] brain (MR → CT [66]; CT → MR [67]; MR → PET
as the multi-channel input to a CNN for brain [68], PET → MR [69]), image synthesis using
tumor segmentation. Zhao et al. [62] fused PET GANs has also been applied to musculoskeletal
and CT images as their multi-channel input to a (MR → CT [70]), heart (MR → CT [71]), liver
3D CNN for lung cancer segmentation and (CT → [72]), and lung (CT → MR [73]) images.
achieved higher accuracies than with respective In image synthesis tasks, image quality is tra-
single-modality segmentation. ditionally assessed by calculating the mean abso-
lute error (MAE), mean squared error (MSE),
5.2.8.3 Registration (peak) signal to noise ratio ((P)SNR), or the
In medical image registration, a coordinate trans- structural similarity (SSIM) index between
formation is determined from one image and generated and ground truth reference image.

applied to another to spatially align both. CNNs While MAE, MSE, and (P)SNR assess pixel-
employed for registration tasks often use tradi- wise intensity differences, SSIM additionally
tionally used registration metric such as a simi- measures contrast and structural differences
larity measure between two images from different between the synthetic and reference image [20].
5 The Basic Principles of Machine Learning 77

5.2.9 Federated Learning digit recognition with a back-propagation network.

Adv Neural Inf Proces Syst. 1990;2:396–404.
6. Milletari F, Navab N, Ahmadi SA. V-net: fully con-
An emerging method for ML training is called volutional neural networks for volumetric medical
“federated learning.” Federated learning ties image segmentation. In: Proceedings of the 4th inter-
computers across multiple sites, even internation- national conference on 3D vision, 3DV 2016, 2016,
p. 565–71. https://doi.org/10.1109/3DV.2016.79.
ally, and enables distributed learning [74, 75]. 7. Ronneberger O, Fischer P, Brox T. U-net: convolu-
This allows individual centers learn a “local” tional networks for biomedical image segmentation.
model weights using their data, which is then arXiv Prepr. arXiv1505.04597v1, 2015, p. 1–8.
sent to central locations that can then create a 8. Girshick R, Donahue J, Darrell T, Malik J. Rich fea-
ture hierarchies for accurate object detection and
“global” model based on local inputs. The advan- semantic segmentation. In: Proceedings of the IEEE
tage of this is that data can remain within a cen- computer society conference on computer vision
ter, which is a consideration to ensure patient and pattern recognition, 2014, p. 580–7. https://doi.
privacy by reducing the amount of identifiable org/10.1109/CVPR.2014.81.
9. Horvitz E, Apacible J. Learning and reasoning about
data. This also leverages the ability of varied interruption. In: Proceedings of the 5th international
patient groups, datasets, and computational capa- conference on multimodal interfaces, 2003, p. 20–7.
bilities to provide a more general model than can https://doi.org/10.1145/958432.958440.
be achieved within a single center. Researchers 10. Arlot S, Celisse A. A survey of cross-validation pro-
cedures for model selection. Stat Surv. 2010;4:40–79.
are actively exploring use of federated learning https://doi.org/10.1214/09-SS054.
with homomorphic encryption [76], which can 11. Montgomery DC, Peck EA, Vining GG. Introduction
enable distributed learning while maintaining to linear regression analysis. 5th ed. Boca Raton:
complete privacy [74, 77–79]. Wiley; 2012.
12. Goodfellow I, Bengio Y, Courville A. Deep learning.
As ML becomes increasingly prevalent in Cambridge: MIT Press; 2016.
medicine and in other aspects of our lives, it may 13. Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities
be that we will eventually possess models and improve neural network acoustic models. In: ICML
weights based on global datasets and trained workshop on deep learning for audio, speech and lan-
guage processing, 2013.
from federated learning projects on our phones— 14. Botchkarev A. Performance metrics (error measures)
if our phones are not participating in these in machine learning regression, forecasting and prog-
projects! nostics: properties and typology. arXiv Prepr. arXiv
1809.03006, 2018. p. 1–37.
15. Hossin M, Sulaiman MN. A review on evaluation
metrics for data classification evaluations. Int J Data
Min Knowl Manag Process. 2015;5:1–11. https://doi.
References org/10.5121/ijdkp.2015.5201.
16. Janocha K, Czarnecki WM. On loss functions for
1. Haralick RM, Dinstein I, Shanmugam K. Textural deep neural networks in classification. arXiv Prepr.
features for image classification. IEEE Trans Syst arXiv 1702.05659, 2017. https://doi.org/10.4467/208
Man Cybern. 1973;SMC-3:610–21. https://doi. 38476SI.16.004.6185.
org/10.1109/TSMC.1973.4309314. 17. Bishop CM. Neural networks for pattern recognition.
2. Jain AK, Farrokhnia F. Unsupervised texture seg- Oxford: Oxford University Press; 1995.
mentation using Gabor filters. In: 1990 IEEE inter- 18. Reed R, MarksII RJ. Neural smithing: supervised
national conference on systems, man and cybernetics learning in feedforward artificial neural networks.
conference and proceedings. IEEE, 1990. https://doi. Cambridge: MIT Press; 1999.
org/10.1016/0031-3203(91)90143-S. 19. Heusel M, Ramsauer H, Unterthiner T, Nessler B,
3. Khotanzad A, Chen J-Y. Unsupervised segmentation Hochreiter S. GANs trained by a two time-scale
of textured images by edge detection in multidimen- update rule converge to a local Nash equilibrium. In:
sional feature. IEEE Trans Pattern Anal Mach Intell. Advances in neural information processing systems,
1989;11:414–21. https://doi.org/10.1109/34.19038. 2017, p. 6627–38.
4. Lawrence S, Giles CL, Tsoi AC, Back AD. Face rec- 20. Horé A, Ziou D. Image quality metrics: PSNR vs.
ognition: a convolutional neural-network approach. SSIM. In: Proceedings of the 20th international
IEEE Trans Neural Netw. 1997;8:98–113. https://doi. conference on pattern recognition, 2010, p. 2366–9.
org/10.1109/72.554195. https://doi.org/10.1109/ICPR.2010.579.
5. LeCun Y, Boser BE, Denker JS, Henderson D, 21. Johnson J, Alahi A, Fei-Fei L. Perceptual losses for
Howard RE, Hubbard WE, Jackel LD. Handwritten style transfer and super-resolution. In: European con-
78 J. D. Kaggie et al.

ference on computer vision, 2016, p. 694–711. https:// 37. Sra S, Nowozin S, Wright SJ, editors. Optimization
doi.org/10.1007/978-3-319-46475-6_43. for machine learning. Cambridge: MIT Press; 2011.
22. Isola P, Zhu JY, Zhou T, Efros AA. Image-to-image 38. Duda RO, Hart PE. Pattern classification and scene
translation with conditional adversarial networks. In: analysis. 1st ed. Boca Raton: Wiley; 1973.
Proceedings of the 30th IEEE conference on computer 39. Gonzalez RC, Woods RE, Eddins SL. Digital image
vision and pattern recognition, CVPR 2017, 2017, processing using MATLAB. 3rd ed. Upper Saddle
p. 5967–76. https://doi.org/10.1109/CVPR.2017.632. River: Pearson Prentice Hall; 2020.
23. Maximo A, Bhushan C. Conditional adversarial net- 40. Shapiro L. Computer vision and image processing. 1st
work for segmentation with simple loss function. ed. Boston: Academic Press; 1992.
In: Proceedings of the 27th annual meet. ISMRM, 41. Nagi J, Ducatelle F, Di Caro GA, Cireşan D, Meier
Montreal, Canada, vol. 4737, 2019. U, Giusti A, Nagi F, Schmidhuber J, Gambardella
24. Shrestha A, Mahmood A. Review of deep learn- LM. Max-pooling convolutional neural networks for
ing algorithms and architectures. IEEE Access. vision-based hand gesture recognition. In: 2011 IEEE
2019;7:53040–65. https://doi.org/10.1109/ international conference on signal and image process-
ACCESS.2019.2912200. ing applications, ICSIPA, 2011, p. 342–7. https://doi.
25. Sun S, Cao Z, Zhu H, Zhao J. A survey of optimi- org/10.1109/ICSIPA.2011.6144164.
zation methods from a machine learning perspective. 42. Boureau Y-L, Ponce J, LeCun Y. A theoretical anal-
IEEE Trans Cybern. 2020;50:3668–81. https://doi. ysis of feature pooling in visual recognition. In:
org/10.1109/tcyb.2019.2950779. Proceedings of the 27th international conference on
26. Törn A, Zilinskas A. Global optimization. machine learning, Haifa, Israel, 2010.
Berlin: Springer-Verlag; 1989. https://doi. 43. Guo T, Dong J, Li H, Gao Y. Simple convolutional
org/10.1007/3-540-50871-6. neural network on image classification. In: IEEE
27. Heath MT. Scientific computing: an introductory 2nd international conference on big data analy-
survey, revised. 2nd ed. Philadelphia: Society for sis, 2017, p. 721–724. https://doi.org/10.1109/
Industrial and Applied Mathematics; 2018. ICBDA.2017.8078730.
28. Horst R, Pardalos PM. Handbook of global opti- 44. Emmert-Streib F, Yang Z, Feng H, Tripathi S, Dehmer
mization. Boston: Springer; 1995. https://doi. M. An introductory review of deep learning for predic-
org/10.1007/978-1-4615-2025-2. tion models with big data. Front Artif Intell. 2020;3:1–
29. Nocedal J, Wright SJ. Numerical optimization, 23. https://doi.org/10.3389/frai.2020.00004.
springer series in operations research and financial 45. Vincent P, Larochelle H, Bengio Y, Manzagol
engineering. New York: Springer; 2006. https://doi. P-A. Extracting and composing robust features with
org/10.1007/978-0-387-40065-5. denoising autoencoders. In: Proceedings of the 25th
30. Cauchy A-L. Méthode générale pour la résolution des international conference on machine learning, ACM,
systèmes d’équations simultanées. C R Hebd Seances 2008, p. 1096–103.
Acad Sci. 1847;25:536–8. 46. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol
31. Le QV, Ngiam J, Coates A, Lahiri A, Prochnow B, Ng PA. Stacked denoising autoencoders: learning useful
AY. On optimization methods for deep learning. In: representations in a deep network with a local denois-
Proceedings of the 28th international conference on ing criterion. J Mach Learn Res. 2010;11:3371–408.
machine learning, 2011, p. 129–32. 47. An J, Cho S. Variational autoencoder based anomaly
32. Behera L, Kumar S, Patnaik A. On adaptive learning detection using reconstruction probability. Spec Lect
rate that guarantees convergence in feedforward net- IE. 2015;2(1):1–18.
works. IEEE Trans Neural Netw. 2006;17:1116–25. 48. Bhushan C, Yang Z, Virani N, Iyer N. Variational
https://doi.org/10.1109/TNN.2006.878121. encoder-based reliable classification. In: IEEE inter-
33. Yu C-C, Liu B-D. A backpropagation algorithm national conference on image process; 2020.
with adaptive learning rate and momentum coeffi- 49. He K, Zhang X, Ren S, Sun J. Deep residual
cient. In: Proceedings of the 2002 international joint learning for image recognition. arXiv Prepr.
conference on neural networks, IJCNN’02 (Cat. arXiv1512.03385v1, 2015, p. 1–17. https://doi.
No.02CH37290). IEEE, 2002, p. 1218–23. https:// org/10.1007/s11042-017-4440-4.
doi.org/10.1109/IJCNN.2002.1007668. 50. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu
34. Kingma DP, Ba JL. Adam: a method for stochastic B, Warde-Farley D, Ozair S, Courville A, Bengio
optimization. In: 3rd international conference on Y. Generative adversarial networks. arXiv Prepr.
learning representations (ICLR), San Diego, USA, arXiv1406.2661v1, 2014, p. 1–9. https://doi.
2015, p. 1–15. org/10.1001/jamainternmed.2016.8245.
35. Luo Z-Q. On the convergence of the LMS algorithm 51. Salakhutdinov R, Hinton G. Deep Boltzmann
with adaptive learning rate for linear feedforward net- machines. J Mach Learn Res. 2009;5:448–55.
works. Neural Comput. 1991;3:226–45. https://doi. 52. Lee J-G, Jun S, Cho Y-W, Lee H, Kim GB, Seo JB,
org/10.1162/neco.1991.3.2.226. Kim N. Deep learning in medical imaging: general
36. Bottou L. Online learning and stochastic approxima- overview. Korean J Radiol. 2017;18:570–84. https://
tions. On-line Learn Neural Netw. 1998;17. doi.org/10.3348/kjr.2017.18.4.570.
5 The Basic Principles of Machine Learning 79

53. Schlegl T, Seeböck P, Waldstein SM, Langs G, network. Phys Med Biol. 2019;64:015011, 15pp.
Schmidt-Erfurth U. F-AnoGAN: fast unsupervised https://doi.org/10.1088/1361-6560/aaf44b.
anomaly detection with generative adversarial net- 63. Balakrishnan G, Zhao A, Sabuncu MR, Dalca AV,
works. Med Image Anal. 2019;54:30–44. https://doi. Guttag J. An unsupervised learning model for deform-
org/10.1016/j.media.2019.01.010. able medical image registration. In: 2018 IEEE/CVF
54. Bermudez C, Plassard AJ, Davis LT, Newton AT, conference on computer vision on pattern recog-
Resnick SM, Landman BA. Learning implicit brain nition, 2018, p. 9252–60. https://doi.org/10.1109/
MRI manifolds with deep learning. In: Proceedings of CVPR.2018.00964.
SPIE 10574, medical imaging 2018 image processing, 64. de Vos BD, Berendsen FF, Viergever MA, Sokooti H,
vol. 56, 2018. https://doi.org/10.1117/12.2293515. Staring M, Išgum I. A deep learning framework for
55. Küstner T, Liebgott A, Mauch L, Martirosian P, unsupervised affine and deformable image registra-
Bamberg F, Nikolaou K, Yang B, Schick F, Gatidis tion. Med Image Anal. 2019;52:128–43. https://doi.
S. Automated reference-free detection of motion org/10.1016/j.media.2018.11.010.
artifacts in magnetic resonance images. Magn Reson 65. Shan S, Yan W, Guo X, Chang EI-C, Fan Y, Xu
Mater Phys Biol Med. 2018;31:243–56. https://doi. Y. Unsupervised end-to-end learning for deform-
org/10.1007/s10334-017-0650-z. able medical image registration. arXiv Prepr. arXiv
56. Dice LR. Measures of the amount of ecologic asso- 1711.08608v2, 2018, p. 1–12.
ciation between species. Ecology. 1945;26:297–302. 66. Emami H, Dong M, Nejad-Davarani SP, Glide-Hurst
https://doi.org/10.2307/1932409. CK. Generating synthetic CTs from magnetic reso-
57. Sørensen TJ. A method of establishing groups of nance images using generative adversarial networks.
equal amplitude in plant sociology based on similarity Med Phys. 2018;45:3627–36. https://doi.org/10.1002/
of species and its application to analyses of the veg- mp.13047.
etation on Danish commons. Biol Skr. 1948;5:1–34. 67. Jin CB, Kim H, Liu M, Jung W, Joo S, Park E, Ahn
58. Jaccard P. Distribution de la Flore Alpine dans le YS, Han IH, Lee JI, Cui X. Deep CT to MR synthesis
Bassin des Dranses et dans quelques régions voisines. using paired and unpaired data. Sensors (Switzerland).
Bull la Société vaudoise des Sci Nat. 1901;37:241– 2019;19:1–19. https://doi.org/10.3390/s19102361.
72. https://doi.org/10.5169/seals-266440. 68. Pan Y, Liu M, Lian C, Zhou T, Xia Y, Shen
59. Wang G, Li W, Ourselin S, Vercauteren T. Automatic D. Synthesizing missing PET from MRI with cycle-
brain tumor segmentation using cascaded anisotropic consistent generative adversarial networks for
convolutional neural networks. In: Lecture notes in Alzheimer’s disease diagnosis. In: Frangi A, Schnabel
computer science (including subseries lecture notes J, Davatzikos C, Alberola-López C, Fichtinger G, edi-
in artificial intelligence. Lecture notes in bioinfor- tors. Medical image computing and computer assisted
matics) 10670 LNCS, 2018¸ p. 178–90. https://doi. intervention – MICCAI 2018. Lecture notes in com-
org/10.1007/978-3-319-75238-9_16. puter science, vol. 11072, 2018, p. 595–602. https://
60. Zhou C, Ding C, Wang X, Lu Z, Tao D. One-pass doi.org/10.1007/978-3-030-00931-1_52.
multi-task networks with cross-task guided attention 69. Choi H, Lee DS. Generation of structural MR images
for brain tumor segmentation. IEEE Trans Image from amyloid PET: application to MR-less quanti-
Process. 2020;29:4516–29. https://doi.org/10.1109/ fication. J Nucl Med. 2018;59:1111–7. https://doi.
TIP.2020.2973510. org/10.2967/jnumed.117.199414.
61. Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, 70. Hiasa Y, Otake Y, Takao M, Matsuoka T, Takashima
Farahani K, Kirby J, Burren Y, Porz N, Slotboom J, K, Carass A, Prince JL, Sugano N, Sato Y. Cross-
Wiest R, Lanczi L, Gerstner E, Weber MA, Arbel modality image synthesis from unpaired data using
T, Avants BB, Ayache N, Buendia P, Collins DL, CycleGAN. In: Simulation and synthesis in medical
Cordier N, Corso JJ, Criminisi A, Das T, Delingette imaging, SASHIMI 2018. Lecture notes in computer
H, Demiralp Ç, Durst CR, Dojat M, Doyle S, Festa science, vol. 11037 LNCS, 2018, p. 31–41. https://
J, Forbes F, Geremia E, Glocker B, Golland P, Guo doi.org/10.1007/978-3-030-00536-8_4.
X, Hamamci A, Iftekharuddin KM, Jena R, John 71. Chartsias A, Joyce T, Dharmakumar R, Tsaftaris
NM, Konukoglu E, Lashkari D, Mariz JA, Meier R, SA. Adversarial image synthesis for unpaired multi-
Pereira S, Precup D, Price SJ, Raviv TR, Reza SMS, modal CardiacData. In: Simulation and synthesis in
Ryan M, Sarikaya D, Schwartz L, Shin HC, Shotton medical imaging, SASHIMI 2017. Lecture notes in
J, Silva CA, Sousa N, Subbanna NK, Szekely G, computer science, vol. 10557 LNCS, 2017. https://
Taylor TJ, Thomas OM, Tustison NJ, Unal G, Vasseur doi.org/10.1007/978-3-319-68127-6_1.
F, Wintermark M, Ye DH, Zhao L, Zhao B, Zikic D, 72. Ben-Cohen A, Klang E, Raskin SP, Soffer S, Ben-
Prastawa M, Reyes M, Van Leemput K. The multi- Haim S, Konen E, Amitai MM, Greenspan H. Cross-
modal brain tumor image segmentation benchmark modality synthesis from CT to PET using FCN and
(BRATS). IEEE Trans Med Imaging. 2015;34:1993– GAN networks for improved automated lesion detec-
2024. https://doi.org/10.1109/TMI.2014.2377694. tion. Eng Appl Artif Intell. 2019;78:186–94. https://
62. Zhao X, Li L, Lu W, Tan S. Tumor co-segmentation in doi.org/10.1016/j.engappai.2018.11.013.
PET/CT using multi-tumor co-segmentation in PET/ 73. Jiang J, Hu Y-C, Tyagi N, Zhang P, Rimner A,
CT using multi-modality fully convolutional neural Mageras GS, Deasy JO, Veeraraghavan H. Tumor-
80 J. D. Kaggie et al.

aware, adversarial domain adaptation from CT to MRI 78. Kairouz P, McMahan HB, Avent B, Bellet A, Bennis
for lung cancer segmentation. In: Frangi A, Schnabel M, Bhagoji AN, Bonawitz K, Charles Z, Cormode
J, Davatzikos C, Alberola-López C, Fichtinger G, edi- G, Cummings R, D’Oliveira RGL, El Rouayheb S,
tors. Medical image computing and computer assisted Evans D, Gardner J, Garrett Z, Gascón A, Ghazi B,
intervention – MICCAI 2018. Lecture notes in com- Gibbons PB, Gruteser M, Harchaoui Z, He C, He L,
puter science, vol. 11071 LNCS, 2018. https://doi. Huo Z, Hutchinson B, Hsu J, Jaggi M, Javidi T, Joshi
org/10.1007/978-3-030-00934-2_86. G, Khodak M, Konečný J, Korolova A, Koushanfar F,
74. Li T, Sahu AK, Talwalkar A, Smith V. Federated Koyejo S, Lepoint T, Liu Y, Mittal P, Mohri M, Nock
learning: challenges, methods, and future directions. R, Özgür A, Pagh R, Raykova M, Qi H, Ramage D,
IEEE Signal Process Mag. 2020;37:50–60. https:// Raskar R, Song D, Song W, Stich SU, Sun Z, Suresh
doi.org/10.1109/MSP.2020.2975749. AT, Tramèr F, Vepakomma P, Wang J, Xiong L, Xu
75. Yang Q, Liu Y, Chen T, Tong Y. Federated machine Z, Yang Q, Yu FX, Yu H, Zhao S. Advances and open
learning: concept and applications. ACM Trans problems in federated learning. arXiv Prepr. arXiv
Intell Syst Technol. 2019;10:1–19. https://doi. 1912.04977v1, 2019, p. 1–105.
org/10.1145/3298981. 79. Xu G, Li H, Liu S, Yang K, Lin X. VerifyNet:
76. Gentry C, Boneh D. A fully homomorphic encryp- secure and verifiable federated learning. IEEE Trans
tion scheme. Ph.D. Diss., Stanford University, 2009. Inf Forensics Secur. 2020;15:911–26. https://doi.
https://doi.org/10.5555/18349540. org/10.1109/TIFS.2019.2929409.
77. Cheng K, Fan T, Jin Y, Liu Y, Chen T, Yang
Q. SecureBoost: a lossless federated learning frame-
work. arXiv Prepr. arXiv 1901.08755v1; 2019.
Part II
Clinical Applications
Imaging Biomarkers and Their
Meaning for Molecular Imaging 6
Angel Alberich-Bayarri, Ana Jiménez-Pastor,
and Irene Mayorga-Ruiz

Contents
6.1 Introduction 83
6.2 Imaging Biomarkers, Paradigm Shift in Medical Imaging 84
6.3 Imaging Biomarkers in Hybrid Molecular Imaging 85
References 86

6.1 Introduction a complement to the traditional radiological diag-

nosis to detect a specific disorder or lesion, quan-
The famous quote from Lord Kelvin “When you tify its biological situation, evaluate its
can measure what you are speaking about, and progression, stratify phenotypic abnormalities,
express it in numbers, you know something about and assess the treatment response [1–6].
it, when you cannot express it in numbers, your Despite the evolution of image processing
knowledge is of a meager and unsatisfactory kind; platforms and image quantification solutions to
it may be the beginning of knowledge, but you cover unmet clinical needs, their application in
have scarely, in your thoughts advanced to the daily practice is still work in progress in many
stage of science” is a really inspiring statement for aspects. In the field of radiology, a wide variety
the explanation of the imaging biomarker con- of algorithms for neuroimaging to be applied to
cept. Imaging biomarkers can be defined as char- magnetic resonance imaging (MRI) have been
acteristics extracted from the images of an developed as well as other solutions for comput-
individual that can be objectively measured and erized tomography (CT), some of them based on
act as indicators of a normal biological process, a artificial intelligence pipelines, such as lung nod-
disease, or a response to a therapeutic interven- ule detection and characterization. Although not
tion. Biomarkers have been shown to be useful as being an absolute but a relative quantification, in
molecular imaging, the concept of imaging bio-
marker has been present since the use of stan-
A. Alberich-Bayarri (*) · A. Jiménez-Pastor dardized uptake value (SUV). Furthermore,
I. Mayorga-Ruiz workstations and other solutions have been
Quantitative Imaging Biomarkers in Medicine, mainly addressed to provide quantitative analysis
Quibim SL, Valencia, Spain tools in a patient-specific basis, but not to store
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 83

Idea

Proof of concept Proof of mechanism Image acquisition Image processing Image analysis

Value

Measurements Proof of principle Proof of efficacy & Structured report

effectiveness

Fig. 6.1 Stepwise development of imaging biomarkers to with the use of convolutional neural networks (CNN),
convert a clinical idea into value for clinical practice. The image processing, and image analysis steps
AI section refers to the components that can be improved

quantitative data in databases for the posterior • Detection imaging biomarkers: use as a tool to
data mining and scientific research in imaging find high levels of a specific measure in a tis-
biomarkers. As an example, although the tech- sue or organ that can indicate the presence of
nology is already there [1], today pipelines, like a disease.
automatically detect the lesions in lymphoma, • Diagnostic imaging biomarkers: use as a tool
extract their SUV values as well as their meta- for the identification of the specific disease
bolic tumor volume (MTV) and store in a struc- suffered by the patient.
tured report in the PACS are still not available. • Staging imaging biomarkers: use as a tool for
In this chapter, we introduce the concept of grading of the disease severity or extent.
imaging biomarker and explain the main charac- • Predictive/prognostic imaging biomarkers:
teristics of the development process and valida- use as a tool to forecast the progression of the
tion to finally detail how the process can be disease and its potential relapse.
applied in hybrid modalities where it is highly • Follow-up imaging biomarkers: use as a tool
relevant to combine the spatial information with for monitoring treatment response and disease
the functional one. progression in the patient.

The most supported process for the develop-

6.2 Imaging Biomarkers, ment of imaging biomarkers, converting a clini-
Paradigm Shift in Medical cal idea or need into clinical value is described in
Imaging [2] and also proposed in [4], which is divided into
different steps (Fig. 6.1).
Imaging biomarkers allow to measure subtle tis- The first step is the proof of concept, which is
sue changes, either at a structural or at a function usually a small test to solve an unmet clinical
level [7]. They are the main enabler of quantita- need of a specific pathology that can be evaluated
tive imaging and the key for the paradigm shift in with current image acquisition modalities and
medical imaging. They can be classified in differ- image processing techniques. The proof of mech-
ent types depending on their main application anism establishes a link (in magnitude and direc-
across different clinical scenarios. Imaging bio- tion) between the parameter under study and the
markers can be used to extract patient pheno- existence, staging, and evolution of the disease.
types, either independently or together with other Thereafter, a design on the most appropriate
clinical or genomic variables. The main applica- image acquisition protocol to ensure appropriate
tions of imaging biomarkers are: image quality is performed; the images needed to
6 Imaging Biomarkers and Their Meaning for Molecular Imaging 85

extract the biomarker must be technically ade- demonstrating an important evidence with the
quate (signal-to-noise ratio, spatial resolution, outcome in lymphoma patients consists of the
contrast-to-noise ratio, uniformity, among oth- extraction of metabolic heterogeneity from
ers). The following preprocessing step aims to lesions, beyond the maximum values of SUV,
improve the image quality before the analysis that is, the current standard of care [9]. Finally,
(with techniques such as filtering, interpolation, after the technical process for the extraction and
registration, movement correction, and segmen- measurement of the imaging biomarker is clear, a
tation). Segmentation is one of the processes that pilot test in the way of a Proof of Principle must
has been significantly improved with the use of be performed in a controlled cohort of subjects to
artificial intelligence approaches such as the evaluate potential biases related to sex, age, or
application of convolutional neural networks others. This also serves as a preliminary valida-
(CNN). The development of network architec- tion of the method. Comprehensive proofs of effi-
tures such as U-Net has permitted the segmenta- cacy and effectiveness on external, larger, and
tion of organs and structures clearly outperforming well-characterized series of subjects will show
traditional computer vision algorithms [8]. The the ability of a biomarker to really measure (even
analysis and modeling of the signal is the process if it is in a surrogate manner) the clinical
by which the quantitative or objective informa- endpoint.
tion is extracted from the images. This informa-
tion can represent structural or functional
properties of the tissue. Those imaging biomark- 6.3 Imaging Biomarkers
ers that can be calculated voxel-wise allow for in Hybrid Molecular Imaging
the representation of the spatial distribution in
parametric maps, defined as derived images (sec- The imaging biomarkers that can be extracted in
ondary) in which the value of a specific parame- molecular imaging are related to the imaging
ter is placed as the pixel value. In general, modalities used in the examination. Generally
imaging biomarkers have specific measurement speaking, the imaging biomarkers that can be
units; however, due to the nature of the calcula- extracted from the molecular imaging compo-
tion process, some parameters may be measured nents of the modality (see Table 6.1, considering
in arbitrary units (a.u.). This is the case of only those ones based on PET) are the standard-
radiomics features or parameters such as the frac- ized uptake value (SUV), related to the metabolic
tal dimension. An additional layer of multi- activity, the metabolic tumor volume (MTV),
variate post-processing applied to the imaging which is related to the size of the metabolic
biomarkers allows for the combination of the region within the lesion, the total lesion glycoly-
most relevant features into indicators represent- sis (TLG), derived from the multiplication of the
ing disease status that can be plotted in new para- MTV by the average metabolic activity, the
metric images called nosological maps. delta-, which calculates the difference in a given
Measurements of imaging biomarkers in specific imaging biomarker between two specific time-
lesions or tissues must be optimized to the physi- points in the longitudinal course of the disease.
ological phenomena under study. A clear exam- Finally, lesion heterogeneity can be characterized
ple is the conventional approach in the both in the anatomical-structural component of
measurements of SUV, consisting of the extrac- the modality, that is, the CT or the MR images,
tion of the maximum value (SUVmax) of the and in the PET component. For the structural or
region (instead of average, median, or other his- metabolic heterogeneity estimation of lesion, dif-
togram descriptors). Automation and AI can ferent textural (radiomics) features can be
allow for the seamless extraction of a wide vari- extracted by the use of standard first-order histo-
ety of measurements for a specific imaging bio- gram analysis or more advanced second-order
marker beyond the conventional ones. An techniques: gray level co-occurrence matrix
exploratory example in molecular imaging that is (GLCLM), gray level run-length matrix
86 A. Alberich-Bayarri et al.

Table 6.1 Most relevant imaging biomarkers in molecular imaging, objective of their quantification and specific units
Objective Modality Imaging biomarker Units
Metabolic activity PET/CT & PET/MR Standardized uptake value (SUV) a.u.
Tumoral burden PET/CT & PET/MR Metabolic tumor volume (MTV) mL
Tumoral burden + metabolic PET/CT & PET/MR Total lesion glycolysis (TLG) g
activity
Change in metabolic activity PET/CT & PET/MR Delta-SUV (ΔSUV), averaged or a.u.
voxel-wise
Lesion heterogeneity CT, MR, PET/CT, & PET/ Textures—radiomics a.u.
MR

and conceptual framework. Clin Pharmacol Ther.

(GLRLM), gray level size zone matrix (GLSZM), 2001;69:89–95.
gray level dependence matrix (GLDM), neigh- 4. European Society of Radiology (ESR). ESR statement
on the stepwise development of imaging biomarkers.
boring gray tone difference matrix (NGTDM), Insights Imaging. 2013;4:147–52.
among others. In total, thousands of descriptors 5. European Society of Radiology. White paper on imag-
can be obtained, expressing the heterogeneity of ing biomarkers. Insights Imaging. 2010;1:42–5.
a single lesion. Furthermore, these features can 6. Martí-Bonmatí L. Introduction to the stepwise devel-
opment of imaging biomarkers. In: Marti-Bonmatí
be obtained from either a 2D or 3D analysis. L, Alberich-Bayarri A, editors. Imaging biomarkers,
development and clinical integration. Cham: Springer;
2017. p. 9–27. ISBN 978-3-319-43504-6.
7. Alberich-Bayarri A, Neri E, Marti-Bonmatí
References L. Imaging biomarkers and imaging biobanks. In:
Ranschaert E, Morozov S, Algra PR, editors. Artificial
1. Martí Bonmatí L, Alberich-Bayarri A, García-Martí intelligence in medical imaging: opportunities, appli-
G, Sanz Requena R, Pérez Castillo C, Carot Sierra cations and risks. Cham: Springer; 2019. p. 119–26.
JM, Manjón Herrera JV. Biomarcadores de ima- ISBN 978-3-319-94877-5.
gen, imagen cuantitativa y bioingeniería [Imaging 8. Ronneberger O, Fischer P, Brox T. U-Net: convolu-
biomarkers, quantitative imaging, and bioengineer- tional networks for biomedical image segmentation,
ing]. Radiologia. 2012;54(3):269–78. https://doi. MICCAI, LNCS, vol. 9351. Cham: Springer; 2015.
org/10.1016/j.rx.2010.12.013. Spanish. Epub 2011 Jul p. 234–41. Available at arXiv:1505.04597.
5. PMID: 21733539. 9. Ceriani L, Milan L, Martelli M, Ferreri AJM, Cascione
2. Alberich-Bayarri Á, Hernández-Navarro R, Ruiz- L, Zinzani PL, Di Rocco A, Conconi A, Stathis A,
Martínez E, García-Castro F, García-Juan D, Martí- Cavalli F, Bellei M, Cozens K, Porro E, Giovanella
Bonmatí L. Development of imaging biomarkers and L, Johnson PW, Zucca E. Metabolic heterogeneity
generation of big data. Radiol Med. 2017;122(6):444– on baseline 18FDG-PET/CT scan is a predictor of
8. https://doi.org/10.1007/s11547-017-0742-x. Epub outcome in primary mediastinal B-cell lymphoma.
2017 Feb 21. PMID: 28224398. Blood. 2018;132(2):179–86. https://doi.org/10.1182/
3. Biomarkers Definitions Working Group. Biomarkers blood-2018-01-826958. Epub 2018 May 2. PMID:
and surrogate endpoints: preferred definitions 29720487.
Integration of Artificial
Intelligence, Machine Learning, 7
and Deep Learning into Clinically
Routine Molecular Imaging

Geoffrey Currie and Eric Rohren

Contents
7.1 Introduction 87
7.2 Classification 89
7.3 Segmentation 90
7.4 Detection and Localization 93
7.5 Applications of ML and DL in Molecular Imaging 94
7.6 Internal Department Applications 99
7.7 A Glance at Tomorrow 101
7.8 orkforce; Redundancy, Displacement, Transformation,
W
and Opportunity 104
7.9 Summary 105
References 107

7.1 Introduction tion of circumferential profiles and risk scores

associated with 201-Thallium chloride planar
Assimilation of AI into clinical practice heralds myocardial perfusion scans or the auto contour-
an exciting era with the reimagining of precision ing and production of functional parameters and
nuclear medicine and molecular imaging capa- phase/paradox images for gated blood pool scans.
bilities. AI has a long history in nuclear medicine This rudimentary form of AI using expert sys-
and molecular imaging, although perhaps not tems or knowledge graphs might also be obvious
using that language. Consider the use of auto- in bone mineral density contouring, regions and
mated region of interest production for genera- fracture risk scoring. The emergence of quantita-
tive software and polar maps for single photon
emission computed tomography (SPECT) myo-
G. Currie (*) · E. Rohren cardial perfusions studies is a more advanced
School of Dentistry and Medical Sciences, Charles example of AI via expert systems. There are early
Sturt University, Wagga Wagga, NSW, Australia
examples of ML in nuclear medicine also. ML
Baylor College of Medicine, Houston, TX, USA involves learning from large amounts of data that
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 87

perform a task without explicit instruction; the AI will improve our ability to perform our jobs,
artificial neural network (ANN) being the main improve outcomes and free up time from menial
platform to do so. An early example was the tasks to provide better patient care (utopians).
application of a 15 node ANN in 1993 evaluating There are also optimists who think AI is exciting
28 input features in ventilation perfusion lung and may emerge to improve our systems (poised to
scans against experienced physicians [1]. In be fast followers), pessimists who think AI is hype
molecular imaging, feature extraction, and or a hoax designed to raise revenue (skeptics), and
radiomic feature extraction can be integrated into realists thinking AI is a crucial part of the land-
ML and DL algorithms based on big data to scape but also understand not everyone will be
enhance precision nuclear medicine but this expert. In lower numbers there are also a few con-
requires clinically validated models (Fig. 7.1). In spiracy theorists claiming AI is just another tool
visual or image-based DL, the convolutional neu- being used by the government to spy on or control
ral network (CNN) is designed for and tasked us, those who worry about the emergence of AI
with four basic operations: classification/object and doubt their ability to assimilate into an AI aug-
recognition, classification/localization, object mented world (metathesiophobics), and those that
detection, and instance segmentation (Fig. 7.2). fear relinquishing control if AI diverts some per-
The role of AI in the general community, medi- ceived power, control, or attention from those per-
cine broadly, and specifically in molecular imag- forming amazing things without AI to those
ing sparks considerable debate. Anecdotally, breaking new ground with AI (narcissists).
molecular imaging folk sit in one of several AI Much of the disconnection comes from a
camps. There are those that believe AI will dis- lack of understanding. AI is part of molecular
place human resources producing professional imaging now and will be a growing part tomor-
anarchy (dystopians) in contrast to those that think row. Individuals need to upskill, not so they can

Molecular
Patient data Functional Metabolic imaging
imaging imaging
Demographic Neural
data Quantitation Radiomic network
feature
Clinical history Semantic Machine
extraction
evaluation learning

Optimal
Outcomes

Small Data

Multiple
patients
Precision
Multiple Big Deep learning Nuclear
sites Data Neural networks Medicine
Data mining
Large trials

Fig. 7.1 Schematic representation of the semantic evalu- potential to integrate with big data to enhance outcomes
ation of imaging data, addition of radiomic feature extrac- and drive precision nuclear medicine. (Reprinted with
tion and ANN analysis to produce small data and the permission [2])
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 89

Classification and Classification and Object Detection Instance Segmentation

Object Recognition Localization

Tumor Tumor Tumor, Liver, Kidneys Tumor, Liver, Kidneys

Fig. 7.2 Schematic representation of the difference between image recognition tasks in DL

conduct ML or DL research projects, but so they data is represented in a two-dimensional plot but
can implement developments and devise strat- obviously in molecular imaging the data is signifi-
egy for clinical AI in an informed position. cantly more dense and may be in four dimensions
Careful consideration needs to be given to how (three-dimensional space and time). Clear bound-
AI will assimilate into mainstream clinical prac- aries between data distribution is not always obvi-
tice. Advanced planning needs invested stake- ous. Support vector machines use vectors (purple
holders who have ownership in the plan and arrows) to determine the line that best separates
technology. AI may see emergence of new roles, the known classifications. Consider this data the
recrafting of some responsibilities and, poten- training set (blue and green circles) with grounded
tially, role redundancy. Organizational change truth labels. If we introduced a new unclassified
management is critical to manage these possi- data point (yellow dot), then we classify it as a
bilities. This implementation needs to be done blue dot because it lies below the line. This
within an ethical and legal framework [3]. At approach may not work as well if the boundaries
the same time, improving the language used in between classifications are not as obvious or lin-
this space may decrease misunderstanding and ear; including linear regression approaches.
associated antagonism. Recognizing AI is not Another approach is the K-nearest-neighbors
new to nuclear medicine, being precise about where the K represents the number of nearest
using ML or DL instead of AI when appropriate, neighbors considered in the classification. As
diverging from generalized use of term like AI shown in Fig. 7.4, the purple circles are centered
in preference for greater precise and more on the new unclassified data point. Using K = 1,
meaningful terms when appropriate like “engi- the single nearest neighbor would see the new
neered learning” or “intelligent imaging”, and data point classified as blue. Using a K = 9, more
recognizing that AI is neither artificial nor data points are considered, random chance is
intelligent. averaged, and now the yellow dot would be clas-
sified as green (6 green and 3 blue in the purple
nearest neighbor circle). Clustering methods
7.2 Classification adopt an iterative approach that begins with ran-
dom assignment of class to data points and deter-
Classification is an interesting problem to solve in mining the geometric center of each cluster. The
molecular imaging. Suppose we have a simple second iteration applies a new classification based
situation where there are two features of interest. on position relative to the geometric center of the
These are depicted in Fig. 7.3. For simplicity, the first iteration clusters, and uses the new points to
90 G. Currie and E. Rohren

Fig. 7.3 Schematic representation of linear approaches to classification with the purple arrows representing the vector
for separation and the new data point (yellow) being classified based on which side of the fit line it is located

Fig. 7.4 Schematic representation of nearest neighbor approaches to classification with the purple circles representing
the number of neighbors included in the calculation for the new data point (yellow)

adjust the geometric centers of clusters. This iter- 7.3 Segmentation

ative process continues until the geometric cen-
ters of the clusters no longer change. New data Image segmentation is not a new application in
points are then classified based on proximity to molecular imaging. Segmentation is a method
the final geometric centers of clusters. The artifi- for partitioning one or more parts of an image
cial neural network approach is a nonlinear solu- from the other parts. We do this to simplify the
tion based on changes to weights on individual image or to enhance the representation of parts
perceptrons to optimize the correct answers of the image of most interest. During recon-
(Fig. 7.5). The training data defines the nonlinear struction of myocardial perfusion SPECT or
demarcation between classifications which also brain SPECT images, for example, the boundar-
highlights the value of larger training sets in pro- ies of the area of interest are set and information
viding more accurate classification differentiators outside that window are truncated out of the
(Fig. 7.6). The inferential phase would see new image. On a bone scan a region may be drawn
data points assigned classifications based on this. (manually or automatically) around the bladder
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 91

Fig. 7.5 Schematic representation of nonlinear based on which side of the fit line it is located. The red
approaches to classification using neural analysis for sep- dashed line represents the position of the fit line with over
aration and the new data point (yellow) being classified fitting

pixel in the image in a manner that all pixels

segmented together have the same label identi-
fying that pixel. In nuclear medicine, CT and
magnetic resonance imaging (MRI), segmenta-
tion can be used to enhance an object of interest
in a complex scene. A very simple example of
segmentation is the use of specific color scales
for nuclear medicine images. The step 10 color
palette for viewing SPECT reconstructions par-
titions every pixel in the image into one of ten
labels, each corresponding to a specific color,
and each sequentially representing 10% of the
minimum to maximum count range. More com-
plex examples of segmentation include the co-
registration of positron emission tomography
Fig. 7.6 Schematic representation of nonlinear (PET) and CT images with a lesion of interest
approaches to classification using neural analysis for sep- segmented from surrounding tissues both ana-
aration and the new data point (yellow) being classified tomically (CT) and physiologically (PET). In
based on which side of the fit line it is located. The red CNN and DL, segmentation is critical for iden-
dashed line represents line determined in Fig. 7.4 while
the solid red line is the position of the fit line with the tification and characterization of target tissues,
addition of more data points and radiomic feature extraction. On CT for
example, the volume, size, shape, and texture of
a lung tumor will change as the constraints that
to suppress the contribution of bladder counts to define the segmentation vary. Similarly, the
the image. Adjusting the windowing on a com- standardized uptake value (SUV) will change as
puted tomography (CT) image from soft tissue the constraints of the lesion segmentation are
to bone provides a means for segmenting parts altered. An important role of AI tools is to pro-
of the image of interest. From a computing per- vide accurate and reproducible segmentation in
spective, true segmentation is identifying every an automated fashion.
92 G. Currie and E. Rohren

There are many approaches to segmentation. zontal planes (Fig. 7.8) combined with a similar
Here several of the more common approaches kernel for the vertical plane allow contouring
encountered in molecular imaging segmentation between objects.
are discussed. As already mentioned, threshold- Clustering methods are the same approach as
ing is a very simple way to segment an image. outlined for classification. Clustering is an iterative
This might be windowing or truncating the count approach that begins with random assignment of
scale (color scale) on nuclear medicine images or class to data points and determining the geometric
switching between windows (bone/soft tissue) on center of each cluster. The second iteration applies
CT. This is referred to as region-based segmenta- a new classification based on position relative to
tion or threshold segmentation and uses the indi- the geometric center of the first iterations clusters,
vidual pixel values (numerical value that and uses the new points to adjust the geometric
represents a color, count density, attenuation, or centers of clusters. This iterative process continues
other value). Setting a specific threshold (or more until the geometric centers of the clusters no longer
than one threshold) allows segmentation of pixels changes. Data points are then classified based on
based on their position relative to the threshold proximity to the final geometric centers of clusters.
(above or below). Images that contain high con- Each cluster might be represented as a different
trast have differences between these values that gray scale or color, or some clusters may be elimi-
can be exploited (Fig. 7.7). A global threshold is nated from the image (Fig. 7.9).
used to segment the image into two partitions; the The last approach discussed here is the DL
object or structure of interest and background. approach referred to as Mask R-CNN or instance
Multiple local thresholds can be used to segment segmentation. It is not the only DL approach and
multiple objects of interest from background. is used commonly in social media (origins in
Edge detection segmentation is a convolu- Facebook). The R represents region signifying
tional process. The image is segmented based on object detection. The mask aspect differentiates
the edge between different parts of the image. this approach from other R-CNNs by adding in
These partitions or edges may represent a change parallel a convolution branch that employs a
in contrast, count density, or color. Discontinuity region of interest. In essence, a small convolutional
within an image identifies an edge (e.g., edge of network applied to each region of interest
myocardium and ventricular lumen or a tumor (Fig. 7.10). While threshold segmentation is fast
compared to surrounding tissue). Using a filter or and simple, there may be no significant boundary
kernel that enhances the edges of data on hori- or, indeed, overlap between partitions. Edge

Fig. 7.7 Schematic representation of a threshold segmentation partitioning an image into regions above or below a
predefined threshold
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 93

9 6 7 9 4 Stride = 1 9 6 7 9 8 9 6 7 9 8

7 8 6 9 9 7 8 6 9 9 7 8 6 9 9
Input
3 4 2 3 4 image 3 4 2 3 4 3 4 2 3 4

4 2 4 3 4 4 2 4 3 4 4 2 4 3 4

3 4 5 2 4 3 4 5 2 4 3 4 5 2 4
Input Overlay Overlay Input
image position 1 position 2 Summed image
position 2

9 6 7 9 4 Summed
position 1
Horizontal
7 8 6 9 9 1 2 1 31 31 30 edge

3 4 2 3 4 0 0 0 31 32 35

4 2 4 3 4 1 2 1 21 20 20
Weight matrix Convolution
Overlay (Sobel horizontal filter) image
3 4 5 2 4 position 9

Summed
position 9

Fig. 7.8 Schematic representation of an edge detection The kernel is applied in a weighted fashion to each pixel
segmentation partitioning an image into regions based on to create the convolution image, in this case for the hori-
identifying the edges between objects within the image. zontal edge. (Reprinted with permission [4])

Fig. 7.9 Schematic representation of a cluster-based segmentation partitioning an image into regions based on K-means
with clusters identified by color or eliminated from the image

detection segmentation is useful when there is

good image contrast but is confounded by com- 7.4 Detection and Localization
plex images containing numerous edges/parti-
tions. Cluster-based segmentation is useful for An important area of computer vision algorithms
small datasets but can be computationally and certainly in applications in nuclear medicine
demanding for larger data sets. Instance segmen- is object detection. CNNs and DL play an inte-
tation is simple and flexible but requires substan- gral role in this capability. Clearly, segmentation
tial and time-consuming neural network training. is the underlying principle of detection and local-
94 G. Currie and E. Rohren

tumor

convolution layer 1

convolution input
ReLU feature map
kernel

answer

pooled Class box Concatenate

feature
maps
ROI align

Region
proposal
network

convolution pooled flattening forward

feature map ReLU feature Classification
kernel propagation
maps

convolution layer 15
input output
layer fully connected layer
layers

Fig. 7.10 Schematic representation of instance segmentation (red bounding boxes) using mask R-CNN

ization, and classification (Fig. 7.2). It is useful to 7.5 pplications of ML and DL

A
simplify the process into the three major tasks: in Molecular Imaging

• Image classification predicts the class of an The research applications of AI, ML, and DL in
object in an image. molecular imaging are growing quickly. The
• Object localization uses a bounding box and opportunities and applications can be divided
defined spatial parameters to locate an object. into several broad categories; potential clinical
• Object detection determines the presence of applications and physics applications. Physics
objects in an image and applies a class label. and instrumentation applications include attenua-
tion correction from pseudo-CT [5–9], scatter
Computer aided detection (CADe) is a system correction [10], motion correction, image recon-
for detection of objects on medical images and struction [11–13], co-registration, low dose
consists of four main steps: segmentation of the imaging [14–17], noise reduction [18], and radia-
region of interest, detection of the object of inter- tion dosimetry [19, 20]. Much of the recent clini-
est, analysis of object features, and classification cal literature relates to CNNs and DL offering
against potential false positives (Fig. 7.11 exclud- potential solutions in automated disease detec-
ing the green box). Computer aided diagnosis tion [21], classification [22, 23], triage, segmen-
(CADx) is a system that extracts the image fea- tation [24], guide therapy [25], and assisted
tures and uses a classifier to predict what the diagnosis [26]. It is important to recognize that
object of interest is (Fig. 7.11). AI and ML also have significant benefits to clini-
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 95

Fig. 7.11 Flow diagram

of the process for
detection and
localization on medical
images. The entire Image Object
process represents Recognition Localization
CADx while truncation
before the final step are
the limits of CADe

Image Object
Classification Detection

Object Object
Class Segmentation

cal practice without the use of CNN and DL. At a • cNeuro cMRI is a CNN algorithm for annota-
rudimentary level, an ANN can be used in paral- tion, segmentation, and quantitation of neuro-
lel to conventional statistical approaches to glean logical MRI.
deeper insights into features or combinations of • Arterys Cardio DL is an AI platform for post-
features that provide the greatest predictive processing analysis and quantitation of car-
power [27, 28]. Despite the breadth of AI, ML, diac MRI.
and DL applications published in the literature, • HealthCCS is an AI algorithm for calculating
there are few that have transitioned to general cardiac risk based on coronary artery plaque
implementation in clinical practice. This, in part, calcification on CT.
relates to the regulatory framework for software • IB Neuro is an AI algorithm for post-
as a medical device (SaMD). Among the US FDA processing image registration of serial brain
approved SaMDs are the following molecular MRI with generation of parametric perfusion
imaging-related AI applications or AI platforms: maps.
• Icobrain is an AI pipeline for annotation, seg-
• Aidoc BriefCase-PE is a CNN algorithm for mentation, and quantitation of serial brain MRI.
analysis of CTPA to triage based on the prob- • NeuroQuant is an AI platform for annotation,
ability of pulmonary embolism with reported segmentation, and quantitation of brain
sensitivity of 91% and specificity of 90%. MRI.
• AI-Rad Companion-Cardiovascular is a CNN • Quantib Brain provides an AI driven platform
algorithm for segmentation and coronary cal- for MRI segmentation, quantitation, and
cium scoring of the heart. classification.
96 G. Currie and E. Rohren

• SubtlePET is image processing software for was supported in PET/MRI using a deep neural
data management and noise reduction for PET network work [6]. Torrado-Carvajal et al. [7]
scans. integrated the Dixon method with a CNN to gen-
erate pseudo-CT for pelvic PET/MRI scans with
Among the emerging AI applications in less than 2% variation from the CT map. A deep
molecular imaging, those associated with auto- CNN combined with zero-echo-time Dixon
contouring and segmentation, radiomic feature pseudo-CT was also used to produce more accu-
extraction, triage and second reporters, attenua- rate attenuation maps than traditional MRI
tion correction, reconstruction, dose reduction pseudo-CT methods [8]. DL approaches can pro-
and radiation dosimetry are perhaps the most duce pseudo-CT attenuation maps from the sino-
important and most likely to transition to more gram of 18F FDG brain PET with less than 1%
widespread clinical utility. error reported over CT [9].
An important area of development for molec- Despite the advances associated with iterative
ular imaging is in pseudo-CT attenuation maps reconstruction algorithms, CNN/DL-based recon-
(Fig. 7.12) that could reduce radiation dose. struction approaches have been a number of posi-
There are a number of limitations in estimating tive reports in the literature. Zhu et al. [11]
an attenuation map from MRI for SPECT/MRI or employed a deep neural network to produce recon-
PET/MRI hybrid systems. CNNs may overcome structed data direct from the sinogram of brain MRI
the limitations of maximum likelihood recon- and PET data with less noise and artefact.
struction of activity and attenuation (MLAA) and Haggstrom et al. [12] used a DL encoder-decoder
provide accurate attenuation maps without trans- CNN on PET data (Fig. 7.13) to reconstruct higher
mission studies. Hwang et al. [5] evaluated deep quality images compared to iterative and backpro-
CNNs to produce an attenuation map that closely jection methods. The root mean square error was
modeled the CT-based grounded truth and this 11% lower than for ordered subset expectation

MRI Pseudo-CT CT

or
validate

PET

input output
hidden layers
layer layer

Iterative input
reconstruction Co-registration

Reconstructed, Attenuation map

fused and correction (pseudo-CT)
attenuation
corrected PET

Fig. 7.12 Model for potentially using CNN for improved pseudo-CT attenuation correction in PET/MRI. (Adapted
and reprinted with permission [4])
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 97

Encoder Decoder

PET Reconstructed
sinogram slice

Feature layers 32 64 128 256 512 1024 512 256 128 64 32 1

Spatial size 2882 1442 722 362 182 182 182 262 442 752 1282
Convolution 72 52 52 32 32 32 32 32 32 32 32

Convolution Convolution stride 2 Batch normalization Activation function Upsampling

stride 1

Fig. 7.13 Schematic representation of the DeepPET convolutional encoder-decoder network

maximization (OSEM) and 53% lower than filtered zation focused on calculating the ideal dose for
backprojection (FBP). The DL approach also pro- diagnostic or therapeutic outcomes. Xu et al. [14]
duced better structural similarity index and signal- adopted a similar coder-decoder architecture
to-noise ratio. Jiao et al. [13] adopted described in Fig. 7.14 except the inputs are mul-
a CNN approach using the back-projection image tiple low count PET slices. They reported supe-
of the sinogram data as the input tensor to recon- rior image quality for reconstructing ultra-low
struct 18F choline and 18F florbetapir brain PET dose PET through the encoder-decoder CNN
images with faster processing times. There remains than standard dose using conventional recon-
much work to be done in this space. struction techniques. Similar approaches have
An important consideration in medical imag- been reported by several authors using T1
ing and for nuclear medicine specifically is dose weighted MRI. Kaplan and Zhu [15] used a CNN
reduction. A number of dose reduction strategies to reduce the noise associated with low dose PET
have been employed to maintain image quality scans and reported a final result of comparable
and diagnostic integrity but with low doses performance metrics to the grounded truth (full
administered to patients. This addresses not only dose scan). Ouyang et al. [16] used an encoder-
the issues of radiation dose and safety but also the decoder generative adversarial network (GAN) in
sustainable use of scarce and expensive resources. amyloid brain PET to train low dose (1%) scans
With the emergence of hybrid imaging technol- down-sampled from list mode data against the
ogy, dose reduction where feasible is critical. A 100% reconstruction as grounded truth. Outside
number of advances have facilitated dose reduc- the brain, Lei et al. [17] employed a cycle-
tion including more sensitive detector systems consistent GAN to estimate whole-body PET
and improved reconstruction algorithms. CNNs images from low count data following inverse
and DL may also play a role in dose reduction transformation. The method improved the mean
and this is an important domain for DL focus. error and normalized means square error from
Indeed, there are two concepts to consider; dose 5.6% and 3.5% to −0.1% and 0.5% respectively.
reduction to minimize the dose without compro- Excluding low dose PET and SPECT scans,
mising the quality of imaging, and dose optimi- molecular images are generally noisy. CNN and
98 G. Currie and E. Rohren

Skip connection n epoch

(concatenation)

Encoder Decoder
Block 1 Skip connection Block 3
Loss
(concatenation) function

Input CT stop
Encoder Decoder criteria
(or MRI) Block 2 Block 2 met
Skip connection
Convolution (concatenation) Estimated PET Noisy PET

Batch normalization Encoder Decoder

Block 3 Block 1
Activation function

Max pooling

Upsampling
Graph Neural
Output Network
Module Denoised PET

Fig. 7.14 Schematic representation of the encoder-decoder GNN in the U-net architecture used with CT or MRI to
denoise PET images

DL can be used to reduce the noise in nuclear pared to the population-based estimations.
medicine images. One approach is to use an Indeed, there is potential to train a CNN against
encoder-decoder architecture with a built-in the original 68Ga PET scan, the serial 177Lu
graph neural network (GNN) module (Fig. 7.14). gamma distributions to provide dosimetry esti-
A CT or MRI can be used as an input and an iter- mates first by the 68Ga PET (allowing immediate
ative process undertaken comparing the loss optimization of the therapy dose), and then cor-
function after each epoch until stop criteria are rected based on the first 177Lu gamma image.
satisfied. Cui et al. [18] employed a process simi- This field is not advancing very fast because the
lar to this using CT or MRI to reduce noise on first step is to develop rigorous CNN approaches
PET scans. The contrast-to-noise ratio (CNR) for multiple lesion detection and segmentation.
was superior to other methods (iterative recon- Zhao et al. [19] developed a U-net based deep
struction, Gaussian filtering) of noise reduction. CNN for automatic characterization of lesions on
CNN and DL are used in radiation therapy for 68Ga PSMA PET/CT and calculate tumor bur-
auto-contouring, auto-planning and decision sup- den with the intention of further developing the
port to optimize treatment outcomes and better algorithm for optimizing radionuclide therapy.
manage radiation dosimetry. It makes sense that Precision was reported to be 99% in bone lesions
radionuclide therapy and theranostics adopt and 94% in lymph nodes but segmentation accu-
CNN/DL approaches to optimize patient dose racy was lower than detection. Jackson et al. [20]
and dosimetry to target tissues versus non-target trained a CNN to automatically contour for
tissues. An area of particular interest in radiation kidney regions for radiation dose estimation in
dosimetry is associated with 177Lu-lutate and radionuclide therapy with 177Lu PSMA. While
177Lu-PSMA therapy. Post therapy, 177Lu no differences were seen in the dosimetry estima-
allows gamma imaging for whole-body distributions associated with manual versus CNN
tion and dosimetry calculations. This data can be regions, automation improves the time cost.
subsequently used to measure tumor burden dur- Nonetheless, the study also revealed some con-
ing therapy, dose burden to non-target tissues and founding for the CNN based on anatomical or
also optimization of subsequent rounds of radio- pathological anomalies of the renal system (e.g.,
nuclide therapy. A trained CNN could not only polycystic kidneys). With developments focused
automate dosimetry calculations but could reduce on foundations, Fig. 7.15 provides a schematic of
the error for individual tissues calculations com- processes that may be on the horizon.
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 99

n epoch
Skip connection
(concatenation)

Encoder Decoder
Block 1 Skip connection Block 3
Loss
(concatenation) function
96 h
Input 68Ga-PSMA stop
PET/CT Encoder Decoder criteria 24 h
Block 2 Block 2 met
Skip connection 4h
Convolution (concatenation) Estimated Serial 177Lu
Dosimetry Gamma Images
Batch normalization Encoder Decoder
Block 3 Block 1
Activation function Adjust Theray
Dose to Optimize
Therapy
Max pooling
4 24 96
Upsampling time (h)
Graph Neural Kinetics and Dosimetry
Network Estimations
Output
Module

Fig. 7.15 Schematic representation of an encoder-decoder GNN in the U-net architecture that could be used to develop
dosimetry-based optimization of therapy doses of 177Lu based on the 68Ga PET/CT. (PSMA images courtesy of [29])

With respect to clinical applications, an ANN Parkinson’s, beta amyloid, and 18F-FDG
was trained against six expert nuclear cardiolo- Alzheimer’s), myocardial perfusion studies
gists to provide superior 17 segment defect scor- (SPECT and PET), and the thyroid. There is a
ing in myocardial perfusion scans [21]. In a very diverse array of clinical applications of DL
multicenter trial [22] using a deep CNN produced producing a rapidly growing body of literature.
a statistically significant improvement over total
perfusion defect scores. The report provided an
insight into how AI outcomes could be integrated 7.6 Internal Department
into conventional image display using a polar Applications
map display (Fig. 7.16). ML has also been used
to predict major cardiac events (MACE) on myo- There are also opportunities for data rich depart-
cardial perfusion SPECT with superiority over ments to train an ANN or CNN for a specific
expert readers and automated quantitative soft- internal purpose [30, 31]. This could produce
ware [23]. internally valid algorithms that can reliably per-
Unsupervised DL was used on 18F-FDG PET form the prescribed task to enhance internal pro-
to differentiate Alzheimer’s disease and was able cesses. Clearly, commercialization of these
to identify abnormal patterns in 60% of studies algorithms would require navigation of regula-
classified as normal by expert visualization [26]. tory frameworks associated with data sharing,
DL has also been successfully used to identify privacy and security yet the major barrier would
nasopharyngeal carcinoma patients most likely to be local bias in the data [3]. There may also be
benefit from induction chemotherapy on PET/CT specific parameter or equipment biases in the
[25]. DL on quantitative SPECT/CT has provided algorithm unique to the developing depart-
automated volume of interest segmentation on ment that do not hold when parameters or equip-
CT that can then be applied to the SPECT data ment change. Changing the acquisition or
for calculation of glomeruli filtration rate [24]. reconstruction parameters is also likely to pro-
There is a diverse array of emerging DL and duce variations in performance of the trained
CNN based literature in clinical molecular imag- algorithm. Over and above these technical specif-
ing including radiomic feature extraction and ics, there is likely to be a local population bias
segmentation on PET or PET/CT in a variety of that threatens external validity of a trained
tumors (e.g., lung, head/neck), brain studies (e.g., algorithm.
100 G. Currie and E. Rohren

Fig. 7.16 Prediction of CAD with integration of DL outcomes will be integrated into radiomic outputs.
outputs into polar maps provides an insight into how AI (Reprinted with permission [23])

In addition to the site-specific characteristics cations of ML and DL in molecular imaging, and

that may act as barriers to the expansion of an the actual number of commercially available
internally developed AI process to more general algorithms approved by the US FDA. Thus, there
usage, there are operational and logistical chal- is an opportunity for departments to develop
lenges that must be considered, as well. First, the internal AI tools that enhance outcomes and
electronic ecosystems in which medical data performance.
resides is widely varied across institutions. In To illustrate the thinking and process, con-
part this is reflective of the variety of information sider simple theoretical applications and proj-
technology (IT) solutions available in the market- ects with particular consideration to how the ML
place, but even in cases in which the same vendor or DL could be integrated into existing graphical
solution is deployed, there are often site-specific outputs as highlighted by Figure 7.16. The addi-
customizations and modifications that may make tion of CNN risk stratification to pre-existing com-
direct translation of algorithms and processes a mercial software for quantitation, radiomic
challenge. These same issues present challenges feature extraction and display offers an intuitive
to the commercialization of AI technology, since and seamless approach. Consider a CNN risk
the ideal commercial product should be vendor score for pulmonary embolism summarized and
agnostic both with regards to scanner technology displayed in a simple format (Fig. 7.17). This
(input) as well as IT infrastructure (processing could offer perfusion SPECT segmentation
and output). This provides an insight into the dis- against the accompanying low-dose CT to predict
crepancy between the enormous potential appli- pulmonary embolism (Fig. 7.18). A similar
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 101

Fig. 7.17 Mock

summary output for a V/Q Lung Scan
CNN-based risk Pulmonary Embolism Probability
algorithm for pulmonary
embolism using
ventilation and perfusion
mismatch. (Reprinted
with permission [32])

Ventilation Perfusion Ventilation Perfusion

Defect Probability

Improbable Possible Probable Certain

Pulmonary Embolism Probability Score

Normal 0.99 PE

approach to segmentation, risk scoring of indi- burden and enhancing precision medicine.
vidual lesions and mapping total disease burden Integration of images and radiomic features of
might be helpful for patients presenting for eval- current AI applications (e.g., myocardial perfu-
uation of metastatic spread to bone (Fig. 7.19). sion SPECT software) to include DL predictions
While these mock-examples provoke ideas of integrated into the reporting display will become
what is potentially possible, the value is immedi- the norm across all procedures undertaken in
ately transparent. The emergence, for example, of nuclear medicine (Fig. 7.16).
parametric images in PET offers a perfect oppor- Increasingly, AI will play an important role in
tunity to incorporate AI driven outputs into image patient management and business administration.
display. Consider the possibilities for improved outcomes
of, for example, a patient presenting to nuclear
medicine for 68Ga PSMA and 177Lu PSMA
7.7 A Glance at Tomorrow where facial recognition software not only identi-
fies the patient and registers them in the clinic,
AI, ML, and DL today provide opportunity to but also retrieves the patients’ medical records
improve efficiency and improve efficacy [2, 31, and previous imaging as they walk through the
33, 34]. Fully realized, tomorrow this capability waiting room door. DL algorithms automatically
has the potential to optimize patient management evaluate all previous scans, segmenting critical
and drive precision nuclear medicine. This may organs and target tissues, individualizing the
see AI initiatives move from segmentation and diagnostic radiopharmaceutical dose to optimize
classification to fully integrated tools in theranos- the image quality as a trade-off against radiation
tics, image guided therapy, and radiation dosim- dosimetry. DL/GAN based iterative reconstruc-
etry. Harnessed properly, DL affords the tools for tion with segmentation and radiomic feature
improving outcomes, reducing radiation dose extraction would include auto-mapping all
102 G. Currie and E. Rohren

Fig. 7.18 Mock summary output for a CNN-based risk consistent with a high likelihood of pulmonary embolism
algorithm for pulmonary embolism using low-dose CT (left) and the other matching defect associated with lower
and perfusion SPECT mismatch. The coronal and digital likelihood of pulmonary embolism (right). (Reprinted
slices represent two different patients; one with mismatch with permission [32])
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 103

Bone Scan
Metastatic Disease Probability

Patient name: Classification Probability

Scan date: Normal 0.01
Dose: Signal metastases 0.05
Time post inj scanned: Multiple metastases 0.85
Hx: Widespread disease 0.08
Superscan 0.01

Defect Probability

Improbable Suspicious Highly suspicious Metastases

(further investigation)

Metastases Burden Score

(axial skeleton)

Normal 0.26 Superscan

Fig. 7.19 Mock summary output for a CNN-based risk algorithm for skeletal metastases with probability classification
for various outcomes and risk assessment for individual lesions. (Reprinted with permission [32])

lesions in the patients’ series. An ML algorithm gamma camera scans used to image therapy dis-
might evaluate radiomic inputs and other patient tribution to segment and extract radiomic fea-
records and personalize the therapeutic approach. tures and determine dosimetry.
ML and DL algorithms built to model the specific Back-office operations could also be sub-
insight and expertise of specialists (from any- stantially streamlined. From the point of con-
where in the world), could provide expert second ception that a particular imaging study or
reader systems for image reporting. DL/GAN therapy may be clinically useful for a particu-
algorithm co-registers whole body PET with lar patient, an integrated AI system could
104 G. Currie and E. Rohren

assess the application of the proposed proce- improve efficiency, and decrease error rates.
dure with the patient’s medical record, includ- But the judgment of the physician remains
ing clinical histories, laboratory values, and essential. In some ways, automating some of
prior imaging, and compare with available lit- the more menial tasks makes better use of a
erature and appropriate use criteria (AUC) physician’s time for the skill set they have
recommendations. Furthermore, such a sys- trained extensively to have. Automation of
tem could recommend alternate imaging or menial tasks in nuclear medicine has been
therapy as justified by the clinical scenario. In rolled out over many decades (e.g., auto-con-
addition to the potential enhancement to clini- tours for region of interest identification) with-
cal care, such an upstream integration of AI out a sense of doom associated with employment
technology would facilitate the interface displacement or redundancy.
between health systems and payers, both gov- Concurrently, there has been very little dis-
ernmental and third- p arty, where AUC and cussion about the impact of AI on the technolo-
medical need statements are being increas- gist or physicist. It is entirely conceivable that
ingly employed. AI technology focused on an AI system could be designed that simply
aiding medical justification and rationale requires a “concierge” to direct the patient to
would greatly improve the experience for the the X-ray room; threatening the role of the
referring clinicians, decreasing the time spent radiographer. The nature of higher order imag-
on administrative activities. The p ossibilities ing procedures in nuclear medicine represents
are limitless but there are significant barriers a deep moat and high wall protecting the
to overcome. responsibilities of the nuclear medicine tech-
nologist from AI automation. Nonetheless,
image analysis and reconstruction will have an
7.8 Workforce; Redundancy, increasing AI presence and many of the radio-
Displacement, pharmacy responsibilities may be automated
Transformation, where there is potential for robotic AI. Perhaps
and Opportunity more importantly, the triage capability of AI is
a direct threat to the role of technical staff pro-
Perhaps the greatest speculation, hysteria, and viding interim reports.
resistance around AI in radiology has been the In nuclear medicine, the emergence of
impact on the workforce. At one extreme are the capabilities of ML and DL will challenge the
doomsday predictors foreshadowing the extinc- patient care paradigm and drive a shift toward
tion of radiologists as a species while on the improved patient care (and outcomes) and
opposite end of the spectrum lies those who deny greater satisfaction amongst physicians. The
the emerging capability of AI and see no role for paradigm shift is unlikely to have any signifi-
it in radiology. The reality lies across a broad cant impact on the role and responsibilities of
central band depending on a variety of factors the nuclear medicine physician. Efficiencies
relating to work function. created by ML and DL are more likely to have
The speculation around the impact of AI on a direct impact on nuclear medicine technolo-
the role of radiologists is perplexing and war- gists and scientists/physicists who take
rants discussion and consideration in relation to responsibility for data curation and steward-
nuclear medicine. At best, CNN and DL pro- ship. Here, there is potential for role expan-
grams provide fantastic triage and second sion beyond current roles in PACs
reader systems that support the physician, administration/management to new roles in
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 105

Table 7.1 Hypothetical nuclear medicine department sion that their hospital is using new technology
workforce and the associated probable changes to work-
to help the doctors make the best decisions to
force structure associated with a deep assimilation of AI
enhance their outcomes consistent with preci-
Workforce
Number in transformation in
sion medicine. Patient B may come to the con-
traditional fully immersive AI clusion that the hospital is merely using
Role/position department department computer programs to save themselves time
Physician 4 4 and effort undermining the nature of personal-
Physicist 2 1 ized medicine. From a system perspective, the
Data Scientist 0 2 viewpoint of Patient A is obviously much more
Radiopharmacist 1 1
preferable, to be seen as a patient-centered sys-
Technologist 9.5 8.5
tem layering the latest technologies over the
PACS/Data 0.5 1.5
Manager core of personalized care.
Nurse 2 2 As a result of all these factors, AI may
Research Fellow 2 4 reshape the nuclear medicine workforce but
PhD Candidates 2 6 workforce changes themselves could be minor
and redundancy rare. Uncertainty may be fueled
by the emergence of digital technology that saw
redundancy of the dark room technician. For
data management and data science. Indeed, AI, however, displacement of work functions in
the workforce of a department may expand nuclear medicine is less likely. Yet AI is a dis-
rather than contract with an increased research ruptive technology and the impact on clinical
and development footprint (Table 7.1). and research practice may see those with AI
A critical consideration as one imagines a capability or literacy displace those without.
future nuclear medicine department with fully
integrated AI technology is the importance of
the human element. Although medicine is in a 7.9 Summary
large part a science, there also exists elements
of art and culture. Dispassionate and data- ML and DL are rich tools for evaluating the large
driven decisions may be laudable in the intel- volume of radiomic features extracted from
lectual sense, but these terms are seldom used molecular imaging data sets. Moreover, ML and
to describe the ideal way to interact with DL can be valuable in identifying those radiomic
patients. Rather, terms such as caring, empa- features that should be used alone or in combina-
thetic, and understanding paint the picture of tion in decision making. ML and DL has the
the trusted physician or healthcare team mem- capability to uncover relationships amongst fea-
ber. Illustrative of this is the increasing focus tures and outcomes that may not be apparent in
on patient experience factors in the assessment the standard combination of semantic reporting
of healthcare systems, which in some cases are (Fig. 7.20). While ML and DL are unlikely to
beginning to lead to financial rewards or penal- cause job redundancy, there is an opportunity to
ties. In fact, patients themselves may be signifi- enhance patient outcomes, reporting accuracy,
cant drivers in the appropriate adoption of AI and efficiency. In particular, AI and DL are part
enhancements into practice. Consider the sce- of the inventory required to capitalise on high
nario in which an AI algorithm is brought data density image sets, enhance image quality,
online to aid in the evaluation of cardiac perfu- extract abstract features and allow advances in
sion scans. Patient A may come to the conclu- radiation dosimetry.
106 G. Currie and E. Rohren

Fig. 7.20 A number of

models for integration of Population
AI into radiology have
been proposed, but in
nuclear medicine,
perhaps the most
appropriate model
captures the best of each
domain. (Reprinted with Imaging Test
permission [2])

Priority/triage
Physician AI
Pre-analysis
Data extraction

Discordant
Review/Learn

Concordant
Extends Degree of Confidence

The emergence of AI in molecular imaging tions in the development and implementation

heralds an era of disruptive technology with of AI; without perhaps using the language
the potential to reinvigorate the ecosystem of associated with AI. This leaves nuclear medi-
and reengineer the landscape in which, clinical cine and molecular imaging well placed to
molecular imaging is practiced. While AI is not assimilated ML and DL into clinical practice to
new in molecular imaging, more recent devel- enhance precision medicine (Fig. 7.21). The
opments and applications of ML and DL create disruptive revolution of AI in molecular imag-
refreshed interest in the architecture, opera- ing is upon us so for those colleagues eager to
tion, and implementation of AI. As a profes- grow with our profession, it is timely to craft a
sion, nuclear medicine and molecular imaging position in the AI space today and for
has provided leadership across several genera- tomorrow.
7 Integration of Artificial Intelligence, Machine Learning, and Deep Learning into Clinically Routine… 107

Diagnostics
Therapeutics Omics
Patient history Radiomics
Narrative

Evidence-Based Big RCTs

Little
Data Medicine Data

Evidence
facing
AI AI
ing
arn
Le Machine Deep
learning learning
Precision
facing
Personalised Precision
Patient/Person
Medicine Patient Medicine
facing Pathology
facing Early diagnosis
Optimised therapy
Response to therapy
Prevention

Survival
Disease free Learning facing
Outcomes
Cost
QOL

Fig. 7.21 Schematic representation of the role of big data, radiomics, and AI (ML and DL) in enhancing precision
medicine

reconstructed activity and attenuation maps. J Nucl

References Med. 2019;60(8):1183–9. https://doi.org/10.2967/
jnumed.118.219493. pii: jnumed.118.219493.
1. Scott J, Palmer E. Neural network analysis of 7. Torrado-Carvajal A, Vera-Olmos J, Izquierdo-Garcia
ventilation-perfusion lung scans. Radiology. D, Catalano OA, Morales MA, Margolin J, Soricelli
1993;186(3):661–4. A, Salvatore M, Malpica N, Catana C. Dixon-VIBE
2. Currie G. Intelligent imaging: artificial intelligence deep learning (DIVIDE) pseudo-CT synthesis
augmented nuclear medicine. J Nucl Med Technol. for pelvis PET/MR attenuation correction. J Nucl
2019;47(3):217–22. Med. 2019;60(3):429–35. https://doi.org/10.2967/
3. Currie G, Hawk KE, Rohren E. Ethical principles jnumed.118.209288. Epub 2018 Aug 30.
for the application of artificial intelligence (AI) in 8. Leynes A, et al. Zero-echo-time and dixon deep
nuclear medicine and molecular imaging. Eur J Nucl pseudo-CT (ZeDD CT): direct generation of pseudo-
Med Mol Imaging. 2020;47(4):748–52. https://doi. CT images for pelvic PET/MRI attenuation correction
org/10.1007/s00259-020-04678-1. using deep convolutional neural networks with multi-
4. Currie G. Artificial intelligence in nuclear medicine: parametric MRI. J Nucl Med. 2018;59:852–8.
a primer for scientists and technologists. Reston: 9. Liu F, Jang H, Kijowski R, Zhao G, Bradshaw
SNMMI Publishing; 2022. T, McMillan AB. A deep learning approach for
5. Hwang, et al. Improving the accuracy of simultane-
18
F-FDG PET attenuation correction. EJNMMI
ously reconstructed activity and attenuation maps Phys. 2018;5(1):24. https://doi.org/10.1186/
using deep learning. J Nucl Med. 2018;59:1624–9. s40658-018-0225-8.
6. Hwang D, Kang SK, Kim KY, Seo S, Paeng JC, Lee 10. Qian H, Rui X, Ahn S, IEEE. Deep learning models
DS, Lee JS. Generation of PET attenuation map for for PhT scatter estimations. In: IEEE nuclear sci-
whole-body time-of-flight 18F-FDG PET/MRI using ence symposium and medical imaging conference.
a deep neural network trained with simultaneously New York: IEEE; 2017. p. 2017.
108 G. Currie and E. Rohren

11. Zhu B, Liu JZ, Cauley SF, Rosen BR, Rosen 23. Betancur J, Otaki Y, Motwani M, Fish MB, Lemley
MS. Image reconstruction by domain-transform man- M, Dey D, Gransar H, Tamarappoo B, Germano
ifold learning. Nature. 2018;555(7697):487–92. G, Sharir T, Berman DS, Slomka PJ. Prognostic
12. Haggstrom I, Schmidtlein CR, Campanella G, Fuchs value of combined clinical and myocardial perfu-
TJ. DeepPET: a deep encoder-decoder network for sion imaging data using machine learning. JACC
directly solving the PET image reconstruction inverse Cardiovasc Imaging. 2018;11(7):1000–9. https://doi.
problem. Med Image Anal. 2019;54:253–62. org/10.1016/j.jcmg.2017.07.024. Epub 2017 Oct 18.
13. Jiao J, Ourselin S. Fast PET reconstruction using 24. Park J, Bae S, Seo S, Park S, Bang JI, Han JH, Lee
multi-scale fully convolutional neural networks; WW, Lee JS. Measurement of glomerular filtration
2017. rate using quantitative SPECT/CT and deep-learning-
14. Xu J, Gong E, Pauly J, Zaharchuk G. 200x low-dose based kidney segmentation. Sci Rep. 2019;9(1):4223.
PET reconstruction using deep learning; 2017. https://doi.org/10.1038/s41598-019-40710-7.
15. Kaplan S, Zhu Y-M. Full-dose PET image estimation 25. Peng H, Dong D, Fang MJ, Li L, Tang LL, Chen L,
from low-dose PET image using deep learning: a pilot Li WF, Mao YP, Fan W, Liu LZ, Tian L, Lin AH, Sun
study. J Digit Imaging. 2019;32(5):773–8. Y, Tian J, Ma J. Prognostic value of deep learning
16. Ouyang J, Chen KT, Gong E, Pauly J, Zaharchuk PET/CT-based radiomics: potential role for future
G. Ultra-low-dose PET reconstruction using gen- individual induction chemotherapy in advanced
erative adversarial network with feature match- nasopharyngeal carcinoma. Clin Cancer Res.
ing and task-specific perceptual loss. Med Phys. 2019;25(14):4271–9. https://doi.org/10.1158/1078-
2019;46(8):3555–64. 0432.CCR-18-3065.
17. Lei Y, Dong X, Wang T, et al. Whole-body PET esti- 26. Choi H, Ha S, Kang H, Lee H, Lee DS. Deep learning
mation from low count statistics using cycle consis- only by normal brain PET identify unheralded brain
tent generative adversarial networks. Phys Med Biol. anomalies. EBioMedicine. 2019;43:447–53. https://
2019;64(21):215017. doi.org/10.1016/j.ebiom.2019.04.022. Epub 2019 Apr
18. Cui JN, Gong K, Guo N, et al. PET image denoising 16.
using unsupervised deep learning. Eur J Nucl Med 27. Currie G, Iqbal B, Kiat H. Intelligent imaging:
Mol Imaging. 2019;46(13):2780–9. radiomics and artificial neural networks in heart fail-
19. Zhao Y, Gafita A, Vollnberg B, et al. Deep neural ure. J Med Imaging Radiat Sci. 2019;50(4):571–4.
network for automatic characterization of lesions 28. Currie G, Sanchez S. Topical sensor metrics for 18F-
on 68Ga-PSMA-11 PET/CT. Eur J Nucl Med Mol FDG positron emission tomography dose extravasa-
Imaging. 2020;47:603–13. https://doi.org/10.1007/ tion. Radiography. 2020;27:178–86.
s00259-019-04606-y. 29. Violet J, Jackson P, Ferdinandus J, Sandhu S, Akhurst
20. Jackson P, Hardcastle N, Dawe N, Kron T, Hofman T, Iravani A, Kong G, Kumar A, Thang S, Eu P,
MS, Hicks RJ. Deep learning renal segmentation for Scalzo M, Murphy D, Williams S, Hicks R, Hofman
fully automated radiation dose estimation in unsealed M. Dosimetry of 177Lu-PSMA-617 in metastatic
source therapy. Front Oncol. 2018;8:215. https://doi. castration-resistant prostate cancer: correlations
org/10.3389/fonc.2018.00215. eCollection 2018. between pretherapeutic imaging and whole-body
21. Nakajima K, Kudo T, Nakata T, Kiso K, Kasai T, tumor dosimetry with treatment outcomes. J Nucl
Taniguchi Y, Matsuo S, Momose M, Nakagawa M, Med. 2019;60:517–23.
Sarai M, Hida S, Tanaka H, Yokoyama K, Okuda 30. Currie G. Intelligent imaging: anatomy of machine
K, Edenbrandt L. Diagnostic accuracy of an artifi- learning and deep learning. J Nucl Med Technol.
cial neural network compared with statistical quan- 2019;47(4):273–81.
titation of myocardial perfusion images: a Japanese 31. Currie G, Hawk KE, Rohren E, Vial A, Klein
multicenter study. Eur J Nucl Med Mol Imaging. R. Machine learning and deep learning in medical
2017;44(13):2280–9. imaging: intelligent imaging. J Med Imaging Radiat
22. Betanacur J, Hu LH, Commandeur F, Sharir T, Sci. 2019;50(4):477–87.
Einstein AJ, Fish MB, Ruddy TD, Kaufmann PA, 32. Currie G. Intelligent imaging: developing a
Sinusas AJ, Miller EJ, Bateman TM, Dorbala S, Di machine learning project. J Nucl Med Technol.
Carli M, Germano G, Otaki Y, Liang JX, Tamarappoo 2021;49(1):44–8.
BK, Dey D, Berman DS, Slomka PJ. Deep learning 33. Uribe C, et al. Machine learning in nuclear medicine:
analysis of upright-supine high-efficiency SPECT part 1—introduction. J Jucl Med. 2019;60:451–6.
myocardial perfusion imaging for prediction of 34. Nensa F, Demircioglu A, Rischpler C. Artificial
obstructive coronary artery disease: a multicenter intelligence in nuclear medicine. J Nucl Med.
trial. J Nucl Med. 2019;60(5):664–70. 2020;60:29S–37S.
Imaging Biobanks for Molecular
Imaging: How to Integrate ML/AI 8
into Our Databases

Angel Alberich-Bayarri, Ana Jiménez-Pastor,

Blanca Ferrer, María José Terol,
and Irene Mayorga-Ruiz

Contents
8.1 Introduction 109
8.2 Imaging Biobanks in Molecular Imaging 110
8.3 Bioethical Issues 112
8.4 Proposed Architecture 113
References 116

8.1 Introduction Although medical images can be considered

as digital and immortal samples of the organs and
Biobanks are collections, repositories of all types tissues of the human body, their inclusion in bio-
of human biological samples, such as blood, tis- banks has not been straightforward by design.
sues, cells or DNA and/or related data such as the Indeed, many discussions have been held around
associated clinical and research data, as well as the biobank concept and its suitability for the
biomolecular resources, including model- and management of imaging data. The European
micro-organisms that might contribute to the Society of Radiology (ESR) initiated an Imaging
understanding of the physiology and diseases of Biobanks Working Group in 2014, with the focus
humans [1]. At a European level, the main to provide guidelines for the creation of imaging
infrastructure of biobanks is BBMRI-ERIC
biobanks and the integration of image reposito-
(Biobanking and BioMolecular resources ries into existing biobanks initiatives. The defini-
Research Infrastructure) (http://bbmri-eric.eu). tion of imaging biobanks according to the
working group guidelines was “organized data-
bases of medical images, and associated imaging
biomarkers (radiology and beyond), shared
A. Alberich-Bayarri (*) · A. Jiménez-Pastor
I. Mayorga-Ruiz among multiple researchers, linked to other bio-
Quantitative Imaging Biomarkers in Medicine, repositories” [2].
Quibim SL, Valencia, Spain Thereafter, a memorandum of understanding
e-mail: [email protected] was signed in November 11, 2015, between the
B. Ferrer · M. J. Terol ESR and BBMRI-ERIC [3]. The reason for these
Hematology Department, Clinic University Hospital efforts on integration is that medical images
of Valencia, Valencia, Spain

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 109
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_8
110 A. Alberich-Bayarri et al.

g enerated in radiology and nuclear medicine are the European Commission. This regulatory
not only pictures, but quantitative data, provided requires comprehensive validation studies,
in the form of imaging biomarkers that can be including multi-center data and large cohorts of
derived from the digital images acquired in an patients in which the algorithm obtains excellent
individual using modalities such as Computed performance.
Tomography (CT), Magnetic Resonance Imaging With regard to the seamless integration in cur-
(MRI), X-rays, ultrasounds, and related to the rent information technology (IT) infrastructures
topic of this chapter, also positron emission existing in hospitals, AI modules should be infer-
tomography (PET), single-photon emission com- enced in a software platform that can be interop-
puted tomography (SPECT) as well as hybrid erable with the current healthcare information
modalities (PET/CT and PET/MRI) [4, 5]. systems (i.e., understanding standards such as
There are several technological solutions for DICOM communication with PACS, HL7 mes-
the creation of biobanks for medical imaging in saging, XML) and that incorporates automated
general and that can be perfectly adapted to the analysis pipelines execution.
molecular imaging space [6]. Image processing This chapter addresses the main specifications
algorithms for molecular imaging have emerged for the creation of imaging biobanks for molecu-
to cover unmet clinical needs but their applica- lar imaging as well as the strategies for the inte-
tion to clinical routine in an optimized manner is gration of AI/ML models to streamline the data
still not straightforward, since it requires frequent extraction from the images.
manual interactions. Furthermore, standalone
software and other solutions have been mainly
addressed to provide quantitative analysis tools 8.2 Imaging Biobanks
in a patient-specific basis, but not to populate in Molecular Imaging
databases for the posterior scientific research and
data mining in imaging biomarkers. As an exam- Imaging biobanks are not formed exclusively by
ple, although the technology is available, today images but also by associated data in the form of
pipelines like quantifying the metabolic tumor imaging biomarkers that can be extracted from
volume (MTV) or total lesion glycolysis (TLG) them after the application of the appropriate
of lymphoma lesions, storing the obtained results image processing techniques. Imaging biomark-
in the PACS, and obtaining its metabolic hetero- ers are defined as characteristics extracted from
geneity in a seamless way in clinical routine are the images of an individual that can be objec-
still not available. tively measured and act as indicators of a normal
Artificial Intelligence, Machine Learning, and biological process, a disease, or a response to a
more specifically, the use of convolutional neural therapeutic intervention. Imaging biomarkers are
networks (CNN), also called Deep Learning, complementary to conventional radiological
have allowed for the development of AI models readings either to detect a specific disease or
that might help to streamline organ segmentation, lesion; quantify its biological situation; evaluate
lesion detection and quantification processes [7– its progression; stratify phenotypic abnormali-
9]. Despite the high number of research initia- ties; and assess the treatment response [6, 10–12].
tives on deep learning, the integration of AI An illustrative example of derived imaging bio-
models in clinical routine requires the accom- markers from molecular imaging is the calcula-
plishment of regulatory and technical tion of the standardized uptake value (SUV),
challenges. which depends on the images, the injected radio-
From the regulatory perspective, the model tracer dose, and the patient weight. Therefore, in
needs to be cleared as a Medical Device product the design of imaging biobanks for molecular
by relevant organisms such as the Food and Drug imaging field, it is of utmost importance to col-
Administration (FDA) and notified bodies clear- lect information about the patient preparation and
ing CE mark for Medical Devices on behalf of doses of the radiotracer that might be needed for
8 Imaging Biobanks for Molecular Imaging: How to Integrate ML/AI into Our Databases 111

a specific disease with the purpose of creating

appropriate models of the tissues, organs, and the
patient [13]. As an example, such models will be
used to predict the risk of disease progression in
diseases like lymphoma, and to tailor treatment
based on individual response to novel therapeutic
approaches [14]. Focusing on the current imag-
ing biomarkers available, solid tumors (breast,
lung, colorectal, and prostate cancer) and hema-
tological malignancies (lymphoma, multiple
myeloma) are the ideal scenarios to develop
disease-oriented imaging biobanks. Nevertheless,
scalability of the infrastructure would allow their
inclusion of other clinical scenarios (rare tumors,
neuroblastoma, glioblastoma, among others).
From the technological viewpoint, an imaging
biobank should have an optimized software
architecture for the massive extraction of quanti-
tative imaging data and its association to other
variables. The main users of the platform are not
only medical doctors but any collaborator from
transversal disciplines such as biology, biotech-
Fig. 8.1 Example of data transmittal form associated to a
nology, and biomedical engineering. Considering
PET asking for additional information that is not present the scenario of use of the imaging biobank, the
in the DICOM headers and is needed to preserve the main functional requirements can be summarized
imaging information with high-quality standards in the following:

the calculation of imaging biomarkers (Fig. 8.1). • Integration: The imaging biobank software
Although the DICOM standard is designed to platform should be adapted to current health-
incorporate in the metadata specific characteris- care information systems (i.e., understanding
tics of the patient and the examination character- standards like DICOM communication with
istics, much of the information needed for PACS, XML, HL7 messaging) and to be struc-
molecular imaging analysis is not included. tured in conventional cells & tissues biobanks
Quantitative imaging biomarkers should be data formats (Minimum Information About
linked to other information from the patient, such Biobank Data Sharing, MIABIS).
as next generation sequencing (NGS) data, • Modularity: structured in different compo-
proteomics, blood test data, as well as clinical nents (medical images visualization, inference
information [2]. of image analysis algorithms and AI models,
With regard to the subjects and associated database searching engines and data mining
pathologies being registered in the imaging bio- capabilities, reports generator, back-end,
bank, two different strategies exist: the creation front-end) and layers able to work as whole
of population-based imaging biobanks, that are infrastructure.
the ones created to collect data from general pop- • Scalability: infrastructure ready to grow with
ulation with the purpose of identifying risk fac- peaks of demand either on the storage or in the
tors in the development of specific diseases and computing fronts through elastic architec-
help in the early detection and the disease- tures, allowing for the wake up process of new
oriented imaging biobanks, which consist of the storage units or servers when an increase
ones developed to collect multi-omics data from demand exists.
112 A. Alberich-Bayarri et al.

• Accessibility: The imaging biobank should be management and analysis within shared imaging
built in a client-server approach to be reach- biobanks [16].
able from any place by simply using a web The General Data Protection Regulation (EU)
browser. 2016/679 (GDPR) is a regulation in EU law on
• Vendor-agnostic: The imaging biobank data protection and privacy in the European
should be able to manage images and data Union (EU) and the European Economic Area
from any manufacturer. (EEA) and represents one of the most compre-
• Inference of AI models and algorithms: The hensive and strict legal guidelines existing world-
imaging biobank architecture would allow to wide. GDPR sets the definition and establishes
integrate scripts or software components the difference between pseudonymization and
developed by researchers in languages such as anonymization of personal data, which is sum-
Python, R and embedded in Docker contain- marized in Fig. 8.2.
ers, in order to apply analysis pipelines to the All the data incorporated into biobanks in gen-
data. eral requires the approval of the research project
• Data mining: The infrastructure should allow by an ethics committee and the corresponding
for Big Data management and scientific informed consent in which the patient confirms
exploitation. whether accepts to participate in a research pro-
gram. In the case of observational non-
interventional projects, mainly retrospective
8.3 Bioethical Issues studies based on data collection for their storage
in a biobank, the informed consent can be waived
Biobanks must preserve the human and legal by the ethics committee.
rights of each person that offers biomaterial for Molecular images from modalities such as
research [15]. Data privacy and security is a key PET or SPECT, as in other medical imaging
factor to consider in the creation of imaging bio- modalities are stored in DICOM format, combin-
banks. Recent initiatives in medical imaging ing the pixel data component (the image) and the
research such as the big consortiums on AI and associated information (the metadata). A stan-
medical imaging include an open data policy in dard process in research projects and the creation
their data management plans. As an example, the of imaging biobanks is the appropriate pseudony-
European Commission aims to accelerate mization or anonymization of the images. An
Europe’s innovation capacity through data shar- example of pseudonymization consists of assign-
ing and by following the principle of open access ing a code to the patient information that could be
to research results. The availability of open, high- linked afterwards with the real patient identity by
quality and large-scale imaging biobanks and pro- any individual (even if the code assignment infor-
cessing facilities in terms of data, services, and mation is stored in the source hospital and can
resources will radically simplify access to knowl- only be linked by the healthcare professionals
edge, improve interoperability and standardiza- managing the patient). An example of anony-
tion and will help consolidate the medical imaging mization consists of completely deleting all
research community and foster multi-disciplinary patient information in the images metadata and
collaboration at European level [16]. One of the on the file folders so that the images cannot be
keys to success in the European medical research linked back with the patient identity by any per-
and innovation field is to find the compromise son. There exists controversy on whether the
between ensuring that medical and scientific net- images themselves are considered personal data
work collaboration is not hindered while keeping or not, since it could be considered that these are
a strict and high level of information security. unique and different for every patient.
Biomedical imaging will become one of the Nevertheless, taking into account that the effort
major data producers, and researchers working in required to identify an individual from an image
this domain will have to face the burden of data is disproportionate (taking the anonymized image
8 Imaging Biobanks for Molecular Imaging: How to Integrate ML/AI into Our Databases 113

Fig. 8.2 Difference between pseudonymization and anonymization as per General Data Protection Regulation (GDPR)

and applying a brute-force correlation algorithm extraction [18]. They must also allow to analyze
with the identified images of a hospital to find the these features extracted from medical images at a
match) due to the combination of steps that need population level (radiomics), aiming to find pro-
to be undertaken, most current legal advisors spective disease biomarkers and to combine them
exclude anonymized medical images from being with other molecular biology and genomics data
considered personal data. Care has to be taken (radiogenomics) [19].
when managing high resolution medical images One of the aspects that must be clearly defined
of the head, with modalities such as CT or MRI, before the low-level architecture definition is the
since 3D reconstructions allow to visualize the data ingestion and flow, checking whether man-
face of the patients and eventually identify them ual and automated uploads will be allowed and
[17]. The best approach in this case is to apply the strategy for making the biobank accessible to
facial blurring or removal techniques before their worldwide researchers. Imaging biobanks are
storage in an imaging biobank. originated in research projects in which a group
of partners participate, and images are mainly
managed in a pseudonymized domain. Once the
8.4 Proposed Architecture biobank has been created and populated with
data, it is usually released to public domain after
Imaging biobanks are not simple collections of complete anonymization (Fig. 8.3).
medical images associated with patient data. In The architecture of a molecular imaging bio-
fact, architectures for the creation of medical bank is organized in three different layers, the
imaging and also molecular imaging biobanks front-end, the back-end services and the data per-
must incorporate advanced high performance sistence layer. The following components can be
computing capabilities where medical images, found in software platforms used for imaging
metadata, and other information associated to the biobanks such as Quibim Precision® (Quibim SL,
images can be used for imaging biomarkers Valencia, Spain) (Fig. 8.4):
114 A. Alberich-Bayarri et al.

Fig. 8.3 Data entry and flow in the creation of an imaging biobank

User

Imaging biobank
Front-end
platform user interface

Automated
data ingestion

Image Analysis Back-end

Imaging biobank
modules services
platform back-end

Images Job scheduling

normalization (single-patient/
radiomics-
radiogenomics)

Persistence
File Multi-omics
layer
storage Database

Fig. 8.4 Architecture of the Quibim Precision® (Quibim SL, Valencia, Spain) software platform, used for the creation
of imaging biobanks
8 Imaging Biobanks for Molecular Imaging: How to Integrate ML/AI into Our Databases 115

• Front-end layer: This layer is directly exposed –– Multi-omics database: a relational SQL or
to the final user to interact with the software non-relational NoSQL database where the
using the web user interface. Users can access application persists the structured
to this user interface typically through a URL information.
that opens the web application. –– File storage: the application persists all the
• Back-end layer: This layer is built by different files (imaging studies, results, configura-
services that processes requests from the user tion files, ...) in a local or cloud
interface, external applications, and the inter- repository.
action between the services. At this stage,
three main components are present: The images analysis modules that can be
–– Imaging biobank platform back-end: This implemented and integrated within an Imaging
component handles all the requests related to Biobank platform architecture must include all
the front-end and serves the code of the appli- the image processing and quantification steps
cation. Besides, it provides common data to desired to meet the clinical and research needs.
other services such as molecular imaging The programming language of these algorithms
analysis algorithms and external applications. will vary depending on the expertise of the devel-
–– Automated data ingestion: This component opers although the three most frequently used
handles the connection between the imag- languages in the field of molecular imaging are:
ing biobank platform and the local reposi- Python, Matlab, and R, among others.
tories (PACS) using the DICOM protocol. As an illustrative example, given a molecular
–– Job scheduler: This service is used to imaging biobank in which a specific project is
schedule the execution of image analysis dedicated to manage diffuse large B-cell lym-
modules. The platform needs to incorpo- phoma cases, including the PET/CT examina-
rate a simple and flexible orchestrator tions together with the associated clinical data,
(Nomad, Kubernetes) to deploy and man- an image analysis pipeline can be focused in
age image analysis algorithms in contain- applying a systematic analysis methodology to a
ers (Docker). batch of examinations (Fig. 8.5). The image
• Persistence layer: This layer is used to persist analysis pipeline is embedded in a Docker con-
the non-volatile information of the software, tainer and integrated within the imaging biobank
that is, the data. This layer is composed of: platform.

Fig. 8.5 Molecular imaging analysis pipeline dedicated to the estimation of metabolic tumor volume of individual
lesions detected as well as quantification of histogram and textural properties to measure metabolic heterogeneity
116 A. Alberich-Bayarri et al.

References and conceptual framework. Clin Pharmacol Ther.

2001;69:89–95.
12. European Society of Radiology (ESR). ESR statement
1. BBMRI-ERIC Statutes, Article 1(1). https://www.
on the stepwise development of imaging biomarkers.
bbmri-eric.eu/wp-content/uploads/2016/12/BBMRI-
Insights Imaging. 2013;4:147–52.
ERIC_Statutes_Rev2_for_website.pdf. Visited 15
13. Alberich-Bayarri A, Neri E, Marti-Bonmati L. Imaging
May 2021.
biomarkers and imaging biobanks. In: Ranschaert E,
2. European Society of Radiology (ESR). ESR posi-
et al., editors. Artificial intelligence in medical imag-
tion paper on imaging biobanks. Insights Imaging.
ing. London: Springer Nature; 2019. p. 119–26. https://
2015;6:403–10.
doi.org/10.1007/978-3-319-94878-2.
3. https://www.myesr.org/article/145. Visited 15 May
14. Ferrer Lores B, Mayorga-Ruiz I, Alberich Bayarri
2021.
A, Morello-González D, Pastor-Galán I, Navarro-
4. O’Connor JP, Aboagye EO, Adams JE, et al. Consensus
Cubells B, Serrano A, Teruel AI, Dosdá-Muñoz R,
statement. Imaging biomarkers roadmap for cancer
Solano C, Marti-Bonmati L, Terol MJ. Prognostic
studies. Nat Rev Clin Oncol. 2017;14(3):169–86.
value of radiomics signature by diagnostic 18F-FDG
https://doi.org/10.1038/nrclinonc.2016.162. Epub
PET/CT analysis in aggressive non-Hodgkin’s lym-
2016 Oct 11.
phoma. Blood. 2018;132(Suppl 1):1703. https://doi.
5. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images
org/10.1182/blood-2018-99-119851.
are more than pictures, they are data. Radiology.
15. Coppola L, Cianflone A, Grimaldi AM, Incoronato
2016;278(2):563–77. https://doi.org/10.1148/
M, Bevilacqua P, Messina F, Baselice S, Soricelli
radiol.2015151169. Epub 2015 Nov 18. PMID:
A, Mirabelli P, Salvatore M. Biobanking in health
26579733; PMCID: PMC4734157.
care: evolution and future directions. J Transl Med.
6. Alberich-Bayarri Á, Hernández-Navarro R, Ruiz-
2019;17(1):172. https://doi.org/10.1186/s12967-019-
Martínez E, García-Castro F, García-Juan D, Martí-
1922-3. PMID: 31118074; PMCID: PMC6532145.
Bonmatí L. Development of imaging biomarkers and
16. ESR statement on data protection regulation.
generation of big data. Radiol Med. 2017;122(6):444–
https://www.myesr.org/sites/default/files/ESR%20
8. https://doi.org/10.1007/s11547-017-0742-x. Epub
Statement%20on%20EC%27s%20proposal%20
2017 Feb 21. PMID: 28224398.
on%20Data%20Protection%20Regulation_1.pdf.
7. Blanc-Durand P, Jégou S, Kanoun S, Berriolo-
Visited 15 May 2021.
Riedinger A, Bodet-Milin C, Kraeber-Bodéré F,
17. Schwarz CG, Kremers WK, Therneau TM, Sharp
Carlier T, Le Gouill S, Casasnovas RO, Meignan M,
RR, Gunter JL, Vemuri P, Arani A, Spychalla AJ,
Itti E. Fully automatic segmentation of diffuse large B
Kantarci K, Knopman DS, Petersen RC, Jack CR
cell lymphoma lesions on 3D FDG-PET/CT for total
Jr. Identification of anonymous MRI research par-
metabolic tumour volume prediction using a convolu-
ticipants with face-recognition software. N Engl J
tional neural network. Eur J Nucl Med Mol Imaging.
Med. 2019;381(17):1684–6. https://doi.org/10.1056/
2021;48(5):1362–70. https://doi.org/10.1007/s00259-
NEJMc1908881. PMID: 31644852; PMCID:
020-05080-7. Epub 2020 Oct 24. PMID: 33097974.
PMC7091256.
8. Naser MA, van Dijk LV, He R, Wahid KA, Fuller
18. Martí-Bonmatí L, Alberich-Bayarri Á, Ladenstein R,
CD. Tumor segmentation in patients with head and
Blanquer I, Segrelles JD, Cerdá-Alberich L, Gkontra
neck cancers using deep learning based-on multi-
P, Hero B, García-Aznar JM, Keim D, Jentner W,
modality PET/CT images. Head Neck Tumor
Seymour K, Jiménez-Pastor A, González-Valverde
Segm (2020). 2021;12603:85–98. https://doi.
I, de Las M, Heras B, Essiaf S, Walker D, Rochette
org/10.1007/978-3-030-67194-5_10. Epub 2021 Jan
M, Bubak M, Mestres J, Viceconti M, Martí-Besa
13. PMID: 33724743; PMCID: PMC7929493.
G, Cañete A, Richmond P, Wertheim KY, Gubala T,
9. Arabi H, AkhavanAllaf A, Sanaat A, Shiri I,
Kasztelnik M, Meizner J, Nowakowski P, Gilpérez S,
Zaidi H. The promise of artificial intelligence and
Suárez A, Aznar M, Restante G, Neri E. PRIMAGE
deep learning in PET and SPECT imaging. Phys
project: predictive in silico multiscale analytics to
Med. 2021;83:122–37. https://doi.org/10.1016/j.
support childhood cancer personalised evaluation
ejmp.2021.03.008. Epub 2021 Mar 22. PMID:
empowered by imaging biomarkers. Eur Radiol Exp.
33765602.
2020;4(1):22. https://doi.org/10.1186/s41747-020-
10. Marti-Bonmati L, Alberich-Bayarri A, Garcia-Marti
00150-9. PMID: 32246291; PMCID: PMC7125275.
G, Sanz-Requena R, Perez Castillo C, Carot Sierra
19. Neri E, Regge D. Imaging biobanks in oncology:
JM, Manjon Herrera JV. Imaging biomarkers, quan-
European perspective. Future Oncol. 2017;13(5):433–
titative imaging and bioengineering. Radiologia.
41. https://doi.org/10.2217/fon-2016-0239. Epub
2012;54:269–78.
2016 Oct 28. PMID: 27788586.
11. Biomarkers Definitions Working Group. Biomarkers
and surrogate endpoints: preferred definitions
Artificial Intelligence/Machine
Learning in Nuclear Medicine 9
Sangwon Lee, Kyeong Taek Oh, Yong Choi,
Sun K. Yoo, and Mijin Yun

Contents
9.1 Introduction 117
9.2 Classification 118
9.2.1 A
lzheimer’s Disease 118
9.2.2 P arkinson’s Disease 120
9.3 Segmentation 121
9.4 Image Generation and Processing 122
9.5 Low-Dose Imaging 123
References 126

9.1 Introduction diction of disease progression [1, 2]. Different

from magnetic resonance imaging (MRI) depen-
Though the diagnosis of neurodegenerative dis- dent on morphological changes of cortical and
eases is mainly based on clinical criteria, neuro- subcortical structures, positron emission tomog-
imaging in nuclear medicine plays important raphy/computed tomography (PET/CT) provides
supportive roles in diagnosis and differential quantitative evaluation of functional or molecular
diagnosis of neurodegenerative diseases and pre- changes related to metabolism, proteinopathy,
enzyme expression, transporter, or receptor. In
addition to visual analysis, quantitative image
S. Lee · M. Yun (*) analysis is essential to investigate clinical signifi-
Department of Nuclear Medicine, Yonsei University cance of neuroimaging. Of them, voxel-based
College of Medicine, Seoul, South Korea
e-mail: [email protected]; [email protected] analysis and region-of-interest (ROI) or volume-
of-interest (VOI) analysis are widely used for
K. T. Oh · S. K. Yoo
Department of Medical Engineering, Yonsei comparison between control (or normal) and
University College of Medicine, Seoul, South Korea patient groups. Statistical parametric mapping
e-mail: [email protected]; [email protected] (SPM) is the most popular voxel-based approach,
Y. Choi which demonstrates areas of the brain with a sig-
Department of Electronics Engineering, Sogang nificant difference between normal controls and
University, Seoul, South Korea patients [3, 4]. ROI or VOI-based image analysis
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 117
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_9
118 S. Lee et al.

performs the calculation in the pixels of each ROI aging in nuclear medicine including classifica-
or VOI. Manual, semi-automatic, and automatic tion of diseases, segmentation of ROI or VOI,
method can be used to draw a region or volume. denoising, image reconstruction, and low-dose
Although accurate, manual drawing is time- imaging.
consuming, operator dependent, and less repro-
ducible. On the contrary, accurate region
segmentation by automatic drawing should be 9.2 Classification
guaranteed in each patient for robustness and
reliability of data analysis. 9.2.1 Alzheimer’s Disease
Different from traditional image analysis,
machine learning as a subset of the artificial intel- Alzheimer’s disease (AD) is a neurodegenerative
ligence finds patterns through big data. Based on disease characterized by a decline in cognitive
the training data, it builds a mathematical model function. It mostly affects older people so that the
to make prediction. A learning method can be prevalence of AD is increasing with the growth of
unsupervised, semi-supervised, or supervised. the elderly population. Early diagnosis of AD
Supervised learning requires labeled data to find before the symptoms become severe is of utmost
the pattern, whereas unsupervised learning uses clinical importance since it may provide opportu-
unlabeled data and semi-supervised learning nities for effective treatment. 18F-FDG PET/CT is
needs a small labeled data and a large unlabeled one of the most useful modalities to support the
data. Machine learning is trained using a large clinical diagnosis of dementia including AD. It
number of input data with high reproducibility to shows changes in glucose metabolism of the
extract the feature of clinical significance. After brain over various disease entities related to
extraction, feature selection removes unneces- dementia with high sensitivity and specificity. In
sary features to reduce the training time and the patients with AD, the reduction of glucose metab-
possibility of overfitting, and avoid the dimen- olism is expected stating from the mesial tempo-
sionality issues. Then, a classifier algorithm such ral to posterior cingulate cortex (PCC), lateral
as support vector machine, random forest, or arti- temporal, inferior parietal, and prefrontal regions
ficial neural network is performed to map the fea- to help diagnose [5].
ture for the classification of disease. Deep learning methods have been studied for
As a part of the machine learning, deep learn- the evaluation of patients with AD. Several auto-
ing is consisted of the artificial neural networks encoders with multi-layered neural network to
with multiple convolutional layers and nodes. combine multimodal features were applied for
Unlike traditional machine learning, deep learn- AD classification [6]. In a study with a stacked
ing performs the feature extraction and learning auto-encoder to extract high-level features of
by itself. For the feature extraction and transfor- multimodal ROI and an SVM classifier, the pro-
mation, the techniques of deep learning are based posed method was 95.9%, 85.0%, and 75.8%
on a cascade of multiple layers of nonlinear pro- accurate for AD, MCI, and MCI-converter diag-
cessing units. High-quality data and labels are nosis, respectively, using the ADNI dataset [7].
most important to train and test the deep learning Recently, CNN methods with 2D or 3D volume
models. Dataset is typically composed of train- data of PET/CT or MRI scans were applied for
ing, validation, and test set. The training data are AD classification [8–11]. In 2D CNN models, the
used to train a network that loss function calcu- features from the specific slices of axial, coronal,
lates the loss values in the forward propagation and sagittal scans were concatenated and used for
and learnable parameters are updated via back- AD classification. Using MRI volume data, skull
propagation. The validation data are to fine-tune stripping and gray matter segmentation were per-
hyper-parameters and the test data to evaluate the formed and the slices with gray matter informa-
performance of the model. This chapter will tion were used as CNN model input. Compared
focus on artificial intelligence used for neuroim- to 2D CNN models, studies have used 3D volume
9 Artificial Intelligence/Machine Learning in Nuclear Medicine 119

data with promising results. Using the affected by the image-acquisition environment
Alzheimer’s Disease Neuroimaging Initiative including the imaging acquisition system,
(ADNI) MRI dataset without skull-stripping pre- acquisition protocol, reconstruction method,
processing, Hosseini-Asl et al. built a deep 3D etc. Therefore, there is a need for a model with
Convolutional Neural Network (3D-CNN) upon enhanced generalization performance to
a convolutional auto-encoder, which was pre- improve clinical utility of a proposed method. In
trained to capture anatomical shape variations in a recent study using FDG PET/CT, instead of
structural brain MRI scans for source domain [8]. 3D volume data, slice-selective learning using a
Then, fully connected upper layers of the BEGAN-based model was constructed to solve
3D-CNN were fine-tuned for each task-specific the above (Fig. 9.1) [9]. The model was trained
AD classification in target domain. The proposed with an ADNI dataset, then performed external
3D deeply supervised adaptable CNN outper- validation with their own dataset. A range was
formed several proposed approaches, including set to cover the most important AD-related
3D-CNN model, other CNN-based methods, and regions and searched for the most appropriate
conventional classifiers by accuracy and robust- slices for classification. The model learned the
ness. Liu et al. used cascaded convolutional neu- generalized features of AD and NC for external
ral networks (CNNs) to learn the multi-level and validation when appropriate slices were
multimodal features of MRI and PET brain selected. The slice range that covered the PCC
images for AD classification [10]. In the method, using double slices showed the best perfor-
multiple deep 3D-CNNs were applied on differ- mance. The accuracy, sensitivity, and specificity
ent local image patches to transform the local was 94.33%, 91.78%, and 97.06% using their
brain image into more compact high-level fea- own dataset and 94.82%, 92.11%, and 97.45%
tures. Then, an upper high-level 2D-CNN fol- using the ADNI dataset. The performance on the
lowed by softmax layer was cascaded to ensemble two independent datasets showed no statistical
the high-level features and generate the latent difference. The study showed the feasibility of
multimodal correlation features for classification the model with consistent performance when
task. Finally, a fully connected layer followed by tested using datasets acquired from a variety of
softmax layer combined these learned features image-acquisition environments.
for AD classification. Without image segmenta- Despite remarkable diagnostic accuracy of
tion and rigid registration, the method could deep learning, the correlation between the fea-
automatically learn the generic multi-level and tures extracted by deep learning model and dis-
multimodal features from multiple imaging eases is hard to explain. Several studies proposed
modalities for classification. With ADNI MRI the methods for solving this problem by provid-
and PET dataset from 397 subjects including 93 ing the feature map and input data responsible for
AD patients, 204 mild cognitive impairment the result of prediction. Class activation map
(MCI, 76 MCI converters +128 MCI non- (CAM) has been widely used to understand
converters) and 100 normal controls (NC), the where the deep learning model evaluate for
proposed method demonstrated promising per- classes and to explain how deep learning models
formance of an accuracy of 93.26% for classifi- predict the outputs [12–14]. Choi et al. demon-
cation of AD vs. NC and 82.95% for classification strated that brain regions where the CNN model
MCI converters vs. NC. evaluated for AD with decreased cognitive func-
Although studies have shown that various tion using CAM method, which can generate the
deep learning methods were effective for AD heat map with the probability of AD [15].
classification, the model performance of exter- However, CAM-based interpretation should be
nal validation compared to the training dataset cautious because deep learning models may clas-
is an issue to be resolved. In fact, the qualities sify diseases by the regions that cannot be
and properties of medical images could be explained by the known knowledge.
120 S. Lee et al.

Pre-processing Feature Extraction

G
Noise
Vector Denc Ddec

Intensity normalization
Label
Selected
slices

G
Spatial Normalization
Noise
Slice selection Vector Denc Ddec

Label

80960

1024

Linear SVM

AD / NC

Classification

Fig. 9.1 Architecture of slice-selective learning for Alzheimer’s disease classification using GAN network

9.2.2 Parkinson’s Disease Machine learning has been applied to diag-

nose PD using DAT-SPECT or PET scan [21–27].
Parkinson’s disease (PD) is the second most com- The extracted feature from deep learning meth-
mon of neurodegenerative diseases which is ods has outstanding diagnostic results. However,
mainly a movement disorder, such as resting the clinical correlation between disease and deep
tremor, bradykinesia, and rigidity [16, 17]. learning methods needs further explanation and
Alpha-synuclein aggregates, the primary PD verification since low-level features extracted
pathology, are known to promote the dopaminer- from deep learning methods may not reflect the
gic loss [18]. Although non-invasive direct PET neuropathological heterogeneity of PD. Shiiba
imaging of alpha-synuclein aggregates in the et al. used semi-quantitative indicators and shape
brain is limited, the quantification of presynaptic feature acquired on DAT-SPECT to train the
transporters of the nigrostriatal dopaminergic model of machine learning for classification
neurons can be performed with PET and SPECT between PD and normal controls (NC) [28].
using either 18F or 123I N-(3-Fluoropropyl)-2β- Striatum binding ratio (SBR) as semi-quantitative
carbon ethoxy-3β-(4-iodophenyl) Nortropane indicators and circularity index of shape were
(FP-CIT) [19, 20]. Dopamine transporter (DAT) combined as a feature for machine learning. The
in PET/CT has been widely used for the early performance of classification was significantly
diagnosis of PD and the discrimination between improved by using both SBR and circularity than
PD and other diseases showing parkinsonism. by the one of SBR or circularity index (AUC for
9 Artificial Intelligence/Machine Learning in Nuclear Medicine 121

SBR and circularity: 0.995, AUC for circularity to many tasks with excellent performance such as
only: 0.990, and AUC for SBR: 0.973). image-to-image translation, semantic segmenta-
FDG PET/CT is also actively used for the tion, and resolution translation from low to high
evaluation of patients with parkinsonism, espe- [34]. In particular, GAN models have been prom-
cially for the differentiation between idiopathic ising in the field of segmentation. Of the PET/CT
PD and atypical parkinsonism [29]. Wu et al. studies, there is only one study applied pix2pix
used support vector machine to classify PD framework of GAN to segment normal white
patients and NC using radiomics features on matter (WM) on 18F-FDG PET/CT [35]. The
18
F-FDG PET [21]. The proposed method showed DSC of segmenting WM from 18F-FDG PET/CT
that the accuracy of classification between PD was 0.82 on average. Despite the low resolution
and NC was 90.97 ± 4.66% and 88.08 ± 5.27% in of 18F-FDG PET/CT, the results showed similar
Huashan and Wuxi test sets, respectively. In addi- results compared to MRI [36, 37]. The study
tion, several studies showed that the deep learn- showed a feasibility of using 18F-FDG PET/CT in
ing methods were also effective for classification segmenting WM volumes.
between PD patients and NC [30, 31]. Zhao et al. In the WM, there are foci or areas called as
developed a 3D deep residual CNN for auto- white matter hyper-intensities (WMH) since they
mated differential diagnosis of idiopathic PD show increased signal intensity on T2-weighted
(IPD) and atypical parkinsonism (APD) [30]. fluid attenuated inversion recovery (FLAIR) on
With dataset from 920 patients including 502 MRI. Despite seen in healthy elderly subjects,
IPD patients, 239 multiple system atrophy (MSA) WMH are associated with greater hippocampal
patients, and 179 progressive supranuclear palsy atrophy in non-demented elderly and cognitive
(PSP) patients, the proposed method demon- decline in patients with CI [38–40]. Therefore,
strated the performance of 97.7% sensitivity, MRI has been invaluable in the assessment of
94.1% specificity, 95.5% PPV, and 97.0% NPV WMH [41]. As mentioned, 18F-FDG PET/CT is
for the classification of IPD, versus 96.8%, useful in assessing the glucose metabolism in the
99.5%, 98.7%, and 98.7% for the classification of cortex or subcortical neurons. However, the low
MSA, and 83.3%, 98.3%, 90.0%, and 97.8% for spatial resolution and low glucose metabolism
the classification of PSP, respectively. have limited the evaluation of the WM and WMH
on 18F-FDG PET/CT. In our group, we applied a
GAN framework to segment WMH on 18F-FDG
9.3 Segmentation PET/CT (In Fig. 9.2, unpublished data). A data-
set of mild, moderate, and severe groups of WMH
Despite the sensitivity of PET/CT is usually according to the Fazekas scoring system was
much higher than conventional structural images used to train and test a deep learning model.
such as CT of MRI, it is considered difficult to Using WMH on FLAIR MRI as gold standard, a
extract anatomical information from PET/CT GAN method was used to segment WMH on
images because they are not well-distinguishable MRI. The dice similarity coefficient (DSC) val-
from low-resolution images of PET/CT [32]. So ues were closely dependent on WMH volumes on
far, there are limited studies to segment anatomi- MRI. With more than 60 mL of volume, the DSC
cal structures on PET images using deep learning values were above 0.7 with a mean value of
methods, especially in the diseases related to the 0.751 ± 0.048. With a volume of 60 mL or less,
brain. A 3D U-net shaped CNN has been used to the mean value of DSC was only 0.362 ± 0.263.
segment cerebral gliomas on F-18 fluoroethylty- For WMH volume estimation, GAN showed
rosine (18F-FET) PET [33]. Of the deep learning excellent correlation with WMH volume on MRI
methods, generative adversarial network (GAN) (r = 0.998 in severe group, 0.983 in moderate
model received great attention due to the ability group, and 0.908 in mild group). Although it is
to generate data without explicitly modeling limited to evaluate WMH on 18F-FDG PET/CT
probability density functions. It has been applied by visual analysis, they are important vascular
122 S. Lee et al.

a b c d

Fig. 9.2 Deep learning-based, GAN, FLAIR image synthesized using PET/CT. 18F-FDG PET/CT (a), T2-weighted
FLAIR image (b), predicted WMH volume (c), and manually segmented WMH volume (d)

component contributing to dementia. Our GAN age DSC for bone region was 0.77, which was
method showed a feasibility to automatically seg- significantly higher than MLAA-derived attenua-
ment and estimate volumes of WMH on 18F-FDG tion map (0.36). Liu et al. also demonstrated that
PET/CT which will increase values of 18F-FDG deep learning approach to generate pseudo CT
PET/CT in evaluating patients with CI. from MR image reduced PET reconstruction
error compared to CT-based method [48]. With
the retrospective T1-weighted MR images from
9.4 Image Generation 40 subjects, deep convolutional auto-encoder
and Processing (CAE) network was trained with 30 datasets and
then evaluated in 10 dataset by comparing the
Artificial intelligence in nuclear medicine is also generated pseudo CT to a ground-truth of CT
widely used in image processing technology, scan. The results of this study showed that the
such as image reconstruction and attenuation cor- DSC for air region of 0.97, soft tissue of 0.94,
rection. For PET/MRI, attenuation correction by and bone of 0.80.
making pseudo CT images from MRI has com- A generation of MRI from CT or CT from MRI
pared to CT-based methods [42–46]. In a method has been performed by a lot of researchers, but
using Dixon sequence, PET activity in bone very few studies have been carried out for the gen-
structure is underestimated in attenuation map eration of MR images from PET/CT. Choi et al.
[43, 44]. Despite many approaches, MR-based [49] built GAN model, based on image-to-image
attenuation correction methods are considered translation, to generate MR images from florbeta-
lower performance than CT-based method for pir PET images. The generated MR images are
PET/CT. Recently, deep learning methods have used for quantification of florbetapir PET and
been applied to the attenuation correction for measured value was highly correlated with real
PET/MRI. Hwang et al. [47] proposed a deep MR-based quantification method. Although there
learning-based whole-body PET/MRI attenua- was a high structural similarity of 0.91 ± 0.04
tion correction, which is more accurate than between real MR image and generated MR image,
Dixon-based 4-segment method. The proposed the differentiation between gray and white matter
deep learning method used activity and attenua- was difficult and there was blurring of the detailed
tion maps estimated using the maximum- structures in the generated MR. In our group, cycle
likelihood reconstruction of activity and GAN based deep learning method was applied for
attenuation (MLAA) algorithm as inputs to a generating FLAIR images from 18F-FDG PET/
CNN to learn a CT-derived attenuation map. The CT. As shown in Fig. 9.3 (unpublished data), the
attenuation map generated from CNN showed generated FLAIR images from our method had
better bone identification than MLAA and aver- excellent visual quality.
9 Artificial Intelligence/Machine Learning in Nuclear Medicine 123

a b c

Fig. 9.3 Representative images of 18F-FDG PET/CT as an input to deep learning model (a), real FLAIR (b), and the
generated FLAIR image by deep learning model (c) (unpublished data)

9.5 Low-Dose Imaging achieved significantly better performance com-

pared with reconstructed by denoising algorithms
High-quality PET images need a large number of (nonlocal means, block-matching 3D, and auto-
gamma events either from high-dose injection or context network) from 0.005 of the standard
long scan time. Long scan time can result in dose.
patient motion artifacts and inconvenience, while Chen et al. [57] proposed a method to recon-
high-dose administration increases radiation struct full-dose amyloid PET/MR using
exposure to patients. To overcome these issues, 18F-florbetaben (18F-FBB) image from low-dose
the development of technology has concentrated image. Compared with low-dose image, the syn-
on increasing the PET scanner sensitivity to thesized images using CNN model showed
detect a large number of coincidence events. A marked improvement on all quality metrics, such
newer PET system with an axial field-of-view as peak signal-to-noise ratio (PSNR), structural
covering the whole body in a single bed position similarity, and room mean square error (RMSE).
has shown a 40-fold improvement in effective In a visual reading of amyloid burden of synthe-
sensitivity [50, 51]. In addition, numerous image sized FBB image using CNN model, accuracy for
reconstruction and noise reduction algorithm amyloid status was 89%. In addition, the CNN
have improved spatial resolution and signal-to- model showed the smallest mean and variance
noise ratio (SNR) of PET image [52, 53]. Ordered for standardized uptake value ratio (SUVR) dif-
subset expectation maximization (OSEM) with ference to full-dose images. Ouyang et al. [58]
modeling of the point spread function has been also reported a generative adversarial network
used to reconstruct gamma event for high- (GAN) model to reconstruct the full-dose PET
resolution PET imaging. image from low-dose image, which significantly
With deep learning method, convolutional outperformed Chen et al.’s method with the same
neural network (CNN) models have been used to input by 1.87 dB in PSNR, 2.04% in SSIM, and
learn the relationship between full-dose and low- 24.75% in RMSE.
dose PET images [54–56]. Xu et al. [56] pro- In our group, a CNN model with a residual
posed a deep learning method, an encoder-decoder learning framework was applied for predicting
structure with concatenate skip connection with full-time 18F-FBB PET/CT images from short-
residual learning framework, to reduce dose of time scan of 1 to 5 min with excellent image
radioactive tracer in 18F-FDG PET imaging. They quality (Fig. 9.4, unpublished data). In amyloid
124 S. Lee et al.

Fig. 9.4 18F-FBB PET/ a b

CT images reconstructed
with different scan time
(left column) and the
predicted 18F-FBB PET/
CT images by deep
learning method from
short scan time (right
column). Amyloid status
of negative (a) and
positive case (b) were
shown

imaging, amyloid positivity can be measured by short scan time PET images with low SNR using
quantitative analysis of SUVR, which were deep learning with a concatenated connection
normalized to the mean value in the cerebellar and residual learning framework (Fig. 9.5). The
cortex. The results of our ROC analyses showed list-mode PET data were formatted into 10, 30,
that the cut-off values for amyloid positivity 60, and 120 s to investigate the effect of scan
deduced from the images predicted from the time on the quality of synthesized PET images.
CNN models using low-dose images from 1 to The PSNRs and NRMSEs of the synthesized
5 min remained unchanged as compared with 18
F-FDG PET images were significantly supe-
those obtained from the ground-truth images. rior to those of the short scan images for all scan
Scan time reduction using low-dose imaging times. As the scan time increased from 10 to
has been tried for 18F-FDG PET/CT imaging. 120 s, the PSNRs and NRMSEs of the synthe-
Kim et al. [59] proposed that deep learning sized 18F-FDG PET images were improved by
method to synthesize the PET images with high an average of 21.6 ± 3.8% and 47.0 ± 5.5%,
SNR acquired for typical scan durations from respectively.
9 Artificial Intelligence/Machine Learning in Nuclear Medicine 125

+
1

[1+1+1] 32 32 (32+32) 32 32 1 1

32 32 32 (32+64) 32 32

32 64 64 (64+128) 64 64
Input Residual Output

64 128 128 3 3 Convolution-BN-PReLU-Dropout

2 2 Average pooling
Concatenate
Bilinear interpolation
3 3 Convolution

Fig. 9.5 A schematic of the encoder-decoder convolutional neural network for predicting the full-time scan from short-
time scan of 18F-FDG PET/CT

Short scan time of FDG PET/CT Images predicted by deep-learning Standard scan time of FDG PET/CT
(10 sec acquisition) (15 min acquisition)

Fig. 9.6 Representative 18F-FDG PET/CT images in 62-year-old female with normal control, with short-time scan
(10 sec, left), predicted images by CNN with residual learning framework (middle), and full-time scan (15 min, right)

As shown in Fig. 9.6, high quality of PET It will provide new opportunities for PET/CT for
image generated using deep learning model with those patients such as children, pregnant women,
low count data and/or short scan time can have and patients prone to motion artifacts.
practical impact on reducing radiation exposure.
126 S. Lee et al.

References tion. In: Proceedings of the IEEE conference on com-

puter vision and pattern recognition; 2016. p. 2921–9.
14. Zintgraf LM, Cohen TS, Adel T, Welling M. Visualizing
1. Staffaroni AM, Elahi FM, McDermott D, Marton
deep neural network decisions: prediction difference
K, Karageorgiou E, Sacco S, et al. Neuroimaging in
analysis. arXiv preprint arXiv:170204595. 2017.
dementia. Semin Neurol. 2017;37(5):510–37. https://
15. Choi H, Kim YK, Yoon EJ, Lee JY, Lee DS. Cognitive
doi.org/10.1055/s-0037-1608808.
signature of brain FDG PET based on deep learning:
2. Nasrallah I, Dubroff J. An overview of PET neuroim-
domain transfer from Alzheimer’s disease to Parkinson’s
aging. Semin Nucl Med. 2013;43(6):449–61. https:// disease. Eur J Nucl Med Mol Imaging. 2020;47(2):403–
doi.org/10.1053/j.semnuclmed.2013.06.003. 12. https://doi.org/10.1007/s00259-019-04538-7.
3. Friston KJ, Frith CD, Liddle PF, Frackowiak 16. Dorsey ER, Constantinescu R, Thompson JP, Biglan
RS. Comparing functional (PET) images: the assess- KM, Holloway RG, Kieburtz K, et al. Projected number
ment of significant change. J Cereb Blood Flow of people with Parkinson disease in the most populous
Metab. 1991;11(4):690–9. https://doi.org/10.1038/ nations, 2005 through 2030. Neurology. 2007;68(5):384–
jcbfm.1991.122. 6. https://doi.org/10.1212/01.wnl.0000247740.47667.03.
4. Friston KJ, Frith CD, Liddle PF, Dolan RJ, 17. Jankovic J. Parkinson’s disease: clinical features and diag-
Lammertsma AA, Frackowiak RS. The relationship nosis. J Neurol Neurosurg Psychiatry. 2008;79(4):368–
between global and local changes in PET scans. J 76. https://doi.org/10.1136/jnnp.2007.131045.
Cereb Blood Flow Metab. 1990;10(4):458–66. https:// 18. Gratwicke J, Jahanshahi M, Foltynie T. Parkinson’s
doi.org/10.1038/jcbfm.1990.88. disease dementia: a neural networks perspective.
5. Nestor PJ, Altomare D, Festari C, Drzezga A, Rivolta Brain J Neurol. 2015;138(Pt 6):1454–76. https://doi.
J, Walker Z, et al. Clinical utility of FDG-PET for the org/10.1093/brain/awv104.
differential diagnosis among the main forms of demen- 19. Brooks DJ. Molecular imaging of dopamine trans-
tia. Eur J Nucl Med Mol Imaging. 2018;45(9):1509– porters. Ageing Res Rev. 2016;30:114–21. https://doi.
25. https://doi.org/10.1007/s00259-018-4035-y. org/10.1016/j.arr.2015.12.009.
6. Liu S, Liu S, Cai W, Che H, Pujol S, Kikinis R, 20. Brücke T, Djamshidian S, Bencsits G, Pirker W,
et al. Multimodal neuroimaging feature learning for Asenbaum S, Podreka I. SPECT and PET imaging
multiclass diagnosis of Alzheimer’s disease. IEEE of the dopaminergic system in Parkinson’s disease.
Trans Biomed Eng. 2015;62(4):1132–40. https://doi. J Neurol. 2000;247(Suppl 4):Iv/2-7. https://doi.
org/10.1109/tbme.2014.2372011. org/10.1007/pl00007769.
7. Suk HI, Shen D. Deep learning-based feature rep- 21. Wu Y, Jiang JH, Chen L, Lu JY, Ge JJ, Liu FT, et al.
resentation for AD/MCI classification. Med Image Use of radiomic features and support vector machine
Comput Comput Assist Interv. 2013;16(Pt 2):583–90. to distinguish Parkinson’s disease cases from normal
https://doi.org/10.1007/978-3-642-40763-5_72. controls. Ann Translat Med. 2019;7(23):773. https://
8. Hosseini-Asl E, Ghazal M, Mahmoud A, Aslantas A, doi.org/10.21037/atm.2019.11.26.
Shalaby AM, Casanova MF, et al. Alzheimer’s dis- 22. Katako A, Shelton P, Goertzen AL, Levin D, Bybel
ease diagnostics by a 3D deeply supervised adaptable B, Aljuaid M, et al. Machine learning identified an
convolutional network. Front Biosci (Landmark Ed). Alzheimer’s disease-related FDG-PET pattern which is
2018;23:584–96. https://doi.org/10.2741/4606. also expressed in Lewy body dementia and Parkinson’s
9. Kim HW, Lee HE, Lee S, Oh KT, Yun M, Yoo disease dementia. Sci Rep. 2018;8(1):13236. https://
SK. Slice-selective learning for Alzheimer’s disease doi.org/10.1038/s41598-018-31653-6.
classification using a generative adversarial network: 23. Zhang YC, Kagen AC. Machine learning interface for
a feasibility study of external validation. Eur J Nucl medical image analysis. J Digit Imaging. 2017;30(5):
Med Mol Imaging. 2020;47(9):2197–206. https://doi. 615–21. https://doi.org/10.1007/s10278-016-9910-0.
org/10.1007/s00259-019-04676-y. 24. Augimeri A, Cherubini A, Cascini GL, Galea D,
10. Liu M, Cheng D, Wang K, Wang Y. Multi-modality cas- Caligiuri ME, Barbagallo G, et al. CADA-computer-
caded convolutional neural networks for Alzheimer’s aided DaTSCAN analysis. EJNMMI Phys. 2016;3(1):4.
disease diagnosis. Neuroinformatics. 2018;16(3–4):295– https://doi.org/10.1186/s40658-016-0140-9.
308. https://doi.org/10.1007/s12021-018-9370-4. 25. Oliveira FP, Castelo-Branco M. Computer-aided
11. Mehmood A, Maqsood M, Bashir M, Shuyuan Y. A diagnosis of Parkinson’s disease based on [(123)
deep siamese convolution neural network for multi- I]FP-CIT SPECT binding potential images, using
class classification of Alzheimer disease. Brain Sci. the voxels-as-features approach and support vector
2020;10(2) https://doi.org/10.3390/brainsci10020084. machines. J Neural Eng. 2015;12(2):026008. https://
12. Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan doi.org/10.1088/1741-2560/12/2/026008.
T, et al. Chexnet: radiologist-level pneumonia detec- 26. Huertas-Fernández I, García-Gómez FJ, García-Solís
tion on chest x-rays with deep learning. arXiv preprint D, Benítez-Rivero S, Marín-Oyaga VA, Jesús S, et al.
arXiv:171105225. 2017. Machine learning models for the differential diagno-
13. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba sis of vascular parkinsonism and Parkinson’s disease
A. Learning deep features for discriminative localiza- using [(123)I]FP-CIT SPECT. Eur J Nucl Med Mol
9 Artificial Intelligence/Machine Learning in Nuclear Medicine 127

Imaging. 2015;42(1):112–9. https://doi.org/10.1007/ tion. NeuroImage. 2015;108:214–24. https://doi.

s00259-014-2882-8. org/10.1016/j.neuroimage.2014.12.061.
27. Illan IA, Gorrz JM, Ramirez J, Segovia F, Jimenez- 38. Erten-Lyons D, Woltjer R, Kaye J, Mattek N, Dodge
Hoyuela JM, Ortega Lozano SJ. Automatic assis- HH, Green S, et al. Neuropathologic basis of white
tance to Parkinson’s disease diagnosis in DaTSCAN matter hyperintensity accumulation with advanced
SPECT imaging. Med Phys. 2012;39(10):5971–80. age. Neurology. 2013;81(11):977–83. https://doi.
https://doi.org/10.1118/1.4742055. org/10.1212/WNL.0b013e3182a43e45.
28. Shiiba T, Arimura Y, Nagano M, Takahashi T, 39. Fiford CM, Manning EN, Bartlett JW, Cash DM, Malone
Takaki A. Improvement of classification perfor- IB, Ridgway GR, et al. White matter hyperintensities
mance of Parkinson’s disease using shape features are associated with disproportionate progressive hip-
for machine learning on dopamine transporter single pocampal atrophy. Hippocampus. 2017;27(3):249–62.
photon emission computed tomography. PLoS One. https://doi.org/10.1002/hipo.22690.
2020;15(1):e0228289. https://doi.org/10.1371/jour- 40. Liu CK, Miller BL, Cummings JL, Mehringer CM,
nal.pone.0228289. Goldberg MA, Howng SL, et al. A quantitative MRI
29. Walker Z, Gandolfo F, Orini S, Garibotto V, Agosta F, study of vascular dementia. Neurology. 1992;42(1):138–
Arbizu J, et al. Clinical utility of FDG PET in Parkinson’s 43. https://doi.org/10.1212/wnl.42.1.138.
disease and atypical parkinsonism associated with demen- 41. Scheltens P, Barkhof F, Leys D, Pruvo JP, Nauta JJ,
tia. Eur J Nucl Med Mol Imaging. 2018;45(9):1534–45. Vermersch P, et al. A semiquantative rating scale for
https://doi.org/10.1007/s00259-018-4031-2. the assessment of signal hyperintensities on magnetic
30. Zhao Y, Cumming P, Rominger A, Zuo C, Shi K, Wu resonance imaging. J Neurol Sci. 1993;114(1):7–12.
P, et al. A 3D deep residual convolutional neural net- https://doi.org/10.1016/0022-510x(93)90041-v.
work for differential diagnosis of parkinsonian syn- 42. An HJ, Seo S, Kang H, Choi H, Cheon GJ, Kim HJ,
dromes on (18)F-FDG PET images. In: Conference et al. MRI-based attenuation correction for PET/
proceedings: annual international conference of the MRI using multiphase level-set method. J Nucl
IEEE engineering in medicine and biology society Med. 2016;57(4):587–93. https://doi.org/10.2967/
IEEE engineering in medicine and biology soci- jnumed.115.163550.
ety annual conference. 2019; p. 3531–4. https://doi. 43. Keereman V, Fierens Y, Broux T, De Deene Y,
org/10.1109/embc.2019.8856747. Lonneux M, Vandenberghe S. MRI-based attenuation
31. Shen T, Jiang J, Lin W, Ge J, Wu P, Zhou Y, et al. Use correction for PET/MRI using ultrashort echo time
of overlapping group LASSO sparse deep belief net- sequences. J Nucl Med. 2010;51(5):812–8. https://
work to discriminate Parkinson’s disease and normal doi.org/10.2967/jnumed.109.065425.
control. Front Neurosci. 2019;13:396. https://doi. 44. Rausch I, Rust P, DiFranco MD, Lassen M, Stadlbauer
org/10.3389/fnins.2019.00396. A, Mayerhoefer ME, et al. Reproducibility of MRI
32. Bouter C, Henniges P, Franke TN, Irwin C, Sahlmann Dixon-based attenuation correction in combined
CO, Sichler ME, et al. (18)F-FDG-PET detects PET/MR with applications for lean body mass estima-
drastic changes in brain metabolism in the Tg4- 42 tion. J Nucl Med. 2016;57(7):1096–101. https://doi.
model of Alzheimer’s disease. Front Aging Neurosci. org/10.2967/jnumed.115.168294.
2018;10:425. https://doi.org/10.3389/fnagi.2018.00425. 45. Sekine T, Ter Voert EE, Warnock G, Buck A, Huellner
33. Blanc-Durand P, Van Der Gucht A, Schaefer N, Itti M, Veit-Haibach P, et al. Clinical evaluation of zero-
E, Prior JO. Automatic lesion detection and segmen- echo-time attenuation correction for brain 18F-FDG
tation of 18F-FET PET in gliomas: a full 3D U-net PET/MRI: comparison with atlas attenuation correc-
convolutional neural network study. PLoS One. tion. J Nucl Med. 2016;57(12):1927–32. https://doi.
2018;13(4):e0195798. https://doi.org/10.1371/jour- org/10.2967/jnumed.116.175398.
nal.pone.0195798. 46. Vandenberghe S, Marsden PK. PET-MRI: a
34. Yi X, Walia E, Babyn P. Generative adversarial net- review of challenges and solutions in the devel-
work in medical imaging: a review. Med Image opment of integrated multimodality imaging.
Anal. 2019;58:101552. https://doi.org/10.1016/j. Phys Med Biol. 2015;60(4):R115–54. https://doi.
media.2019.101552. org/10.1088/0031-9155/60/4/r115.
35. Oh KT, Lee S, Lee H, Yun M, Yoo SK. Semantic seg- 47. Hwang D, Kang SK, Kim KY, Seo S, Paeng JC,
mentation of white matter in FDG-PET using generative Lee DS, et al. Generation of PET attenuation map
adversarial network. J Digit Imaging. 2020;33(4):816– for whole-body time-of-flight (18)F-FDG PET/MRI
25. https://doi.org/10.1007/s10278-020-00321-5. using a deep neural network trained with simul-
36. Nie D, Wang L, Gao Y, Shen D. Fully convolutional taneously reconstructed activity and attenuation
networks for multi-modality isointense infant brain maps. J Nucl Med. 2019;60(8):1183–9. https://doi.
image segmentation. Proc IEEE Int Symp Biomed org/10.2967/jnumed.118.219493.
Imag. 2016;2016:1342–5. https://doi.org/10.1109/ 48. Liu F, Jang H, Kijowski R, Bradshaw T, McMillan
isbi.2016.7493515. AB. Deep learning MR imaging-based attenua-
37. Zhang W, Li R, Deng H, Wang L, Lin W, Ji S, et al. tion correction for PET/MR imaging. Radiology.
Deep convolutional neural networks for multi- 2018;286(2):676–84. https://doi.org/10.1148/radiol.
modality isointense infant brain image segmenta- 2017170700.
128 S. Lee et al.

49. Choi H, Lee DS. Generation of structural MR images Phys Med Biol. 2019;64(11):115004. https://doi.
from amyloid PET: application to MR-less quantifi- org/10.1088/1361-6560/ab0dc0.
cation. J Nucl Med. 2018;59(7):1111–7. https://doi. 55. Kaplan S, Zhu YM. Full-dose PET image estimation
org/10.2967/jnumed.117.199414. from low-dose PET image using deep learning: a pilot
50. Badawi RD, Shi H, Hu P, Chen S, Xu T, Price PM, et al. study. J Digit Imaging. 2019;32(5):773–8. https://doi.
First human imaging studies with the EXPLORER org/10.1007/s10278-018-0150-3.
total-body PET scanner. J Nucl Med. 2019;60(3):299– 56. Xu J, Gong E, Pauly J, Zaharchuk G. 200x low-dose
303. https://doi.org/10.2967/jnumed.119.226498. PET reconstruction using deep learning. arXiv pre-
51. Cherry SR, Jones T, Karp JS, Qi J, Moses WW, print arXiv:171204119. 2017.
Badawi RD. Total-body PET: maximizing sensitivity 57. Chen KT, Gong E, de Carvalho Macruz FB, Xu J,
to create new opportunities for clinical research and Boumis A, Khalighi M, et al. Ultra-low-dose (18)
patient care. J Nucl Med. 2018;59(1):3–12. https:// F-florbetaben amyloid PET imaging using deep
doi.org/10.2967/jnumed.116.184028. learning with multi-contrast MRI inputs. Radiology.
52. Caribe P, Koole M, D’Asseler Y, Van Den Broeck B, 2019;290(3):649–56. https://doi.org/10.1148/radiol.
Vandenberghe S. Noise reduction using a Bayesian 2018180940.
penalized-likelihood reconstruction algorithm on a time- 58. Ouyang J, Chen KT, Gong E, Pauly J, Zaharchuk
of-flight PET-CT scanner. EJNMMI Phys. 2019;6(1):22. G. Ultra-low-dose PET reconstruction using generative
https://doi.org/10.1186/s40658-019-0264-9. adversarial network with feature matching and task-spe-
53. Ashrafinia S, Mohy-Ud-Din H, Karakatsanis NA, Jha cific perceptual loss. Med Phys. 2019;46(8):3555–64.
AK, Casey ME, Kadrmas DJ, et al. Generalized PSF https://doi.org/10.1002/mp.13626.
modeling for optimized quantitation in PET imaging. 59. Kim J, Kang S, Lee K, Jung JH, Kim G, Lim HK, et al.
Phys Med Biol. 2017;62(12):5149–79. https://doi. Effect of scan time on neuro 18 F-fluorodeoxyglucose
org/10.1088/1361-6560/aa6911. positron emission tomography image generated using
54. Liu CC, Qi J. Higher SNR PET image predic- deep learning. J Med Imag Health In. 2020;10:1–7.
tion using a deep learning model and MRI image. https://doi.org/10.1166/jmihi.2020.3316.
AI/ML Imaging Applications
in Body Oncology 10
Robert Seifert and Peter Herhaus

Contents
10.1 General Principles 129
10.2 rain
B 130
10.2.1 G lioma 130
10.3 eck
N 131
10.3.1 H ead and Neck Cancer 131
10.3.2 Thyroid Cancer 131
10.4 horax
T 132
10.4.1 L ung Cancer 132
10.5 bdomen
A 132
10.5.1 E sophageal Cancer 132
10.5.2 L iver Tumor 132
10.5.3 P rostate Cancer 133
10.6 keleton
S 134
10.6.1 B one Metastases 134
10.7 ematopoietic System
H 134
10.7.1 L ymphoma 134
10.7.2 M ultiple Myeloma 134
References 135

10.1 General Principles

In the following, the structure of the chapter is

outlined and general principles as well as issues
R. Seifert (*)
Department of Nuclear Medicine, of artificial intelligence (AI) in nuclear medicine
University Hospital Essen, Essen, Germany are discussed. There is no clear definition of AI in
e-mail: [email protected] medical imaging nor a clear demarcation to con-
P. Herhaus ventional analysis techniques. Thus, other
Internal Medicine III, Hematology and Medical advanced image analysis methods like radiomics
Oncology, Technische Universität München, are summarized in this chapter as well.
Munich, Germany

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 129
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_10
130 R. Seifert and P. Herhaus

The utilization of AI for detecting diseases in ing results (e.g. fully automatic delineation of all
medical image data is rapidly emerging [1]. malignancy suspicious lesions). Therefore, both
Consequently, AI in nuclear medicine has been the tumor volume quantification trend and indi-
widely employed for image data, and also for vidual lesion delineation trend will ultimately
electronic health record data [2]. When applied to merge when lesion-wise classification becomes
image data, AI may be used to determine the even better and is thus suited for tumor volume
stage according to an existing staging system quantification.
(like the bone scan index), to improve an existing There are some unsolved issues regarding the
staging system (e.g. by simplification of application of AI in the field of nuclear medicine
TIRADS), to generate new staging systems that and especially in oncological settings. As out-
are to complex or too time-consuming to be per- lined, the quantification of the tumor volume
formed by medical experts (e.g. whole-body comes into focus of many software tools that ana-
tumor volume quantification in PET-CTs) or to lyze PET-CT data. Yet, there is no consensus how
directly predict a clinically relevant endpoint to determine a reference standard for tumor vol-
(e.g. estimate grading of tumor, predict overall ume quantification. It may be evident, that mor-
survival time). When applied to electronic health phological information (e.g. obtained from the
record data, AI may be used to predict endpoints CT component) is not ideal as reference to assess
as well. Additional approaches seem promising, the molecular volume. However, there are several
like the utilization of artificial intelligence to strategies for the segmentation of PET volume as
form real-world control groups for image centric well, like applying a fixed threshold (e.g. every
trial, as has been demonstrated for therapeutic voxel >6 SUV is tumor), applying relative thresh-
trials [3]. olding (e.g. 50% of local SUVmax), or others.
An organ-wise structure is chosen to organize Future studies have to evaluate which tumor seg-
this chapter, as it focuses on the application of AI mentation method is closest to the actual tumor
to oncological imaging. However, as AI is emerg- volume and should therefore be used as reference
ing in the field of nuclear medicine, two underly- standard for AI algorithms. To this end, it might
ing trends can be observed: whole-body tumor be warranted to employ the concept of probabi-
volume quantification and individual lesion listic segmentations that addresses issues arising
delineation. Quantification of the molecular from inter- and intra-rater variance in tumor seg-
whole-body tumor volume (e.g. 18F-FDG or mentations [5]. Finally, one has to bear in mind
PSMA avid tumor parts in contrast to morpho- that it is at least as difficult to develop AI for a
logical tumor volume) is feasible using semi- specific task as proving its incremental benefit for
automated approaches that facilitate the the patient and implementing it in the clinical
quantification by AI methods. Yet, medial expert routine [6, 7].
interaction is still needed to obtain valid results.
Such quantification approaches are clinically
needed, as the whole-body tumor volume might 10.2 Brain
be a more precise parameter to assess the extent
of an oncological disease [4]. Moreover, quanti- 10.2.1 Glioma
fying of the whole-body tumor volume might
enable more precise therapy response monitor- The characterization of cerebral gliomas has
ing. The second trend is to automatically delin- moved from a morphological-based classification
eate and grade malignancy suspicious lesions in to molecular profiling, comprising of markers
nuclear medicine imaging by employing AI. This like IDH1 mutation status [8]. This is due to the
is a more complex and error prone task, com- heterogeneity of gliomas, which cannot suffi-
pared to just providing assistance to medical ciently be differentiated by conventional imag-
experts. However, several studies that are pre- ing. Therefore, molecular imaging approaches
sented here could demonstrate extremely promis- together with machine learning methods have
10 AI/ML Imaging Applications in Body Oncology 131

been proposed to enable an improved noninva- PET-CT data [14]. The authors adopted the U-Net
sive glioma profiling. Kebir et al. could show that design which used both PET and CT images as
11
C-MET PET and machine learning enabled the input and achieved a dice score (which is a mea-
noninvasive diagnosis of the IDH1 status of glio- sure of segmentation accuracy) of 87.5%.
mas; an area under the curve (AUC) of 0.79 was
reached [9]. However, the analyzed patient col-
lective was relatively small (n = 39) and future 10.3.2 Thyroid Cancer
corroborating studies are needed.
Haubold et al. employed multiparametric 18F- Thyroid nodules are frequently seen on ultra-
FET PET-MR to noninvasively estimate grading sound examinations; however, only a small frac-
and molecular profiles of gliomas [10]. tion of thyroid nodules is caused by thyroid
Interestingly, the integration of 18F-FET features cancer [15]. To facilitate the characterization of
(like SUVmax) into the multiparametric MRI fea- thyroid nodules as either malignancy suspicious
tures has improved the estimation neither of or benign, the ACR TI-RADS system has been
grading nor of molecular profiling. For example, proposed [16]. ACR TI-RADS comprises five
the estimation of IDH1 status had an AUC of categories (like echogenicity or shape) and allo-
88% (excluding PET features). Yet again, the cates a score for the degree of each category. The
patient collective was relatively small (n = 42), sum of all five category scores stratifies the likeli-
especially given the large number of 19.284 fea- hood of the presence of thyroid cancer. The like-
tures that were extracted for each patient. lihood of cancer is in turn graded in five categories
(1-benign to 5-highly suspicious). Despite good
reason for the individual categories, no study
10.3 Neck could corroborate a given score (e.g. in the echo-
genicity category, the hyperechoic criterium has
10.3.1 Head and Neck Cancer a score of 1, whereas hyoechoic has a score of 2).
Therefore, Wildman-Tobriner et al. used AI to
18
F-FDG PET-CT is a reference standard exami- evaluate, if the individual scores of ultrasound
nation for the detection of cervical lymph node features were appropriate or if ACR TI-RADS
metastases of patients with head and neck cancer; could be simplified. Interestingly, the scores of
especially, if subsequent radiotherapy is planned their revised ACR TI-RADS called AI TI-RADS
[11]. However, the differentiation between physi- were indeed simplified (e.g. hyperechoic crite-
ological lymph nodes and suspicious lymph node rium got a score of 0 and was therefore neglect-
metastases in 18F-FDG PET-CT might be chal- able, whereas hypoechoic remained with score of
lenging. To this end, Chen et al. have proposed a 2). Moreover, the authors could corroborate that
tool which combines both radiomics and 3D con- the sensitivity of AI TI-RADS remained high
volutional neuronal networks for the character- compared to conventional ACR TI-RADS (93%),
ization of cervical lymph node metastases using whereas the specificity of AI TI-RADS increased
PET-CT [12]. Unfortunately, the patient collec- compared to ACR TI-RADS (65% vs. 47%). This
tive was small (n = 59) and the reference standard interesting work could facilitate the use of this
for nodal involvement was an expert rating. manual classification system and might be
Huang et al. proposed a method for the auto- expanded to other classifications as well.
mated delineation of head and neck cancer using Instead of training a neuronal network to esti-
PET-CT data and demonstrated its feasibility mate an ACT TI-RADS score (or a similar clas-
[13]. Yet, despite the use of bicentric data, the sification), some groups directly used the
generalizability of the presented approach still histological classification as ground truth for
needs to be proven. Zhao et al. have followed a training and evaluation. Ko et al. could show that
similar approach and aimed at the automated a convolutional neuronal network obtained high
delineation of nasopharyngeal carcinoma on AUC results (0.835–0.850) and was not
132 R. Seifert and P. Herhaus

statistically differed form radiologists (AUC: only its immediate vicinity is present to the net-
0.805–0.860) [17]. Importantly, histological work. Because of that, the input of the entire PET
ground truth was present for all patients. There as maximum intensity projection (MIP) signifi-
have also been reports on optimized network cantly improved the accuracy. Similar to the
architectures dedicated to ultrasound images of human perception, the MIP and other reforma-
thyroid cancer [18]. Li et al. presented a retro- tions may facilitate the recognition of global
spective multicenter study evaluating the perfor- uptake patterns, e.g. caused by brown adipose
mance of a neuronal network in detecting thyroid tissue activation. Additionally, CT information
cancer by ultrasound images, which comprised was used in conjunction with the PET as input for
45.644 patients [19]. Importantly, external vali- the neuronal network and significantly improved
dation cohorts were present as well. For the inter- the accuracy compared to PET only inputs.
nal validation cohort, both sensitivity (93.4%) Future studies have to evaluate the predictive
and specificity (86.1%) were remarkably high. potential of the automatically determined 18F-
The authors concluded that sensitivity was simi- FDG tumor volume.
lar to a group of skilled radiologists, but the spec-
ificity was statistically significantly improved.
10.5 Abdomen

10.4 Thorax 10.5.1 Esophageal Cancer

10.4.1 Lung Cancer Beukinga et al. used 18F-FDG PET examinations

before and after neoadjuvant radio chemotherapy
Fluorodeoxyglucose (18F-FDG) PET-CT is the to predict the outcome of patients suffering from
standard diagnostic tool for the staging of patients esophageal cancer [22]. The authors extracted
with lung cancer [20]. Sibille et al. developed a radiomic features, which combined with the
software for the automated segmentation of sus- T-stage could predict complete pathologic
picious FDG foci using acquisitions of 302 lung response with high accuracy (AUC = 0.81).
cancer patients amongst other patients [21]. The However, only 73 patients were included in this
proposed software runs fully automatically and study, which might limit the transferability to
estimates not only the classification of each 18F- larger or inhomogeneous patient collectives.
FDG hot spot (suspicious i.e. metastasis vs. not
suspicious i.e. physiologic) but also the anatomi-
cal location of each hot spot (e.g. lymph node 10.5.2 Liver Tumor
level). The accuracies both of classification
(AUC = 0.98) and of anatomical location (accu- Radioembolization with 90Y spheres is a thera-
racy = 97% for body part, 84% for organ or tis- peutic option for patients with liver metastases or
sue) were remarkably high. Interestingly, the primary hepatic tumor and also known as selec-
proposed neuronal network did not segment the tive internal radioembolization (SIRT). Due to
18
F-FDG foci in the PET acquisition, but in con- impairment of uninvolved liver tissue and gener-
trast analyzed hotspots found by conventional ally end stage disease, the prediction of overall
thresholding. This procedure might lead to inac- survival prior to SIRT is clinically needed.
curacies, as confluent lesions or confluence Therefore, Ingirsch et al. had retrospectively ana-
between a metastasis and an organ with physio- lyzed electronic health records (e.g. blood level
logical 18F-FDG accumulation might not be sepa- of bilirubin, age) of 366 patients that received 90Y
rated properly by conventional thresholding. The radioembolization by using machine learning
neuronal networks used by this software require methods [23]. The authors identified baseline
the input of coronal reformatted image data. Each cholinesterase and bilirubin levels as predictor
tracer accumulation is analyzed separately and for overall survival after SIRT.
10 AI/ML Imaging Applications in Body Oncology 133

10.5.3 Prostate Cancer accumulation in many organs, like in liver,

spleen, bowel, kidneys, salivary glands and oth-
Prostate cancer is the leading cause of cancer- ers. The qPSMA software assists the reading
related death in men and has a remarkably early physician in segmenting all prostate cancer
tendency to form metastases; already at time of metastases by excluding some organs with phys-
prostatectomy, approximately 70% of men show iological update from the analysis. To this end, a
prostate cancer cell in the bone marrow [24]. The random forest-based algorithm is used by
sensitive detection of metastases as well as moni- qPSMA to segment organs with physiological
toring of the whole-body tumor load is of great PSMA accumulation employing the CT compo-
clinical importance. To this end, prostate-specific nent [30]. The qPSMA software not only masks
membrane antigen (PSMA) targeting PET-CT out physiological PSMA uptake, but likewise
has been widely employed and could demon- segments PSMA foci with a patient specific
strate superior performance both in primary and SUV threshold. Each voxel exceeding this
recurrent prostate cancer [25, 26]. Several threshold is regarded as metastases, if it is not
AI-based approaches have tried to analyze manually or automatically excluded. In addition,
PSMA-PET examinations with regard to indi- qPSMA enables the adjustment of predefined
vidual lesion classification and whole-body organ exclusion masks and facilitates the exclu-
tumor volume. sion or inclusion of missed PSMA foci using
Zhao et al. have developed a neuronal network brush tools. For example, liver metastases had to
for the delineation of PSMA avid metastases in be added manually due to the heuristic logic that
the pelvic area [27]. The authors had adopted the the liver uptake is physiologic, and the entire
U-Net architecture to include both PET and CT liver therefore be removed from the analysis.
slices as input and aimed at a voxel wise segmen- The inter-rater and intra-rater correlation of
tation of prostate cancer metastases [28]. The qPSMA is high for the segmentation of individ-
network employs axial, coronal and sagittal ref- ual metastasis.
ormations as input to mimic the reading of a An approach similar to qPSMA was proposed
human expert. For training and evaluation, metas- by Seifert et al. [31]. Likewise, it facilitates the
tases were manually delineated by nuclear medi- semiautomated quantification of the whole-body
cine experts in 193 PSMA PET acquisitions; tumor volume by excluding physiologic PSMA
their delineations were used as ground truth data. foci from the analysis. Moreover, it automatically
The limitation to the pelvic region was necessary assigns the anatomical location to each PSMA
due to proof of concept nature of the publication; focus. In contrast to qPSMA, the software
however, extension to the whole body seems also employs a two-step approach for delineation of
feasible. The work of Zhao et al. is of great rele- foci: first, voxels exceeding a patient-specific
vance, as it enables the fully automated segmen- threshold are selected as candidate lesions.
tation of prostate cancer metastases with great Second, these candidate lesions were segmented
precision (99%) and recall (99%). However, by thresholding with 50% of the local SUVmax.
because of point spread artifacts, it could prove Thereby, no brush tools are needed for refine-
disadvantageous that the proposed neuronal net- ment; physiological candidate lesions can be
work outputs the tumor segmentation. deleted easily. The author could show that this
Gafita et al. proposed an open source software procedure achieves a high inter-rater agreement.
(qPSMA) for the semi-automated quantification Interestingly, the authors also reported that semi-
of the whole-body tumor burden in PSMA-PET automatically quantified whole-body tumor vol-
CTs [29]. Despite the name prostate-specific ume stratified end-stage prostate cancer patients
membrane antigen, PSMA shows physiological according to the overall survival.
134 R. Seifert and P. Herhaus

10.6 Skeleton 10.7 Hematopoietic System

10.6.1 Bone Metastases 10.7.1 Lymphoma

Bone scans are primarily used for the detection 18

F-FDG -PET-CT is a standard diagnostic for
and monitoring of bone metastases and one of the staging and therapy monitoring of lymphoma
high throughput examinations of nuclear medi- patients. However, due to highly variable physi-
cine. Especially for therapy monitoring of pros- ological 18F-FDG uptake, the interpretation of
tate cancer patients, bone scans are an established 18
F-FDG PET acquisitions is challenging, espe-
imaging method [32]. However, the interpreta- cially for neuronal networks. The software pro-
tion of bone scans to calculate a quantitative bio- posed by Sibille et al. that was already presented
marker, which is called bone scan index (BSI), is above was not only trained using lung cancer
time-consuming [33, 34]. To calculate the BSI, at patients, but with 18F-FDG PET-CTs of lym-
first, the fraction of metastatic involvement of phoma patients (n = 327) as well [21]. Therefore,
each bone has to be calculated. Second, this frac- the software obtained high accuracy in the clas-
tion is multiplied with the fraction that the bone sification (AUC = 0.95) and the determination of
constitutes to the entire skeleton. By summation the anatomical location (Accuracy = 97% for
of all values, the BSI is obtained. Thereby, BSI body part and 84% for organ or tissue). Thereby,
represents the fraction of metastatically affected the automatic quantification of a whole-body
bone, i.e. a BSI of 3 means that 3% of the entire tumor volume is feasible. Future studies have to
skeletal mass is affected by metastases. elucidate if the automatically determined tumor
Several solutions have been proposed to auto- volume can stratify patients according to their
matically quantify the BSI. Among them is the overall survival or other clinically relevant end
work of Ulmert et al., who proposed a method points.
which uses neuronal networks for the automated
segmentation and classification of hotspots in
bone scans [35]. Interestingly, the development 10.7.2 Multiple Myeloma
of the first prototype dates back to 2006, where
AI was not the now established buzzword, which Multiple myeloma (MM) is a clonal plasma cell
might be the reason why the authors called their neoplasia and detection of bone lesions is crucial
work “computer-based decision support system” during diagnostic work-up. MM lesions not only
[36]. The automatically derived BSI could statis- display an important criterion for the initiation of
tically significant stratify prostate cancer patients treatment but moreover discriminate MM from
according to overall survival [37]. pre-malignant diseases such as monoclonal gam-
As mentioned above, PSMA-PET-CT has mopathy of undetermined significance. Whole
emerged as reference standard examination for body low-dose CT is the gold standard in MM,
patients with prostate cancer. Therefore, the but MRI is attributed with a higher sensitivity in
quantification of the osseous tumor volume from the detection of small MM lesions. CXCR4-
PSMA-PET-CT, similar to the BSI, is of impor- directed PET imaging with 68Ga-Pentixafor rep-
tance. To this end, Bieth et al. have proposed a resents another imaging modality for the
software for the quantification of the osseous detection of active MM lesions.
tumor burden using PSMA-PET-CT acquisitions Martínez-Martínez et al. have developed a
[38]. Hammes et al. followed a similar approach fully automated method that identifies bone mar-
(EBONI) and provided the source code of their row infiltration in low-dose CT of MM patients
software [39]. [40]. Their method was validated on a dataset of
10 AI/ML Imaging Applications in Body Oncology 135

127 subjects where it was able to discriminate Ronneberger O. A hierarchical probabilistic U-net for
modeling multi-scale ambiguities. 2019. p. 1–25.
bone marrow infiltration in patients with MM 6. Rajkomar A, Dean J, Kohane I, et al. The practical
from healthy controls with an AUC of 0.996. The implementation of artificial intelligence technologies
limitation of their study is that their method is in medicine. Nat Med. 2019;25:15–8.
only validated for the bone marrow infiltration in 7. Wiens J, Saria S, Sendak M, et al. Do no harm: a road-
map for responsible machine learning for health care.
the femur. However, lesion distribution in MM Nat Med. 2019;25:1337–40.
patients ranges from a single lesion to multiple 8. Louis DN, Perry A, Reifenberger G, von Deimling
lesions with a disseminated pattern and those A, Figarella-Branger D, Cavenee WK, Ohgaki H,
lesions do not necessarily have to affect the femur. Wiestler OD, Kleihues P, Ellison DW. The 2016
World Health Organization classification of tumors
An automated approach to determine whole- of the central nervous system: a summary. Acta
body bone lesions in MM patients was con- Neuropathol. 2016;131:803–20.
ducted by Xu et al. [41]. The combination of 9. Kebir S, Weber M, Lazaridis L, et al. Hybrid 11
68
Ga-Pentixafor PET that registers elevated C-MET PET/MRI combined with “machine learn-
ing” in glioma diagnosis according to the revised
CXCR4-expression within MM lesions with glioma WHO classification 2016. Clin Nucl Med.
anatomical features from the CT-scan was used 2019;44:214–20.
in this study. Two CNNs (V-Net and W-Net) 10. Haubold J, Demircioglu A, Gratz M, et al. Non-
were used for the segmentation and detection of invasive tumor decoding and phenotyping of cere-
bral gliomas utilizing multiparametric 18F-FET
MM lesions. Their study that was first verified in PET-MRI and MR fingerprinting. Eur J Nucl Med Mol
digital phantoms (n = 120) and further validated Imaging. 2019;47(6):1435–45. https://doi.org/10.1007/
in a small patient cohort (n = 12) revealed that s00259-019-04602-2.
the W-Net architecture with the combination of 11. Grégoire V, Lefebvre JL, Licitra L, Felip E. Squamous
cell carcinoma of the head and neck: EHNS-ESMO-
PET and CT data was most accurate in lesion ESTRO clinical practice guidelines for diagnosis,
detection and achieved a dice-score of 73%. treatment and follow-up. Ann Oncol. 2010;21:184–6.
However, this study was mainly conducted on 12. Chen L, Zhou Z, Sher D, Zhang Q, Shah J, Pham
digital phantoms and further validation in a big- N-L, Jiang S, Wang J. Combining many-objective
radiomics and 3D convolutional neural network
ger patient cohort and correlation to clinical through evidential reasoning to predict lymph node
parameters such as treatment response or overall metastasis in head and neck cancer. Phys Med Biol.
survival has to be evaluated. 2019;64:075011.
13. Huang B, Chen Z, Wu PM, et al. Fully automated delin-
eation of gross tumor volume for head and neck cancer
on PET-CT using deep learning: a dual-center study.
References Contrast Media Mol Imaging. 2018;2018:8923028.
https://doi.org/10.1155/2018/8923028.
1. Liu X, Faes L, Kale AU, et al. A comparison of deep 14. Zhao L, Lu Z, Jiang J, Zhou Y, Wu Y, Feng
learning performance against health-care profession- Q. Automatic nasopharyngeal carcinoma segmenta-
als in detecting diseases from medical imaging: a sys- tion using fully convolutional networks with auxil-
tematic review and meta-analysis. Lancet Digit Heal. iary paths on dual-modality PET-CT images. J Digit
2019;1:e271–97. Imaging. 2019;32:462–70.
2. Nensa F, Demircioglu A, Rischpler C. Artificial 15. Smith-Bindman R, Lebda P, Feldstein VA, Sellami
intelligence in nuclear medicine. J Nucl Med. D, Goldstein RB, Brasic N, Jin C, Kornak J. Risk of
2019;60:29S–37S. thyroid cancer based on thyroid ultrasound imaging
3. Feld E, Harton J, Meropol NJ, et al. Effectiveness characteristics. JAMA Intern Med. 2013;173:1788.
of first-line immune checkpoint blockade versus 16. Tessler FN, Middleton WD, Grant EG, et al. ACR thy-
carboplatin-based chemotherapy for metastatic uro- roid imaging, reporting and data system (TI-RADS):
thelial cancer. Eur Urol. 2019;76:524–32. white paper of the ACR TI-RADS committee. J Am
4. Cottereau AS, Becker S, Broussais F, et al. Prognostic Coll Radiol. 2017;14:587–95.
value of baseline total metabolic tumor volume 17. Ko SY, Lee JH, Yoon JH, et al. Deep convolutional
(TMTV0) measured on FDG-PET/CT in patients with neural network for the diagnosis of thyroid nodules on
peripheral T-cell lymphoma (PTCL)+. Ann Oncol. ultrasound. Head Neck. 2019;41:885–91.
2016;27:719–24. 18. Li H, Weng J, Shi Y, Gu W, Mao Y, Wang Y, Liu W,
5. Kohl SAA, Romera-Paredes B, Maier-Hein KH, Zhang J. An improved deep learning approach for
Rezende DJ, Eslami SMA, Kohli P, Zisserman A, detection of thyroid papillary cancer in ultrasound
images. Sci Rep. 2018;8:1–12.
136 R. Seifert and P. Herhaus

19. Li X, Zhang S, Zhang Q, et al. Diagnosis of thyroid organs in whole-body CT images via iterative trilat-
cancer using deep convolutional neural network eration. IEEE Trans Med Imaging. 2017;36:2276–86.
models applied to sonographic images: a retrospec- 31. Seifert R, Herrmann K, Kleesiek J, Schafers MA,
tive, multicohort, diagnostic study. Lancet Oncol. Shah V, Xu Z, Chabin G, Garbic S, Spottiswoode
2019;20:193–201. B, Rahbar K. Semi-automatically quantified tumor
20. Planchard D, Popat S, Kerr K, et al. Metastatic non- volume using Ga-68-PSMA-11-PET as biomarker
small cell lung cancer: ESMO Clinical Practice for survival in patients with advanced prostate can-
Guidelines for diagnosis, treatment and follow-up. cer. J Nucl Med. 2020;61(12):1786–92. https://doi.
Ann Oncol. 2018;29:iv192–237. org/10.2967/jnumed.120.242057.
21. Sibille L, Seifert R, Avramovic N, Vehren T, 32. Armstrong AJ, Al-Adhami M, Lin P, et al. Association
Spottiswoode B, Zuehlsdorff S, Schäfers M. 18 between new unconfirmed bone lesions and outcomes
F-FDG PET/CT uptake classification in lymphoma in men with metastatic castration-resistant prostate
and lung cancer by using deep convolutional neural cancer treated with enzalutamide: secondary analysis
networks. Radiology. 2019;294(2):445–52. of the PREVAIL and AFFIRM randomized clinical
22. Beukinga RJ, Hulshoff JB, Mul VEM, Noordzij W, trials. JAMA Oncol. 2020;6:217–25.
Kats-Ugurlu G, Slart RHJA, Plukker JTM. Prediction 33. Erdi YE, Humm JL, Imbriaco M, Yeung H, Larson
of response to neoadjuvant chemotherapy and radia- SM. Quantitative bone metastases analysis based on
tion therapy with baseline and restaging 18F-FDG image segmentation. J Nucl Med. 1997;38:1401–6.
PET imaging biomarkers in patients with esophageal 34. Imbriaco M, Larson SM, Yeung HW, Mawlawi OR,
cancer. Radiology. 2018;287:983–92. Erdi Y, Venkatraman ES, Scher HI. A new parameter
23. Ingrisch M, Schöppe F, Paprottka K, Fabritius M, for measuring metastatic bone involvement by pros-
Strobl FF, De Toni EN, Ilhan H, Todica A, Michl M, tate cancer: the bone scan index. Clin Cancer Res.
Paprottka PM. Prediction of 90 Y radioembolization 1998;4:1765–72.
outcome from pretherapeutic factors with random sur- 35. Ulmert D, Kaboteh R, Fox JJ, et al. A novel automated
vival forests. J Nucl Med. 2018;59:769–73. platform for quantifying the extent of skeletal tumour
24. Morgan TM, Lange PH, Porter MP, Lin DW, Ellis WJ, involvement in prostate cancer patients using the bone
Gallaher IS, Vessella RL. Disseminated tumor cells in scan index. Eur Urol. 2012;62:78–84.
prostate cancer patients after radical prostatectomy 36. Sadik M, Jakobsson D, Olofsson F, Ohlsson M,
and without evidence of disease predicts biochemical Suurkula M, Edenbrandt L. A new computer-based
recurrence. Clin Cancer Res. 2009;15:677–83. decision-support system for the interpretation of bone
25. Hofman MS, Lawrentschuk N, Francis RJ, et al. scans. Nucl Med Commun. 2006;27:417–23.
Prostate-specific membrane antigen PET-CT in 37. Armstrong AJ, Anand A, Edenbrandt L, et al. Phase
patients with high-risk prostate cancer before 3 assessment of the automated bone scan index as a
curative-intent surgery or radiotherapy (proPSMA): a prognostic imaging biomarker of overall survival in
prospective, randomised, multi-centre study. Lancet. men with metastatic castration-resistant prostate can-
2020;395(10231):1208–16. https://doi.org/10.1016/ cer a secondary analysis of a randomized clinical trial.
S0140-6736(20)30314-7. JAMA Oncol. 2018;4:944–51.
26. Fendler WP, Calais J, Eiber M, et al. Assessment of 38. Bieth M, Krönke M, Tauber R, Dahlbender M,
68Ga-PSMA-11 PET accuracy in localizing recurrent Retz M, Nekolla SG, Menze B, Maurer T, Eiber M,
prostate cancer: a prospective single-arm clinical trial. Schwaiger M. Exploring new multimodal quantitative
JAMA Oncol. 2019;5:856–63. imaging indices for the assessment of osseous tumor
27. Zhao Y, Gafita A, Vollnberg B, Tetteh G, Haupt F, burden in prostate cancer using 68Ga-PSMA PET/
Afshar-Oromieh A, Menze B, Eiber M, Rominger A, CT. J Nucl Med. 2017;58:1632–7.
Shi K. Deep neural network for automatic character- 39. Hammes J, Täger P, Drzezga A. EBONI: a tool for
ization of lesions on 68Ga-PSMA-11 PET/CT. Eur J automated quantification of bone metastasis load in
Nucl Med Mol Imaging. 2020;47:603–13. PSMA PET/CT. J Nucl Med. 2017;59:1070–5.
28. Ronneberger O, Fischer P, Brox T. U-Net: convolu- 40. Martínez-Martínez F, Kybic J, Lambert L, Mecková
tional networks for biomedical image segmentation. Z. Fully automated classification of bone mar-
In: Medical image computing and computer assisted row infiltration in low-dose CT of patients with
intervention – MICCAI 2015, Lecture notes in com- multiple myeloma based on probabilistic density
puter science; 2015. p. 234–41. model and supervised learning. Comput Biol Med.
29. Gafita A, Bieth M, Krönke M, Tetteh G, Navarro F, 2016;71:57–66.
Wang H, Günther E, Menze B, Weber WA, Eiber M. 41. Xu L, Tetteh G, Lipkova J, Zhao Y, Li H, Christ P,
qPSMA: semiautomatic software for whole-body Piraud M, Buck A, Shi K, Menze BH. Automated
tumor burden assessment in prostate cancer using 68 whole-body bone lesion detection for multiple
ga-PSMA11 PET/CT. J Nucl Med. 2019;60:1277–83. myeloma on 68 Ga-Pentixafor PET/CT imaging using
30. Bieth M, Peter L, Nekolla SG, Eiber M, Langs G, deep learning methods. Contrast Media Mol Imaging.
Schwaiger M, Menze B. Segmentation of skeleton and 2018;2018:1–11.
Artificial Intelligence/Machine
Learning in Nuclear Medicine 11
and Hybrid Imaging

Robert J. H. Miller, Jacek Kwiecinski, Damini Dey,

and Piotr J. Slomka

Contents
11.1 Introduction to AI 138
11.2 I to Improve Image Quality and Processing
A 138
11.2.1 Image Denoising 138
11.2.2 Image Reconstruction 139
11.2.3 AI Applications in Attenuation Correction 139
11.2.4 Image Segmentation 140
11.2.5 CT Segmentation: Coronary Artery Calcium 141
11.2.6 CT Segmentation: Epicardial Adipose Tissue 143
11.3 I to Improve Physician Interpretation
A 145
11.3.1 Structured Reporting 145
11.3.2 Disease Diagnosis 145
11.3.3 Risk Prediction 146
11.4 Protocol Optimization: Application to Rest Scan Cancellation 151
11.5 Explainable AI 151
11.6 Summary 152
References 152

R. J. H. Miller
Department of Imaging (Division of Nuclear
Medicine), Medicine (Division of Artificial
Intelligence in Medicine), Cardiology, and
Biomedical Sciences, Cedars-Sinai Medical Center,
Los Angeles, CA, USA
Department of Cardiac Sciences, University of Department of Interventional Cardiology and
Calgary and Libin Cardiovascular Institute, Angiology, Institute of Cardiology, Warsaw, Poland
Calgary, AB, USA
D. Dey · P. J. Slomka (*)
J. Kwiecinski Department of Imaging (Division of Nuclear
Department of Imaging (Division of Nuclear Medicine), Medicine (Division of Artificial
Medicine), Medicine (Division of Artificial Intelligence in Medicine), Cardiology, and
Intelligence in Medicine), Cardiology, and Biomedical Sciences, Cedars-Sinai Medical Center,
Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA
Los Angeles, CA, USA e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 137
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_11
138 R. J. H. Miller et al.

Coronary artery disease (CAD) remains the lead- 11.1 Introduction to AI

ing cause of mortality and morbidity worldwide.
Recent statistics estimate that 18.2 million adults Computer algorithms which perform tasks nor-
age 20 and older have CAD (~7% of the popula- mally characteristic of human intelligence such
tion) [1]. CAD is the number one cause of death as understanding language and recognizing
and disability and accounts for healthcare expendi- images are referred to as AI [4, 5]. ML is a branch
tures which are projected to exceed $1 trillion by of AI which uses existing observations to deter-
year 2035 [1]. Cardiac imaging with echocardiog- mine which features best predict the outcome of
raphy, myocardial perfusion imaging (single pho- interest to more accurately predict the outcomes
ton emission tomography [SPECT] and positron of future observations. ML algorithms are well
emission tomography [PET]), computed tomogra- suited to integrate the diverse clinical, stress, and
phy, and magnetic resonance imaging play a key imaging information generated by a hybrid imag-
role in the diagnosis and management of CAD [2, ing scan.
3]. These advanced imaging modalities, along with Deep learning (DL) is a subset of ML which
other clinical tests, generate extensive amounts of refers to algorithms with a multilayered learning
data whose volume, heterogeneity, and complexity approach. DL can be trained using either struc-
have made human-driven analysis increasingly tured or unstructured data; however, the most
impractical. Artificial intelligence (AI) methods common approach is a convolutional neural net-
such as machine learning (ML) are particularly work (CNN)—especially suitable to be applied
well-suited to tackling the challenges of such com- to images [4, 6]. CNNs differ from other artificial
plex “Big Data” and have shown great promise in neural networks in that the neurons from adjacent
addressing classification, clustering, and predictive layers are only connected to nearby neurons in
modeling tasks in cardiovascular research [4]. In the following layer. DL algorithms are well-
cardiac imaging, AI has the potential to reduce suited to directly extract information from car-
costs and improve value throughout the stages of diovascular and hybrid images. For all AI
image acquisition, interpretation, and clinical deci- approaches, it is critical to ensure that a large,
sion-making. Moreover, the precision of diagnosis diverse data set is used for training the algorithm
or risk prediction—now possible with comprehen- and that predictions are tested using data that was
sive advanced cardiovascular imaging—combined not used in any way during model training.
with “big data” from electronic health records and
pathology, is likely to better characterize disease
and enable personalized therapy. 11.2 I to Improve Image Quality
A
In this chapter we present recent AI techniques and Processing
developed for the analysis of hybrid cardiac
imaging data (SPECT, PET, and computed 11.2.1 Image Denoising
tomography [CT]), including methods to opti-
mally integrate associated clinical information. CNNs by their nature are suitable for image clas-
These AI techniques include emerging methods sification and image transformation. They have
for image reconstruction, image segmentation, the potential to improve image quality, reduce
disease diagnosis, and outcome prediction. radiation exposure and shorten image acquisition
Additionally, we will review methods to auto- in cardiovascular nuclear medicine. This can be
matically extract additional structural informa- achieved by a process which can be thought of as
tion from the associated CT scans obtained with specialized filtering. Such image enhancement is
hybrid scanners. Since cardiac nuclear emission typically accomplished by architectures referred
scanning is increasingly being acquired in con- to as convolutional autoencoders, U-nets being a
junction with CT, this valuable data regarding frequently used architecture [7, 8]. These tech-
cardiac anatomy can complement the functional niques have been successfully applied to denois-
information provided by SPECT and PET. ing of CT images [9]. Ramon et al. demonstrated
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 139

in 1052 subjects that higher quality images could possible to apply DL directly within iterative
be generated from low-dose SPECT MPI using a reconstruction framework. For example, Shiri
3D CNN [10]. The CNN was formed with stacked et al. developed a DL model to achieve image
autoencoders designed to predict full-dose quality comparable to standard SPECT images
images from low-dose image reconstructions. In from incomplete datasets, namely counts
simulations, images denoised with the CNN obtained after reduction of the acquisition time
using 1/16th of the standard dose achieved simi- per projection or a reduction of the number of
lar image quality to simulated images with 1/8th angular projections [14]. They demonstrated in
of the dose denoised with a standard filtering 363 patients that the DL model was able to effec-
approach [10]. Song et al. demonstrated that spa- tively recover image quality and reduced the bias
tial resolution can be improved by using a 3D in quantification metrics as compared to a stan-
convolutional residual network. The authors dard iterative ordered subsets expectation maxi-
compared their predicted images with standard mization approach. The resulting images
dose and Gaussian post-filtering, showing a provided similar automatic quantitation of perfu-
reduction of the normalized mean square error by sion (stress total perfusion deficit) and function
6.13% and 11.05%, respectively [11]. Ladefoged (volume, eccentricity and shape index) compared
et al. evaluated the potential of denoising to conventional full acquisition studies [14].
18
F-fluorodeoxyglucose cardiac PET images with DL-based reconstructions are being researched
a deep learning model, trained in 146 patients extensively in general PET imaging [15–19]—
and tested in 20 patients, simulating dose reduc- and it is just a matter of time until they will be
tions as low as 1% of the injected dose [12]. Their tested for cardiovascular applications. Given this
denoising models were able to recover the PET encouraging data, it appears that DL algorithms
signal for both the static and gated images at will increasingly be used both during image
these low doses, showing that a significant dose reconstruction and as a post-processing tech-
reduction can be achieved for myocardial nique to improve image quality and reduce radia-
18
F-fluorodeoxyglucose PET images, used for tion exposure and acquisition times.
viability testing in patients with ischemic heart
disease, without significant loss of diagnostic
accuracy. Both 1% and 10% dose reductions are 11.2.3 A
I Applications in Attenuation
possible and provide quantitative metrics clini- Correction
cally comparable to those obtained with a full
dose [12]. Our group performed a preliminary During image reconstruction of SPECT or PET,
study of image denoising in order to substantially attenuation correction is a mechanism that
reduce the length of coronary 18F sodium fluoride enables adjustment for the amount of tissue
acquisitions [13]. In this study we obtained simi- between the source of radiation (myocardium)
lar quantitative metrics from the images recon- and the detectors of the scanner. This is typically
structed from 3 min of list mode data with CNN achieved by CT on a hybrid PET/CT scanner or
processing as those from traditional reconstruc- less commonly by segmented MR on a PET/MR
tion with the original 30-min acquisitions. These system. Such patient-specific attenuation correc-
kinds of postprocessing techniques are easy to tion improves the specificity and the accuracy of
implement clinically and can be rapidly applied myocardial perfusion imaging in the diagnosis of
to reconstructed data. CAD. However, the attenuation maps on the
hybrid PET/CT or SPECT/CT need to be prop-
erly registered to the emission data in order for
11.2.2 Image Reconstruction the correction to be accurate. Misregistration
between the perfusion and CT attenuation cor-
The aforementioned approaches have utilized DL rected images often results in artifacts that affect
as a post-processing method; however, it is also the diagnostic accuracy of PET/CT [20, 21]
140 R. J. H. Miller et al.

Automatic co-registration of SPECT/PET and istered pseudo-CT attenuation maps could be

non-contrast CT is challenging because of the used to routinely apply attenuation correction to
scant anatomical landmarks and possibly grossly SPECT or PET images without additional radia-
abnormal perfusion images. AI-based techniques tion exposure, even if CT images are not
have been developed to improve registration of available.
SPECT perfusion and CT attenuation correction
maps. Ko et al. developed a CNN-based algo-
rithm, trained to predict the extent of misregistra- 11.2.4 Image Segmentation
tion (rigid translations) between images in
3-dimensions compared to manually co-Accurate myocardial segmentation is necessary
registered SPECT/CT images [22]. The algo- to ensure high precision of subsequent image
rithm was trained in 402 cases and tested in 100 quantitation and interpretation in cardiac SPECT
cases, with residual misalignment between image and PET. While the vast majority of SPECT and
pairs of 1.71 ± 1.32 mm during training and PET myocardial perfusion quantification soft-
2.38 ± 2.00 mm during testing as compared to ware packages perform this process automati-
experienced operators. cally utilizing standard image processing
DL methods could be also utilized for generat- approaches, AI-based algorithms can potentially
ing attenuation maps from the emission data further improve this task and significantly reduce
itself. The advantage of such an approach is the the need for manual adjustments. For example, in
perfect registration of the emission data and gen- cardiac SPECT and PET analysis, an accurate
erated pseudo-CT data, thus potentially avoiding definition of the mitral valve plane remains a
misregistration artifacts on a hybrid system. problematic area for automatic segmentation and
Moreover, the use of CT for attenuation purposes this frequently requires manual correction.
increases radiation exposure to patients, thus Betancur et al. developed a novel method for
simulated CT scan can lower the overall patient automatic valve plane localization [26] ML- and
dose. The feasibility of AI-based alternate attenu- validated it with the anatomical information from
ation correction maps has been demonstrated for contrast CT angiography -obtained on the hybrid
brain and whole body PET imaging on PET/MR SPECT/CT scanner (Fig. 11.1). ML allowed them
systems, where CT is not available [23, 24]. to encapsulate expert knowledge and c apture the
These methods could also be applied to cardio- complex pattern changes caused by valve plane
vascular imaging. Recently Shi et al. trained a variations. Using a support-vector machines
DL algorithm to predict CT-attenuation maps (SVM) algorithm, they combined features such as
from myocardial perfusion SPECT projection intensity, shape, and information from gated
data and showed that the synthetic attenuation images to localize the most likely valve plane
maps (pseudo-CT) were qualitatively and quanti- position. The SVM algorithm demonstrated close
tatively consistent with the CT-based attenuation agreement with expert interpreters (bias of 1 mm),
maps [25]. The globally normalized mean abso- with tighter limits of agreement compared to two
lute error between the pseudo-CT and standard independent experts (−7 to 10 mm vs. −10 to
CT-based attenuation maps was 3.60% ± 0.85% 10 mm). Wang et al. described an end-to-end
among the 25 testing subjects. Importantly the CNN to segment left ventricular myocardium by
normalized mean absolute error between the delineating its endocardial and epicardial surface
reconstructed SPECT images that were corrected [27]. Their approach, which included a loss func-
using the pseudo-CT and CT-based attenuation tion which encouraged similarity and penalized
maps was 0.26% ± 0.15%, whereas the localized discrepancies between the prediction and training
absolute percentage error was 1.33% ± 3.80% in dataset, demonstrated excellent precision for left
the left ventricle myocardium and 1.07% ± 2.58% ventricular myocardial volume (mean error
in the blood pool. In the future such perfectly reg- − 1.1 ± 3.7%) [27].
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 141

Valve plane by human experts Valve plane by SVM model

Fig. 11.1 Automatic valve plane localization (top). (bottom). This research was originally published in
Bland-Altman difference plots show that global stress JNM. Betancur et al. Automatic Valve Plane Localization
TPD (red) and rest TPD (blue) for valve plane positions in Myocardial Perfusion SPECT/CT by Machine
were very similar between the results from two experts Learning: Anatomic and Clinical Validation. J Nucl Med.
and automatic valve plan localization procedure (SVM) 2017;58:961–967. © SNMMI

11.2.5 C
T Segmentation: Coronary We have shown the complementary role of CAC
Artery Calcium and PET scans (Fig. 11.2) and developed a com-
bined PET+CAC score, which increased the
Coronary artery calcium (CAC) is an unequivo- diagnostic performance of PET [41, 42]. It has
cal marker for atherosclerosis. Evidence to date also been shown that myocardial flow reserve
has consistently shown that CAC accurately pre- with PET and CAC provide complementary strat-
dicts cardiovascular events [28–36]. Non-contrast ification of cardiac risk [43, 44]. CAC scan is
CT can reliably detect CAC [37, 38]—comple- low-cost and acquired without contrast, but does
menting MPI [39]. Hybrid PET/CT or SPECT/ involve additional radiation and imaging time
CT systems are capable of obtaining CAC CT and therefore is not always performed. However,
scans. Quantification of CAC by CT provides an all current PET/CT MPI scans are acquired with
accurate measure of atherosclerotic burden [40]. ungated, low-dose CT scans for attenuation cor-
142 R. J. H. Miller et al.

a
Obstructive CAD (%)
92
100 75
72
57
80

50
60 50
35

40
35
0 25
8
20 9
22
0 22 TPD ≥ 5%
0 0 2% ≤ TPD <5%
CAC 0 0% < TPD < 2%
CAC 1-99
TPD 0%
CAC 100-399
CAC ≥ 400

b Per-vessel Analysis (n = 456)

0.9 No discrimination
ITPD
0.8
True positive rate (Sensitivity)

logCAC
0.7 Combined
0.6

0.5
Test AUC 95% CI
0.4
ITPD 0.81 0.76-0.85
0.3
CAC 0.73 0.68-0.78
0.2
Combined* 0.85 0.81-0.89
0.1

0
0 0.2 0.4 0.6 0.8 1
False positive rate (1 - Specificity)
* AUC Indicates Significantly Better Performance of Combined per-vessel ITPD & CAC
versus ITPD alone, P = 0.02

Fig. 11.2 (a) Prevalence of CAD as a function of CAC formance of per-vessel ischemic total perfusion deficit
and total perfusion deficit (TPD) on a per-vessel basis (ITPD) and per-vessel coronary artery calcium score
(n = 456). Prevalence of obstructive coronary artery dis- (CAC) versus ITPD alone in predicting obstructive CAD.
ease (CAD) across ischemic total perfusion deficit (ITPD) * Asterisk indicates AUC of combined analysis—ITPD
categories and coronary artery calcium (CAC) score cate- with per-vessel log CA. This research was originally pub-
gories (per-vessel analysis, N = 456). The “zero” risk of lished in JNM. Brodov et al. Combined Quantitative
obstructive disease in vessels with either ITPD 0% or Assessment of Myocardial Perfusion and Coronary Artery
CAC score of 0, while the highest risk in vessels with Calcium Score by Hybrid 82Rb PET/CT Improves
either ITPD ≥5% or CAC score ≥400 was seen, Detection of Coronary Artery Disease. J Nucl Med.
P < 0.0001. (b) tenfold cross-validated receiver operating 2015;56:1345–50. © SNMMI
characteristics (ROC) analysis comparing combined per-
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 143

Stress

Rest

Automated Manual Stress Rest

Fig. 11.3 Deep learning calcium detection on a CT low—aorta) in a patient with high-risk multivessel disease
attenuation scan from a hybrid scanner. Fully automated (by invasive angiography) and a visually and quantitively
CNN (left) agrees with conventional manual calcium normal SPECT MPI scan (right)
scoring (middle) of CTAC (red/green—coronary; yel-

rection, which could also be used to estimate the on a hybrid PET/CT scanner and dedicated gated
CAC burden. CAC scans [49]. Example of CAC segmentation
Currently CAC scoring is performed semi- performed by visual observer and DL is shown in
automatically, where an operator selects areas of Fig. 11.3. Such automatic segmentation of CT
calcification, which are subsequently segmented attenuation maps can provide valuable clinical
automatically. AI was extensively used to develop information during reporting of hybrid imaging.
automatic CAC scoring methods in diverse CT
scans—beyond dedicated gated cardiac CAC
scans. One approach involved a two-stage CNN 11.2.6 C
T Segmentation: Epicardial
developed to detect CAC [45], which was then Adipose Tissue
evaluated in a large clinical study incorporating a
wide range of CT scans [46]. Besides CAC scor- Beyond CAC, computed tomography also images
ing, DL methods have been used to detect cal- the adipose (fat) tissue which surrounds the heart.
cium in the thoracic aorta and heart valves from While thoracic and epicardial adipose tissue
low-dose, non-contrast chest CT [47]. Other (EAT) is currently not routinely measured or
methods have been proposed using DL with dual reported for cardiovascular risk assessment, EAT
CNN to process scan rescan datasets simultane- volume is emerging as an important predictor of
ously, utilizing 5075 datasets for training and adverse cardiovascular events and can therefore
testing [48]. This approach achieved classifica- be used for risk stratification [50–54]. EAT can
tion accuracy of 93% as compared with the expert be obtained from routine non-contrast cardiac CT
interpretation. This method has also been applied scans, but typically requires long manual pro-
to detect calcium from CT attenuation correc- cessing times to measure. Fully automated meth-
tions scans obtained on hybrid scanners. In 133 ods for segmentation of pericardial fat from
consecutive patients undergoing myocardial per- cardiac CT have been proposed using DL meth-
fusion 82Rubidium PET/CT, Isgum et al., showed ods. Investigators developed fully automated
good correlation between CAC scores derived CNN for EAT segmentation (QFAT) from non-
from non-contrast CT attenuation maps obtained contrast CT [55, 56]. This DL algorithm for EAT
144 R. J. H. Miller et al.

a b

Fig. 11.4 Algorithm embedded in research software tified (yellow arrow) by selecting a new operator (1, red)
QFAT. Three-dimensional representations of epicardial and by using the “extract contours” option (2, green). This
adipose tissue (EAT) (EAT in pink overlaid on heart ren- research was originally published in Radiology: Artificial
dered in red), (a) as manually identified by the expert and, Intelligence. Commandeur et al. Fully Automated CT
(b) as automatically identified by the algorithm. (c) Quantification of Epicardial Adipose Tissue by Deep
Screenshot of QFAT software with the integrated deep Learning: A Multicenter Study. Radiol Artif Intell. 2019;1
learning approach. The pericardium is automatically iden- (6):e190045. © Radiological Society of North America

identification and quantification from non- approximately 15 min for experts (Fig. 11.4)
contrast calcium scoring CT datasets was trained [56]. Therefore, DL allowed for fast, robust, and
and validated with tenfold cross validation in 250 fully automated quantification of EAT from
CT image sets by Commandeur et al. [56] The non-contrast calcium scoring CT as well as an
agreement between the proposed approach and expert reader and could be integrated easily in
an expert reader performing manual segmenta- clinical practice for cardiovascular risk assess-
tion of EAT was high, with no bias and correla- ment. DL algorithms for EAT quantification from
tions of 0.945 and 0.926 in the training and CT have been implemented in the research tool
validation datasets respectively [56]. Additionally, QFAT at Cedars-Sinai and have been recently
the variation of DL vs. expert interpretation was used in a novel combination of DL and classical
equivalent to the variation between two experi- ML for cardiovascular risk assessment. Such
enced observers. These results were further vali- EAT measurements could potentially be inte-
dated in a multicenter cohort with 850 cases. grated into comprehensive evaluations using
Automated segmentation of EAT was performed hybrid imaging data on PET/CT or SPEC/CT
in a mean time of 1.57 ± 0.49 s, compared with hybrid scanners.
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 145

11.3 I to Improve Physician

A that ML using the LogitBoost method outper-
Interpretation forms two-position combined TPD for diagnosis
of obstructive CAD [60].
11.3.1 Structured Reporting A visual abnormal diagnosis could be used as
a “gold standard” to train the AI methods for
One potential implementation of AI is for gener- diagnosis from nuclear cardiology images. Spier
ating automated structured reports. This can be et al. have developed a DL algorithm which
achieved with “expert systems”—a branch of AI achieved agreement of ~90% with expert visual
in which algorithms utilize a combination of interpretation of myocardial perfusion [61]. In a
observed data and heuristic rules obtained from recent study, Liu et al. have shown that a CNN
human experts to provide final predictions. trained and internally cross-validated on over
Garcia et al. demonstrated that an expert system 37,000 stress-only SPECT perfusion scans can be
algorithm could generate an automated struc- reliably correlated with visual assessment of per-
tured report for a SPECT myocardial perfusion fusion SPECT studies outperforming the quanti-
exam which was non-inferior compared to nine tative assessment of stress only perfusion
expert readers [57]. The authors have employed developed by the same group [62]. Using physi-
17-segment smart-scores which use a nonpara- cian interpretation as an external gold standard
metric normalized count distribution applied to for diagnosis may achieve better agreement than
information theory to generate a certainty of with the results of the invasive angiography.
abnormality. This certainty for each segment is However, such training by definition cannot sur-
modified according to all the available perfusion pass the physicians’ performance and can also be
and function information for that segment includ- associated with potential training bias. It remains
ing rest, stress, changes between stress and rest, to be seen if such approaches are widely adopted.
AC and non-AC images, and prone images [57]. Given the findings from the ISCHEMIA trial
The algorithm processes this information for all [63], which have raised questions over visual
segments to generate the automated report. Such interpretation of regional perfusion for guiding
AI-generated automatic reports can be reviewed decisions about revascularization, it is important
by physicians to expedite nuclear cardiology to realize the importance of choosing an appro-
reporting or potentially improve accuracy. priate comparator for AI analysis. While match-
ing or outperforming automatic software in
detecting abnormal images is desirable, the
11.3.2 Disease Diagnosis patient needs to be central in all efforts in the
medical field. Therefore, in view of the overall
AI can be trained to predict the likelihood of limitations of MPI and having in mind that the
CAD, for example obstructive CAD, using either goal of noninvasive imaging in CAD is to distin-
classical ML methods operating on quantified guish patients who should proceed to invasive
image features, or DL algorithms, which interro- testing, it should be preferred to test the diagnos-
gate data directly. Arsanjani et al. demonstrated tic accuracy of AI with invasive coronary angiog-
that an SVM model [58] which integrated numer- raphy as the ground truth. Such a study
ical variables—total perfusion deficit (TPD), design—testing whether DL algorithms can pre-
ischemic changes, and left ventricular ejection dict the likelihood of obstructive CAD directly
fraction—can significantly improve the diagnos- from SPECT images—has been recently
tic accuracy for obstructive CAD from MPI com- employed by Betancur et al. In a cohort that
pared to standard quantitation with TPD included 1638 patients from nine centers, the
(diagnostic accuracy 86% vs. 81%; p < 0.01) authors demonstrated that DL (using a combina-
[59]. In a separate study including 957 scans tion of CNN and fully connected layers) improved
from over 600 patients with correlating invasive detection of obstructive CAD compared to quan-
coronary angiography data, it was also shown titation of perfusion with TPD on both a regional
146 R. J. H. Miller et al.

Multicenter, n = 1272/3480 (37%) AUC - area under the receiver operating characteristic curve
1.0
Deep learning (DL) Center 1 Center 2 Center 3 Center 4

AUC (bars), 95% CI (whiskers)

0.8 n = 395/1086 n = 200/573 n = 315/825 n = 362/996

1.0 P < 0.01* P < 0.001*

59.1% P < 0.001 * P < 0.02 *
Sensitivity

0.6 +4.5%, P < 0.001 **

54.6% 0.79 0.81
cTPD = 1% 0.75 0.75 0.76 0.75
0.774 0.72
0.4 0.71
0.75
AUC (bars), 95% CI (whiskers)

DL 0.774
0.2

cTPD 0.729
0.5
0.0 82.2% P < 0.001 *
DL cTPD DL cTPD DL cTPD DL cTPD
1.0 0.8 0.6 0.4 0.2 0.0
Specificity Predictor DL cTPD

Fig. 11.5 Deep learning (DL) prediction of CAD from Red dotted line shows the overall multicenter AUC. This
upright and supine MPS images (red) outperformed the cur- research was originally published in JNM. Betancur et al.
rent method—combined upright-supine TPD (cTPD) (blue). Deep Learning Analysis of Upright-Supine High-Efficiency
DL had an overall estimated per-vessel performance (left) SPECT Myocardial Perfusion Imaging for Prediction of
externally validated (right) in four DL models (one per cen- Obstructive Coronary Artery Disease: A Multicenter Study.
ter)—each trained with data from the other three centers. J Nucl Med. 2019;60:664–670. © SNMMI

and per-patient basis [64]. With matched speci- extracted high-level features from the polar maps
ficity, DL improved the per-vessel sensitivity to (through the Inception-v3 network) achieved a
69.8% from 64.4% with TPD (p < 0.01) [64]. high diagnostic accuracy for detecting cardiac
Subsequently the same group demonstrated that a sarcoidosis (sensitivity and specificity of 0.839,
modified algorithm, utilizing both upright and 0.870 respectively) [68]. This study focused on
supine imaging data from solid-state SPECT cardiac sarcoidosis but highlights how AI can
scanners, improved the diagnostic accuracy com- benefit cardiac imaging beyond CAD.
pared to combined upright-supine quantitative
analysis developed previously by the same inves-
tigators in rigorous multisite external evaluation 11.3.3 Risk Prediction
[65]. (Fig. 11.5) These CNN algorithms have
been recently expanded to demonstrate the pos- AI algorithms also have a potential role in refin-
sibility of image-based explanation with image ing risk prediction for future adverse outcomes
attention maps implemented in a clinical proto- following nuclear cardiac imaging. The classical
type for CAD diagnosis [66]. For example, this ML approach has a particular strength in its abil-
technique can be applied to SPECT MPI to high- ity to combine large amounts of clinical, stress,
light regions of perfusion polar maps which con- and imaging data (with variables quantified by
tribute most to the final DL predictions [67]. standard software) in an efficient and objective
Apart from diagnosis of CAD, CNNs have fashion. Arsanjani et al. trained an ML model to
also been applied to diagnose cardiac sarcoidosis predict post-MPI revascularization in a cohort of
from PET MPI, demonstrating improved sensi- 713 patients who underwent dual-isotope SPECT
tivity and specificity compared to two methods MPI and subsequent invasive angiography [69].
for quantification [68]. In a study based on a total The ML model was compared to visual scoring
of 85 patients (33 cardiac sarcoidosis patients of two expert readers. The ML approach had
and 52 patients without cardiac sarcoidosis) Togo superior area under the receiver operating charac-
et al. demonstrated that a deep CNN with teristic curve (AUC) (0.81 ± 0.2) for predicting
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 147

revascularization compared to one of the readers which includes: baseline characteristics (sex,
(0.72 ± 0.02, p < 0.01) and quantitative assess- age, body mass index, family CAD history,
ment of perfusion (0.77 ± 0.2, p < 0.01). The smoking, diabetes, dyslipidemia, and hyperten-
study showed that the automatic ML approach, sion), the left ventricular systolic function (rest
integrating a wide range of variables, is compa- and stress left ventricular ejection fraction), and
rable or better than experienced readers in predic- perfusion variables (regional rest and stress myo-
tion of early revascularization and is significantly cardial blood flow and myocardial perfusion
better than standalone measures of perfusion. In a reserve). AI-models could potentially be com-
subsequent multicenter study, Hu et al. demon- bined with a Cox proportional hazards model in
strated that a similar ML architecture could train order to provide time-to-event analyses. Such
a model which outperformed current methods for methods could provide more precise period-
quantitative analysis of perfusion for prediction specific risk prediction.
of revascularization on a per-patient and per- Risk prediction can also be performed from
vessel basis [70]. In this study the overall feature CT scans automatically segmented using DL
importance graphs demonstrate that revascular- methods. In a study of 20,084 patients from sev-
ization prediction can be obtained primarily from eral clinical trials, DL-derived automatic CAC
imaging variables. (Fig. 11.6) This study also score has been demonstrated to be a strong pre-
proposed initial methods for explaining the predictor of cardiovascular events, independent of
diction to the physicians by showing the relative other risk factors (multivariable-adjusted hazard
importance of each feature. ratios up to 4.3). Potentially DL can be applied to
AI models have also been developed to predict predicting cardiovascular mortality directly
the risk for major adverse cardiovascular event (without the need to derive calcium scores). In a
(MACE). An ML model was developed using set of 1583 participants of the National Lung
single-center SPECT MPI data (n = 2619) to Screening, this approach achieved good prognos-
determine the benefit of combining clinical, tic performance with AUC of 0.73. Both of these
stress, and imaging features [71]. The ML model, studies demonstrated the feasibility of obtaining
trained and tested with tenfold cross-validation, important cardiovascular risk information from
had higher area under the receiver operating ungated chest CT scans, while CT scans obtained
characteristic curve (AUC) for MACE compared for the purpose of attenuation correction are typi-
to either stress TPD or ischemic TPD (AUC: 0.81 cally lower radiation and ungated, which can
vs. 0.73 vs. 0.71, respectively; p < 0.01) [71]. reduce image quality. However, it has been dem-
Importantly, almost 20% of patients in the high- onstrated that CAC scores obtained automati-
est MACE risk (ninety-fifth or higher percentile) cally from PET CTAC correlate well with scores
by the ML score had “normal” expert visual from separately obtained, same-day, ECG-gated
interpretation of myocardial perfusion highlight- CAC scans [49] and can be applied for risk pre-
ing an additional potential role of AI. Results diction. Thus, it should be feasible to apply
from this study are shown in (Fig. 11.7). ML AI-based techniques to CT-attenuation maps of
could potentially be used after expert interpreta- cardiac SPECT and PET to automatically extract
tion to ensure that high-risk patients are not useful diagnostic and prognostic information.
missed. The prognostic utility of such information was
DL models have also been developed to pre- recently evaluated on a cohort of 747 patients
dict MACE from rest and stress myocardial blood with chest pain who underwent 82Rubidium PET/
flow as well as myocardial flow reserve PET CT. Dekker et al. showed that a DL model for
polar maps with high predictive accuracy [72]. In quantification of CAC from non-contrast low-
a study based on 1185 patients who underwent dose CT acquired for attenuation correction pur-
13
N-ammonia PET, Juarez-Orozco et al. have poses predicts MACE independent of perfusion
shown that DL applied directly to polar-maps findings. Importantly, the addition of coronary
outperforms a comprehensive clinical model calcium information resulted in a net reclassifica-
148 R. J. H. Miller et al.

0 0.01 0.02 0.03 0.04 0.05 0.06

stress SEV* (sd)
stress SEV** (sd)
stress EXT (%)
ischemic EXT (%)
ischemic SEV* (sd)
stress SEV (sd)
ischemic EXT* (%)
stress EXT** (%)
stress EXT* (%)
ischemic SEV (sd)
ST deviation (mm)
body mass insdex (kg/m2)
gender (male, female)
rest SBP (mmHg)
transient ischemic dilation*
vessel territory (LAD, LCx, RCA)
rest ejection fraction (%)
ECG response to stress (1, 2, 3, 4, 5)
stress peak heart rate (beats/minute)
transient ischemic dilation
Regional imaging variable
transient for test (1-23, integer)
rest thickening (%) Global imaging variable
age (year) Stress-test variable
test termination reason (1-11, integer) Clinical variable
symptoms (1, 2, 3, 4)
clinical response to stress (1, 2, 3, 4, 5)
conduction disease (1, 2, 3, 4, 5)
stress EF (%)
stress test type (1, 2, 3, 4, 5)
All variables are in upright (D-SPECT) or
stress VOL (ml)
supine (Discovery) position, unless
location (outpatient, inpatient, ed) specified otherwise:
rest VOL (ml) *: supine (D-SPECT) or prone (Discovery)
under drug influence (0, 1, 2, 3, 4, 5) **: Combined-view (results from combined
upright/supine or supine/prone)
left ventricular hypertrophy (0, 1)
stress thickening (%)
stress EF* (%)
stress imaging position (u, s, p, su, sp)
rest heart rate (beats/minute)
Abbreviations:
stress phase BW (degree) BW = bandwidth; DBP= diastolic blood
pharmacologic stress agent (2, 3, 4, 5) pressure; EF = ejection fraction; EXT =
stress VOL* (ml) extent; HR = heart rate; SBP = systolic
blood pressure; SEV = severity; VOL = left
dyslipidemia (0, 1)
ventricular volume; other abbreviations
peripheral vascular disease (0, 1) see Supplement table S1.
family History (0, 1)
diabetes mellitus (0, 1)
abnormal rest ECG (0, 1)
hypertension (0, 1)
stress peak SBP (mmHg)
smoking (0,1)
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 149

Patient Imaging Data Physician

Variables
Myocardial MACE risk prediction
Imaging
Perfusion SPECT Machine Learning
Quantification
Imaging Model

60 Stress test and 60

Normal visual diagnosis Clinical variables

Observed event rate (%)

100 99 50 50
97 93

Predicted ML score
80 40 40
69
Frequency (%)

Electronic
60 30 Medical Records 30

40 20 20
Observed Predicted
19
20 10 10

0 0 0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
25

4
95
-4

-7

-9
≥
<
25

Percentile of ML score
Percentile of ML score

Fig. 11.7 Prediction of MACE by machine learning (pink bars) and predicted (green curve) MACE rate
(ML). The composite ML risk score from imaging and according to percentile of ML score (right). Reprinted
clinical data can be presented to physicians as an annual- from JACC Cardiovascular Imaging, Vol 11, Betancur
ized event risk. Frequency of patients with normal visual et al., Prognostic Value of Combined Clinical and
diagnosis versus ML score (left). 19% of patients with Myocardial Perfusion Imaging Data Using Machine
normal visual diagnosis (red arrow) were in the ≥95th Learning, Pages 1000–1009, 2018, with permission from
percentile of MACE risk computed by ML. Observed Elsevier

tion improvement of 0.13 (0.02–0.25) [73]. derived by deep learning, has been developed for
Routine implementation of deep learning for the long-term prediction of hard cardiac events in
interrogating non-contrast CT data could provide 1069 asymptomatic subjects. The ML risk score
clinicians with additional cardiovascular infor- (AUC 0.81) outperformed the CAC score (0.75)
mation with no processing overhead, when inter- and ASCVD risk score (0.74; both p = 0.02) for
preting a nuclear cardiology scan. the prediction of hard cardiac events [79].
The DL segmentation of EAT can be utilized (Fig. 11.8) With such tools, important prognostic
for risk prediction models. These automatically data can be obtained from existing CT scans,
derived EAT measures have demonstrated inde- which would not be obtained clinically due to the
pendent prognostic utility [55, 56, 74–77]. EAT tedious manual task processing. This technique
may be particularly relevant for improving risk could potentially be used for hybrid (SPECT/CT
prediction in patients with cardiometabolic risk and PET/CT) nuclear cardiology scans, where
factors [78]. An ML risk score, integrating circu- CT is obtained as an auxiliary scan for attenua-
lating biomarkers and computed tomography tion correction or with additional dedicated gated
(CT) measures including CAC score end EAT CT scans.

Fig. 11.6 The machine learning (ML) algorithm evalu- ical and stress-test variables also play roles in the predic-
ated all 55 used variables independently to determine the tion. Reprinted from EHJCI, Hu et al., Machine learning
IGR for each variable in each fold. 49 out of 55 variables predicts per-vessel early coronary revascularization after
had IGR > 0 and were selected. ML models were built fast myocardial perfusion SPECT: results from multicen-
with these selected variables. Most variables in the rank- tre REFINE SPECT registry. European heart journal car-
ing are imaging variables (blue and light blue bars) with diovascular Imaging, 21:549–559, 2020, by permission of
regional imaging variables (blue bars) leading, while clin- Oxford University Press
150 R. J. H. Miller et al.

Fig. 11.8 Variable Age

importance for the a CAC score
classification of hard Systolic blood pressure
cardiac events. (a) The Number of coronary lesions
top 25 variables are Aortic valve calcium score
displayed: clinical risk LDL
factors in blue, MMP-9
quantitative imaging D-dimer
measures in grey, and Pentraxin 3
serum biomarkers in red. PIGR Category
The “gain” denotes how GDF-15
Clinical
Variables
PAI-1
much a variable
EAT volume Imaging
contributes to the
Triglycerides
prediction made by the Biomarkers
hs-CRP
XGBoost algorithm. (b) MCP-1
Receiver operator EAT attenuation
characteristic curves for MPO
the prediction of hard BNP
cardiac events. The Total cholesterol
machine learning model Diabetes
with serum biomarkers HDL
performed significantly BMI
better than the ASCVD Diastolic blood pressure
risk score and CAC MYO
score (both p = 0.02). 0.00 0.05 0.10 0.15
Reprinted from Gain
Atherosclerosis, Vol b
318, Tamarappoo et al., 1
Machine learning
integration of circulating
and imaging biomarkers 0.8
for explainable
patient-specific
prediction of cardiac 0.6
Sensitivity

events: A prospective
study, Pages 76–82,
2021, with permission 0.4
from Elsevier

0.2 ML 0.81 [0.75-0.87]

CAC 0.75 [0.68-0.81] *
*
ASCVD 0.74 [0.67-0.80]
0 *p=0.02
1 0.8 0.6 0.4 0.2 0
Specificity

One issue in risk prediction with a large number ing only 6 variables provided superior risk predic-
of variables is the necessity to obtain these variables tion compared to a logistic regression model with
from health records—which is not always possible 14 variables [80]. Such feature reduction would
in real time during clinical interpretation. Therefore, simplify the implementation of ML algorithms as
it is important to establish if reduced variable mod- modules in reporting software by limiting the addi-
els provide similar prognostic stratification. Haro tional work required by physicians or support staff
Alonso et al. demonstrated that an ML model utiliz- to enable AI predictions.
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 151

11.4 Protocol Optimization: a All population (n = 3541 MACE/20,414 patients)

Application to Rest Scan 1.0 ML3
Cancellation ML1
Stringent
ML2 clinical
0.8
While risk prediction may be important for treat-
ment selection, potentially a more practical and Clinical
simple application is to apply such technique for ND>0
0.6

Sensitivity
imaging protocol optimization. For example, in
SPECT MPI stress-only imaging is associated
0.4
with up to 60% reduction in effective radiation AUC

exposure compared to a standard one-day stress- ML 0.80

rest exam [81] and shortening of examination 0.2

stress TPD 0.70

time. Despite these substantial benefits, stress- MD-diagnosis 0.68

only MPI protocols remain severely under-

Both p < 0.0001, compared with ML
utilized (<12% worldwide <3% in the US) 0.0

[82, 83]. It was recently demonstrated that ML 1.0 0.8 0.6 0.4 0.2 0.0
models could also be used clinically to automati- Specificity

cally identify patients for rest scan cancellation. b Personalized contribution of features for MACE risk
These algorithms could potentially be used to
Low
identify patients with a low likelihood of having risk Diabetes

obstructive CAD [84] or those with a very low Age 60y/o

risk of MACE [85]. In a study of 20,414 patients Adenosine

using matched cancellation rates, subjects Stress TPD (0%)
selected for rest scan cancellation with clinical
Stress TPD 2 (1%)
methods had higher all-cause mortality (1.0% to
Past PCI
1.3%) compared to patients who were selected by
Resting heart rate 57 bpm
corresponding ML thresholds (0.2% to 0.6%)
[85]. The overall performance for risk prediction No abnormal motoin

of 5-year risk of MACE was higher for AI-based Peak heart rate 96 bpm
approach than any of the previously proposed Inpatient
clinical approaches (Fig. 11.9a). This approach ML risk = 18% Remaining features
could potentially be implemented as a module in
0 0.06 0.1 0.2 0.25
interpretation or reporting software to reduce the
proportion of patients requiring rest imaging, Fig. 11.9 (a) Machine learning (ML) with imaging and
obviating the need for an on-site physician and clinical data had higher area under operating curve (AUC)
for MACE risk prediction compared to total perfusion defi-
guaranteeing a high degree of safety. Such cit (TPD) and superior risk decision thresholds (ML1–3) vs.
streamlined automated AI-selection of patients visual physician’s reading with clinical data (MD, Clinical,
for stress-only protocols would lead to significant Stringent clinical). (b) Personalized risk explanation. 60-y/o
reductions in radiation exposure for patients and old male with normal stress perfusion (TPD)—decreasing
risk (blue bars), but clinical features (diabetes, past revascu-
technical staff, as well as reduction in associated larization [PCI]) increasing risk (red bars). AI-based 5-year
costs [86]. MACE risk 18%. Reprinted from EHJCI, Hu et al.,
Prognostically safe stress-only single-photon emission com-
puted tomography myocardial perfusion imaging guided by
machine learning: report from REFINE SPECT, jeaa134,
11.5 Explainable AI 2020, by permission of Oxford University Press

Methods to improve the explainability of risk

predictions are critical for clinical implementa- features contributing to the risk score for a given
tion. For traditional ML approaches, individual subject can be displayed. This approach was
152 R. J. H. Miller et al.

employed by Hu et al. to explain predictions menting ML models and explain AI predictions

regarding MACE risk and the safety of rest scan are necessary for a more widespread uptake of
cancellation [70] (Fig. 11.9b). This information this promising technology.
can be used by physicians to identify clinically
actionable factors such as hypertension or dia-
betes mellitus. Additionally, these explanations References
could potentially improve the accuracy of com-
bined reporting by allowing physicians to iden- 1. Virani SS, Alonso A, Benjamin EJ, Bittencourt MS,
Callaway CW, Carson AP, Chamberlain AM, Chang
tify potential errors in ML risk predictions.
AR, Cheng S, Delling FN, Djousse L, Elkind MSV,
Attention maps could be implemented to explain Ferguson JF, Fornage M, Khan SS, Kissela BM,
DL model predictions. Attention maps are heats Knutson KL, Kwan TW, Lackland DT, Lewis TT,
maps which can be overlayed on the input Lichtman JH, Longenecker CT, Loop MS, Lutsey
PL, Martin SS, Matsushita K, Moran AE, Mussolino
images to highlight the image regions that trig-
ME, Perak AM, Rosamond WD, Roth GA, Sampson
gered the final AI findings by backpropagating UKA, Satou GM, Schroeder EB, Shah SH, Shay CM,
the finding through the CNN [87, 88]. Similarly Spartano NL, Stokes A, Tirschwell DL, VanWagner
attention maps can be applied to CT images. To LB, Tsao CW. American Heart Association Council
on E, Prevention Statistics C and Stroke Statistics
reduce computational complexity, researchers
S. Heart Disease and Stroke Statistics-2020 Update:
have attempted direct scoring of CAC through A Report From the American Heart Association.
regression of the calcium score, avoiding time- Circulation. 2020;141:e139–596.
consuming intermediate CAC segmentation 2. Fihn SD, Blankenship JC, Alexander KP, Bittl JA,
[89]. While there was no explicit segmentation, Byrne JG, Fletcher BJ, Fonarow GC, Lange RA,
Levine GN, Maddox TM, Naidu SS, Ohman EM,
attention maps were used for visual explanation Smith PK. 2014 ACC/AHA/AATS/PCNA/SCAI/STS
In direct CAC scoring, accomplished by a focused update of the guideline for the diagnosis and
method termed deConvnet [88]—highlighting management of patients with stable ischemic heart dis-
the image regions contributing to the calcium ease: a report of the American College of Cardiology/
American Heart Association Task Force on Practice
score. Guidelines, and the American Association for
Thoracic Surgery, Preventive Cardiovascular Nurses
Association, Society for Cardiovascular Angiography
11.6 Summary and Interventions, and Society of Thoracic Surgeons.
J Am Coll Cardiol. 2014;64:1929–49.
3. Knuuti J, Wijns W, Saraste A, Capodanno D, Barbato
AI has become an increasingly important tool, E, Funck-Brentano C, Prescott E, Storey RF, Deaton
with rapidly expanding implications for nuclear C, Cuisset T, Agewall S, Dickstein K, Edvardsen T,
cardiology and hybrid imaging. AI can signifi- Escaned J, Gersh BJ, Svitil P, Gilard M, Hasdai D,
Hatala R, Mahfoud F, Masip J, Muneretto C, Valgimigli
cantly improve image processing, image recon- M, Achenbach S, Bax JJ. 2019 ESC Guidelines for the
struction, potentially allowing reduction in diagnosis and management of chronic coronary syn-
radiation exposure or optimize image segmenta- dromes. Eur Heart J. 2020;41:407–77.
tion. Clinically, AI could potentially be imple- 4. Dey D, Slomka PJ, Leeson P, Comaniciu D, Shrestha
S, Sengupta PP, Marwick TH. Artificial intelligence
mented in order to provide structured reports, in cardiovascular imaging: JACC state-of-the-art
diagnose obstructive CAD, or optimally predict review. J Am Coll Cardiol. 2019;73:1317–35.
the likelihood of adverse events. AI shall undoubt- 5. Slomka PJ, Miller RJ, Isgum I, Dey D. Application
edly play an increasing role in integrating the and translation of artificial intelligence to cardiovas-
cular imaging in nuclear medicine and noncontrast
data acquired across imaging modalities in cardi- CT. Semin Nucl Med. 2020;50:357–66.
ology. Such areas are hybrid PET/MR, and 6. Krittanawong C, Tunhasiriwet A, Zhang H, Wang Z,
SPECT/CT or PET/CT acquisitions which enable Aydar M, Kitai T. Deep learning with unsupervised
collecting a wealth of clinical variables which feature in echocardiographic imaging. J Am Coll
Cardiol. 2017;69:2100–1.
can guide patient management yet require careful 7. Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T,
analysis which could be enhanced with Ronneberger O. 3D U-Net: learning dense volu-
AI. Methods to improve the feasibility of imple- metric segmentation from sparse annotation. In:
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 153

International conference on medical image computing 21. Gould KL, Pan T, Loghin C, Johnson NP, Guha A,
and computer-assisted intervention. 2016. p. 424–32. Sdringola S. Frequent diagnostic errors in cardiac
8. Ronneberger O, Fischer P, Brox T. U-net: PET/CT due to misregistration of CT attenuation
Convolutional networks for biomedical image seg- and emission PET images: a definitive analysis of
mentation. In: International conference on medical causes, consequences, and corrections. J Nucl Med.
image computing and computer-assisted intervention. 2007;48:1112–21.
2015. p. 234–41. 22. Ko C-L, Cheng M-F, Yen R-F, Chen C-M, Lee W-J,
9. Chen H, Zhang Y, Kalra MK, Lin F, Chen Y, Liao Wang T-D. Automatic alignment of CZT myocardial
P, Zhou J, Wang G. Low-dose CT with a residual perfusion SPECT and external non-contrast CT by
encoder-decoder convolutional neural network. IEEE deep-learning model and dynamic data generation. J
Trans Med Imaging. 2017;36:2524–35. Nucl Med. 2019;60:570.
10. Ramon AJ, Yang Y, Pretorius PH, Johnson KL, King 23. Dong X, Lei Y, Wang T, Higgins K, Liu T, Curran WJ,
MA, Wernick MN. Initial investigation of low-dose Mao H, Nye JA, Yang X. Deep learning-based attenu-
SPECT-MPI via deep learning. In: 2018 IEEE nuclear ation correction in the absence of structural informa-
science symposium and medical imaging conference tion for whole-body positron emission tomography
proceedings (NSS/MIC). 2018. p. 1–3. imaging. Phys Med Biol. 2020;65:055011.
11. Song C, Yang Y, Wernick MN, Pretorius PH, King 24. Liu F, Jang H, Kijowski R, Bradshaw T, McMillan
MA. Low-dose cardiac-gated spect studies using a AB. Deep learning MR imaging-based attenua-
residual convolutional neural network. In: 2019 IEEE tion correction for PET/MR imaging. Radiology.
16th international symposium on biomedical imaging 2018;286:676–84.
(ISBI 2019). 2019. p. 653–6. 25. Shi L, Onofrey JA, Liu H, Liu YH, Liu C. Deep
12. Ladefoged C, Hasbak P, Hansen J, Kjer A, Hejgaard learning-based attenuation map generation for myo-
L, Andersen F. Low-dose PET reconstruction using cardial perfusion SPECT. Eur J Nucl Med Mol
deep learning: application to cardiac imaged with Imaging. 2020;47:2383–95.
FDG. J Nucl Med. 2019;60:573. 26. Betancur J, Rubeaux M, Fuchs TA, Otaki Y, Arnson
13. Lassen ML, Commandeur F, Kwiecinski J, Dey D, Y, Slipczuk L, Benz DC, Germano G, Dey D, Lin CJ,
Cadet S, Germano G, Berman D, Dweck M, Newby Berman DS, Kaufmann PA, Slomka PJ. Automatic
D, Slomka P. 10-fold reduction of scan time with deep valve plane localization in myocardial perfusion
learning reconstruction of coronary PET images. J SPECT/CT by machine learning: anatomic and clini-
Nucl Med. 2019;60:244. cal validation. J Nucl Med. 2017;58:961–7.
14. Shiri I, AmirMozafari Sabet K, Arabi H, Pourkeshavarz 27. Wang T, Lei Y, Tang H, He Z, Castillo R, Wang C,
M, Teimourian B, Ay MR, Zaidi H. Standard SPECT Li D, Higgins K, Liu T, Curran WJ, Zhou W, Yang
myocardial perfusion estimation from half-time X. A learning-based automatic segmentation and
acquisitions using deep convolutional residual neural quantification method on left ventricle in gated
networks. J Nucl Cardiol. 2021;28(6):2761–79. myocardial perfusion SPECT imaging: a feasibility
15. Whiteley W, Luk WK, Gregor J. DirectPET: full-size study. Journal of nuclear cardiology: official publica-
neural network PET reconstruction from sinogram tion of the American Society of Nuclear Cardiology.
data. J Med Imag. 2020;7:032503. 2020;27(3):976–87.
16. Reader AJ, Corda G, Mehranian A, da Costa-Luis C, 28. Arad Y, Goodman KJ, Roth M, Newstein D, Guerci
Ellis S, Schnabel JA. Deep learning for PET image AD. Coronary calcification, coronary disease risk fac-
reconstruction. IEEE Trans Radiat Plasma Med Sci. tors, C-reactive protein, and atherosclerotic cardio-
2020;5:1–25. vascular disease events: the St. Francis Heart Study. J
17. Liu Z, Chen H, Liu H. Deep learning based frame- Am Coll Cardiol. 2005;46:158–65.
work for direct reconstruction of PET images. In: 29. Shaw LJ, Raggi P, Schisterman E, Berman DS,
International conference on medical image computing Callister TQ. Prognostic value of cardiac risk factors
and computer-assisted intervention. 2019. p. 48–56. and coronary artery calcium screening for all-cause
18. Häggström I, Schmidtlein CR, Campanella G, Fuchs mortality. Radiology. 2003;228:826–33.
TJ. DeepPET: a deep encoder–decoder network for 30. Park R, Detrano R, Xiang M, Fu P, Ibrahim Y, LaBree
directly solving the PET image reconstruction inverse L, Azen S. Combined use of computed tomography
problem. Med Image Anal. 2019;54:253–62. coronary calcium scores and C-reactive protein lev-
19. Gong K, Catana C, Qi J, Li Q. PET image recon- els in predicting cardiovascular events in nondiabetic
struction using deep image prior. IEEE Trans Med individuals. Circulation. 2002;106:2073–7.
Imaging. 2018;38:1655–65. 31. Wong ND, Hsu JC, Detrano RC, Diamond G,
20. Slomka PJ, Diaz-Zamudio M, Dey D, Motwani M, Eisenberg H, Gardin JM. Coronary artery calcium
Brodov Y, Choi D, Hayes S, Thomson L, Friedman evaluation by electron beam computed tomography
J, Germano G, Berman D. Automatic registration of and its relation to new cardiovascular events. Am J
misaligned CT attenuation correction maps in Rb-82 Cardiol. 2000;86:495–8.
PET/CT improves detection of angiographically 32. Kondos GT, Hoff JA, Sevrukov A, Daviglus
significant coronary artery disease. J Nucl Cardiol. ML, Garside DB, Devries SS, Chomka EV, Liu
2015;22:1285–95. K. Electron-beam tomography coronary artery cal-
154 R. J. H. Miller et al.

cium and cardiac events: a 37-month follow-up of myocardial perfusion by (82)Rb PET/CT in the iden-
5635 initially asymptomatic low- to intermediate-risk tification of obstructive coronary artery disease. Eur J
adults. Circulation. 2003;107:2571–6. Nucl Med Mol Imaging. 2018;45:521–9.
33. Greenland P, LaBree L, Azen SP, Doherty TM, 43. Naya M, Murthy VL, Foster CR, Gaber M, Klein
Detrano RC. Coronary artery calcium score combined J, Hainer J, Dorbala S, Blankstein R, Di Carli
with Framingham score for risk prediction in asymp- MF. Prognostic interplay of coronary artery calcifica-
tomatic individuals. JAMA. 2004;291:210–5. tion and underlying vascular dysfunction in patients
34. Taylor AJ, Bindeman J, Feuerstein I, Cao F, Brazaitis with suspected coronary artery disease. J Am Coll
M, O’Malley PG. Coronary calcium indepen- Cardiol. 2013;61:2098–106.
dently predicts incident premature coronary heart 44. Schenker MP, Dorbala S, Hong EC, Rybicki
disease over measured cardiovascular risk fac- FJ, Hachamovitch R, Kwong RY, Di Carli
tors: mean three-year outcomes in the Prospective MF. Interrelation of coronary calcification, myocar-
Army Coronary Calcium (PACC) project. J Am Coll dial ischemia, and outcomes in patients with inter-
Cardiol. 2005;46:807–14. mediate likelihood of coronary artery disease: a
35. Vliegenthart R, Oudkerk M, Hofman A, Oei HH, van combined positron emission tomography/computed
Dijck W, van Rooij FJ, Witteman JC. Coronary calci- tomography study. Circulation. 2008;117:1693–700.
fication improves cardiovascular risk prediction in the 45. Lessmann N, van Ginneken B, Zreik M, de Jong PA,
elderly. Circulation. 2005;112:572–7. de Vos BD, Viergever MA, Isgum I. Automatic cal-
36. Detrano R, Guerci AD, Carr JJ, Bild DE, Burke G, cium scoring in low-dose chest CT using deep neural
Folsom AR, Liu K, Shea S, Szklo M, Bluemke DA, networks with dilated convolutions. IEEE Trans Med
O’Leary DH, Tracy R, Watson K, Wong ND, Kronmal Imaging. 2018;37:615–25.
RA. Coronary calcium as a predictor of coronary 46. van Velzen SG, Lessmann N, Velthuis BK, Bank IE,
events in four racial or ethnic groups. N Engl J Med. van den Bongard DH, Leiner T, de Jong PA, Veldhuis
2008;358:1336–45. WB, Correa A, Terry JG. Deep learning for auto-
37. Erbel R, Mohlenkamp S, Moebus S, Schmermund matic calcium scoring in CT: validation using mul-
A, Lehmann N, Stang A, Dragano N, Gronemeyer tiple cardiac CT and chest CT protocols. Radiology.
D, Seibel R, Kalsch H, Brocker-Preuss M, Mann K, 2020;295:66–79.
Siegrist J, Jockel KH, Heinz Nixdorf Recall Study 47. Šprem J, De Vos BD, Lessmann N, Van Hamersvelt
Investigative Group. Coronary risk stratification, dis- RW, Greuter MJ, De Jong PA, Leiner T, Viergever
crimination, and reclassification improvement based MA, Išgum I. Coronary calcium scoring with par-
on quantification of subclinical coronary atheroscle- tial volume correction in anthropomorphic thorax
rosis: the Heinz Nixdorf Recall study. J Am Coll phantom and screening chest CT images. PLoS One.
Cardiol. 2010;56:1397–406. 2018;13:e0209318.
38. Greenland P, Blaha MJ, Budoff MJ, Erbel R, Watson 48. Huo Y, Terry JG, Wang J, Nath V, Bermudez C, Bao S,
KE. Coronary calcium score and cardiovascular risk. Parvathaneni P, Carr JJ, Landman BA. Coronary cal-
J Am Coll Cardiol. 2018;72:434–47. cium detection using 3D attention identical dual deep
39. Engbers EM, Timmer JR, Ottervanger JP, Mouden M, network based on weakly supervised learning. Proc
Knollema S, Jager PL. Prognostic value of coronary SPIE Int Soc Opt Eng. 2019;10949:1094917.
artery calcium scoring in addition to single-photon 49. Isgum I, de Vos BD, Wolterink JM, Dey D, Berman
emission computed tomographic myocardial perfu- DS, Rubeaux M, Leiner T, Slomka PJ. Automatic
sion imaging in symptomatic patients. Circulation. determination of cardiovascular risk by CT attenu-
Cardiovasc Imag. 2016;9:e003966. ation correction maps in Rb-82 PET/CT. J Nucl
40. Budoff MJ, Young R, Burke G, Jeffrey Carr J, Detrano Cardiol. 2018;25:2133–42.
RC, Folsom AR, Kronmal R, Lima JAC, Liu KJ, 50. Iacobellis G, Pistilli D, Gucciardo M, Leonetti
McClelland RL, Michos E, Post WS, Shea S, Watson F, Miraldi F, Brancaccio G, Gallo P, di Gioia
KE, Wong ND. Ten-year association of coronary CR. Adiponectin expression in human epicardial adi-
artery calcium with atherosclerotic cardiovascular dis- pose tissue in vivo is lower in patients with coronary
ease (ASCVD) events: the multi-ethnic study of ath- artery disease. Cytokine. 2005;29:251–5.
erosclerosis (MESA). Eur Heart J. 2018;39:2401–8. 51. Ding J, Kritchevsky SB, Harris TB, Burke GL,
41. Brodov Y, Gransar H, Dey D, Shalev A, Germano Detrano RC, Szklo M, Carr JJ. The association of
G, Friedman JD, Hayes SW, Thomson LE, Rogatko pericardial fat with calcified coronary plaque. Obesity.
A, Berman DS, Slomka PJ. Combined quantita- 2008;16:1914–9.
tive assessment of myocardial perfusion and coro- 52. Rosito GA, Massaro JM, Hoffmann U, Ruberg
nary artery calcium score by hybrid 82Rb PET/CT FL, Mahabadi AA, Vasan RS, O’Donnell CJ, Fox
improves detection of coronary artery disease. J Nucl CS. Pericardial fat, visceral abdominal fat, cardiovas-
Med. 2015;56:1345–50. cular disease risk factors, and vascular calcification in
42. Zampella E, Acampa W, Assante R, Nappi C, Gaudieri a community-based sample: the Framingham Heart
V, Mainolfi CG, Green R, Cantoni V, Panico M, Klain Study. Circulation. 2008;117:605–13.
M, Petretta M, Slomka PJ, Cuocolo A. Combined 53. Mahabadi AA, Massaro JM, Rosito GA, Levy D,
evaluation of regional coronary artery calcium and Murabito JM, Wolf PA, O’Donnell CJ, Fox CS,
11 Artificial Intelligence/Machine Learning in Nuclear Medicine and Hybrid Imaging 155

Hoffmann U. Association of pericardial fat, intratho- Jeffries NO, Harrell FE Jr, Rockhold FW, Broderick S,
racic fat, and visceral abdominal fat with cardiovascu- Ferguson TB Jr, Williams DO, Harrington RA, Stone
lar disease burden: the Framingham Heart Study. Eur GW, Rosenberg Y, ISCHEMIA Research Group.
Heart J. 2009;30(7):850–6. Initial invasive or conservative strategy for stable
54. Mahabadi AA, Berg MH, Lehmann N, Kalsch H, coronary disease. N Engl J Med. 2020;382:1395–407.
Bauer M, Kara K, Dragano N, Moebus S, Jockel 64. Betancur J, Commandeur F, Motlagh M, Sharir
KH, Erbel R, Mohlenkamp S. Association of epi- T, Einstein AJ, Bokhari S, Fish MB, Ruddy TD,
cardial fat with cardiovascular risk factors and inci- Kaufmann P, Sinusas AJ, Miller EJ, Bateman
dent myocardial infarction in the general population: TM, Dorbala S, Di Carli M, Germano G, Otaki
the Heinz Nixdorf Recall Study. J Am Coll Cardiol. Y, Tamarappoo BK, Dey D, Berman DS, Slomka
2013;61:1388–95. PJ. Deep learning for prediction of obstructive disease
55. Commandeur F, Goeller M, Razipour A, Cadet S, Hell from fast myocardial perfusion SPECT: a multicenter
MM, Kwiecinski J, Chen X, Chang HJ, Marwan M, study. J Am Coll Cardiol Img. 2018;11:1654–63.
Achenbach S, Berman DS, Slomka PJ, Tamarappoo 65. Betancur J, Hu LH, Commandeur F, Sharir T, Einstein
BK, Dey D. Fully automated CT quantification of epi- AJ, Fish MB, Ruddy TD, Kaufmann PA, Sinusas
cardial adipose tissue by deep learning: a multicenter AJ, Miller EJ, Bateman TM, Dorbala S, Di Carli M,
study. Radiol Artif Intell. 2019;1:e190045. Germano G, Otaki Y, Liang JX, Tamarappoo BK, Dey
56. Commandeur F, Goeller M, Betancur J, Cadet S, Doris D, Berman DS, Slomka PJ. Deep learning analysis
M, Chen X, Berman DS, Slomka PJ, Tamarappoo of upright-supine high-efficiency SPECT myocardial
BK, Dey D. Deep learning for quantification of epi- perfusion imaging for prediction of obstructive coro-
cardial and thoracic adipose tissue from non-contrast nary artery disease: a multicenter study. J Nucl Med.
CT. IEEE Trans Med Imaging. 2018;37:1835–46. 2019;60:664–70.
57. Garcia EV, Klein JL, Moncayo V, Cooke CD, Del’Aune 66. Otaki Y, Singh A, Kavanagh P, Miller RJ, Parekh T,
C, Folks R, Moreiras LV, Esteves F. Diagnostic per- Tamarappoo BK, Sharir T, Einstein AJ, Fish MB,
formance of an artificial intelligence-driven cardiac- Ruddy TD, Kaufmann PA, Sinusas AJ, Miller EJ,
structured reporting system for myocardial perfusion Bateman TM, Sharmila Dorbala M, Carli MD, Cadet
SPECT imaging. J Nucl Cardiol. 2020;27:1652–64. S, Liang JX, Dey D, Berman DS, Slomka PJ. Clinical
58. Chang CC, Lin CJ. LIBSVM: a library for support deployment of explainable artificial intelligence for
vector machines. ACM Trans Intell Syst Technol diagnosis of coronary artery disease. JACC Cardiovas
(TIST). 2011;2:27. Imag. 2021. (In review).
59. Arsanjani R, Xu Y, Dey D, Fish M, Dorbala S, Hayes 67. Otaki Y, Tamarappoo B, Singh A, Sharir T, Hu LH,
S, Berman D, Germano G, Slomka P. Improved accu- Gransar H, Einstein A, Fish M, Ruddy T, Kaufmann
racy of myocardial perfusion SPECT for the detec- P, Sinusas A, Miller E, Bateman T, Dorbala S, Di Carli
tion of coronary artery disease using a support vector M, Liang J, Dey D, Berman D, Slomka P. Diagnostic
machine algorithm. J Nucl Med. 2013;54:549–55. accuracy of deep learning for myocardial perfusion
60. Arsanjani R, Xu Y, Dey D, Vahistha V, Shalev A, imaging in men and women with a high-efficiency
Nakanishi R, Hayes S, Fish M, Berman D, Germano parallel-hole-collimated cadmium-zinc-telluride cam-
G, Slomka PJ. Improved accuracy of myocardial perera: multicenter study. J Nucl Med. 2020;61:92.
fusion SPECT for detection of coronary artery dis- 68. Togo R, Hirata K, Manabe O, Ohira H, Tsujino I,
ease by machine learning in a large population. J Nucl Magota K, Ogawa T, Haseyama M, Shiga T. Cardiac
Cardiol. 2013;20:553–62. sarcoidosis classification with deep convolutional
61. Spier N, Nekolla S, Rupprecht C, Mustafa M, Navab neural network-based features using polar maps.
N, Baust M. Classification of polar maps from cardiac Comput Biol Med. 2019;104:81–6.
perfusion imaging with graph-convolutional neural 69. Arsanjani R, Dey D, Khachatryan T, Shalev A, Hayes
networks. Sci Rep. 2019;9:7569. SW, Fish M, Nakanishi R, Germano G, Berman DS,
62. Liu H, Wu J, Miller EJ, Liu C, Liu Y-H. Diagnostic Slomka P. Prediction of revascularization after myo-
accuracy of stress-only myocardial perfusion SPECT cardial perfusion SPECT by machine learning in a
improved by deep learning. Eur J Nucl Med Mol large population. J Nucl Cardiol. 2015;22:877–84.
Imaging. 2021:1–8. 70. Hu LH, Betancur J, Sharir T, Einstein AJ, Bokhari S,
63. Maron DJ, Hochman JS, Reynolds HR, Bangalore Fish MB, Ruddy TD, Kaufmann PA, Sinusas AJ, Miller
S, O’Brien SM, Boden WE, Chaitman BR, Senior R, EJ, Bateman TM, Dorbala S, Di Carli M, Germano G,
Lopez-Sendon J, Alexander KP, Lopes RD, Shaw LJ, Commandeur F, Liang JX, Otaki Y, Tamarappoo BK,
Berger JS, Newman JD, Sidhu MS, Goodman SG, Dey D, Berman DS, Slomka PJ. Machine learning
Ruzyllo W, Gosselin G, Maggioni AP, White HD, predicts per-vessel early coronary revascularization
Bhargava B, Min JK, GBJ M, Berman DS, Picard MH, after fast myocardial perfusion SPECT: results from
Kwong RY, Ali ZA, Mark DB, Spertus JA, Krishnan multicentre REFINE SPECT registry. Eur Heart J
MN, Elghamaz A, Moorthy N, Hueb WA, Demkow Cardiovasc Imaging. 2020;21:549–59.
M, Mavromatis K, Bockeria O, Peteiro J, Miller TD, 71. Betancur J, Otaki Y, Motwani M, Fish MB, Lemley
Szwed H, Doerr R, Keltai M, Selvanayagam JB, Steg M, Dey D, Gransar H, Tamarappoo B, Germano G,
PG, Held C, Kohsaka S, Mavromichalis S, Kirby R, Sharir T, Berman DS, Slomka PJ. Prognostic value of
156 R. J. H. Miller et al.

combined clinical and myocardial perfusion imaging PJ, Berman DS, Dey D. Machine learning integration
data using machine learning. J Am Coll Cardiol Img. of circulating and imaging biomarkers for explainable
2018;11:1000–9. patient-specific prediction of cardiac events: a pro-
72. Juarez-Orozco LE, Martinez-Manzanera O, van der spective study. Atherosclerosis. 2021;318:76–82.
Zant FM, Knol RJJ, Knuuti J. Deep learning in quan- 80. Haro Alonso D, Wernick MN, Yang Y, Germano G,
titative PET myocardial perfusion imaging: a study Berman DS, Slomka P. Prediction of cardiac death
on cardiovascular event prediction. JACC Cardiovasc after adenosine myocardial perfusion SPECT based on
Imaging. 2020;13:180–2. machine learning. J Nucl Cardiol. 2019;26:1746–54.
73. Dekker M, Waissi F, Bank IE, Isgum I, Scholtens AM, 81. Mercuri M, Pascual TNB, Mahmarian JJ, Shaw LJ,
Velthuis BK, Pasterkamp G, de Winter RJ, Mosterd A, Dondi M, Paez D, Einstein AJ. Estimating the reduc-
Timmers L. The prognostic value of automated coro- tion in the radiation burden from nuclear cardiology
nary calcium derived by a deep learning approach on through use of stress-only imaging in the United States
non-ECG gated CT images from 82Rb-PET/CT myo- and worldwide. JAMA Intern Med. 2016;176:269–73.
cardial perfusion imaging. Int J Cardiol. 2021;329:9– 82. Einstein AJ, Pascual TN, Mercuri M, Karthikeyan G,
15. https://doi.org/10.1016/j.ijcard.2020.12.079. Vitola JV, Mahmarian JJ, Better N, Bouyoucef SE,
74. Commandeur F, Slomka PJ, Goeller M, Chen X, Cadet Hee-Seung Bom H, Lele V, Magboo VP, Alexanderson
S, Razipour A, McElhinney P, Gransar H, Cantu S, E, Allam AH, Al-Mallah MH, Flotats A, Jerome S,
Miller RJH, Rozanski A, Achenbach S, Tamarappoo Kaufmann PA, Luxenburg O, Shaw LJ, Underwood
BK, Berman DS, Dey D. Machine learning to predict SR, Rehani MM, Kashyap R, Paez D, Dondi
the long-term risk of myocardial infarction and car- M. Current worldwide nuclear cardiology practices
diac death based on clinical risk, coronary calcium, and radiation exposure: results from the 65 country
and epicardial adipose tissue: a prospective study. IAEA Nuclear Cardiology Protocols Cross-Sectional
Cardiovasc Res. 2020;116:2216–25. Study (INCAPS). Eur Heart J. 2015;36:1689–96.
75. Lin A, Wong ND, Razipour A, McElhinney PA, 83. Jerome SD, Tilkemeier PL, Farrell MB, Shaw
Commandeur F, Cadet SJ, Gransar H, Chen X, Cantu LJ. Nationwide Laboratory adherence to myocardial
S, Miller RJH, Nerlekar N, Wong DTL, Slomka perfusion imaging radiation dose reduction practices: a
PJ, Rozanski A, Tamarappoo BK, Berman DS, Dey report from the intersocietal accreditation commission
D. Metabolic syndrome, fatty liver, and artificial data repository. J Am Coll Cardiol Img. 2015;8:1170–6.
intelligence-based epicardial adipose tissue measures 84. Eisenberg E, Betancur J, Hu LH, Sharir T, Einstein
predict long-term risk of cardiac events: a prospective A, Ruddy T, Kaufmann P, Sinusas A, Miller E,
study. Cardiovasc Diabetol. 2021;20:27. Bateman T, Dorbala S, Di Carli M, Germano G, Otaki
76. Tamarappoo B, Dey D, Shmilovich H, Nakazato R, Y, Tamarappoo B, Dey D, Berman D, Slomka P. The
Gransar H, Cheng VY, Friedman JD, Hayes SW, diagnostic accuracy of machine learning from stress
Thomson LE, Slomka PJ, Rozanski A, Berman only fast-MPS. J Nucl Med. 2018;59:508.
DS. Increased pericardial fat volume measured from 85. Hu LH, Miller RJH, Sharir T, Commandeur F, Rios
noncontrast CT predicts myocardial ischemia by R, Einstein AJ, Fish MB, Ruddy TD, Kaufmann PA,
SPECT. J Am Coll Cardiol Img. 2010;3:1104–12. Sinusas AJ, Miller EJ, Bateman TM, Dorbala S, Di
77. Goeller M, Achenbach S, Marwan M, Doris MK, Carli M, Liang JX, Eisenberg E, Dey D, Berman DS,
Cadet S, Commandeur F, Chen X, Slomka PJ, Gransar Slomka PJ. Prognostically safe stress-only single-
H, Cao JJ, Wong ND, Albrecht MH, Rozanski A, photon emission computed tomography myocar-
Tamarappoo BK, Berman DS, Dey D. Epicardial adi- dial perfusion imaging guided by machine learning:
pose tissue density and volume are related to subclini- report from REFINE SPECT. European Heart Journal
cal atherosclerosis, inflammation and major adverse Cardiovascular Imaging. 2020;22(6):705–14.
cardiac events in asymptomatic subjects. J Cardiovasc 86. Mearns BM. Stress-only SPECT reduces radia-
Comput Tomogr. 2018;12:67–73. tion exposure but does not affect mortality. Nat Rev
78. Neeland IJ, Ross R, Despres JP, Matsuzawa Y, Cardiol. 2010;7:178.
Yamashita S, Shai I, Seidell J, Magni P, Santos RD, 87. Selvaraju RR, Cogswell M, Das A, Vedantam R,
Arsenault B, Cuevas A, Hu FB, Griffin B, Zambon Parikh D, Batra D. Grad-cam: visual explanations
A, Barter P, Fruchart JC, Eckel RH, International from deep networks via gradient-based localization.
Atherosclerosis Society, International Chair on In: Proceedings of the IEEE international conference
Cardiometabolic Risk Working Group on Visceral on computer vision. 2017. p. 618–26.
Obesity. Visceral and ectopic fat, atherosclerosis, and 88. Zeiler MD, Fergus R. Visualizing and understanding
cardiometabolic disease: a position statement. Lancet convolutional networks. In: European conference on
Diabetes Endocrinol. 2019;7:715–25. computer vision 2014. p. 818–833.
79. Tamarappoo BK, Lin A, Commandeur F, McElhinney 89. de Vos BD, Wolterink JM, Leiner T, de Jong PA,
PA, Cadet S, Goeller M, Razipour A, Chen X, Gransar Lessmann N, Isgum I. Direct automatic coronary
H, Cantu S, Miller RJ, Achenbach S, Friedman J, calcium scoring in cardiac and chest CT. IEEE Trans
Hayes S, Thomson L, Wong ND, Rozanski A, Slomka Med Imaging. 2019;38:2127–38.
Part III
Impact of AI and ML on Molecular
Imaging and Theranostics
Artificial Intelligence Will Improve
Molecular Imaging, Therapy 12
and Theranostics. Which Are
the Biggest Advantages
for Therapy?

Georgios Kaissis and Rickmer Braren

Contents
12.1 Introduction 159
12.2 iterature Review
L 162
12.2.1 M orphological and Metabolic Tumor Volume Tracking 162
12.3 uantitative Image and Texture Analysis
Q
in Oncological Therapy Response Monitoring 163
12.3.1 Neuro-Oncology 163
12.3.2 Head and Neck Cancers 163
12.3.3 Lung Cancer 164
12.3.4 Prostate Cancer 164
12.3.5 Breast Cancer 164
12.3.6 Gastrointestinal Oncology 165
12.4 Discussion and Outlook 166
References 167

12.1 Introduction cation of deep learning [1], the introduction of

large, curated data sets facilitating transfer learn-
Artificial Intelligence (AI) approaches in medical ing approaches [2], the substantial research and
imaging have witnessed significant evolution industry interest in the domain and the availabil-
over the past years. The reasons for this are mani- ity of both hardware accelerators (mainly graph-
fold: The field of computer vision has arguably ics processing units) and software frameworks
seen the most drastic advance in its state of the art providing pretrained algorithms and approach-
facilitated by the increasingly widespread appli- able application programming interfaces lower-
ing the barrier to entry to the field. Furthermore,
medical imaging represents an excellent target
G. Kaissis (*) for machine learning applications as it is widely
Technical University of Munich, Munich, Germany available in standardized data exchange formats
Imperial College London, London, UK and stored electronically [3]. Also, the availabil-
e-mail: [email protected] ity of images alongside medical/radiological
R. Braren reports provide inbuilt human ground-truth
Technical University of Munich, Munich, Germany assessments of relevant findings.
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 159
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_12
160 G. Kaissis and R. Braren

The trend of large dataset accrual has increas- other parts of this book, as well as touched upon
ingly also manifested in the medical field, with later in this chapter.
large databases of medical imaging data being Expectedly for a novel field, most of the litera-
assembled as national efforts attempting to pro- ture published on the topic of artificial intelli-
vide a cross-sectional assessment of large popu- gence applications in medical imaging has
lations of both healthy volunteers and patients. focused on diagnostic applications in the field of
The German National Cohort Health Study oncology such as the prediction of tumor sub-
(NAKO Gesundheitsstudie, www.nako.de) and types, genetic features, metastatic behavior or
the United Kingdom Biobank [4] are examples of patient survival. Algorithms targeted at diagnosis
this development, providing access to thousands often provide objectively verifiable outputs (e.g.
of imaging data sets to researchers and practitio- by comparison of the algorithm’s prediction to a
ners in the field, which can be used for the devel- histopathologic result), and can be compared to
opment of machine learning algorithms. These the performance of human experts (e.g. true/false
efforts supplement initiatives such as the [5], rep- positive/negative rates), facilitating their valida-
resenting curated collections of oncology-specific tion. The field of therapy monitoring and ther-
material including medical imaging but also digi- anostics, that is, the image-based expression
tal histopathology or genomic sequence data. The quantification of relevant therapeutic targets, has
increasing roll-out of partially or fully electronic however not yet witnessed the same level of
patient records signifies a further important step research activity. Several reasons emerge, such as
towards the collection of relevant metadata, the following:
which can be included in predictive models
alongside image-based information. However, 1. Treatment represents a heterogenous clinical
such data repositories are not without specific process characterized by the application of sev-
challenges: Large-scale data collection signifies eral therapeutic approaches, often simultane-
an increased importance of privacy protection, ously. For instance, oncologic therapy consists
for which next-generation methods have only of surgical, pharmacologic, radiotherapeutic,
recently been introduced [6]. Moreover, data and other supplemental interventions.
quality is paramount for the development of pre- Establishing causal relationships between a
dictive algorithms, thus care needs to be taken certain treatment and its effect is therefore
that images and clinical metadata are generated often a difficult undertaking.
and expertly curated with high standards of qual- 2. The interplay between treatment and disease
ity assurance. Algorithms need to be trained and is hard to accurately quantify. For example,
validated on diverse and representative patient tumors demonstrate therapy escape phenom-
collectives to ascertain not only their validity ena leading to treatment resistance, which can
when applied to unseen data from new sources, be hard to distinguish from inefficacy or pri-
but also to assert their fairness, control their bias mary failure of the treatment.
and render them reproducible and interpretable. 3. Cancer imaging is influenced by systemic
The deployment of machine learning algorithms effects such as individual toxicity or comor-
to clinical routine poses great challenges of its bidities that can have a modulating effect on
own, necessitating interdisciplinary cooperation local findings (e.g. perfusion effects of anti-
and continuous monitoring and improvements. vascular agents versus decrease in cardiovas-
Finally, the reimbursement of algorithm-based cular output causing tissue mal-perfusion)
diagnostic services remains largely unresolved. and which are hard to deconvolve from spe-
Issues such as these represent but a limited subset cific treatment outcomes.
of the parameters which need to be taken into 4. Effects mediating treatment response are also
account in the design of artificial intelligence functions of the complex genetic, transcrip-
algorithms for medical use and are discussed in tomic, epigenetic, and environmental tumor
12 Artificial Intelligence Will Improve Molecular Imaging, Therapy and Theranostics. Which Are the... 161

landscape in which causes and effects can be and/or incorporate other data, such as clinical
impossible to distinguish. record information.
5. Novel treatments are continuously introduced, Finally, from a methodological point of view,
thus retrospectively collected data, often the research can be divided into studies applying tra-
bedrock of oncological machine learning ditional computer vision techniques by utilizing
applications, might not be applicable as algo- predefined mathematical descriptors of the image
rithm training material. (features) alongside machine learning-methods
6. Finally, cancer is insufficiently understood typically used for tabular data analysis such as
and represents a disease as individual as the regression models, tree-based algorithms etc. and
patients themselves. Intra- and inter-tumoral studies applying deep neural networks directly to
heterogeneity thus pose hindrances to the the imaging data. For the former, the term
applicability of algorithmic tools aimed fore- radiomics is often used. We would like to point
most at generalization, drastically increasing out that this distinction is not formal, and the
the difficulty of training such algorithms. term radiomics is used for deep-neural-network-
based algorithms as well. Due to its ill definition,
In attempting to taxonomically classify the we eschew the usage of this term altogether and
current literature about machine learning and refer instead to the techniques and algorithms in
artificial intelligence approaches for treatment question by their technical description, which we
response prediction and assessment as well as believe to be more both clearer and more
theranostics, two patterns emerge: informative.
The methodological concerns applied to a
• The majority of studies focus on the predic- study are also a function of the data used for algo-
tion of therapy response from a single time- rithm development. Unlike pure anatomic imag-
point and single surrogate. Such studies ing, which typically takes the form of a
attempt to capture information from a singular three-dimensional stack of images in black and
imaging study, often the baseline examina- white, hybrid and functional imaging usually
tion, to predict differences in treatment out- provides at least two congruent images for the
come by characterizing a specific tumor same anatomical location. In case of dynamic
phenotype. acquisitions, such as multiple contrast media
• Studies focusing on longitudinal/integrative phases, the dimensionality of the data further
monitoring of findings, for example, integrat- increases. This data is often heterogeneous with
ing the features of the tumor alongside rele- respect to its spatial resolution (e.g. the technical
vant metadata and/or their evolution over the resolution of the scanner or the effects resulting
treatment period to predict the course of from interactions of radionuclides with the tissue
therapy. leading to, for example, the actual resolution of
PET differing from the nominal resolution of the
With respect to the defining tumor features, detector elements). These factors need to be
research can be stratified into studies aiming at taken into account and potentially corrected for
the quantification of tumor volume, either purely in quantitative imaging studies.
morphological or morphological and metabolic, In the following sections we will highlight and
for example, by the definition and automated contrast relevant literature findings regarding the
tracking of metabolic tumor volume, and into application of machine learning to therapy
studies concerned with higher-order descriptors response evaluation with a focus on hybrid onco-
of disease features or treatment targets. Such fea- logical imaging and provide recommendations
tures can be derived from the tumor itself, for and future directions for practitioners and
example, histogram metrics, texture features etc. researchers in the field.
162 G. Kaissis and R. Braren

12.2 Literature Review on several extraneous parameters. Thus, more

recently, parameters like the total lesion glycoly-
12.2.1 Morphological and Metabolic sis (TLG) and metabolic tumor volume (MTV)
Tumor Volume Tracking have been proposed as more precise biomarkers
of disease activity. These however require a defi-
12.2.1.1 Volumetry-Based nition of the tumor volume itself, also termed
Oncological Response segmentation.
Assessment Frameworks
The conceptually simplest automated therapy 12.2.1.2 Automated Segmentation-
surveillance approaches rely on the quantifica- Based Volumetry Techniques
tion of the reduction in tumor volume using auto- The evolution of automated volumetry methods
mated methods, thus mirroring human evaluation, thus closely follows the evolution of automated
for example, by application of the Response tumor segmentation methods. Earlier studies [7–
Evaluation Criteria in Solid Tumors (RECIST). 9] rely on legacy segmentation techniques such
RECIST was among the first attempts to quantify as region-growing, nearest-neighbor or probabi-
tumor response to treatment in imaging. However, listic graphical methods [10]. Hybrid imaging
it relies on two-dimensional evaluation and on provides a benefit in this regard by providing a
the definition of so-called target lesions, which form of pre-segmentation mask via the high-SUV
necessarily limits its scope and potential repre- tumor region, helping to guide algorithm behav-
sentativeness, since individual tumor manifesta- ior. Such iso-contour-based segmentation meth-
tions are employed as surrogates of disease ods [11] have been demonstrated, for example, in
burden. RECIST evaluation suffers from further sarcoma. Similar approaches can also be applied
notable limitations, mainly in tumor entities with directly to metabolic tumor volume (MTV) track-
ill-defined margins (e.g. pancreatic cancer) and ing without the associated morphological imag-
can be a poor correlate of therapy response due to ing. This approach has shown promise in several
phenomena such as pseudo-progression, whereby tumor entities such as rectal cancer [12], lym-
tumor volume initially increases in response to phoma [13], gynecological tumors [14], or
therapy due to inflammatory changes. The 2009 esophageal cancer [15]. However, it has been
position paper by Wahl et al. introduced a sys- noted that MTV lacks standardization and large-
tematic framework combining previous guide- scale external validation and thus cannot be
lines for incorporating metabolic and functional assumed to be a universal gold standard for ther-
imaging-derived information into tumor response apy surveillance in comparison to, for example,
assessment called PERCIST (PET response crite- the standardized uptake value (SUV) [16].
ria in solid tumors). The PERCIST framework
stipulates the categories complete and partial 12.2.1.3 Evolution of Automated
metabolic response, stable metabolic disease, and Segmentation Using Neural
progressive metabolic disease by measurement Networks
of lean body mass-adjusted standardized uptake Automated segmentation has witnessed a sub-
value (SUL). Similar frameworks have been pro- stantial evolution with the introduction of neural
posed by other working groups, such as the network-based segmentation methods. Earlier
EORTC, as well as combined functional/mor- methods, based on fully convolutional neural net-
phologic criteria such as the Lugano criteria pro- works [17] have more recently been superseded
posed in 2014, incorporating elements of both by encoder–decoder architectures with transverse
RECIST and radionuclide uptake information. short-circuits, such as the UNet architecture pro-
The quantitative nature of PET allows the cal- posed by Ronneberger et al. in 2015 [18] and
culation of absolute radionuclide activity per vol- their conceptual evolutions such as Feature
ume tissue, offering benefits over the standardized Pyramid Networks (FPNs) [19]. A common trait
uptake value, which has been shown to depend of these architectures is the utilization of image
12 Artificial Intelligence Will Improve Molecular Imaging, Therapy and Theranostics. Which Are the... 163

information captured at multiple scales and the The research developments in the field of
transmission of high spatial frequency (i.e. high treatment supervision in hybrid imaging have
detail) image information from early to late parts closely followed the main oncologic application
of the network with corresponding feature map areas of PET.
sizes. Encoder–decoder architectures have domi-
nated the segmentation literature since ca. 2015,
and can be applied both in two and three dimen- 12.3.1 Neuro-Oncology
sions. Fully automatic segmentation has been
proposed as a solution to the aforementioned In neuro-oncologic applications, for example,
standardization problem [20] and been success- studies have focused on the identification of
fully applied to both treatment response assess- molecular phenotypes with relevance for therapy
ment, for example, in breast cancer [21], where it and prognosis, such as isocitrate dehydrogenase
has been shown to outperform dynamic contrast- status [23] from amino acid (fluoroethyl tyrosine,
enhanced MRI, and treatment planning, for FET) PET scans in gliomas. The authors found
example, for brain tumor radiotherapy [22]. that the inclusion of radiomic parameters
improved diagnostic accuracy compared to PET-
derived metrics alone. Similarly, a recent study
12.3 Quantitative Image by Hotta et al. found image texture parameters
and Texture Analysis derived from 11C-methionine PET to yield excel-
in Oncological Therapy lent discriminative performance between recur-
Response Monitoring rence of malignant brain tumors and radiation
necrosis [24], a topic of critical relevance for
The advent of quantitative image analysis work- steering treatment decisions. A multitude of
flows within the past 5 years has generated sig- works (see e.g. overview in [25]) have focused on
nificant interest in the utilization of image-derived brain metastases, amongst others for differentia-
data for tumor characterization. Such approaches tion of primary brain tumors from metastases,
rely on either the bulk extraction of tumor-related pinpointing the origin of metastatic lesions to the
image features, their preprocessing and modeling brain and for differentiating treatment-related
using machine learning (also termed radiomics), changes from recurrence. Recent studies have
or the end-to-end analysis of image data using also focused specifically on treatment, with stud-
neural networks. As discussed above, we will not ies by Cha et al. demonstrating strong perfor-
terminologically differentiate between these mance of convolutional neural network ensembles
approaches, believing them to not be mutually in the prediction of metastatic lesion response to
exclusive. However, it is expected that the numer- radiotherapy [26] from baseline imaging
ous shortcomings of the so-called radiomics examinations.
workflow will eventually lead to its replacement
by algorithms and techniques based on more
robust techniques and models, and not suscepti- 12.3.2 Head and Neck Cancers
ble to the same technical limitations we will
describe below. The typical workflow of quanti- In head and neck cancers, several studies have
tative image analysis studies is common to both demonstrated the benefits of integrating quantita-
approaches, consisting of a volume of interest tive imaging features with morphological tumor
definition step and a modeling step. For volume descriptors for predictive modeling workflows.
of interest definition i.e. segmentation, both man- For instance, Fujima et al. showed that in patients
ual and all above-mentioned automatic methods who underwent chemoradiation treatment for
are applicable and commonly used. For details on pharyngeal cancer, tumor shape and texture fea-
the various techniques, we refer to the chapters in tures were highly predictive of progression-free
Part I of this book. and overall patient survival [27]. They note that
164 G. Kaissis and R. Braren

clinical parameters alone were not sufficient for and robustness and proceed to demonstrate high
discriminating survival subgroups in their study. inter-rater variability impacting the reproducibil-
Feliciani et al. employed texture metrics derived ity of texture parameters [34]. Such challenges
from pretherapeutic FDG-PET and found these are of course not immanent to thoracic imaging
imaging biomarkers highly predictive of local workflows and have been repeatedly noted in pre-
chemoradiation therapy failure [28]. Crispin- vious studies irrespective of imaging modality
Ortuzar and colleagues aimed at predicting head applied [35, 36] with PET-specific solutions
and neck tumor hypoxia, which is usually recently proposed [37].
assessed, for example, with specific hypoxia
radiotracers such as 18F-FMISO, using FDG-
PET-derived texture parameters. They report sub- 12.3.4 Prostate Cancer
stantial improvements over baseline FDG-PET
performance alone and note that quantitative The role of hybrid imaging in prostate cancer is
imaging biomarkers can provide an alternative to continuously evolving and expanding with the
hypoxia-specific radiotracers where such are application of Gallium or Fluorine-labeled
unavailable [29]. PSMA supported by recent meta-analyses [38,
39] and having been demonstrated to impact
patient management in a majority of cases [40].
12.3.3 Lung Cancer The first randomized prospective trial testing the
influence of PSMA PET/CT on prostate patient
In lung cancer, the relevance of including FDG- outcome was announced in early 2019 [41].
PET into patient workup was shown in the 2002 Quantitative imaging feature studies have
PLUS trial [30], demonstrating a 20% reduction recently provided promising results applied to
in unnecessary surgical interventions. PSMA PET. For example, Zamboglou et al. dem-
Consequently, several studies have investigated onstrate PSMA-PET-derived quantitative fea-
quantitative imaging features, for example, in the tures to discriminate between cancer- and
prediction of histological subtypes [31] or post- non-cancer-affected prostate tissue, as well as
treatment survival [32]. Oikonomou et al. studied differentiate between Gleason scores of 7 and ≥8
the association of quantitative image features and between patients with and without nodal
with several outcomes, including local and dis- involvement [42]. PSMA expression is an excel-
tant disease control, recurrence-free probability lent example of a theranostic application, i.e. the
and survival metrics and found image-derived specific expression monitoring of a therapy-
features to represent the only predictors of over- relevant target: since Lutetium-PSMA can be
all survival, disease-specific survival and regional used for radioligand treatment in advanced pros-
disease control [33]. A recent multicenter trial by tate cancer [43], machine-learning applications
Dissaux et al. demonstrated that FDG-PET- predicting response to such therapy directly from
derived texture features predict local disease con- the images could hence represent a promising
trol in patients undergoing stereotactic next step.
radiotherapy for early-stage non-small-cell lung
cancer and highlighted the potential value of such
algorithms for therapeutic decision-making. The 12.3.5 Breast Cancer
large body of research into machine learning and
quantitative imaging biomarker applications in The field of breast cancer research has witnessed
lung cancer has also provided insight into key among the strongest advances in the utilization of
challenges associated with such applications. quantitative imaging workflows and the applica-
Yang et al. note that the widespread application tion of machine intelligence, likely due to the
of texture-derived image features as prognostic high quality of image acquisition because of the
predictors is impeded by a lack of quality control lack of motion artifacts, the universal
12 Artificial Intelligence Will Improve Molecular Imaging, Therapy and Theranostics. Which Are the... 165

implementation of standardized reporting in the and therapy response [50–52]. Yip et al. included
form of BIRADS and the high incidence. Hence, longitudinally acquired datasets in their model
several studies have proposed image-derived fea- and found a decrease in tumor heterogeneity-
tures for the noninvasive characterization of related texture and histogram features to be asso-
breast cancer. For example, Antunovic et al. uti- ciated with tumor response and patient survival
lized pretreatment FDG-PET/CT of breast cancer [53]. Ypsilantis et al. employed convolutional
and found histogram features to be associated neural networks on PET scans and found them to
with histopathological, molecular, and receptor outperform radiomics models in the prediction of
expression subtypes [44]. Similarly, Huang et al. therapy response in esophageal cancer [54].
found image features derived from PET/MRI Furthermore, sub-regional analyses, taking into
data to be associated with tumor grading, stage, account intra-tumoral heterogeneity are being
subtype, recurrence, and survival [45]. Ou et al. assessed for their impact on the survival of
utilize machine learning to differentiate between esophageal cancer patients treated with chemora-
breast carcinoma and breast lymphoma based on diation, shown, for example, in the study by Xie
texture features derived from FDG-PET/CT [46]. et al. [55].
Focused on therapy response prediction, In pancreatic cancer, multiparametric imaging
Antunovic and colleagues noted the association and machine learning have been investigated for
of molecular breast cancer subtypes with distinct differentiation of inflammatory and neoplastic
responses to neoadjuvant chemotherapy and processes [56]. The added utility of hybrid fusion
developed machine learning algorithms on FDG- imaging for the delineation of tumors has been
PET/CT to predict pathological complete noted by Belli et al. in a recent study [57] with
response in locally advanced breast cancer [47]. applications in quantitative imaging workflows.
Ha et al. also utilized FDG-PET/CT to develop In our own work, we note the importance and
machine learning-derived metabolic signatures potential benefits of multiparametric data inte-
of breast cancer associated with Ki67 gene gration for accurate prognostic prediction in the
expression, pathological complete response to field of pancreatic cancer [58]. Cui et al. identi-
neoadjuvant chemotherapy and recurrence risk fied quantitative parameters prognostic of stereo-
[48]. As noted above, however, such workflows tactic radiation therapy in pancreatic cancer from
are not without challenges and it was recently FDG-PET/CT imaging [59]. With the evolving
noted in the work by Sollini et al. that most evi- role of hybrid imaging for therapy planning in
dence on the utility […] is at the feasibility level. pancreatic cancer [60, 61] especially with respect
The authors recommend harmonization, valida- to neoadjuvant treatment regimens, as well as the
tion on representative datasets and the establish- advances in molecular subtyping including the
ment of guidelines for the application of distinction of differentially activated metabolic
quantitative imaging parameters in breast pathways, [62–64], it must be assumed that the
imaging [49]. scope of quantitative imaging workflows will
soon expand further to hybrid imaging.
In rectal cancer, several studies have investi-
12.3.6 Gastrointestinal Oncology gated the utility of pretreatment quantitative
imaging biomarkers in the prediction of therapy
The largest body of work regarding therapy pre- response. The study by Lovinfosse and colleagues
diction using quantitative image-derived param- found texture parameters derived from pretreat-
eters in hybrid imaging has arguably been ment FDG-PET/CT predictive of survival in a
produced in the area of gastrointestinal oncology. cohort of patients with locally advanced rectal
In esophageal cancer for instance, several studies cancer treated with neoadjuvant chemoradiation,
on radiomics workflows have highlighted the sig- noting that these features outperformed volume-
nificance of heterogeneity-related image features based parameters in predictive performance [65].
and have derived models predictive of prognosis Amorim et al. compared FDG-PET- and diffu-
166 G. Kaissis and R. Braren

sion-weighted MRI-derived parameters and neuroendocrine neoplasms [70], expanding on

observed the information gained from these previous studies reporting on combined radio-
modalities to be independent and complementary, tracer application [71]. We believe machine
underscoring the relevance of multiparametric learning techniques to herald a transition towards
hybrid imaging workflows in oncology [66]. The integrated theranostic applications which will
importance of tumor heterogeneity was noted by likely blur the current borders between diagno-
Bundschuh et al., who note that heterogeneity- sis- and therapy-response-focused studies. This
related image features are relevant both early in evolution will obviously not remain without chal-
the course of therapy and after its completion lenges. Foremost, it will be predicated on the
[67]. A similar dual timepoint study was per- development and availability of emerging and
formed by Jeon et al., who performed multipara- novel theranostic radiotracers beyond the above-
metric modeling including clinical parameters mentioned fields of prostate cancer and neuroen-
and MRI-derived texture features and observed docrine tumors, as well as the understanding of
changes in these features to be associated with their interaction with biological targets and their
distinct risk phenotypes. The authors note that unique challenges and pitfalls [72], to enable
their results would be applicable to and benefit their utilization in AI-guided and precision medi-
from the inclusion of functional imaging [68]. cine applications [73].
Reviewing the current literature findings, a
clear trend can be observed from tumor tissue
12.4 Discussion and Outlook and metabolic volume tracking applications
towards image texture analysis which can be
In this chapter, we review the applications of ascribed to the above-mentioned rise of quantita-
machine learning and artificial intelligence to tive imaging workflows [74] within the past few
therapy monitoring in the domain of molecular years. We however still observe specific chal-
and hybrid imaging, as well as theranostics. lenges, several of which are unmet in current
Despite its somewhat earlier stage of evolution literature:
compared to applications purely focused on diag- Nearly all of the studies outlined above utilize
nosis, such as tumor detection or subtype classifi- hybrid imaging-based texture analysis workflows.
cation, the multitude of studies presented A more thorough investigation on the differential
showcase the intense research interest in the field contribution of each modality to the predictive
and provide an outlook on the main objectives of model, or an analysis of the added benefit of
techniques, algorithms, and applications aimed at hybrid imaging over a single modality were not
therapy monitoring and response prediction. routinely performed. Anatomic and functional
Evidently, diagnostic and theranostic applica- imaging have been shown to present specific and
tions are closely related. For example, specific individual challenges with respect to texture anal-
tumor subtypes are associated with distinct ther- ysis, rendering such a differentiated assessment
apy response, providing space for exploration of necessary [75]. Furthermore, the difficulties of
novel therapy targets and specific therapeutic harmonizing quantitative imaging workflows and
agents. The clinical utilization of theranostic rendering them robust towards variances between
radiotracers is also expected to expand beyond diagnostic equipment vendors, differences in
the current main routine application of prostate human performance and unstandardized texture
imaging with PSMA: initial studies report suc- feature specifications have been noted extensively
cesses, for example, in the application of texture in the literature [36], mostly aimed at anatomical
analysis in neuroendocrine tumors [69]. The imaging modalities. However, recent works have
combined application of diagnostic and theranos- focused on harmonizing texture features specifi-
tic radiotracers has also been reported, with very cally in functional imaging [37] alongside efforts
recent results showcasing their complementary for protocol standardization and guidelines aimed
value in the outcome prediction of pancreatic at hybrid imaging studies [76]. Ultimately, we
12 Artificial Intelligence Will Improve Molecular Imaging, Therapy and Theranostics. Which Are the... 167

believe handcrafted quantitative imaging features References

and the field of radiomics to represent an interme-
diate step in the evolution of machine learning 1. Sejnowski TJ. The unreasonable effectiveness of deep
learning in artificial intelligence. Proc Natl Acad Sci.
application in medical imaging towards deep
2020;117(48):30033–8.
learning-based workflows. The latter offer greater 2. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei
representational flexibility and robustness, obviat- L. ImageNet: a large-scale hierarchical image data-
ing post-processing and harmonization require- base. In: CVPR09. 2009.
3. Gibaud B. The quest for standards in medical imag-
ments in favor of data diversity and larger patient
ing. Eur J Radiol. 2011;78:190–8.
cohorts and rendering them inherently more suit- 4. Sudlow C, Gallacher J, Allen N, Beral V, Burton P,
able for multicentric studies [77–80]. The advent Danesh J, et al. UK biobank: an open access resource
of deep learning and associated advances in image for identifying the causes of a wide range of com-
registration [81] will also signify greater facility plex diseases of middle and old age. PLoS Med.
2015;12:e1001779.
in integrating additional information from studies 5. Prior F, Smith K, Sharma A, Kirby J, Tarbox L, Clark
acquired at multiple timepoints. Longitudinal K, et al. The public cancer radiology imaging col-
imaging has been shown to offer deeper insight lections of the cancer imaging archive. Scient Data.
into therapy-related changes in tumor biol- 2017;4:170124.
6. Kaissis GA, Makowski MR, Rückert D, Braren
ogy [82]; however, it was only performed in a RF. Secure, privacy-preserving and federated machine
small fraction of the studies presented above due learning in medical imaging. Nature Mach Intell.
to the difficulties of acquiring multi-timepoint 2020;2:305–11.
imaging and the escalated requirements towards 7. Clark M, Hall L, Goldgof D, Velthuizen R, Murtagh
F, Silbiger M. Automatic tumor segmentation using
selection of time-stable and reproducible image knowledge-based techniques. IEEE Trans Med
features [83]. Lastly, many of the studies pre- Imaging. 1998;17:187–201.
sented base their assessment of therapy response 8. Vokurka EA, Herwadkar A, Thacker NA, Ramsden
on surrogate measurements, for example, on RT, Jackson A. Using bayesian tissue classifica-
tion to improve the accuracy of vestibular schwan-
tumor volume decrease or on associations
noma volume and growth measurement. AJNR Am J
between therapy response and a decrease in image Neuroradiol. 2002;23:459–67.
heterogeneity believed to mirror biological phe- 9. Haney SM, Thompson PM, Cloughesy TF, Alger JR,
nomena, which cannot always be objectively vali- Toga AW. Tracking tumor growth rates in patients
with malignant gliomas: a test of two algorithms.
dated. Further more, therapy response is a
AJNR Am J Neuroradiol. 2001;22:73–82.
multifactorial process greatly dependent on clini- 10. Zhao Z, Yang G, Lin Y, Pang H, Wang M. Automated
cal parameters, which should be included in the glioma detection and segmentation using graphical
modeling process [58]. The introduction of algo- models. PLoS One. 2018;13:e0200745.
rithms enabling the direct prediction of patient 11. Lee H, Paeng JC, Hong SH, Yoo HJ, Cheon GJ, Lee
DS, et al. Appropriate margin thresholds for isocon-
survival from images and the associated clinical tour metabolic volumetry of fluorine-18 fluorode-
data [84] will thus improve the capabilities for oxyglucose PET in sarcoma. Nucl Med Commun.
pre-therapeutic risk stratification and provide 2016;37:1088–94.
higher confidence for guiding therapy decisions. 12. Bang J-I, Ha S, Kang S-B, Lee K-W, Lee H-S, Kim J-S,
et al. Prediction of neoadjuvant radiation chemother-
In conclusion, this chapter discusses the appli- apy response and survival using pretreatment [18F]
cations of machine learning-based medical image FDG PET/CT scans in locally advanced rectal cancer.
analysis workflows, their applications to therapy Eur J Nucl Med Mol Imaging. 2015;43:422–31.
response monitoring and theranostics in a hybrid 13. Kostakoglu L, Chauvie S. Metabolic tumor vol-
ume metrics in lymphoma. Semin Nucl Med.
imaging setting, as well as current and future 2018;48:50–66.
research directions. We believe that the concur- 14. Erdogan M, Erdemoglu E, Evrimler S, Hanedan
rent evolution and innovations in the fields of C, Şengül SS. Prognostic value of metabolic tumor
oncologic hybrid imaging, theranostics, and volume and total lesion glycolysis assessed by 18F-
FDG PET/CT in endometrial cancer. Nucl Med
computer vision will fuel scientific discovery in Commun. 2019;40:1099–104.
the field and provide the opportunity for clinical 15. Yildirim BA, Torun N, Guler OC, Onal C. Prognostic
translation and improvements to patient care. value of metabolic tumor volume and total lesion gly-
168 G. Kaissis and R. Braren

colysis in esophageal carcinoma patients treated with using a combination of contrast-enhanced computed
definitive chemoradiotherapy. Nucl Med Commun. tomography and [18F]-fluorodeoxyglucose positron
2018;39:553–63. emission tomography radiomics features. Radiother
16. Gallamini A, Kostakoglu L. Metabolic tumor volume: Oncol. 2018;127:36–42.
we still need a platinum-standard metric. J Nucl Med. 30. van Tinteren H, Hoekstra OS, Smit EF, van den Bergh
2016;58:196–7. JH, Schreurs AJ, Stallaert RA, et al. Effectiveness of
17. Long J, Shelhamer E, Darrell T. Fully convolutional positron emission tomography in the preoperative
networks for semantic segmentation. arXiv e-prints assessment of patients with suspected non-small-cell
[Internet]. 2014. Available from: http://arxiv.org/ lung cancer: the PLUS multicentre randomised trial.
abs/1411.4038v2 Lancet. 2002;359:1388–92.
18. Ronneberger O, Fischer P, Brox T. U-net: 31. Hyun SH, Ahn MS, Koh YW, Lee SJ. A machine-
Convolutional networks for biomedical image seg- learning approach using PET-based radiomics to
mentation. arXiv e-prints [Internet]. 2015. Available predict the histological subtypes of lung cancer. Clin
from: http://arxiv.org/abs/1505.04597v1 Nucl Med. 2019;44:956–60.
19. Wu X, Sahoo D, Zhang D, Zhu J, Hoi SC. Single- 32. Ahn H, Lee H, Kim S, Hyun S. Pre-treatment 18F-
shot bidirectional pyramid networks for high-quality FDG PET-based radiomics predict survival in
object detection. Neurocomputing. 2020;401:1–9. resected non-small cell lung cancer. Clin Radiol.
20. Beichel RR, Smith BJ, Bauer C, Ulrich EJ, 2019;74:467–73.
Ahmadvand P, Budzevich MM, et al. Multi-site qual- 33. Oikonomou A, Khalvati F, Tyrrell PN, Haider MA,
ity and variability analysis of 3D FDG PET segmenta- Tarique U, Jimenez-Juan L, et al. Radiomics analysis
tions based on phantom and clinical image data. Med at PET/CT contributes to prognosis of recurrence and
Phys. 2017;44:479–96. survival in lung cancer treated with stereotactic body
21. Andreassen MMS, Goa PE, Sjøbakk TE, Hedayati R, radiotherapy. Sci Rep. 2018;8:4003.
Eikesdal HP, Deng C, et al. Semi-automatic segmenta- 34. Yang F, Simpson G, Young L, Ford J, Dogan N,
tion from intrinsically-registered 18F-FDGPET/MRI Wang L. Impact of contouring variability on onco-
for treatment response assessment in a breast cancer logical PET radiomics features in the lung. Sci Rep.
cohort: comparison to manual DCEMRI. MAGMA. 2020;10:369.
2019;33:317–28. 35. van Timmeren JE, Carvalho S, Leijenaar RTH, Troost
22. Rundo L, Stefano A, Militello C, Russo G, Sabini EGC, van Elmpt W, de Ruysscher D, et al. Challenges
MG, D’Arrigo C, et al. A fully automatic approach and caveats of a multi-center retrospective radiomics
for multimodal PET and MR image segmentation in study: an example of early treatment response assess-
gamma knife treatment planning. Comput Methods ment for NSCLC patients using FDG-PET/CT
Prog Biomed. 2017;144:77–96. radiomics. PLoS One. 2019;14:e0217536.
23. Lohmann P, Lerche C, Bauer EK, Steger J, Stoffels 36. Rizzo S, Botta F, Raimondi S, Origgi D, Fanciullo C,
G, Blau T, et al. Predicting IDH genotype in gliomas Morganti AG, et al. Radiomics: the facts and the chal-
using FET PET radiomics. Sci Rep. 2018;8:13328. lenges of image analysis. Eur Radiol Exp. 2018;2:36.
24. Hotta M, Minamimoto R, Miwa K. 11C-methionine- 37. Orlhac F, Boughdad S, Philippe C, Stalla-Bourdillon
PET for differentiating recurrent brain tumor from H, Nioche C, Champion L, et al. A postreconstruction
radiation necrosis: radiomics approach with random harmonization method for multicenter radiomic stud-
forest classifier. Sci Rep. 2019;9:15666. ies in PET. J Nucl Med. 2018;59:1321–8.
25. Lohmann P, Kocher M, Ruge MI, Visser-Vandewalle 38. Hope TA, Goodman JZ, Allen IE, Calais J, Fendler WP,
V, Shah NJ, Fink GR, et al. PET/MRI radiomics Carroll PR. Meta analysis of 68Ga-PSMA-11 PET
in patients with brain metastases. Front Neurol. accuracy for the detection of prostate cancer validated
2020;11:1. by histopathology. J Nucl Med. 2018;60:786–93.
26. Cha YJ, Jang WI, Kim M-S, Yoo HJ, Paik EK, Jeong 39. Perera M, Papa N, Christidis D, Wetherell D, Hofman
HK, et al. Prediction of response to stereotactic radio- MS, Murphy DG, et al. Sensitivity, specificity, and
surgery for brain metastases using convolutional neu- predictors of positive 68 gaProstate-specific mem-
ral networks. Anticancer Res. 2018;38:5437–45. brane antigen positron emission tomography in
27. Fujima N, Hirata K, Shiga T, Li R, Yasuda K, Onimaru advanced prostate cancer: a systematic review and
R, et al. Integrating quantitative morphological and meta-analysis. Eur Urol. 2016;70:926–37.
intratumoural textural characteristics in FDG-PET for 40. Calais J, Fendler WP, Eiber M, Gartmann J, Chu F-I,
the prediction of prognosis in pharynx squamous cell Nickols NG, et al. Impact of 68Ga-PSMA-11 PET/
carcinoma patients. Clin Radiol. 2018;73(1059):e1–8. CT on the management of prostate cancer patients
28. Feliciani G, Fioroni F, Grassi E, Bertolini M, Rosca A, with biochemical recurrence. J Nucl Med. 2017;59:
Timon G, et al. Radiomic profiling of head and neck 434–41.
cancer: 18F-FDG PET texture analysis as predictor 41. Calais J, Czernin J, Fendler WP, Elashoff D, Nickols
of patient survival. Contrast Media Mol Imaging. NG. Randomized prospective phase III trial of
2018;2018:1–8. 68Ga-PSMA-11 PET/CT molecular imaging for pros-
29. Crispin-Ortuzar M, Apte A, Grkovski M, Oh JH, tate cancer salvage radiotherapy planning [PSMA-
Lee NY, Schöder H, et al. Predicting hypoxia status SRT]. BMC Cancer. 2019;19:18.
12 Artificial Intelligence Will Improve Molecular Imaging, Therapy and Theranostics. Which Are the... 169

42. Zamboglou C, Carles M, Fechter T, Kiefer S, Reichel 54. Ypsilantis P-P, Siddique M, Sohn H-M, Davies A,
K, Fassbender TF, et al. Radiomic features from Cook G, Goh V, et al. Predicting response to neoadju-
PSMA PET for non-invasive intraprostatic tumor vant chemotherapy with PET imaging using convolu-
discrimination and characterization in patients with tional neural networks. PLoS One. 2015;10:e0137036.
intermediate- and high-risk prostate cancer - a com- 55. Xie C, Yang P, Zhang X, Xu L, Wang X, Li X, et al.
parison study with histology reference. Theranostics. Sub-region based radiomics analysis for survival
2019;9:2595–605. prediction in oesophageal tumours treated by defini-
43. Hofman MS, Violet J, Hicks RJ, Ferdinandus J, tive concurrent chemoradiotherapy. EBioMedicine.
Thang SP, Akhurst T, et al. [177 lu]-PSMA-617 2019;44:289–97.
radionuclide treatment in patients with metastatic 56. Zhang Y, Cheng C, Liu Z, Wang L, Pan G, Sun G,
castration-resistant prostate cancer (LuPSMA trial): et al. Radiomics analysis for the differentiation
A single-centre, single-arm, phase 2 study. Lancet of autoimmune pancreatitis and pancreatic ductal
Oncol. 2018;19:825–33. adenocarcinoma in 18 f-FDG PET/CT. Med Phys.
44. Antunovic L, Gallivanone F, Sollini M, Sagona A, 2019;46:4520–30.
Invento A, Manfrinato G, et al. [18F]FDG PET/CT 57. Belli ML, Mori M, Broggi S, Cattaneo GM, Bettinardi
features for the molecular characterization of pri- V, Dell’Oca I, et al. Quantifying the robustness of [
mary breast tumors. Eur J Nucl Med Mol Imaging. 18 f]FDG-PET/CT radiomic features with respect to
2017;44:1945–54. tumor delineation in head and neck and pancreatic
45. Huang S-y, Franc BL, Harnish RJ, Liu G, Mitra D, cancer patients. Phys Med. 2018;49:105–11.
Copeland TP, et al. Exploration of PET and MRI 58. Kaissis GA, Jungmann F, Ziegelmayer S, Lohöfer
radiomic features for decoding breast cancer pheno- FK, Harder FN, Schlitter AM, et al. Multiparametric
types and prognosis. npj Breast Cancer. 2018;4:24. modelling of survival in pancreatic ductal adenocarci-
46. Ou X, Zhang J, Wang J, Pang F, Wang Y, Wei X, et al. noma using clinical, histomorphological, genetic and
Radiomics based on 18 f-FDG PET/CT could differ- image-derived parameters. J Clin Med. 2020;9:1250.
entiate breast carcinoma from breast lymphoma using 59. Cui Y, Song J, Pollom E, Shirato H, Chang D, Koong
machine-learning approach: a preliminary study. A, et al. Radiomic analysis of FDG-PET identi-
Cancer Med. 2019;9:496–506. fies novel prognostic imaging biomarkers in locally
47. Antunovic L, Sanctis RD, Cozzi L, Kirienko M, advanced pancreatic cancer patients treated with
Sagona A, Torrisi R, et al. PET/CT radiomics in breast SBRT. Int J Radiat Oncol Biol Phys. 2015;93:S4–5.
cancer: promising tool for prediction of pathological 60. Dholakia AS, Chaudhry M, Leal JP, Chang DT,
response to neoadjuvant chemotherapy. Eur J Nucl Raman SP, Hacker-Prietz A, et al. Baseline metabolic
Med Mol Imaging. 2019;46:1468–77. tumor volume and total lesion glycolysis are associ-
48. Ha S, Park S, Bang J-I, Kim E-K, Lee H-Y. Metabolic ated with survival outcomes in patients with locally
radiomics for pretreatment 18F-FDG PET/CT to advanced pancreatic cancer receiving stereotactic
characterize locally advanced breast cancer: histo- body radiation therapy. Int J Radiat Oncol Biol Phys.
pathologic characteristics, response to neoadjuvant 2014;89:539–46.
chemotherapy, and prognosis. Sci Rep. 2017;7:1556. 61. Mellon EA, Jin WH, Frakes JM, Centeno BA, Strom
49. Sollini M, Cozzi L, Ninatti G, Antunovic L, Cavinato TJ, Springett GM, et al. Predictors and survival for
L, Chiti A, et al. PET/CT radiomics in breast cancer: pathologic tumor response grade in borderline resect-
mind the step. Methods. 2021;188:122–32. able and locally advanced pancreatic cancer treated
50. Yip C, Landau D, Kozarski R, Ganeshan B, Thomas with induction chemotherapy and neoadjuvant stereo-
R, Michaelidou A, et al. Primary esophageal can- tactic body radiotherapy. Acta Oncol. 2016;56:391–7.
cer: heterogeneity as potential prognostic biomarker 62. Kaissis G, Ziegelmayer S, Lohöfer F, Steiger K,
in patients treated with definitive chemotherapy and Algül H, Muckenhuber A, et al. A machine learning
radiation therapy. Radiology. 2014;270:141–8. algorithm predicts molecular subtypes in pancreatic
51. van Rossum PSN, Fried DV, Zhang L, Hofstetter ductal adenocarcinoma with differential response to
WL, van Vulpen M, Meijer GJ, et al. The incremen- gemcitabine-based versus FOLFIRINOX chemother-
tal value of subjective and quantitative assessment of apy. PLoS One. 2019;14:e0218642.
18F-FDG PET for the prediction of pathologic com- 63. Kaissis GA, Ziegelmayer S, Lohöfer FK, Harder FN,
plete response to preoperative chemoradiotherapy in Jungmann F, Sasse D, et al. Image-based molecular
esophageal cancer. J Nucl Med. 2016;57:691–700. phenotyping of pancreatic ductal adenocarcinoma. J
52. Xiong J, Yu W, Ma J, Ren Y, Fu X, Zhao J. The role Clin Med. 2020;9:724.
of PET-based radiomic features in predicting local 64. Kaissis G, Ziegelmayer S, Lohöfer F, Algül H, Eiber
control of esophageal cancer treated with concurrent M, Weichert W, et al. A machine learning model for the
chemoradiotherapy. Sci Rep. 2018;8:9902. prediction of survival and tumor subtype in pancreatic
53. Yip C, Davnall F, Kozarski R, Landau DB, Cook ductal adenocarcinoma from preoperative diffusion-
GJR, Ross P, et al. Assessment of changes in tumor weighted imaging. Eur Radiol Exp. 2019;3:41.
heterogeneity following neoadjuvant chemother- 65. Lovinfosse P, Polus M, Daele DV, Martinive P,
apy in primary esophageal cancer. Dis Esophagus. Daenen F, Hatt M, et al. FDG PET/CT radiomics for
2014;28:172–9. predicting the outcome of locally advanced rectal
170 G. Kaissis and R. Braren

cancer. European Journal of Nuclear Medicine and 75. Cook GJ, Azad G, Owczarczyk K, Siddique M, Goh
Molecular Imaging. 2017;45:365–75. V. Challenges and promises of PET radiomics. Int J
66. Amorim BJ, Torrado-Carvajal A, Esfahani SA, Marcos Radiat Oncol Biol Phys. 2018;102:1083–9.
SS, Vangel M, Stein D, et al. PET/MRI radiomics in 76. Prenosil GA, Weitzel T, Fürstner M, Hentschel M,
rectal cancer: a pilot study on the correlation between Krause T, Cumming P, et al. Towards guidelines to
PET- and MRI-derived image features with a clinical harmonize textural features in PET: Haralick textural
interpretation. Mol Imag Biol. 2020;22(5):1438–45. features vary with image noise, but exposure-invariant
67. Bundschuh RA, Dinges J, Neumann L, Seyfried domains enable comparable PET radiomics. PLoS
M, Zsoter N, Papp L, et al. Textural parameters of One. 2020;15:e0229560.
tumor heterogeneity in 18F-FDG PET/CT for ther- 77. Kuhl CK, Truhn D. The long route to standard-
apy response assessment and prognosis in patients ized radiomics: unraveling the knot from the end.
with locally advanced rectal cancer. J Nucl Med. Radiology. 2020;295:339–41.
2014;55:891–7. 78. Truhn D, Schrading S, Haarburger C, Schneider
68. Jeon SH, Song C, Chie EK, Kim B, Kim YH, Chang H, Merhof D, Kuhl C. Radiomic versus convolu-
W, et al. Delta-radiomics signature predicts treatment tional neural networks analysis for classification of
outcomes after preoperative chemoradiotherapy and contrast-enhancing lesions at multiparametric breast
surgery in rectal cancer. Radiat Oncol. 2019;14:43. MRI. Radiology. 2019;290:290–7.
69. Werner RA, Ilhan H, Lehner S, Papp L, Zsótér N, 79. Lou B, Doken S, Zhuang T, Wingerter D, Gidwani M,
Schatka I, et al. Pre-therapy somatostatin receptor- Mistry N, et al. An image-based deep learning frame-
based heterogeneity predicts overall survival in pan- work for individualising radiotherapy dose: a retro-
creatic neuroendocrine tumor patients undergoing spective analysis of outcome prediction. Lancet Digit
peptide receptor radionuclide therapy. Mol Imaging Health. 2019;1:e136–47.
Biol. 2018;21:582–90. 80. Sun Q, Lin X, Zhao Y, Li L, Yan K, Liang D, et al.
70. Mapelli P, Partelli S, Salgarello M, Doraku J, Pasetto Deep learning vs. radiomics for predicting axillary
S, Rancoita PM, et al. Dual tracer 68Ga-DOTATOC lymph node metastasis of breast cancer using ultra-
and 18F-FDG PET/computed tomography radiomics sound images: Don’t forget the peritumoral region.
in pancreatic neuroendocrine neoplasms: an endear- Front Oncol. 2020;10:53.
ing tool for preoperative risk assessment. Nucl Med 81. Haskins G, Kruger U, Yan P. Deep learning in medi-
Commun. 2020;41(9):896–905. cal image registration: a survey. Mach Vis Appl.
71. Hindié E. The NETPET score: combining FDG and 2020;31:8.
somatostatin receptor imaging for optimal management 82. Bak SH, Park H, Sohn I, Lee SH, Ahn M-J, Lee
of patients with metastatic well-differentiated neuroen- HY. Prognostic impact of longitudinal monitoring of
docrine tumors. Theranostics. 2017;7:1159–63. radiomic features in patients with advanced non-small
72. Werner RA, Thackeray JT, Pomper MG, Bengel FM, cell lung cancer. Sci Rep. 2019;9:8730.
Gorin MA, Derlin T, et al. Recent updates on molecu- 83. Leijenaar RTH, Carvalho S, Velazquez ER, van Elmpt
lar imaging reporting and data systems (MI-RADS) for WJC, Parmar C, Hoekstra OS, et al. Stability of FDG-
theranostic radiotracersNavigating pitfalls of SSTR- PET radiomics features: an integrated analysis of
and PSMA-targeted PET/CT. J Clin Med. 2019;8:1060. test-retest and inter-observer variability. Acta Oncol.
73. van der Meel R, Sulheim E, Shi Y, Kiessling F, Mulder 2013;52:1391–7.
WJM, Lammers T. Smart cancer nanomedicine. Nat 84. Katzmann A, Mühlberg A, Sühling M, Nörenberg D,
Nanotechnol. 2019;14:1007–17. Maurus S, Holch JW, et al. Computed tomography
74. Gillies RJ, Kinahan PE, Hricak H. Radiomics: images image-based deep survival regression for metastatic
are more than pictures, they are data. Radiology. colorectal cancer using a non-proportional hazards
2016;278:563–77. model. Predict Intell Med. 2019:73–80.
Integrative Computational
Biology, AI, and Radiomics: 13
Building Explainable Models
by Integration of Imaging, Omics,
and Clinical Data

I. Jurisica

Contents
13.1 Introduction 172
13.2 Artificial Intelligence and Data-Driven Science 172
13.3 Multimodal Imaging and Radiomics 175
13.4 Integrative Computational Biology 176
13.5 Patient-Centric Medicine: Preventive and Data-Driven 179
References 182

Abbreviations DNA Deoxyribonucleic acid

ECG Electrocardiography
18
F-FDG 18
Fluoro-deoxy-glucose EEG Electroencephalography
AD Alzheimer’s disease EFS Event-free survival
AI Artificial intelligence EIM Electrical impedance
AUC Area under the curve myography
cDNA Complementary DNA EOG Electrooculography
CNIL National Commission on EPR Electronic patient record
Informatics and Liberty FCSRT Free and cued selective
CPU Central processing unit reminding test
CT Computer tomography fMRI Functional magnetic resonance
DLBCL Diffuse large B cell lymphoma imaging
GLCM Grey level co-occurrence
matrix
GLNU Grey-level non-uniformity
GLSZM Grey-level size zone matrix
I. Jurisica (*) GPU Graphical processing unit
Osteoarthritis Research Program, Division of HGZE High grey-level zone emphasis
Orthopedic Surgery, Schroeder Arthritis Institute, and
Data Science Discovery Centre for Chronic Diseases,
HT High-throughput
Krembil Research Institute, University Health ICD International classification of
Network, Toronto, ON, Canada diseases
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 171
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_13
172 I. Jurisica

IID Integrated interactions database 13.1 Introduction

IPIaa Age-adjusted international
prognostic index Technological advances and high-throughput (HT)
LASSI-L Loewenstein-Acevedo scale for assays are rapidly changing the way we formulate
semantic interference and and test biological hypotheses. Advances in imag-
learning ing modalities, RNA sequencing, and mass spec-
LGZE Low grey-level zone emphasis trometry analysis have enabled translational
LUAD Lung adenocarcinoma research and clinical applications to simultaneously
LZE Long-zone emphasis view all genes expressed, identify proteome-wide
LZHGE Long-zone high grey-level changes, and assess interacting partners of each
emphasis individual protein within a biological system. Such
LZLGE Long-zone low grey-level views are already having an impact on our under-
emphasis standing of human disease, particularly in the realm
mirDIP microRNA data integration of cancer biology [1]. However, it may be challeng-
portal ing to identify useful information from these stud-
miRNA microRNA ies, ensure that signal is separated from noise, and
ML Machine learning provide hypotheses for further research, with the
MRI Magnetic resonance imaging goal to deliver measurable clinical impact. Often we
MSK Musculoskeletal only have fragmented patient cohorts, small number
MTV Metabolic tumor volume of samples with large number of parameters,
NDD Neurodegenerative diseases unknown or poorly understood biases of individual
NHL Non-Hodgkin’s lymphoma assays often lead to incorrect data processing which
OS Overall survival in turn leads to incorrect results. Diverse algorithms
OSEM Ordered subset expectation in analysis workflows produce different results,
maximisation often further reducing signal from noise.
pathDIP Pathway data integration portal Addressing important clinical questions
PET Positron emission tomography requires systematic knowledge management and
piRNA Piwi-interacting RNA analysis of the large volume of diverse informa-
PPI Protein–protein interaction tion. Biomedical information is inherently multi-
PSF Point spread function modal, covering clinical parameters, images,
RNA Ribonucleic acid gene expression, protein expression and protein
RNAseq RNA sequencing interactions, metabolites, drugs and pathways.
ROC Receiver operating Analyzing these data and using it intelligently is
characteristic a challenge because of their complexity, multiple
scRNAseq Single-cell RNA sequencing interdependent factors, the uncertainty of these
SD Standard deviation dependencies, and the continuous evolution of
SNOMED CT Standard nomenclature of med- our understanding of the data. Proper data man-
icine clinical terms agement and analysis will in turn impact preven-
SUV Standardised uptake value tion, early diagnosis, disease classification,
SZE Short-zone emphasis prognostics, and treatment planning.
SZHGE Short-zone high grey-level
emphasis
SZLGE Short-zone low grey-level 13.2 Artificial Intelligence
emphasis and Data-Driven Science
TCGA The cancer genome atlas
TILs Tumor infiltrated lymphocytes Artificial intelligence (AI) research focuses on
VOI Volume of interest development of diverse algorithms, their applica-
ZLNU Zone length non-uniformity tion in multiple areas, and system performance
ZP Zone percentage and usability questions. AI algorithms fit into
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 173

several broad categories, including representa- incorrect result interpretation. Evidence-based

tion and ontologies, search and retrieval, feature medicine and data-driven science rely on high-
extraction, constraint optimization, and diverse quality data and independently validated results.
learning strategies from classification and clus- However, data correctness and truth may not
tering, to decision trees, random forest, case- always be straightforward to assess and quantify.
based reasoning, support vector machines, Naive It used to be true that we can rely on peer-
Bayes, and neural networks. reviewed, published literature. However, detailed
Algorithms need to be properly trained and analysis of over 2000 retracted articles in life sci-
validated. This requires high-quality datasets, ence literature identified 67.4% of retractions as
properly annotated, with sufficiently large num- misconduct, with suspected fraud in over 43% of
ber of samples for complex domains with high- cases, and less than 22% identified as errors [2].
dimensional parameters. Knowing the biases of The number of retracted papers is growing alarm-
individual assays used for data generation may ingly (Fig. 13.1), and many of the retracted
help avoiding mistakes and reduce unwanted papers are related to clinical studies, patient care,
influences affecting AI system performance and and treatment, and are published in high-impact

Fig. 13.1 (a) Adjusted a

to number of papers
published in a given
year, retractions have
increased to over 10% a
year since 2020.
(b) The steady increase
is especially alarming
when one considers the
trend over the last 70
years. The sharp
jump starts at 2000, but
the second drastic
increase is shown during
the pandemic

b
174 I. Jurisica

factor journals, and get hundreds or even thou- planning and scheduling of radiology and other
sands of citations. Some recent, COVID-research treatments, data mining and machine learning
related paper retractions were dealt with from omics and imaging data, natural language
fairly quickly [3, 4], but the process frequently understanding for patient records and reports and
takes months or even years [2] (not surprisingly, for conversational systems, knowledge represen-
errors are usually identified twice as fast as tation and reasoning, planning, scheduling, and
fraud). This poses a substantial challenge for modeling. Requirements of specific areas intro-
implementing evidence-based medicine, as duce additional constraints for AI systems.
it may take a few years before we realize the evi- Usability questions are important especially in
dence is wrong, by which time we not only use critical application such as medicine, but they
the knowledge from these papers, but we use data are frequently ignored or handled only as an
to train and test AI algorithms. Training or vali- afterthought, and not planned from the ini-
dating AI systems on flawed data may not be tial design. This is one of the main reasons why
obvious immediately. Open and reproducible sci- despite many published papers in this area, real
ence requires open publications as well as open applications in medicine are not frequent. The
data; without it it would hard or impossible to most important issues, especially in biomedical
even identify papers with errors or suspected applications, include privacy and security [10–
fraud. In addition, open data enables improved 15], trust and robustness [16], ethics and fairness
and richer curation efforts, such as IMEx consor- [17], uncertainty and reliability [18, 19], repro-
tium [5–9]. Well-organized, rich, curated por- ducibility [20–22] and explainability [23–27],
tals ensure we can implement data-driven and effective human computer interaction inter-
medicine, and analyze, model, validate and interfaces [25].
pret results correctly. Completely replacing human experts from the
Managing combined molecular, imaging, and biomedical workflow is not feasible due to legal
patient data requires the support of the basic considerations, but AI-based optimization of the
knowledge management functions: workflow can improve quality and reduce costs
[28]. Human-in-the-loop ensures that cases sub-
1. Knowledge acquisition—identifying possible stantially different from the training data or com-
information sources (i.e., high-throughput plex cases will be handled properly by human
platforms, instruments and their biases, wear- experts. In addition, future innovation and prog-
able devices), and integrating such informa- ress also requires human-in-the-loop [25]. While
tion into a knowledge base; efforts in democratization of AI enables broader
2. Effective knowledge representation—storing application of successful algorithms, it can also
the knowledge in structures that support fast lead to their suboptimal or completely incorrect
and accurate access, and provide multiple use and overinterpretation of results, if the users
“networks” for linking semantics, relation- are not properly trained and understand required
ships and guiding analysis, modeling and assumptions and existing biases, constraints, and
interepretation; proper use of such tools.
3. Effective analysis, visualization, interpreta- It is critical to know the end-user of the AI
tion—supporting scalable access to relevant system; not only because of appropriate inter-
knowledge, multi-dimensional analysis with face, but also because of the need to tailore false
diverse tools, and scalable, intuitive, and inter- positives and false negatives towards the applica-
active visualization to support visual data tion use case. For example, a phone-based classi-
mining and interpretation. fier skin lesions would likely be used by general
public as a first step in screening, and thus false
Successful integration of such algorithms into negatives need to be low, at an expense of higher
efficient and effective workflows must consider false positives. The workflow would need to
specifics of individual application areas. Medical account for this bias, and the second level assess-
AI applications range from computer vision and ment would need to reduce false positives with
robotics in computer-assisted surgeries, through additional assessments. However, useful system
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 175

built for the expert radiologists should perfectly analysis, wavelet, quad tree decomposition), and
classify “simpler” cases automatically, and char- shape (sphericity, compactness).
acterize complex cases for “discussion” with Same as for AI algorithms discussed above,
human experts (as in clinical rounds) by using an reproducibility and robustness of radiomic features
ensemble AI-system that combines multiple are essential for generating useful results [69].
algorithms (with diverse biases) and is trained Standardizing acquisition protocols and “normaliz-
on different data sets (to increase diver- ing” instruments by using phantoms [70–76] are
sity). Thus, properly using such systems in the essential for integrating larger datasets, validating
more complex pipeline could optimize the cost, signatures and models on independent datasets, and
reduce false negatives and false positives, and in in turn leading to improving clinical outcomes
turn improve patient outtcomes. and results.
Explainability, while often neglected, is grow- While the applications cover diverse medical
ing in its importance in AI research [29]. Creating areas, predominant focus is on cancer, as deter-
explainable models is essential in medical appli- mined by word frequency analysis in the published
cations to ensure trust in recommendations and radiomics-related papers (Fig. 13.2). Imaging fea-
decision support [24, 30]. It is also vital for tures/biomarkers can be used individually or in
ensuring system evolution, due to cohort drift, combination with other clinical information to diag-
change of data and equipment over time, as nose and characterize tumors and metastasis [77–
explanations and models help address false posi- 81], and predict immunotherapy targets [45, 64,
tives, false negatives, identify possible trends and 82–87]. For example, frequently used imaging bio-
patterns or outliers. Explainable models help pri- markers in lung cancer include radiomic features,
oritize validation and identify signal from followed by CT features, and neural network-ana-
noise [31]. lyzed PET parameters [88]. While many papers
show benefits of fully automated AI-imaging sys-
tems [89–95], there is an advantage of using human-
13.3 Multimodal Imaging in-the-loop to handle complex cases [96].
and Radiomics Combining ultrasound imaging with
RNA sequencing data helped to identify and char-
Different imaging modalities offer new insights acterize at a gene and pathway levels three sub-
[32–36]. Multimodal imaging thus provides more phenotypes in patients with active psoriatic
accurate and robust biomarkers [36–42]. However, arthritis. Considering the sub-phenotypes show
computed tomography (CT), ultrasound, magnetic distinct clinical features, the characterization at
resonance imaging (MRI), MR spectroscopy, pos- biological pathway level helps identify possible
itron emission tomography (PET) or optical imag- mechanisms leading to clinical differences and
ing have varied availability, reproducibility, potential prevention and treatment [97].
cost-efficiency, acquisition time, and resolution, Integration can improve confidence in findings
and thus their applications need to be tailored to a [98–100], and reduce both type I and II errors. For
specific workflow. Regardless of the imaging example, in Alzheimer’s disease (AD) different
modality, extracting useful features by signal pro- studies provide individual insight into early diag-
cessing [43–51] can be enhanced by using AI nosis but also disease progression and treatment
algorithms, which in turn can substantially strategies, such as, imaging [101–108], electroen-
improve data interpretation and patient care [32, cephalography (EEG) [109], circulating RNA
52–68]. In both cases, one can focus on global or [110] or miRNA [111], psychological tests [112,
local features and implement (semi)manual or 113], and sleep pattern (using wearables) [114].
fully automated image segmentation. Important Combining individual data sets leads to improved
characteristics to extract and characterize include diagnostics, e.g., different imaging modalities and
intensity (histogram, skewness, kurtosis), texture early diagnosis [115–118], integrated miRNA and
(gray level co-occurrence matrix (GLCM), fractal piRNA [119], combined genetic and biochemical
176 I. Jurisica

Fig. 13.2 Analyzing the frequency of words in radiomics by word cloud analysis. Large font size corresponds to
literature published so far highlights that clinical applica- more frequent terms. Paper titles were obtained from
tion focus mostly on cancer (lung and breast), with PubMed and data was processed using Matlab R2019b
the goal to predict treatment and survival, as highlighted wordcloud package

markers [120–124], and combined biochemical critically evaluate and interpret the prediction
and cognitive markers [125]. and create plausible and explainable models.
Further integration of omics methods and algo- This will help to ensure that seemingly impor-
rithms could lead to an ensemble system, con- tant patterns in data are not artifacts of, for
firming some findings and thus increasing example, literature or data collection bias. To
confidence and reducing false positives and false achieve this, one has to carefully design what
negatives. Omics-based biomarkers may also pro- data is being used to compute background dis-
vide a less-invasive and cheaper alternative to tribution, how to handle missing values, how to
imaging for early detection of AD [126]. preprocess and normalize data, and how to
adjust parameters for confounding vari-
ables. Systems biology research uses methods
13.4 Integrative Computational from multiple disciplines, including variety of
Biology biochemical assays for omics platforms, diverse
statistical, signal processing and AI-based algo-
In systems biology research, we must ensure rithms. As outlined in Fig. 13.3, on the one
that discovered patterns and proposed models hand, translational research influences what
are both statistically sound, and biologically methods are needed, but on the other hand, new
and clinically meaningful. When analyzing data method development drives progress in the
and building predictive models, it is essential to clinical research.
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 177

While individual analyses provide useful

Translational results, integrating them across different modali-
Research & ties—imaging, genomics, proteomics, metabolo-
Clinical Trials Interactions
mics, clinical—may provide superior performance
Omics profiles
and ability to generate explainable models, provid-
ing improved mechanistic insight. Such integra-
Imaging Pathways
tion provides a platform for reinforcement
modeling—validating and enhancing individual
Drugs Methods models through multiple layers of integration.
Biology Annotation This can be achieved by using fusion [127–129]
Integration
Models
and networks that link individual layers of data
with relationships, such as transcription factor or
microRNA regulatory networks, physical protein
Fig. 13.3 Translational research and clinical trials both interactions, signaling and metabolic pathways
generate data for analytic pipelines and provide validation [130–132]. These relationships enable us to iden-
platforms for novel hypotheses and models. Combining
diverse data from multiple imaging modalities with tify and explain unexpected patterns observed in
diverse omics platforms, biology assays, drug informa- data [132], identify broader patterns by moving
tion, pathways, and rich interaction networks creates the from gene/protein-centric to complex/pathway-
comprehensive platform for hypotheses generation, lead- centric features [133], and identify broad deregu-
ing to formation and validation of explainable models for
disease and healthy states. Importantly, the relationships lated signaling cascades [130]. Such integration
across data modalities help reduce biases and errors in helps reducing noise from individual data sets and
each individual platform, by de facto reinforcement increases confidence in the result, as highlighted in
modeling Fig. 13.4. Here, data from the Cancer Genome

Fig. 13.4 Data from TCGA LUAD integrated using NAViGaTOR 3.0.13 [137], and the SVG output file was
microRNA-gene [135] and protein–protein interaction processed in Adobe Illustrator 2020 to produce the final
[136] networks. Highlighted are known up−/downregu- PNG image
lated and prognostic genes. Network was visualized in
178 I. Jurisica

Atlas and lung adenocarcinoma (LUAD) data set diagnosis or prognosis, molecular profiles, AI
[134] were integrated using protein interaction algorithms, and integrative computational biol-
networks, microRNA-gene networks, and gene ogy are needed to identify possible new drug tar-
ontology annotation. While there are almost 400 gets, combination therapies, and select the right
shared gene targets of the nine differential microR- patient for the right treatment at the right time.
NAs, too many to interpret or validate, only those Recently, a pathway-based patient model com-
highlighted have multiple support, and thus higher bined with multi-scale Bayesian network model
confidence for further functional studies and vali- TransPRECISE provided useful information on
dation. Importantly, substantial fraction of predicting specific treatment options [138].
microRNA targets has been shown as deregulated Additional benefits can be achieved by com-
and prognostic in LUAD, as highlighted by node bining imaging and network analysis algorithms.
outline and protein names. This not only confirms For example, fMRI data analysis can be repre-
the importance of identified deregulated microR- sented as graphs, such as structural brain networks
NAs in LUAD but also provides a possible mecha- [139], neuro-connectivity after brain injury [140,
nism and explainable model on how they relate to 141], or functional connectivity in Schizophrenia
clinically relevant targets. A more comprehensive [142]. Once constructed, these networks can be
analysis later showed that integrating copy number analyzed by the graph theory algorithms (e.g.,
aberrations, gene expression, and microRNAs [143–150]) to identify network structure–function
with networks not only explains paradoxical relationships. As an example, focusing on a pos-
expression patterns but also identifies and vali- sible connection between AKT and BTK1
dates prognostic genes in LUAD [132]. (Fig. 13.5), the network highlights the link
Analogously, even if all the markers and fea- between musculoskeletal (MSK; red edges and
tures from radiomics would be sufficient for red node outlines) and neurodegenerative (NDD)

Fig. 13.5 Exploring the connection between BTK and disease annotation, and drug information were obtained
AKT1, highlighting unique and common disease relation- from IID v.2018 [152]. Network was created in
ships (neurodegenerative (NDD) and musculoskeletal NAViGaTOR 3.0.13 [137], and the SVG output file was
(MSK) diseases) and drug targets, as highlighted by the processed in Adobe Illustrator 2020 to produce the final
color and font size, as per legend. Protein interactions, PNG file
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 179

diseases (blue lines and blue node outlines), in cohort is the minimal required condition; ideally,
addition to identifying know drug targets (large the predictor is validated on multiple indepen-
font). dent cohorts, and there is a clear understanding of
the boundaries of the model, i.e., where it can be
safely applied, and where standard of care should
13.5 Patient-Centric Medicine: be used instead. To ensure patient-centric medi-
Preventive and Data-Driven cine and increase the level of trust in the AI sys-
tems, they must be able to determine confidence
Understanding and successfully treating multi- of prediction for a given patient despite uncer-
genic diseases requires systems-oriented research tainties. Optimal use of the system would also
focused on the implication of disease-perturbed require to know what are the costs and conse-
molecular interaction networks and pathways. quences of false positives and false negatives,
These networks represent crucial relationships and adjust training, validation, and performance
among genes and proteins, their mutations, chro- accordingly.
mosomal aberrations, microRNA deregulation, While many papers have been published and
and other epigenetic and metabolic changes. multiple AI approaches passed diverse valida-
Decision support systems in medicine need to tions, their translation to medical practice in gen-
be robust across assays, instruments, and patient eral remains low. Some of the reasons stem from
cohorts, handling uncertainty and missing data the strict privacy issues that limit access to sam-
gracefully. However, quality of data and ples and thus prevent broad (and proper) training
literature-
based evidence may be questionable and validation. The system has to also integrate
[2], leading to low reproducibility and errors. into the existing workflows, and thus it is impor-
Due to mistakes and fraud in data and literature, tant to determine what is the baseline perfor-
these systems must also be able to handle incor- mance we strive to achieve, and who and how
rect information and data, again highlighting the will use the system. Are we striving to build sys-
value of integrative approach and explainable tems that are superior to general practitioners or
models that help separate signal from noise by systems that improve performance of experts, by
combining evidence or exploring explanatory automatically solving simpler cases and aug-
relationships. menting human expert performance in solving
Genomic medicine enhanced with cognitive complex cases by providing multiple views, clas-
analytics provides the necessary platform for pre- sifications and annotations? For example, as
cision patient care. Expanding genomic medicine shown in [153], an AI system is on par with
with network biology, quantified patient assess- human experts in the UK (also because the UK
ment and computational modeling leads to new system uses two radiologists to make a decision),
opportunities for translational research and pro- and the same AI system is superior to a radiology
vide patient-centric treatment strategies. reader in the US, where single expert is sufficient
Characterizing patient’s life style is vital for to make a decision.
assessing predisposition, prognosis, and treatment One size does not fit all. Unusually complex or
outcome. Providing measurable feedback on life rare cases are often discussed at clinical rounds, to
style change may alter risk, increase compliance, ensure best possible treatment and continuous
and improve treatment effect. Importantly, we learning. Thus, using an ensemble of AI systems
need to also understand the implication of the would help ensure high accuracy and lower false
environment, by studying and quantifying effects positive and false negative rates. The systems should
of exposome on human condition [151]. have clear boundaries based on training/validation
In precision medicine, achieving high accu- cohort characteristics, defer the analysis to human
racy/precision evaluated on a specific patient experts when dealing with outliers, and provide
180 I. Jurisica

confidence in and explanation of the decision sug- provides more detailed and accurate measurement
gested. Considering risks and costs of care path, of fat and muscle quality, and thus overall fitness
such ensemble systems would further enable to pri- level for an individual, and quantitative change over
oritize and optimize decision recommendations. time.
While we characterize diseases with the latest We need more precise measures of fitness and
molecular technologies, e.g., (single cell) body mass—we need to move from evidence-
RNA sequencing and proteomic and metabolomic based medicine to data-driven medicine.
platforms, we continue collecting other patient data Knowing the heart rate, real-time electro cardio
unreliably and sporadically, much of it using ques- gram (ECG), breathing rate and volume, sleeping
tionnaires or snapshots of sampled measures. We pattern and quality, and overall activity may also
know that tobesity correlates with the risk of many provide valuable insights into our health, yet,
diseases, including heart condition, diabetes, and measuring them in the clinic, or for a few days
cancer. Just one example of many; we know that using holster, provides only a limited (often
body mass index (BMI) is an imprecise and limited biased) snapshot, not delivering sufficient value
estimate of overall fat percentage and fitness, espe- for the new data-driven medicine.
cially when estimated from the weight and height, Even simple devices, such as bracelets and
yet it remains to be used in clinical studies, and trackers, provide more detailed information about
linked to disease risk. It has been introduced sleep patterns, steps, or overall activity, replacing
200 years ago for tracking obesity, but its value is vague statements about “150 minutes of moder-
diminished when measuring health on an individual ate activity a week is beneficial” with more pre-
level. Alternative approaches could be as simple as cise measure of duration, intensity, type and
measuring tape (for assessing waist or neck circum- frequency of activities, and most importantly,
ference), or more advanced with wearable devices. tracking and adjusting it for each individual
Molecular profiles from omics datasets provide patient. The challenges are to provide “simple”
detailed information about complex diseases on recommendations, empower the user and health
gene, protein, and metabolite level. If we add rich provider, but that can also create unforeseen
and temporal data streams from wearable devices to problems such as depression or eating disorders
each patient’s record, we will create unprecedented [154]. Increasingly, even with appropriate mea-
opportunities for better understanding how life style sures taken [155], hacker attacks and misuse of
affects disease and healthy states. Devices such as such personal information have been growing.
Skulpt (http://www.skulpt.me) will provide details Despite this, there is a steady increase in compa-
about fat deposition and overall fitness, replacing nies providing developing such devices and
the two-century-old BMI. Based on electrical delivering their potential for scientific use
impedance myography (EIM) measurements, it (Fig. 13.6). However, having precise measure-

Fig. 13.6 The trend in

PubMed papers with
“wearable” is growing
steadily since 2000. The
trend continues, as in the
first half year of 2022
more articles have been
published on the topic
compared to the first
10 years combined
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 181

ments is not sufficient. Humans are known to build trust in using such devices, and thus princi-
ignore recommendations and follow their bad ples of privacy by design have to be integral to
habits. To fully and maximally engage most indi- product development and data management [11].
viduals, we need to create smart social network- The role of exercise within the scope of cancer
ing that will motivate and empower patients and rehabilitation has been studied in the last decade
increase compliance [156–158]. [159–188]. The combination of aerobic exercise
At present, the data collected by wearable and resistance training has been shown to provide
devices focuses the consumer market; however, it many physiological and psychological health
is inevitable that after necessary approvals, appli- benefits for cancer prevention, concurrent treat-
cations in medical settings will prevail. Taking ment, and prevention of recurrence, such as
frequency of words in PubMed titles related to maintaining and improving muscle mass,
wearable devices already suggest trends, as high- strength, cardiorespiratory fitness, body function,
lighted in Fig. 13.7. physical activity levels and capacity, increased
We will need to adapt the computing infra- immune function, less psychological and emo-
structure to handle such streams of data, and to tional stress and improved mood, reduced depres-
find ways to integrate it with imaging and omics sion and anxiety, increased quality of life, less
platforms, and by using AI to analyze and inter- severe and less frequent symptoms or side effects
pret it wisely and effectively. This will enable a to other treatments, shortened hospitalization,
transformative change to move from reactive to reduced likelihood of cardiovascular disease and
preventive and predictive medicine. diabetes, and lowers chance of cancer relapse.
However, as proper knowledge management is Combined, these approaches will change the
essential in AI and omics, it is crucial withfor way we consider disease. We will start moving
wearable data streams, to ensure that translational from diagnosing and treating disease to prevent-
research does not chase statistically significant ing it. Importantly, we could transform health-
patterns that are biologically and clinically use- care to care about health, and not just sick and
less. Since such data are challenging to share, and disease, and to move from cohort to patient-
may be easily altered by mistake or fraud, ensur- centric medicine. Wearable devices will be moni-
ing reproducible science and reducing errors in toring individuals precisely and longitudinally,
data handling will be paramount. Privacy and they will assess an increasing number of fea-
issues cannot be stressed enough, as they help to tures, similarly as we moved from cDNA arrays

Fig. 13.7 Word clouds from PubMed titles on relevant net). While “monitoring” is the main focus at present,
wearable articles (the more frequent the word in the title, “patients,” “disease,” and “health” are clearly gaining on
the larger it is). Created using Wordle (http://www.wordle. importance
182 I. Jurisica

to Affymetrix platforms, to RNAseq and scRNA- national registry analysis. Lancet [Retraction of
Publication]. 2020;395(10240):1820.
seq. This will lead to 24/7 model of data collec- 5. Hermjakob H, Montecchi-Palazzi L, Bader G,
tion—precisely and continuously tracking our Wojcik J, Salwinski L, Ceol A, et al. The HUPO
activity, sleep, inactivity, food, and calorie intake, PSI’s molecular interaction format—a community
and pollutants in the environment. Data mining standard for the representation of protein interaction
data. Nat Biotechnol. 2004;22(2):177–83.
and machine learning algorithms will identify 6. Orchard S, Kerrien S, Abbani S, Aranda B, Bhate
trends and interesting patterns both at the popula- J, Bidwell S, et al. Protein interaction data cura-
tion level and for each individual, with personal- tion: the International Molecular Exchange (IMEx)
ized, calibrated trends generated and used as consortium. Nat Methods [Research Support, NIH,
Extramural Research Support, Non-US Gov’t].
preventive measures for each patient. The data- 2012;9(4):345–50.
driven aspect will change how we consider evi- 7. Del-Toro N, Duesbury M, Koch M, Perfetto L,
dence, recommendations, and belief in guidelines. Shrivastava A, Ochoa D, et al. Capturing varia-
No longer will imprecise guidelines with “one tion impact on molecular interactions in the IMEx
Consortium mutations data set. Nat Commun
size fits all” suffice—we need to provide custom- [Dataset Research Support, NIH, Extramural
ized “dose” prescribed for individuals, moving Research Support, Non-US Gov’t]. 2019;10(1):10.
towards person-driven approaches. Clearly, “the 8. Perfetto L, Pastrello C, Del-Toro N, Duesbury M,
future is connected”, and through the collective Iannuccelli M, Kotlyar M, et al. The IMEx coronavi-
rus interactome: an evolving map of Coronaviridae-
data handling and analysis [163], we will manage host molecular interactions. Database (Oxford).
even the most complex diseases, eventually. 2020:baaa096.
As highlighted before though, patient-centric, 9. Porras P, Barrera E, Bridge A, Del-Toro N, Cesareni
data-driven medicine requires high quality, com- G, Duesbury M, et al. Towards a unified open access
dataset of molecular interactions. Nat Commun.
prehensive data, and multiple levels of indepen- 2020;11(1):6144.
dent validation, with explainable models helping 10. Barth-Jones D, El Emam K, Bambauer J, Cavoukian
to increase trust and usability of such systems. A, Malin B. Assessing data intrusion threats. Science
Importantly, it is also essential to determine [Letter Comment]. 2015;348(6231):194–5.
11. Cavoukian A. Safeguarding health information.
who and how will use the system and optimize it Health Law Can. 1998;18(4):115–7.
accordingly, as even useful applications may 12. Moore W, Frye S. Review of HIPAA, Part 1: his-
result in negative outcomes when used improp- tory, protected health information, and privacy
erly or suboptimally [154]. and security rules. J Nucl Med Technol [Review].
2019;47(4):269–72.
13. Torkzadehmahani R, Nasirigerdeh R, Blumenthal
DB, Kacprowski T, List M, Matschinske J, et al.
Privacy-preserving artificial intelligence techniques
References in biomedicine. arXiv:2007.11621. 2020.
14. Zerka F, Barakat S, Walsh S, Bogowicz M, Leijenaar
1. Andrea D, Xueqing Z, Tauanne DA, Andrea M-M, RTH, Jochems A, et al. Systematic review of
Gene Ching CK, João MLD, et al. Substitution privacy-preserving distributed machine learning
mutational signatures in whole-genome from federated databases in health care. JCO Clin
sequenced cancers in the UK population. Science. Cancer Inform [Research Support, Non-US Gov’t].
2022;376(6591):ab19283. 2020;4:184–200.
2. Fang FC, Steen RG, Casadevall A. Misconduct 15. Kaissis GA, Makowski MR, Rückert D, Braren
accounts for the majority of retracted scien- RF. Secure, privacy-preserving and federated
tific publications. Proc Natl Acad Sci U S A. machine learning in medical imaging. Nat Mach
2012;109(42):17028–33. Intell. 2020;2:305–11.
3. Mehra MR, Desai SS, Kuy S, Henry TD, Patel 16. Zwanenburg A. Radiomics in nuclear medicine:
AN. Retraction: cardiovascular disease, drug ther- robustness, reproducibility, standardization, and how
apy, and mortality in Covid-19. N Engl J Med [Letter to avoid data analysis traps and replication crisis. Eur
Retraction of Publication]. 2020;382(26):2582. J Nucl Med Mol Imaging. 2019;46(13):2638–55.
https://doi.org/10.1056/NEJMoa2007621. 17. Lavery JV, IJsselmuiden C. The research fairness
4. Mehra MR, Ruschitzka F, Patel AN. Retraction- initiative: filling a critical gap in global research eth-
hydroxychloroquine or chloroquine with or without ics. Gates Open Res. 2018;2:58.
a macrolide for treatment of COVID-19: a multi-
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 183

18. Krynski TR, Tenenbaum JB. The role of causality 33. Ibrahim A, Vallieres M, Woodruff H, Primakov S,
in judgment under uncertainty. J Exp Psychol Gen. Beheshti M, Keek S, et al. Radiomics analysis for
2007;136(3):430–50. clinical decision support in nuclear medicine. Semin
19. Begley CG, Ellis LM. Drug development: raise Nucl Med [Review]. 2019;49(5):438–49.
standards for preclinical cancer research. Nature. 34. Lee G, Lee HY, Park H, Schiebler ML, van Beek
2012;483(7391):531–3. EJR, Ohno Y, et al. Radiomics and its emerging role
20. Prinz F, Schlange T, Asadullah K. Believe it or not: in lung cancer research, imaging biomarkers and
how much can we rely on published data on poten- clinical management: state of the art. Eur J Radiol
tial drug targets? Nat Rev Drug Discov [Letter [Review]. 2017;86:297–307.
Comment]. 2011;10(9):712. 35. Anthony GJ, Cunliffe A, Castillo R, Pham N,
21. Clements JC. Is the reproducibility crisis fuel- Guerrero T, Armato SG 3rd, et al. Incorporation of
ling poor mental health in science? Nature [News pre-therapy (18) F-FDG uptake data with CT texture
Comment]. 2020;582(7811):300. features into a radiomics model for radiation pneu-
22. Collins FS, Tabak LA. Policy: NIH plans to enhance monitis diagnosis. Med Phys. 2017;44(7):3686–94.
reproducibility. Nature. 2014;505(7485):612–3. 36. Yin Q, Hung SC, Rathmell WK, Shen L, Wang L,
23. Yang Z, Zhang A, Sudjianto A. Enhancing explain- Lin W, et al. Integrative radiomics expression pre-
ability of neural networks through architecture dicts molecular subtypes of primary clear cell renal
constraints. IEEE Trans Neural Netw Learn Syst. cell carcinoma. Clin Radiol. 2018;73(9):782–91.
2021;32(6):2610–21. 37. Li ZY, Wang XD, Li M, Liu XJ, Ye Z, Song B, et al.
24. Windisch P, Weber P, Furweger C, Ehret F, Kufeld Multi-modal radiomics model to predict treatment
M, Zwahlen D, et al. Implementation of model response to neoadjuvant chemotherapy for locally
explainability for a basic brain tumor detection advanced rectal cancer. World J Gastroenterol.
using convolutional neural networks on MRI slices. 2020;26(19):2388–402.
Neuroradiology. 2020;62(11):1515–8. 38. Zhuo EH, Zhang WJ, Li HJ, Zhang GY, Jing BZ,
25. Holzinger A, Langs G, Denk H, Zatloukal K, Muller Zhou J, et al. Radiomics on multi-modalities MR
H. Causability and explainability of artificial intel- sequences can subtype patients with non-metastatic
ligence in medicine. Wiley Interdiscip Rev Data Min nasopharyngeal carcinoma (NPC) into distinct sur-
Knowl Discov [Review]. 2019;9(4):e1312. vival subgroups. Eur Radiol. 2019;29(10):5590–9.
26. Coeckelbergh M. Artificial intelligence, responsi- 39. Zhuo EH, Zhang WJ, Li HJ, Zhang GY, Jing BZ,
bility attribution, and a relational justification of Zhou J, et al. Correction to: Radiomics on multi-
explainability. Sci Eng Ethics. 2020;26(4):2051–68. modalities MR sequences can subtype patients
27. Petkovic D, Altman R, Wong M, Vigil A. Improving with non-metastatic nasopharyngeal carcinoma
the explainability of random forest classifier—user (NPC) into distinct survival subgroups. Eur Radiol
centered approach. In: Pacific symposium on bio- [Published Erratum]. 2019;29(7):3957.
computing pacific symposium on biocomputing. 40. Lv W, Ashrafinia S, Ma J, Lu L, Rahmim A. Multi-
[Research Support, NIH, Extramural Research level multi-modality fusion radiomics: application
Support, Non-US Gov’t]. 2018;23:204–15. to PET and CT imaging for prognostication of head
28. Kalra A, Chakraborty A, Fine B, Reicher J. Machine and neck cancer. IEEE J Biomed Health Inform.
learning for automation of radiology protocols for 2020;24(8):2268–77.
quality and efficiency improvement. J Am Coll 41. Bagher-Ebadian H, Janic B, Liu C, Pantelic M,
Radiol. 2020;17(9):1149–58. Hearshen D, Elshaikh M, et al. Detection of domi-
29. Montani S, Striani M. Artificial intelligence in nant intra-prostatic lesions in patients with pros-
clinical decision support: a focused literature sur- tate cancer using an artificial neural network and
vey. Yearbook of medical informatics. Review. MR multi-modal radiomics analysis. Front Oncol.
2019;28(1):120–7. 2019;9:1313.
30. Diprose WK, Buist N, Hua N, Thurier Q, Shand 42. Zhong QZ, Long LH, Liu A, Li CM, Xiu X, Hou XY, et al.
G, Robinson R. Physician understanding, explain- Radiomics of multiparametric MRI to predict biochemical
ability, and trust in a hypothetical machine learn- recurrence of localized prostate cancer after radiation ther-
ing risk calculator. J Am Med Inform Assoc. apy. Front Oncol. 2020;10:731.
2020;27(4):592–600. 43. Wagner MW, Bilbily A, Beheshti M, Shammas
31. Lohmann P, Kocher M, Ruge MI, Visser-Vandewalle A, Vali R. Artificial intelligence and radiomics in
V, Shah NJ, Fink GR, et al. PET/MRI radiomics pediatric molecular imaging. Methods [Review].
in patients with brain metastases. Front Neurol 2021;188:37–43.
[Review]. 2020;11:1. 44. Dercle L, Lu L, Schwartz LH, Qian M, Tejpar S,
32. Sollini M, Antunovic L, Chiti A, Kirienko Eggleton P, et al. Radiomics response signature for
M. Towards clinical application of image min- identification of metastatic colorectal cancer sensi-
ing: a systematic review on artificial intelligence tive to therapies targeting EGFR pathway. J Natl
and radiomics. Eur J Nucl Med Mol Imaging. Cancer Inst. 2020;112(9):902–12.
2019;46(13):2656–72.
184 I. Jurisica

45. Ha S. Perspectives in radiomics for personalized erative MRI may accurately predict the histopatho-
medicine and theranostics. Nucl Med Mol Imaging. logical grades of soft tissue sarcomas. J Magn Reson
2019;53(3):164–6. Imaging. 2020;51(3):791–7.
46. Chen C, Guo X, Wang J, Guo W, Ma X, Xu J. The 57. Song J, Yin Y, Wang H, Chang Z, Liu Z, Cui
diagnostic value of radiomics-based machine learn- L. A review of original articles published in the
ing in predicting the grade of meningiomas using emerging field of radiomics. Eur J Radiol [Review].
conventional magnetic resonance imaging: a pre- 2020;127:108991.
liminary study. Front Oncol. 2019;9:1338. 58. Rogers W, Thulasi Seetha S, Refaee TAG, Lieverse
47. Leijenaar RT, Carvalho S, Velazquez ER, van RIY, Granzier RWY, Ibrahim A, et al. Radiomics:
Elmpt WJ, Parmar C, Hoekstra OS, et al. Stability from qualitative to quantitative imaging. Br J Radiol
of FDG-PET radiomics features: an integrated [Review]. 2020;93(1108):20190948.
analysis of test-retest and inter-observer variabil- 59. Peng A, Dai H, Duan H, Chen Y, Huang J, Zhou L,
ity. Acta Oncol [Research Support, Non-US Gov’t]. et al. A machine learning model to precisely immu-
2013;52(7):1391–7. nohistochemically classify pituitary adenoma sub-
48. Jack CR Jr, Barkhof F, Bernstein MA, Cantillon M, types with radiomics based on preoperative magnetic
Cole PE, Decarli C, et al. Steps to standardization resonance imaging. Eur J Radiol. 2020;125:108892.
and validation of hippocampal volumetry as a bio- 60. Mokrane FZ, Lu L, Vavasseur A, Otal P, Peron JM,
marker in clinical trials and diagnostic criterion for Luk L, et al. Radiomics machine-learning signature
Alzheimer’s disease. Alzheimers Dement. [Research for diagnosis of hepatocellular carcinoma in cir-
Support, N.I.H., Extramural Research Support, Non-- rhotic patients with indeterminate liver nodules. Eur
U.S. Gov’t]. 2011;7(4):474–85e4. Radiol. 2020;30(1):558–70.
49. Rizzo S, Botta F, Raimondi S, Origgi D, Buscarino 61. Haider SP, Mahajan A, Zeevi T, Baumeister P,
V, Colarieti A, et al. Radiomics of high-grade serous Reichel C, Sharaf K, et al. PET/CT radiomics signa-
ovarian cancer: association between quantitative CT ture of human papilloma virus association in oropha-
features, residual tumour and disease progression ryngeal squamous cell carcinoma. Eur J Nucl Med
within 12 months. Eur Radiol. 2018;28(11):4849–59. Mol Imaging. 2020;47(13):2978–91.
50. Sala E, Mema E, Himoto Y, Veeraraghavan H, 62. Varghese B, Chen F, Hwang D, Palmer SL, De
Brenton JD, Snyder A, et al. Unravelling tumour Castro Abreu AL, Ukimura O, et al. Objective risk
heterogeneity using next-generation imaging: stratification of prostate cancer using machine learn-
radiomics, radiogenomics, and habitat imaging. Clin ing and radiomics applied to multiparametric mag-
Radiol [Review]. 2017;72(1):3–10. netic resonance images. Sci Rep [Research Support,
51. Vargas HA, Veeraraghavan H, Micco M, Nougaret NIH, Extramural Research Support, US Gov’t, Non-
S, Lakhman Y, Meier AA, et al. A novel represen- PHS Research Support, Non-US Gov’t]. 2019;9(1):
tation of inter-site tumour heterogeneity from pre- 1570.
treatment computed tomography textures classifies 63. Oyama A, Hiraoka Y, Obayashi I, Saikawa Y,
ovarian cancers by clinical outcome. Eur Radiol Furui S, Shiraishi K, et al. Hepatic tumor classifi-
[Evaluation Studies]. 2017;27(9):3991–4001. cation using texture and topology analysis of non-
52. Xie C, Du R, Ho JW, Pang HH, Chiu KW, Lee contrast-enhanced three-dimensional T1-weighted
EY, et al. Effect of machine learning re-sampling MR images with a radiomics approach. Sci Rep
techniques for imbalanced datasets in (18)F-FDG [Research Support, Non-US Gov’t]. 2019;9(1):8764.
PET-based radiomics model on prognostication per- 64. Sun R, Limkin EJ, Vakalopoulou M, Dercle L,
formance in cohorts of head and neck cancer patients. Champiat S, Han SR, et al. A radiomics approach
Eur J Nucl Med Mol Imaging. 2020;47(12):2826–35. to assess tumour-infiltrating CD8 cells and response
53. Wildeboer RR, Mannaerts CK, van Sloun RJG, to anti-PD-1 or anti-PD-L1 immunotherapy: an
Budaus L, Tilki D, Wijkstra H, et al. Automated imaging biomarker, retrospective multicohort study.
multiparametric localization of prostate cancer Lancet Oncol [Research Support, Non-US Gov’t].
based on B-mode, shear-wave elastography, and 2018;19(9):1180–91.
contrast-enhanced ultrasound radiomics. Eur Radiol. 65. Oakden-Rayner L, Carneiro G, Bessen T,
2020;30(2):806–15. Nascimento JC, Bradley AP, Palmer LJ. Precision
54. Wang X, Wan Q, Chen H, Li Y, Li X. Classification radiology: predicting longevity using feature engi-
of pulmonary lesion based on multiparametric MRI: neering and deep learning methods in a radiomics
utility of radiomics and comparison of machine framework. Sci Rep. 2017;7(1):1648.
learning methods. Eur Radiol. 2020;30(8):4595–605. 66. Stone JR, Wilde EA, Taylor BA, Tate DF, Levin
55. Wang H, Song B, Ye N, Ren J, Sun X, Dai Z, et al. H, Bigler ED, et al. Supervised learning technique
Machine learning-based multiparametric MRI for the automated identification of white matter
radiomics for predicting the aggressiveness of pap- hyperintensities in traumatic brain injury. Brain Inj.
illary thyroid carcinoma. Eur J Radiol [Evaluation 2016;30(12):1458–68.
Study]. 2020;122:108755. 67. Aerts H. Data science in radiology: a path forward.
56. Wang H, Chen H, Duan S, Hao D, Liu J. Radiomics Clin Cancer Res [Letter Research Support, NIH,
and machine learning with multiparametric preop- Extramural]. 2018;24(3):532–4.
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 185

68. Vukicevic AM, Milic V, Zabotti A, Hocevar A, De for atypical manifestation of primary central ner-
Lucia O, Filippou G, et al. Radiomics-based assess- vous system lymphoma: development and mul-
ment of primary Sjogren’s syndrome from salivary ticenter external validation. Neuro-Oncology.
gland ultrasonography images. IEEE J Biomed 2018;20(9):1251–61.
Health Inform. 2020;24(3):835–43. 82. Yu H, Meng X, Chen H, Han X, Fan J, Gao W, et al.
69. Berenguer R, Pastor-Juan MDR, Canales-Vazquez J, Correlation between mammographic radiomics fea-
Castro-Garcia M, Villas MV, Mansilla Legorburo F, tures and the level of tumor-infiltrating lymphocytes
et al. Radiomics of CT features may be nonrepro- in patients with triple-negative breast cancer. Front
ducible and redundant: influence of CT acquisition Oncol. 2020;10:412.
parameters. Radiology. 2018;288(2):407–15. 83. Park KJ, Lee JL, Yoon SK, Heo C, Park BW, Kim
70. Zwanenburg A, Vallieres M, Abdalah MA, Aerts H, JK. Radiomics-based prediction model for outcomes
Andrearczyk V, Apte A, et al. The image biomarker of PD-1/PD-L1 immunotherapy in metastatic urothe-
standardization initiative: standardized quantitative lial carcinoma. Eur Radiol. 2020;30(10):5392–403.
radiomics for high-throughput image-based pheno- 84. Mu W, Tunali I, Gray JE, Qi J, Schabath MB, Gillies
typing. Radiology. 2020;295(2):328–38. RJ. Radiomics of (18)F-FDG PET/CT images pre-
71. Nardone V, Reginelli A, Guida C, Belfiore MP, dicts clinical benefit of advanced NSCLC patients
Biondi M, Mormile M, et al. Delta-radiomics to checkpoint blockade immunotherapy. Eur J Nucl
increases multicentre reproducibility: a phantom Med Mol Imaging. 2020;47(5):1168–82.
study. Med Oncol. 2020;37(5):38. 85. Hectors SJ, Lewis S, Besa C, King MJ, Said D, Putra
72. Zhovannik I, Bussink J, Traverso A, Shi Z, J, et al. MRI radiomics features predict immuno-
Kalendralis P, Wee L, et al. Learning from scanners: oncological characteristics of hepatocellular carci-
bias reduction and feature correction in radiomics. noma. Eur Radiol. 2020;30(7):3759–69.
Clin Transl Radiat Oncol. 2019;19:33–8. 86. Avanzo M, Stancanello J, Pirrone G, Sartor
73. Saeedi E, Dezhkam A, Beigi J, Rastegar S, Yousefi G. Radiomics and deep learning in lung can-
Z, Mehdipour LA, et al. Radiomic feature robustness cer. Strahlenther Onkol [Review]. 2020;196(10):
and reproducibility in quantitative bone radiogra- 879–87.
phy: a study on radiologic parameter changes. J Clin 87. Tang C, Hobbs B, Amer A, Li X, Behrens C, Canales
Densitom. 2019;22(2):203–13. JR, et al. Development of an immune-pathology
74. Orlhac F, Frouin F, Nioche C, Ayache N, informed radiomics model for non-small cell lung
Buvat I. Validation of a method to compen- cancer. Sci Rep. 2018;8(1):1922.
sate multicenter effects affecting CT Radiomics. 88. Ninatti G, Kirienko M, Neri E, Sollini M, Chiti
Radiology [Multicenter Study Validation Study]. A. Imaging-based prediction of molecular ther-
2019;291(1):53–9. apy targets in NSCLC by radiogenomics and AI
75. Kalendralis P, Traverso A, Shi Z, Zhovannik I, approaches: a systematic review. Diagnostics (Basel)
Monshouwer R, Starmans MPA, et al. Multicenter [Review]. 2020;10(6):359.
CT phantoms public dataset for radiomics reproduc- 89. Wang X, Zhang L, Yang X, Tang L, Zhao J, Chen
ibility tests. Med Phys. 2019;46(3):1512–8. G, et al. Deep learning combined with radiomics
76. Baessler B, Weiss K, Pinto Dos Santos D. Robustness may optimize the prediction in differentiating high-
and reproducibility of radiomics in magnetic reso- grade lung adenocarcinomas in ground glass opacity
nance imaging: a phantom study. Investig Radiol. lesions on CT scans. Eur J Radiol. 2020;129:109150.
2019;54(4):221–8. 90. Kakileti ST, Madhu HJ, Manjunath G, Wee L,
77. Shofty B, Artzi M, Shtrozberg S, Fanizzi C, DiMeco Dekker A, Sampangi S. Personalized risk predic-
F, Haim O, et al. Virtual biopsy using MRI radiomics tion for breast cancer pre-screening using artificial
for prediction of BRAF status in melanoma brain intelligence and thermal radiomics. Artif Intell Med.
metastasis. Sci Rep. 2020;10(1):6623. 2020;105:101854.
78. Huang CY, Lee CC, Yang HC, Lin CJ, Wu HM, 91. Wang K, Qiao Z, Zhao X, Li X, Wang X, Wu T, et al.
Chung WY, et al. Radiomics as prognostic factor in Individualized discrimination of tumor recurrence
brain metastases treated with gamma knife radiosur- from radiation necrosis in glioma patients using an
gery. J Neuro-Oncol. 2020;146(3):439–49. integrated radiomics-based model. Eur J Nucl Med
79. Wu G, Chen Y, Wang Y, Yu J, Lv X, Ju X, et al. Mol Imaging. 2020;47(6):1400–11.
Sparse representation-based radiomics for the diag- 92. Currie G, Hawk KE, Rohren E, Vial A, Klein
nosis of brain tumors. IEEE Trans Med Imaging. R. Machine learning and deep learning in medical
2018;37(4):893–905. imaging: intelligent imaging. J Med Imaging Radiat
80. Peng L, Parekh V, Huang P, Lin DD, Sheikh K, Baker Sci. 2019;50(4):477–87.
B, et al. Distinguishing true progression from radio- 93. Sheth D, Giger ML. Artificial intelligence in the
necrosis after stereotactic radiation therapy for brain interpretation of breast cancer on MRI. J Magn
metastases with machine learning and radiomics. Int Reson Imaging [Review]. 2020;51(5):1310–24.
J Radiat Oncol Biol Phys. 2018;102(4):1236–43. 94. Castiglioni I, Gallivanone F, Soda P, Avanzo M,
81. Kang D, Park JE, Kim YH, Kim JH, Oh JY, Kim Stancanello J, Aiello M, et al. AI-based applica-
J, et al. Diffusion radiomics as a diagnostic model tions in hybrid imaging: how to build smart and truly
186 I. Jurisica

multi-parametric decision models for radiomics. Eur single-photon emission tomography. Eur J Nucl
J Nucl Med Mol Imaging. 2019;46(13):2673–99. Med. 1997;24(6):602–8.
95. Veit-Haibach P, Buvat I, Herrmann K. EJNMMI 107. Deweer B, Lehericy S, Pillon B, Baulac M, Chiras
supplement: bringing AI and radiomics to nuclear J, Marsault C, et al. Memory disorders in prob-
medicine. Eur J Nucl Med Mol Imaging [Editorial]. able Alzheimer’s disease: the role of hippocampal
2019;46(13):2627–9. atrophy as shown with MRI. J Neurol Neurosurg
96. Bejnordi BE, Litjens G, van der Laak JA. Machine Psychiatry [Research Support, Non-US Gov’t].
learning compared with pathologist assessment- 1995;58(5):590–7.
reply. JAMA [Letter Comment]. 2018;319(16):1726. 108. Blin J, Baron JC, Dubois B, Crouzel C, Fiorelli M,
97. Eder L, Li Q, Rahmati S, Rahman P, Jurisica I, Attar-Levy D, et al. Loss of brain 5-HT2 receptors
Chandran V. Defining imaging subphenotypes of in Alzheimer’s disease. In vivo assessment with
psoriatic arthritis: integrative analysis of imaging positron emission tomography and [18F]setoperone.
data and gene expression in a PsA patient cohort. Brain. 1993;116(Pt 3):497–510.
Rheumatology (Oxford). 2022: keac078. 109. Rossini PM, Di Iorio R, Vecchio F, Anfossi M,
98. Viswanath SE, Tiwari P, Lee G, Madabhushi Babiloni C, Bozzali M, et al. Early diagnosis of
A. Dimensionality reduction-based fusion Alzheimer’s disease: the role of biomarkers includ-
approaches for imaging and non-imaging biomedi- ing advanced EEG signal analysis. Report from the
cal data: concepts, workflow, and use-cases. BMC IFCN-sponsored panel of experts. Clin Neurophysiol
Med Imaging [Research Support, Non-US Gov’t [Review]. 2020;131(6):1287–310.
Research Support, NIH, Extramural Research 110. Ludwig N, Fehlmann T, Kern F, Gogol M, Maetzler
Support, US Gov’t, Non-PHS]. 2017;17(1):2. W, Deutscher S, et al. Machine learning to detect
99. Joaquim HPG, Costa AC, Talib LL, Dethloff F, Alzheimer’s disease from circulating non-coding
Serpa MH, Zanetti MV, et al. Plasma metabolite pro- RNAs. Genom Proteom Bioinform [Research
files in first episode psychosis: exploring symptoms Support, Non-US Gov’t]. 2019;17(4):430–40.
heterogeneity/severity in schizophrenia and bipolar 111. McKeever PM, Schneider R, Taghdiri F, Weichert
disorder cohorts. Front Psych. 2020;11:496. A, Multani N, Brown RA, et al. MicroRNA expres-
100. Lee E, Choi JS, Kim M, Suk HI. Toward an inter- sion levels are altered in the cerebrospinal fluid of
pretable Alzheimer’s disease diagnostic model patients with young-onset Alzheimer’s disease. Mol
with regional abnormality representation via deep Neurobiol. 2018 Dec;55(12):8826–41.
learning. Neuroimage [Research Support, Non-US 112. Matias-Guiu JA, Cabrera-Martin MN, Curiel RE,
Gov’t]. 2019;202:116113. Valles-Salgado M, Rognoni T, Moreno-Ramos T,
101. Hamelin L, Lagarde J, Dorothee G, Leroy C, Labit et al. Comparison between FCSRT and LASSI-L to
M, Comley RA, et al. Early and protective microg- detect early stage Alzheimer’s disease. J Alzheimers
lial activation in Alzheimer’s disease: a prospec- Dis. 2018;61(1):103–11.
tive study using 18F-DPA-714 PET imaging. Brain 113. Bechard LE, Beaton D, McGilton K, Tartaglia MC,
[Research Support, Non-US Gov’t]. 2016;139(Pt Black S. Physical activity perceptions, experiences,
4):1252–64. and beliefs of older adults with mild cognitive impair-
102. Scheltens P. Imaging in Alzheimer’s dis- ment or Alzheimer’s disease and their care partners.
ease. Dialogues Clin Neurosci [Review]. Appl Physiol Nutr Metab. 2020;45(11):1216–24.
2009;11(2):191–9. 114. Ettore E, Bakardjian H, Sole M, Levy Nogueira M,
103. Magnin B, Mesrob L, Kinkingnehun S, Pelegrini- Habert MO, Gabelle A, et al. Relationships between
Issac M, Colliot O, Sarazin M, et al. Support vector objectives sleep parameters and brain amyloid
machine-based classification of Alzheimer’s disease load in subjects at risk for Alzheimer’s disease: the
from whole-brain anatomical MRI. Neuroradiology. INSIGHT-preAD study. Sleep [Research Support,
2009;51(2):73–83. Non-US Gov’t]. 2019;42(9):zsz137.
104. Gauthier S, Dubois B, Feldman H, Scheltens 115. Ortner M, Drost R, Hedderich D, Goldhardt O,
P. Revised research diagnostic criteria for Muller-Sarnowski F, Diehl-Schmid J, et al. Amyloid
Alzheimer’s disease. Lancet Neurol [Comment PET, FDG-PET or MRI?—the power of different
Letter]. 2008;7(8):668–70. imaging biomarkers to detect progression of early
105. Dubois B, Feldman HH, Jacova C, Dekosky ST, Alzheimer’s disease. BMC Neurol. 2019;19(1):264.
Barberger-Gateau P, Cummings J, et al. Research 116. Oliveira PP Jr, Nitrini R, Busatto G, Buchpiguel C,
criteria for the diagnosis of Alzheimer’s disease: Sato JR, Amaro E Jr. Use of SVM methods with
revising the NINCDS-ADRDA criteria. Lancet surface-based cortical and volumetric subcorti-
Neurol [Research Support, Non-US Gov’t Review]. cal measurements to detect Alzheimer’s disease. J
2007;6(8):734–46. Alzheimers Dis. 2010;19(4):1263–72.
106. Claus JJ, Dubois EA, Booij J, Habraken J, de 117. Ebrahimighahnavieh MA, Luo S, Chiong R. Deep
Munck JC, van Herk M, et al. Demonstration of a learning to detect Alzheimer’s disease from neu-
reduction in muscarinic receptor binding in early roimaging: a systematic literature review. Comput
Alzheimer’s disease using iodine-123 dexetimide Methods Prog Biomed. 2020;187:105242.
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 187

118. Sabri O, Sabbagh MN, Seibyl J, Barthel H, Akatsu 129. Wang B, Mezlini AM, Demir F, Fiume M, Tu Z,
H, Ouchi Y, et al. Florbetaben PET imaging to detect Brudno M, et al. Similarity network fusion for aggre-
amyloid beta plaques in Alzheimer’s disease: phase gating data types on a genomic scale. Nat Methods
3 study. Alzheimers Dement. [Clinical Trial, Phase [Research Support, Non-US Gov’t Research
III Multicenter Study Research Support, Non- - Support, US Gov’t, Non-PHS]. 2014;11(3):333–7.
U.S. Gov’t]. 2015;11(8):964–74. 130. Kennedy SA, Jarboui MA, Srihari S, Raso C, Bryan
119. Jain G, Stuendl A, Rao P, Berulava T, Pena Centeno K, Dernayka L, et al. Extensive rewiring of the
T, Kaurani L, et al. A combined miRNA-piRNA EGFR network in colorectal cancer cells express-
signature to detect Alzheimer’s disease. Transl ing transforming levels of KRAS(G13D). Nat
Psychiatry [Multicenter Study Research Support, Commun [Research Support, Non-U.S. Gov’t].
Non-US Gov’t]. 2019;9(1):250. 2020;11(1):499.
120. Lannfelt L. Biochemical diagnostic markers to 131. Enfield KSS, Marshall EA, Anderson C, Ng KW,
detect early Alzheimer’s disease. Neurobiol Aging Rahmati S, Xu Z, et al. Epithelial tumor s uppressor
[Review]. 1998;19(2):165–7. ELF3 is a lineage-specific amplified oncogene in
121. Hampel H, Toschi N, Baldacci F, Zetterberg H, lung adenocarcinoma. Nat Commun [Research
Blennow K, Kilimann I, et al. Alzheimer’s disease Support, Non-US Gov’t]. 2019;10(1):5438.
biomarker-guided diagnostic workflow using the 132. Tokar T, Pastrello C, Ramnarine VR, Zhu CQ,
added value of six combined cerebrospinal fluid Craddock KJ, Pikor LA, et al. Differentially
candidates: Abeta1–42, total-tau, phosphorylated- expressed microRNAs in lung adenocarcinoma
tau, NFL, neurogranin, and YKL-40. Alzheimers invert effects of copy number aberrations of prog-
Dement [Multicenter Study Research Support, Non-- nostic genes. Oncotarget. 2018;9(10):9137–55.
U.S. Gov’t]. 2018;14(4):492–501. 133. Martinez VD, Vucic EA, Thu KL, Pikor LA, Lam
122. Kunkle BW, Grenier-Boley B, Sims R, Bis JC, S, Lam WL. Disruption of KEAP1/CUL3/RBX1
Damotte V, Naj AC, et al. Genetic meta-analysis of E3-ubiquitin ligase complex components by multi-
diagnosed Alzheimer’s disease identifies new risk ple genetic mechanisms: association with poor prog-
loci and implicates Abeta, tau, immunity and lipid nosis in head and neck cancer. Head Neck [Research
processing. Nat Genet [Meta-Analysis Research Support, NIH, Extramural Research Support,
Support, Non-US Gov’t]. 2019;51(3):414–30. Non-US Gov’t]. 2015;37(5):727–34.
123. Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, 134. Cancer Genome Atlas Research Network.
Sims R, Bellenguez C, et al. Meta-analysis of 74,046 Comprehensive molecular profiling of lung adeno-
individuals identifies 11 new susceptibility loci for carcinoma. Nature. [Research Support, N.I.H.,
Alzheimer’s disease. Nat Genet [Meta-Analysis Extramural Research Support, Non-U.S. Gov’t].
Research Support, NIH, Extramural Research 2014;511(7511):543–50.
Support, Non-US Gov’t]. 2013;45(12):1452–8. 135. Shirdel EA, Xie W, Mak TW, Jurisica I. NAViGaTing
124. Thijssen EH, La Joie R, Wolf A, Strom A, Wang P, the micronome—using multiple microRNA pre-
Iaccarino L, et al. Diagnostic value of plasma phos- diction databases to identify signalling pathway-
phorylated tau181 in Alzheimer’s disease and fron- associated microRNAs. PLoS One [Comparative
totemporal lobar degeneration. Nat Med [Research Study Evaluation Studies Research Support, Non-US
Support, NIH, Extramural Research Support, Gov’t]. 2011;6(2):e17429.
Non-US Gov’t]. 2020;26(3):387–97. 136. Brown KR, Jurisica I. Unequal evolutionary conser-
125. Edwards M, Balldin VH, Hall J, O’Bryant vation of human protein interactions in interologous
S. Combining select neuropsychological assess- networks. Genome Biol. 2007;8(5):R95.
ment with blood-based biomarkers to detect mild 137. Brown KR, Otasek D, Ali M, McGuffin MJ, Xie
Alzheimer’s disease: a molecular neuropsychology W, Devani B, et al. NAViGaTOR: network analysis,
approach. J Alzheimers Dis [Research Support, visualization and graphing Toronto. Bioinformatics.
NIH, Extramural]. 2014;42(2):635–40. 2009;25(24):3327–9.
126. Xicota L, Ichou F, Lejeune FX, Colsch B, 138. Bhattacharyya R, Ha MJ, Liu Q, Akbani R, Liang H,
Tenenhaus A, Leroy I, et al. Multi-omics signa- Baladandayuthapani V. Personalized network mod-
ture of brain amyloid deposition in asymptomatic eling of the pan-cancer patient and cell line interac-
individuals at-risk for Alzheimer’s disease: the tome. JCO Clin Cancer Inform. 2020;4:399–411.
INSIGHT-preAD study. EBioMedicine. 2019;47: 139. Ingalhalikar M, Smith A, Parker D, Satterthwaite
518–28. TD, Elliott MA, Ruparel K, et al. Sex differences in
127. Leung CK, Braun P, Cuzzocrea A. AI-based sensor the structural connectome of the human brain. Proc
information fusion for supporting deep supervised Natl Acad Sci U S A [Comparative Study Research
learning. Sensors (Basel). 2019;19(6):1345. Support, NIH, Extramural Research Support,
128. Zizzo AN, Erdman L, Feldman BM, Goldenberg Non-US Gov’t]. 2014;111(2):823–8.
A. Similarity network fusion: a novel application to 140. Bigler ED. Default mode network, connectivity, trau-
making clinical diagnoses. Rheum Dis Clin North matic brain injury and post-traumatic amnesia. Brain
Am [Review]. 2018;44(2):285–93. [Research Support, N.I.H., Extramural Research
188 I. Jurisica

Support, U.S. Gov’t, Non-P.H.S. Comment]. tions for treating disability. Med Sci Sports Exerc
2016;139(Pt 12):3054–7. [Clinical Trial Randomized Controlled Trial].
141. Bigler ED, Abildskov TJ, Goodrich-Hunsaker NJ, 1997;29(8):977–85.
Black G, Christensen ZP, Huff T, et al. Structural 157. Bonato P. Advances in wearable technology
neuroimaging findings in mild traumatic brain for rehabilitation. Stud Health Technol Inform.
injury. Sports Med Arthrosc Rev [Review]. 2009;145:145–59.
2016;24(3):e42–52. 158. Shallwani S, Dalzell MA, Sateren W, O’Brien
142. Cui LB, Liu L, Wang HN, Wang LX, Guo F, Xi S. Exercise compliance among patients with multiple
YB, et al. Disease definition for schizophrenia by myeloma undergoing chemotherapy: a retrospective
functional connectivity using radiomics strategy. study. Supportive Care Cancer. 2015;23(10):3081–8.
Schizophr Bull. 2018;44(5):1053–9. 159. Davies NJ, Batehup L, Thomas R. The role of diet
143. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, and physical activity in breast, colorectal, and pros-
Chklovskii D, Alon U. Network motifs: simple tate cancer survivorship: a review of the literature.
building blocks of complex networks. Science. Br J Cancer [Research Support, Non-US Gov’t
2002;298:824–7. Review]. 2011;105(Suppl 1):S52–73.
144. Shen-Orr SS, Milo R, Mangan S, Alon U. Network 160. Norman A, Moradi T, Gridley G, Dosemeci M, Rydh
motifs in the transcriptional regulation network of B, Nyren O, et al. Occupational physical activity and
Escherichia coli. Nat Genet. 2002;31(1):64–8. risk for prostate cancer in a nationwide cohort study
145. Mangan S, Alon U. Structure and function of the in Sweden. Br J Cancer. 2002;86(1):70–5.
feed-forward loop network motif. Proc Natl Acad 161. Wannamethee SG, Shaper AG, Walker M. Physical
Sci U S A. 2003;100(21):11980–5. activity and risk of cancer in middle-aged men. Br J
146. Alon U. Network motifs: theory and experimental Cancer. 2001;85(9):1311–6.
approaches. Nat Rev Genet. 2007;8(6):450–61. 162. Abioye AI, Odesanya MO, Ibrahim NA. Physical
147. Milenkovic T, Lai J, Przulj N. GraphCrunch: a activity and risk of gastric cancer: a meta-analysis
tool for large network analyses. BMC Bioinform. of observational studies. Br J Sports Med.
2008;9:70. 2015;49(4):224–9.
148. Geraci J, Liu G, Jurisica I. Algorithms for system- 163. Contrepois K, Wu S, Moneghetti KJ, Hornburg D,
atic identification of small subgraphs. Methods Ahadi S, Tsai MS, et al. Molecular choreography of
Mol Biol [Research Support, Non-US Gov’t]. acute exercise. Cell. 2020;181(5):1112–30 e16.
2012;804:219–44. 164. McTiernan A, Stanford JL, Weiss NS, Daling JR,
149. Przulj N, Wigle DA, Jurisica I. Functional topology Voigt LF. Occurrence of breast cancer in rela-
in a network of protein interactions. Bioinformatics. tion to recreational exercise in women age 50–64
2004;20(3):340–8. years. Epidemiology [Research Support, Non-US
150. Barrios-Rodiles M, Brown KR, Ozdamar B, Bose R, Gov’t Research Support, US Gov’t, PHS].
Liu Z, Donovan RS, et al. High-throughput mapping 1996;7(6):598–604.
of a dynamic signaling network in mammalian cells. 165. Goh J, Kirk EA, Lee SX, Ladiges WC. Exercise,
Science. 2005;307(5715):1621–5. physical activity and breast cancer: the role of
151. Zhang X, Gao P, Snyder MP. The Exposome in the tumor-associated macrophages. Exerc Immunol Rev
era of the quantified self. Annu Rev Biomed Data [Review]. 2012;18:158–76.
Sci. 2021;4:255–77. 166. Basen-Engquist K, Carmack C, Brown J, Jhingran
152. Kotlyar M, Pastrello C, Malik Z, Jurisica I. IID A, Baum G, Song J, et al. Response to an exer-
2018 update: context-specific physical protein- cise intervention after endometrial cancer: differ-
protein interactions in human, model organisms ences between obese and non-obese survivors.
and domesticated species. Nucleic Acids Res. Gynecol Oncol [Comparative Study Observational
2019;47(D1):D581–9. Study Research Support, NIH, Extramural
153. McKinney SM, Sieniek M, Godbole V, Godwin J, Research Support, Non-US Gov’t]. 2014;133(1):
Antropova N, Ashrafian H, et al. International evalu- 48–55.
ation of an AI system for breast cancer screening. 167. Cannioto RA, Moysich KB. Epithelial ovarian
Nature [Evaluation Study Research Support, NIH, cancer and recreational physical activity: a review
Extramural]. 2020;577(7788):89–94. of the epidemiological literature and implications
154. Rich E, Lewis S, Lupton D, Miah A, Piwek L. Digital for exercise prescription. Gynecol Oncol Rev.
health generation?: young People’s use of ‘healthy 2015;137(3):559–73.
lifestyle. Technologies. Bath: University of Bath; 168. Spector D, Deal AM, Amos KD, Yang H, Battaglini
2020. CL. A pilot study of a home-based motivational
155. Cilliers L. Wearable devices in healthcare: privacy exercise program for African American breast
and information security issues. Health Inf Manag. cancer survivors: clinical and quality-of-life out-
2020;49(2–3):150–6. comes. Integr Cancer Ther [Research Support, NIH,
156. Rejeski WJ, Brawley LR, Ettinger W, Morgan T, Extramural]. 2014;13(2):121–32.
Thompson C. Compliance to exercise therapy in 169. Gil-Rey E, Quevedo-Jerez K, Maldonado-Martin S,
older participants with knee osteoarthritis: implica- Herrero-Roman F. Exercise intensity guidelines for
13 Integrative Computational Biology, AI, and Radiomics: Building Explainable Models by Integration… 189

cancer survivors: a comparison with reference val- ity, and mediators of sleep in breast and prostate can-
ues. Int J Sports Med. 2014;35(14):e1–9. cer patients receiving radiation therapy. Community
170. Irwin ML, McTiernan A, Baumgartner RN, Oncol. 2010;7(10):463–71.
Baumgartner KB, Bernstein L, Gilliland FD, et al. 180. Galvao DA, Taaffe DR, Spry N, Joseph D, Newton
Changes in body fat and weight after a breast cancer RU. Combined resistance and aerobic exercise pro-
diagnosis: influence of demographic, prognostic, and gram reverses muscle loss in men undergoing andro-
lifestyle factors. J Clin Oncol. 2005;23(4):774–82. gen suppression therapy for prostate cancer without
171. Granger CL, McDonald CF, Berney S, Chao C, bone metastases: a randomized controlled trial. J
Denehy L. Exercise intervention to improve exercise Clin Oncol. 2010;28(2):340–7.
capacity and health related quality of life for patients 181. Lakoski SG, Willis BL, Barlow CE, et al. Midlife
with non-small cell lung cancer: a systematic review. cardiorespiratory fitness, incident cancer, and sur-
Lung Cancer [Review]. 2011;72(2):139–53. vival after cancer in men: the cooper center longitu-
172. Pettapiece-Phillips R, Narod SA, Kotsopoulos J. The dinal study. JAMA Oncol. 2015;1(2):231–7.
role of body size and physical activity on the risk of 182. Santa Mina D, Alibhai SM, Matthew AG, Guglietti
breast cancer in BRCA mutation carriers. Cancer CL, Pirbaglou M, Trachtenberg J, et al. A random-
Causes Control. 2015;26(3):333–44. ized trial of aerobic versus resistance exercise
173. Friedenreich CM, McGregor SE, Courneya KS, in prostate cancer survivors. J Aging Phys Act.
Angyalfi SJ, Elliott FG. Case-control study of life- 2013;21(4):455–78.
time total physical activity and prostate cancer risk. 183. Demark-Wahnefried W, Clipp EC, Lipkus IM,
Am J Epidemiol. 2004;159(8):740–9. Lobach D, Snyder DC, Sloane R, et al. Main out-
174. Buffart LM, Galvao DA, Chinapaw MJ, Brug J, comes of the FRESH START trial: a sequentially
Taaffe DR, Spry N, et al. Mediators of the resistance tailored, diet and exercise mailed print interven-
and aerobic exercise intervention effect on physi- tion among breast and prostate cancer survivors.
cal and general health in men undergoing andro- J Clin Oncol [Clinical Trial, Phase II Multicenter
gen deprivation therapy for prostate cancer. Cancer Study Randomized Controlled Trial Research
[Randomized Controlled Trial Research Support, Support, N.I.H., Extramural Research Support, Non--
Non-US Gov’t]. 2014;120(2):294–301. U.S. Gov’t]. 2007;25(19):2709–18.
175. Moore SC, Peters TM, Ahn J, Park Y, Schatzkin A, 184. Gardner JR, Livingston PM, Fraser SF. Effects of
Albanes D, et al. Age-specific physical activity and exercise on treatment-related adverse effects for
prostate cancer risk among white men and black patients with prostate cancer receiving androgen-
men. Cancer [Research Support, NIH, Extramural]. deprivation therapy: a systematic review. J Clin
2009;115(21):5060–70. Oncol Review. 2014;32(4):335–46.
176. Singh AA, Jones LW, Antonelli JA, Gerber L, 185. Parsons JK. Prostate cancer and the therapeutic ben-
Calloway EE, Shuler KH, et al. Association between efits of structured exercise. J Clin Oncol [Comment
exercise and primary incidence of prostate cancer: Editorial]. 2014;32(4):271–2.
does race matter? Cancer. 2013;119(7):1338–43. 186. Winters-Stone KM, Beer TM. Review of exer-
177. Magbanua MJ, Richman EL, Sosa EV, Jones LW, cise studies in prostate cancer survivors receiving
Simko J, Shinohara K, et al. Physical activity and androgen deprivation therapy calls for an aggressive
prostate gene expression in men with low-risk research agenda to generate high-quality evidence
prostate cancer. Cancer Causes Control [Research and guidance for exercise as standard of care. J Clin
Support, NIH, Extramural Research Support, Oncol [Comment Letter]. 2014;32(23):2518–9.
Non-US Gov’t]. 2014;25(4):515–23. 187. Mennen-Winchell LJ, Grigoriev V, Alpert P, Dos
178. Richman EL, Kenfield SA, Stampfer MJ, Paciorek Santos H, Tonstad S. Self-reported exercise and
A, Carroll PR, Chan JM. Physical activity after bone mineral density in prostate cancer patients
diagnosis and risk of prostate cancer progression: receiving androgen deprivation therapy. J Am Assoc
data from the cancer of the prostate strategic uro- Nurse Pract [Research Support, Non-US Gov’t].
logic research endeavor. Cancer Res. [Research 2014;26(1):40–8.
Support, N.I.H., Extramural Research Support, 188. Antonelli JA, Jones LW, Banez LL, Thomas JA,
Non-U.S. Gov’t Research Support, U.S. Gov’t, Non-- Anderson K, Taylor LA, et al. Exercise and pros-
P.H.S.]. 2011;71(11):3889–95. tate cancer risk in a cohort of veterans undergoing
179. Sprod LK, Palesh OG, Janelsins MC, Peppone LJ, prostate needle biopsy. J Urol [Research Support,
Heckler CE, Adams MJ, et al. Exercise, sleep qual- Non-US Gov’t]. 2009;182(5):2226–31.
Legal and Ethical Aspects
of Machine Learning: Who Owns 14
the Data?

Barbara Prainsack and Elisabeth Steindl

Contents
14.1 Introduction 191
14.2 Opening the “Ethics Bubble”: What Are the Concerns? 192
14.3 Going Beyond FAT: Beyond Medical Ethics 195
14.4 Who Owns Patient Data? 196
14.5 Conclusion 199
References 200

14.1 Introduction decades, if not centuries, what is new here is the

strong focus on ethics.
It is no exaggeration to say that we are in the Perhaps this is not surprising, given that AI—
midst of an “AI ethics bubble”. The ethics of arti- used here as an umbrella term for various tech-
ficial intelligence makes headlines in public nologies that mimic human intelligence—has
media and the topic of major international con- become a symbol for societal concerns about the
ferences. Technology corporations in particular mastery of machines over people. It is seen as
are channeling funding into the creation of AI posing various challenges to society, ranging
ethics institutes and endowed chairs, such as from voter manipulation to other threats to
recently seen at universities in Oxford, Munich, democracy [3], to the technological replacement
and Cambridge, MA (e.g. [1, 2]). While corpora- of human labour [4]. The replacement of human
tions have collaborated with academia for many labour is an aspect that is particularly pertinent to
medicine as well: Some studies predict that up to
half of all the existing jobs in the United States
B. Prainsack (*) are at risk of automation [5]. Among medical
Department of Political Science, University of professionals, radiologists and pathologists are
Vienna, Vienna, Austria
seen as particularly vulnerable to technological
e-mail: [email protected]
replacement [6–8]. Against this backdrop, it
E. Steindl
could be argued, technology companies have a
Department of Innovation and Digitalisation in Law,
University of Vienna, Vienna, Austria particularly great need to ensure that their devel-

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 191
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_14
192 B. Prainsack and E. Steindl

opment and use of AI complies with ethical denote applications of AI that improve with only
standards. very little, or even no, input from humans. Also in
But there is also a more sinister reason for thethis chapter, the term machine learning is used to
current ethics bubble. Corporations that use AI to refer to processes and technologies whereby
develop new services, increase market shares, machines discern patterns in data with only little
and expand their global reach, are currently steering from humans, while “AI” is used to
pitching “ethics” against “regulation”. Strict reg- denote instances in which debates refer to even
ulation of AI, and in particular, machine learning, wider areas of machine “intelligence”, or to the
they argue, puts Europe, North America and other attempt to make machines act like humans.
world regions at risk of falling further behind the Although AI has a history of many decades
AI capabilities of China, and is thus problematic. (e.g. [11]), there has been an increase in AI tech-
They suggest that rather than putting up “red nologies in recent years. This is mostly due to
tape” for technology, societies should ensure the increasing computational power and increasing
creation of good ethics guidelines that ensure that opportunities for automation and digitisation.
AI is “trustworthy” ([9], and in reference to [10]).These, in turn, have been made possible by “data-
Such playing out of ethics against regulation is, fication”, which means the capturing and storing
of course, not only politically problematic but of information about people’s lives, their bodies,
also factually flawed: Ethics and regulation take and about their environments, that were previ-
different forms and are issued by different insti- ously unrecorded. For example, even a decade
tutions, but they mutually influence and enable ago, the only way to learn about people’s exercise
each other. Ethical considerations are always part levels was by asking them what type of exercise
of regulatory processes and guidelines, and regu- they had done within a specific period of time,
lation, in turn, is necessary to enforce ethical and how much of it. Today, this information is,
norms and commitments. Also in this chapter, for many of us, automatically captured by activ-
ethical and regulatory and legal aspects are ity trackers built into our smartphones, or mea-
treated as closely intertwined, and not as some- sured in other, often remote and unobtrusive
thing that can, or should be, strictly separated. ways. The legal scholar Harry Surden called this
Before we look at the legal and regulatory the end of structural privacy [12], meaning that
aspects of AI in imaging—and zoom into the the domains of our lives and bodies that remain
question of who owns the data that is used for this unseen and “uncounted” are becoming smaller
purpose—let us first look at what the issues are and smaller. There is ever less of us and our lives
the ethics scholarship has identified in this that is not datafied.
context. For healthcare, the availability of data about
various aspects of the lives and bodies of patients,
often over a long period of time, is seen as an
14.2 Opening the “Ethics Bubble”: unprecedented opportunity. Here, AI is portrayed
What Are the Concerns? as an answer to the problem of data interpreta-
tion: While the production of data has become
There has recently been a terminological shift in relatively cheap, and greater amounts of data are
discussions of the ethics of AI. Until about mid- being produced each day, making sense of these
2019, the term “artificial intelligence” was widely data has remained expensive [13]. To bridge this
used as an umbrella term for all computational “interpretation gap”, machine learning in particu-
processes that mimic human intelligence. More lar has been suggested as a solution. Moreover, in
recently, following criticism of the unduly vague many aspects of healthcare, AI is already in use:
and wide use of the term in ethical and regulatory from telemedicine to supporting communication
discussions, the terms that are used have become with patients to billing and insurance. In medical
more specific: Policy and academic papers alike imaging, molecular imaging is expected to bene-
increasingly use the term “machine learning” to fit significantly from machine learning; and deep-
14 Legal and Ethical Aspects of Machine Learning: Who Owns the Data? 193

learning based interpretation is hoped to help While fairness ultimately pertains to questions
reduce interobserver variability in nuclear imag- about equity, the second criterion within the FAT
ing (e.g. [14]; see also [15]). paradigm, accountability, relates to the question
What are the key ethical challenges related to of who can be held responsible for outcomes.
AI? Over the last years, ethicists and other experts Here, also legal questions about liability come
have raised a range of concerns related to AI that into play. Very broadly speaking (and without
can be largely grouped in three clusters: Fairness, consideration of specific configurations in par-
accountability, and transparency (FAT). The particular jurisdictions; for more details on these,
adigmatic challenge for fairness is biased train- see [16, 19]), liability for harm caused by machine
ing data (see [16, p. 176]): This is the case when learning applications can only kick in when
a specific population group, such as elderly peo- someone has been negligent, either a physician or
ple, members of minorities, or the uninsured, are a company. Negligence on the side of physicians
underrepresented, or entirely missing, from a or healthcare workers, in turn, requires that there
data set. It is not always straightforward to know, is a duty of care towards patients that was
however, when bias exists, or when it is problem- breached. As Schönberger emphasises, not all
atic [17]. For example, in the context of the train- erroneous predictions by an AI system that
ing of an algorithm to classify pulmonary caused harm to a patient mean that physicians or
tuberculosis (e.g. [18]), what constitutes a non- healthcare organisations they work for are liable;
biased dataset: A dataset that is representative of they can only be held accountable if they used the
people who typically suffer from TB? One that AI in a way that they should not have [16, p. 197].
reflects the demographic composition of the The other type of liability besides that of phy-
patient population treated in a specific hospital? sicians and healthcare providers is product liabil-
Or a dataset that represents the demographic ity. This becomes relevant when patients suffer
composition of the city? Of the entire nation harm from products that were defective in their
even? Moreover, if it is known, for example, that design, manufacturing, or warning—in other
minority populations have been underrepresented words, products that did not operate as they
in training data for machine learning for years, should have. The legal concept of liability was
would it be mandated for ethical reasons to overs- developed with the idea in mind that those held
ample members of the minority populations in liable would be people, not machines. They were
question to make up for previous discrimination? written for people who have a sense of responsi-
There are no definitive answers to these ques- bility, which machines do not have. Moreover,
tions; instead, they illustrate the intricacies of machines would not be affected by any of the
knowing when a bias exists, and when a bias is conventional sanctions (e.g. fines) that our law
problematic, that is, when it has a negative impact system applies. Algorithms, in contrast to book
on equity.1 titles that suggest otherwise (e.g. [20]), do not
“want” things—they are not human. This raises a
few issues when liability laws are applied to
1
It is mandated here to clarify the difference between
machines: First, if AI works in the form of non-
inequality and inequity. The two terms are often conflated
in common parlance, but they mean different things. embedded software (meaning that the software is
Inequality means that resources or benefits are distributed not built into other machines such as phones,
unequally over different groups. Using the example of cars, or pacemakers) then it is not clear whether it
health outcomes, if women and men have different life
is covered by existing liability legislation such as
expectancies, that is an inequality. Not all inequalities,
however, are also unfair: if the different outcome can be
explained by voluntary actions, for example. If Laura and ing accidents she now suffers chronic pain, then the differ-
Amir, who are married, and who grew up in similar social ence in health status between them is not an inequity. As a
strata and in the same town, have different health status rule of thumb, if we cannot find any factor that justifies
because Amir likes to tend to the garden in his spare time different outcomes, then we should treat different out-
while Laura goes paragliding, and due to multiple sport- comes as inequities.
194 B. Prainsack and E. Steindl

the European Union’s Product Liability Directive, Table 14.1 A graded scale ethical scrutiny of machine
learning in healthcare
for example. Second, current approaches bypass
the problem that the legal concept of liability was Level of ethical
sensitivity Use of AI
designed to apply to humans by holding the peo-
Low AI to support non-medical aspects
ple who build or use the machines liable for the (e.g. scheduling,
actions of the machines. As Schönberger argues video-conferencing)
[16], the more “autonomous” machines become, Intermediate AI to support diagnosis or
that is, the less their actions can be traced back to treatment choice (“thinking AI”)
decisions taken by humans, the more difficult it High AI to make decisions (“acting AI”)
becomes to hold the humans “behind” the
machines accountable. Scholars are discussing a
number of ways to address these problems. These for observers to understand. It is because of this
include giving some kind of personhood status to lack of transparency that some ethicists have
intelligent machines (e.g. [21])2; another solution argued that the use of unsupervised machine
that is discussed is to hold the healthcare profes- learning in healthcare is ethically more problem-
sionals that are using AI even more strictly atic than supervised machine learning [22]. Such
accountable for the “decisions” of the machine proposals, however, neglect the question of
than at present. For example, doctors would then where in healthcare machine learning is put to
be responsible for harm if they did not take ade- use. If it is used in core medical contexts, such as
quate measures to evaluate how accurate the for diagnosis and treatment decisions, then the
algorithm is that they are using [16]. lack of transparency seems much more concern-
The last notion in the FAT-paradigm is trans- ing than if unsupervised machine learning is
parency. At times, transparency is a precondition used within an application to enable video con-
of liability, and at other times, it goes beyond it. sultations. For this reason, we propose a graded
While liability refers to the consequences for scale ethical scrutiny of machine learning in
someone who bears responsibility for something healthcare (Table 14.1) that distinguishes
in the case of harm (i.e. in the case of negligence between three levels of ethical sensitivity: At the
or even intentional wrongdoing), a certain level lowest level of concern are uses of machine
of transparency is required for the assessment of learning (and other AI) for non-medical aspects,
whether any wrongdoing took place. Especially such as appointment scheduling or videoconfer-
in the context of unsupervised machine learning, encing. At the intermediate level are applications
where no function is associated with the input,3 of machine learning in key medical activities
it is often difficult, if not impossible, to know such as the establishment of a diagnosis or treat-
how the software arrived at a specific outcome ment decision, but where machine learning is
because the path to achieving the outcome was only aiding human decision making without sug-
not designed into the system, and is impossible gesting a final decision (“thinking AI”). At the
highest level of ethical sensitivity is the use of
2
The European Parliament has adopted a resolution in machine learning for key medical activities
2017 with recommendations to the Commission on Civil where the software makes the decision, e.g. if a
Law Rules on Robotics suggesting to prompt a legal status machine that automatically classified a disease
for robots (https://www.europarl.europa.eu/doceo/docu-
and gave a treatment decision that was binding
ment/TA-8-2017-0051_EN.html?redirect#BKMD-12).
The Commission, however, did not follow this recommen- (“acting AI”), which is so far not part of routine
dation in its recent strategies addressing AI. clinical care.
3
Within supervised machine learning, the machine is told Other factors that are to be considered include
by a human what to look for: e.g. it is shown pictures of whether or not machine learning is supervised
dogs and then asked to look for dogs in other images.
Within unsupervised machine learning, the machine is not
(which is less ethically problematic because of
told what to look for, but just commanded to look for higher level of transparency) or unsupervised
patterns. (more ethically sensitive due to lower levels of
14 Legal and Ethical Aspects of Machine Learning: Who Owns the Data? 195

transparency), whether the tool has been validated, ence used by ethics frameworks—most promi-
and whether the people using the tool are conscious nently the focus only on individual rights. As
of the possibility and consequences of potential many scholars have argued, most of the risks in
bias (“fairness through awareness”, [23]). connection with data use are personal and collec-
tive, and they cannot be broken down into indi-
vidual bits (e.g. [24]). Moreover, many of the
14.3 oing Beyond FAT: Beyond
G scholars and approaches that are populating the
Medical Ethics rapidly growing field of AI ethics were trained in
medical ethics or bioethics. It will be difficult to
Ethics guidelines, ethics codes, as well as papers expand (and, in some cases, change) the refer-
addressing ethical concerns in connection with ence points and institutional structures that these
AI in healthcare regularly discuss phenomena experts are operating with and within.
that map against the FAT paradigm—even if they What is the problem with the categories and
discuss these issues under different labels. But focus points of medical ethics—why can they not
there are also contributions that raise bigger be transposed to AI ethics? The main reason is
questions. A statement by the European Group on that the key reference point of medical ethics is
Ethics on AI, robotics and “autonomous” sys- the human body; the early codifications of medi-
tems” (2018), for example, draws attention to the cal ethics established that people have a right to
need for AI to be put in the service of broader be informed about, and consent to, what happens
societal and ethical values, including human dig- to their bodies. This framework emerged partly in
nity, responsibility, democracy, justice, equity, response to the horrific human rights infringe-
solidarity, sustainability, and deliberation. ments of the Nazi period and other instances
Moreover, scholars such as Karen Yeung use the when harmful or even torturous “experiments”
term “ethics washing” to refer to situations where were imposed on people under the guise of sci-
AI ethics serves mostly as an empty vessel that ence. Data ethics, on the other hand, does not
can be filled with any content that seems suitable, take the physical body as its reference point, but
and where ethics lacks the necessary tools to the “data body”—which is of a very different
enforce its own claims [9]. Taken together, these nature. First of all, the data body does not have
points of critique call for an ethics that does not clear borders and boundaries; the data that repre-
accept current institutional arrangements and sents a person, namely, the data capturing her
configurations of power as they are, and within behaviour, her diseases, etc., is spread over many
these, try to make AI “more ethical”. Instead, places and can be accessed by many people at the
they call for a political ethics that is concerned same time. This means, also, that the frame of an
also with how new technological practices affect intervention that medical ethics operates with
the distribution of entitlements, duties, and does not work for data ethics. An intervention
resources within and across populations. The into a person’s data is not comparable to a body
FAT paradigm goes some way in that direction, that is operated on to take out a gallbladder, or to
but not far enough. test a new drug. There is often no clear beginning
An important underpinning of such a more and no clear end to an “operation” on a dataset—
political ethics of AI is to leave the specificities of data is interrogated continuously [25]. In addi-
medical ethics behind, and instead treat AI ethics tion, in traditional medical ethics, it is normally
as a form of data ethics. A key argument in favour clearly apparent who carries out the procedure
of the latter is that many ethical issues in connec- and who is at risk: The latter is normally the
tion with machine learning emerge due to the patient. In data ethics, “procedures” can be car-
integration and use of large amounts of personal ried out by many different people in different
data. But such a move from medical to data ethics places at the same time—primary and secondary
may not be as easy to do as it may seem. It would data users (the latter are researchers, for example,
require a fundamental shift in the points of refer- who reuse datasets from other research teams, or
196 B. Prainsack and E. Steindl

even from the clinic), commercial enterprises, want to be confidential or even private, and peo-
etc. The people at risk from these procedures can ple may suffer harm if this data and information
be totally unrelated from those who have given are known or used by others. For these reasons,
their data. In other words, risks in data ethics are not only GDPR, but most jurisdictions place
not limited to specific individuals, but they are restrictions on the collection and use of personal
collective. data. But there are crucial differences in how per-
Understanding AI ethics as a kind of data eth- sonal data is protected. To put it very generally, in
ics, and not as a field of application for medical Europe, the predominant view has been to see
ethics, also affects how we think about data personal data and information as belonging to
ownership. people in a moral sense, without being consid-
ered property in the legal sense. This means that
personal data is not seen as something to be sold,
14.4 Who Owns Patient Data? or something that has a market value. The protec-
tion of personal data is ensured through privacy
This simple question is not easy to answer. It will rights.
concern us for the rest of this chapter. The prob- According to European Law, the question of
lem starts with defining ownership. While the whether data can be owned has multiple layers.
related term “property” has clearly definable One layer refers to the fact that any data has to be
legal meaning, ownership can relate to legal enti- categorised as either personal or non-personal
tlements, but it can also refer to a moral claim on data. Personal data is protected by a number of
something. People who say that they own their fundamental individual rights, such as the right to
personal data do not always mean to express a be informed, the right of access, the right to rec-
legal opinion. Rather than implying that they tification, the right to erasure, the right to restrict
have the right to destroy or sell their data, which processing, the right to data portability, the right
are some of the key characteristics that distin- to object, and rights in relation to automated
guish property rights from other entitlements, decision-making and profiling (see Chap. 3
what they often mean to say is: “I should have a GDPR). These individual rights continue to exist
say in who uses my data, what they do with it, as long as the data has not been anonymised—
and who benefits from it”. In other words, owner- this means that, taking into account all the means
ship is a very broad concept that includes moral reasonably likely to be used, the data does no lon-
and legal elements. ger relate to an identified or identifiable person
But let’s start at the beginning. Can we legally (i.e. all links to do so have been destroyed). In
own data? In other words, is it possible to own other words the conception of personal data
something that is (at least in part) immaterial—as within the GDPR cannot be aligned with a third
digital data is (see [26])? The law answers this party owning somebody else’s personal data.4 It
question affirmatively; intellectual property also means that, in the European context, the
rights protection is an example. It gives people or question of ownership only arises regarding non-
organisations the right to control intellectual personal data. And this is where the next layer
resources that are in part, or even entirely, comes in: as data does not easily fit into either
immaterial. one of the traditional legal categories of material
Within the European Union, the EU General or immaterial, it cannot be subsumed under prop-
Data Protection Regulation (GDPR) grants spe- erty that is moveable or intellectual property. The
cial protections to so-called personal data, that is, European Commission itself stressed that current
data that refers to a specific identified or identifi-
able natural person. Names and addresses are 4
The question of lawfulness of processing of special cat-
clearly personal data; but IP addresses or genomes egories of personal data according to Article 9 GDPR has
are too [27]. Personal data is seen as disclosing to be seen apart from any kind of possible ownership and
things about people and their lives that they may is therefore not discussed here.
14 Legal and Ethical Aspects of Machine Learning: Who Owns the Data? 197

intellectual property laws are not a suitable tool tled to do a lot of things: to read the book, to con-
for data governance [28]. trol who else gets to read it, and she can use it for
In the United States, debates about whether other purposes such as place a laptop on top of it
personal information should or could be viewed for a videoconference. She can exclude other
as property have been complex. Some authors see people from even looking at it. But there are
property rights as the best way of protecting things that this person who has taken a book from
personal data [29]. Partially, this notion is rooted the library is not entitled to do: She must not sell
in the important role that property rights play in or destroy the book. These additional entitle-
American self-conception. Property rights, under- ments are reserved to the person or entity that
stood—in William Blackstone’s deliberately pro- holds property rights. In other words, the bundle
vocative description—as ‘that sole and despotic of rights granted to a person due to mere posses-
dominion which one man claims and exercises sion (e.g. having the book in your house after
over the external things of the world, in total having taken it from the library) is less “thick”
exclusion of the right of any other individual in than the bundle of property rights. Property rights
the universe’ [30], are woven into the very foun- include all rights that other forms of possessions
dations of American society and legal culture. include (the right to possession, income, etc., as
Even for those scholars who say that this ideal has listed below) plus the right of alienation (selling
never been implemented in actual practice, prop- or destroying).
erty rights have nevertheless played a much more Another example of the difference between
important role in the United States than in Europe. weaker forms of possession on the one hand, and
In U.S. discourse, treating personal data as prop- property rights on the other, is renting a flat. As a
erty has served the important purpose of overcoming lawful tenant I am entitled to determine who can
the shortcomings of U.S. data protection systems [31, enter the flat, how it is decorated, and what is
pp. 507–508]. In contrast to the European Union, done inside. But only the owner (here: the holder
who have a data protection law that applies to the of property rights) holds the additional rights that
processing of all personal data and expands its territo- are also in the bundle, such as selling the flat.
rial scope even beyond European borders, American (The fact that I am not normally allowed to
privacy laws are sector-specific; they are tailored to destroy my flat, even if I hold property rights,
specific fields such as healthcare or financial services. illustrates that even property rights are not unlim-
This has led some scholars to argue that, because ited—even they can be restricted to protect
American privacy laws are relatively weak, property important other rights and interests. In the inter-
rights are the best, or even the only, way to ensure est of public safety and security I am not allowed
people’s control over their data. to burn down my flat, or to neglect it to such an
Other authors (e.g. [32, p. 1295]) disagree extent that it becomes a public nuisance).
with this stance. They argue that “the raison Back to digital data. But how does this differ-
d’etre of property is alienability” [32, p. 1295]. ence between property rights and “weaker” forms
The meaning of this statement becomes clear of possession that apply to tangible goods such as
only if we take a closer look at how property books or flats work with intangible things such as
rights are organised: It is best conceived as a bun- data? As noted, although data has a tangible,
dle of entitlements, rather than as one single material element, including the technical infra-
right. It is the bundle of rights, rather than one structures that enable its collection, storage, and
specific characteristic, that sets property rights use, at least a part of them is immaterial.
apart from other entitlements to things. Within In order to answer this question it is helpful to
that bundle, there are some “stand-out” rights unpack the bundle of rights and entitlements that
that characterise the bundle. make up property rights. Denise Johnson [33],
To use an example from the physical world: drawing upon Honore’s famous work in the
When someone has borrowed a book from a 1960s [34], names the following entitlements as
library, the book is in her possession. She is enti- part of the bundle of property rights:
198 B. Prainsack and E. Steindl

1. The right to possess. Just as the example of 3. The right to income allows the property
the library book, or the rented flat, below, the rights holder to allow others to use the thing
person who rightfully possesses has and to pay her for this use. This right is
exclusive control of a thing. When the thing closely related to the previous one, namely
that is owned is intangible, then, as Honore the right to manage; the difference between
put it, possession is the right to exclude oth- the two is that the right to income focuses on
ers from using or benefitting from the thing. the money that one receives in return—for
Moving to the digital realm, for data in the other people using the thing, for example
healthcare domain, such as imaging data and (see also [36]). This seems no more difficult
lab results, it is very difficult to conceive to enforce in the case of digital data than it is
what such “exclusive” control would look with owning a physical object.
like. When an imaging department that does 4. The right to capital—which is the right that
a cardiac perfusion scan on a patient owns allows a person to alienate the thing, namely
the imaging data (because the patient may to give it away, to consume it, to change it, or
have agreed to this when signing the consent to destroy it. The problem here is that it is not
form for the procedure) “exclusive control” so easy to decide what “consuming” or
means that they can share the data with third “destroying” data means. Physical things are
parties—they can even sell the data. But consumable and rivalrous: They can be ‘used
does it mean that they can exclude the patient up’, and the use of the good by one person
from accessing their own perfusion scan?, affects the use of the good by others. Many
Wherever GDPR is applicable, this stance authors argue that the same cannot be said
would be difficult to argue—because as long for digital data, as they are considered to be
as the perfusion scan is seen as personal neither consumable nor rivalrous: The perfu-
data—i.e. as data that is linked to an identi- sion scan data does not disappear, or deterio-
fied or identifiable person (note that this rate, if lots of people use it; and one research
includes pseudonymised data)—then the group using it does not detract from the util-
patient has a right to access—or even initiate ity of the data for another. Having said this,
the erasure—of her own data even though she whereas the data itself is not consumable or
does not hold property rights to it [35]. rivalrous, their value can be: the value of a
2. The right to manage gives people the right to dataset can be highest for those who have
decide who can use the thing that is pos- exclusive use; and it can, of course, be
sessed, and how. It includes the right of lend- affected by many people using it. Think of
ing or contracting out (see also [36]). This proprietary information such as search algo-
right seems relatively unproblematic in con- rithms, or information on commercial merg-
nection with digital data, except that it may ers that are likely to affect stock prices, for
be difficult to exclude patients from using example. For these reasons, digital data is
their own data as long as this data is consid- best described as simultaneous [26]: It can
ered personal data—as explained in point be in more places than one at the same time,
(1). Referring to our example of the perfu- it can be copied and used by several people
sion scan explained above, this means that at the same time, independent of what the
the entity that holds property rights to the others are doing, and it leaves traces even
perfusion scan data can decide who gets when it is deleted. Because the value of data
access to it, for what purpose it can be used, can be rivalrous, it is arguably this multiplic-
and who can commercialise it. They may ity of data that is the key difference between
not, however, be able to refuse patients physical entities and digital data with regard
access as long as the imaging data can be to the right to capital. In situations where
linked to an identified or identifiable person. those holding property rights to data cannot
14 Legal and Ethical Aspects of Machine Learning: Who Owns the Data? 199

control all copy of the dataset (or do not even of a dataset could be taken away to pay for
know where all the different copies are), the something that the rights holder owns.
right to capital may be difficult to enforce. 10. Last but not least, property rights have a
5. The right to security protects the rights- residuary character: This means that, even
holder from expropriation. In Quigley’s if the property rights holder has given away
words [36, p. 633], it is “the assurance that a many entitlements within the bundle (e.g.
person […] will not be forced to give it up she has leased her property to someone else),
without adequate recompense.” It is not dif- she still holds whatever is left of the bundle.
ficult to conceive of this right with respect to To the extent that the bundle of property
digital data. rights can be applied to digital data, the
6. The power of transmissibility means that residuary character does not pose any addi-
the rights holder can give the thing that s/he tional complications.
owns to somebody else, either before or after
his/her death. Also here, it is not difficult to In sum, many of the entitlements and duties
imagine this right to be applied to digital within the bundle of rights that constitute prop-
data (for the instrument of post-mortem data erty rights—which were originally developed for
donation specifically, see [26, 37]). physical things—cannot be neatly transposed to
7. The absence of term: This means that the digital data. Because of the multiple nature of
length of ownership is not time-limited. digital data (the ability of digital data to be at sev-
8. Now we are moving into the provisions eral places at the same time), it is more useful to
within the bundle of property rights that are speak about the right to control data in the con-
duties and liabilities rather than entitle- text of medical imaging than about data owner-
ments: The first one is the prohibition of ship. Because of the complexities laid out in this
harmful use, meaning that even the person chapter, and because of the moral and legal con-
who owns a thing is not free to do with it notations of the term, the notion of ownership
whatever she pleases; the boundaries of her tends to confuse more than it clarifies when
freedom are the rights of others. In the applied to digital data.
physical world this is best described with a
knife: Even if I hold all entitlements of the
bundle of property rights to the knife I am 14.5 Conclusion
not allowed to use it to cut into another per-
son. With regard to data, the prohibition of This chapter started with the diagnosis that we
harmful use raises really interesting ques- are amidst an “AI ethics bubble”, where espe-
tions: Does this only mean that the data cially corporate interest in ethics of AI and
owner herself is not allowed to use the data machine learning is extremely high. Technology
in a harmful way? Or does it include a duty corporations and other businesses provide fund-
to actively prevent that others can use the ing for ethics institutes and endowed chairs on AI
data in a harmful way? Does this mean that ethics at leading universities, and co-opt academ-
restrictions of data sharing may be required ics into the ethics governance of their own com-
as a preventive measure? These questions panies. The pitching of “ethics” against
remain open. “regulation” has been part of this process.
9. Those who hold property rights are also lia- Taking the stance that ethics and regulation,
ble to execution; which means that the thing albeit having different emphases, complement
that is owned can be taken away for the and require each other, rather than being clearly
repayment of a debt, for example. It is con- separable, this chapter then opened up the “ethics
ceivable that this would apply to digital data: bubble” of AI. Our diagnosis was that most of the
if the data has commercial value, ownership ethical concerns identified and discussed in this
200 B. Prainsack and E. Steindl

context map against the so-called FAT paradigm. 8. Obermeyer Z, Emanuel EJ. Predicting the future—big
data, machine learning, and clinical medicine. N Engl
It orders concerns in several clusters, including J Med. 2016;375(13):1216.
fairness, accountability, and transparency. While 9. Yeung K, Howes A, Pogrebna G. AI governance by
this typology is extremely helpful, we proposed human rights-centred design, deliberation and over-
to take a step further and go beyond the FAT par- sight: an end to ethics washing. The Oxford handbook
of AI ethics. Oxford: Oxford University Press; 2019.
adigm. In order to do so, we suggested to go 10. European Commission Ethics Guidelines for
beyond the toolbox of medical ethics and draw Trustworthy AI. 2019. https://ec.europa.eu/
more strongly upon the instruments in the grow- digital-s ingle-m arket/en/news/ethics-g uidelines-
ing field of data ethics. This is necessary, we trustworthy-ai. Accessed 16 May 2020.
11. Haenlein M, Kaplan A. A brief history of artificial
argued, because the reference point of medical intelligence: on the past, present, and future of artifi-
ethics is the physical body, which has clear cial intelligence. Calif Manag Rev. 2019;61(4):5–14.
boundaries. The same does not apply to people’s 12. Surden H. Structural rights in privacy. SMUL Rev.
data bodies, which are far from clearly bounded: 2007;60:1605.
13. Prainsack B. Precision medicine needs a cure for
Data is multiple in the sense that it can be in sev- inequality. Curr Hist. 2019;118(804):11–5.
eral places at the same time. 14. Choi H. Deep learning in nuclear medicine and molec-
What, then, does this mean for the question of ular imaging: current perspectives and future direc-
data ownership? Who owns the data that medical tions. Nucl Med Mol Imaging. 2018;52(2):109–18.
15. Choi H, Ha S, Im HJ, Paek SH, Lee DS. Refining
imaging departments work with? The final section diagnosis of Parkinson’s disease with deep learning-
of this chapter seeks to answer this question by based interpretation of dopamine transporter imag-
discussing how the “bundle of rights” that make ing. Neuroimage Clin. 2017;16:586–94. https://doi.
property rights can be applied to digital data. We org/10.1016/j.nicl.2017.09.010.
16. Schönberger D. Artificial intelligence in healthcare: a
conclude that because of the multiple nature of critical analysis of the legal and ethical implications.
digital data, some of the entitlements and duties Int J Law Inform Technol. 2019;27(2):171–203.
within the bundle of property rights can be applied 17. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM,
to digital data only with difficulty. Blau HM, Thrun S. Dermatologist-level classification
of skin cancer with deep neural networks. Nature.
2017;542(7639):115–8.
18. Lakhani P, Sundaram B. Deep learning at chest
References radiography: automated classification of pulmonary
tuberculosis by using convolutional neural networks.
1. Moss E, Metcalf, J. The ethical dilemma at the heart Radiology. 2017;284(2):574–82.
of big tech companies. Harvard Business Rev. 2019. 19. Pesapane F, Volonté C, Codari M, Sardanelli
https://hbr.org/2019/11/the-ethical-dilemma-at-the- F. Artificial intelligence as a medical device in radiol-
heart-of-big-tech-companies. Accessed 24 Apr 2020. ogy: ethical and regulatory issues in Europe and the
2. Ochigame R. 2019. The invention of “ethical AI”. United States. Insights Imaging. 2018;9(5):745–53.
The Intercept. https://theintercept.com/2019/12/20/ 20. Finn E. What algorithms want: imagination in the age
mit-ethical-ai-artificial-intelligence/?comments=1. of computing. Cambridge, MA: MIT Press; 2017.
Accessed 24 Apr 2020. 21. Vladeck DC. Machines without principals: liability
3. O’Neill C. Weapons of math destruction: how big data rules and artificial intelligence. Washington Law Rev.
increases inequality and threatens democracy. Crown. 2014;89:117.
2016. 22. Jannes M, Friele M, Jannes C, Woopen C. Algorithms
4. Ford M. Rise of the robots: technology and the threat in digital healthcare. An interdisciplinary analysis.
of a jobless future. New York: Basic Books; 2015. Gütersloh: Bertelsmann Stiftung; 2019.
5. Frey CB, Osborne MA. The future of employment: 23. Dwork C, Hardt M, Pitassi T, Reingold O, Zemel
how susceptible are jobs to computerisation? Technol R. Fairness through awareness. In: Proceedings of the
Forecast Soc Chang. 2017;114:254–80. 3rd innovations in theoretical computer science con-
6. Chockley K, Emanuel E. The end of radiology? Three ference; 2012. p. 214–26.
threats to the future practice of radiology. J Am Coll 24. Taylor M. Genetic data and the law: a critical perspec-
Radiol. 2016;13(12):1415–20. tive on privacy protection. Cambridge: Cambridge
7. Grace K, Salvatier J, Dafoe A, Zhang B, Evans University Press; 2012.
O. When will AI exceed human performance? 25. Metcalf J, Crawford K. Where are human sub-
Evidence from AI experts. J Artif Intell Res. jects in big data research? The emerging ethics
2018;62:729–54. divide. Big Data Soc. 2016;3(1):1–14. https://doi.
14 Legal and Ethical Aspects of Machine Learning: Who Owns the Data? 201

org/10.1177/2053951716650211. Accessed 16 May 32. Litman J. Information privacy/information property.

2020. Stanford Law Rev. 2000;52:1283–313.
26. Prainsack B. Data donation: how to resist the iLevia- 33. Johnson DR. Reflections on the bundle of rights.
than. In: The ethics of medical data donation. Cham: Vermont Law Rev. 2007;32:247. https://lawreview.ver-
Springer; 2019. p. 9–22. montlaw.edu/wp-content/uploads/2012/02/johnson2.pdf
27. Goddard M. The EU General Data Protection 34. Honoré AM. Ownership. Making law bind: essays
Regulation (GDPR): European regulation that has a legal and philosophical. Oxford: Clarendon Press;
global impact. Int J Mark Res. 2017;59(6):703–5. 1961. p. 161–92 (Originally published in Guest AG,
28. European Commission Legal study on ownership ed. Oxford essays in jurisprudence. Oxford: Oxford
and access to data. Final report. 2016. https://www. University Press; 1961. p. 107–47).
op.europa.eu/s/n2Qc. Accessed 16 May 2020. 35. Thorogood A, Bobe J, Prainsack B, Middleton A, Scott
29. Murphy RS. Property rights in personal information: E, Nelson S, Corpas M, Bonhomme N, Rodriguez LL,
an economic defence of privacy. Georgetown Law J. Murtagh M, Kleiderman E. APPLaUD: access for
1996;84:2381–217. patients and participants to individual level uninter-
30. Blackstone W.. Of property in general. Commentaries preted genomic data. Hum Genomics. 2018;12(1):7.
on the laws of England. 1765–69; Book II: Chapter 36. Quigley M. Property and the body: applying Honoré.
I. 1979. https://avalon.law.yale.edu/subject_menus/ Med Law Rev. 2007;17:457.
blackstone.asp. Accessed 12 May 2018. 37. Krutzinna J, Floridi L, editors. The ethics of medi-
31. Purtova N. Property rights in personal data: learning cal data donation. Cham: Springer International
from the American discourse. Comput Law Secur Publishing; 2019.
Rev. 2009;25(6):507–21.
Artificial Intelligence
and the Nuclear Medicine 15
Physician: Clever Is as Clever Does

Roland Hustinx

Contents
15.1 I Am Looking Forward to More A.I. in My Practice Because… 204
15.1.1 The Images Will Look Prettier 204
15.1.2 My Life Will Be Easier 204
15.1.3 My Patients Will Be Better Off 205
15.2 I Am Wary of More A.I. Because… 206
15.2.1 I Don’t Understand It 206
15.2.2 I Don’t Trust It 207
15.2.3 I Don’t Want It 207
15.3 How to Proceed? Let’s Be Practical! 208
References 210

For several years now, the role and place of artifi- longer a prospect, it is a reality. The physician’s
cial intelligence (A.I.) in radiology have been attitude has shifted from the fear that “A.I. will
discussed and debated in all strata of the radio- replace radiologists” to the belief that “radiolo-
logical field. From university hospitals to private gists who use AI will replace those who don’t.”
centers, from large companies to countless start- A.I. has been much less present in the field of
ups, from scientific societies to medical associa- nuclear medicine (NM), which is distinct from
tions, all are very actively and vocally involved. radiology as a medical specialty in most coun-
The U.S. Centers for Medicare and Medicaid tries. However, they share similar technologies,
Services’ (CMS) decision in September 2020 to in particular the cross-sectional techniques used
provide its first-ever reimbursement of a radiol- in hybrid imaging, e.g. CT and MRI. There is no
ogy A.I. algorithm is expected to open the door to reason that the advances, solutions, and new
broader coverage of imaging A.I. software in the problems highlighted by A.I. in the radiological
clinics. The feeling in radiology is that A.I. is no field should not be observed sooner or later in the
NM field. Some of our practical specificities,
R. Hustinx (*) such as the complication of dealing with short-
Division of Nuclear Medicine and Oncological lived isotopes for scheduling the clinical activity,
Imaging, University Hospital of Liège, GIGA-CRC or the complexities of individual dosimetry in
In Vivo Imaging, University of Liège, Liege, Belgium treatments with radiopharmaceuticals, should, on
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 203
P. Veit-Haibach, K. Herrmann (eds.), Artificial Intelligence/Machine Learning in Nuclear Medicine
and Hybrid Imaging, https://doi.org/10.1007/978-3-031-00119-2_15
204 R. Hustinx

the contrary, constitute excellent fields where A.I. activity, i.e. lower patient’s exposure [1]. Studies
helps our practice. Nonetheless, it is indisputable will be shorter to acquire, leading to improved
that NM is lagging behind radiology in the clini- patient’s comfort and experience, fewer move-
cal implementation of A.I. Whatever the reasons, ment artifacts, and also increased throughput.
increased susceptibility of the NM techniques to X-ray exposure may also be reduced by using
local methodological variables, difficulty to deep learning (DL) for attenuation correction,
gather large curated datasets or perhaps smaller hence removing the need for low-dose, attenua-
market less attractive for the industry, we do not tion correction only, CTs [2]. A.I. has the poten-
seem close to seeing any reimbursement of an tial to further enhance the image quality through
A.I. add-on in our field. It is only a matter of improvements in the co-registration of the CT
time, however, and it should give NM physicians and SPECT/PET parts of hybrid studies. This
the opportunity to better prepare and contribute may have major implications in particular in
more actively to shaping how A.I. will be inte- studies where misregistrations may have signifi-
grated into our practice. The question is essen- cant clinical implications. This is the case for
tially twofold: what would be the role of NM instance when using the diagnostic CT study
physicians in a medical era where A.I. is more along with the [99mTc]MAA SPECT/CT study for
and more present, and what must we learn and do determining the activity of [90Y]-labeled micro-
to shape this future. spheres to inject during selective intra-arterial
In this chapter we shall consider successively radiation therapy. In summary, considering the
the benefits of A.I., the threats and the obstacles images and their content as a product, we will be
that accompany its implementation, and finally working with better-quality material, and nobody
the possible steps that need to be taken for a suc- would argue against that.
cessful and mutually satisfactory embedment of Furthermore, improved, faster, and more
A.I. in clinical nuclear medicine. These questions robust automated AI-based segmentation algo-
shall be considered looking at the three axes of rithms will streamline the data analysis. For
involvement of A.I. in the field of NM: Physics, instance, [18F]FDG PET/CT is key in the man-
i.e. how A.I. will impact image acquisition and agement of diffuse large B cell lymphomas
reconstruction; operational, i.e. how A.I. will (DLBCL), and the metabolic tumor volume
optimize health care delivery through improved (MTV) appears to be a metrics that further
scheduling and overall organization; clinical improves its prognostic value. The current con-
which encompasses all applications aiming at sensus tends towards using a fixed maximum
improving the interpretation of the studies (not standardized uptake value (SUVmax) threshold of
limited to the images) in terms of diagnostic 4, but even when semi-automated, the process is
accuracy, prognostic and predictive value or indi- tedious, time-consuming, and imperfectly repro-
vidual pre-treatment dosimetry. ducible [3, 4]. Automated algorithms based on
DL have been proposed for this task [5], and in
all likelihood most of us should see those as a
15.1 I Am Looking Forward welcome addition to our daily routine.
to More A.I. in My Practice
Because…
15.1.2 My Life Will Be Easier
15.1.1 The Images Will Look Prettier
The introduction of A.I. into the operation of the
In theory, we nuclear medicine physicians should NM department should also benefit to the physi-
benefit from the introduction of A.I. in all three cians, through optimization of the resources. This
fields, and the physics applications are probably has been demonstrated in radiology departments
the most obviously welcome. Indeed, we will be [6], and it should prove even more relevant in
looking at images obtained with lower injected NM, which is dealing with isotopes, including
15 Artificial Intelligence and the Nuclear Medicine Physician: Clever Is as Clever Does 205

short-lived ones. Patients scheduling, radiophar- regularly perform a large number of non-FDG
maceutical preparation, and report generation are studies, such as radiolabeled PSMA ligands. In
operational activities all susceptible to benefit addition, theranostic approaches, with the
from A.I., provided that the physicians, radio- accompanying treatment procedures, also
pharmacists, and administrative staffs strongly occupy very different places in NM centers.
contribute to framing the A.I. intervention and Therefore, it is clear that considering the poten-
fully stay on top of the processes. The worst-case tial impact of A.I. in the field of NM involves
scenario would be an A.I.-supported take-over by first trying to understand the major trends in the
non-medical, bureaucratic supervisors who future development of the specialty itself. A sys-
would consider that A.I. provides them with all tematic review published in 2019 showed a
the insight needed to optimally manage an NM strong imbalance in A.I. applications towards
Department, without a significant contribution oncology, which accounted for 86% of all publi-
from the physicians. A basic task, often over- cations in A.I. and radiomics fields [7]. Hence,
looked, but which is responsible for a significant one may infer that those centers where oncology,
waste of time for the NM physician is to recover and more specifically high-end, tertiary or qua-
and organize previous studies, not only in NM ternary-care oncology, is more prevalent, will
but also in other modalities. It is often difficult to experience the most immediate impact of A.I. on
streamline a process that involves different pro- their clinical practice. Neurology and cardiology
viders, for the PACS and the different viewers are probably the next in line in terms of clinical
that may coexist in a department. Operational implementation. From the physician’s perspec-
A.I. would be of great value in this setting. tive, the initial steps in this clinical implementa-
tion process should be quite exciting. We can
expect to benefit from a growing number of A.I.
15.1.3 My Patients Will Be Better Off toolkits designed to perform dedicated and
highly focused tasks, such as characterizing lung
More generally, NM physicians are used to look- nodules using [18F]FDG PET and CT, or recog-
ing at images but also at data. Radiomics and nizing normal patterns, e.g. non-pathological
A.I. will provide more data, more reliable data, studies in whole-body bone scans with [99mTc]-
and new ways at interpreting these data. NM labeled diphosphonates. Such tasks should prove
should therefore be a fertile ground for these to be of great benefit to the specialty, and our
developments in diagnostic and prognostic patients, by improving the quality and reliability
applications in general. However, we must first of the diagnostic information contained in our
study the terrain before attempting to consider reports. We would always maintain a holistic,
the practical impacts that can be expected in human-centered approach to the NM imaging
clinical NM. Activity profiles are very different field, as we would use these A.I. tools to merely
in academic centers and public and private ser- complement an otherwise unchanged process of
vices. They also vary from country to country, in interpreting images and quantitative data that
Europe and across the world. Some services supports them. Personalized dosimetry may also
work primarily with single-photon NM, i.e. bone be helped by A.I. and thus gain further accep-
scan, myocardial perfusion scintigraphy, and a tance in the clinical field. For instance, similar to
range of studies performed less frequently such diagnostic studies, A.I. may lead to shorter
as kidney, thyroid, or parathyroid scans. These acquisition times for the [177Lu] SPECT studies
studies, when added together, constitute a sig- or better model and predict voxel-wise dosime-
nificant contribution to the production of these try measurements. Again, the final decision, i.e.
services. The relative contribution of hybrid should we treat the patient and if yes, the activity
imaging (SPECT/CT) also varies considerably to be administered, would remain in the physi-
from center to center. In yet other departments, cian’s hands, albeit better armed for making
most of the activity relates to PET/CT, and some those decisions.
206 R. Hustinx

With all of these largely positive elements, the and we are able to speak or at least listen to our
transition to AI-augmented nuclear medicine fellow physicists and engineers. However, our
should be smooth and easy. All we have to do is training in computational science and our under-
learn how to use the new tools first and then how standing of probabilistic learning is quite limited.
extensively to trust them. Just as we use quantita- For many of us, the leap to radiomics is reason-
tive algorithms that compare individual studies to ably doable, because they are quantitative fea-
population-based normality, like the Cedar-Sinai tures that answer formulas, and for which we can
program in MPI or Parametric Statistical assess confounders. Basically, the good old SUV
Mapping (SPM) in FDG brain PET studies, and is nothing more than a basic radiomic function.
many more. These are useful tools, fully inte- The more advanced features remain very similar
grated into the clinics, but the conclusions of whether they represent a measure of signal het-
which do not replace those of the NM physician. erogeneity, shape or intensity, e.g. the biological
Obviously, however, this is not the full story. phenomenon responsible for the accumulation or
Indeed A.I. undoubtedly contains threats to the distribution of the tracer. The leap to A.I. is much
practice of nuclear medicine as we know it, and more difficult, because our scientific background
as some us might want to keep it. And other has not prepared us for it. We do not have the
obstacles exist in the way of a smooth implemen- mental tools to fully understand the basics of a
tation of A.I. in clinical NM. U-Net architecture. Without even considering
DL, the more basic learning machine algorithms,
such as the random forests and support vector
15.2 I Am Wary of More machine, are not entirely part of our natural
A.I. Because… domain of competence. Furthermore, the rela-
tionship between the images, the quantitative fea-
15.2.1 I Don’t Understand It tures abstracted from the images and the biology,
is lost after going through the DL process.
This represents perhaps the greatest obstacle on Moreover, with A.I. in medicine, high perfor-
A.I.’s path towards clinical nuclear medicine. As mance is often associated with high opacity.
stated previously, we as NM physicians are used Hence the call for explainable and interpretable
to dealing with data, numbers, values, quantita- A.I. Some authors have gone further in distin-
tive measurements in addition to looking at guishing explainability and causability [8]. The
images. We understand the relationship between former “highlights decision-relevant parts of the
these numbers and results, and the physiological, used representations of the algorithms and active
biological, or biochemical processes that under- parts in the algorithmic model, that either con-
lie them. We easily translate time/activity curves tribute to the model accuracy on the training set,
into glomerular filtration rate. We understand or to a specific prediction for one particular
how to translate counts/pixel into the SUV, as a observation.” The later refers to “the extent to
semi-quantitative measurement of the glucose which an explanation of a statement to a human
metabolism. We also understand and know very expert achieves a specified level of causal under-
well all the factors that affect the variability of standing with effectiveness, efficiency and satis-
the SUV. We also know that we could, if we faction in a specified context of use.” In other
wanted to, obtain absolute measurements such as words, an algorithm is explainable if we under-
the glucose metabolic rate in mmol/min/g. tissue. stand the effect of variables on all the moving
Every nuclear medicine physician knows the dif- parts that constitute the algorithm, and it fits the
ference between filtered back-projection and causability criterion if the end result, i.e. the con-
iterative reconstruction. We have been trained to clusion at the end of the computation, is effi-
master the basics of physics and instrumentation, ciently and transparently actionable.
15 Artificial Intelligence and the Nuclear Medicine Physician: Clever Is as Clever Does 207

15.2.2 I Don’t Trust It it is no longer the case when the patient popula-
tion evolves or when the technique changes. So
Obviously, it is difficult to trust processes that are ideally, the algorithms should not stop learning,
poorly understood, which is why explainability i.e. they should adapt along with modifications
and causability are prerequisites for trust. Beyond introduced in the sets of data to analyze. This is
that, A.I. is not free of risk, in particular it can the continuous learning or continual A.I. [14].
generate errors. For example, image reconstruc- The algorithm learns to learn, incrementally
tion with DL can lead to artifacts and alterations adapts to new characteristics found in the input
that could have clinical impact [9]. Machine data, constantly updating its feature selection to
learning algorithms, even the smartest, can be better fit its changing environment. Intuitively we
fooled by minute alterations to the input data and may realize the advantages of such process, but
completely mishandle the data, in a way that we also realize that it should be associated with a
humans are not subject to [10]. This is the so- constant “revalidation process.” Indeed, the cata-
called “adversarial machine learning” well strophic inference or forgetting may occur when
known in the A.I. community, and the concept extreme outliers wreak havoc into an autono-
has been extended to the field of radiomics [11]. mously relearning algorithm. To put it simply,
This raises the specter of an initially effective and even fully validated and trustworthy A.I. algo-
fully validated A.I. algorithm turning into a mill rithms at the time of marketing and clinical
generating mislead interpretations and erroneous implementation need to continuously go through
decisions. The validation process itself needs to extremely stringent quality controls.
be validated. The medical literature is not devoid
of papers that, although peer-reviewed in a seem-
ingly appropriate fashion, are methodologically 15.2.3 I Don’t Want It
impaired in a severe way. Many questions arise
concerning the statistical methods for assessing The ultimate, and most compelling, question is
the performance of an algorithm. Most articles in “where does the physician fit in this puzzle?” Say
NM use the area under the receiver operating we end up with a multitude of A.I. algorithms
characteristic curve (AUC ROC) as the main dedicated to a multitude of specific tasks, possi-
metric for assessing the performance of the bly running in parallel and selected depending on
model when the outcome is binary, i.e., recur- the patient’s medical profile and issue at hand.
rence/no recurrence, malignant/not malignant, Say those algorithms are constantly learning, and
etc. Yet in presence of unbalanced data, the AUC one way or another, the process is safeguarded by
artificially inflates the performance of the model multiple checkpoints. Once we get there, the role
[12]. There is a need for at the very least using the of the physician could go either way: The physi-
most appropriate test, e.g. AUC and F-score, cians remain in charge of the patient’s care,
depending on the sample distribution and hypoth- responsible before the law, they keep receiving
esis, and also probably to develop more specific the medical fees, and thus decide when and how
tests [13]. to use the A.I. tools. Or the physicians do not
Further improving and perfecting the A.I. have the knowledge and expertise to correct the
should be accompanied by further safeguards. A.I. tools when they are wrong; they do not even
Current typical A.I. models are essentially static, know when an A.I. tool is wrong, and they are
in that they have been trained using samples cor- surrounded by so many effective A.I. tools that
responding to a population that was fully vali- the gestalt, which was the heart of the medical
dated at the time the model was built. They are profession, is no more than the vestige of a
efficient in test sets that correspond to their train- bygone era, in so much so that the physicians no
ing sets. Those static algorithms may be subject longer enjoy the confidence of the public and
to concept drift, which means that even though a health care providers. The debate remains very
task was at first efficiently and reliably fulfilled, vivid in the radiology community. The prophecy
208 R. Hustinx

G. Hinton playfully made in 2016 (“People who identified, located, and segmented over
should stop training radiologists now. It’s just 12,000 regions in 629 FDG PET/CT studies per-
completely obvious within five years that DL is formed in lymphoma and NSCLC patients [5]. A
going to do better than radiologists”) has not DL algorithm using both the PET and the CT
been verified yet, but the question remains circu- data performed very well for these tasks, with
lated in the decision circles. The Dutch Finance 87.1% sensitivity and 99% specificity in classify-
Minister Wopke Hoekstra very recently coming the lung cancer patients, and 88.6% localiza-
mented that “The work of the radiologist to a sig- tion accuracy in the same population. Similar
nificant extent has become redundant, because … results were obtained in the lymphoma patients.
a machine can read the images better than humans In this case, the network is trained to do as well as
who studied 10 years for it” [15]. The answers the physician. It does not reach this level of per-
coming from medical and scientific organizations formance, but close enough, and is thus proposed
are only half-convincing. They argue that as the as an adjunct to the physician’s interpretation. In
medical demand is increasing, A.I. will take care this case, we do not know the ground truth, we do
of the automated, time-consuming tasks, always not know who is right in the discrepant cases
in support of the physicians, whose number will (human “gold standard” or DL?), but it does not
remain stable, hence improving the cost/effec- matter, as the product is designed to help the phy-
tiveness ratio of the radiological profession. They sician accomplishing his task, including the
add that “AI will still make mistakes, which can potential flaws. This is a very marketable prod-
be easily corrected by a human, by a radiologist. uct, because it does not change the paradigm, the
But will not be possible for AI to correct itself” physician remains in charge, and the product
[15], which as we have seen represents more being a tool that automates and accelerates a pro-
wishful thinking than hard truth. Furthermore, cess. It has been trained to replicate the human’s
considering the balance “who corrects who,” past process, and it is designed to be checked by
experience with computer-assisted diagnosis is humans.
not uniformly encouraging as, in some instances, Following this approach does not fully take
radiologists tend to ignore or overturn the com- advantage of the capacities of A.I. Zhao et al.
puter prompts, even when they are correct [16]. recently went further with their report on DL for
Needless to say, implementation of A.I. in the diagnosing metastatic involvement on bone scin-
clinics has massive implications in terms of legal tigraphy [17]. They studied over 12.000 cases,
responsibilities, but this topic would deserve a and the endpoint was clear-cut, i.e. the presence
full chapter. or absence of bone metastases in the scintigra-
phy. They showed an overall accuracy of 93.4%,
with 92.6% sensitivity and 93.9% specificity and
15.3 ow to Proceed? Let’s
H an AUC of 0.964, consistent across cancer types.
Be Practical! This compared favorably with the performances
of experimented NM physicians, as in 13/200
Radiology is ahead of nuclear medicine, and cases read in parallel, A.I. was correct and all
seems caught in a circular argument: A.I. is there three physicians were wrong, compared to only 6
to stay, it’s going to be faster, more powerful, and cases where it was the reverse. And this was
more reliable for organizing the departments and obtained at lightning speed, as only 11 seconds
providing the clinicians with the most relevant were needed for interpreting 400 cases, which
information, yet radiologists need to remain is…fast! As a comparison, it took an average of
totally in charge and in full control. 136 minutes for the NM physicians to read those
The key issues are probably the validation of 400 studies, e.g. almost 3 studies per minute,
the A.I. algorithm and its endpoint. A typical which for a human being, is also quite fast. This
approach is to compare the A.I. with the human paper is a good case study. Published in a presti-
truth. A good example is provided by Sibille et al. gious journal, the conclusion is unequivocal: A.I.
15 Artificial Intelligence and the Nuclear Medicine Physician: Clever Is as Clever Does 209

is faster, better, and cheaper than the physicians. ing early response to immunotherapies, which
Case closed. In this model, there is no need for a can be very effective but in a limited number of
physician in control, no A.I. at the service of the patients and with significant costs, both monetary
physician, and no A.I. as a complement or sup- and in terms of morbidity. Theranostics is a major
port to the physician. A.I. wins, period. Yet in field for the development of A.I. in nuclear medi-
order to go further and implement such algorithm cine, to help the physicians in identifying those
in the clinic, one must first answer a few ques- who would benefit from the treatment based upon
tions. The study deals with planar scintigraphy, the diagnostic companion study, tailor the treat-
although SPECT is recommended and routinely ment through fast personalized dosimetry, and
performed. That is relevant because the benefit of finally reliably and rapidly assess treatment suc-
A.I. was primarily in terms of sensitivity. Also, cess, or failure. Second, we need to acquire the
adding the CT further improves the diagnostic minimal knowledge necessary to get on speaking
accuracy. The ground truth is also debatable, as terms with those who will actually develop and
explained in the methods. And finally, the algo- build A.I. This goes through changing how the
rithm is the perfect example of a black box. research teams are organized, developing strong
Hence, this tremendous amount of work (over collaborations outside the faculty of medicine,
12.000 studies!) published in a high-level jour- and probably partnering with the industry. This
nal, provides very little chance of effective clini- also implies revamping the education and train-
cal translation, if NM physicians are asked to ing of residents to account for this evolution. We
give their opinion. The imaging technique is not have to get better in statistics and computational
up to date, the gold standard is weak, the method sciences. Third, we need to build multicenter net-
is questionable, and the algorithm is opaque. works. It is very unlikely that single-center proto-
Similarly to some extent, major critiques were cols will manage to gather the amount and
addressed after the publication of a paper report- diversity of data necessary to develop A.I. algo-
ing on a DL algorithm outperforming radiolo- rithms directly applicable to the routine clinical
gists for interpreting mammographies, even practice. We need to account for the diversity in
though this study was methodologically very the hardware performances, acquisition and
solid [18, 19]. One may wonder whether A.I., to reconstruction algorithms, and population types.
be accepted, must be clamped and its power And finally, we need to set the highest standards
limited. for validation, not only regarding the methodol-
In order to get out of this labyrinth and come ogy surrounding the development and testing of
to the situation where not only nuclear medicine the A.I. model but also the clinical relevance of
physicians coexist with A.I. but patients also the question being solved and the clinical appro-
truly benefit from this development, a multistep priateness of the population sample being
approach is required. First, physicians must iden- investigated.
tify unmet clinical needs, taking into account the If we can fulfill these criteria, i.e. if we iden-
bigger picture. This means identifying the weak tify the need, comprehend the methods, and put
points of our techniques, in terms of accuracy or ourselves in a situation such as to produce reli-
reproducibility, in diseases and clinical situations able and reproducible results, then and only then
where it makes a difference for patients. [18F] will we be fully prepared for the next phase, i.e.
FDG-PET/CT is quite effective in identifying enthusiastically promoting and advocating the
residual disease at the end of treatment for dif- A.I.-augmented nuclear medicine to the clinical
fuse large B-cell lymphoma. The advantage of world.
developing A.I. for this task would be marginal at
best, and difficult to establish. The impact would Acknowledgements The author wishes to thank Nadia
be quite different were it to predicting or assess- Withofs, MD, PhD for fruitful discussions.
210 R. Hustinx

References 9. Antun V, Renna F, Poon C, Adcock B, Hansen AC. On

instabilities of deep learning in image reconstruction
and the potential costs of AI. Proc Natl Acad Sci U S
1. Schwyzer M, Ferraro DA, Muehlematter UJ, Curioni-
A. 2020;117(48):30088–95. https://doi.org/10.1073/
Fontecedro A, Huellner MW, von Schulthess GK,
pnas.1907377117.
et al. Automated detection of lung cancer at ultralow
10. Zhou Z, Firestone C. Humans can decipher adversar-
dose PET/CT by deep neural networks—initial
ial images. Nat Commun. 2019;10:1334. https://doi.
results. Lung Cancer. 2018;126:170–3. https://doi.
org/10.1038/s41467-019-08931-6.
org/10.1016/j.lungcan.2018.11.001.
11. Barucci A, Neri E. Adversarial radiomics: the rising of
2. Shiri I, Ghafarian P, Geramifar P, Leung KH,
potential risks in medical imaging from adversarial learn-
Ghelichoghli M, Oveisi M, et al. Direct attenuation
ing. Eur J Nucl Med Mol Imaging. 2020;13:2941–43.
correction of brain PET images using only emission
https://doi.org/10.1007/s00259-020-04879-8.
data via a deep convolutional encoder-decoder (Deep-
12. Cook J, Ramadas V. When to consult precision-recall
DAC). Eur Radiol. 2019;29:6867–79. https://doi.
curves. The Stata Journal. 2020;20:131–48. https://
org/10.1007/s00330-019-06229-1.
doi.org/10.1177/1536867x20909693.
3. Burggraaff CN, Rahman F, Kassner I, Pieplenbosch
13. Flach P. Performance evaluation in machine learning:
S, Barrington SF, Jauw YWS, et al. Optimizing work-
the good, the bad, the ugly, and the way forward. In:
flows for fast and reliable metabolic tumor volume
The thirty-third AAAI conference on artificial intel-
measurements in diffuse large B cell lymphoma.
ligence (AAAI-19). 2019.
Mol Imaging Biol. 2020;22:1102–10. https://doi.
14. Pianykh OS, Langs G, Dewey M, Enzmann DR,
org/10.1007/s11307-020-01474-z.
Herold CJ, Schoenberg SO, et al. Continuous learning
4. Barrington SF, Zwezerijnen BG, de Vet HC, Heymans
AI in radiology: implementation principles and early
MW, Mikhaeel NG, Burggraaff CN, et al. Automated
applications. Radiology. 2020;297:6–14. https://doi.
segmentation of baseline metabolic total tumor bur-
org/10.1148/radiol.2020200038.
den in diffuse large B-cell lymphoma: which method
15. Turner J, Ward P. Dutch debate intensifies over future
is most successful? J Nucl Med. 2021;62(3):332–7.
shape of AI. 2020. https://www.auntminnieeurope-
https://doi.org/10.2967/jnumed.119.238923.
com/indexaspx?sec=sup&sub=aic&pag=dis&Ite
5. Sibille L, Seifert R, Avramovic N, Vehren T,
mID=619384.
Spottiswoode B, Zuehlsdorff S, et al. (18)F-FDG
16. Nishikawa RM, Schmidt RA, Linver MN, Edwards AV,
PET/CT uptake classification in lymphoma and
Papaioannou J, Stull MA. Clinically missed cancer:
lung cancer by using deep convolutional neural net-
how effectively can radiologists use computer-aided
works. Radiology. 2020;294:445–52. https://doi.
detection? AJR Am J Roentgenol. 2012;198:708–16.
org/10.1148/radiol.2019191114.
https://doi.org/10.2214/AJR.11.6423.
6. Curtis C, Liu C, Bollerman TJ, Pianykh OS. Machine
17. Zhao Z, Pi Y, Jiang L, Xiang Y, Wei J, Yang P, et al.
learning for predicting patient wait times and appoint-
Deep neural network based artificial intelligence
ment delays. J Am Coll Radiol. 2018;15:1310–6.
assisted diagnosis of bone scintigraphy for cancer
https://doi.org/10.1016/j.jacr.2017.08.021.
bone metastasis. Sci Rep. 2020;10:17046. https://doi.
7. Sollini M, Antunovic L, Chiti A, Kirienko M. Towards
org/10.1038/s41598-020-74135-4.
clinical application of image mining: a systematic
18. McKinney SM, Sieniek M, Godbole V, Godwin J,
review on artificial intelligence and radiomics. Eur
Antropova N, Ashrafian H, et al. International evalu-
J Nucl Med Mol Imaging. 2019;46:2656–72. https://
ation of an AI system for breast cancer screening.
doi.org/10.1007/s00259-019-04372-x.
Nature. 2020;577:89–94. https://doi.org/10.1038/
8. Holzinger A, Langs G, Denk H, Zatloukal K, Muller
s41586-019-1799-6.
H. Causability and explainability of artificial intelli-
19. Haibe-Kains B, Adam GA, Hosny A, Khodakarami F,
gence in medicine. Wiley Interdiscip Rev Data Min
Shraddha T, Kusko R, et al. Transparency and reproduc-
Knowl Discov. 2019;9:e1312. https://doi.org/10.1002/
ibility in artificial intelligence. Nature. 2020;586:E14–
widm.1312.
6. https://doi.org/10.1038/s41586-020-2766-y.