0% found this document useful (0 votes)
17 views9 pages

Paper 4-Cirrhosis in Standard T2-Weighted MRI Using Deep

This study investigated using deep transfer learning (DTL) to detect liver cirrhosis from standard T2-weighted MRI scans. A dataset of 713 MRI scans was analyzed, with 553 scans showing confirmed cirrhosis and 160 normal scans. Scans were split into training, validation, and test sets. A pre-trained ResNet50 convolutional neural network was used for classification with and without prior liver segmentation. The DTL model achieved significantly higher accuracy for detecting cirrhosis compared to two radiologists, with expert-level performance even without image segmentation.

Uploaded by

21vcetcse015
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views9 pages

Paper 4-Cirrhosis in Standard T2-Weighted MRI Using Deep

This study investigated using deep transfer learning (DTL) to detect liver cirrhosis from standard T2-weighted MRI scans. A dataset of 713 MRI scans was analyzed, with 553 scans showing confirmed cirrhosis and 160 normal scans. Scans were split into training, validation, and test sets. A pre-trained ResNet50 convolutional neural network was used for classification with and without prior liver segmentation. The DTL model achieved significantly higher accuracy for detecting cirrhosis compared to two radiologists, with expert-level performance even without image segmentation.

Uploaded by

21vcetcse015
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

European Radiology

https://doi.org/10.1007/s00330-021-07858-1

IMAGING INFORMATICS AND ARTIFICIAL INTELLIGENCE

Detection of liver cirrhosis in standard T2-weighted MRI using deep


transfer learning
Sebastian Nowak 1 & Narine Mesropyan 1 & Anton Faron 1 & Wolfgang Block 1 & Martin Reuter 2,3,4 & Ulrike I. Attenberger 1 &
Julian A. Luetkens 1 & Alois M. Sprinkart 1

Received: 11 November 2020 / Revised: 12 February 2021 / Accepted: 10 March 2021


# The Author(s) 2021

Abstract
Objectives To investigate the diagnostic performance of deep transfer learning (DTL) to detect liver cirrhosis from clinical MRI.
Methods The dataset for this retrospective analysis consisted of 713 (343 female) patients who underwent liver MRI between
2017 and 2019. In total, 553 of these subjects had a confirmed diagnosis of liver cirrhosis, while the remainder had no history of
liver disease. T2-weighted MRI slices at the level of the caudate lobe were manually exported for DTL analysis. Data were
randomly split into training, validation, and test sets (70%/15%/15%). A ResNet50 convolutional neural network (CNN) pre-
trained on the ImageNet archive was used for cirrhosis detection with and without upstream liver segmentation. Classification
performance for detection of liver cirrhosis was compared to two radiologists with different levels of experience (4th-year
resident, board-certified radiologist). Segmentation was performed using a U-Net architecture built on a pre-trained ResNet34
encoder. Differences in classification accuracy were assessed by the χ2-test.
Results Dice coefficients for automatic segmentation were above 0.98 for both validation and test data. The classification
accuracy of liver cirrhosis on validation (vACC) and test (tACC) data for the DTL pipeline with upstream liver segmentation
(vACC = 0.99, tACC = 0.96) was significantly higher compared to the resident (vACC = 0.88, p < 0.01; tACC = 0.91, p = 0.01)
and to the board-certified radiologist (vACC = 0.96, p < 0.01; tACC = 0.90, p < 0.01).
Conclusion This proof-of-principle study demonstrates the potential of DTL for detecting cirrhosis based on standard T2-weighted
MRI. The presented method for image-based diagnosis of liver cirrhosis demonstrated expert-level classification accuracy.
Key Points
• A pipeline consisting of two convolutional neural networks (CNNs) pre-trained on an extensive natural image database
(ImageNet archive) enables detection of liver cirrhosis on standard T2-weighted MRI.
• High classification accuracy can be achieved even without altering the pre-trained parameters of the convolutional neural
networks.
• Other abdominal structures apart from the liver were relevant for detection when the network was trained on unsegmented
images.

Sebastian Nowak and Narine Mesropyan contributed equally to this Keywords Deep learning . Neural networks, computer .
work. Magnetic resonance imaging . Liver cirrhosis

* Alois M. Sprinkart
[email protected]
Abbreviations
1
Department of Diagnostic and Interventional Radiology,
ACC Accuracy
Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn AP Average precision
(Universitätsklinikum Bonn), Venusberg-Campus 1, AUC Area under the curve
53127 Bonn, Germany CNN Convolutional neural network
2
Image Analysis, German Center for Neurodegenerative Diseases DTL Deep transfer learning
(DZNE), Bonn, Germany
3
A.A. Martinos Center for Biomedical Imaging, Massachusetts
General Hospital, Boston, MA, USA Supplementary Information The online version contains supplementary
4 material available at https://doi.org/10.1007/s00330-021-07858-1.
Department of Radiology, Harvard Medical School, Boston, MA,
USA
Eur Radiol

Introduction Materials and methods

Liver cirrhosis is the end stage of chronic liver disease and a This retrospective study was approved by the institutional
major global health condition, especially due to its variety of review board with a waiver of written informed consent.
severe complications caused by portal hypertension such as Patients who underwent liver MRI at our institution for stan-
variceal bleeding, ascites, and hepatic encephalopathy [1]. dard diagnostic purposes between 2017 and 2019 were includ-
Although liver biopsy is the gold standard for the detection ed. Two groups of patients were identified and included in the
of cirrhosis, imaging has a particularly important role in the final study cohort:
evaluation of the disease [2]. Imaging is primarily used to
characterize the morphologic manifestations of cirrhosis, i. Patients with known liver cirrhosis of any stage: Inclusion
evaluate the presence and the effects of portal hypertension, criterion was the presence of histologically or clinically
and screen for hepatocellular carcinoma. However, morpho- defined liver cirrhosis of any clinical disease severity.
logic characteristics of cirrhosis are often detected inciden- Exclusion criteria were the presence of focal liver lesions
tally in patients with unsuspected cirrhosis. It is therefore at the level of portal vein bifurcation or a past medical
not unusual that radiologists presume an initial diagnosis history of hepatic surgery (Fig. 1).
of cirrhosis [3]. ii. Patients without known liver disease: From the same pe-
To assume a diagnosis of liver cirrhosis, different morpho- riod, a randomly selected control group was recruited,
logical criteria have been described for standard imaging mo- which consisted of patients without known liver disease.
dalities [2]. However, most of these findings are subjective, Exclusion criteria for the control group were the same as
susceptible to inter-observer variability, and often lack high those applied for the cirrhosis group.
overall accuracy for the detection of cirrhosis [4]. Therefore,
quantitative analyses, which could improve the objectivity
and reading performance in the identification of liver cirrho- Patient characteristics were retrieved from the clinical infor-
sis, are of great interest [5]. mation management system of the referring institution. An
A method that could objectively assess relevant features overview of the MRI indications for the two groups is provided
automatically within radiological images could support the in Supplement S1.
radiologist in diagnosing liver cirrhosis, leading to greater As this study aimed to determine the diagnostic utility of
accuracy and less variation in reading performance. Since DTL to detect liver cirrhosis based on morphological hall-
2012, when a deep learning technique has shown superior marks of liver cirrhosis, T2-weighted imaging was used for
performance in the prominent ImageNet challenge for the first analysis. In detail, images of a standard T2-weighted respi-
time, especially CNNs have become the gold standard for ratory triggered multi-slice turbo spin echo sequence with
image classification and segmentation [6]. Deep learning non-Cartesian k-space filling with radial rectangular blades
methods have been continuously improved and successfully (Multi Vane XD) were used. For each patient, a single-slice
applied in various disciplines, including medical imaging image at the level of the caudate lobe was exported for DTL
[7–12]. analysis (N.M. with 1 year of experience in the field of
However, a disadvantage of CNNs is the requirement of clinical abdominal imaging). All examinations were per-
a large number of pre-classified images, which serve as formed on clinical whole-body MRI systems (Philips,
training data. Instead of training a neural network from Ingenia 1.5 T and 3 T). Detailed imaging parameters are
scratch with a small data set, it has proven advantageous listed in Supplement S2.
to use a technique called transfer learning [13]. The basic Image data were randomly divided into training data (70%),
idea is to use a CNN pre-trained e.g. on a large natural validation data (15%), and test data (15%) using a custom
image dataset, which has already been trained to recognize Matlab script (MathWorks). Details of the preprocessing prior
complex patterns and then adapt it to a different task. This to training are listed in Supplement S3.
technique has recently been successfully applied to a vari- Images were analyzed using two different processing pipe-
ety of segmentations and classification problems of medical lines. In the first pipeline, an image segmentation network was
image data [14–16]. applied prior to the classification task. In the second pipeline,
The aim of this study was to investigate the capabilities the classification was performed directly on the unsegmented
of deep transfer learning (DTL) to identify liver cirrhosis images.
in standard T2-weighted MRI and to evaluate the diagnos- For segmentation, a CNN following the principle architec-
tic performance against radiologists with different levels of ture of a U-net model was implemented [17]. Its descending
experience. encoder part is identical to a CNN with residual connections
Eur Radiol

Fig. 1 Flowchart illustrating the inclusion and exclusion criteria for the group of patients with liver cirrhosis for this study

known as ResNet34 that was pre-trained on the ImageNet in Matlab and verified by a board-certified radiologist
database [18]. The ground truth for the training of the segmen- (J.A.L.).
tation CNN was generated by a radiology resident (N.M.) by ResNet50 as a well-established CNN with 50 trainable
manually delineating the liver using in-house tools developed layers and residual connections was used for the classification

Fig. 2 Details of the presented deep transfer learning (DTL) pipeline for task (right), a pre-trained ResNet50 CNN was employed. The
detection of liver cirrhosis. The segmentation network (left) is based on a classification performance of the DTL pipeline including liver
U-net architecture, with a ResNet34 convolutional neural network (CNN) segmentation (A) was compared to a classification based on the
as encoder, pre-trained on the ImageNet archive. For the classification original, unsegmented images (B)
Eur Radiol

task in both pipelines. The model was pre-trained on the examined on 3.0 T. A total of 553 patients (248 female, mean
ImageNet archive and implemented in pytorch’s torchvision age: 60 ± 12 years) with a confirmed diagnosis of liver cirrho-
package [19]. Detailed descriptions of the segmentation and sis based on clinical or histopathological criteria were includ-
classification CNN architectures can be found in Fig. 2 and ed (Fig. 1). The control group consisted of 160 subjects (94
Supplement S4. female, mean age: 49 ± 18 years) without history of liver
The DTL methods developed in this work were trained in disease. A training set with 505 subjects (244 female, mean
two phases. First, only non-pretrained layers were trained and age: 58 ± 14 years), a validation set with 104 subjects (49
all pre-trained parameters of the convolutional layers were female, mean age: 57 ± 14 years), and a test set with 104
kept constant. To further investigate whether varying the subjects (49 female, mean age: 58 ± 15 years) were compiled
pre-trained parameters may improve the reading performance by random selection, while maintaining the proportion of con-
of the CNN, the parameters of the pre-trained convolutional trol patients to patients with cirrhosis. The DTL method for
layers were made variable in a second phase. The one cycle segmentation of the liver in the transverse T2-weighted MRI
learning rate policy was applied for fine-tuning of the pre- images developed on the training set showed Dice values of
trained models for liver segmentation and classification of 0.984 for the validation set and 0.983 for the test set.
liver cirrhosis [20]. All experiments and evaluations were per- In the subsequent training of the classification network
formed with python and fastai, a deep learning application ResNet50 for the identification of cirrhosis based on segmented
programming interface for pytorch [21]. Further details of images, an accuracy (ACC) of 0.99 (95% confidence interval:
the experimental design and the hyper-parameters used for 0.95–1.00) for validation data (vACC) and 0.96 (0.90–0.99) for
training are given in Supplement S5. test data (tACC) was achieved. For the classification on unseg-
To compare the performance of the DTL analyses to the mented images, vACC was 0.97 (0.92–0.99) and tACC was
performance of healthcare professionals at different experience 0.95 (0.89–0.98). The accuracy of the DTL pipeline for classi-
levels, validation and test data were also classified independent- fication of cirrhosis with prior segmentation of the organ was
ly by a radiology resident (A.F.) with 4 years of experience in significantly higher compared to the resident (vACC = 0.88,
abdominal imaging and a board-certified radiologist (J.A.L.) p < 0.01; tACC = 0.91, p = 0.01) as well as the board-certified
with 8 years of experience in abdominal imaging. radiologist (vACC = 0.96, p < 0.01; tACC = 0.90, p < 0.01)
The 95% confidence interval of the DTL-based classifica- (Table 1). Modifications of pre-trained parameters did not im-
tion accuracy was determined by the Clopper-Pearson method prove segmentation and classification accuracy significantly
and a χ2-test was performed to test for significant differences (Table 2). On the test set, a balanced accuracy value of 0.90
in accuracy between the DTL-based classification and the was observed for the DTL method based on unsegmented im-
readers in SPSS Statistics 24 (IBM). For the test set, calcula- ages. Balanced accuracy values of 0.92 were observed for the
tions of balanced accuracy, receiver operating characteristic, DTL method based on segmented images, as well as for the
and precision-recall analyses were performed with scikit-learn radiology resident and board-certified radiologist. For the DTL
0.23.2 [22–24]. method, the balanced accuracy of 0.92 is derived from a sensi-
In order to assess the classification performance of the en- tivity of 1, which was higher than that of the radiology resident
tire first pipeline (including prior segmentation), the segmen- and board-certified radiologist (0.91, 0.89) and a specificity of
tations of the CNN (instead of manual segmentations) were 0.83, which was lower than that of the radiology resident and
used for the validation and test set of the classification net- board-certified radiologist (0.92, 0.96).
work. In addition to evaluating the method by its performance Receiver operating characteristic and precision-recall
on the validation and test data set, gradient-weighted class curves for the test data set are shown in Fig. 3. For the DTL
activation maps (Grad-CAMs) were generated [25]. This tech- method trained on segmented images, an area under the curve
nique is proposed to add visual information to radiological (AUC) of 0.99 and an average precision (AP) of 0.97 and for
images, describing areas of the image that affect the prediction the DTL method trained on the unsegmented images, an AUC
of the CNN [26]. These colored prediction maps were visually of 0.95, and an AP of 0.93 were determined.
inspected and the image areas contributing to the CNN’s pre- Figure 4 shows exemplary images from the test set with
diction of cirrhosis were quantified separately for both patient colored maps indicating areas which were particularly rele-
groups. vant for the decision of the classifier. The results of the visual
inspection are presented in Table 3. In the first pipeline with
upstream segmentation, the caudate lobe was highlighted in
Results 47.5% of the images classified as cirrhosis and in 25% of the
images classified as no cirrhosis. In every fifth (20.8%) of the
A total of 713 patients (342 female, mean age: 58 ± 14 years) segmented images classified as no cirrhosis, the transition
were included. Of those, examinations of 572 patients were zone of the caudate lobe to the image background was
acquired at a field strength of 1.5 T. The remainder were highlighted.
Eur Radiol

Table 1 Accuracy (ACC), balanced accuracy (BACC), sensitivity unsegmented images and based on images with prior segmentation of
(Sens), and specificity (Spec) for identification of liver cirrhosis for the liver. The accuracy of the DTL approaches was also compared to a
validation (vACC, vBACC, vSens, vSpec) and test (tACC, tBACC, radiological resident and a board-certified radiologist. Statistical
tSens, tSpec) of the deep transfer learning (DTL) method based on difference was assessed by χ2-test

Reader/method vACC p value (vAcc) tACC p value (tAcc) vBACC tBACC vSens tSens vSpec tSpec

ResNet50 (segmented liver) 0.99 - 0.96 - 0.99 0.92 0.99 1 1 0.83


ResNet50 (full image) 0.97 p = 0.04 0.95 p = 0.61 0.97 0.90 0.98 1 0.96 0.79
Board-certified radiologist 0.96 p < 0.01 0.90 p < 0.01 0.98 0.92 0.95 0.89 1 0.96
Radiology resident (4th year) 0.88 p < 0.01 0.91 p = 0.01 0.93 0.92 0.85 0.91 1 0.92

In the second pipeline, based on unsegmented images, addi- prior segmentation of the liver provides classification accura-
tional highlighted areas outside of the liver were identified. In cy at expert level.
images classified as cirrhosis, the spleen area was highlighted in To date, no other work has investigated the use of a
6%, the stomach area in 22.5%, and the gastroesophageal junc- DTL approach for the detection of liver cirrhosis in
tion in 12.5%. In 29.2% of the CNN’s negative predictions, standard T2-weighted MRI sequences. There are recent
spinal musculature was highlighted. studies based on gadoxetic acid–enhanced MRI imaging
that classifies fibrotic pathologies of the liver by
methods of deep learning and radiomics [27, 28].
Discussion However, these methods are trained from scratch and
they require a manual definition of region of interests.
This proof-of-principle study demonstrates the feasibility of In contrast to that, the method proposed in the current
automatic detection of liver cirrhosis by DTL based on a stan- study does not require manual segmentation since the
dard T2-weighted MRI. The deep learning approach with liver is segmented automatically with high precision.

Table 2 Dice values of the segmentation convolutional neural network the CNN. In the first stage of training the classification CNN, an
(CNN) and classification accuracy of liver cirrhosis of the classification accuracy of 0.99 for the segmented images and 0.97 for the
CNN at different stages of the training experiments. In the first stage of unsegmented images were achieved by optimizing the output layer of
training the segmentation CNN, a Dice score of 0.9828 was achieved by the ResNet50 CNN only. The following stages that started from the
optimizing the convolutional layers of the random-initialized decoder and best previous model state did not lead to an improvement in accuracy
remaining the parameters of the pre-trained ResNet34 encoder and showed only minor improvements of the cross-entropy loss. Also in
unchanged. In the following three stages that started from the model the last three stages, where the convolutional layers of the pre-trained
state of the previous stage, only minor improvements of 0.001 of the ResNet50 were made variable with learning rates increased linearly
Dice score were achieved. In these stages, the convolutional layers of from the first to the last layer of the CNN, no improvement in accuracy
the pre-trained ResNet34 encoder were made variable, whereby the could be observed. Detailed descriptions of the training experiments can
learning rate (LR) increased linearly from the first to the last layer of be found in Supplement S5

Training Epochs Max LR last layer Max LR first layer Dice on validation set
stage decoder encoder
Segmentation network 1 80 0.001 Frozen 0.9828
(U-net like 2 40 0.0005 0.000005 No improvement
with ResNet34 encoder)
3 40 0.0005 0.00005 0.9837
4 40 0.0005 0.0005 0.9838

Training Epochs Max LR output Max LR first Accuracy and Accuracy and
stage layer layer cross-entropy cross-entropy
loss (segmented loss (full image)
image)
Classification network 1 80 0.1 Frozen 0.99, 0.1452 0.97, 0.325
(ResNet50) 2 40 0.01 Frozen No improvement 0.97, 0.2151
3 40 0.001 Frozen No improvement No improvement
4 40 0.0001 0.000001 No improvement 0.97, 0.2025
5 40 0.0001 0.00001 0.99, 0.1450 No improvement
6 40 0.0001 0.0001 0.99, 0.1339 No improvement
Eur Radiol

Fig. 3 Liver cirrhosis classification performance of the deep transfer the board-certified radiologist (rater B) on the test set, illustrated by
learning (DTL) methods trained on the segmented images (DTL A) or receiver operating characteristic and precision-recall curves and area
unsegmented images (DTL B) and of the radiology resident (rater A) and under the curve (AUC) and average precision (AP) values

Recent studies based on ultrasound imaging also used DTL animals, and buildings was generalized to identify liver cir-
methods pre-trained on the ImageNet archive [29, 30]. Of rhosis on an expert level in standard T2-weighted MRI.
note, in both mentioned studies, the pre-trained parameters A further aim of our study was to investigate, whether prior
were not kept constant during training. Particularly the first segmentation of the liver is beneficial for this classification
few layers of the pre-trained CNNs have learned to recognize task. Interestingly, both variants (with and without prior
very general image features such as edges and shapes during segmentation) achieved high accuracy. However, the accura-
the training with the ImageNet data set [31]. The ability to cy for the detection of liver cirrhosis was slightly higher
extract these features is a benefit of transfer learning, and for the DTL pipeline with prior segmentation. This result
therefore, other groups proposed to first optimize only the may be attributed to the following advantages of upstream
output layer of the network prior to changing the pre-trained segmentation:
parameters of the CNN [15, 32].
In order to examine whether altering the pre-trained param- i. The network is forced to focus on the area, where patho-
eters of the DTL methods is beneficial for the identification of logical alterations are primarily expected.
cirrhosis, the CNNs were trained in two phases in this work, ii. Image areas that are not in focus of the analysis are
with frozen and unfrozen pre-trained parameters. prevented to have an impact on the normalization step [33].
Interestingly, the accuracy on the validation data set of both iii. Using only the image areas of the organ allows to train
methods did not further increase by unfreezing the pre-trained the classification model with smaller image matrices and
parameters. Hence, the learned feature extraction capability thus larger batch size, which is considered beneficial for
from the training on the natural image data set of e.g. cars, the applied learning rate policy [20].

Table 3 Evaluation of the gradient-weighted class activation maps of different image areas do not add up to 100% within a patient group. The
the test set. The maps of the predictions of the deep transfer learning liver areas were divided into left, right hepatic, and caudate lobe. For the
method, trained on segmented images and images without liver segmented images, it was also noted whether image areas at the transition
segmentation, were visually inspected and it was recorded which image zone of the caudate lobe to the image background were highlighted. For
areas were highlighted, separately for both patient groups. Note that the full images, highlighted areas near the stomach, spleen,
several areas of the image were highlighted, so the percentages of the gastroesophageal junction, and spinal muscles were observed

Unsegmented images Patient group Right hepatic Left hepatic Caudate lobe Spleen Stomach Gastroesophageal junction Spinal musculature
Cirrhosis 53.8% 35% 22.5% 6.3% 22.5% 12.5% 2.5%
No cirrhosis 83.3% 16.7% 0 0 8.3% 0 29.2%

Segmented images Patient group Right hepatic Left hepatic Caudate lobe Border caudate - -
lobe/-
background
Cirrhosis 53.8% 28.8% 47.5% 2.5% - -
No cirrhosis 58.3% 20.8% 25% 20.8% - -
Eur Radiol

Fig. 4 Gradient-weighted class


activation maps for unsegmented
and segmented images from the
test set. The overlays highlight
regions that had high impact on
classification in patients without
cirrhosis (a) and patients with
cirrhosis (b). Patients with and
without cirrhosis that were
correctly classified by the DTL
methods but incorrectly classified
by the certified radiologist are
shown in c. Examples of images
with a disagreeing classification
of the two DTL methods, where
the image was only correctly
classified with prior liver
segmentation are shown in d.
Images that were misclassified by
both DTL methods, but correctly
classified by the certified
radiologist are shown in e

For both methods, image areas relevant for the CNN’s observation motivates further studies to investigate if deep
decision were investigated applying the Grad-CAM method learning methods may also reliably detect accompanying ef-
[25]. The results indicate that the caudate lobe area is impor- fects of cirrhosis.
tant for the DTL methods for the detection of liver cirrhosis Future work should also address whether a multi-task-
trained on either segmented or unsegmented images. learning architecture, which would simultaneously optimize
Interestingly, the Grad-CAM evaluations of the DTL method segmentation and classification performance, has advantages
based on the unsegmented images showed that in some cases, over the presented pipeline. In addition, the method could be
image areas outside of the liver were relevant. This indicates extended by an automated selection of the 2D slice at the level
that the CNN might also base the prediction of cirrhosis on of the caudate lobe to allow fully automated prediction of
accompanying signs of cirrhosis, such as spleen hypertrophy, cirrhosis based on T2-weighted imaging.
venous alterations like fundus varices, or the general vital Our study has several limitations. First, the DTL model has
status of the patient according to muscle structure. This been trained for the identification of liver cirrhosis only and
Eur Radiol

does not support the detection of very early signs of tissue funders had no influence on the conceptualization and design of the study,
data analysis and data collection, preparation of the manuscript, and the
fibrosis, which might be present in early hepatopathy.
decision to publish.
However, this was not the aim of this proof-of-principle study,
but to investigate the hypothesis that ImageNet pre-trained
Compliance with ethical standards
models are generalizable to T2-weighted MRI imaging and
allow the assessment of imaging features of liver cirrhosis. Guarantor The scientific guarantor of this publication is PD Dr.med.
The investigation of an automated classification of early signs Julian A. Luetkens.
of tissue fibrosis and different stages of fibrosis will be the
next step in the evaluation of deep transfer learning–based Conflict of interest The authors of this manuscript declare no relation-
ships with any companies whose products or services may be related to
approaches based on standard T2-weighted MRI imaging.
the subject matter of the article.
Our study collective included a broad range of cirrhosis
severities (according to the Child-Pugh score) and different Statistics and biometry No complex statistical methods were necessary
etiologies of cirrhosis. To account for the difference in the for this paper.
number of patients with liver cirrhosis and patients without
liver disease, additional performance measures were assessed. Informed consent Written informed consent was waived by the
Institutional Review Board.
According to the balanced accuracy, the method trained on
segmented images performs at expert level. However, the
Ethical approval This retrospective study was approved by the institu-
DTL method shows a higher sensitivity and a lower specificity tional review board with a waiver of written informed consent.
compared to the board-certified radiologist, which may be a
result of the class imbalance of the dataset. An expert level Methodology
classification performance of the DTL method trained on seg- • Retrospective
• Diagnostic or prognostic study
mented images is furthermore underlined by the precision-
• Performed at one institution
recall analysis.
Another limitation is that the classification was based
solely on T2-weighted images. In contrast to that, addition- Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing, adap-
al pieces of information such as different MRI sequences as tation, distribution and reproduction in any medium or format, as long as
well as clinical and laboratory parameters are typically you give appropriate credit to the original author(s) and the source, pro-
available for diagnosis in clinical routine. However, in our vide a link to the Creative Commons licence, and indicate if changes were
study, high diagnostic accuracy was shown for both the made. The images or other third party material in this article are included
in the article's Creative Commons licence, unless indicated otherwise in a
classifier and clinical experts, even if the diagnosis was credit line to the material. If material is not included in the article's
based on only one anatomical sequence. Future studies Creative Commons licence and your intended use is not permitted by
may evaluate whether a multi-parametric approach or the statutory regulation or exceeds the permitted use, you will need to obtain
inclusion of clinical parameters can further improve diag- permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
nostic performance.

Conclusion
References
This proof-of-principle study demonstrates the potential of 1. Volk ML, Tocco RS, Bazick J, Rakoski MO, Lok AS (2012)
DTL for the detection of cirrhosis based on standard T2- Hospital re-admissions among patients with decompensated cirrho-
weighted MRI. The DTL pipeline for the image-based diag- sis. Am J Gastroenterol 107(2):247–252
nosis of liver cirrhosis demonstrated classification accuracy at 2. Procopet B, Berzigotti A (2017) Diagnosis of cirrhosis and portal
hypertension: imaging, non-invasive markers of fibrosis and liver
expert level. An application of the pipeline could support ra-
biopsy. Gastroenterol Rep (Oxf) 5(2):79–89
diologists in the diagnosis of liver cirrhosis and has the poten- 3. Brown JJ, Naylor MJ, Yagan N (1997) Imaging of hepatic cirrhosis.
tial to improve consistency of reading performance. Radiology 202(1):1–16
4. Rustogi R, Horowitz J, Harmath C et al (2012) Accuracy of MR
elastography and anatomic MR imaging features in the diagnosis of
severe hepatic fibrosis and cirrhosis. J Magn Reson Imaging 35(6):
Supplementary Information The online version contains supplementary
1356–1364
material available at https://doi.org/10.1007/s00330-021-07858-1.
5. House MJ, Bangma SJ, Thomas M et al (2015) Texture-based
classification of liver fibrosis using MRI. J Magn Reson Imaging
41(2):322–328
Funding Open Access funding enabled and organized by Projekt DEAL. 6. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classifica-
The study was supported by a grant from the BONFOR research program tion with deep convolutional neural networks. Commun ACM
of the University of Bonn (Application number 2020-2A-04). The 60(6):84–90
Eur Radiol

7. Nowak S, Faron A, Luetkens JA et al (2020) Fully automated 20. Smith LN (2018) A disciplined approach to neural network hyper-
segmentation of connective tissue compartments for CT-based parameters: Part 1 – learning rate, batch size, momentum, and
body composition analysis: a deep learning approach. Invest weight decay. arXiv preprint arXiv:1803.09820
Radiol 55(6):357–366 21. Howard J, Gugger S (2020) Fastai: a layered API for deep learning.
8. Zhu Y, Fahmy AS, Duan C, Nakamori S, Nezafat R (2020) Information 11(2):108
Automated myocardial T2 and extracellular volume quantification 22. Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn:
in cardiac MRI using transfer learning–based myocardium segmen- machine learning in Python. J Mach Learn Res 12:2825–2830
tation. Radiol Artif Intell 2(1):e190034 23. Brodersen KH, Ong CS, Stephan KE, Buhmann JM (2010) The
9. Krogue JD, Cheng KV, Hwang KM et al (2020) Automatic hip balanced accuracy and its posterior distribution. The 20th
fracture identification and functional subclassification with deep International Conference on Pattern Recognition, IEEE, pp 3121–
learning. Radiol Artif Intell 2(2):e190023 3124
10. Wang K, Mamidipalli A, Retson T et al (2019) Automated CT and 24. Saito T, Rehmsmeier M (2015) The precision-recall plot is more
MRI liver segmentation and biometry using a generalized informative than the ROC plot when evaluating binary classifiers
convolutional neural network. Radiol Artif Intell 1(2):180022 on imbalanced datasets. PLoS One 10(3):e0118432
11. Estrada S, Lu R, Conjeti S et al (2020) FatSegNet: A fully automat- 25. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D
ed deep learning pipeline for adipose tissue segmentation on ab- (2017) Grad-CAM: visual explanations from deep networks via
dominal dixon MRI. Magn Reson Med 83(4):1471–1483 gradient-based localization. Proceedings of the IEEE international
conference on computer vision, pp 618–626
12. Henschel L, Conjeti S, Estrada S, Diers K, Fischl B, Reuter M
26. Reyes M, Meier R, Pereira S et al (2020) On the interpretability of
(2020) FastSurfer - a fast and accurate deep learning based neuro-
artificial intelligence in radiology: challenges and opportunities.
imaging pipeline. Neuroimage 219:117012
Radiol Artif Intell 2(3):e190043
13. Kornblith S, Shlens J, Le QV (2019) Do Better ImageNet models 27. Yasaka K, Akai H, Kunimatsu A, Abe O, Kiryu S (2018) Liver
transfer better? Proceedings of the IEEE conference on computer fibrosis: deep convolutional neural network for staging by using
vision and pattern recognition, pp 2661–2671 gadoxetic acid–enhanced hepatobiliary phase MR images.
14. Shin H-C, Roth HR, Gao M et al (2016) Deep convolutional neural Radiology 287(1):146–155
networks for computer-aided detection: CNN architectures, dataset 28. Park HJ, Lee SS, Park B et al (2019) Radiomics analysis of
characteristics and transfer learning. IEEE Trans Med Imaging gadoxetic acid–enhanced MRI for staging liver fibrosis.
35(5):1285–1298 Radiology 290(2):380–387
15. Mormont R, Geurts P, Maree R (2018) Comparison of deep transfer 29. Xue LY, Jiang ZY, Fu TT et al (2020) Transfer learning radiomics
learning strategies for digital pathology. Proceedings of the IEEE based on multimodal ultrasound imaging for staging liver fibrosis.
Conference on Computer Vision and Pattern Recognition Eur Radiol 30:2973–2983
Workshops, pp 2262–2271 30. Lee JH, Joo I, Kang TW et al (2020) Deep learning with ultraso-
16. Ravishankar H, Sudhakar P, Venkataramani R et al (2016) nography: automated classification of liver fibrosis using a deep
Understanding the mechanisms of deep transfer learning for med- convolutional neural network. Eur Radiol 30(2):1264–1273
ical images. In: Deep Learning and Data Labeling for Medical 31. Qin Z, Yu F, Liu C, Chen X (2018) How convolutional neural
Applications. Springer, Cham, pp 188–196 network see the world - a survey of convolutional neural network
17. Ronneberger O, Fischer P, Brox T (2015) U-Net: Convolutional visualization methods. arXiv preprint arXiv:1804.11191
networks for biomedical image segmentation. In: International 32. Zeiler MD, Fergus R (2014) Visualizing and understanding
Conference on Medical image computing and computer-assisted convolutional networks. In: European conference on computer vi-
intervention. Springer, Cham, pp 234–241 sion. Springer, Cham, pp 818–833
18. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for 33. Collewet G, Strzelecki M, Mariette F (2004) Influence of MRI
image recognition. Proceedings of the IEEE conference on comput- acquisition protocols and image intensity normalization methods
er vision and pattern recognition, pp 770–778 on texture classification. Magn Reson Imaging 22(1):81–91
19. Paszke A, Gross S, Massa F et al (2019) PyTorch: An imperative
style, high-performance deep learning library. Advances in Neural Publisher’s note Springer Nature remains neutral with regard to jurisdic-
Information Processing Systems, pp 8026–8037 tional claims in published maps and institutional affiliations.

You might also like