0% found this document useful (0 votes)
64 views

Image Synthesis in Multi-Contrast MRI With Conditional Generative Adversarial Networks

Uploaded by

Guru Velmathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Image Synthesis in Multi-Contrast MRI With Conditional Generative Adversarial Networks

Uploaded by

Guru Velmathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO.

10, OCTOBER 2019 2375

Image Synthesis in Multi-Contrast MRI


With Conditional Generative
Adversarial Networks
Salman UH. Dar, Student Member, IEEE , Mahmut Yurt, Levent Karacan, Aykut Erdem ,
Erkut Erdem , and Tolga Çukur , Senior Member, IEEE

Abstract — Acquiring images of the same anatomy with of the multi-contrast MRI exams without the need for pro-
multiple different contrasts increases the diversity of diag- longed or repeated examinations.
nostic information available in an MR exam. Yet, the scan
time limitations may prohibit the acquisition of certain Index Terms — Generative adversarial network, image
contrasts, and some contrasts may be corrupted by noise synthesis, multi-contrast MRI, pixel-wise loss, cycle-
and artifacts. In such cases, the ability to synthesize consistency loss.
unacquired or corrupted contrasts can improve diagnos- I. I NTRODUCTION
tic utility. For multi-contrast synthesis, the current meth-
ods learn a nonlinear intensity transformation between
the source and target images, either via nonlinear regres-
M AGNETIC resonance imaging (MRI) is pervasively
used in clinical applications due to the diversity of
contrasts it can capture in soft tissues. Tailored MRI pulse
sion or deterministic neural networks. These methods
can, in turn, suffer from the loss of structural details in sequences enable the generation of distinct contrasts while
synthesized images. Here, in this paper, we propose a imaging the same anatomy. For instance, T1 -weighted brain
new approach for multi-contrast MRI synthesis based on images clearly delineate gray and white matter tissues,
conditional generative adversarial networks. The proposed whereas T2 -weighted images delineate fluid from cortical
approach preserves intermediate-to-high frequency details
via an adversarial loss, and it offers enhanced synthe- tissue. In turn, multi-contrast images acquired in the same
sis performance via pixel-wise and perceptual losses for subject increase the diagnostic information available in clinical
registered multi-contrast images and a cycle-consistency and research studies. However, it may not be possible to
loss for unregistered images. Information from neighbor- collect a full array of contrasts given considerations related
ing cross-sections are utilized to further improve syn- to the cost of prolonged exams and uncooperative patients,
thesis quality. Demonstrations on T1 - and T2 - weighted
images from healthy subjects and patients clearly indicate particularly in pediatric and elderly populations [1]. In such
the superior performance of the proposed approach com- cases, acquisition of contrasts with relatively shorter scan
pared to the previous state-of-the-art methods. Our synthe- times might be preferred. Even then a subset of the acquired
sis approach can help improve the quality and versatility contrasts can be corrupted by excessive noise or artifacts that
prohibit subsequent diagnostic use [2]. Moreover, cohort stud-
Manuscript received January 7, 2019; revised February 19, 2019; ies often show significant heterogeneity in terms of imaging
accepted February 22, 2019. Date of publication February 26, protocol and the specific contrasts that they acquire [3]. Thus,
2019; date of current version October 1, 2019. The work of T. the ability to synthesize missing or corrupted contrasts from
Çukur was supported by a European Molecular Biology Organi-
zation Installation Grant (IG 3028), by a TUBITAK 1001 Grant other successfully acquired contrasts has potential value for
(118E256), by a BAGEP fellowship awarded, by a TUBA GEBIP enhancing multi-contrast MRI by increasing availability of
fellowship and Nvidia Corporation under GPU grant. The work of diagnostically-relevant images, and improving analysis tasks
E. Erdem was supported by a separate TUBA GEBIP fellowship.
(Corresponding author: Tolga Çukur.) such as registration and segmentation [4].
S. U. Dar and M. Yurt are with the Department of Electrical and Cross-domain synthesis of medical images has recently
Electronics Engineering, Bilkent University, TR-06800 Ankara, Turkey, been gaining popularity in medical imaging. Given a
and also with the National Magnetic Resonance Research Center, Bilkent
University, TR-06800 Ankara, Turkey. subject’s image x in X (source domain), the aim is to accu-
L. Karacan, A. Erdem, and E. Erdem are with the Department of rately estimate the respective image of the same subject y
Computer Engineering, Hacettepe University, TR-06800 Ankara, Turkey. in Y (target domain). Two main synthesis approaches are
T. Çukur is with the Department of Electrical and Electronics Engi-
neering, Bilkent University, TR-06800 Ankara, Turkey, also with the registration-based [5]–[7] and intensity-transformation-based
National Magnetic Resonance Research Center, Bilkent University, methods [8]–[24]. Registration-based methods start by gen-
TR-06800 Ankara, Turkey, and also with the Neuroscience Program, erating an atlas based on a co-registered set of images,
Sabuncu Brain Research Center, Bilkent University, TR-06800 Ankara,
Turkey (e-mail: [email protected]). x 1 and y1 , respectively acquired in X and Y [5]. These
This article has supplementary downloadable material available at methods further make the assumption that within-domain
http://ieeexplore.ieee.org, provided by the author. images from separate subjects are related to each other through
Color versions of one or more of the figures in this article are available
online at http://ieeexplore.ieee.org. a geometric warp. For synthesizing y2 from x 2 , the warp that
Digital Object Identifier 10.1109/TMI.2019.2901750 transforms x 1 to x 2 is estimated, and this warp is then applied

0278-0062 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
2376 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO. 10, OCTOBER 2019

on y1 . Since they only rely on geometric transformations,


registration-based methods that rely on a single atlas can
suffer from across-subject differences in underlying morphol-
ogy [23]. For example, inconsistent pathology across a test
subject and the atlas can cause failure. Multi-atlas registration
in conjunction with intensity fusion can alleviate this limi-
tation, and has been successfully used in synthesizing CT
from MR images [6], [7]. Nevertheless, within-domain reg-
istration accuracy might still be limited even in normal
subjects [23].
An alternative is to use intensity-based methods that do
not rely on a strict geometric relationship among differ-
ent subjects’ anatomies [8]–[24]. One powerful approach
for multi-contrast MRI is based on the compressed sensing
Fig. 1. The pGAN method is based on a conditional adversar-
framework, where each patch in the source image x 2 is ial network with a generator G, a pre-trained VGG16 network V,
expressed as a sparse linear combination of patches in the atlas and a discriminator D. Given an input image in a source contrast
image x 1 [10], [22]. The learned sparse combinations are (e.g., T1 -weighted), G learns to generate the image of the same anatomy
in a target contrast (e.g., T2 -weighted). Meanwhile, D learns to discrim-
then applied to estimate patches in y2 from patches in y1 . inate between synthetic (e.g., T1 –G(T1 ) and real (e.g., T1 –T2 ) pairs
To improve matching of patches across domains, generative of multi-contrast images. Both subnetworks are trained simultaneously,
models were also proposed that use multi-scale patches and where G aims to minimize a pixel-wise, a perceptual and an adversarial
loss function, and D tries to maximize the adversarial loss function.
tissue segmentation labels [16], [18]. Instead of focusing on
linear models, recent studies aimed to learn more general
non-linear mappings that express individual voxels in y1 in distinct contrasts from a single modality, with demonstrations
terms of patches in x 1 , and then predict y2 from x 2 based on on multi-contrast brain MRI in normal subjects and glioma
these mappings. Nonlinear mappings are learned on training patients. For improved accuracy, the proposed method also
data via techniques such as nonlinear regression [8], [9], [23] leverages correlated information across neighboring cross-
or location-sensitive neural networks [19]. An important exam- sections within a volume. Two implementations are provided
ple is Replica that performs random forest regression on for use when multi-contrast images are spatially registered
multiresolution image patches [23]. Replica demonstrates great (pGAN) and when they are unregistered (cGAN). For the first
promise in multi-contrast MR image synthesis. However, dic- scenario, we train pGAN with pixel-wise loss and percep-
tionary construction at different spatial scales is independent, tual loss between the synthesized and true images (Fig. 1)
and the predictions from separate random forest trees are [25], [49]. For the second scenario, we train cGAN after
averaged during synthesis. These may lead to loss of detailed replacing the pixel-wise loss with a cycle loss that enforces the
structural information and suboptimal synthesis performance. ability to reconstruct back the source image from the synthe-
Recently an end-to-end framework for MRI image synthesis sized target image (Fig. 2) [50]. Extensive evaluations are pre-
has been proposed, Multimodal, based on deep neural net- sented on multi-contrast MRI images (T1 - and T2 -weighted)
works [21]. Multimodal trains a neural network that receives from healthy normals and glioma patients. The proposed
as input images in multiple source contrasts and predicts approach yields visually and quantitatively enhanced accuracy
the image in the target contrast. This method performs mul- in multi-contrast MRI synthesis compared to state-of-the-art
tiresolution dictionary construction and image synthesis in a methods (Replica and Multimodal) [21], [23].
unified framework, and it was demonstrated to yield higher
II. M ETHODS
synthesis quality compared to non-network-based approaches
even when only a subset of the source contrasts is available. A. Image Synthesis via Adversarial Networks
That said, Multimodal assumes the availability of spatially- Generative adversarial networks are neural-network archi-
registered multi-contrast images. In addition, Multimodal uses tectures that consist of two sub-networks; G, a generator
mean absolute error loss functions that can perform poorly in and D, a discriminator. G learns a mapping from a latent
capturing errors towards higher spatial frequencies [25]–[27]. variable z (typically random noise) to an image y in a
Here we propose a novel approach for image synthesis target domain, and D learns to discriminate the generated
in multi-contrast MRI based on generative adversarial net- image G(z) from the real image y [51]. During training
work (GAN) architectures. Adversarial loss functions have of a GAN, both G and D are learned simultaneously, with
recently been demonstrated for various medical imaging appli- G aiming to generate images that are indistinguishable from
cations with reliable capture of high-frequency texture infor- the real images, and D aiming to tell apart generated and
mation [28]–[48]. In the domain of cross-modality image real images. To do this, the following adversarial loss function
synthesis, important applications include CT to PET synthe- (L G AN ) can be used:
sis [29], [40], MR to CT synthesis [28], [33], [38], [42], [48],
L G AN (G, D) = E y [log D(y)] + E z [log(1− D(G(z)))], (1)
CT to MR synthesis [36], and retinal vessel map to image
synthesis [35], [41]. Inspired by this success, here we intro- where E denotes expected value. G tries to minimize and
duce conditional GAN models for synthesizing images of D tries to maximize the adversarial loss that improves

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
DAR et al.: IMAGE SYNTHESIS IN MULTI-CONTRAST MRI WITH CONDITIONAL GANs 2377

adversarial loss function:


L cond G AN (D, G) = −E x,y [(D(x, y) − 1)2 ]
−E x,z [D(x, G(x, z))2 ], (3)
where x denotes the source image.
An analogous problem to image-to-image translation tasks
in computer vision exists in MR imaging where the same
anatomy is acquired under multiple different tissue contrasts
(e.g., T1 - and T2 -weighted images). Inspired by the recent
success of adversarial networks, here we employed conditional
GANs to synthesize MR images of a target contrast given
as input an alternate contrast. For a comprehensive solu-
tion, we considered two distinct scenarios for multi-contrast
MR image synthesis. First, we assumed that the images of
the source and target contrasts are perfectly registered. For
this scenario, we propose pGAN that incorporates a pixel-
wise loss into the objective function as inspired by the pix2pix
architecture [49]:
L L1 (G) = E x,y,z [y − G(x, z)1 ], (4)
where L L1 is the pixel-wise L1 loss function. Since the gen-
erator G was observed to ignore the latent variable in pGAN,
the latent variable was removed from the model.
Fig. 2. The cGAN method is based on a conditional adversar- Recent studies suggest that incorporation of a perceptual
ial network with two generators (GT1 , GT2 ) and two discriminators loss during network training can yield visually more realistic
(DT1 , DT2 ). Given a T1 -weighted image, GT2 learns to generate the results in computer vision tasks. Unlike loss functions based
respective T2 -weighted image of the same anatomy that is indiscrim-
inable from real T2 -weighted images of other anatomies, whereas on pixel-wise differences, perceptual loss relies on differences
DT2 learns to discriminate between synthetic and real T2 -weighted in higher feature representations that are often extracted from
images. Similarly, GT1 learns to generate realistic a T1 -weighted image of networks pre-trained for more generic tasks [25]. A commonly
an anatomy given the respective T2 -weighted image, whereas DT1 learns
to discriminate between synthetic and real T1 -weighted images. Since used network is VGG-net trained on the ImageNet [56] dataset
the discriminators do not compare target images of the same anatomy, for object classification. Here, following [25], we extracted
a pixel-wise loss cannot be used. Instead, a cycle-consistency loss is feature maps right before the second max-pooling operation of
utilized to ensure that the trained generators enable reliable recovery of
the source image from the generated target image. VGG16 pre-trained on ImageNet. The resulting loss function
can be written as:
L Perc (G) = E x,y [V (y) − V (G(x))1 ], (5)
modeling high-spatial-frequency information [26]. Both G and
D are trained simultaneously. Upon convergence, G is capable where V is the set of feature maps extracted from VGG16.
of producing realistic counterfeit images that D cannot recog- To synthesize each cross-section y from x we also leveraged
nize [51]. To further stabilize the training process, the negative correlated information across neighboring cross-sections by
log-likelihood cost for adversarial loss in (1) can be replaced conditioning the networks not only on x but also on the neigh-
by a squared loss [52]: boring cross-sections of x. By incorporating the neighboring
cross-sections (3), (4) and (5) become:
L G AN (D, G) = −E y [(D(y) − 1)2 ] − E z [D(G(z))2 ] (2)
L cond G AN−k (D, G) = −E xk ,y [(D(xk , y) − 1)2 ]
Recent studies in computer vision have demonstrated
that GANs are very effective in image-to-image translation −E xk [D(xk , G(xk ))2 ], (6)
tasks [49], [50]. Image-to-image translation concerns transfor- L L1−k (G) = E xk ,y [y − G(xk )1 ], (7)
mations between different representations of the same under- L Perc−k (G) = E xk ,y [V (y) − V (G(xk ))1 ], (8)
lying visual scene [49]. These transformations can be used to
where xk = [x   , . . . , x , x , x, x , x , . . . , x   ] is
convert an image between separate domains, e.g., generating − k2 −2 −1 +1 +2 + k2
semantic segmentation maps from images, colored images a vector consisting
   of k consecutive cross-sections ranging
from sketches, or maps from aerial photos [49], [53], [54]. from − k2 to k2 , with the cross section x in the middle, and
Traditional GANs learn to generate samples of images from L cond G AN−k and L L1−k are the corresponding adversarial and
noise. However, in image-to-image translation, the synthesized pixel-wise loss functions. This yields the following aggregate
image has statistical dependence on the source image. To better loss function:
capture this dependency, conditional GANs can be employed
that receive the source image as an additional input [55]. The L pG AN = L cond G AN−k (D, G) + λL L1−k (G)
resulting network can then be trained based on the following +λ perc L perc−k (G), (9)

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
2378 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO. 10, OCTOBER 2019

where L pG AN is the complete loss function, λ controls the order that they were shared on the public databases. Subjects
relative weighing of the pixel-wise loss and λ perc controls the with images containing severe motion-artifacts across the
relative weighing of the perceptual loss. volume were excluded from selection. The selected set of
In the second scenario, we did not assume any explicit subjects were then sequentially split into training, validation
registration between the images of the source and target and testing sets. Protocol information for each dataset is
contrasts. In this case, the pixel-wise and perceptual losses described below.
cannot be leveraged since images of different contrasts are not 1) MIDAS Dataset: T1 - and T2 -weighted images from
necessarily spatially aligned. To limit the number of potential 66 subjects were analyzed, where 48 subjects were used
solutions for the synthesized image, here we proposed cGAN for training, 5 were used for validation and 13 were used
that incorporates a cycle-consistency loss as inspired by the for testing. From each subject, approximately 75 axial cross
cycleGAN architecture [50]. The cGAN method consists of sections that contained brain tissue and that were free of
two generators (G x , G y ) and two discriminators (Dx , D y ). major artifacts were manually selected. T1 -weighted images:
G y tries to generate G y (x) that looks similar to y and D y 3D gradient-echo FLASH sequence, TR=14ms, TE=7.7ms,
tries to distinguish G y (x) from the images y. On the other flip angle=25◦, matrix size=256x176, 1 mm isotropic reso-
hand, G x tries to generate G x (y) that looks similar to x and lution, axial orientation. T2 -weighted images: 2D spin-echo
Dx tries to distinguish G x (y) from the images x. This archi- sequence, TR=7730ms, TE=80ms, flip angle=90◦, matrix
tecture incorporates an additional loss to ensure that the input size=256×192, 1 mm isotropic resolution, axial orientation.
and target images are consistent with each other, called the 2) IXI Dataset: T1 - and T2 -weighted images from 40 sub-
cycle consistency loss L cycle : jects were analyzed, where 25 subjects were used for training,
5 were used for validation and 10 were used for testing.
L cycle (G x , G y ) = E x [x − G x (G y (x))1 ] When T1 -weighted images were registered onto T2 -weighted
+E y [y − G y (G x (y))1 ]. (10) images, nearly 90 axial cross sections per subject that con-
tained brain tissue and that were free of major artifacts
This loss function enforces that property that after projecting
were selected. When T2 -weighted images were registered onto
the source images onto the target domain, the source image
T1 -weighted images, nearly 110 cross sections were selected.
can be re-synthesized with minimal loss from the projec-
In this case due to poor registration quality we had to
tion. Lastly, by incorporating the neighboring cross-sections,
remove a test subject. T1 -weighted images: TR=9.813ms,
the cycle consistency and adversarial loss functions become:
TE=4.603ms, flip angle=8◦, volume size = 256×256×150,
L cycle−k (G x , G y ) = E xk [xk − G x (G y (xk ))1 ] voxel dimensions = 0.94mm×0.94mm×1.2mm, sagittal ori-
+E yk [yk − G y (G x (yk ))1 ]. (11) entation. T2 -weighted images: TR=8178ms, TE=100ms, flip
angle=90◦, volume size = 256×256×150, voxel dimen-
L G AN−k (D y , G y ) = −E yk [(D y (yk ) − 1)2 ] sions = 0.94×0.94×1.2 mm3 , axial orientation.
−E xk [D y (G y (xk ))2 ] (12) 3) BRATS Dataset: T1 - and T2 -weighted images from
41 low-grade glioma patients with visible lesions were ana-
This yields the following aggregate loss function for training:
lyzed, where 24 subjects were used for training, 2 were used
L cG AN (Dx , D y , G x , G y ) for validation and 15 were used for testing. From each subject,
= L G AN−k (Dx , G x ) + L G AN−k (D y , G y ) approximately 100 axial cross sections that contained brain
tissue and that were free of major artifacts were manually
+λcycle L cycle−k (G x , G y ). (13) selected. Different scanning protocols were employed on sep-
where L cG AN is the complete loss function, and λcycle controls arate sites.
the relative weighing of the cycle consistency loss. Note that each dataset comprises a different number of
While training both pGAN and cGAN, we made a minor cross-sections per subject, and we only retained cross-sections
modification in the adversarial loss function. As implemented that contained brain tissue and that were free of major artifacts.
in [50], the generator was trained to minimize E xk [(D(xk , As such, we varied the number of subjects across datasets
G(xk )) − 1)2 ] instead of −E xk [(D(xk , G(xk )))2 ]. to balance the total number of images used, resulting in
approximately 4000–5000 images per dataset.
Control analyses were performed to rule out biases due
B. MRI Datasets to the specific selection or number of subjects. To do this,
For registered images, we trained both pGAN and we performed model comparisons using an identical number
cGAN models. For unregistered images, we only trained of subjects (40) within each dataset. This selection included
cGAN models. The experiments were performed on three sep- nonoverlapping training, validation and testing sets, such that
arate datasets: the MIDAS dataset [57], the IXI dataset (http:// 25 subjects were used for training, 5 for validation and
brain-development.org/ixi-dataset/) and the BRATS dataset 10 for testing. In IXI, we sequentially selected a completely
(https://sites.google.com/site/braintumorsegmentation/home/ independent set of subjects from those reported in the main
brats2015). MIDAS and IXI datasets contained data from analyses. This selection was then sequentially split into train-
healthy subjects, whereas the BRATS dataset contained data ing/validation/testing sets via a 4-fold cross-validation pro-
from patients with structural abnormality (i.e., brain tumor). cedure. Since the number of subjects available was smaller
For each dataset, subjects were sequentially selected in the in MIDAS and BRATS, we performed 4-fold cross-validation

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
DAR et al.: IMAGE SYNTHESIS IN MULTI-CONTRAST MRI WITH CONDITIONAL GANs 2379

by randomly sampling nonoverlapping training, validation and (see Supp. Methods for details). Tuning hyperparameters in
testing sets in each fold. No overlap was allowed among deep neural networks, especially in complex models such as
testing sets across separate folds, or among the training, testing GANs, can be computationally intensive [60], [61]. Thus,
and validation sets within each fold. it is quite common in deep learning research to perform
4) Data Normalization: To prevent suboptimal model one-fold cross-validation [30], [35] or even directly adopt
training and bias in quantitative assessments, datasets were hyperparameter selection from published work [24], [28],
normalized to ensure comparable ranges of voxel intensi- [29], [38], [48], [62]. For computational efficiency, here we
ties across subjects. The multi-contrast MRI images in the selected the optimum weightings of loss functions and number
IXI and MIDAS datasets were acquired using a single scan of epochs by performing one-fold cross-validation. We par-
protocol. Therefore, for each contrast, voxel intensity was titioned the datasets into training, validation and test sets,
normalized within each subject to a scale of [0 1] via division each set containing images from distinct subjects. Multiple
by the maximum intensity within the brain volume. The models were trained for varying number of epochs (in the
protocol variability in the BRATS dataset was observed to range [100 200]) and relative weighting of the loss functions
cause large deviations in image intensity and contrast across (λ in the set {10,100,150}, and λ perc in the set {10,100,150}).
subjects. Thus, for normalization, the mean intensity across the Parameters were selected based on the validation set, and
brain volume was normalized to 1 within individual subjects. performance was then assessed on the test set. Among the
To attain an intensity scale in [0 1], three standard deviations datasets here, IXI contains the highest-quality images with
above the mean intensity of voxels pooled across subjects was visibly lower noise and artifact levels compared to MIDAS
then mapped to 1. and visibly sharper images compared to BRATS. To prevent
overfitting to noise, artifacts or blurry images, we therefore
C. Image Registration performed cross-validation of GAN models on IXI, and used
the selected parameters in the remaining datasets. Weightings
For the first scenario, multi-contrast images from a given
of both pixel-wise and perceptual loss were selected as 100 and
subject were assumed to be registered. Note that the images
the number of epochs was set to 100 (the benefits of perceptual
contained in the MIDAS and IXI datasets are unregistered.
loss on synthesis performance are demonstrated in MIDAS
Thus, the T1 - and T2 -weighted images in these datasets
and IXI; Supp. Table IV). Remaining hyperparameters were
were registered prior to network training. In the MIDAS
adopted from [50], where the Adam optimizer was used with
dataset, the voxel dimensions for T1 - and T2 -weighted images
a minibatch size of 1 [63]. In the first 50 epochs, the learning
were identical, so a rigid transformation based on a mutual
rates for the generator and discriminator were 0.0002. In the
information cost function was observed to yield high quality
last 50 epochs, the learning rate was linearly decayed from
registration. In the IXI dataset, however, voxel dimensions
0.0002 to 0. During each iteration the discriminator loss
for T1 - and T2 -weighted images were quite distinct. For
function was halved to slow down the learning process of the
improved registration accuracy, we therefore used an affine
discriminator. Decay rates for the first and second moments
transformation with higher degrees of freedom based on a
of gradient estimates were set as β1= 0.5 and β2=0.999,
mutual information cost in this case. No registration was
respectively. Instance normalization was applied [64]. All
needed for the BRATS dataset that was already registered.
weights were initialized using normal distribution with 0 mean
No registration was performed for the second scenario. All
and 0.02 std.
registrations were implemented in FSL [58], [59].
In the second scenario, we did not assume any alignment
between the source and target images, and so we used cGAN
D. Network Training to learn the mapping between unregistered source and target
Since we consider two different scenarios for multi-contrast images (cGANunreg). Similar to pGAN, two variants of cGAN
MR image synthesis, network training procedures were dis- were considered that worked on a single cross-section (k=1)
tinct. In the first scenario, we assumed perfect alignment and on multiple consecutive cross-sections. Because training
between the source and target images, and we then used of cGAN brings substantial computational burden compared
pGAN to learn the mapping from the source to the target to pGAN, we only examined k=3 for cGAN. This latter
contrast. In a first variant of pGAN (k=1), the input image cGAN variant was implemented with multiple consecutive
was a single cross-section of the source contrast, and the target cross-sections of the source contrast. Although cGAN does
was the respective cross-section of the desired contrast. Note not assume alignment between the source and target domains,
that neighboring cross sections in MR images are expected to we wanted to examine the effects of loss functions used in
show significant correlation. Thus, we reasoned that additional cGAN and pGAN. For comparison purposes, we also trained
information from adjacent cross-sections in the source contrast separate cGAN networks on registered multi-contrast data
should improve synthesis. To do this, a second variant of (cGANreg ). The cross-validation procedures, and the archi-
pGAN was implemented where multiple consecutive cross- tectures of the generator and discriminator were identical to
sections (k=3, 5, 7) of the source contrast were given as input, those for pGAN. Multiple models were trained for varying
with the target corresponding to desired contrast at the central number of epochs (in the range [100 200]), and λcycle in the
cross-section. set {10,100,150}). Model parameters were selected based on
For the pGAN network, we adopted the generator architec- performance on the validation set, and model performance was
ture from [25], and the discriminator architecture from [50] then assessed on the test set. The relative weighting of the

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
2380 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO. 10, OCTOBER 2019

cycle consistency loss function was selected as λcycle =100, was needed, and this resulted in only two distinct cases for
and the model was trained for 200 epochs. In the first consideration: a) T1 →T2 and d) T2 →T1 . A single variant
100 epochs, the learning rate for both networks were set of pGAN (k=3) and cGAN (k=1) was considered.
to 0.0002, and in the remaining 100 epochs, the learning rate 2) Comparison to State-of-the-Art Methods: To investigate
was linearly decayed from 0.0002 to 0. During each iteration how well the proposed methods perform with respect to state-
the discriminator loss function was divided by 2 to slow down of-the-art approaches, we compared the pGAN and cGAN
the learning process of the discriminator. models with Replica and Multimodal. Models were compared
using the same training, and testing sets, and these sets
E. Competing Methods comprised images from different groups of subjects. The
synthesized images were compared with the true target images
To demonstrate the proposed approach, two state-of-the-art
as reference. Both the synthesized and the reference images
methods for MRI image synthesis were implemented. The first
were normalized to a maximum intensity of 1. To assess the
method was Replica that estimates a nonlinear mapping from
synthesis quality, we measured the peak signal-to-noise ratio
image patches in the source contrast onto individual voxels
(PSNR) and structural similarity index (SSIM) [65] metrics
in the target contrast [23]. Replica extracts image features at
between the synthesized image and the reference.
different spatial scales, and then performs a multi-resolution
3) Spectral Density Analysis: While PSNR and SSIM serve
analysis via random forests. The learned nonlinear mapping
as common measures to evaluate overall quality, they primarily
is then applied on test images. Code posted by the authors of
capture characteristics dominated by lower spatial frequencies.
the Replica method was used to train the models, based on
To examine synthesis quality across a broader range of fre-
the procedures/parameters described in [23].
quencies, we used a spectral density similarity (SDS) metric.
The second method was Multimodal that uses an end-
The rationale for SDS is similar to that for the error spectral
to-end neural network to estimate the target image given
plots demonstrated in [66], where error distribution is analyzed
the source image as input. A neural-network implementation
across spatial frequencies. To compute SDS, synthesized and
implicitly performs multi-resolution feature extraction and
reference images were transformed into k-space, and separated
synthesis based on these features. Trained networks can then
into four separate frequency bands: low (0–25%), intermediate
be applied on test images. Code posted by the authors of the
(25–50%), high-intermediate (50–75%), and high (75–100%
Multimodal method was used to train the models, based on
of the maximum spatial frequency in k-space). Within each
procedures/parameters described in [21].
band, SDS was taken as the Pearson’s correlation between
The proposed approach and the competing methods were
vectors of magnitude k-space samples of the synthesized
compared on the same training and test data. Since the
and reference images. To avoid bias from background noise,
proposed models were implemented for unimodal mapping
we masked out background regions to zero before calculating
between two separate contrasts, Replica and Multimodal
the quality measures.
implementations were also performed with only two contrasts.
4) Generalizability: To examine the generalizability of the
proposed methods, we trained pGAN, cGAN, Replica and
F. Experiments Multimodal on the IXI dataset and tested the trained models
1) Comparison of GAN-Based Models: Here we first ques- on the MIDAS dataset. The following cases were examined:
tioned whether the direction of registration between multi- T1 →T2# , T1# →T2, T2 →T1# , and T2# →T1 . During
contrast images affects the quality of synthesis. In particular, testing, ten sample images were synthesized for a given source
we generated multiple registered datasets from T1 - and image, and the results were averaged to mitigate nuisance
T2 -weighted images. In the first set, T2 -weighted images variability in individual samples. When T1 -weighted images
were registered onto T1 -weighted images (yielding T2# ). were registered onto T2 -weighted images, within-cross-section
In the second set, T1 -weighted images were registered onto voxel dimensions were isotropic for both datasets and no extra
T2 -weighted images (yielding T1# ). In addition to the direction pre-processing step was needed. However, when T2 -weighted
of registration, we also considered the two possible directions images were registered, voxel dimensions were anisotropic
of synthesis (T2 from T1 ; T1 from T2 ). for IXI yet isotropic for MIDAS. To avoid spatial mismatch,
For MIDAS and IXI, the above-mentioned considerations voxel dimensions were matched via trilinear interpolation.
led to four distinct cases: a) T1 →T2# , b) T1# →T2 , Because a mismatch of voxel thickness in the cross-sectional
c) T2 →T1# , d) T2# →T1 . Here, T1 and T2 are unregistered dimension can deteriorate synthesis performance, single cross-
images, T1# and T2# are registered images, and → corresponds section models were considered.
to the direction of synthesis. For each case, pGAN and 5) Reliability Against Noise: To examine the reliability of
cGAN were trained based on two variants, one receiving a synthesis against image noise, we trained pGAN and Multi-
single cross-section, the other receiving multiple (3, 5 and 7) modal on noisy images. The IXI dataset was selected since
consecutive cross-sections as input. This resulted in a total it contains high-quality images with relatively low noise lev-
of 32 pGAN and 12 cGAN models. Note that the single- els. Two separate sets of noisy images were then generated
cross section cGAN contains generators for both contrasts, by adding Rician noise to the source and target contrast
and trains a model that can synthesize in both directions. For images respectively. The noise level was fixed within subjects
the multi cross-section cGAN, however, a separate model was and randomly varied across subjects by changing the Rician
trained for synthesis direction. For BRATS, no registration shape parameter in [0 0.2]. For noise-added target images,

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
DAR et al.: IMAGE SYNTHESIS IN MULTI-CONTRAST MRI WITH CONDITIONAL GANs 2381

TABLE I TABLE II
Q UALITY OF S YNTHESIS IN THE MIDAS D ATASET Q UALITY OF S YNTHESIS IN THE MIDAS D ATASET
S INGLE C ROSS -S ECTION M ODELS M ULTI C ROSS -S ECTION M ODELS (K = 3)

measurements for cGANunreg are not reported. Overall, similar


to the MIDAS dataset, we observed that pGAN outperforms
the competing methods (p<0.05). On average, across the two
background masking was performed prior to training and no datasets, pGAN achieves 1.42dB higher PSNR and 1.92%
perceptual loss was used in pGAN to prevent overfitting to higher SSIM compared to cGAN. These improvements can
noise. Separate models were trained using noise-added source be attributed to pixel-wise and perceptual losses compared to
and original target images, and using original source and noise- cycle-consistency loss on paired images.
added target images. In MR images, neighboring voxels can show structural
Statistical significance of differences among methods was correlations, so we reasoned that synthesis quality can
assessed with nonparametric Wilcoxon signed-rank tests be improved by pooling information across cross sections.
across test subjects. Neural network training and evaluation To examine this issue, we trained multi cross-section pGAN
was performed on NVIDIA Titan X Pascal and Xp GPUs. (k = 3, 5, 7), cGANreg and cGANunreg models (k = 3; see
Implementation of pGAN and cGAN was carried out in Methods) on the MIDAS and IXI datasets. PSNR and SSIM
Python using the Pytorch framework [67]. Code for repli- measurements for pGAN are listed in Supp. Table II, and those
cating the pGAN and cGAN models will be available on for cGAN are listed in Supp. Table III. For pGAN, multi cross-
http://github.com/icon-lab/mrirecon. Replica was based on a section models yield enhanced synthesis quality in all cases.
MATLAB implementation, and a Keras implementation [68] Overall, k=3 offers optimal or near-optimal performance while
of Multimodal with the Theano backend [69] was used. maintaining relatively low model complexity, so k=3 was
considered thereafter for pGAN. The results are more variable
for cGAN, with the multi-cross section model yielding a
III. R ESULTS
modest improvement only in some cases. To minimize model
A. Comparison of GAN-Based Models complexity, k=1 was considered for cGAN.
We first evaluated the proposed models on T1 - and T2 - Table II compares PSNR and SSIM of multi cross-section
weighted images from the MIDAS and IXI datasets. We con- pGAN and cGAN models for T2 and T1 synthesis in the
sidered two cases for T2 synthesis (a. T1 →T2# , b. T1# →T2 , MIDAS dataset. Representative results for T1 →T2# are
where # denotes the registered image), and two cases for T1 shown in Fig. 3b and T2# →T1 are shown in Supp. Fig. Ib.
synthesis (c. T2 →T1# , d. T2# →T1 ). Table I lists PSNR Among multi cross-section models, pGAN outperforms alter-
and SSIM for pGAN, cGANreg trained on registered data, and natives in PSNR and SSIM (p<0.05), except for SSIM in
cGANunreg trained on unregistered data in the MIDAS dataset. T2# →T1 . Moreover, compared to the single cross-section
We find that pGAN outperforms cGANunreg and cGANreg pGAN, the multi cross-section pGAN improves PSNR and
in all cases (p<0.05). Representative results for T1 →T2# SSIM values. These measurements are also affirmed by
are displayed in Fig. 3a and T2# →T1 are displayed in improvements in visual quality for the multi cross-section
Supp. Fig. Ia, respectively. pGAN yields higher synthesis qual- model in Fig. 3 and Supp. Fig. I. In contrast, the benefits are
ity compared to cGANreg . Although cGANunreg was trained less clear for cGAN. Note that, unlike pGAN that works on
on unregistered images, it can faithfully capture fine-grained paired images, the discriminators in cGAN work on unpaired
structure in the synthesized contrast. Overall, both pGAN and images from the source and target domains. In turn, this can
cGAN yield synthetic images of remarkable visual similarity render incorporation of correlated information across cross
to the reference. Supp. Tables II and III (k=1) lists PSNR and sections less effective. Supp. Tables II and III compare PSNR
SSIM across test images for T2 and T1 synthesis with both and SSIM of multi cross-section pGAN and cGAN models for
directions of registration in the IXI dataset. Note that there T2 and T1 synthesis in the IXI dataset. The multi cross-section
is substantial mismatch between the voxel dimensions of the pGAN outperforms cGANreg in all cases (p<0.05). Moreover,
source and target contrasts in the IXI dataset, so cGANunreg the multi cross-section pGAN outperforms the single cross-
must map between the spatial sampling grids of the source section pGAN in all cases (p<0.05), except in T1 →T2# .
and the target. Since this yielded suboptimal performance, On average, across the two datasets, multi cross-section

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
2382 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO. 10, OCTOBER 2019

TABLE III
A- Q UALITY OF S YNTHESIS IN THE MIDAS D ATASET

Fig. 3. The proposed approach was demonstrated for synthesis of


T2 -weighted images from T1 -weighted images in the MIDAS dataset.
Synthesis was performed with pGAN, cGAN trained on registered images
(cGANreg ), and cGAN trained on unregistered images (cGANunreg ).
For pGAN and cGANreg , training was performed using T2 -weighted
images registered onto T1 -weighted images (T1 →T2 ). Synthesis
results for (a) the single cross-section, and (b) multi cross-section
models are shown along with the true target image (reference) and
the source image (source). Zoomed-in portions of the images are also
displayed. While both pGAN and cGAN yield synthetic images of striking
visual similarity to the reference, pGAN is the top performer. Synthesis
quality is improved as information across neighboring cross sections is
incorporated, particularly for the pGAN method.

pGAN achieves 0.63dB higher PSNR and 0.89% higher SSIM


compared to single cross-section pGAN.

B. Comparison to State-of-the-Art Methods


Next, we demonstrated the proposed methods against two
state-of-the-art techniques for multi-contrast MRI synthe-
sis, Replica and Multimodal. We trained pGAN, cGANreg , Fig. 4. The proposed approach was demonstrated for synthesis of
Replica, and Multimodal on T1 - and T2 -weighted brain images T1 -weighted images from T2 -weighted images in the IXI dataset.
in the MIDAS and IXI datasets. Note that Replica performs T2 →T1 and T2 →T1 synthesis were performed with pGAN, Multi-
modal and Replica. Synthesis results for (a) T2 →T1 , and (b) T2 →T1
ensemble averaging across random forest trees and Multimodal along with their corresponding error maps are shown along with the true
uses mean-squared error measures that can lead to overem- target image (reference) and the source image (source). The proposed
phasis of low frequency information. In contrast, conditional method outperforms competing methods in terms of synthesis quality.
Regions that are inaccurately synthesized by the competing methods are
GANs use loss functions that can more effectively capture reliably depicted by pGAN (marked with arrows). The use of adversarial
details in the intermediate to high spatial frequency range. loss enables improved accuracy in synthesis of intermediate-spatial-
Thus, pGAN should synthesize sharper and more realistic frequency texture in T2 -weighted images compared to Multimodal and
Replica that show some degree of blurring.
images as compared to the competing methods. Table III
lists PSNR and SSIM for pGAN, Replica and Multimodal
(cGANreg listed in Supp. Table I) in the MIDAS dataset.
Overall, pGAN outperforms the competing methods in all Following assessments on datasets comprising healthy sub-
examined cases (p<0.05), except for SSIM in T2 synthesis, jects, we demonstrated the performance of the proposed meth-
where pGAN and Multimodal perform similarly. The proposed ods on patients with pathology. To do this, we trained and
method is superior in depiction of detailed tissue structure as tested pGAN, cGANreg , Replica, and Multimodal on T1 - and
visible in Supp. Fig. II (for comparisons in coronal and sagittal T2 -weighted brain images from the BRATS dataset. Similar to
cross-sections see Supp. Figs. IV, V). Table IV lists PSNR the previous evaluations, here we expected that the proposed
and SSIM across test images synthesized via pGAN, Replica method would synthesize more realistic images with improved
and Multimodal (cGANreg listed in Supp. Table I) for the IXI preservation of fine-grained tissue structure. Table V lists
dataset. Overall, pGAN outperforms the competing methods in PSNR and SSIM across test images synthesized via pGAN,
all examined cases (p<0.05). The proposed method is superior Replica and Multimodal (cGANreg listed in Supp. Table I;
in depiction of detailed tissue structure as visible in Fig. 4 and for measurements on background-removed images in MIDAS,
Supp. Fig. III (see also Supp. Figs. IV, V). IXI and BRATS see Supp. Table V). Overall, pGAN is the

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
DAR et al.: IMAGE SYNTHESIS IN MULTI-CONTRAST MRI WITH CONDITIONAL GANs 2383

TABLE V
Q UALITY OF S YNTHESIS IN THE BRATS D ATASET

Fig. 6. The T1 -weighted image of a sample cross-section from the


MIDAS dataset was processed with an ideal filter in k-space. The
filter was broadened sequentially to include higher frequencies (0-25%,
0-50%, 0-75%, 0-100% of the maximum spatial frequency). The filtered
images respectively show the contribution of low, intermediate, high-
intermediate and high frequency bands. The bulk shape and contrast
of the imaged object is captured in the low frequency band, whereas the
Fig. 5. The proposed approach was demonstrated on glioma patients fine structural details such as edges are captured in the intermediate
for synthesis of T2 -weighted images from T1 -weighted images, and and partly high-intermediate frequency bands. There is no apparent
T2 -weighted images from T1 -weighted images in the BRATS dataset. contribution from the high frequency band.
Synthesis results for (a) T1 →T2 , and (b) T1 →T2 along with their corre-
sponding error maps are shown along with the true target image (refer-
ence) and the source image (source). Regions of inaccurate synthesis
with Replica and Multimodal are observed near pathologies (marked with all 4 folds. We find that there is minimal variability in pGAN
arrows). Meanwhile, the pGAN method enables reliable synthesis with performance across folds. Across the datasets, pGAN variabil-
visibly improved depiction of intermediate spatial frequency information.
ity is merely 0.70% in PSNR and 0.37% in SSIM, compared
to Multimodal variability of 2.26% in PSNR and 0.46% in
TABLE IV SSIM. The results of these control analyses are also highly
Q UALITY OF S YNTHESIS IN THE IXI D ATASET consistent with those in the original set of subjects reported
in Supp. Table I. We find that there is minimal variability in
pGAN performance between the main and control analyses.
Across the datasets, pGAN variability is 1.42% in PSNR and
0.73% in SSIM, compared to Multimodal variability of 2.98%
in PSNR and 0.97% in SSIM.

C. Spectral Density Analysis


To corroborate visual observations regarding improved
depiction of structural details, we measured spectral density
similarity (SDS) between synthesized and reference images
across low, intermediate, high-intermediate and high spa-
top performing method in all cases (p<0.05), except for SSIM tial frequencies (see Methods). Fig. 6 shows filtered ver-
in T1 →T2 where pGAN and Multimodal perform similarly. sions of a T1 -weighted image in the MIDAS dataset, where
Moreover, cGAN performs favorably in PSNR over competing the filter is broadened sequentially to include higher fre-
methods. Representative images for T2 and T1 synthesis are quencies so as to visualize the contribution of individual
displayed in Fig. 5 (see also Supp. Figs. IV, V). It is observed bands. Intermediate and high-intermediate frequencies pri-
that regions near pathologies are inaccurately synthesized marily correspond to edges and other structural details in
by Replica and Multimodal. Meanwhile, the pGAN method MR images, so we expected pGAN to outperform competing
enables reliable synthesis with visibly improved depiction of methods in these bands. Fig. 7 shows representative synthesis
structural details. Across the datasets, pGAN outperforms the results in the image and spatial frequency (k-space) domains.
state-of-the-art methods by 2.85dB PSNR and 1.23% SSIM. Supp. Table VI lists SDS across the test images synthesized via
Next, we performed additional control analyses via 4-fold pGAN, cGANreg , Replica and Multimodal in the all datasets.
cross validation to rule out potential biases due to subject In the MIDAS dataset, pGAN outperforms the competing
selection. Supp. Tables IX–XI list PSNR and SSIM across test methods at low and intermediate frequencies (p<0.05), except
images synthesized via pGAN and Multimodal separately for in T1 synthesis where it performs similarly to Multimodal.

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
2384 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO. 10, OCTOBER 2019

images, pGAN outperforms Multimodal in all examined cases


(p<0.05) except for SSIM in T1 →T2# . On average, pGAN
achieves 1.74dB higher PSNR and 2.20% higher SSIM than
Multimodal. For noisy target images, pGAN is the top per-
former in PSNR in T1# →T2 , T2 →T1# (p<0.05) and
performs similarly to Multimodal in the remaining cases.
On average, pGAN improves PSNR by 0.61dB. (Note, how-
ever, that for noisy target images, reference-based quality
measurements are biased by noise particularly towards higher
frequency bands; see Supp. Fig. VII.) Naturally, synthesis
Fig. 7. Synthesis results are shown for a sample cross section from
the IXI dataset along with the true target (reference) and the source
performance is lowered in the presence of noise. We assessed
image (source). Images shown in (a) the spatial domain (b) the spatial- the performance degradation when the models were trained
frequency (k-space) domain. White circular boundaries in the k-space on noise-added images as compared to when the models were
representation of the source delineate the boundaries of the low, interme-
diate, high-intermediate and high frequency bands. The pGAN method
trained on original images. Overall, pGAN and Multimodal
more accurately synthesizes the target image as evidenced by the better show similar performance degradation with noise. For noisy
match in energy distribution across k-space. source images, degradation is 5.27% in PSNR and 2.17% in
SSIM for pGAN, and 3.77% in PSNR, 2.66% in SSIM for
Multimodal. For noisy target images, degradation is 16.70%
In the IXI dataset, pGAN yields superior performance to com- in PSNR and 12.91% in SSIM for pGAN, and 15.19% in
peting methods in all frequency bands (p<0.05). In the BRATS PSNR, 10.06% in SSIM for Multimodal.
dataset, pGAN achieves higher SDS than the competing meth-
ods at low, intermediate and high-intermediate frequencies in
T2 synthesis and at low frequencies in T1 synthesis (p<0.05). IV. D ISCUSSION
Across the datasets, pGAN outperforms the state-of-the-art A multi-contrast MRI synthesis approach based on
methods by 0.056 at low, 0.061 at intermediate and 0.030 at conditional GANs was demonstrated against state-of-the-art
high-intermediate frequencies. methods in three publicly available brain MRI datasets.
The proposed pGAN method uses adversarial loss functions
D. Generalizability and correlated structure across neighboring cross-sections for
Next, we examined synthesis methods in terms of their improved synthesis. While many previous methods require
generalization performance. Supp. Table VII lists SSIM and registered multi-contrast images for training, a cGAN method
PSNR for pGAN, cGANreg , Replica and Multimodal trained was presented that uses cycle-consistency loss for learning
on the IXI dataset and tested on the MIDAS dataset. Overall, to synthesize from unregistered images. Comprehensive eval-
the proposed methods are the top performers. In T1 →T2# , uations were performed for two distinct scenarios where
Multimodal is the leading performer with 1.9% higher SSIM training images were registered and unregistered. Overall,
SSIM (p<0.05) than pGAN. In T1# →T2 , pGAN outperforms both proposed methods yield synthetic images of remarkable
competing methods in PSNR (p<0.05). In T2 →T1# , pGAN is visual similarity to reference images, and pGAN visually and
again the leading performer with 1.9% higher SSIM (p<0.05) quantitatively improves synthesis quality compared to state-
than Multimodal. In T2# →T1 , cGANreg is the leading per- of-the-art methods [21], [23]. These promising results warrant
former with 1.22dB higher PSNR (p<0.05) SSIM than pGAN. future studies on broad clinical populations to fully examine
We also assessed the level of performance degradation between diagnostic quality of synthesized images in pathological cases.
within-dataset synthesis (trained and tested on MIDAS) and Several previous studies proposed the use of neural net-
across-dataset synthesis (trained on IXI, tested on MIDAS). works for multi-contrast MRI synthesis tasks [13], [19]–[21],
Overall, pGAN and Multimodal show similar degradation lev- [24]. A recent method, Multimodal, was demonstrated to yield
els. While pGAN is the top performer in terms of SSIM, cGAN higher quality compared to conventional methods in brain
yields a modest advantage in PSNR. On average, percentage MRI datasets [21]. Unlike conventional neural networks,
degradation is 20.83% in PSNR and 11.70% in SSIM for the GAN architectures proposed here are generative networks
pGAN, 22.22% in PSNR and 10.12% in SSIM for Multimodal, that learn the conditional probability distribution of the target
15.85% in PSNR and 12.85% in SSIM for cGANreg , and contrast given the source contrast. The incorporation of adver-
11.40% in PSNR and 14.51% in SSIM for Replica. Note sarial loss as opposed to typical squared or absolute error loss
that percentage degradation in PSNR is inherently limited for leads to enhanced capture of detailed texture information about
Replica, which yields low PSNR for within-dataset synthesis. the target contrast, thereby enabling higher synthesis quality.
While our synthesis approach was primarily demonstrated
for multi-contrast brain MRI here, architectures similar to
E. Reliability Against Noise pGAN and cGAN have been proposed in other medical
Lastly, we examined reliability of synthesis against noise image synthesis applications such as cross-modality synthesis
(Supp. Fig. VI). Supp. Table VIII list SSIM and PSNR or data augmentation [28], [29], [33]–[36], [38]–[42], [48].
for pGAN and Multimodal trained on noise-added source The discussions below highlight key differences between the
and target images from IXI, respectively. For noisy source current study and previous work:

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
DAR et al.: IMAGE SYNTHESIS IN MULTI-CONTRAST MRI WITH CONDITIONAL GANs 2385

(1) [29], [40], [42], [48] proposed conditional GANs for intermediate-to-high frequency details in the synthesized
cross-modality synthesis applications. One important proposed images. (ii) We perform task- and model-specific optimization
application is CT to PET synthesis [29], [40]. For instance, of the number of cross-section considering both computational
[29] fused the output of GANs and convolutional networks complexity and performance. (iii) As aforementioned, we con-
to enhance tumor detection performance from synthesized sider within-modality synthesis as opposed to cross-modality
images; and [40] demonstrated competitive tumor detection synthesis.
results from synthesized versus real images. Another important Few recent studies have independently proposed GAN mod-
application is MR to CT synthesis [42], [48]. In [42] and [48], els for multi-contrast MRI synthesis [62], [73], [74]. Perhaps,
patch-based GANs were used for locally-aware synthesis, the closest to our approach are [62] and [73] where conditional
and contextual information was incorporated by training an GANs with pixel-wise loss were used for improved segmen-
ensemble of GAN models recurrently. Our approach differs in tation based on synthesized FLAIR, T1 - and T2 -weighted
the following aspects: (i) Rather than cross-modality image images. Our work differs from these studies in the following
synthesis, we focus on within-modality synthesis in multi- aspects: (i) We demonstrate improved multi-contrast MRI
contrast MRI. MRI provides excellent delineation among soft synthesis via cycle-consistency loss to cope with un-registered
tissues in the brain and elsewhere, with the diversity of images. (ii) We demonstrate improved multi-contrast synthesis
contrasts that it can capture [70]. Therefore, synthesizing a performance via the inclusion of a perceptual loss to pGAN.
specific MRI contrast given another poses a different set (iii) We demonstrate multiple cross-section models to lever-
of challenges than performing MR-CT or CT-PET synthesis age correlated information across neighboring cross-sections
where CT/PET shows relatively limited contrast among soft within multi-contrast MRI volumes. (iv) We quantitatively
tissues [71]. (ii) We demonstrate multi-cross section models demonstrate that conditional GANs better preserve detailed
to leverage correlated information across neighboring cross- tissue structure in synthesized multi-contrast images compared
sections within a volume. (iii) We demonstrate pGAN based to conventional methods [21], [23].
on both pixel-wise and perceptual losses to enhance synthesis The proposed approach might be further improved by
quality. considering several lines of development. Here we presented
(2) Architectures similar to cGAN with cycle-consistency multi-contrast MRI results while considering two potential
loss were recently proposed to address the scarcity of paired directions for image registration (T1 →T2# and T1# →T2 for
training data in MR-CT synthesis tasks [28], [33], [36], [38], T2 synthesis). We observed that the proposed methods yielded
[39]. [33] also utilized a gradient-consistency loss to enhance high-quality synthesis regardless of the registration direction.
the segmentation performance on CT images synthesized from Comparisons between the two directions based on reference-
MR data. Reference [36] performed data-augmentation for based metrics are not informative because the references
enhanced segmentation performance using MR images syn- are inevitably distinct (e.g., T2# versus T2 ), so determining
thesized from CT data. Reference [39] coupled synthesis and the optimal direction is challenging. Yet, with substantial
segmentation networks to perform improved segmentation on mismatch between the voxel sizes in the source and target
synthesized CT images using MR labels. Our work differs contrasts, the cGAN method learns to interpolate between
in the following aspects: (i) As aforementioned, we consider the spatial sampling grids of the source and the target. To
within-modality synthesis as opposed to cross-modality syn- alleviate performance loss, a simple solution is to resam-
thesis. (ii) We consider paired image synthesis with cGAN to ple each contrast separately to match the voxel dimensions.
comparatively evaluate its performance against two state-of- Alternatively, the spatial transformation between the source
the-art methods (Replica and Multimodal) for paired image and target images can first be estimated via multi-modal
synthesis. registration [75]. The estimated transformation can then be
(3) An architecture resembling pGAN was proposed for cascaded to the output of cGAN. A gradient cycle consistency
synthesizing retinal images acquired with fundus photography loss can also be incorporated to prevent the network from
given tabular structural annotations [41]. Similar to pGAN, learning the spatial transformation between the source and the
this previous study incorporated a perceptual loss to improve target [33]. Another cause for performance loss arises when
synthesis quality. Our work differs in the following aspects: MR images for a given contrast are corrupted by higher levels
(i) Synthesis of vascular fundus images in the retina given of noise than typical. Our analyses on noise-added images
annotations is a distinct task than synthesis of a target MR con- imply a certain degree of reliability against moderate noise in
trast given another source MR contrast in the brain. Unlike the T1 - or T2 -weighted images. However, an additional denoising
relatively focused delineation between vascular structures and network could be incorporated to earlier layers in GAN models
background in retinal images, in our case, there are multiple when source images have higher noise, and to later layers
distinct types of brain tissues that appear at divergent signal when target images have elevated noise [76].
levels in separate MR contrasts [71]. (ii) We demonstrate Synthesis accuracy can also be improved by generalizing the
multi-cross section models to leverage correlated information current approach to predict the target based on multiple source
across neighboring cross-sections within an MRI volume. contrasts. In principle, both pGAN and cGAN can receive
(4) A recent study suggested the use of multiple cross- as input multiple source contrasts in addition to multiple
sections during MR-to-CT synthesis [72]. In compari- cross sections as demonstrated here. In turn, this generaliza-
son to [72], our approach is different in that: (i) We tion can offer improved performance when a subset of the
incorporate an adversarial loss function to better preserve source contrast is unavailable. The performance of conditional

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
2386 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO. 10, OCTOBER 2019

GAN architectures in the face of missing inputs warrants further improved by incorporating pixel-wise and perceptual
further investigation. Alternatively, an initial fusion step can losses in the case of registered images, and a cycle-consistency
be incorporated that combines multi-contrast source images in loss for unregistered images. Finally, the proposed method
the form of a single fused image fed as input to the GAN [77]. leverages information across neighboring cross-sections within
Our analyses on noise-added images indicate that, for each volume to increase accuracy of synthesis. The proposed
target contrasts that are inherently noisier, a downweighing method outperformed state-of-the-art synthesis methods in
of perceptual loss might be necessary. The proposed models multi-contrast brain MRI datasets from healthy subjects and
include a hyperparameter for adjusting the relative weighing glioma patients. Given the prohibitive costs of prolonged
of the perceptual loss against other loss terms. Thus, a cross- exams due to repeated acquisitions, only a subset contrasts
validation procedure can be performed for the specific set of might be collected with adequate quality, particularly in pedi-
source-target contrasts at hand to optimize model parame- atric and elderly patients and in large cohorts [1], [3]. Multi-
ters. It remains important future work to assess the optimal contrast MRI synthesis might be helpful in those worst-case
weighing of perceptual loss as a function of noise level for situations by offering a substitute for highly-corrupted or even
specific contrasts. Alternatively, denoising can be included as unavailable contrasts. Therefore, our GAN-based approach
a preprocessing step to improve reliability against noise. Note holds great promise for improving the diagnostic information
that such denoising has recently been proposed for learning- available in clinical multi-contrast MRI.
based sampling pattern optimization in MRI [78].
An important concern regarding neural-network based meth- R EFERENCES
ods is the availability of large datasets for successful training.
[1] B. B. Thukral, “Problems and preferences in pediatric imaging,” Indian
The cGAN method facilitates network training by permitting J. Radiol. Imag., vol. 25, no. 4, pp. 359–364, Oct. 2015.
the use of unregistered and unpaired multi-contrast datasets. [2] K. Krupa and M. Bekiesińska-Figatowska, “Artifacts in mag-
While here we performed training on paired images for unbi- netic resonance imaging,” Polish J. Radiol., vol. 80, pp. 93–106,
Feb. 2015.
ased comparison, cGAN permits the use of unpaired images [3] C. M. Stonnington et al., “Interpreting scan data acquired from multiple
from distinct sets of subjects. As such, it can facilitate com- scanners: A study with Alzheimer’s disease,” NeuroImage, vol. 39, no. 3,
pilation of large datasets that would be required for improved pp. 1180–1185, Feb. 2008.
[4] J. E. Iglesias, E. Konukoglu, D. Zikic, B. Glocker, K. van Leemput,
performance via deeper networks. Yet, further performance and B. Fischl, “Is synthesizing MRI contrast useful for inter-modality
improvements may be viable by training networks based on a analysis?” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist.
mixture of paired and unpaired training data [15]. Intervent., 2013, pp. 631–638.
[5] M. I. Miller, G. E. Christensen, Y. Amit, and U. Grenander, “Mathe-
Recently, cross-modality synthesis with GANs was lever- matical textbook of deformable neuroanatomies,” Proc. Natl. Acad. Sci.
aged as a pre-processing step to enhance various medical USA., vol. 90, no. 24, pp. 11944–11948, Dec. 1993.
imaging tasks such as segmentation, classification or tumor [6] N. Burgos et al., “Attenuation correction synthesis for hybrid PET-MR
scanners: Application to brain studies,” IEEE Trans. Med. Imag., vol. 33,
detection [29], [33], [36], [39], [40], [79], [80]. For instance, no. 12, pp. 2332–2341, Dec. 2014.
[29] fused the output of GANs and convolutional networks [7] J. Lee, A. Carass, A. Jog, C. Zhao, and J. L. Prince, “Multi-atlas-based
to enhance tumor detection from synthesized PET images, CT synthesis from conventional MRI with patch-based refinement for
and [40] demonstrated competitive detection performance MRI-based radiotherapy planning,” Proc. SPIE, vol. 10133, Feb. 2017,
Art. no. 101331I.
with real versus synthesized PET images. [33] trained GANs [8] A. Jog, A. Carass, S. Roy, D. L. Pham, and J. L. Prince, “MR image
based on cycle-consistency loss to enhance segmentation synthesis by contrast learning on neighborhood ensembles,” Med. Image
performance from synthesized CT images. Reference [36] Anal., vol. 24, no. 1, pp. 63–76, Aug. 2015.
[9] A. Jog, S. Roy, A. Carass, and J. L. Prince, “Magnetic resonance image
showed that incorporating synthesized MR images with the synthesis through patch regression,” in Proc. IEEE Int. Symp. Biomed.
real ones can improve the performance of a segmentation Imaging, Apr. 2013, pp. 350–353.
network [39]. GANs also showed enhanced performance in [10] S. Roy, A. Carass, and J. Prince, “A compressed sensing approach for
MR tissue contrast synthesis,” in Proc. Biennial Int. Conf. Inf. Process.
liver lesion classification in synthetic CT [79], and chest Med. Imaging, 2011, pp. 371–383.
pathology classification in synthetic X-ray images [80]. These [11] S. Roy, A. Jog, A. Carass, and J. L. Prince, “Atlas based intensity
previous reports suggest that the multi-contrast MRI synthesis transformation of brain MR images,” in Proc. Int. Workshop Multimodal
Brain Image Anal., 2013, pp. 51–62.
methods proposed here might also improve similar post- [12] Y. Huang, L. Shao, and A. F. Frangi, “Simultaneous super-resolution and
processing tasks. It remains future work to assess to what cross-modality synthesis of 3D medical images using weakly-supervised
extent improvements in synthesis quality translate to tasks such joint convolutional sparse coding,” in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit., Jul. 2017, pp. 5787–5796.
as segmentation or detection. [13] V. Sevetlidis, M. V. Giuffrida, and S. A. Tsaftaris, “Whole image syn-
thesis using a deep encoder-decoder Network,” in Proc. Int. Workshop
Simul. Synth. Med. Imaging, 2016, pp. 127–137.
V. C ONCLUSION [14] R. Vemulapalli, H. van Nguyen, and S. K. Zhou, “Unsupervised cross-
modal synthesis of subject-specific scans,” in Proc. IEEE Int. Conf.
We proposed a new multi-contrast MRI synthesis method Comput. Vis., Dec. 2015, pp. 630–638.
based on conditional generative adversarial networks. Unlike [15] Y. Huang, L. Shao, and A. F. Frangi, “Cross-modality image synthe-
sis via weakly coupled and geometry co-regularized joint dictionary
most conventional methods, the proposed method performs learning,” IEEE Trans. Med. Imaging, vol. 37, no. 3, pp. 815–827,
end-to-end training of GANs that synthesize the target contrast Mar. 2018.
given images of the source contrast. The use of adversarial loss [16] D. H. Ye, D. Zikic, B. Glocker, A. Criminisi, and E. Konukoglu,
“Modality propagation: Coherent synthesis of subject-specific scans with
functions improves accuracy in synthesis of detailed structural data-driven regularization,” in Proc. Int. Conf. Med. Image Comput.
information in the target contrast. Synthesis performance is Comput.-Assist. Intervent., 2013, pp. 606–613.

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
DAR et al.: IMAGE SYNTHESIS IN MULTI-CONTRAST MRI WITH CONDITIONAL GANs 2387

[17] S. Roy, A. Carass, N. Shiee, D. L. Pham, and J. L. Prince, “MR contrast [40] L. Bi, J. Kim, A. Kumar, D. Feng, and M. Fulham, “Synthesis of positron
synthesis for lesion segmentation,” in Proc. IEEE Int. Symp. Biomed. emission tomography (PET) images via multi-channel generative adver-
Imaging, Apr. 2010, pp. 932–935. sarial networks (GANs),” in Proc. Int. Workshop Reconstruction Anal.
[18] N. Cordier, H. Delingette, M. Lê, and N. Ayache, “Extended modality Moving Body Organs, Sep. 2017, pp. 43–51.
propagation: Image synthesis of pathological cases,” IEEE Trans. Med. [41] H. Zhao, H. Li, S. Maurer-Stroh, and L. Cheng, “Synthesizing retinal and
Imaging, vol. 35, no. 12, pp. 2598–2608, Dec. 2016. neuronal images with generative adversarial nets,” Med. Image Anal.,
[19] H. van Nguyen, K. Zhou, and R. Vemulapalli, “Cross-domain synthesis vol. 49, pp. 14–26, Jul. 2018.
of medical images using efficient location-sensitive deep Network,” in [42] D. Nie, R. Trullo, C. Petitjean, S. Ruan, and D. Shen, “Medical
Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent., 2015, image synthesis with deep convolutional adversarial Networks,” in
pp. 677–684. Proc. Med. Image Comput. Comput.-Assist. Intervent, May 2017,
[20] T. Joyce, A. Chartsias, and S. A. Tsaftaris, “Robust multi-modal MR pp. 417–425.
image synthesis,” in Proc. Int. Conf. Med. Image Comput. Comput.- [43] M. Mardani et al., “Deep generative adversarial neural Networks for
Assist. Intervent., 2017, pp. 347–355. compressive sensing MRI,” IEEE Trans. Med. Imaging, vol. 38, no. 1,
[21] A. Chartsias, T. Joyce, M. V. Giuffrida, and S. A. Tsaftaris, “Multimodal pp. 167–179, Jan. 2019.
MR synthesis via modality-invariant latent representation,” IEEE Trans. [44] T. M. Quan, T. Nguyen-Duc, and W.-K. Jeong, “Compressed sensing
Med. Imaging, vol. 37, no. 3, pp. 803–814, Mar. 2018. MRI reconstruction with cyclic loss in generative adversarial Networks,”
[22] S. Roy, A. Carass, and J. L. Prince, “Magnetic resonance image example- IEEE Trans. Med. Imaging, vol. 37, no. 6, pp. 1488–1497, Aug. 2018.
based contrast synthesis,” IEEE Trans. Med. Imaging, vol. 32, no. 12, [45] G. Yang et al., “DAGAN: Deep de-aliasing generative adversarial
pp. 2348–2363, Dec. 2013. networks for fast compressed sensing MRI reconstruction,” IEEE Trans.
[23] A. Jog, A. Carass, S. Roy, D. L. Pham, and J. L. Prince, “Random forest Med. Imag., vol. 37, no. 6, pp. 1310–1321, Jun. 2018.
regression for magnetic resonance image synthesis,” Med. Image Anal.,
[46] O. Shitrit et al., “Accelerated magnetic resonance imaging by adversarial
vol. 35, pp. 475–488, Jan. 2017.
neural Network,” in Proc. Int. Workshop Deep Learn. Med. Image Anal.,
[24] C. Zhao, A. Carass, J. Lee, Y. He, and J. L. Prince, “Whole brain
Sep. 2017, pp. 30–38.
segmentation and labeling from CT using synthetic MR images,” in
Machine Learning in Medical Imaging. Cham, Switzerland: Springer, [47] Y. Wang et al., “3D conditional generative adversarial Networks for
2017, pp. 291–298. high-quality PET image estimation at low dose,” NeuroImage, vol. 174,
pp. 550–562, Jul. 2018.
[25] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time
style transfer and super-resolution,” in Computer Vision—ECCV. Cham, [48] D. Nie et al., “Medical image synthesis with deep convolutional
Switzerland: Springer, Sep. 2016, pp. 694–711. adversarial Networks,” IEEE Trans. Biomed. Eng., vol. 65, no. 12,
[26] C. Ledig et al., “Photo-realistic single image super-resolution using pp. 2720–2730, Dec. 2018.
a generative adversarial Network,” in Proc. IEEE Conf. Comput. Vis. [49] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation
Pattern Recognit., Aug. 2017, pp. 105–114. with conditional adversarial Networks,” in Proc. IEEE Conf. Comput.
[27] A. Dosovitskiy and T. Brox, “Generating images with perceptual similar- Vis. Pattern Recognit., Jul. 2017, pp. 1125–1134.
ity metrics based on deep networks,” in Proc. Adv. Neural Inf. Process. [50] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image
Syst., 2016, pp. 658–666. translation using cycle-consistent adversarial Networks,” in Proc. IEEE
[28] J. M. Wolterink, A. M. Dinkla, M. H. F. Savenije, P. R. Seevinck, Int. Conf. Comput. Vis., Oct. 2017, pp. 2223–2232.
C. A. T. van den Berg, and I. Isgum, “Deep MR to CT synthesis using [51] I. J. Goodfellow et al., “Generative adversarial networks,” in Proc. Adv.
unpaired data,” in Proc. Int. Workshop Simul. Synth. Med. Imaging, 2017, Neural Inf. Process. Syst., 2014, pp. 2672–2680.
pp. 14–23. [52] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley,
[29] A. Ben-Cohen, E. Klang, S. P. Raskin, M. M. Amitai, and H. Greenspan, “Least squares generative adversarial Networks,” in Proc. IEEE Int.
“Virtual PET images from CT data using deep convolutional Networks: Conf. Comput. Vis., Oct. 2017, pp. 2813–2821.
Initial results,” in Proc. Int. Workshop Simul. Synth. Med. Imaging, 2017, [53] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional Networks
pp. 49–57. for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern
[30] F. Mahmood, R. Chen, and N. J. Durr, “Unsupervised reverse Recognit., Sep. 2015, pp. 3431–3440.
domain adaptation for synthetic medical images via adversarial train- [54] T. Chen, M.-M. Cheng, P. Tan, A. Shamir, and S.-M. Hu, “Sketch2Photo:
ing,” IEEE Trans. Med. Imaging, vol. 37, no. 12, pp. 2572–2581, Internet image montage,” ACM Trans. Graph., vol. 28, no. 5, p. 124,
Dec. 2018. Dec. 2009.
[31] H. Huang, P. S. Yu, and C. Wang. (2018). “An introduction to [55] M. Mirza and S. Osindero. (2014). “Conditional generative adversarial
image synthesis with generative adversarial nets.” [Online]. Available: nets.” [Online]. Available: https://arxiv.org/abs/1411.1784
https://arxiv.org/abs/1803.04469 [56] O. Russakovsky et al., “ImageNet large scale visual recognition
[32] Y. Hu et al., “Freehand ultrasound image simulation with challenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252,
spatially-conditioned generative adversarial Networks,” in Proc. Dec. 2015.
Int. Workshop Reconstruction Anal. Moving Body Organs, 2017,
[57] E. Bullitt et al., “Vessel tortuosity and brain tumor malignancy:
pp. 105–115.
A blinded study1 ,” Acad. Radiol., vol. 12, no. 10, pp. 1232–1240,
[33] Y. Hiasa et al., “Cross-modality image synthesis from unpaired data
Oct. 2005.
using CycleGAN,” in Proc. Int. Workshop Simul. Synth. Med. Imaging,
Sep. 2018, pp. 31–41. [58] M. Jenkinson, P. Bannister, M. Brady, and S. Smith, “Improved opti-
[34] J. T. Guibas, T. S. Virdi, and P. S. Li. (2017). “Synthetic medical mization for the robust and accurate linear registration and motion
images from dual generative adversarial networks.” [Online]. Available: correction of brain images,” NeuroImage, vol. 17, no. 2, pp. 825–841,
https://arxiv.org/abs/1709.01872 Oct. 2002.
[35] P. Costa et al., “End-to-end adversarial retinal image synthesis,” IEEE [59] M. Jenkinson and S. Smith, “A global optimisation method for robust
Trans. Med. Imag., vol. 37, no. 3, pp. 781–791, Mar. 2018. affine registration of brain images,” Med. Image Anal., vol. 5, no. 2,
[36] A. Chartsias, T. Joyce, R. Dharmakumar, and S. A. Tsaftaris, “Adver- pp. 143–156, Jun. 2001.
sarial image synthesis for unpaired multi-modal cardiac data,” in Proc. [60] P. Murugan. (2017). “Hyperparameters optimization in deep convolu-
Int. Workshop Simul. Synth. Med. Imaging, Sep. 2017, pp. 3–13. tional neural Network / Bayesian approach with Gaussian process prior.”
[37] F. Calimeri, A. Marzullo, C. Stamile, and G. Terracina, “Biomedical data [Online]. Available: https://arxiv.org/abs/1712.07233
augmentation using generative adversarial neural Networks,” in Proc. [61] T. Hinz, N. Navarro-Guerrero, S. Magg, and S. Wermter, “Speeding up
Int. Conf. Artif. Neural Netw., Oct. 2017, pp. 626–634. the hyperparameter optimization of deep convolutional neural networks,”
[38] J. M. Wolterink, A. M. Dinkla, M. H. F. Savenije, P. R. Seevinck, and Int. J. Comput. Intell. Appl., vol. 17, no. 2, Jun. 2018, Art. no. 1850008.
C. A. T. van den Berg, “MR-to-CT synthesis using cycle-consistent [62] B. Yu, L. Zhou, L. Wang, J. Fripp, and P. Bourgeat, “3D cGAN
generative adversarial networks,” in Proc. Neural Inf. Process. Syst. based cross-modality MR image synthesis for brain tumor segmen-
(NIPS), Long Beach, CA, USA, 2017. tation,” in Proc. IEEE 15th Int. Symp. Biomed. Imaging, Apr. 2018,
[39] Y. Huo, Z. Xu, S. Bao, A. Assad, R. G. Abramson, and B. A. Landman, pp. 626–630.
“Adversarial synthesis learning enables segmentation without target [63] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic
modality ground truth,” in Proc. IEEE 15th Int. Symp. Biomed. Imaging, optimization,” in Proc. Int. Conf. Learn. Represent., Aug. 2015,
Apr. 2018, pp. 1217–1220. pp. 12–24.

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.
2388 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 38, NO. 10, OCTOBER 2019

[64] D. Ulyanov, A. Vedaldi, and V. Lempitsky. (2016). “Instance normal- [74] C. Han et al., “GAN-based synthetic brain MR image genera-
ization: The missing ingredient for fast stylization.” [Online]. Available: tion,” in Proc. IEEE 15th Int. Symp. Biomed. Imaging, Apr. 2018,
https://arxiv.org/abs/1607.08022 pp. 734–738.
[65] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image [75] X. Yang, R. Kwitt, M. Styner, and M. Niethammer, “Quicksilver: Fast
quality assessment: From error visibility to structural similarity,” IEEE predictive image registration—A deep learning approach,” Neuroimage,
Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004. vol. 158, pp. 378–396, Sep. 2017.
[66] T. H. Kim and J. P. Haldar, “The Fourier radial error spectrum plot: [76] D. Jiang, W. Dou, L. Vosters, X. Xu, Y. Sun, and T. Tan, “Denoising
A more nuanced quantitative evaluation of image reconstruction quality,” of 3D magnetic resonance images with multi-channel residual learning
in Proc. IEEE 15th Int. Symp. Biomed. Imaging, Apr. 2018, pp. 61–64. of convolutional neural Network,” Jpn. J. Radiol., vol. 36, no. 9,
[67] A. Paszke et al., “Automatic differentiation in PyTorch,” in Proc. Neural pp. 566–574, Sep. 2018.
Inf. Process. Syst. (NIPS), Long Beach, CA, USA, 2017. [77] S. Qi et al., “Multimodal fusion with reference: Searching for
[68] F. Chollet, Keras. San Francisco, CA, USA: GitHub, 2015. joint neuromarkers of working memory deficits in schizophre-
[69] T. D. Team et al.. (2016). “Theano: A Python framework for nia,” IEEE Trans. Med. Imaging, vol. 37, no. 1, pp. 93–105,
fast computation of mathematical expressions.” [Online]. Available: Jan. 2018.
https://arxiv.org/abs/1605.02688 [78] B. Gözcü et al., “Learning-based compressive MRI,” IEEE Trans. Med.
[70] M. A. Bernstein, K. F. King, and X. J. Zhou, Handbook of MRI Pulse Imag., vol. 37, no. 6, pp. 1394–1406, Jun. 2018.
Sequences. New York, NY, USA: Academic, 2004. [79] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, J. Goldberger, and
[71] D. G. Nishimura, Principles of Magnetic Resonance Imaging. Stanford, H. Greenspan, “GAN-based synthetic medical image augmentation for
CA, USA: Stanford Univ., 1996. increased CNN performance in liver lesion classification,” Neurocom-
[72] L. Xiang, Q. Wang, D. Nie, Y. Qiao, and D. Shen, “Deep embedding puting, vol. 321, pp. 321–331, Dec. 2018.
convolutional neural network for synthesizing CT image from T1- [80] H. Salehinejad, S. Valaee, T. Dowdell, E. Colak, and J. Barfett,
Weighted MR image,” Med. Image Anal., vol. 47, pp. 31–44, Jul. 2018. “Generalization of deep neural networks for chest pathology clas-
[73] Q. Yang, N. Li, Z. Zhao, X. Fan, E. I.-C. Chang, and Y. Xu. (2018). sification in X-Rays using generative adversarial Networks,” in
“MRI image-to-image translation for cross-modality image registration Proc. IEEE Int. Conf. Acoust., Speech Signal Process., Apr. 2018,
and segmentation.” [Online]. Available: https://arxiv.org/abs/1801.06940 pp. 990–994.

Authorized licensed use limited to: VIT University. Downloaded on December 02,2021 at 10:58:46 UTC from IEEE Xplore. Restrictions apply.

You might also like