0% found this document useful (0 votes)
57 views11 pages

Graphical Abstract: Intensity Inhomogeneity Correction of Mri Images Using Inhomonet

This document presents a deep learning approach called InhomoNet for correcting intensity inhomogeneity in MRI images. Intensity inhomogeneity is an artifact that causes inconsistencies in pixel intensities within tissues and reduces image quality. InhomoNet uses a novel multi-scale architecture to capture features at multiple scales without losing neighborhood information. It also uses attention-driven skip connections to optimally transfer contextual and spatial information. Evaluation on synthetic and real MRI data shows InhomoNet can accurately perform intensity inhomogeneity correction compared to other state-of-the-art methods. Novel loss functions like histogram correlation and 3D pixel losses are also proposed to further enhance results.

Uploaded by

Vishal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views11 pages

Graphical Abstract: Intensity Inhomogeneity Correction of Mri Images Using Inhomonet

This document presents a deep learning approach called InhomoNet for correcting intensity inhomogeneity in MRI images. Intensity inhomogeneity is an artifact that causes inconsistencies in pixel intensities within tissues and reduces image quality. InhomoNet uses a novel multi-scale architecture to capture features at multiple scales without losing neighborhood information. It also uses attention-driven skip connections to optimally transfer contextual and spatial information. Evaluation on synthetic and real MRI data shows InhomoNet can accurately perform intensity inhomogeneity correction compared to other state-of-the-art methods. Novel loss functions like histogram correlation and 3D pixel losses are also proposed to further enhance results.

Uploaded by

Vishal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Graphical Abstract

Intensity inhomogeneity correction of MRI images using InhomoNet


Vishal Venkatesh,Neeraj Sharma,Munendra Singh
Highlights
Intensity inhomogeneity correction of MRI images using InhomoNet
Vishal Venkatesh,Neeraj Sharma,Munendra Singh

• The present paper proposes a deep learning approach for intensity inhomogeneity correction. This method is end-to-end
trainable and can be inculcated in the pre-processing pipeline for further computer vision algorithms.
• The network architecture of the generator employs novel multi-scale local information module that help to capture
features at multiple scales without the loss of neighbourhood information, while the attention-driven skip connections
help to optimal transfer of contextual and spatial localized information across the encoder-decoder blocks.
• The proposed method results in robust reconstruction as it consists of predictions from each decoder block to be in
consistent with the ground truth. This offers guidance for the upsampling process and ensures the final prediction
resembles the ground truth.

• The inculcation of novel loss functions like the histogram correlation loss and the 3D pixel loss in the objective function
further enhances the results.
Intensity inhomogeneity correction of MRI images using InhomoNet
Vishal Venkatesha , Neeraj Sharmab and Munendra Singha
a Department of Mechatronics Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
b School of Biomedical Engineering, Indian Institute of Technology (Banaras Hindu University), Varanasi 221005, India

ARTICLE INFO ABSTRACT


Keywords: Intensity inhomogeneity is one of the major artifacts in magnetic resonance imaging (MRI). Bias field
bias correction present in MRI images alters true pixel value and produces spurious varying pixel intensities. This
image enhancement artifact affects the diagnose ability of the radiologist and also degrades the performance of computer-
intensity non-uniformity aided diagnosis algorithms such as segmentation. The present work proposes a novel network called
Deep learning InhomoNet for intensity inhomogeneity correction of MRI image. The generator architecture of In-
homoNet consists of a new multi-scale local information module at each encoder block that helps to
capture features at multiple scales. The horizontal and vertical kernels help to reduce the problems
like loss of neighbourhood information, gridding issues caused due to large dilated convolution oper-
ations. The attention-driven skip connections in the generator network are utilized to transfer optimal
semantic and spatial localization information from the encoder to decoder blocks. Further, the present
work proposes two new losses functions i.e. histogram correlation and 3D pixel loss. These losses
help to realise pixel consistency across different regions of brain MRI. The inculcation of the L1 loss
provides guidance to the upsampling process as it compares the prediction from each decoder block
with the ground truth. The proposed method is evaluated on synthetic and real MRI data and the
comparative analysis with other state-of-the methods depicts the ability of the proposed method to
perform intensity inhomogeneity correction accurately.

1. Introduction
The intensity inhomogeneity is spurious smooth varying
bias field that causes pixel inconsistency within the same tis-
sue in the MRI image. This artifact reduces the tissue con-
trast making gray-white matter differentiation difficult and
also affects the interpretation of lesions and structural in-
formation. The variation in the true pixel intensity values
due to the bias field causes semantic inconsistency and ad-
versely affects further computer-aided diagnosis and other
image processing algorithms such as segmentation, regis-
tration etc. It is evident from Fig. 1 the effect of intensity
inhomogeneity artifact in brain MRI image as it drastically
lowers the accuracy of the segmentation results, hence lead- Figure 1: Effect of Intensity Inhomogeneity artefact on qual-
ing to a requirement of an image enhancement method to ity of MRI image and segmentation results. First row images
overcome the intensity inhomogeneity artifact essential. are (a) inhomogeneity corrupted MRI image, (b) MRI image
The mathematical model for the intensity inhomogeneity after correction of inhomogeneity, and (c) refers to inhomo-
artifact is given as: geneity free ground truth image, whereas second row images
are respectively segmented results.
𝑣(𝑥, 𝑦) = 𝑢(𝑥, 𝑦).𝑏(𝑥, 𝑦) + 𝑛(𝑥, 𝑦) (1)

where v(x, y) stands for intensity inhomogeneity-corrupted


image, u(x, y) for intensity inhomogeneity-free image, b(x, and are specific to the situation. In recent years, the retro-
y) for the bias field and n(x, y) for the noise component. The spective image processing techniques has shown potential
main causes of the intensity inhomogeneity artifact are: (i) to minimize the intensity inhomogeneity artefact [35], [31].
improper position/ coupling of the coils, (ii) non-uniform The main contributions of the proposed work are as fol-
excitation of hydrogen atoms, which can be observed when lows:
scanning is done with other than 90◦ and 180◦ flip angles,
• The present paper proposes a deep learning approach
and (iii) uneven magnetic field within the region of scan.
for intensity inhomogeneity correction. This method
This artifact can be reduced up to an extent with the help
is end-to-end trainable and can be inculcated in the
of proper coil size, tuning of scanning parameters and shim-
pre-processing pipeline for further computer vision al-
ming [14]. However, these methods largely relies on the user
gorithms.
∗ Corresponding author: [email protected]
ORCID (s): • The network architecture of the generator employs novel
multi-scale local information module that help to cap-

V. Venkatesh et al.: Preprint submitted to Elsevier Page 1 of 9


Intensity inhomogeneity correction of MRI images using InhomoNet

ture features at multiple scales without the loss of neigh- duced very good results. Liu et al. have proposed edge-
bourhood information, while the attention-driven skip preserved intensity inhomogeneity correction and segmented
connections help to optimal transfer of contextual and liver in MRI using level set method [19]. Recently, Kumar
spatial localized information across the encoder-decoder et al. presented a novel N3T-spline approach to correct the
blocks. intensity variations in combination with 3D convolutional
neural network for the application of automatic brain tumor
• The proposed method results in robust reconstruction segmentation [15].
as it consists of predictions from each decoder block In recent years, the development in deep learning has
to be in consistent with the ground truth. This offers shown huge potential in the field of medical image process-
guidance for the upsampling process and ensures the ing [24][21] [2].The prominent encoder-decoder structure
final prediction resembles the ground truth. is utilized for many computer vision problems like seman-
• The inculcation of novel loss functions like the his- tic segmentation, denoising, deblurring etc. Several recent
togram correlation loss and the 3D pixel loss in the works utilize convolutions applied with different receptive
objective function further enhances the results. fields in a parallel manner to capture multi-scale contextual
information [4][38]. This causes better feature extraction
and has achieved remarkable results. The downsampling and
2. Related Work upsampling operations results in loss of information. This
Initially, Guillemaud et al. have proposed high pass ho- information loss is reduced by employing skip connections
momorphic filter to reduce intensity inhomogeneity present between the encoder and decoder blocks. Different variants
as the slow varying bias field in MRI data [11]. This method of skip connections [33][9][32] have been proposed to ef-
was simple, robust and independent of tissue class. An im- ficiently transfer information from encoder to decoder side
proved version of homomorphic filtering technique is devel- and is proven to be beneficial for biomedical image analy-
oped by Y Lee et al. and utilized the B spline smoothing sis as well [6][12]. Attention serves as an important element
function to address the intensity homogeneity [16]. In a re- [3][18][7] and the principle of providing channel and spatial
cent study, an unsharp masking based enhanced homomor- attention to the feature maps helps in appropriate selection
phic filtering method has been proposed to correct inhomo- of features necessary to obtain results with high semantic as
geneity from the brain MRI data [27]. The above discussed well spatial localisation accuracy. The inculcation of atten-
inhomogeneity correction techniques are based on filtering tion and other similar advancements are getting more preva-
approach and the problem associated with most of the filter- lent in recent research. We explore the similar techniques
ing based approaches is loss of low frequency information and advancements in the proposed work as well.
along with removal of bias field from the MRI data. In or-
der to retain low frequency information, dynamic stochas- 3. Proposed Method
tic resonance is used to reduce intensity inhomogeneity of
diffusion weighted imaging MRI sequence of neonatal brain 3.1. Network Architecture
[29]. This technique has reduced the bias field and improved Multi-scale local information modules :In order to en-
the overall quality of the MRI data. One of the most pop- sure the pixel intensities are consistent in different tissue re-
ular method for inhomogeneity correction is Nonparamet- gions of the MRI image, the method should essentially de-
ric intensity Nonuniformity Normalization (known as N3) termine the true pixel intensity of the whole region and cor-
method, which used an iterative approach to estimate the rect the spurious intensities of individual pixels based on the
multiplicative bias field and true intensity values [30]. In semantics of region it is located. The multi-scale approach
general, bias field assumed as smooth varying field, Sne- of kernels offers different fields of view and helps in deter-
hashis et al. have proposed patch based intensity inhomo- mination of the true pixel intensity across different regions.
geneity correction, which performs well even in the case of Direct use of larger kernels to obtain the macroscopic view
non-smooth bias field [26]. Further, Osadebey et al. have usually results in the higher number of parameters, causing
corrected the bias field present in MRI using region of inter- the methods to be computational expensive and time inten-
est, and anatomic structural map [23]. sive .The use of dilated convolutions results in lesser num-
The above discussed techniques enhance the details of ber of parameters but with the increase in the rate of dilation
MRI data, which lead to improve the diagnosis confidence. problems like loss of neighbourhood information, gridding
Further, the accuracy of the segmentation algorithms is highly issues and semantic inconsistency [36] is introduced.This is
influenced by the presence of inhomogeneity in MRI data. In caused due to the padding of zeroes between the kernel val-
this view, a few studies have proposed the intensity inhomo- ues that don’t capture any information.
geneity correction led segmentation techniques. Wells et al. In context of the intensity inhomogeneity problem, in-
has proposed adaptive segmentation and removed the bias formation in a macroscopic as well as pertaining to local ba-
field using expected maximization [37]. C. Li et al. have sis is very essential to predict the true pixel intensity. The
proposed an efficient approach based on multiplicative in- proposed module includes horizontal and vertical convolu-
trinsic component optimization (MICO) for bias field cor- tions to reduce the effect of problems that arise from dilated
rection and segmentation of MRI data [17], and has pro- convolutions. The dilated convolutions help to acquire the

V. Venkatesh et al.: Preprint submitted to Elsevier Page 2 of 9


Intensity inhomogeneity correction of MRI images using InhomoNet

Figure 2: Overview of the proposed method and the representation of the InhomoNet.The encoder blocks contain multi-scale local
information module and optimal semantic and spatial information is transferred to the decoder by means of the attention-driven
skip connections. The overall objective function contains four loss functions. The red arrows depict the downsampling, blue
arrows depict the upsampling, orange and black arrows depict transfer of features while the brown arrow depicts the upsampling
operation to realise the actual image dimensions.

1 × 5, 1 × 9,1 × 13 and 3 × 1, 5 × 1, 9 × 1,13 × 1 respec-


tively. The output of each dilated convolution is channel-
wise concatenated.The average (𝐹𝑎𝑣𝑔 ) is performed for the
output feature maps of matching horizontal and vertical ker-
nels. Further, channel-wise concatenation is implemented
for all average output feature maps. The dilated convolu-
tions and horizontal-vertical convolutions perform feature
extraction in parallel manner, hence capturing maximum in-
formation from the input feature map. The summation of
the concatenations from dilated convolutions and horizontal-
vertical convolutions is resized by a conv 1×1 kernel. Hence,
if the input feature has M channel depth, the output channel
depth is maintained as M. Residual learning is also incorpo-
rated to make the flow of the features more efficient across
the network.

Attention-driven skip connections : The downsampling


process at the encoder blocks results in increase in the se-
Figure 3: Representation of the Multi-scale local information mantic and contextual information and decrease in the spa-
module. tial localization information. Inculcating skip connections
across corresponding encoder and decoder blocks optimally
benefits from spatial information from the initial encoder
macroscopic contextual information while the use of 2D hor- blocks and contextual information from the later encoder
izontal and vertical kernels help to acquire the local con- blocks. In order to transfer optimal contextual as well as
textual and neighbourhood information.The proposed multi- spatial information across the encoder blocks towards the de-
scale local information module is shown in the Fig. 3, which coder blocks, an attention-driven skip connections based on
consists of: (i) four different 3 × 3 kernels with rates of di- channel and spatial attention methods is proposed. This is
lation as 1,2,4 and 6, and (ii) 2D horizontal and vertical ker- essential for intensity inhomogeneity artifact correction as
nels as depicted in Fig. 3. The kernel dimensions after dila- it helps in producing results with accurate spatial localiza-
tion are 3 × 3, 5 × 5,9 × 9 and 13 × 13 and the correspond- tion of the pixels and also bring about consistency in the in-
ing horizontal and vertical kernels have dimensions 1 × 3, tensity values based on the semantics of the region. Fig. 4

V. Venkatesh et al.: Preprint submitted to Elsevier Page 3 of 9


Intensity inhomogeneity correction of MRI images using InhomoNet

other produces a prediction of the intensity inhomogeneity


corrected image. The dimension of the prediction is matched
with that of ground truth after pixel shuffle and convolution
operations and is represented by brown arrow in Fig. 2. The
final prediction of the intensity inhomogeneity corrected im-
age is realised after convolution by 3 × 3 kernel.

Discriminator Architecture :The discriminator tries to dis-


tinguish the prediction from the ground truth, i.e. if the pre-
diction results seem real or fake with respect to the ground
Figure 4: Representation of the Attention module truth .The discriminator architecture relies on 70 × 70 over-
lapping patches to classify them as real or fake. The discrim-
inator used is same as the one proposed in [39].
represents the overview of the attention-driven skip connec- 3.2. Loss Functions
tions and Fig. 2 depicts the utilization of these connections
Guidance L1 loss : The L1 loss tries to minimize the sum
in the generator architecture. The channel and spatial atten-
of all the absolute differences between the values of the ground
tion blocks take in a feature map of dimension 𝐶 × 𝐻 × 𝑊
truth and prediction. In order to provide guidance to the up-
and outputs attention feature maps of dimension of 𝐶 × 1 × 1
sampling process, the prediction from each decoder block is
and 1 × 𝐻 × 𝑊 while capturing the inter-channel and inter-
compared with the ground truth by means of L1 loss. The
spatial relationships between the features respectively. The
loss is given as follows:
channel and spatial attention provide contextual and spatial
localization attentions to the features respectively. 𝐿 ∑
∑ 𝑛
The skip connections from the initial encoder blocks trans- 𝐿𝐺𝐿1 = ||𝑦𝑔𝑗,𝑖 − 𝑦𝑝𝑗,𝑖 || (2)
fers good spatial information but lacks contextual informa- 𝑗 𝑖
tion, the insufficiency of contextual information can be over-
come by utilizing the information from the later encoder blocks where 𝑦𝑔 and 𝑦𝑝 are the ground truth and pixel intensity val-
that contain good contextual information and this applies ues respectively and 𝐿 represents the decoder level.
vice-versa for skip connections from later encoder blocks
with respect to insufficiency of spatial information. Keeping
Adversarial loss : The game theoretic approach between
this fundamental phenomenon pertaining to encoder-decoder
the generator and discriminator networks help to produce re-
structures in mind, the attention module takes two feature
sults that are realistic and visually appealing, making adver-
maps as input, one feature map is taken as the main feature
sarial loss a powerful loss [10]. The generator produces in-
map ( represented in black arrow) and other feature map is
tensity inhomogeneity corrected images that are similar to
taken as the enhancer feature map(represented in orange ar-
the intensity inhomogeneity-free ground truth images, while
row) that adds the lacking information to the main feature
the discriminator aims to distinguish between the two. The
map. The attention maps produced from the enhancer fea-
generator aims to minimize the discriminator’s ability to dis-
ture map from the channel and spatial attention blocks scales
tinguish the predicted samples from the ground truth sam-
the main feature map, thereby providing attention to the less
ples, while discriminator aims to maximize its ability to iden-
informative features. The main feature map is then summed
tify the ground truth samples among the predicted samples.
with its scaled versions from the channel and spatial atten-
The loss is given as follows:
tion blocks and concatenated as represented in Fig. 4.The
output feature map rich in features that represent optimal 𝑚𝑖𝑛𝐺 𝑚𝑎𝑥𝐷 𝐿𝐴𝑑𝑣 (𝐺, 𝐷) = 𝐸𝑦∼𝑝𝑑𝑎𝑡𝑎 (𝑌 𝑔 ) [𝑙𝑜𝑔𝐷(𝑦)]
contextual and spatial information and this is transferred to (3)
the decoder side. +𝐸𝑥∼𝑝𝑑𝑎𝑡𝑎 (𝑌 𝑝 ) [𝑙𝑜𝑔(1 − 𝐷(𝐺(𝑥)))]

Generator Architecture :The generator architecture con-


sists of encoder-decoder structure. There exists four number
of encoder and decoder blocks and a bottleneck deep rep- Histogram correlation loss : A histogram graphically rep-
resentation. Each encoder block consists of a multi-scale resents distribution of pixels at each different intensity value.
local information module and proceeds to downsample the An intensity inhomogeneity free image depicts distinct, con-
feature map by a factor of two by inverse operation of pixel sistent intensity values for different regions whereas an in-
shuffle [28]. The encoding blocks perform feature extrac- tensity inhomogeneity image depicts spurious variations in
tion process and result in deep representation. Each decoder the intensity values and this evident in histogram visualiza-
block upsamples the incoming feature map by pixel shuffle tion in Fig. 5. This loss tries to maximize the correlation
method and concatenates it with the attention-rich feature between the histograms of the ground truth and predicted
map from the attention module. The decoder block consists images. The correlation and the loss is given as follows:
of two outputs, one proceeds to the next decoder block while

V. Venkatesh et al.: Preprint submitted to Elsevier Page 4 of 9


Intensity inhomogeneity correction of MRI images using InhomoNet

Figure 5: Ablation for multi-scale local infomation module. The numerical values denotes the SSIM. a : Intensity inhomogeneity
corrupted image, b : Use of convolution kernels , c : Use of Dilated and horizantal, vertical kernels, d : Ground truth.

Table 1
Quantitative results of ablation for multi-scale local information module. PropN denotes
the proposed network. The parameters column denotes the number of parameters con-
tributed by each technique if utilized in the method.
Method Parameters Hist_corr SSIM PSNR
PropN w/ Conv 79.41 Million 0.5632 0.9459 29.091
PropN w/ Dil_conv 10.07 Million 0.5869 0.9630 30.552
PropN w/ Dil_conv+HV_conv 25.73 Million 0.6277 0.9813 34.978

3D Pixel loss :The 2D grayscale image can be visualized


as a 3D matrix form and is represented in Fig. 6. The depth
of the 3D matrix signifies the pixel intensity range of 0 to
255. The 3D matrix consists of range of 2D binary maps
populated by 1 and 0 stacked in range 0 to 255, where each
binary map represents a particular pixel intensity value and
the co-ordinates represents the co-ordinates on the 2D grayscale
image. For example, each co-ordinate (𝑥, 𝑦) in the 2D grayscale
Figure 6: Representation of the 2D gray-scale image in 3D image with pixel intensity 𝑝𝑖 is represented in the 3D matrix
matrix form. by means of 1 and 0, i.e. the value is 1 at co-ordinate (𝑥, 𝑦, 𝑝𝑖 )
and 0 elsewhere. The ground truth and the predictions are
∑ ̄ ̄
𝐼 (𝐻𝑔 (𝐼)−𝐻𝑔 )(𝐻𝑝 (𝐼)−𝐻𝑝 ) represented as 3D matrices and the mean absolute difference
𝑑(𝐻𝑔 , 𝐻𝑝 ) = √∑ ∑
̄ 2
𝐼 (𝐻𝑔 (𝐼)−𝐻𝑔 )
̄ 2
𝐼 (𝐻𝑔 (𝐼)−𝐻𝑝 )
is taken of the two matrices. The 3D pixel loss is given as

𝐿𝐻𝐶 = (1 − 𝑑(𝐻𝑔 , 𝐻𝑝 )) (4)


V. Venkatesh et al.: Preprint submitted to Elsevier Page 5 of 9
Intensity inhomogeneity correction of MRI images using InhomoNet

follows: Table 2
∑ 𝑊 ∑
𝐻 ∑ 𝐷 Quantitative results of ablation on attention-driven skip con-
1 𝑔 𝑝
𝐿3𝐷𝑃 = ||𝑃𝑖,𝑗,𝑘 − 𝑃𝑖,𝑗,𝑘 || (5) nections.
𝐻 ×𝑊 ×𝐷 𝑖 𝑗 𝑘 Method SSIM PSNR
Where H and W are the height and width of the image and PropN w/o skip 0.9227 27.269
D is the depth i.e pixel 0 to 255. 𝑃 𝑔 and 𝑃 𝑝 represent the 3D PropN w/ skip 0.9564 29.867
PropN w/ Attn_skip 0.9808 34.886
matrices of the ground truth and prediction.

Overall Objective function :The complete loss function


performance of the proposed method.The tensorflow frame-
is given as follows:
work and a NVidia Tesla K80 GPU was used in the process.
𝐿 = 𝐿𝐴𝑑𝑣 + 𝛼 × 𝐿𝐺𝐿1 + 𝛽 × 𝐿𝐻𝐶 + 𝛾 × 𝐿3𝐷𝑃 (6) We utilize a pre-trained UNet trained on the 3-label segmen-
The terms 𝛼, 𝛽 and 𝛾 are positive weighted terms that help tation dataset to perform segmentation of the brain MRI im-
control the performance of the network. age.

4.3. Ablation Study


4. Experimental Results The ablation study is performed on the validation dataset
4.1. Dataset and the quantitative results denotes the mean of the results
The dataset used in the present work consists of sim- obtained.The Fig. 5 and table 1 depicts the utility of the
ulated MR and real MR images of the brain. The simu- multi-scale local information module that helps obtain im-
lated dataset was obtained from the BrainWeb dataset[5]. ages having histograms resembling the histograms of the ground
Five different intensity inhomogeneity/bias field were sim- truth images and is evident in the correlation value as well.Also,
ulated [22] using discrete polynomials in x and y directions the number of parameters is lower compared to the use of
i.e. (2, 1), (3,2), (4, 6), (7, 2) and (5, 5). Each set of the standard convolution kernels in order to perform the same
generated inhomogenity field multiplied with the eighty one multi-scale feature extraction yet attains the best metrics val-
ground truth images (slice number 55 to 135) obtained from ues. The overall number of parameters of the proposed method
BrainWeb dataset to generate intensity inhomogeneity im- is 32.61 million. Hence, this module efficiently brings about
ages. The total synthetic dataset generated for the work was pixel consistency in different regions. The utility of the attention-
405 image pairs of intensity inhomogeneity images and their driven skip connection is justified in Fig. 8 and table 2. The
corresponding intensity inhomogeneity-free ground truth im- results from network without skip connections have blurred,
age. The dimension of the gray-scale images was maintained less distinctive regions, while the addition of skip connec-
as 192 × 192. 50 randomly selected images was set aside as tions results in improvement in the image quality due to the
the validation set and the remaining 355 images was used transfer of information from encoder to decoder blocks but
for training purpose. Further, we have tested the proposed the use of attention-driven skip connections helps produce
network on simulated MRI data as well real MRI data. Total images with accurate intensity value and better spatial local-
fifty simulated MRI for testing consist of inhomogeneity (2, ization of the pixels, thereby obtaining high metric values.
5), (4, 1), (2, 7), (3, 1), and (1, 2) in x and y directions. The Table 3 depicts the performance of the network with in-
real MRI images for the testing were collected from [1]. cremental addition of new losses to the network. The in-
The segmentation dataset required to train the UNet [25] culcation of different losses helps improve the results and
for the segmentation task consists of 81 synthetic ground it is quantitatively evident in table 3.The Guidance L1 loss
truth images and their corresponding segmentation mask con- helps in the reconstruction of the image but doesn’t take the
sisting of three labels, i.e grey matter, white matter and the semantic criteria while predicting the pixel intensity. The
Cerebrospinal fluid (CSF).Data augmentation techniques like adversarial loss offers meta-supervision as it trys to map the
horizontal, vertical flip and rotation of 5,15,30,45,60, 75 and generated images to the domain of the ground truth images,
90 degrees are utilized to increase the dataset. hence there is a improvement in the metric values. The His-
togram correlation loss enables the network to generate im-
4.2. Implementation details ages that have histograms similar to that of ground truth,
The Least squares GANs depicts stability in the learning hence ensuring every intensity value in gray-scale range has
process in comparison to regular GANs [20].Hence, instead identical number of pixels. The 3D pixel loss provides ac-
of the log-likelihood loss adversarial loss for the generator curate spatial localization as well as the true intensity value
and discriminator, the least-squared loss is used. We set the to the pixels in the generated image and helps arrive at high-
batch size as 1 and utilize the ADAM optimizer with learning est metric values. Hence, each loss in the overall objective
rate of 0.0001 for the first 50 epochs and then the learning function brings about improvement in the performance of
rate is linearly decayed to zero till 100 epochs. The terms the network.
𝛼,𝛽 and 𝛾 was set to 10, 1 and 100 respectively and was se-
lectled after a series of comparison experiments. The stan- 4.4. Results
dard Structural Similarity Index (SSIM) and Peak Signal- Fig. 7 shows the qualitative results of the comparative
to-Noise Ratio (PSNR) metrics were to used to evaluate the analysis of our method with other state-of-the art conven-

V. Venkatesh et al.: Preprint submitted to Elsevier Page 6 of 9


Intensity inhomogeneity correction of MRI images using InhomoNet

Figure 7: Qualitative comparison results on synthetic test data. a : Intensity inhomogeneity corrupted image, intensity inhomo-
geneity corrected images from b : MICO, c : N4, d : Homo, e : Pix2Pix, f : CycleGAN, g : InhomoNet and h : ground truth.
The rows 1,3,5 represent the MRI images and rows 2,4,6 represent the segmentation results of the MRI images above them.

Table 3 ter perform than the conventional methods. The quantitative


Quantitative results for ablation on loss functions. results for the proposed method achieves the best SSIM and
Method SSIM PSNR PSNR values and high mean intersection-over-union (mIOU)
PropN+L1 0.8777 25.552 and the dice coefficient (DC) values for the segmentation re-
PropN+GL1 0.9343 27.913 sults on the for the intensity inhomogeneity corrected im-
PropN+GL1+Adv 0.9576 30.373 ages depicts the proposed method’s ability to generate im-
PropN+GL1+Adv+HC 0.9788 34.271 ages similar to ground truth. Hence, the proposed method
PropN+GL1+Adv+HC+3DP 0.9826 35.297 outperforms conventional and deep learning methods for the
task of intensity inhomogeneity correction on synthetic im-
ages. We further show the performance of the proposed
tional and deep learning image-to-image translation meth- method on real brain MRI images in Fig. 9. Since, ground
ods. The conventional methods are MICO [17], N4 [34], truth image is not available, quantitative results can not be
Homo [11], while the deep learning methods are Pix2Pix expressed in terms of SSIM and PSNR. The correction of
[13] and Cyclegan [39]. It is observed that our method ef- intensity inhomogeneity is measured in terms of coefficient
fective rectifies the intensity inhomogeniety artifact and this of joint variation (CJV) [8] between gray matter and white
evident as the predicted segmentation mask is closely identi- matter. The less value of CJV corresponds to better inho-
cal to the ground truth mask. The conventional methods fail mogeneity correction. The CJV values of the MRI images
to bring about pixel consistency and this adversely affects are mentioned at the top of each MRI image of Fig. 9. In
the results from the segmentation network, while the deep comparison to real MRI image, the processed MRI image
learning methods being data-driven show comparatively bet- shows less CJV i.e., less intra-tissue intensity variance. The

V. Venkatesh et al.: Preprint submitted to Elsevier Page 7 of 9


Intensity inhomogeneity correction of MRI images using InhomoNet

5. Conclusion
In this work, we proposed a novel deep learning method
to address the intensity inhomogeneity in MRI images. The
proposed InhomoNet consists of new generator architecture
that contains multi-scale local information module that cap-
tures features in different receptive fields and tries to over-
come problems like loss of neighbourhood information, grid-
ding issues. The inculcation of attention-driven skip con-
nection builds on the concept of enabling the transfer of op-
timal spatial and contextual information from the encoder
to decoder side, this provides attention such that accurate
spatial localization and true pixel intensity is realized. The
method also consists of new losses like the guidance L1 loss
that helps in providing guidance in the upsampling process,
hence enhancing the reconstruction ability. The Histogram
Figure 8: Qualitative results on ablation for attention-driven correlation loss tries to bring about pixel consistency in dif-
skip connections. The numerical values denote the SSIM. a : ferent regions, while the 3D pixel loss ensures the spatial lo-
Intensity inhomogeneity corrupted image, b : no skip connec- calization of the pixels in the generated image are accurate.
tions, c : with skip connections, d : with attention driven skip The combination of these new loss functions with the ad-
connections, e : ground truth. verasrial loss makes the proposed method an efficient,accurate
method to perform intensity inhomogeneity artifact correc-
tion.
Table 4 The future work would be pertaining to developing a method
Quantitative results for comparative analysis on synthetic test
that is both robust to intensity inhomogeneity and noise ar-
data. The SSIM, PSNR are denoted for predicted MRI images
tifact, try to inculcate more semantically-driven image en-
and mIOU,DC for the segmentation results.
hancement and develop techniques that are deployable on
Method SSIM PSNR mIOU DC real MRI images.
MICO 0.9127 22.699 35.8 0.422
N4 0.9122 20.595 51.7 0.658
Homo 0.8216 18.839 33.0 0.441 References
Pix2pix 0.9342 26.167 67.1 0.787
[1] , . real mri dataset. https://openneuro.org/public/datasets. Ac-
CycleGAN 0.9324 26.947 69.3 0.805 cessed: 2019-07-03.
InhomoNet 0.9515 29.129 80.4 0.887 [2] Akkus, Z., Galimzianova, A., Hoogi, A., Rubin, D.L., Erickson, B.J.,
2017. Deep learning for brain mri segmentation: state of the art and
future directions. Journal of digital imaging 30, 449–459.
[3] Anwar, S., Barnes, N., 2019. Real image denoising with feature at-
tention. arXiv preprint arXiv:1904.07396 .
[4] Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H., 2018.
Encoder-decoder with atrous separable convolution for semantic im-
age segmentation, in: Proceedings of the European conference on
computer vision (ECCV), pp. 801–818.
[5] Cocosco, C.A., Kollokian, V., Kwan, R.K.S., Pike, G.B., Evans,
A.C., 1997. Brainweb: Online interface to a 3d mri simulated brain
database, in: NeuroImage, Citeseer.
[6] Drozdzal, M., Vorontsov, E., Chartrand, G., Kadoury, S., Pal, C.,
2016. The importance of skip connections in biomedical image seg-
mentation, in: Deep Learning and Data Labeling for Medical Appli-
cations. Springer, pp. 179–187.
[7] Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H., 2019. Dual at-
tention network for scene segmentation, in: Proceedings of the IEEE
Figure 9: Results on real data. a and b represent: Intensity Conference on Computer Vision and Pattern Recognition, pp. 3146–
inhomogeneity corrupted and corrected images respectively. 3154.
[8] Ganzetti, M., Wenderoth, N., Mantini, D., 2016. Intensity inhomo-
geneity correction of structural mr images: a data-driven approach to
define input algorithm parameters. Frontiers in neuroinformatics 10,
qualitative results clearly show that the network is able to 10.
rectify the bias field and bring about contrast enhancement [9] Gao, H., Tao, X., Shen, X., Jia, J., 2019. Dynamic scene deblur-
ring with parameter selective sharing and nested skip connections, in:
and pixel intensity correction based on the semantics of the
Proceedings of the IEEE Conference on Computer Vision and Pattern
region. Recognition, pp. 3848–3856.
[10] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley,
D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversar-

V. Venkatesh et al.: Preprint submitted to Elsevier Page 8 of 9


Intensity inhomogeneity correction of MRI images using InhomoNet

ial nets, in: Advances in neural information processing systems, pp. nance. Journal of Medical and Biological Engineering 37, 508–518.
2672–2680. [30] Sled, J.G., Zijdenbos, A.P., Evans, A.C., 1998. A nonparametric
[11] Guillemaud, R., 1998. Uniformity correction with homomorphic fil- method for automatic correction of intensity nonuniformity in mri
tering on region of interest, in: Proceedings 1998 International Con- data. IEEE transactions on medical imaging 17, 87–97.
ference on Image Processing. ICIP98 (Cat. No. 98CB36269), IEEE. [31] Song, S., Zheng, Y., He, Y., 2017. A review of methods for bias
pp. 872–875. correction in medical images. Biomedical Engineering Review 1.
[12] Ibtehaz, N., Rahman, M.S., 2020. Multiresunet: rethinking the u-net [32] Song, Y., Zhu, Y., Du, X., 2019. Dynamic residual dense network for
architecture for multimodal biomedical image segmentation. Neural image denoising. Sensors 19, 3809.
Networks 121, 74–87. [33] Tong, T., Li, G., Liu, X., Gao, Q., 2017. Image super-resolution using
[13] Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A., 2017. Image-to-image dense skip connections, in: Proceedings of the IEEE International
translation with conditional adversarial networks, in: Proceedings of Conference on Computer Vision, pp. 4799–4807.
the IEEE conference on computer vision and pattern recognition, pp. [34] Tustison, N.J., Avants, B.B., Cook, P.A., Zheng, Y., Egan, A.,
1125–1134. Yushkevich, P.A., Gee, J.C., 2010. N4itk: improved n3 bias correc-
[14] Jones, R.W., Witte, R.J., 2000. Signal intensity artifacts in clinical mr tion. IEEE transactions on medical imaging 29, 1310.
imaging. Radiographics 20, 893–901. [35] Vovk, U., Pernus, F., Likar, B., 2007. A review of methods for correc-
[15] Kumar, G.A., Sridevi, P., 2019. Intensity inhomogeneity correction tion of intensity inhomogeneity in mri. IEEE transactions on medical
for magnetic resonance imaging of automatic brain tumor segmen- imaging 26, 405–421.
tation, in: Microelectronics, Electromagnetics and Telecommunica- [36] Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., Cot-
tions. Springer, pp. 703–711. trell, G., 2018. Understanding convolution for semantic segmenta-
[16] Lee, Y., Wang, Z., Zhu, Y.s., 2003. An improved homomorphic fil- tion, in: 2018 IEEE winter conference on applications of computer
tering method for nonuniformity correction of mr images. IFAC Pro- vision (WACV), IEEE. pp. 1451–1460.
ceedings Volumes 36, 119–122. [37] Wells, W.M., Grimson, W.E.L., Kikinis, R., Jolesz, F.A., 1996. Adap-
[17] Li, C., Gore, J.C., Davatzikos, C., 2014. Multiplicative intrinsic com- tive segmentation of mri data. IEEE transactions on medical imaging
ponent optimization (mico) for mri bias field estimation and tissue 15, 429–442.
segmentation. Magnetic resonance imaging 32, 913–923. [38] Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J., 2017. Pyramid scene
[18] Li, H., Xiong, P., An, J., Wang, L., 2018. Pyramid attention network parsing network, in: Proceedings of the IEEE conference on computer
for semantic segmentation. arXiv preprint arXiv:1805.10180 . vision and pattern recognition, pp. 2881–2890.
[19] Liu, H., Tang, P., Guo, D., Liu, H., Zheng, Y., Dan, G., 2018. Liver [39] Zhu, J.Y., Park, T., Isola, P., Efros, A.A., 2017. Unpaired image-
mri segmentation with edge-preserved intensity inhomogeneity cor- to-image translation using cycle-consistent adversarial networks, in:
rection. Signal, Image and Video Processing 12, 791–798. Proceedings of the IEEE international conference on computer vision,
[20] Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S., 2017. pp. 2223–2232.
Least squares generative adversarial networks, in: Proceedings of the
IEEE International Conference on Computer Vision, pp. 2794–2802.
[21] Milletari, F., Ahmadi, S.A., Kroll, C., Plate, A., Rozanski, V.,
Maiostre, J., Levin, J., Dietrich, O., Ertl-Wagner, B., Bötzel, K., et al.,
2017. Hough-cnn: deep learning for segmentation of deep brain re-
gions in mri and ultrasound. Computer Vision and Image Understand-
ing 164, 92–102.
[22] O’Leary, P., Harker, M., 2012. A framework for the evaluation of in-
clinometer data in the measurement of structures. IEEE Transactions
on Instrumentation and Measurement 61, 1237–1251.
[23] Osadebey, M., Bouguila, N., Arnold, D., 2016. Brain mri intensity
inhomogeneity correction using region of interest, anatomic structural
map, and outlier detection, in: Applied Computing in Medicine and
Health. Elsevier, pp. 79–98.
[24] Pawar, K., Chen, Z., Shah, N.J., Egan, G.F., 2018. Motion correction
in mri using deep convolutional neural network, in: Proceedings of
the ISMRM Scientific Meeting & Exhibition, Paris.
[25] Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional
networks for biomedical image segmentation, in: International Con-
ference on Medical image computing and computer-assisted interven-
tion, Springer. pp. 234–241.
[26] Roy, S., Carass, A., Bazin, P.L., Prince, J.L., 2011. Intensity inhomo-
geneity correction of magnetic resonance images using patches, in:
Medical Imaging 2011: Image Processing, International Society for
Optics and Photonics. p. 79621F.
[27] SA, P.S.B., 2019. Enhanced homomorphic unsharp masking method
for intensity inhomogeneity correction in brain mr images. Computer
Methods in Biomechanics and Biomedical Engineering: Imaging &
Visualization , 1–9.
[28] Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R.,
Rueckert, D., Wang, Z., 2016. Real-time single image and video
super-resolution using an efficient sub-pixel convolutional neural net-
work, in: Proceedings of the IEEE conference on computer vision and
pattern recognition, pp. 1874–1883.
[29] Singh, M., Sharma, S., Verma, A., Sharma, N., 2017. Enhancement
and intensity inhomogeneity correction of diffusion-weighted mr im-
ages of neonatal and infantile brain using dynamic stochastic reso-

V. Venkatesh et al.: Preprint submitted to Elsevier Page 9 of 9

You might also like