Blind CT Image Quality Assessment Via Deep Learning Strategy: Initial Study

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/323623361

Blind CT image quality assessment via deep learning strategy: initial study

Conference Paper · March 2018


DOI: 10.1117/12.2293240

CITATIONS READS

0 311

7 authors, including:

Jianhua Ma ji he
Southern Medical University Southern Medical University
212 PUBLICATIONS   1,896 CITATIONS    18 PUBLICATIONS   34 CITATIONS   

SEE PROFILE SEE PROFILE

Dong Zeng Zhaoying Bian


Southern Medical University Southern Medical University
63 PUBLICATIONS   315 CITATIONS    85 PUBLICATIONS   554 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

PET simulation View project

PET imaging View project

All content following this page was uploaded by Dong Zeng on 30 September 2018.

The user has requested enhancement of the downloaded file.


Blind CT image quality assessment via deep learning
strategy: Initial Study
Sui Li, Ji He, Yongbo Wang, Yuting Liao, Dong Zeng, Zhaoying Bian, and Jianhua Ma
Department of Biomedical Engineering, Southern Medical University, Guangzhou, Guangdong
510515, China

Guangzhou Key Laboratory of Medical Radiation Imaging and Detection Technology,


Guangzhou 510515, China

ABSTRACT
Computed Tomography (CT) is one of the most important medical imaging modality. CT images can be used
to assist in the detection and diagnosis of lesions and to facilitate follow-up treatment. However, CT images are
vulnerable to noise. Actually, there are two major source intrinsically causing the CT data noise, i.e., the X-ray
photo statistics and the electronic noise background. Therefore, it is necessary to doing image quality assessment
(IQA) in CT imaging before diagnosis and treatment. Most of existing CT images IQA methods are based on
human observer study. However, these methods are impractical in clinical for their complex and time-consuming.
In this paper, we presented a blind CT image quality assessment via deep learning strategy. A database of 1500
CT images is constructed, containing 300 high-quality images and 1200 corresponding noisy images. Specifically,
the high-quality images were used to simulate the corresponding noisy images at four different doses. Then,
the images are scored by the experienced radiologists by the following attributes: image noise, artifacts, edge
and structure, overall image quality, and tumor size and boundary estimation with five-point scale. We trained
a network for learning the non-liner map from CT images to subjective evaluation scores. Then, we load the
pre-trained model to yield predicted score from the test image. To demonstrate the performance of the deep
learning network in IQA, correlation coefficients: Pearson Linear Correlation Coefficient (PLCC) and Spearman
Rank Order Correlation Coefficient (SROCC) are utilized. And the experimental result demonstrate that the
presented deep learning based IQA strategy can be used in the CT image quality assessment.
Keywords: Blind image quality assessment, deep learning, five-point scale IQA, CT images

1. INTRODUCTION
Computed tomography (CT) is a widely used imaging modality, allowing visualization of anatomical structures
with high spatial and temporal resolution. CT images are subjected to noisy, artifacts during acquisition,
processing and reconstruction. Therefore, there is a need to assess of CT images quality before diagnosis and
treatment. IQA is an effective approach to assess the quality of perceived visual stimuli. IQA methods fall into
two categories: subjective assessment by human and objective assessment by algorithms designed to mimic the
subjective judgement. Subjective assessment always regarded as the gold standard to evaluate images, however,
the subjective judgment is not always suitable for applications because its well-known drawbacks are time-
and labor-consuming, especially for clinical applications. According to the availability of a reference image,
objective IQA metrics can be classified as full-reference (FR) IQA; reduce-reference (RR) IQA; and no-reference
(NR/blind) IQA methods.1 In CT imaging, there is no perfect image as reference for IQA. Therefore, NR-IQA
is needed for CT images quality assessment.
Recently, deep learning, i.e., convolutional neural network (ConvNet), are having an enormous impact on com-
puter vision research and practice. The success of ConvNet can be attributed to many factors, including deep
architecture, batch normalization (BN), Rectifier Linear Unit (ReLU), residual learning and high-performance
graphical processing units (GPUs).2 Deep structures are capable of extracting features from data to build
Correspondence: J.M.: E-mail: [email protected]
increasing abstract representations, replacing the traditional approach of carefully hand-crafting features and
algorithms. Therefore, the ConvNet encourages researchers exploring its potential application to the IQA prob-
lems.3–6 This corresponding experimental results demonstrate that its significant improvements compared with
previous hand-crafted approaches.7–9 It is noted that most of the IQA models are proposed for natural images
quality assessment, but few are used in clinical application.
Inspired by the recent advance in deep learning method for IQA, in this paper, we presented a blind CT
image quality assessment via deep learning strategy. A database of 1500 CT images is constructed, containing
300 high-quality images and 1200 corresponding noisy images. Specifically, the high-quality images were used to
simulate the corresponding noisy images at four different doses. Then, the images are scored by the experienced
radiologists for the following attributes: image noise, artifacts, edge and structure, overall image quality, and
tumor size and boundary estimation with five-point scale. We trained a network for learning the non-liner map
from CT images to subjective evaluation scores. Then, we load the pre-trained model to yield predicted score
from the test image. The predicted scores were validate using Pearson Linear Coefficient (PLCC) and Spearman
Rank Order Correlation Coefficient (SROCC).

2. METHODOLOGY
2.1 Image Database
In this work, under the authorization from Mayo clinic, the clinical data were used to validate and evaluate
performance of the NR-IQA model. In the examination, we simulated distorted CT images from the high-quality
CT images using the simulation techniques.10 The high-quality images were used to simulate noisy images at
four different doses. Fig. 1 shows the subset of the images used in the study. All the CT images are in gray-scale,
and the images in this database are 512×512 pixels in size. The images are scored by the experienced radiologists
for the following attributes: image noise, artifacts, edge and structure, overall image quality, and tumor size and
boundary estimation with five-point scale: Excellent (5), Good (4), Fair (3), Poor (2), Bad (1). There are 1500
images in the database.

1 1 3 2

5 2 1 4

Figure 1. Some source images used in the study

2.2 Implementation Details


The presented NR-IQA model was implemented using a typical network architectures: AlexNet.11 For AlexNet,
we only change training image size and the number of the outputs since our objective has five scores. The
flowchart of our approach is shown in Fig. 2.

Q1

AlexNet

Fully-connected

Fully-connected
Fully-connected
Convolution layer
Q2 C C

Convolution layer

Convolution layer
Convolution

Convolution layer
Convolution layer
o o

Convolution layer

Convolution layer

Convolution layer
n n
vo v
lu lu
ti ti
o
n o
n Quality
Q3 la la
ye ye
r r

layer

layer

layer
layer
Qn

Figure 2. Overview of the presented NR-IQA framework

For IQA, we assess a whole image instead of patches in the images, because visual quality is a holistic concept
of an image. We use the tensorflow framework.12 When training, we use mini-batch Stochastic Gradient Descent
(SGD) algorithm. The batch size, momentum, and weight decay for the min-batch SGD were set to 16, 0.99,
and 0.005, respectively. The filter weights of each layer were initialized with a zero-mean Gaussian distribution
with standard deviation 0.01. The initial biases of each convolution layer were set to 0.

2.3 Evaluation Protocols


Two evaluation metrics are traditionally used to evaluate the performance of IQA algorithms: the Pearson Linear
Correlation Coefficient (PLCC)13 and the Spearman Rank Order Correlation Coefficient (SROCC).14 PLCC is
a measure of the linear correlation between the ground truth and the predicted quality scores. PLCC measures
the prediction accuracy The PLCC is computed using:
PN ¡ ¢
¯
i=1 (yi − y¯i ) yˆi − ŷ
P LCC = qP q ¢ , (1)
N 2 PN ¡ ¯ 2
i=1 (yi − y¯i ) i=1 yˆi − ŷ

where N is the number of test images, the ground truth score of i-th image is denoted by yi , and the predicted
score of i-th image from network is denoted by yˆi . ȳ and ŷ¯ are the means of the ground truth scores and the
predicted scores, respectively.
The SROCC measures the prediction monotonicity Given N test images, the SROCC is computed as:
PN
6 i=1 d2i
SROCC = 1 − , (2)
N (N 2 − 1)

where di is the difference between each pair of values in subjective scores and predicted scores.

3. RESULTS
We randomly chosen 1350 images of database as the training data, and the remaining 150 images as the testing
data. The accuracy of the presented AlexNet-IQA-model is 98%. The consistency between the predicted quality
scores and the subject evaluation results is one of the most important criteria for evaluating the performance
of IQA algorithms. Fig. 3 is the scatter plot of AlexNet-IQA-model versus the predicted score in one trial on
the entire test set. In each plot, the x-coordinate is the subjective scores, and the y-coordinate is the predicted
result, predicted by AlexNet-IQA-model. Fig. 3 thus shows the monotonicity and consistency between predicted
scores and subjective scores. Fig. 3 shows that the predicted results by AlexNet-IQA-model is highly consistent
with the subjective evaluation scores. This demonstrates that AlexNet-IQA-model has an impressive consistency
with human perception.

5
AlexNet-IQA-model
4.5

3.5
predicted scores

2.5

1.5

1
1 1.5 2 2.5 3 3.5 4 4.5 5
subjectvie scores

Figure 3. Scatter plot of the predicted result vs. subjective score in AlexNet-IQA-model

Two performance indexes are used to evaluate the efficacy of IQA methods: Linear Correlation Coefficient
(LCC) and Spearman Rank Order Correlation Coefficient (SROCC). LCC measures the prediction accuracy and
SROCC measures the prediction monotonicity. We randomly select 80% as the training dataset and remaining
20% as test dataset, the performance as shown in Table 1.
Table 1. SROCC and PLCC measurements on the proposed AlexNet-IQA-model

AlexNet-IQA-model
PLCC 0.9953
SROCC 0.9952

4. CONCLUSION
In this paper, we presented a blind CT image quality assessment via deep learning strategy. A database of
1500 CT images is constructed, containing 300 high-quality images and 1200 corresponding noisy images. The
predicted scores were validate using Pearson Linear Coefficient (PLCC) and Spearman Rank Order Correlation
Coefficient (SROCC). And the experimental result demonstrate that the presented deep learning based IQA
strategy can be used in the CT image quality assessment. As we all know, for medical images, the measure of
unitary image quality is not a reliable measure of diagnostic accuracy. Radiologists always focus on a single focal
lesion of whole medical image, which may not have the best image quality overall. Therefore, in further study,
not only we shall elaborate our database and NR-IQA model but also we will focus on the focal lesion image
quality to model a new local NR-IQA methods.

ACKNOWLEDGMENTS
This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 81371544
and 61571214, the China Postdoctoral Science Foundation funded project under Grants Nos. 2016M602489 and
View publication stats

2016M602488, the Guangdong Natural Science Foundation under Grant Nos. 2015A030313271, the Science
and Technology Program of Guangdong, China under Grant No. 2015B020233008, the Science and Technology
Program of Guangzhou, China under Grant No. 201510010039.

REFERENCES
[1] Z.Wang, A.C.Bovik, H.R.Sheikh, and E.P.Simoncelli, “Image quality assessment:From error visibility to
structural similarity,”IEEE Transactions on Image Processing 13(4), 600-612 (2004).
[2] Glorot, X. and Bengio, Y. “Understanding the diffculty of training deep feedforward neural networks,” In
Aistas 9, 249-256 (2010).
[3] S. Bianco, L. Celona, P. Napoletano, and R. Schettini. “On the use of deep learning for blind image quality
assessment,”arXiv preprint arXiv:1602.05531, (2016).
[4] S. Bosse, D. Maniry, T. Wiegand, and W. Samek. “A deep neural network for image quality assessment,” In
Image Processing (ICIP), 2016 IEEE International Conference on , 3773-3777 (2016).
[5] L. Kang, P. Ye, Y. Li, and D.Doermann. “Convolutional neural networks for no-reference image quality
assessment,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 1733-
1740 (2014).
[6] L. Kang, P. Ye, Y. Li, and D. Doermann. “Simultaneous estimation of image quality and distortion via
multi-task convolutional neural networks,” In Image Processing (ICIP), 2015 IEEE International Conference
on 791-2795 (2015).
[7] Joshi N, Kapoor A. “Learning a blind measure of perceptual image quality,” IEEE Conference on Computer
Vision and Pattern Recognition 305-312 (2011).
[8] Saad M A, Bovik A C, Charrier C. “A DCT Statistics-Based Blind Image Quality Index,” IEEE Signal
Processing Letters 17(6), 583-586,(2010).
[9] Saad M A, Bovik A C, Charrier C. “Blind Image Quality Assessment: A Natural Scene Statistics Approach
in the DCT Domain,” IEEE Transactions on Image Processing 21(8),3339-3352,(2012).
[10] D.Zeng, J.Huang, Z.Bian, S.Niu, H.Zhang, Q.Feng, Z.Liang, and J. Ma, “A Simple Low-dose X-ray CT
Simulation from High-dose Scan,” IEEE Transactions on Nuclear Science 62,2226-2233, (2015).
[11] Krizhevsky A, Sutskever I, Hinton G E. “ImageNet classification with deep convolutional neural networks,”
International Conference on Neural Information Processig Systems Curran Associates Inc. 1097-1105,(2012).
[12] https://github.com/tensorflow/tensorflow
[13] X.-K. Song, “Correlated data analysis: modeling, analytics, and applications,” Springer Science and Busi-
ness Media (2007).
[14] T.D. Gautheir, “Detecting Trends Using Spearman’s Rank Correlation Coefficient,” Environmental Foren-
sics 2(4),359-362, (2001).

You might also like