Blind CT Image Quality Assessment Via Deep Learning Strategy: Initial Study
Blind CT Image Quality Assessment Via Deep Learning Strategy: Initial Study
Blind CT Image Quality Assessment Via Deep Learning Strategy: Initial Study
net/publication/323623361
Blind CT image quality assessment via deep learning strategy: initial study
CITATIONS READS
0 311
7 authors, including:
Jianhua Ma ji he
Southern Medical University Southern Medical University
212 PUBLICATIONS 1,896 CITATIONS 18 PUBLICATIONS 34 CITATIONS
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Dong Zeng on 30 September 2018.
ABSTRACT
Computed Tomography (CT) is one of the most important medical imaging modality. CT images can be used
to assist in the detection and diagnosis of lesions and to facilitate follow-up treatment. However, CT images are
vulnerable to noise. Actually, there are two major source intrinsically causing the CT data noise, i.e., the X-ray
photo statistics and the electronic noise background. Therefore, it is necessary to doing image quality assessment
(IQA) in CT imaging before diagnosis and treatment. Most of existing CT images IQA methods are based on
human observer study. However, these methods are impractical in clinical for their complex and time-consuming.
In this paper, we presented a blind CT image quality assessment via deep learning strategy. A database of 1500
CT images is constructed, containing 300 high-quality images and 1200 corresponding noisy images. Specifically,
the high-quality images were used to simulate the corresponding noisy images at four different doses. Then,
the images are scored by the experienced radiologists by the following attributes: image noise, artifacts, edge
and structure, overall image quality, and tumor size and boundary estimation with five-point scale. We trained
a network for learning the non-liner map from CT images to subjective evaluation scores. Then, we load the
pre-trained model to yield predicted score from the test image. To demonstrate the performance of the deep
learning network in IQA, correlation coefficients: Pearson Linear Correlation Coefficient (PLCC) and Spearman
Rank Order Correlation Coefficient (SROCC) are utilized. And the experimental result demonstrate that the
presented deep learning based IQA strategy can be used in the CT image quality assessment.
Keywords: Blind image quality assessment, deep learning, five-point scale IQA, CT images
1. INTRODUCTION
Computed tomography (CT) is a widely used imaging modality, allowing visualization of anatomical structures
with high spatial and temporal resolution. CT images are subjected to noisy, artifacts during acquisition,
processing and reconstruction. Therefore, there is a need to assess of CT images quality before diagnosis and
treatment. IQA is an effective approach to assess the quality of perceived visual stimuli. IQA methods fall into
two categories: subjective assessment by human and objective assessment by algorithms designed to mimic the
subjective judgement. Subjective assessment always regarded as the gold standard to evaluate images, however,
the subjective judgment is not always suitable for applications because its well-known drawbacks are time-
and labor-consuming, especially for clinical applications. According to the availability of a reference image,
objective IQA metrics can be classified as full-reference (FR) IQA; reduce-reference (RR) IQA; and no-reference
(NR/blind) IQA methods.1 In CT imaging, there is no perfect image as reference for IQA. Therefore, NR-IQA
is needed for CT images quality assessment.
Recently, deep learning, i.e., convolutional neural network (ConvNet), are having an enormous impact on com-
puter vision research and practice. The success of ConvNet can be attributed to many factors, including deep
architecture, batch normalization (BN), Rectifier Linear Unit (ReLU), residual learning and high-performance
graphical processing units (GPUs).2 Deep structures are capable of extracting features from data to build
Correspondence: J.M.: E-mail: [email protected]
increasing abstract representations, replacing the traditional approach of carefully hand-crafting features and
algorithms. Therefore, the ConvNet encourages researchers exploring its potential application to the IQA prob-
lems.3–6 This corresponding experimental results demonstrate that its significant improvements compared with
previous hand-crafted approaches.7–9 It is noted that most of the IQA models are proposed for natural images
quality assessment, but few are used in clinical application.
Inspired by the recent advance in deep learning method for IQA, in this paper, we presented a blind CT
image quality assessment via deep learning strategy. A database of 1500 CT images is constructed, containing
300 high-quality images and 1200 corresponding noisy images. Specifically, the high-quality images were used to
simulate the corresponding noisy images at four different doses. Then, the images are scored by the experienced
radiologists for the following attributes: image noise, artifacts, edge and structure, overall image quality, and
tumor size and boundary estimation with five-point scale. We trained a network for learning the non-liner map
from CT images to subjective evaluation scores. Then, we load the pre-trained model to yield predicted score
from the test image. The predicted scores were validate using Pearson Linear Coefficient (PLCC) and Spearman
Rank Order Correlation Coefficient (SROCC).
2. METHODOLOGY
2.1 Image Database
In this work, under the authorization from Mayo clinic, the clinical data were used to validate and evaluate
performance of the NR-IQA model. In the examination, we simulated distorted CT images from the high-quality
CT images using the simulation techniques.10 The high-quality images were used to simulate noisy images at
four different doses. Fig. 1 shows the subset of the images used in the study. All the CT images are in gray-scale,
and the images in this database are 512×512 pixels in size. The images are scored by the experienced radiologists
for the following attributes: image noise, artifacts, edge and structure, overall image quality, and tumor size and
boundary estimation with five-point scale: Excellent (5), Good (4), Fair (3), Poor (2), Bad (1). There are 1500
images in the database.
1 1 3 2
5 2 1 4
Q1
AlexNet
Fully-connected
Fully-connected
Fully-connected
Convolution layer
Q2 C C
Convolution layer
Convolution layer
Convolution
Convolution layer
Convolution layer
o o
Convolution layer
Convolution layer
Convolution layer
n n
vo v
lu lu
ti ti
o
n o
n Quality
Q3 la la
ye ye
r r
layer
layer
layer
layer
Qn
For IQA, we assess a whole image instead of patches in the images, because visual quality is a holistic concept
of an image. We use the tensorflow framework.12 When training, we use mini-batch Stochastic Gradient Descent
(SGD) algorithm. The batch size, momentum, and weight decay for the min-batch SGD were set to 16, 0.99,
and 0.005, respectively. The filter weights of each layer were initialized with a zero-mean Gaussian distribution
with standard deviation 0.01. The initial biases of each convolution layer were set to 0.
where N is the number of test images, the ground truth score of i-th image is denoted by yi , and the predicted
score of i-th image from network is denoted by yˆi . ȳ and ŷ¯ are the means of the ground truth scores and the
predicted scores, respectively.
The SROCC measures the prediction monotonicity Given N test images, the SROCC is computed as:
PN
6 i=1 d2i
SROCC = 1 − , (2)
N (N 2 − 1)
where di is the difference between each pair of values in subjective scores and predicted scores.
3. RESULTS
We randomly chosen 1350 images of database as the training data, and the remaining 150 images as the testing
data. The accuracy of the presented AlexNet-IQA-model is 98%. The consistency between the predicted quality
scores and the subject evaluation results is one of the most important criteria for evaluating the performance
of IQA algorithms. Fig. 3 is the scatter plot of AlexNet-IQA-model versus the predicted score in one trial on
the entire test set. In each plot, the x-coordinate is the subjective scores, and the y-coordinate is the predicted
result, predicted by AlexNet-IQA-model. Fig. 3 thus shows the monotonicity and consistency between predicted
scores and subjective scores. Fig. 3 shows that the predicted results by AlexNet-IQA-model is highly consistent
with the subjective evaluation scores. This demonstrates that AlexNet-IQA-model has an impressive consistency
with human perception.
5
AlexNet-IQA-model
4.5
3.5
predicted scores
2.5
1.5
1
1 1.5 2 2.5 3 3.5 4 4.5 5
subjectvie scores
Figure 3. Scatter plot of the predicted result vs. subjective score in AlexNet-IQA-model
Two performance indexes are used to evaluate the efficacy of IQA methods: Linear Correlation Coefficient
(LCC) and Spearman Rank Order Correlation Coefficient (SROCC). LCC measures the prediction accuracy and
SROCC measures the prediction monotonicity. We randomly select 80% as the training dataset and remaining
20% as test dataset, the performance as shown in Table 1.
Table 1. SROCC and PLCC measurements on the proposed AlexNet-IQA-model
AlexNet-IQA-model
PLCC 0.9953
SROCC 0.9952
4. CONCLUSION
In this paper, we presented a blind CT image quality assessment via deep learning strategy. A database of
1500 CT images is constructed, containing 300 high-quality images and 1200 corresponding noisy images. The
predicted scores were validate using Pearson Linear Coefficient (PLCC) and Spearman Rank Order Correlation
Coefficient (SROCC). And the experimental result demonstrate that the presented deep learning based IQA
strategy can be used in the CT image quality assessment. As we all know, for medical images, the measure of
unitary image quality is not a reliable measure of diagnostic accuracy. Radiologists always focus on a single focal
lesion of whole medical image, which may not have the best image quality overall. Therefore, in further study,
not only we shall elaborate our database and NR-IQA model but also we will focus on the focal lesion image
quality to model a new local NR-IQA methods.
ACKNOWLEDGMENTS
This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 81371544
and 61571214, the China Postdoctoral Science Foundation funded project under Grants Nos. 2016M602489 and
View publication stats
2016M602488, the Guangdong Natural Science Foundation under Grant Nos. 2015A030313271, the Science
and Technology Program of Guangdong, China under Grant No. 2015B020233008, the Science and Technology
Program of Guangzhou, China under Grant No. 201510010039.
REFERENCES
[1] Z.Wang, A.C.Bovik, H.R.Sheikh, and E.P.Simoncelli, “Image quality assessment:From error visibility to
structural similarity,”IEEE Transactions on Image Processing 13(4), 600-612 (2004).
[2] Glorot, X. and Bengio, Y. “Understanding the diffculty of training deep feedforward neural networks,” In
Aistas 9, 249-256 (2010).
[3] S. Bianco, L. Celona, P. Napoletano, and R. Schettini. “On the use of deep learning for blind image quality
assessment,”arXiv preprint arXiv:1602.05531, (2016).
[4] S. Bosse, D. Maniry, T. Wiegand, and W. Samek. “A deep neural network for image quality assessment,” In
Image Processing (ICIP), 2016 IEEE International Conference on , 3773-3777 (2016).
[5] L. Kang, P. Ye, Y. Li, and D.Doermann. “Convolutional neural networks for no-reference image quality
assessment,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 1733-
1740 (2014).
[6] L. Kang, P. Ye, Y. Li, and D. Doermann. “Simultaneous estimation of image quality and distortion via
multi-task convolutional neural networks,” In Image Processing (ICIP), 2015 IEEE International Conference
on 791-2795 (2015).
[7] Joshi N, Kapoor A. “Learning a blind measure of perceptual image quality,” IEEE Conference on Computer
Vision and Pattern Recognition 305-312 (2011).
[8] Saad M A, Bovik A C, Charrier C. “A DCT Statistics-Based Blind Image Quality Index,” IEEE Signal
Processing Letters 17(6), 583-586,(2010).
[9] Saad M A, Bovik A C, Charrier C. “Blind Image Quality Assessment: A Natural Scene Statistics Approach
in the DCT Domain,” IEEE Transactions on Image Processing 21(8),3339-3352,(2012).
[10] D.Zeng, J.Huang, Z.Bian, S.Niu, H.Zhang, Q.Feng, Z.Liang, and J. Ma, “A Simple Low-dose X-ray CT
Simulation from High-dose Scan,” IEEE Transactions on Nuclear Science 62,2226-2233, (2015).
[11] Krizhevsky A, Sutskever I, Hinton G E. “ImageNet classification with deep convolutional neural networks,”
International Conference on Neural Information Processig Systems Curran Associates Inc. 1097-1105,(2012).
[12] https://github.com/tensorflow/tensorflow
[13] X.-K. Song, “Correlated data analysis: modeling, analytics, and applications,” Springer Science and Busi-
ness Media (2007).
[14] T.D. Gautheir, “Detecting Trends Using Spearman’s Rank Correlation Coefficient,” Environmental Foren-
sics 2(4),359-362, (2001).