No-Reference Blurred Image Quality Assessment by Structural Similarity Index

Zhang, Haopeng; Yuan, Bo; Dong, Bo; Jiang, Zhiguo

doi:10.3390/app8102003

Open AccessArticle

No-Reference Blurred Image Quality Assessment by Structural Similarity Index

¹

Image Processing Center, School of Astronautics, Beihang University, Beijing 100191, China

²

Beijing Institute of Radio Measurement, Beijing 100854, China

³

Beijing Key Laboratory of Digital Media, Beihang University, Beijing 100191, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2018, 8(10), 2003; https://doi.org/10.3390/app8102003

Submission received: 9 September 2018 / Revised: 12 October 2018 / Accepted: 15 October 2018 / Published: 22 October 2018

Download

Browse Figures

Versions Notes

Abstract

:

No-reference (NR) image quality assessment (IQA) objectively measures the image quality consistently with subjective evaluations by using only the distorted image. In this paper, we focus on the problem of NR IQA for blurred images and propose a new no-reference structural similarity (NSSIM) metric based on re-blur theory and structural similarity index (SSIM). We extract blurriness features and define image blurriness by grayscale distribution. NSSIM scores an image quality by calculating image luminance, contrast, structure and blurriness. The proposed NSSIM metric can evaluate image quality immediately without prior training or learning. Experimental results on four popular datasets show that the proposed metric outperforms SSIM and well-matched to state-of-the-art NR IQA models. Furthermore, we apply NSSIM with known IQA approaches to blurred image restoration and demonstrate that NSSIM is statistically superior to peak signal-to-noise ratio (PSNR), SSIM and consistent with the state-of-the-art NR IQA models.

Keywords:

no-reference; re-blur; structural similarity; image quality assessment

1. Introduction

Advances in digital techniques enable people to capture, store and send a large amount of digital images easily, which sharply accelerates the rate of information transfer. Images captured by cameras in the real world are usually subject to distortions during acquisition, compression, transmission, processing, and reproduction. The information that an image conveys is related to image quality, thus image quality is very significant for image acquisition and processing systems. If the quality of an image is bad, its processing result is usually bad as well. Therefore, image acquisition and processing systems in real applications need image quality assessment (IQA) to objectively and automatically identify and quantify these image quality degradations. In recent decades, many IQA methods have been proposed to solve this problem. IQA methods can be classified as full-reference (FR), no-reference (NR) and reduced-reference (RR) methods. FR methods, such as the mean squared error (MSE), the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM) [1], require the original undistorted images as the references. In addition, RR methods need prior information about the original images, but either the original undistorted images or their prior information are rarely obtained in practice. NR methods can assess the quality of the distorted images only using themselves, thus are more suitable for actual applications. In this paper, we confine our work to NR methods and focus on the degradation of blur.

The popular FR metric SSIM [1] takes advantage of mathematical convenience and matching to known characteristics of the human visual system (HVS) but suffers from a lack of pristine reference images. Recent general NR models, such as [2,3,4], perform well when adapting with HVS, but lose computation convenience when training with distorted images in advance. In this paper, we propose an NR IQA method for blurred images. Our proposed metric, called no-reference structural similarity (NSSIM), is based on re-blur theory and SSIM, and can be regarded as an improvement of SSIM from FR to NR. The first part is the re-blur process, i.e., blurring the distorted images by a Gaussian low-pass filter. The second is to quantify the blurriness of the image by grayscale distribution. Finally we combine luminance, contrast, structure and blurriness to a quality score. Our NSSIM can be easily computed using the input image and its re-blurred image, and no previous training is needed. Experimental results from four popular datasets validate the better performance of NSSIM than the compared FR and NR methods. In addition, we apply NSSIM to evaluate the performance of the blurred image restoration. The results demonstrate that NSSIM performs as well as the state-of-the-art NR IQA metrics and can get better image quality evaluation than the existing FR IQA metrics PSNR and SSIM. The contribution of the paper is two-fold. One is that we propose a novel definition of image blurriness based on the grayscale distribution of the image, and verify its effectiveness of blurriness measurement and fitness with the subjective quality sense of a human. The other is that we extend the famous FR IQA metric SSIM to a no-reference manner, achieving state-of-the-art IQA performance without previous training.

The rest of this paper is organized as follows. In Section 2, we review previous works in IQA. In Section 3, we describe the blurriness definition and computation, and procedure of our model. In Section 4, we evaluate the performance of the proposed approach by comparing it with the state of the art and applying for blurred image restoration. Section 5 concludes the paper.

2. Related Works

FR IQA methods are usually used to get quantitative evaluation of image quality. For example, PSNR is a classical FR IQA that measures the difference between the maximum and minimum grayscale of the image, which is simple to achieve but cannot accurately simulate the HVS. Another popular FR metric SSIM [1] takes the advantage of mathematical convenience and matching to known characteristics of the HVS but suffers from a lack of pristine reference images. To avoid the requirement of the referred undistorted images, NR IQA algorithms are researched for applications when the referred undistorted images are unavailable. According to their capability, NR IQA algorithms can be divided into distortion-specific and holistic models. In the following, we survey NR IQA algorithms that target blur, compression, and several holistically operated models.

2.1. Distortion-Specific NR IQA Algorithms

Distortion-specific NR IQA algorithms assume that the distortion medium is known. Popular blur IQA algorithms model edge spreads and relate these spreads to perceived quality. Sang [5] proposes a blur IQA model by using singular value decomposition (SVD) to evaluate image structural similarity. Caviedes [6] computes sharpness using the average 2D kurtosis of the 8 × 8 DCT blocks and spatial edge extent information. Ferzli [2] evaluates image quality by Just Notice Blur (JNB). Joshi [7] presents an NR IQA method based on continuous wavelet transform. Similarly, the general approach to NR JPEG IQA is to measure edge strength at block boundaries and relates this strength as well as possibly some measure of image activity to perceived quality. Feng [8] measures the visual impact of ringing artifacts for JPEG images. Meesters [9] detects the low-amplitude edges that result from blocking and estimating the edge amplitudes. Wang [10] evaluates image quality by designing a computationally inexpensive and memory-efficient feature extraction method and estimating the activity of the image signal. JPEG2000 ringing artifacts in an image are normally modeled by measuring edge-spread using an edge-detection-based approach. For example, Sazzad [11] computes simple features in the spatial domain, Sheikh [12] assesses image quality by natural scene statistics (NSS) models, and Marziliano [13] calculates edge width by finding the start and end positions of the edge of each corresponding edge in the processed image.

2.2. Holistic NR IQA Algorithms

Holistic IQA algorithms are designed for measuring distortion of unknown type. Holistic models extract common features of various distortions or establish various models for different distortions. BIQI [14] assumes an image is subjected to a wavelet transform over three scales and three orientations using the Daubechies 9/7 wavelet basis [15], and assesses image quality with a two-step framework which estimates the presence of a set of distortions and evaluated the quality of the image along each of these distortions. BLIIND-II [3] is a multiscale but single-stage algorithm through machine learning that operates in the DCT domain, where a number of features, i.e., scale and orientation selective statistics, correlations across scales, spatial correlation and across orientation statistics, are computed from a natural scene statistics (NSS) model of block DCT coefficients. BRISQUE [4] details the statistical of locally normalized luminance coefficients in the spatial domain. Considering images are naturally multiscale, BRISQUE captures 36 features from two grayscales to identify image distortions and activate distortion-specific quality assessment. A support vector regressor (SVR) [16] is used to build a regression module to perceive quality score. NIQE [17] is founded on perceptually relevant spatial domain NSS features extracted from local image patches that capture the essential low-order statistics of natural images. NIQE takes the distance between the quality-aware NSS feature model and the MVG fit to the features extracted from the distorted image as quality score.

The up-to-date NR IQA algorithm NR-CSR [18] applies convolutional sparse representation (CSR) to simulate the entire image as a sum over a set of convolutions of coefficient maps, which has the same size as the image. NR-CSR uses a low-pass filter to obtain the sparse coefficient and calculating the gradient value to score the sharpness. Meanwhile, artificial neural network method has already been used by novel NR IQA algorithms. For example, Fan [19] proposed an NR IQA algorithm based on multi-expert convolutional neural networks (MCNN) and Li [20] proposed an IQA model for realistic blur image-based semantic feature aggregation (SFA).

It should be noticed that the state-of-the-art NR IQA algorithms like BRISQUE and BLIIND-II perform well at adapting HVS but lose computation convenience due to the training by distorted images in advance. Moreover, deep learning methods like MCNN suffer heavy time cost and high-level hardware requirements.

3. The Proposed IQA Metric

Our proposed NR IQA metric for blurred images aims to be applied without advance training using distorted/undistorted images. It can be regarded as an improvement of SSIM from FR to NR by using re-blur theory, thus we call it NSSIM. In this section, we describe the detail of NSSIM in four parts. Firstly, we revisit the FR structural similarity index (SSIM), and then introduce re-blur, which gives the twice-blurred image. The following part describes feature extraction, where we define image blurriness d from image grayscale histogram distribution. Finally, we compare luminance, contrast, structure and blurriness between the original image and its re-blurred image to land a quality score, i.e., our NSSIM.

3.1. Structural Similarity

The FR metric SSIM [1] compares luminance, contrast and structure of distorted image x and pristine reference image y, i.e.,

S S I M (x, y) = {[l (x, y)]}^{α} {[c (x, y)]}^{β} {[s (x, y)]}^{γ}

(1)

where

l (x, y)

,

c (x, y)

and

s (x, y)

respectively represent luminance, contrast and structure comparison functions.

α > 0

,

β > 0

and

γ > 0

, which are parameters to adjust the relative weight of the three components.

Weber’s Law [21] indicates that the magnitude of a just-noticeable luminance change

▵ I

is approximately proportional to the background luminance I for a wide range of luminance values. Thus the luminance comparison function is defined as

l (x, y) = l (μ_{x}, μ_{y}) = \frac{2 μ_{x} μ_{y} + C_{1}}{μ_{x}^{2} + μ_{y}^{2} + C_{1}}

(2)

where

μ_{x} = \frac{1}{N} \sum_{i = 1}^{N} x_{i}

and

μ_{y} = \frac{1}{N} \sum_{i = 1}^{N} y_{i}

(N is the total number of image pixels.

x_{i}

and

y_{i}

are single pixels in x and y) represent the mean intensity of image x and image y respectively.

C_{1}

is a positive constant to avoid instability when

μ_{x}^{2} + μ_{y}^{2}

is very close to zero.

C_{1} = {(K_{1} L)}^{2}

,

K_{1} ≪ 1

is a small constant, and L is the dynamic range of the pixel values (e.g., 255 for 8-bit grayscale images).

Similarly, the contrast comparison function is

c (x, y) = c (σ_{x}, σ_{y}) = \frac{2 σ_{x} σ_{y} + C_{2}}{σ_{x}^{2} + σ_{y}^{2} + C_{2}}

(3)

where

C_{2} = {(K_{2} L)}^{2}

,

K_{2} ≪ 1

, and

σ_{x} = {(\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ_{x})}^{2})}^{1 / 2}

and

σ_{y} = {(\frac{1}{N - 1} \sum_{i = 1}^{N} {(y_{i} - μ_{y})}^{2})}^{1 / 2}

are standard deviations of x and y, respectively.

The definition of structure correlation function is

s (x, y) = s (\frac{x - μ_{x}}{σ_{x}}, \frac{y - μ_{y}}{σ_{y}}) = \frac{σ_{x y} + C_{3}}{σ_{x} σ_{y} + C_{3}}

(4)

where

C_{3}

is a small positive constant to avoid instability when

σ_{x} σ_{y}

closes to zero, and

σ_{x y} = \frac{1}{N - 1} \sum_{i = 1}^{N} (x_{i} - μ_{x}) (y_{i} - μ_{y})

.

3.2. Re-Blur

It is known that sharp images contain more high-frequency components, thus grayscale variations between adjacent pixels in sharp images are more distinct than those in blurred images. The re-blur theory [17] explains that the variation of quality of sharp images would be larger than that of blurred images after blur processing, which is also demonstrated in Figure 1. Considering the input image x, we can take re-blurred image y as the reference, and the image quality can be assessed by the quantity of high frequency components measured between x and y.

The re-blur procedure is shown in Figure 2. Considering Gaussian blur as the distortion type in this paper, we apply a Gaussian kernel

k_{g}

(i.e., a Gaussian low-pass filter) to the distorted image x to obtain a re-blurred image y, which is formulated as

y = x * k_{g}

(5)

where * is the convolution operator, and the Gaussian kernel

k_{g}

is sampled from a two-dimensional Gaussian function

G (u, v) = \frac{1}{2 π σ^{2}} e^{- (u^{2} + v^{2}) / (2 σ^{2})}

(6)

It should be noted that

k_{g}

is parameterized by the kernel size and the standard deviation

σ

. The impact of these parameters will be discussed in Section 4.

We decompose an

M \times N \times 3

image into the high-frequency part

I^{H}

and the low-frequency part

I^{L}

.

I^{H}

represents the drastic part while

I^{L}

represents the mild part. Thus the image grayscale variation would be sharper in drastic-change area. In order to focus on the main part that contributes to image quality and reduce the time cost, a down-sampling process for the inputted multi-dimensional image by a simple low-pass filter

f = m a x (1, r o u n d (m i n (M, N) / 256))

is applied. If

f > 1

, we define

f = \frac{1}{f \times f} \times l p f

, where

l p f = {[\begin{matrix} 1 & \dots & 1 \\ ⋮ & ⋱ & ⋮ \\ 1 & \dots & 1 \end{matrix}]}_{f \times f}

. Values outside the bounds of the image are computed by mirror-reflecting the image across the border. The filtered image should have the same size of the original image by restricting the points of filtering template in the original image. In addition, we convert both the multi-dimensional original image and the filtered, i.e., the re-blurred image into two-dimensional mode.

3.3. Feature Extraction

Considering the abundant information that images possess, the complexity of images can be represented based on their structure, noise and diversity [22] or based on fuzzy measures of entropy [23] or based on discrete wavelet transform decomposition [24], etc. In this paper, to represent the inputted image and the re-blurred image, we extract luminance, contrast, structure and blurriness of both the down-sampled 2D images, respectively. Then, we ameliorate the traditional structural similarity by combining the blurriness with luminance, contrast and structure in Figure 3. We would like to emphasize image blurriness in this subsection.

Image histogram reports grayscale distribution. As seen in Figure 4, we find that sharp images have a broader grayscale range. In contrast, grayscale distributions of blurred images are narrower and tend to approach the mean value according to their histograms. As is shown in Figure 4, grayscales close to the mean value

μ

of the blurred image (e.g., pixels whose grayscale close to the 132.92 in the right histogram in Figure 4) take the most proportion in the histogram. Thus we describe image blurriness by distributing different weights to different pixel values. We assign heavy weights to the pixels those close to image mean grayscale value and fewer weights to those away from that. Thus we define the image blurriness as

d = \sum_{g_{i} = 0}^{L} p (g_{i}) w (g_{i})

(7)

where d represents image blurriness,

g_{i}

is gray value whose range varies from 0 to the dynamic range L (e.g.,

L = 255

for 8-bit image),

p (g_{i})

is the proportion of

g_{i}

on the whole image, and

w (g_{i})

represents the weight of

g_{i}

, which can be calculated as

w (g_{i}) = \{\begin{matrix} \frac{g_{i}}{μ}, g_{i} < μ \\ \frac{L - g_{i}}{L - μ}, g_{i} ⩾ μ \end{matrix}

(8)

According to the proposed image blurriness index, we test images in Figure 1 to verify the impact of re-blur process. Using SSIM and image blurriness as image quality indexes, the results are shown in Table 1. It is shown that the quality of a sharp image has a more drastic decline than that of a blurred image. Then we analyze the image quality trend activated by re-blur, i.e., we blur the sharp image of Figure 1 with blur times varying from 0 to 11. Experimental results are shown in Figure 5. The proposed blurriness index has an approximate linear growth while the SSIM index gradually declined slower with the increase of blur times. This demonstrated that a high-quality image has a relatively small SSIM index when taking its blurred image as reference.

In order to verify the consistency between the proposed image blurriness metric and natural blur images, we selected five Gaussian blur images from each of the four datasets, including CSIQ [25], Live II [26], IVC [27] and TID2013 [28], shown in Figure 6. Table 2 indicates that the proposed image blurriness metric has positive correlation with the difference mean opinion scores (DMOS), i.e., the subjective quality scores in CSIQ and LIVE II datasets, and negative correlation with DMOS those belong to IVC and TID2013 datasets. It can be concluded that our blurriness metric can precisely represent the blurriness of natural images and fit the subjective quality assessment of the HVS.

3.4. NSSIM Index

Similar to the definitions of luminance, contrast and structure in SSIM, we define blurriness comparison function as

h (x, y) = \frac{2 d_{x} d_{y} + C_{4}}{d_{x}^{2} + d_{y}^{2} + C_{4}}

(9)

where

d_{x}

and

d_{y}

respectively represent blurriness of distorted image x and its re-blurred image y.

C_{4}

is to avoid instability when

d_{x}^{2} + d_{y}^{2}

closes to zero. Thus, we can calculate luminance, contrast, structure and blurriness to get a new metrics SSIM

_{r}

as

S S I M_{r} (x, y) = {[l (x, y)]}^{α} {[c (x, y)]}^{β} {[s (x, y)]}^{γ} {[h (x, y)]}^{λ}

(10)

where

λ

is the exponent coefficient of

h (x, y)

. To better capture the blurriness in local areas of an image, we partition an image into

P \times P

patches of the same size and compute the mean SSIM

_{r}

as

M S S I M_{r} (x, y) = \frac{1}{M} \sum_{i = 1}^{M} S S I M_{r} (x^{i}, y^{i})

(11)

where

M = P \times P

, and

x^{i}

and

y^{i}

are the i-th patches in x and y respectively. Finally, to make the quality score in accordance with the subjective impression, i.e., high-quality image gets high IQA score, we define our proposed NSSIM index as

N S S I M (x, y) = 1 - M S S I M_{r} (x, y)

(12)

4. Performance Evaluation

4.1. Datasets

We tested the proposed NSSIM on four popular datasets for IQA, including CSIQ [25], LIVE II [26], IVC [27] and TID2013 [28]. All datasets consist of several subsets of different distortion types. In this paper, we used Gaussian blur distortion for experiments. In particular, the CSIQ dataset contains 30 reference images and 150 Gaussian blur images, the LIVE II dataset contains 29 reference images and 145 Gaussian blur images, the IVC dataset contains four reference images and 20 Gaussian blur images, and the TID2013 dataset contains 25 reference images and 125 Gaussian blur images. All blur images in these datasets were used together with their DMOS as subjective quality scores. Some samples of Gaussian blur images for experiments are shown in Figure 6 together with their DMOS. It should be noted that DMOS scores provided by CSIQ [25] and LIVE II [26] have positive correlation with blurriness scores, while DMOS scores provided by IVC [27] and TID2013 [28] change in the opposite direction. As shown in Figure 6, the datasets used for performance evaluation contain blur images with different kinds of nature scenes which are generally distributed, thus are suitable for statistical analysis of evaluation results.

4.2. Indexes for Evaluation

In order to provide quantitative measurement of the performance of our proposed model, we follow the performance evaluation procedures employed in the video quality experts group (VQEG) [29]. We test the proposed IQA metrics using Spearman’s rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. SROCC, PLCC and RMSE are respectively defined as

S R O C C = 1 - \frac{6 \sum_{i = 1}^{N} {(R_{s_{i}} - R_{p_{i}})}^{2}}{N (N^{2} - 1)}

(13)

P L C C = \frac{\sum_{i = 1}^{N} (s_{i} - \bar{s}) (p_{i} - \bar{p})}{\sqrt{\sum_{i = 1}^{N} {(s_{i} - \bar{s})}^{2} {(p_{i} - \bar{p})}^{2}}}

(14)

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(p_{i} - s_{i})}^{2}}

(15)

where N is the number of images,

s_{i}

and

p_{i}

represent the i-th scores given by subjective evaluation and objective evaluation,

\bar{s}

and

\bar{p}

represent mean subjective quality score and mean objective predicted score, and

R_{s_{i}}

and

R_{p_{i}}

represent the rank order number of

\bar{s}

and

\bar{p}

, respectively. SROCC is a nonparametric measure of rank correlation to statistically assess how well the relationship between the subjective quality score and the objective predicted score can be monotonically described, while PLCC is a measure of the linear correlation between them. RMSE represents the square root of the quadratic mean of the differences between the subjective quality scores and the objective predicted scores, and is a frequently used measure. By using these three statistical measures, we can easily analyze the consistency between the subjective quality score and the objective predicted score, which indicates the capability of IQA methods.

4.3. Parameter Setting

In this subsection, the re-blur method including blur type and filter parameters are discussed. We also discuss the exponent coefficients of luminance, contrast, structural and blurriness, i.e.,

α

,

β

,

γ

and

λ

in Equation (10), to achieve the best performance of our proposed metric.

4.3.1. Filter Type and Parameter for Re-Blur

As mentioned in Section 3.2, we apply a Gaussian low-pass filter to get the re-blurred image of the input image. It should be noted that we compared three types of filters for re-blur, i.e., Gaussian blur, motion blur and defocus blur. The results on LIVE II dataset are illustrated in Table 3. We can see that Gaussian blur leads to the highest SROCC. It is easy to understand such results, noting that the filter for re-blur is of the same type as the distortion of the image in LIVE II dataset. Thus we chose Gaussian filter for re-blur in the experiments. Furthermore, the size and deviation of Gaussian filter will impact running time. According to the experimental results shown in Table 4, we take the Gaussian filter of 11 × 11 with deviation 1.5.

4.3.2. Patch Quantity

Since we partitioned an image into

P \times P

patches when calculating NSSIM, the size and amount of the patches have impact on processing time. In Table 5, we list the running time taken (in seconds) to compute each quality on an image of resolution 768 × 512 and 24-bit deep color from LIVE II dataset on a 2.6 GHz single-core PC with 4 GB of RAM. When

P = 16

, each patch is 48 × 32 × 3 and it takes average 2.4373 seconds to evaluate, which is the least running time. Thus, we set

P = 16

in the experiments in this paper.

4.3.3. Exponent Coefficient of Blurriness Comparison Function

In this section, we evaluate the contribution of

λ

, which is the exponent coefficient of blurriness comparison function

h (x, y)

. The impact of

λ

tested on LIVE II [26] dataset is shown in Figure 7 with other parameters the same as SSIM [1]. SROCC, PLCC and RMSE come to the best performance when

λ = 1

, and when

λ = 0

our NSSIM degrades to the traditional SSIM [1]. To achieve the best performance and simplify the expression, we set

α = β = γ = λ = 1

,

C_{1} = 0.01

,

C_{2} = 0.03

,

C_{3} = C_{2} / 2

,

C_{4} = C_{2}

.

4.4. Comparison with the State-of-the-Arts

We compared the performance of NSSIM against PSNR, original SSIM [1], and several state-of-the-art NR IQA models such as BRISQUE [4], BLIIND-II [3], MCNN [19], IQA-CWT [7], SFA [20], etc. In order to evaluate the statistically significant difference between the proposed metric and the existing IQA algorithms, we performed statistical analysis by paired-sample t-tests and reporting the p-values. The null hypothesis in our t-tests is that the pairwise difference between the proposed metric and others has a mean equal to zero, i.e., the differences in performances presented in the results are not statistically significant. p-values < 0.05 indicates the rejection of the null hypothesis at the 5% significance level, meaning that the differences are statistically significant. It should be noticed that results of some IQA algorithms, such as NR-CSR [18], MCNN [19], IQA-CWT [7], and SFA [20], are collected from the corresponding references, thus their p-values are not shown in the result tables.

As seen from Table 6, NSSIM achieves 0.8971 of PLCC and 0.1266 of RMSE on CSIQ [25] dataset, better than SSIM and MCNN metrics. We randomly sampled 100 images from CSIQ dataset by 10 times for the t-test, so that 10 samples of SROCC, PLCCs and RMSE were achieved for each algorithm in comparison. p-values in Table 6 show that t-tests reject the null hypothesis at 5% significance level, i.e., the alternative hypothesis is accepted that the pairwise difference between NSSIM and the other metrics does not have a mean equal to zero. This ascertains the differences of various metrics are the statistically significant.

Table 7 shows that NSSIM achieves 0.9689 of PLCC which performs better than the other ten algorithms on LIVE II [26] dataset, and 0.9464 of SROCC holds the third position among the 11 metrics. A paired-sample t-test was also applied the same as we done on CSIQ [25] dataset. The p-values demonstrate the statistically significant improvement of the proposed metric in terms of PLCC.

Table 8 indicates that NSSIM achieves 0.9239 of PLCC and 0.4367 of RMSE, which are the best among all nine algorithms tested on IVC [27] dataset. We also run a paired-sample t-test via randomly sampling 15 images from IVC dataset by 10 times. The p-values proves the statistically significant improvement of the proposed metric in terms of PLCC and RMSE.

As seen from Table 9, the SROCC of NSSIM is 0.8995, which beats other FR or NR IQA algorithms on TID2013 [28] dataset. A paired-sample t-test was performed as we done on CSIQ and LIVE II dataset. The p-values give the demonstration of the robust superior SROCC performance of NSSIM than those of the other algorithms.

Furthermore, Table 10 gives means and standard deviations of SROCC, PLCC, and RMSE of tested IQA algorithms on four datasets. Only IQA algorithms tested on all the four datasets are collected. It can be seen from Table 10 that our NSSIM metric achieves the highest mean SROCC and the third best of mean PLCC and RMSE. Meanwhile the t-test results shown in Table 10 led to the acceptance of the null hypothesis at the 5% significance level, indicating that the differences of average performance between NSSIM and the other metrics are not statistically significant on various datasets. This is understandable since our NSSIM cannot achieve significant improvement on every dataset in terms of all indexes. However, considering the differences of image complexity in different datasets, such as image size, contrast and diversity, the experimental results validate that the proposed NSSIM can be adapted to different image categories. Noticing that [3,4,17] all need prior training procedure, our NSSIM performs the best to maintain the balance of IQA and time-efficiency.

These test results also validate that NSSIM is a demonstration of the relationship between quantified image naturalness and perceptual image quality. NSSIM establishes a simple method to identify image quality without reference or prior training on human judgments of blurred images. Besides, compared with up-to-date NR IQA metrics NR-CSR [18], MCNN [19] and SFA [20], NSSIM is less time costly since the needless of training or learning procedure.

4.5. Consistency with Subjective DMOS Scores

We analyzed the consistency between NSSIM scores and the subjective DMOS scores on four datasets. Scatter plots of NSSIM and DMOS are shown in Figure 8. For CSIQ and LIVE II datasets, our NSSIM has negative correlation with DMOS because NSSIM has positive correlation with image blurriness while DMOS has negative correlation with image blurriness. For IVC and TID2013 datasets, NSSIM and DMOS both have positive correlation with image blurriness. The experimental results demonstrate that our NSSIM has good consistent to HVS, thus can be used for IQA effectively.

4.6. IQA for Blurred Image Restoration

The purpose of image restoration is to reduce or erase image degeneration during acquisition, compression, transmission, processing, and reproduction. IQA can be used to evaluate image restoration algorithm by assessing qualities of distorted image and restoration image. Sroubek [30] presented a deconvolution algorithm for decomposition and approximation of space-variant blur using the alternating direction method of multipliers. Kotera [31] proposed a blind deconvolution algorithm using the variational Bayesian approximation with the automatic relevance determination model on likelihood and image and blur priors. In this section, we use the proposed NSSIM to evaluate the performance of image restoration. Two groups of images including original image, blurred image and restorations are evaluated by the proposed IQA algorithm NSSIM and PSNR, SSIM and several state-of-the-art NR IQA algorithms. The experimental results are shown in Figure 9 and Figure 10 and Table 11 and Table 12.

It can be seen from Figure 9 and Table 11 that PSNR and SSIM [1] fail to evaluate the quality of both restorations since the quality scores are smaller than that of the blurred image. BIQI [14] and NIQE [17] succeed to identity the restoration images but the difference between the two restorations is slight. While BRISQUE [4], BLIIND-II [3], SFA [20] and the proposed metrics NSSIM achieve better precision and the differences between two restorations are distinct. Moreover, the NSSIM-predicted score of blurred image is 0.0709 ×

10^{- 4}

, showing obvious variance between the blurred image and the original image. This demonstrates that NSSIM is extremely sensitive to blur. Similarly, Figure 10 and Table 12 also demonstrate that NSSIM is suitable for blur IQA.

5. Conclusions

IQA is important and useful for image acquisition and processing systems in many applications. In this paper, we focus on blurred IQA. We have proposed a novel NR IQA metric called NSSIM based on SSIM and re-blur theory. The proposed NSSIM takes the advantage of SSIM in mathematical convenience and expanded it from FR to NR. We blur the distorted image and take the re-blurred image as a reference. The definition of image blurriness is given by evaluating grayscale distribution. We score image quality by taking four parts of image features into consideration, including luminance, contrast, structural and blurriness. We discussed the impact of parameters of our algorithm on the performance. We tested the proposed NSSIM metric on four datasets. The experimental results show that NSSIM achieves promising performance and has good consistency of HVS. Compared to existing IQA models, NSSIM does not need reference or prior training or learning procedure, which makes it more time-efficient and convenient to apply. We also expanded the proposed metric to IQA for image restoration, which proves our metric is practically useful. We believe that NSSIM has a great potential to be applied in unconstrained environments.

Author Contributions

Conceptualization, H.Z., B.Y. and B.D.; Methodology, H.Z., B.Y. and B.D.; Software, B.Y. and B.D.; Validation, H.Z., B.Y. and B.D.; Formal Analysis, H.Z. and B.Y.; Writing—Original Draft Preparation, H.Z., B.Y. and B.D.; Visualization, B.Y.; Supervision, H.Z. and Z.J.; Project Administration, H.Z. and Z.J.; Funding Acquisition, H.Z. and Z.J.

Funding

This work was supported in part by the National Natural Science Foundation of China (Grant No. 61501009, 61771031, and 61371134), the National Key Research and Development Program of China (2016YFB0501300 and 2016YFB0501302), and the Fundamental Research Funds for the Central Universities.

Conflicts of Interest

The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

References

Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From error measurement to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Ferzli, R.; Karam, L.J. A no-reference objective image sharpness metric based on the notion of Just Noticeable Blur (JNB). IEEE Trans. Image Process. 2009, 18, 717–728. [Google Scholar] [CrossRef] [PubMed]
Saad, M.A.; Bovik, A.C.; Charrier, A. Blind image quality assessment: A natural scene statistics approach in the DCT domain. IEEE Trans. Image Process. 2012, 21, 3339–3352. [Google Scholar] [CrossRef] [PubMed]
Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]
Sang, Q.; Liang, D.; Wu, X.; Li, C. No reference quality assessment algorithm for blur and noise image using support vector regression. J. Optoelectron. Laser 2014, 25, 595–601. [Google Scholar]
Caviedes, J.; Oberti, F. A new sharpness metric based on local kurtosis, edge and energy information. Signal Process. Image Commun. 2003, 19, 147–161. [Google Scholar] [CrossRef]
Joshi, P.; Prakash, S. Continuous wavelet transform based no-reference image quality assessment for blur and noise distortions. IEEE Access 2018, 6, 33871–33882. [Google Scholar] [CrossRef]
Feng, X.; Allebach, J.P. Measurement of Ringing Artifacts in JPEG Images; School of Electrical and Computer Engineering, Purdue University: West Lafayette, IN, USA, 2006. [Google Scholar] [CrossRef]
Meesters, L.; Martens, J.B. A single-ended blockiness for JPEG-coded images. Signal Process. 2002, 82, 369–387. [Google Scholar] [CrossRef]
Wang, Z.; Sheikh, H.R.; Bovik, A.C. No-reference perceptual quality assessment of JPEG compressed image. In Proceedings of the IEEE International Conference on Image Processing, Rochester, NY, USA, 22–25 September 2002; pp. 477–480. [Google Scholar] [CrossRef]
Sazzad, Z.M.P.; Kawoyoke, Y.; Horita, Y. No reference image quality assessment for JPEG2000 based on spatial features. Signal Process. Image Commun. 2008, 23, 257–268. [Google Scholar] [CrossRef]
Sheikh, H.R.; Bovik, A.C.; Cormack, L. No-reference quality assessment using natural scene statistics: JPEG2000. IEEE Trans. Image Process. 2005, 14, 1918–1927. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Marziliano, P.; Dufaux, F.; Winkler, S.; Ebrahimi, T. Perceptual blur and ringing metrics: Application to JPEG2000. Signal Process. Image Commun. 2004, 19, 163–172. [Google Scholar] [CrossRef]
Moorthy, A.K.; Bovik, A.C. A two-step framework for constructing blind image quality indices. IEEE Signal Process. Lett. 2010, 17, 513–516. [Google Scholar] [CrossRef]
Daubechies, I. Ten Lectures on Wavelets; SIAM: Philadelphia, PA, USA, 1992; ISBN 978-0-898712-74-2. [Google Scholar]
Scholkopf, B.; Smola, A.J.; Williamson, R.C.; Bartlett, P.L. New support vector algorithms. Neural Comput. 2000, 21, 515–519. [Google Scholar] [CrossRef]
Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]
Wu, J.; Zhang, P.; Fei, C.; Lu, S.; Niu, W. No-reference image sharpness assessment with convolutional sparse representation. In Proceedings of the 2017 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 15–17 December 2017; pp. 189–192. [Google Scholar] [CrossRef]
Fan, C.L.; Zhang, Y.; Feng, L.B.; Jiang, Q.S. No reference image quality assessment based on multi-expert convolutional neural networks. IEEE Assess 2018, 6, 8934–8943. [Google Scholar] [CrossRef]
Li, D.Q.; Jiang, T.T.; Jiang, M. Exploiting high-level semantics for no-reference image quality assessment of ealistic blur images. In Proceedings of the 2017 ACM on Multimedia Conference, Mountain View, CA, USA, 23–27 October 2017; pp. 378–386. [Google Scholar] [CrossRef]
Strobel, N.; Mitra, S.K. Quafratic filters for image contrast enhancement. In Proceedings of the 1994 28th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 31 October–2 November 1994. [Google Scholar] [CrossRef]
Le, P.Q.; Iliyasu, A.M.; Sanchez, J.A.G.; Dong, F.Y.; Hirota, K. Representing Visual Complexity of Images Using a 3D Feature Space Based on Structure, Noise, and Diversity. J. Adv. Comput. Intell. Intell. Inform. 2012, 16, 631–640. [Google Scholar] [CrossRef]
Cardaci, M.; Gesù, V.D.; Petrou, M.; Tabacchi, M.E. A fuzzy approach to the evaluation of image complexity. Fuzzy Sets Syst. 2009, 160, 1474–1484. [Google Scholar] [CrossRef]
Iliyasu, A.M.; Al-Asmari, A.K.; Salama, A.S.; Al-Qodah, M.A.; Elwahab, M.A.A.; Le, P.Q. A Visual Complexity-sensitive DWT Ordering Scheme for Hiding Data in Images. Res. J. Appl. Sci. Eng. Technol. 2014, 7, 3286–3297. [Google Scholar] [CrossRef]
Larson, E.C.; Chandler, D.M. Most apparent distortion: full-reference image quality assessment and the role of strategy. J. Electron. Imag. 2010, 19, 011006. [Google Scholar] [CrossRef]
Sheikh, H.R.; Wang, Z.; Cormack, L.; Bovik, A.C. Live Image Quality Assessment Database Release 2. 2003. Available online: http://live.ece.utexas.edu/research/quality/ (accessed on 14 September 2018).
Callet, P.L.; Autrusseau, F. Subjective Quality Assessment-Ivc Database. 2006. Available online: http://www2.irccyn.ec-nantes.fr/ivcdb/ (accessed on 14 September 2018).
Ponomarenko, N.; Jin, L.; Ieremeiev, O.; Lukin, V.; Egiazarian, K.; Astola, J.; Vozel, B.; Chehdi, K.; Carli, M.; Battisti, F.; et al. Image database TID2013: Peculiarities, results and perspectives. Signal Process. Image Commun. 2015, 30, 57–77. [Google Scholar] [CrossRef] [Green Version]
Video Quality Experts Group (VQEG). Final Report from the Video Quality Experts Group on the Validation of Objective Models of Video Quality Assessment. 2000. Available online: http://www.its.bldrdoc.gov/vqeg/vqeg-home.aspx (accessed on 8 October 2018).
Sroubek, F.; Kamenicky, J.; Lu, Y.M. Decomposition of space-variant in image deconvolution. IEEE Signal Process. Lett. 2016, 23, 346–350. [Google Scholar] [CrossRef]
Kotera, J.; Smidl, V.; Sroubk, F. Blind deconvolution with model discrepancies. IEEE Trans. Image Process. 2017, 26, 2533–2544. [Google Scholar] [CrossRef] [PubMed]

Figure 1. A sharp image (left) possesses conspicuous quality decline than a blurred image (middle) after blur processing (right).

Figure 2. Re-blur Process.

Figure 3. Feature extraction and structural similarity measurement. We extract four features, i.e., luminance, contrast, structural and blurriness, of the down-sampled 2D mode images and score structural similarity index by computing four features together.

Figure 4. Sharp (left) and Gaussian blurred (right) images with their grayscale histograms.

Figure 5. Trend of image quality and blurriness with Gaussian blur times that varies from 0 to 11.

Figure 6. Sample images in the four datasets together with their subjective DMOS scores. Each set of images contains five Gaussian blur images those have gradient DMOS scores.

Figure 7. Analysis of performance of SROCC, PLCC and RMSE with

λ

varying from 0 to 5 on LIVE II dataset (145 Gaussian blur images), where

λ

represents the exponent coefficient of blurriness comparison function in Equation (10).

Figure 7. Analysis of performance of SROCC, PLCC and RMSE with

λ

varying from 0 to 5 on LIVE II dataset (145 Gaussian blur images), where

λ

represents the exponent coefficient of blurriness comparison function in Equation (10).

Figure 8. Scatter plots of DMOS vs NSSIM predicted scores on four datasets.

Figure 9. Group 1 restorations: The original image is 480 × 720 × 3 which is provided by LIVE II dataset while the blurred image is produced by a Gaussian low-pass filter of 11 × 11 with deviation 1.5. The restorations are produced by Sroubek [30] and Kotera [31], respectively.

Figure 10. Group 2 restorations: The original image is 512 × 512 × 3 which is provided by IVC dataset while the blurred image is produced by a Gaussian low-pass filter of

11 \times 11

with deviation 1.5. The restorations are produced by Sroubek [30] and Kotera [31], respectively.

Figure 10. Group 2 restorations: The original image is 512 × 512 × 3 which is provided by IVC dataset while the blurred image is produced by a Gaussian low-pass filter of

11 \times 11

with deviation 1.5. The restorations are produced by Sroubek [30] and Kotera [31], respectively.

Table 1. Relationship between SSIM and image blurriness of Figure 1.

Image	SSIM	Blurriness
Sharp	1.0000	1016.6353
Blur	0.8442	1025.8226
Re-blur	0.7839	1034.1684

Table 2. Blurriness of four sets of images shown in Figure 6, demonstrating the consistency between the proposed image blurriness metric and DMOS, i.e., the subjective scores given by the datasets.

CSIQ		LIVE II		IVC		TID2013
DMOS	Blurriness	DMOS	Blurriness	DMOS	Blurriness	DMOS	Blurriness
0.043	1899.433	0.562467	4047.479	4.4	2574.040	5.18919	2142.768
0.142	1916.220	0.963510	4061.419	3.3	2612.444	4.27222	2171.592
0.341	1939.591	1.450490	4073.350	2.6	2636.649	3.48649	2209.623
0.471	1976.500	2.510388	4092.591	1.9	2658.148	3.00000	2250.459
0.750	2044.500	3.541641	4107.935	1.4	2671.689	2.11111	2285.712

Table 3. SROCC of three blur distortion types tested on LIVE II dataset. The best SROCC score is marked in boldface.

Type	Motion Blur	Gaussian Blur	Defocus Blur
SROCC	0.9031	0.9464	0.8848

Table 4. The impact of size and deviation of the Gaussian low-pass filter on running time (seconds), which is tested on LIVE II dataset. The minimum running time is marked in boldface.

Deviation\Size(Pixels)	5	7	9	11	13	15
0.5	2.2956	2.2740	2.2875	2.3679	2.3840	2.3569
1.0	2.2553	2.3779	2.2955	2.2769	2.3800	2.3440
1.5	2.2537	2.3232	2.2743	2.2495	2.3779	2.3457
2.0	2.2560	2.4236	2.2846	2.2967	2.4144	2.3379
2.5	2.2574	2.4542	2.3055	2.3547	2.4531	2.4553

Table 5. Comparison of running time in LIVE II dataset with different patch quantity from 4 × 4 to 64 × 64. The minimum running time is marked in boldface.

Patch Quantity	4 × 4	8 × 8	12 × 12	16 × 16	20 × 20	32 × 32	64 × 64
Running Time (seconds)	3.4769	3.8733	3.6547	2.4373	4.5235	5.4144	18.3218

Table 6. Comparison with ten existing IQA algorithms on CSIQ dataset (150 Gaussian blur images). We take Spearman’s rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface.

Algorithm	SROCC (p-Value)	PLCC (p-Value)	RMSE (p-Value)
PSNR	0.9366 (0.000068)	0.9249 (0.002634)	0.1090 (0.001652)
SSIM [1]	0.9089 (0.003128)	0.8861 (0.014146)	0.1328 (0.007789)
BIQI [14]	0.9182 (0.001193)	0.8974 (0.000621)	0.2851 (0.006433)
BRISQUE [4]	0.9033 (0.015126)	0.9279 (0.004072)	0.1068 (0.000032)
TIP [2]	0.8996 (0.007822)	0.9014 (0.018125)	0.2678 (0.005821)
NIQE [17]	0.8951 (0.005408)	0.9222 (0.009684)	0.1108 (0.000019)
BLIIND-II [3]	0.8915 (0.000362)	0.9011 (0.000067)	0.1243 (0.005118)
NR-CSR [18]	0.8822 (-)	0.8492 (-)	0.1513 (-)
MCNN [19]	0.8751 (-)	0.8882 (-)	0.1317 (-)
IQA-CWT [7]	0.8010 (-)	- (-)	0.1626 (-)
NSSIM	0.8546	0.8971	0.1266

Table 7. Comparison with ten existing IQA algorithms on LIVE II dataset (145 Gaussian blur images). We take Spearman’s rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface.

Algorithm	SROCC (p-Values)	PLCC (p-Values)	RMSE (p-Values)
PSNR	0.7180 (0.000031)	0.6427 (0.000004)	1.6266 (0.000001)
SSIM [1]	0.8727 (0.005351)	0.8570 (0.000026)	1.8174 (0.000522)
BIQI [14]	0.9119 (0.004789)	0.7144 (0.000756)	2.0347 (0.000011)
BRISQUE [4]	0.9710 (0.010362)	0.9680 (0.004423)	1.5331 (0.001605)
TIP [2]	0.9011 (0.009542)	0.8811 (0.007826)	1.4768 (0.001412)
NIQE [17]	0.9721 (0.017612)	0.9561 (0.010149)	0.1884 (0.005590)
BLIIND-II [3]	0.9177 (0.001637)	0.9111 (0.000488)	0.8750 (0.007369)
MCNN [19]	0.9358 (-)	0.9459 (-)	6.4538 (-)
IQA-CWT [7]	0.9169 (-)	- (-)	7.8650 (-)
SFA [20]	0.9166 (-)	0.8305 (-)	0.7055 (-)
NSSIM	0.9464	0.9689	0.8669

Table 8. Comparison with eight existing IQA algorithms on IVC dataset (20 Gaussian blur images). We take Spearman’s rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface.

Algorithm	SROCC (p-Values)	PLCC (p-Values)	RMSE (p-Values)
PSNR	0.7893 (0.000959)	0.8938 (0.001219)	0.5119 (0.005339)
SSIM [1]	0.8080 (0.000672)	0.7821 (0.000021)	0.8348 (0.000059)
BIQI [14]	0.8600 (0.015124)	0.6603 (0.000482)	1.0738 (0.000041)
BRISQUE [4]	0.8239 (0.005389)	0.9009 (0.009336)	0.4960 (0.010058)
TIP [2]	0.8847 (0.012782)	0.8847 (0.007216)	1.1021 (0.001250)
NIQE [17]	0.8638 (0.010644)	0.8994 (0.008812)	0.4990 (0.001229)
BLIIND-II [3]	0.8715 (0.000956)	0.8246 (0.000372)	0.6458 (0.006883)
NR-CSR [18]	0.9239 (-)	0.8775 (-)	0.5478 (-)
NSSIM	0.8886	0.9239	0.4367

Table 9. Comparison with eight existing IQA algorithms on TID2013 dataset (125 Gaussian blur images). We take Spearman’s rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface.

Algorithm	SROCC (p-Values)	PLCC (p-Values)	RMSE (p-Values)
PSNR	0.8406 (0.000027)	0.9609 (0.000003)	0.3448 (0.000203)
SSIM [1]	0.8646 (0.012710)	0.8737 (0.001305)	0.6071 (0.000518)
BIQI [14]	0.8065 (0.010284)	0.8352 (0.000208)	1.0273 (0.009078)
BRISQUE [4]	0.8505 (0.001421)	0.8630 (0.000932)	0.6287 (0.000455)
TIP [2]	0.8531 (0.002316)	0.8352 (0.005613)	1.5324 (0.000062)
NIQE [17]	0.8325 (0.009727)	0.8639 (0.000686)	0.6267 (0.000188)
BLIIND-II [3]	0.8555 (0.008715)	0.8577 (0.000384)	0.6415 (0.000512)
NR-CSR [18]	0.8520 (-)	0.8339 (-)	0.6476 (-)
NSSIM	0.8995	0.7357	1.0471

Table 10. Average IQA performance (mean±standard deviation) on four datasets. We take Spearman’s rank order correlation coefficient (SROCC), Pearson linear correlation coefficient (PLCC) and root mean square error (RMSE) as evaluation indexes. In each column, the best performance value is marked in boldface.

Algorithm	SROCC (p-Values)	PLCC (p-Values)	RMSE (p-Values)
PSNR	0.8211 ± 0.0796 (0.319167)	0.8556 ± 0.1252 (0.835436)	0.6481 ± 0.5828 (0.929265)
SSIM [1]	0.8636 ± 0.0361 (0.356459)	0.8497 ± 0.0404 (0.650212)	0.8480 ± 0.6143 (0.493018)
BIQI [14]	0.8742 ± 0.0451 (0.526389)	0.7768 ± 0.0354 (0.335915)	1.1052 ± 0.3834 (0.164435)
BRISQUE [4]	0.8872 ± 0.0538 (0.738890)	0.9150 ± 0.0383 (0.385953)	0.6912 ± 0.5226 (0.766324)
TIP [2]	0.8847 ± 0.0193 (0.412921)	0.8756 ± 0.0246 (0.310203)	1.0948 ± 0.5053 (0.309981)
NIQE [17]	0.8909 ± 0.0519 (0.811307)	0.9104 ± 0.0336 (0.464863)	0.3562 ± 0.2133 (0.230560)
BLIIND-II [3]	0.8841 ± 0.0232 (0.296689)	0.8736 ± 0.0347 (0.882062)	0.5717 ± 0.2750 (0.741406)
NSSIM	0.8973 ± 0.0328	0.8814 ± 0.0879	0.6193 ± 0.3621

Table 11. IQA algorithms performance on Group 1 restorations. Original image is the reference (sharp) image given by LIVE II [26] dataset. NSSIM represents the quality score given by the proposed metric.

IQA Algorithm	Original Image	Blurred Image	Sroubek	Kotera
PSNR	\	23.4926	18.5741	16.7602
SSIM [1]	1.0000	0.7255	0.6571	0.5106
BIQI [14]	66.2057	34.4890	79.9299	79.7829
BRISQUE [4]	84.5863	112.1240	87.1534	84.8248
NIQE [17]	18.8170	19.7543	14.9457	14.9441
BLIIND-II [3]	73.5000	54.0000	93.0000	88.5000
SFA [20]	3.2221	2.4976	2.4577	2.6608
NSSIM (× $10^{- 4}$ )	38.9533	0.0709	34.1906	39.0565

Table 12. IQA algorithms performance on Group 2 restorations. Original image is the reference (sharp) image given by IVC [27] dataset. NSSIM represents the quality score given by the proposed metric.

IQA Algorithm	Original Image	Blurred Image	Sroubek	Kotera
PSNR	\	23.3116	25.1542	25.0633
SSIM [1]	1.0000	0.8132	0.8904	0.8951
BIQI [14]	46.8936	23.1284	49.8244	49.2987
BRISQUE [4]	32.3916	41.5441	36.6743	35.2054
NIQE [17]	21.5543	27.3656	19.2333	19.1788
BLIIND-II [3]	45.8900	30.5677	51.4986	51.6889
SFA [20]	12.8285	6.2353	6.9934	11.0588
NSSIM (× $10^{- 4}$ )	50.0684	16.7567	39.5872	42.2391

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, H.; Yuan, B.; Dong, B.; Jiang, Z. No-Reference Blurred Image Quality Assessment by Structural Similarity Index. Appl. Sci. 2018, 8, 2003. https://doi.org/10.3390/app8102003

AMA Style

Zhang H, Yuan B, Dong B, Jiang Z. No-Reference Blurred Image Quality Assessment by Structural Similarity Index. Applied Sciences. 2018; 8(10):2003. https://doi.org/10.3390/app8102003

Chicago/Turabian Style

Zhang, Haopeng, Bo Yuan, Bo Dong, and Zhiguo Jiang. 2018. "No-Reference Blurred Image Quality Assessment by Structural Similarity Index" Applied Sciences 8, no. 10: 2003. https://doi.org/10.3390/app8102003

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

No-Reference Blurred Image Quality Assessment by Structural Similarity Index

Abstract

1. Introduction

2. Related Works

2.1. Distortion-Specific NR IQA Algorithms

2.2. Holistic NR IQA Algorithms

3. The Proposed IQA Metric

3.1. Structural Similarity

3.2. Re-Blur

3.3. Feature Extraction

3.4. NSSIM Index

4. Performance Evaluation

4.1. Datasets

4.2. Indexes for Evaluation

4.3. Parameter Setting

4.3.1. Filter Type and Parameter for Re-Blur

4.3.2. Patch Quantity

4.3.3. Exponent Coefficient of Blurriness Comparison Function

4.4. Comparison with the State-of-the-Arts

4.5. Consistency with Subjective DMOS Scores

4.6. IQA for Blurred Image Restoration

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI