0% found this document useful (0 votes)
68 views12 pages

An Information Fidelity Criterion For Image Quality Assessment Using Natural Scene Statistics

In this paper, we propose a novel information fidelity criterion for image quality assessment. QA systems are invariably involved with judging the visual quality of images and videos. Our approach is parameterless and outperforms current methods in our testing.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views12 pages

An Information Fidelity Criterion For Image Quality Assessment Using Natural Scene Statistics

In this paper, we propose a novel information fidelity criterion for image quality assessment. QA systems are invariably involved with judging the visual quality of images and videos. Our approach is parameterless and outperforms current methods in our testing.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO.

12, DECEMBER 2005 2117

An Information Fidelity Criterion for Image Quality


Assessment Using Natural Scene Statistics
Hamid Rahim Sheikh, Member, IEEE, Alan Conrad Bovik, Fellow, IEEE, and
Gustavo de Veciana, Senior Member, IEEE

Abstract—Measurement of visual quality is of fundamental im- or videos that come from a system running under a given con-
portance to numerous image and video processing applications. figuration. The obvious way of measuring quality is to solicit
The goal of quality assessment (QA) research is to design algo- the opinion of human observers. However, such subjective eval-
rithms that can automatically assess the quality of images or videos
in a perceptually consistent manner. Traditionally, image QA algo- uations are not only cumbersome and expensive, but they also
rithms interpret image quality as fidelity or similarity with a “ref- cannot be incorporated into automatic systems that adjust them-
erence” or “perfect” image in some perceptual space. Such “full- selves in real-time based on the feedback of output quality. The
referenc” QA methods attempt to achieve consistency in quality goal of quality assessment (QA) research is, therefore, to design
prediction by modeling salient physiological and psychovisual fea- algorithms for objective evaluation of quality in a way that is
tures of the human visual system (HVS), or by arbitrary signal fi-
delity criteria. In this paper, we approach the problem of image QA consistent with subjective human evaluation. Such QA methods
by proposing a novel information fidelity criterion that is based on would prove invaluable for testing, optimizing, bench-marking,
natural scene statistics. QA systems are invariably involved with and monitoring applications.
judging the visual quality of images and videos that are meant for Traditionally, researchers have focussed on measuring signal
“human consumption.” Researchers have developed sophisticated fidelity as a means of assessing visual quality. Signal fidelity
models to capture the statistics of natural signals, that is, pictures
and videos of the visual environment. Using these statistical models is measured with respect to a reference signal that is assumed
in an information-theoretic setting, we derive a novel QA algorithm to have “perfect” quality. During the design or evaluation of a
that provides clear advantages over the traditional approaches. In system, the reference signal is typically processed to yield a dis-
particular, it is parameterless and outperforms current methods torted (or test) image, which can then be compared against the
in our testing. We validate the performance of our algorithm with reference using so-called full reference (FR) QA methods. Typ-
an extensive subjective study involving 779 images. We also show
that, although our approach distinctly departs from traditional ically this comparison involves measuring the “distance” be-
HVS-based methods, it is functionally similar to them under cer- tween the two signals in a perceptually meaningful way. This
tain conditions, yet it outperforms them due to improved modeling. paper presents a FR QA method for images.
The code and the data from the subjective study are available at [1]. A simple and widely used fidelity measure is the peak
Index Terms—Image information, image quality assessment signal-to-noise ratio (PSNR), or the corresponding distortion
(QA), information fidelity, natural scene statistics (NSS). metric, the mean-squared error (MSE). The MSE is the
norm of the arithmetic difference between the reference and the
test signals. It is an attractive measure for the (loss of) image
I. INTRODUCTION
quality due to its simplicity and mathematical convenience.

T HE field of digital image and video processing deals, in


large part, with signals that are meant to convey reproduc-
tions of visual information for human consumption, and many
However, the correlation between MSE/PSNR and human
judgement of quality is not tight enough for most applications,
and the goal of QA research over the past three decades has
image and video processing systems, such as those for acquisi- been to improve upon the PSNR.
tion, compression, restoration, enhancement and reproduction, For FR QA methods, modeling of the human visual system
etc., operate solely on these visual reproductions. These sys- (HVS) has been regarded as the most suitable paradigm for
tems typically involve tradeoffs between system resources and achieving better quality predictions. The underlying premise
the visual quality of the output. In order to make these tradeoffs is that the sensitivities of the visual system are different for
efficiently, we need a way of measuring the quality of images different aspects of the visual signal that it perceives, such as
brightness, contrast, frequency content, and the interaction be-
tween different signal components, and it makes sense to com-
Manuscript received September 24, 2003; revised January 5, 2005. This work pute the strength of the error between the test and the reference
was supported by a grant from the National Science Foundation. The associate
editor coordinating the review of this manuscript and approving it for publica- signals once the different sensitivities of the HVS have been ac-
tion was Dr. Reiner Eschbach. curately accounted for. Other researchers have explored signal
H. R. Sheikh was with the Laboratory for Image and Video Engineering, fidelity criteria that are not based on assumptions about HVS
Department of Electrical and Computer Engineering, The University of Texas,
Austin, Austin, TX 78712-1084 USA. He is now with Texas Instruments, Inc., models, but are motivated instead by the need to capture the loss
Dallas, TX 75243 USA (e-mail: [email protected]). of structure in the signal, structure that the HVS hypothetically
A. C. Bovik and G. de Veciana are with the Department of Electrical and extracts for cognitive understanding.
Computer Engineering, The University of Texas, Austin, Austin, TX 78712-
1084 USA (e-mail: [email protected]; [email protected]). In this paper, we explore a novel information theoretic
Digital Object Identifier 10.1109/TIP.2005.859389 criterion for image fidelity using natural scene statistics (NSS).
1057-7149/$20.00 © 2005 IEEE
2118 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 12, DECEMBER 2005

Images and videos of the three-dimensional (3-D) visual en- found in [3]–[5]. A number of HVS-based methods have been
vironment come from a common class: the class of natural proposed in the literature. Some representative methods include
scenes. Natural scenes form a tiny subspace in the space of [6]–[13].
all possible signals, and researchers have developed sophisti-
cated models to characterize these statistics. Most real-world B. Arbitrary Signal Fidelity Criteria
distortion processes disturb these statistics and make the image Researchers have also attempted to use arbitrary signal fi-
or video signals unnatural. We propose to use natural scene delity criteria in a hope that they would correlate well with per-
models in conjunction with distortion models to quantify the ceptual quality. In [14] and [15], a number of these are evalu-
statistical information shared between the test and the reference ated for the purpose of QA. In [16] a structural similarity metric
images, and posit that this shared information is an aspect of (SSIM) was proposed to capture the loss of image structure.
fidelity that relates well with visual quality. SSIM was derived by considering hypothetically what consti-
The approaches discussed above describe three ways in which tutes a loss in signal structure. It was claimed that distortions in
one could look at the image QA problem. One viewpoint is an image that come from variations in lighting, such as contrast
structural, from the image-content perspective, in which im- or brightness changes, are nonstructural distortions, and that
ages are considered to be projections of objects in the 3-D en- these should be treated differently from structural ones. It was
vironment that could come from a wide variety of lighting con- claimed that one could capture image quality with three aspects
ditions. Such variations constitute nonstructural distortion that of information loss that are complementary to each other: corre-
should be treated differently from structural ones, e.g., blur- lation distortion, contrast distortion, and luminance distortion.
ring or blocking that could hamper cognition. The second view-
point is psychovisual, from the human visual receiver perspec- C. Limitations
tive, in which researchers simulate the processing of images by A number of limitations of HVS-based methods are discussed
the HVS, and predict the perceptual significance of errors. The in [16]. In summary, these have to do with the extrapolation
third viewpoint, the one that we take in this paper, is the statis- of the vision models that have been proposed in the visual
tical viewpoint that considers natural images to be signals with psychology literature to image processing problems. In [16],
certain statistical properties. These three views are fundamen- it was claimed that structural QA methods avoid some of the
tally connected with each other by the following hypothesis: The limitations of HVS-based methods since they are not based on
physics of image formation of the natural 3-D visual environ- threshold psychophysics or the HVS models derived thereof.
ment leads to certain statistical properties of the visual stimulus, However, they have some limitations of their own. Specifically,
in response to which the visual system has evolved over eons. although the structural paradigm for QA is an ambitious para-
However, different aspects of each of these views may have dif- digm, there is no widely accepted way of defining structure and
ferent complexities when it comes to analysis and modeling. In structural distortion in a perceptually meaningful manner. In
this paper, we show that the statistical approach to image QA [16], the SSIM was constructed by hypothesizing the functional
requires few assumptions, is simple and methodical to derive, forms of structural and nonstructural distortions and the interac-
and yet it is competitive with the other two approaches in that it tion between them. In this paper, we take a new approach to the
outperforms them in our testing. Also, we show that the statis- QA problem. As mentioned in the Introduction, the third alter-
tical approach to QA is a dual of the psychovisual approach to native to QA, apart from HVS-based and structural approaches,
the same problem; we demonstrate this duality toward the end is the statistical approach, which we use in an information
of this paper. theoretic setting. Needless to say, even our approach will make
Section II presents some background work in the field of certain assumptions, but once assumptions regarding the source
FR QA algorithms as well as an introduction to NSS models. and distortion models and the suitability of mutual information
Section III presents our development of the information fidelity as a valid measure of perceptual information fidelity are made,
criterion (IFC). Implementation and subjective validation de- the components of our algorithm and their interactions fall
tails are provided in Sections IV and V, while the results are through without resorting to arbitrary formulations.
discussed in Section VI. In Section VII, we compare and con- Due to the importance of the QA problem to researchers and
trast our method with HVS-based methods, and conclude the developers in the image and video processing community, a con-
paper in Section VIII. sortium of experts, the video quality experts group (VQEG), was
formed in 1997 to develop, validate, and recommend objective
II. BACKGROUND video QA methods [17]. VQEG Phase I testing reported that
all of the proponent methods tested, which contained some of
FR QA techniques proposed in the literature can be divided
the most sophisticated video QA methods of the time, were sta-
into two major groups: those based on the HVS and those based
tistically indistinguishable from PSNR under their testing con-
on arbitrary signal fidelity criteria (a detailed review of the re-
ditions [18]. The Phase II of testing, which consisted of new
search on FR QA methods can be found in [2]–[5]).
proponents under different testing configurations, is also com-
plete and the final report has recommended an FR QA method,
A. HVS Error-Based QA Methods although it has been reported that none of the methods tested
HVS-based QA methods come in different flavors based on were comparable to the “null mode,” a hypothetical model that
tradeoffs between accuracy in modeling the HVS and computa- predicts quality exactly [19], meaning that QA methods need to
tional feasibility. A detailed discussion of these methods can be be improved further.
SHEIKH et al.: INFORMATION FIDELITY CRITERION FOR IMAGE QUALITY ASSESSMENT 2119

D. Natural Scene Statistics


Images and videos of the visual environment captured using
high-quality capture devices operating in the visual spectrum Fig. 1. QA problem could be analyzed using an information theoretic
are broadly classified as natural scenes. This differentiates them framework in which a source transmits information through a channel to
from text, computer generated graphics, cartoons and anima- a receiver. The mutual information between the input of the channel (the
reference image) and the output of the channel (the test image) quantifies the
tions, paintings and drawings, random noise, or images and amount of information that could ideally be extracted by the receiver (the
videos captured from nonvisual stimuli such as radar and sonar, human observer) from the test image.
X-rays, ultrasounds, etc. Natural scenes form an extremely
tiny subset of the set of all possible images. Many researchers of dealing with such problems is to analyze them in an informa-
have attempted to understand the structure of this subspace of tion-theoretic framework, in which the mutual information be-
natural images by studying their statistics (a review on natural tween the input and the output of the channel (the reference and
scene models could be found in [20]). Researchers believe that the test images) is quantified using a model for the source and
the visual stimulus emanating from the natural environment a distortion model. Thus, our assertion in proposing this frame-
drove the evolution of the HVS, and that modeling natural work is that the statistical information that a test image has of
scenes and the HVS are essentially dual problems [21]. While the reference is a good way of quantifying fidelity that could re-
many aspects of the HVS have been studied and incorporated late well with visual quality.
into QA algorithms, a usefully comprehensive (and feasible)
understanding is still lacking. NSS modeling may serve to fill A. Source Model
this gap. As mentioned in Section II-D, the NSS model that we use is
NSS have been explicitly incorporated into a number of the GSM model in the wavelet domain. It is convenient to deal
image processing algorithms: in compression algorithms with one subband of the wavelet decomposition at this point and
[22]–[25], denoising algorithms [26]–[28], image modeling later generalize this for multiple subbands. We model one sub-
[29], image segmentation [30], and texture analysis and syn- band of the wavelet decomposition of an image as a GSM RF,
thesis [31]. While the characteristics of the distortion processes , where I denotes the set of spatial indices for
have been incorporated into some QA algorithms (such as the RF. is a product of two stationary RFs that are independent
those designed for the blocking artifact), the assumptions about of each other [28]
the statistics of the images that they afflict are usually quite
simplistic. Specifically, most QA algorithms assume that the (1)
input images are smooth and low pass in nature. In [32], an NSS where is an RF of positive scalars and
model was used to design a no-reference image QA method for is a Gaussian scalar RF with mean zero and vari-
images distorted with the JPEG2000 compression artifacts. In ance . Note that, for the GSM defined in (1), while the mar-
this paper, we use NSS models for FR QA, and model natural ginal distribution of may be sharply peaked and heavy-tailed,
images in the wavelet domain using Gaussian scale mixtures such as those of natural scenes in the wavelet domain, condi-
(GSM) [28]. Scale-space-orientation analysis (loosely referred tioned on are normally distributed, that is
to as wavelet analysis in this paper) of images has been found to
be useful for natural image modeling. It is well known that the (2)
coefficients of a subband in a wavelet decomposition are neither
where denotes a Gaussian density with mean and
independent nor identically distributed, though they may be
variance . Another observation is that given are inde-
approximately second-order uncorrelated [33]. A coefficient is
pendent of , meaning that the variance of the coef-
likely to have a large variance if its neighborhood has a large
ficient specifies its distribution completely. Additionally, if
variance. The marginal densities are sharply peaked around
the RF is white, then the elements of are conditionally inde-
zero with heavy tails, which are typically modeled as Laplacian
pendent given . The GSM framework can model the marginal
density functions, while the localized statistics are highly space
statistics of the wavelet coefficients of natural images, the non-
varying. Researchers have characterized this behavior of natural
linear dependencies that are present between the coefficients,
images in the wavelet domain by using GSMs [28], a more
as well as the space-varying localized statistics through appro-
detailed introduction to which will be given in the next section.
priate modeling of the RF [28].

B. Distortion Model
III. INFORMATION FIDELITY CRITERION FOR
IMAGE QUALITY ASSESSMENT The distortion model that we use in this paper is also de-
scribed in the wavelet domain. It is a simple signal attenuation
In this paper, we propose to approach the QA problem as and additive Gaussian noise model in each subband
an information fidelity problem, where a natural image source
(3)
communicates with a receiver through a channel. The channel
imposes fundamental limits on how much information could where denotes the RF from a subband in the reference signal,
flow from the source (the reference image), through the channel denotes the RF from the corresponding
(the image distortion process) to the receiver (the human ob- subband from the test (distorted) signal, is
server). Fig. 1 shows the scenario graphically. A standard way a deterministic scalar attenuation field, and
2120 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 12, DECEMBER 2005

is a stationary additive zero-mean Gaussian noise RF with vari- information for one subband and later generalize for multiple
ance . The RF is white and is independent of and . subbands.
This model captures two important, and complementary, distor- Let denote elements from . In
tion types: blur and additive noise. We will assume that most this section, we will assume that the underlying RF is uncor-
distortion types that are prevalent in real world systems can be related (and, hence, is an RF with conditionally independent
roughly described locally by a combination of these two. In our elements given ), and that the distortion model parameters
model, the attenuation factors can capture the loss of signal and are known a priori. Let de-
energy in a subband to the blur distortion, while the process note the corresponding elements from . The mutual infor-
can capture additive noise separately. Additionally, changes in mation between these is denoted as .
image contrast that result from variations in ambient lighting are Due to the nonlinear dependence among the by way of
not modeled as noise since they too can be incorporated into the , it is much easier to analyze the mutual information assuming
attenuation field . is known. This conditioning “tunes” the GSM model for the
The choice of a proper distortion model is crucial for image fi- particular reference image, and, thus, models the source more
delity assessments that are expected to reflect perceptual quality. specifically. Thus, the IFC that we propose in this paper is the
In essence we want the distortion model to characterize what the conditional mutual information , where
HVS perceives as distortion. Based on our experience with dif- are the corresponding elements of
ferent distortion models, we are inclined to hypothesize that the , and denotes a realization of . In this paper, we will
visual system has evolved over time to optimally estimate nat- denote as . With the
ural signals embedded in natural distortions: blur, white noise, stated assumptions on and the distortion model (3), one can
and brightness and contrast stretches due to changes in ambient show
lighting. The visual stimulus that is encoded by the human eyes
is blurred by the optics of the eye as well as the spatially varying
(4)
sampling in the retina. It is therefore natural to expect evolu-
tion to have worked toward near-optimal processing of blurry
signals, say for controlling the focus of the lens, or guiding vi- (5)
sual fixations. Similarly, white noise arising due to photon noise
or internal neuron noise (especially in low light conditions) af-
fects all visual signals. Adaptation in the HVS to changes in (6)
ambient lighting has been known to exist for a long time [34].
Thus, HVS signal estimators would have evolved in response to
natural signals corrupted by natural distortions, and would be where we get (4) by the chain rule [36], and (5) and (6) by con-
near-optimal for them, but suboptimal for other distortion types ditional independence of given , independence of the noise
(such as blocking or colored noise) or signal sources. Hence, , the fact that the distortion model keeps independent of
“over-modeling” the signal source or the distortion process is , and that given and are independent of
likely to fail for QA purposes, since it imposes assumptions . Using the fact that are Gaussian given , and
of the existence of near-optimal estimators in the HVS for the are also Gaussian with variance , we get
chosen signal and distortion models, which may not be true. In
essence distortion modeling combined with NSS source mod-
eling is a dual of HVS signal estimator modeling. (7)
Another hypothesis is that the field could account for the
case when the additive noise is linearly correlated with .
(8)
Previously, researchers have noted that as the correlation of the
noise with the reference signal increases, MSE becomes poorer
in predicting perceptual quality [35]. While the second hypoth- (9)
esis could be a corollary to the first, we feel that both of these
hypotheses (and perhaps more) need to be investigated further
with psychovisual experiments so that the exact contribution of (10)
a distortion model in the quality prediction problem could be
understood properly. For the purpose of image QA presented in
this paper, the distortion model of (3) is adequate, and works where denotes the differential entropy of a continuous
well in our simulations. random variable , and for distributed as
[36].
Equation (10) was derived for one subband. It is straightfor-
C. Information Fidelity Criterion ward to use separate GSM RFs for modeling each subband of in-
terest in the image. We will denote the RF modeling the wavelet
Given a statistical model for the source and the distortion coefficients of the reference image in the th subband as , and
(channel), the obvious IFC is the mutual information between in test (distorted) image as , and assume that are indepen-
the source and the distorted images. We first derive the mutual dent of each other. We will further assume that each subband
SHEIKH et al.: INFORMATION FIDELITY CRITERION FOR IMAGE QUALITY ASSESSMENT 2121

is distorted independently. Thus, the RFs are also indepen- is known to be a spatially correlated field, and can be
dent of each other. The IFC is then obtained by summing over assumed to be unity without loss of generality.
all subbands
B. Assumptions About the Distortion Model
(11)
The IFC assumes that the distortion model parameters and
are known a priori, but these would need to be estimated
where denotes coefficients from the RF of the th in practice. We propose to partition the subbands into blocks
subband, and similarly for and . and assume that the field is constant over such blocks, as are
Equation (11) is our IFC that quantifies the statistical infor- the noise statistics . The value of the field over block ,
mation that is shared between the source and the distorted im- which we denote as , and the variance of the RF over block
ages. An attractive feature of our criterion is that like MSE , which we denote as , are fairly easy to estimate (by linear
and some other mathematical fidelity metrics, it does not in- regression) since both the input (the reference signal) as well as
volve parameters associated with display device physics, data the output (the test signal) of the system (3) are available
from visual psychology experiments, viewing configuration in-
formation, or stabilizing constants, which dictate the accuracy (12)
of HVS-based FR QA methods (and some structural ones, too). (13)
The IFC does not require training data either. However, some
implementation parameters will obviously arise once (11) is im- where the covariances are approximated by sample estimates
plemented. We will discuss implementation in the next section. using sample points from the corresponding blocks in the refer-
The IFC is not a distortion metric, but a fidelity criterion. It ence and test signals.
theoretically ranges from zero (no fidelity) to infinity (perfect fi-
delity within a nonzero multiplicative constant in the absence of C. Wavelet Bases and Inter-Coefficient Correlations
noise).1 Perfect fidelity within a multiplicative constant is some- The derivation leading to (10) assumes that is uncorrelated,
thing that is in contrast with the approach in SSIM [16], in which and, hence, is independent given . In practice, if the wavelet
contrast distortion (multiplicative constant) was one of the three decomposition is orthogonal, the underlying could be approx-
attributes of distortion that was regarded as a visual degrada- imately uncorrelated. In such cases, one could use (10) for com-
tion, albeit one that has a different (and “orthogonal”) contribu- puting the IFC. However, real cartesian-separable orthogonal
tion toward perceptual fidelity than noise and local-luminance wavelets are not good for image analysis since they have poor
distortions. In this paper, we view multiplicative constants (con- orientation selectivity, and are not shift invariant. In our imple-
trast stretches) as signal gains or attenuations interacting with mentation, we chose the steerable pyramid decomposition with
additive noise. Thus, with this approach, the same noise vari- six orientations [37]. This gives better orientation selectivity
ance would be perceptually less annoying if it were added to a than possible with real cartesian separable wavelets. However,
contrast stretched image than if it were added to a contrast at- the steerable pyramid decomposition is over-complete, and the
tenuated image. Since each subband has its own multiplicative neighboring coefficients from the same subband are linearly
constant, blur distortion could also be captured by this model as correlated. In order to deal with such correlated coefficients,
the finer scale subbands would be attenuated more than coarser we propose two simple approximations that work well for QA
scale subbands. purposes.
1) Vector GSM: Our first approximation is to partition the
IV. IMPLEMENTATION ISSUES subband into nonoverlapping block-neighborhoods and assume
In order to implement the fidelity criterion in (11), a number that the neighborhoods are uncorrelated with each other. One
of assumptions are required about the source and the distortion could then use a vector form of the IFC by modeling each neigh-
models. We outline them in this section. borhood as a vector random variable. This “blocking” of coef-
ficients results in an upper bound
A. Assumptions About the Source Model
Note that mutual information (and, hence, the IFC) can only
be calculated between RFs and not their realizations, that is,
a particular reference and test image under consideration. We
will assume ergodicity of the RFs, and that reasonable estimates where is a vector of wavelet
for the statistics of the RFs can be obtained from their realiza- coefficients that form the th neighborhood. All such vectors,
tions. We then quantify the mutual information between the RFs associated with nonoverlapping neighborhoods, are assumed to
having statistics obtained from particular realizations. be uncorrelated with each other. We now model the wavelet co-
For the scalar GSM model, estimates of can be obtained by efficient neighborhood as a vector GSM. Thus, the vector RF
localized sample variance estimation since for natural images
on a lattice is a product of a scalar RF
1Differential entropy is invariant to translation, and so the IFC is infinite for
and a zero-mean Gaussian vector RF of co-
perfect fidelity within an additive constant in the absence of noise as well. How-
ever, since we are applying the IFC in the wavelet domain on “AC” subbands variance . The noise is also a zero-mean vector Gaussian
only to which the GSM model applies, the zero-mean assumptions on U and V RF of same dimensionality as , and has covariance . If we
imply that this case will not happen.
2122 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 12, DECEMBER 2005

assume that is independent of , it is quite easy to In (21) and (22), is assumed to be unity without
show (by using differential entropy for Gaussian vectors) that loss of generality [38].
2) Downsampling: Our second approximation is to use a
subset of the coefficients by downsampling . Downsampling
(14) reduces the correlation between coefficients. We will assume
that the downsampled subband is approximately uncorrelated,
and then use (10) for scalar GSM on the downsampled subband.
(15) The underlying assumption in the downsampling approach is
that the quality prediction from the downsampled subbands
should be approximately the same as the prediction from the
where the differential entropy of a continuous vector
complete subband. This downsampling approach has an addi-
random vector distributed as a multivariate Gaussian tional advantage that it makes it possible to substantially reduce
where denotes the complexity of computing the wavelet decomposition since
the determinant, and is the dimension of [36]. Recalling only a fraction of the subband coefficients need to be computed.
that is symmetric and can be factorized as with In our simulations we discovered that the wavelet decompo-
sition is the most computationally expensive step. Significant
orthonormal and eigenvalues , and that for a distortion
speedups are possible with the typical downsampling factors of
model where , the IFC simplifies as follows:2
twelve or fifteen in our simulations. We downsample a subband
along and across the principal orientations of the respective
filters. In our simulations, the downsampling was done using
(16)
nearest-neighbor interpolation.
Further specifics of the estimation methods used in our testing
are given in Section VI.

(17)
V. SUBJECTIVE EXPERIMENTS FOR VALIDATION
(18)
In order to calibrate and test the algorithm, an extensive
psychometric study was conducted. In these experiments, a
(19) number of human subjects were asked to assign each image
with a score indicating their assessment of the quality of that
image, defined as the extent to which the artifacts were visible
(20) and annoying. Twenty-nine high-resolution 24-bits/pixel RGB
color images (typically 768 512) were distorted using five
where the numerator term inside the logarithm of (19) is the de- distortion types: JPEG2000, JPEG, white noise in the RGB
terminant of a diagonal matrix and, hence, equals the product of components, Gaussian blur, and transmission errors in the
the diagonal terms. The bound in (16) shrinks as increases. JPEG2000 bit stream using a fast-fading Rayleigh channel
In our simulations we use vectors from 3 3 spatial neighbor- model. A database was derived from the 29 images such that
hoods and achieve good performance. Equation (20) is the form each image had test versions with each distortion type, and for
that is used for implementation. each distortion type the perceptual quality roughly covered the
For the vector GSM model, the maximum-likelihood estimate entire quality range. Observers were asked to provide their per-
of can be found as follows [38]: ception of quality on a continuous linear scale that was divided
into five equal regions marked with adjectives “Bad,” “Poor,”
“Fair,” “Good,” and “Excellent,” which was mapped linearly
(21) on to a 1–100 range. About 20–25 human observers rated each
image. Each distortion type was evaluated by different subjects
in different experiments using the same equipment and viewing
where is the dimensionality of . Estimation of the co- conditions. In this way a total of 982 images, out of which 203
variance matrix is also straightforward from the reference were the reference images, were evaluated by human subjects
image wavelet coefficients [38] in seven experiments. The raw scores were converted to dif-
ference scores (between the test and the reference) [18] and
then converted to Z-scores [39], scaled back to 1–100 range,
(22)
and finally a difference mean opinion score (DMOS) for each
distorted image. The average RMSE for the DMOS was 5.92
2Utilizing the structure of C0! and C0! helps in faster implementations with an average 95% confidence interval of width 5.48. The
through matrix factorizations.
U V database is available at [1].
SHEIKH et al.: INFORMATION FIDELITY CRITERION FOR IMAGE QUALITY ASSESSMENT 2123

TABLE I Mean SSIM (MSSIM) was calculated on the luminance compo-


VALIDATION SCORES FOR DIFFERENT QUALITY ASSESSMENT METHODS. THE nent after decimating (filtering and downsampling) it by a factor
METHODS TESTED WERE PSNR, SARNOFF JND-METRIX 8.0 [40], MSSIM
[16], IFC FOR SCALAR GSM WITHOUT DOWNSAMPLING, IFC FOR SCALAR of 4 [16].
GSM WITH DOWNSAMPLING BY 3 ALONG ORIENTATION AND 5 ACROSS, IFC
FOR VECTOR GSM, IFC FOR VECTOR GSM USING HORIZONTAL AND B. Calibration of the Objective Score
VERTICAL ORIENTATIONS ONLY, AND IFC FOR VECTOR GSM AND
HORIZONTAL/VERTICAL ORIENTATIONS WITH ONLY THE SMALLEST It is generally acceptable for a QA method to stably predict
EIGENVALUE IN (20). THE METHODS WERE TESTED AGAINST DMOS FROM subjective quality within a monotonic nonlinear mapping, since
THE SUBJECTIVE STUDY AFTER A NONLINEAR MAPPING. THE VALIDATION
CRITERIA ARE: CORRELATION COEFFICIENT (CC), MEAN ABSOLUTE ERROR
the mapping can be compensated for easily. Moreover, since the
(MAE), ROOT MEAN SQUARED ERROR (RMS), OUTLIER RATIO (OR), AND mapping is likely to depend upon the subjective validation/ap-
SPEARMAN RANK-ORDER CORRELATION COEFFICIENT (SROCC) plication scope and methodology, it is best to leave it to the final
application, and not to make it part of the QA algorithm. Thus,
in both the VQEG Phase-I and Phase-II testing and validation,
a monotonic nonlinear mapping between the objective and the
subjective scores was allowed, and all the performance valida-
tion metrics were computed after compensating for it [18]. This
is true for the results in Table I, where a five-parameter nonlin-
earity (a logistic function with additive linear term) is used for
all methods except for the IFC, for which we used the mapping
on the logarithm of the IFC. The quality predictions, after com-
pensating for the mapping, are shown in Fig. 2. The mapping
function used is given in (23), while the fitting was done using
MATLAB’s fminsearch
(23)
(24)
VI. RESULTS

In this section, we present results on validation of the IFC on


C. Discussion
the database presented in Section V, and comparisons with other
QA algorithms. Specifically, we will compare the performance Table I shows that the IFC, even in its simplest form, is
of our algorithm against PSNR, SSIM [16], and the well known competitive with all state-of-the-art FR QA methods presented
Sarnoff model (Sarnoff JND-Metrix 8.0 [40]). We present re- in this paper. The comparative results between MSSIM and
sults for five versions of the IFC: scalar GSM, scalar GSM with Sarnoff’s JND-Metrix are qualitatively similar to those reported
downsampling by three along the principal orientation and five in [16], only that both of these methods perform poorer in the
across, vector GSM, vector GSM using the horizontal and ver- presence of a wider range of distortion types than reported
tical orientations only, and vector GSM using horizontal and in [16]. However, MSSIM still outperforms JND-Metrix by a
vertical orientations and only one eigenvalue in the summation sizeable margin using any of the validation criteria in Table I.
of (20). Table I summarizes the validation results. The IFC also performs demonstrably better than Sarnoff’s
JND-Metrix under all of the alternative implementations of
the IFC. The vector-GSM form of the IFC outperforms even
A. Simulation Details
MSSIM. Note that the downsampling approximation performs
Some additional simulation details are as follows. Although better than scalar IFC without downsampling, even though
full color images were distorted in the subjective evaluation, the the downsampled version operates on signals that are fifteen
QA algorithms (except JND-Metrix) operated upon the lumi- times smaller, and, hence, it is a computationally more feasible
nance component only. For the scalar GSM with no downsam- alternative to other IFC implementations at a reasonably good
pling, a 5 5 moving window was used for local variance esti- performance. Also note that the IFC as well as MSSIM use
mation , and 16 16 nonoverlapping blocks were used for only the luminance components of the images to make quality
estimating parameters and . The blocking was done in predictions whereas the JND-Metrix uses all color information.
order for the stationarity assumptions on the distortion model to Extending the IFC to incorporate color could further improve
approximately hold. For the scalar GSM with downsampling, all performance.
parameters were estimated on the downsampled signals. A 3 3 An interesting observation is that when only the smaller
window was used for variance estimation, while 8 8 blocks eigenvalues are used in the summation of (20), the performance
were used for the distortion model estimation. For vector GSM, increases dramatically. The last row in Table I and Fig. 2
vectors were constructed from nonoverlapping 3 3 neighbor- show results when only the smallest eigenvalue is used in the
hoods, and the distortion model was estimated with 18 18 summation in (20). The performance is relatively unaffected
nonoverlapping blocks. In all versions of the IFC, only the sub- up to an inclusion of five smallest eigenvalues (out of nine).
bands at the finest level were used in the summation of (11). One hypothesis that could explain this observation is that a
Since the sizes of the images in the database were different, measurement noise could be present in IFC whose strength
the IFC was normalized by the number of pixels in each image. depends upon the strength of the signal used in the computation
2124 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 12, DECEMBER 2005

Fig. 2. Scatter plots for the quality predictions by the four methods after compensating for quality calibration: PSNR, Sarnoff’s JND-metrix, MSSIM, and IFC
for vector GSM. The IFC shown here uses only the horizontal and vertical subbands at the finest scale, and only the smallest eigenvalue in (20). (x) The distortion
types are: JPEG2000, (+) JPEG, (o) white noise in RGB space, (box) Gaussian blur, and (diamond) transmission errors in JPEG2000 stream over fast-fading
Rayleigh channel.

of IFC. Thus, ignoring components with high signal strength TABLE II


[corresponding to summing over low eigenvalues only in (20)] VALIDATION SCORES FOR THE VECTOR GSM IFC USING ALL ORIENTATIONS
VERSUS USING: ONLY THE HORIZONTAL AND VERTICAL ORIENTATIONS AND
could lower the noise if the relationship between the noise 6
THE SUBBANDS ORIENTED AT 60 . ONLY THE SMALLEST EIGENVALUE
variance and the signal variance is super-linear, for which HAS BEEN USED IN (20) FOR GENERATING THIS TABLE
an increase in signal strength would cause a decrease in the
signal-to-noise ratio.
Another interesting observation is that when only the hori-
zontal and vertical subbands are used in the computation of the
IFC in (11) for the vector GSM IFC, the performance increases.3
We first thought that this was due to the presence of JPEG dis-
torted images in the database since the blocking artifact is repre-
sented more in the horizontal and vertical subbands than at other
orientations. However, we discovered that the performance in-
crease was consistent for all distortion types present in the data-
base, and most notably for the JPEG2000 distortion. Also we
do not get this increase in performance when we sum over other portance that the HVS gives to horizontal and vertical edge in-
subbands; the performance in fact worsens. Table II gives the formation in images in comparison with other orientations [34].
performance change of IFC on individual distortion types for In our MATLAB implementation, the scalar GSM version
horizontal and vertical subbands and the corresponding perfor- of the IFC (without downsampling) takes about 10 s for a
mance change when orientations of degrees were summed 512 768 color image on a Pentium III 1-GHz machine. The
in (11). We feel that this performance increase is due to the im- vector GSM version (with horizontal and vertical subbands
3It does so for other IFC forms but we will not report those results here since only) takes about 15 s. This includes the time required to
they are mirrored by the ones presented. perform color conversions, which is roughly 10% of the total
SHEIKH et al.: INFORMATION FIDELITY CRITERION FOR IMAGE QUALITY ASSESSMENT 2125

Fig. 3. HVS-based quality measurement system. We show that this HVS model is the dual of the scalar GSM-based IFC of (11).

time. We noted that about 40% to 50% of the time is needed for to be a particular HVS-based QA algorithm, the perceptual dis-
the computation of the wavelet decomposition. tortion criterion (PDC), within multiplicative and additive con-
We would like to point out the most salient feature of the IFC: stants that could be absorbed into the calibration curve
It does not require any parameters from the HVS or viewing
configuration, training data or stabilizing constants. In contrast,
the JND-metrix requires a number of parameters for calibration
such as viewing distance, display resolution, screen phosphor (26)
type, ambient lighting conditions, etc. [40], and even SSIM re- (27)
quires two hand-optimized stabilizing constants. Despite being
parameterless, the IFC outperforms both of these methods. It is where denotes the index of the th subband, and is the
reasonable to say that the performance of the IFC could improve number of subbands used in the computation.
further if these parameters, which are known to affect percep- We can make the following observations regarding PDC of
tual quality, were incorporated as well. (26), which is the HVS dual of the IFC (using the scalar GSM
model), in comparison with other HVS-based FR QA methods.
VII. SIMILARITIES WITH HVS BASED QA METHODS • Some components of the HVS are not modeled in Fig. 3
and (27), such as the optical point spread function and the
We will now compare and contrast IFC with HVS-based QA contrast sensitivity function.
methods. Fig. 3 shows an HVS-based quality measurement • The masking effect is modeled differently from some
system that computes the error signal between the processed HVS-based methods. While the divisive normalization
reference and test signals, and then processes the error signal mechanism for masking effect modeling has been em-
before computing the final perceptual distortion measure. A ployed by some QA methods [11]–[13], most methods
number of key similarities with most HVS-based QA methods divisively normalize the error signal with visibility
are immediately evident. These include a scale-space-orien- thresholds that are dependent on neighborhood signal
tation channel decomposition, response exponent, masking strength.
effect modeling, localized error pooling, suprathreshold effect • Minkowski error pooling occurs in two stages. First, a
modeling, and a final pooling into a quality score. localized pooling in the computation of the localized
In the Appendix we show the following relationship between MSE (with exponent 2), and then a global pooling after
the scalar version of the IFC in (10) and the HVS model of Fig. 3 the suprathreshold modeling with an exponent of unity.
for one subband Thus, the perceptual error calculation is different from
most methods, in that it happens in two stages with
(25) suprathreshold effects in between.
• In (26), the nonlinearity that maps the MSE to a
where and are as shown in Fig. 3. The MSE compu- suprathreshold-MSE is a logarithmic nonlinearity and
tation in Fig. 3 and (25) is a localized error strength measure. it maps the MSE to a suprathreshold distortion that is
The logarithm term can be considered to be modeling of the later pooled into a quality score. Watson et al. have used
suprathreshold effect. Suprathreshold effect is the name given to threshold power functions to map objective distortion
the fact that the same amount of distortion becomes perceptually into subjective JND by use of two-alternative forced
less significant as the overall distortion level increases. Thus, a choice experiments [41]. However, their method applies
change in MSE of, say, 1.0 to 2.0 would be more annoying than the supratreshold nonlinearity after pooling, as if the
the same change from 10.0 to 11.0. Researchers have previously suprathreshold effect only comes into play at the global
modeled suprathreshold effects using visual impairment scales quality judgement level. The formulation in (26) suggests
that map error strength measures through concave nonlineari- that the suprathreshold modeling should come before a
ties, qualitatively similar to the logarithm mapping, so that they global pooling stage but after localized pooling, and that
emphasize the error at higher quality [41]. Also, the pooling in it affects visual quality at a local level.
(25) can be seen to be Minkowski pooling with exponent 1.0. • One significant difference is that the IFC using the scalar
Hence, with the stated components, the IFC can be considered GSM model, or the PDC of (26), which are duals of each
2126 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 12, DECEMBER 2005

other, is notably inferior to the vector GSM-based IFC. performance of IFC. We are continuing efforts into devel-
We believe that this is primarily due to the underlying oping an IFC for a unified model that consists of source,
assumption about the uncorrelated nature of the wavelet distortion, and HVS models, and we feel that deeper
coefficients being inaccurate. This dependence of percep- insights into perception of quality would be gained.
tual quality on the correlation among coefficients is hard • We would like to remind the readers at this point that al-
to investigate or model using HVS error sensitivities, but though the IFC is similar to an HVS-based distortion mea-
the task is greatly simplified by approaching the same sure, it has not been derived using any HVS knowledge,
problem with NSS modeling. Thus, we feel that HVS- and its derivation is completely independent. The similar-
based QA methods need to account for the fact that nat- ities exist due to the similarities between NSS and HVS
ural scenes are correlated within subbands, and that this models. The difference is subtle, but profound!
inter-coefficient correlation in the reference signal affects
human perception of quality.4 VIII. CONCLUSIONS AND FUTURE WORK
• Another significant difference between IFC/PDC and
other HVS-based methods is distinct modeling of signal In this paper, we presented an IFC for image QA using NSS.
attenuation. Other HVS-based methods ignore signal We showed that using signal source and distortion models, one
gains and attenuations, constraining to be unity, and could quantify the mutual information between the reference
treat such variations as additive signal errors as well. In and the test images, and that this quantification, the IFC, quan-
contrast, a generalized gain in the IFC/PDC ensures tifies perceptual quality. The IFC was demonstrated to be better
that signal gains are handled differently from additive than a state-of-the-art HVS-based method, the Sarnoff’s JND-
noise components. Metrix, as well as a state-of-the-art structural fidelity criterion,
• One could conjecture that the conditioning on in the the SSIM index in our testing. We showed that despite its com-
IFC is paralleled in the HVS by the computation of the petitive performance, the IFC is parameterless. We also showed
local variance and divisive normalization. Note that the that the IFC, under certain conditions, is quantitatively sim-
high degree of self-correlation present in enables its ilar to an HVS-based QA method, and we compared and con-
adequate estimation from by local variance estimation. trasted the two approaches and hypothesized directions in which
Since this divisive normalization occurs quite early in the HVS-based methods could be refined and improved.
HVS model5 and since the visual signal is passed to the We are continuing efforts into improving the IFC by com-
rest of the HVS after it has been conditioned by divisive bining HVS models with distortion and signal source models,
normalization by the estimated , we could hypothesize incorporating color statistics, and inter-subband correlations.
that the rest of the HVS analyzes the visual signal condi- We are hopeful that this new approach will give new insights
tioned on the prior knowledge of , just as the IFC ana- into visual perception of quality.
lyzes the mutual information between the test and the ref-
erence conditioned on the prior knowledge of . APPENDIX
• One question that should arise when one compares the In this Appendix, we shall quantify the similarities between
IFC against the HVS error model is regarding HVS the scalar GSM version of the IFC of (10) and the HVS-based
model parameters. Specifically, one should notice that QA assessment method shown in Fig. 3. The model in Fig. 3 is
while functionally the IFC captures HVS sensitivities, based on calculating MSE in the perceptual space and then pro-
it does so without using actual HVS model parameters. cessing it further to yield the final perceptual distortion measure.
We believe that some of the HVS model parameters were Here we will only deal with coefficients in one subband and a
either incorporated into the calibration curve, or they did scalar GSM model.
not affect performance significantly enough under the We start by giving the formulation for the divisive normal-
testing and validation experiments reported in this paper. ization stage, which divides the input by its localized average.
Parameters such as the characteristics of the display de- Considering the input to the squaring block, this turns out to be
vices or viewing configuration information could easily normalization by the estimated local variance of the input of the
be understood to have approximately similar affect on all squaring block
images for all subjects since the experimental conditions
were approximately the same. Other parameters and
model components, such as the optical point spread func-
(28)
tion or the contrast sensitivity function, which depend
on viewing configuration parameters as well, are perhaps
less significant for the scope and range of quality of our
validation experiments. It is also reasonable to say that (29)
incorporating these parameters could further enhance the

4Equation (20) suggests that the same noise variance would cause a greater
Here, we have assumed that for , that is, the
loss of information fidelity if the wavelet coefficients of the reference image
were correlated than if they were uncorrelated. variance is approximately constant over the pixels neighbor-
5Divisive normalization has been discovered to be operational in the HVS hood of , which we denote by . Also note that the term
[21]. inside the parentheses in an estimate of the conditional local
SHEIKH et al.: INFORMATION FIDELITY CRITERION FOR IMAGE QUALITY ASSESSMENT 2127

variance of (or ) at given , which could be ap- [7] J. Lubin, “A visual discrimination mode for image system design and
proximated by the actual value. We have also assumed, without evaluation,” in Visual Models for Target Detection and Recognition, E.
Peli, Ed, Singapore: World Scientific, 1995, pp. 207–220.
loss of generality, that , since any nonunity [8] A. B. Watson, “DCTune: A technique for visual optimization of DCT
variance of could be absorbed into . The MSE between quantization matrices for individual images,” Soc. Inf. Display Dig. Tech.
and given could now be analyzed Papers, vol. XXIV, pp. 946–949, 1993.
[9] A. P. Bradley, “A wavelet visible difference predictor,” IEEE Trans.
(30) Image Process., vol. 5, no. 8, pp. 717–730, Aug. 1999.
[10] Y. K. Lai and C.-C. J. Kuo, “A Haar wavelet approach to compressed
image quality measurement,” J. Vis. Commun. Image Represen., vol. 11,
(31) pp. 17–40, Mar. 2000.
[11] P. C. Teo and D. J. Heeger, “Perceptual image distortion,” Proc. SPIE,
vol. 2179, pp. 127–141, 1994.
[12] D. J. Heeger and P. C. Teo, “A model of perceptual image fidelity,” in
Proc. IEEE Int. Conf. Image Processing, 1995, pp. 343–345.
[13] A. M. Pons, J. Malo, J. M. Artigas, and P. Capilla, “Image quality metric
(32) based on multidimensional contrast perception models,” Displays, vol.
where we have used and that given 20, pp. 93–110, 1999.
[14] A. M. Eskicioglu and P. S. Fisher, “Image quality measures and their
. Expanding the above expression and taking ex- performance,” IEEE Trans. Commun., vol. 43, no. 12, pp. 2959–2965,
pectation, and using independence between and , the fact Dec. 1995.
that , and are all zero-mean, and the fact that for zero- [15] I. Avcibaş, B. Sankur, and K. Sayood, “Statistical evaluation of image
quality measures,” J. Electron. Imag., vol. 11, no. 2, pp. 206–23, Apr.
mean Gaussian variables , where is the vari- 2002.
ance of , we get [16] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image
quality assessment: From error measurement to structural similarity,”
(33) IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.
[17] The Video Quality Experts Group [Online]. Available:
http://www.vqeg.org/
The goal of this derivation is to compare the IFC of (10) and [18] A. M. Rohaly et al., “Video quality experts group: Current results and fu-
HVS-based MSE criterion ture directions,” Proc. SPIE Visual Commun. Image Process., vol. 4067,
pp. 742–753, Jun. 2000.
[19] Final Report From the Video Quality Experts Group on the Validation of
Objective Models of Video Quality Assessment, Phase II (2003, Aug.).
(34) [Online]. Available: ftp://ftp.its.bldrdoc.gov/dist/ituvidq/frtv2_final_re-
port/VQEGII_Final_Report.pdf
[20] A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu, “On advances
in statistical modeling of natural images,” J. Math. Imag. Vis., vol. 18,
(35) pp. 17–33, 2003.
[21] E. P. Simoncelli and B. A. Olshausen, “Natural image statistics and
neural representation,” Annu. Rev. Neurosci., vol. 24, pp. 1193–1216,
May 2001.
(36) [22] J. M. Shapiro, “Embedded image coding using zerotrees of wavelets co-
efficients,” IEEE Trans. Signal Process., vol. 41, no. 12, pp. 3445–3462,
Hence, we have an approximate relation between the IFC and Dec. 1993.
[23] A. Said and W. A. Pearlman, “A new, fast, and efficient image codec
the HVS-based MSE based on set partitioning in hierarchical trees,” IEEE Trans. Circuits Syst.
Video Technol., vol. 6, no. 3, pp. 243–250, Jun. 1996.
(37) [24] D. S. Taubman and M. W. Marcellin, JPEG2000: Image Compression
Fundamentals, Standards, and Practice. Norwell, MA: Kluwer, 2001.
[25] R. W. Buccigrossi and E. P. Simoncelli, “Image compression via joint
where and are constants. statistical characterization in the wavelet domain,” IEEE Trans. Image
Process., vol. 8, no. 12, pp. 1688–1701, Dec. 1999.
ACKNOWLEDGMENT [26] M. K. Mihçak, I. Kozintsev, K. Ramachandran, and P. Moulin, “Low-
complexity image denoising based on statistical modeling of wavelet
The authors would like to thank Dr. E. Simoncelli and coefficients,” IEEE Signal Process. Lett., vol. 6, no. 12, pp. 300–303,
Dr. Z. Wang at the Center for Neural Science, New York Dec. 1999.
[27] J. K. Romberg, H. Choi, and R. Baraniuk, “Bayesian tree-structured
University, for insightful comments. image modeling using wavelet-domain hidden markov models,” IEEE
Trans. Image Process., vol. 10, no. 7, pp. 1056–1068, Jul. 2001.
[28] M. J. Wainwright, E. P. Simoncelli, and A. S. Wilsky, “Random cas-
REFERENCES cades on wavelet trees and their use in analyzing and modeling natural
[1] LIVE Image Quality Assessment Database, Release 2, H. R. Sheikh, images,” Appl. Comput. Harmon. Anal., vol. 11, pp. 89–123, 2001.
Z. Wang, L. Cormack, and A. C. Bovik. (2005). [Online]. Available: [29] E. Y. Lam and J. W. Goodman, “A mathematical analysis of the DCT
http://live.ece.utexas. edu/research/quality coefficient distributions for images,” IEEE Trans. Image Process., vol.
[2] M. P. Eckert and A. P. Bradley, “Perceptual quality metrics applied to 9, no. 10, pp. 1661–1666, Oct. 2000.
still image compression,” Signal Process., vol. 70, no. 3, pp. 177–200, [30] H. Choi and R. G. Baraniuk, “Multiscale image segmentation using
Nov. 1998. wavelet-domain hidden Markov models,” IEEE Trans. Image Process.,
[3] A. Bovik, Ed., Handbook of Image and Video Processing. New York: vol. 10, no. 9, pp. 1309–1321, Sep. 2001.
Academic, 2000. [31] J. Portilla and E. P. Simoncelli, “A parametric texture model based on
[4] S. Winkler, “Issues in vision modeling for perceptual video quality as- joint statistics of complex wavelet coefficients,” Int. J. Comput. Vis., vol.
sessment,” Signal Process., vol. 78, pp. 231–252, 1999. 40, no. 1, pp. 49–71, 2000.
[5] Z. Wang, H. R. Sheikh, and A. C. Bovik, “Objective video quality assess- [32] H. R. Sheikh, A. C. Bovik, and L. Cormack, “No-reference quality as-
ment,” in The Handbook of Video Databases: Design and Applications, sessment using natural scene statistics: JPEG2000,” IEEE Trans. Image
B. Furht and O. Marques, Eds. Boca Raton, FL: CRC, 2003. Process., vol. 14, no. 11, pp. 1918–1927, Nov. 2005.
[6] S. Daly, “The visible difference predictor: An algorithm for the assess- [33] E. P. Simoncelli, “Modeling the joint statistics of images in the wavelet
ment of image fidelity,” Proc. SPIE, vol. 1616, pp. 2–15, 1992. domain,” Proc. SPIE, vol. 3813, pp. 188–195, Jul. 1999.
2128 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 14, NO. 12, DECEMBER 2005

[34] B. A. Wandell, Foundations of Vision: Sinauer, 1995. Alan Conrad Bovik (S’80–M’81–SM’89–F’96) re-
[35] N. Damera-Venkata, T. D. Kite, W. S. Geisler, B. L. Evans, and A. C. ceived the B.S., M.S., and Ph.D. degrees in electrical
Bovik, “Image quality assessment based on a degradation model,” IEEE and computer engineering from the University of Illi-
Trans. Image Process., vol. 4, no. 4, pp. 636–650, Apr. 2000. nois, Urbana-Champaign, in 1980, 1982, and 1984,
[36] T. M. Cover and J. A. Thomas, Elements of Information Theory. New respectively.
York: Wiley, 1991. He is currently the Curry/Cullen Trust Endowed
[37] E. P. Simoncelli and W. T. Freeman, “The steerable pyramid: A flexible Chair in the Department of Electrical and Computer
architecture for multi-scale derivative computation,” in Proc. IEEE Int. Engineering, The University of Texas, Austin, where
Conf. Image Processing, Oct. 1995, pp. 444–447. he is the Director of the Laboratory for Image and
[38] V. Strela, J. Portilla, and E. Simoncelli, “Image denoising using a local Video Engineering (LIVE) in the Center for Percep-
Gaussian scale mixture model in the wavelet domain,” Proc. SPIE, vol. tual Systems. During the Spring of 1992, he held
4119, pp. 363–371, 2000. a visiting position in the Division of Applied Sciences, Harvard University,
[39] A. M. van Dijk, J. B. Martens, and A. B. Watson, “Quality assessment of Cambridge, MA. He is the editor/author of the Handbook of Image and Video
coded images using numerical category scaling,” Proc. SPIE, vol. 2451, Processing (New York: Academic, 2000). His research interests include digital
pp. 90–101, Mar. 1995. video, image processing, and computational aspects of visual perception, and
[40] JNDmetrix Technology (2003). [Online]. Available: http://www.sarnoff. he has published over 350 technical articles in these areas and holds two U.S.
com/productsservices/videovision/jndmetrix/downloads.asp patents.
[41] A. B. Watson and L. Kreslake, “Measurement of visual impairment Dr. Bovik was named Distinguished Lecturer of the IEEE Signal Processing
scales for digital video,” Proc. SPIE, Human Vis., Vis. Process., and Society in 2000, received the IEEE Signal Processing Society Meritorious
Digit. Display, vol. 4299, pp. 79–89, 2001. Service Award in 1998, the IEEE Third Millennium Medal in 2000, the
University of Texas Engineering Foundation Halliburton Award in 1991, and is
a two-time Honorable Mention winner of the International Pattern Recognition
Society Award for Outstanding Contribution (1988 and 1993). He was named
a Dean’s Fellow in the College of Engineering in 2001. He is a Fellow of
the IEEE and has been involved in numerous professional society activities,
including Board of Governors, IEEE Signal Processing Society, 1996–1998;
Editor-in-Chief, IEEE TRANSACTIONS ON IMAGE PROCESSING, 1996–2002;
Editorial Board, THE PROCEEDINGS OF THE IEEE, 1998–present; and Founding
General Chairman, First IEEE International Conference on Image Processing,
held in Austin in November 1994. He is a Registered Professional Engineer in
the State of Texas and is a frequent consultant to legal, industrial, and academic
institutions.

Gustavo de Veciana (S’88–M’94–SM’01) received


the B.S., M.S, and Ph.D. degrees in electrical engi-
neering from the University of California, Berkeley,
in 1987, 1990, and 1993, respectively.
He is currently a Professor with the Department
of Electrical and Computer Engineering, University
Hamid Rahim Sheikh (S’93–M’04) received of Texas, Austin. His research focuses on the design,
the B.S. degree in electrical engineering from the analysis, and control of telecommunication networks.
University of Engineering and Technology, Lahore, His current interests include measurement, modeling,
Pakistan, in 1998, and the M.S. and Ph.D. degrees and performance evaluation; wireless and sensor net-
from The University of Texas, Austin, in 2001 and works; and architectures and algorithms to design re-
2004, respectively. liable computing and network systems.
His research interests include full-reference and Dr. de Veciana has been an editor for the IEEE/ACM TRANSACTIONS ON
no-reference quality assessment, application of nat- NETWORKING. He is the recipient of General Motors Foundation Centennial
ural scene statistics models and human visual system Fellowship in Electrical Engineering, the 1996 National Science Foundation
models for solving image and video processing CAREER Award, co-recipient of the IEEE William McCalla Best ICCAD Paper
problems, and image and video codecs and their Award for 2000, and co-recipient of the Best Paper in ACM Transactions on
embedded implementation. Design Automation of Electronic Systems, January 2002–2004

You might also like