Objective Assessment of Multiresolution Image Fusion Algorithms For Context Enhancement in Night Vision: A Comparative Study
Objective Assessment of Multiresolution Image Fusion Algorithms For Context Enhancement in Night Vision: A Comparative Study
1, JANUARY 2012
Abstract—Comparison of image processing techniques is critically important in deciding which algorithm, method, or metric to use for
enhanced image assessment. Image fusion is a popular choice for various image enhancement applications such as overlay of two
image products, refinement of image resolutions for alignment, and image combination for feature extraction and target recognition.
Since image fusion is used in many geospatial and night vision applications, it is important to understand these techniques and provide
a comparative study of the methods. In this paper, we conduct a comparative study on 12 selected image fusion metrics over six
multiresolution image fusion algorithms for two different fusion schemes and input images with distortion. The analysis can be applied
to different image combination algorithms, image processing methods, and over a different choice of metrics that are of use to an
image processing expert. The paper relates the results to an image quality measurement based on power spectrum and correlation
analysis and serves as a summary of many contemporary techniques for objective assessment of image fusion algorithms.
Index Terms—Night vision, context enhancement, pixel-level image fusion, multiresolution analysis, objective fusion assessment,
performance metric, image quality.
1 INTRODUCTION
to be verified and collectively evaluated [2], [3]. Although an 2 ALGORITHMS FOR OBJECTIVE IMAGE FUSION
adaptive fusion strategy is preferred, how the fusion PERFORMANCE ASSESSMENT
algorithm adapts to different object-to-background situations
Two types of fusion schemes were considered in the MIF-CE
is still not well understood.
study. The first one is direct (heterogeneous) image fusion of IR
In order to objectively assess the performance of an MIF
and visible images with a multiresolution approach at the
algorithm, a number of evaluation metrics, either objective pixel level. The other method is a modified (homogeneous)
or subjective, have been proposed [4], [5], [6], [7], [8], [9], image fusion as described in [19], where the visible image
[10], [11], [12], [13], [14], [15], [16], [17]. The problem is that enhanced from the IR image is fused with the original visible
certain fusion algorithms may work for one application but image. (Please refer to the supplement, which can be found
do not have the same performance for another application. in the Computer Society Digital Library at http://doi.ieee
Each application varies based on the sensors used, the computersociety.org/10.1109/TPAMI.2011.109.) Utilizing
targets of interest, and the environmental conditions. the fusion of heterogeneous and homogeneous images will
Studies on image fusion lack information that explicitly help to understand how the fusion metrics perform over
defines the applicability and feasibility of a specific fusion various applications. For the rest of the paper, we will use
algorithm for a given application. The same problem also “VI-IR direct fusion” and “VI-EVI modified fusion” to refer
exists in the research on information fusion performance to the two image fusion schemes.
evaluation, with the difficulty being how to define and The assessment of a fused image can be carried out in
validate objective evaluation metrics. Usually a subjective two different ways. The first method is to compare the
evaluation is carried out to validate an objective assessment fusion result with a known reference image (or ground
truth). However, a reference image is not always available
[18]. However, identifying a reliable subjective score needs
in a practical application. The second implementation, a
extensive experiments, which is expensive and cannot cover
blind or nonreferenced assessment, is generally preferred.
all possible conditions of interest. Typically, a robust In this paper, we will focus on the blind assessment.
performance model is required to account for the critical Different approaches have been proposed for blind assess-
image fusion parameters and better assess the trend of ment so far [4], [6], [7], [10], [14], [15], [20], [21], [22], [23],
image fusion performance quality. [24]. In this study, 12 most representative metrics are used
The objective of this Multi-Image Fusion for Context for the comparative study and each metric is briefly
Enhancement (MIF-CE) work is to carry out a comparative described below.
study of the objective image fusion assessment metrics and
investigate their effectiveness for context enhancement in a 2.1 Information Theory-Based Metrics
night vision application. The MIF-CE contributes to: 2.1.1 Normalized Mutual Information (QMI )
Mutual information (MI) is a quantitative measure of the
. understanding the relationship between the image mutual dependence of two variables. The definition of
fusion metrics, mutual information for two discrete random variables U
. demonstrating the effectiveness of these metrics by
and V is
referencing the image quality measurement, and
. learning the difference between the fusion of XX pðu; vÞ
heterogeneous and homogeneous images from fu- MIðU; V Þ ¼ pðu; vÞ log2 ; ð1Þ
v2V u2U
pðuÞpðvÞ
sion metrics.
In this study, 12 fusion metrics have been implemented where pðu; vÞ is the joint probability distribution function
with Matlab1 and applied to six multiresolution fusion of U and V , and pðuÞ and pðvÞ are the marginal probability
algorithms. The 12 metrics are categorized into four distribution functions of U and V , respectively. Actually,
groups: MI quantifies the distance between the joint distribution of
U and V , i.e., pðu; vÞ, and the joint distribution when U
1. information theory based metrics, and V are independent, i.e., pðuÞpðvÞ. Mutual information
2. image feature based metrics, can be equivalently expressed with joint entropy
3. image structural similarity based metrics, and fHðU; V Þg and marginal entropy fHðUÞ; HðV Þg of the
4. human perception inspired fusion metrics. two variable U and V as
A direct image fusion and a modified image fusion
MIðU; V Þ ¼ HðUÞ þ HðV Þ HðU; V Þ; ð2Þ
method are considered in the MIF-CE study. Detailed
information on the image fusion methods are available in where
the supplement, which can be found in the Computer X
HðUÞ ¼ pðuÞ log2 pðuÞ;
Society Digital Library at http://doi.ieeecomputersociety.
u
org/10.1109/TPAMI.2011.109. The selected image fusion X
HðV Þ ¼ pðvÞ log2 pðvÞ;
metrics are presented in Section 2 and Section 3 explains the v
methodology for the MIF-CE comparative study. Experi- X
HðU; V Þ ¼ pðu; vÞ log2 pðu; vÞ:
mental results are presented in Section 4. Discussions and u;v
conclusions can be found in Sections 5 and 6, respectively.
Qu et al. used the summation of the MI between the fused
1. The Matlab implementation is available upon request. image F ði; jÞ and two input images, Aði; jÞ and Bði; jÞ, to
96 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 1, JANUARY 2012
g Y
N s
QAF
g ði; jÞ ¼ ðGAF ði;jÞ Þ ; ð18Þ QM ¼ QAB=F ; ð25Þ
1þ e g g s
s¼1
The final assessment is obtained from the weighted average RF ¼ t ½Iði; jÞ Iði; j 1Þ2 ; ð27Þ
MN i¼1 j¼2
of the edge information preservation values.
PN PM vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
½QAF ði; jÞwA ði; jÞ þ QBF ði; jÞwB ði; jÞ u
QG ¼ n¼1 m¼1 PN PM ; u 1 X N X M
n¼1
A B
m¼1 ðw ði; jÞ þ w ði; jÞÞ
CF ¼ t ½Iði; jÞ Iði 1; jÞ2 ; ð28Þ
MN j¼1 i¼2
ð21Þ
where the weighting coefficients are defined as: w ði; jÞ ¼ A vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u
½gA ði; jÞL and wB ði; jÞ ¼ ½gB ði; jÞL , respectively. Here, L is a u 1 X M X N
MDF ¼ twd ½Iði; jÞ Iði 1; j 1Þ2 ; ð29Þ
constant value. MN i¼2 j¼2
QP ¼ ðPp Þ ðPM Þ ðPm Þ ; ð33Þ quality indices is based on a sliding window approach,
where p; M; m refers to phase congruency (p), maximum, which moves from top-left to bottom-right. The SSIM and
and minimum moments, respectively, and there are Q value can be calculated locally, summed, and averaged to
p get the overall index. See [28] for the detailed implementa-
p p
Pp ¼ max CAF ; CBF ; CSF ; tion of the SSIM algorithm.
M M M
PM ¼ max CAF ; CBF ; CSF ;
m m m
2.3.1 Piella’s Metric (QS )
Pm ¼ max CAF ; CBF ; CSF :
Piella and Heijmans defined three fusion quality index based
k
Herein, Cxy , fkjp; M; mg stands for the correlation coeffi- on Wang’s UIQI method [5]. Assume the local QðA; BjwÞ
cients between two sets x and y: value is calculated in a sliding window w. There are
kxy þ C 1 X
k
Cxy ¼ ; ð34Þ QS ¼ ½ðwÞQ0 ðA; F jwÞ þ ð1 ðwÞÞQ0 ðB; F jwÞ;
kx ky þ C jW j w2W
ð39Þ
1 X
N
xy ¼ ðxi xÞðyi yÞ: ð35Þ X
N 1 QW ¼ cðwÞ½ðwÞQ0 ðA; F jwÞ þ ð1 ðwÞÞQ0 ðB; F jwÞ;
i¼1
w2W
The suffixes A, B, F , and S correspond to the two inputs, ð40Þ
fused image, and maximum-select map. The exponential
parameters , , and can be adjusted based on the QE ¼ QW ðA; B; F Þ QW ðA0 ; B0 ; F 0 Þ ; ð41Þ
importance of the three components [7].
where the weight ðwÞ is defined as
2.3 Image Structural Similarity-Based Metrics
The image similarity measurement is based on the evidence sðAjwÞ
ðwÞ ¼ : ð42Þ
that the human visual system is highly adapted to structural sðAjwÞ þ sðBjwÞ
information and a measurement of the loss of structural Herein, sðAjwÞ is a local measure of image salience. In
information can provide a good approximation of the Piella’s implementation, sðAjwÞ and sðBjwÞ are the variance
perceived image distortion. Wang proposed a structural of images A and B within the window w, respectively. The
similarity index measure (SSIM) for images A and B coefficient cðwÞ in (40) is [5]
defined as [26]
max½sðAjwÞ; sðBjwÞ
SSIMðA; BÞ cðwÞ ¼ P 0 0
: ð43Þ
w0 2W ½sðAjw Þ; sðBjw Þ
¼ ½lðA; BÞ ½cðA; BÞ ½sðA; BÞ In (41), QW ðA0 ; B0 ; F 0 Þ is the Qw calculated with the edge
2 A B þ C1 2A B þ C2 AB þ C3 images, i.e., A0 , B0 , and F 0 , and is a manually adjustable
¼ ;
2 þ 2 þC
A B 1 2A þ 2B þ C2 A B þ C3 parameter to weight the edge-dependent information.
ð36Þ
2.3.2 Cvejie’s Metric QC
where A and B are the average values of images Aði; jÞ Cvejie et al. defined a performance measure as [21]
and Bði; jÞ, A , B , and AB are the variance and covariance, X
respectively [26]. lðA; BÞ, cðA; BÞ, and sðA; BÞ in (36) are the QC ¼ simðA; B; F jwÞQðA; F jwÞ
luminance, contrast, and correlation components, respec- w2W ð44Þ
tively. The parameters , , and are used to adjust the þ ð1 simðA; B; F jwÞÞQðB; F jwÞ;
relative importance of the three components. The constant where the function simðA; B; F jwÞ is [21]
values C1 , C2 , and C3 are defined to avoid the instability
8 AF
when the denominator are very close to zero. By setting >
> 0; if < 0;
¼ ¼ ¼ 1 and C3 ¼ C2 =2, (36) becomes >
> AF þ BF
< AF AF
simðA; B; F jwÞ ¼ ; if 0 1;
ð2 A B þ C1 Þð2AB þ C2 Þ > AF þ BF
> AF þ BF
SSIMðA; BÞ ¼ : ð37Þ >
> AF
2 þ 2B þ C1 2A þ 2B þ C2 : 1; if > 1:
A AF þ BF
A previous version of this index is known as the universal ð45Þ
image quality index (UIQI) and is written as [27] The weighting factor depends on the similarity in spatial
AB 2 A B 2A B domain between the input images and the fused image. The
QðA; BÞ ¼ higher the similarity between the input and fused image,
A B A 2 þ B 2 A 2 þ B 2
ð38Þ the larger the corresponding weighting factor.
4AB A B
¼ :
ðA 2 þ B 2 Þð A 2 þ B 2 Þ
2.3.3 Yang’s Metric QY
The following image structural similarity fusion metrics are Yang et al. proposed another way to use SSIM for fusion
based on these two definitions. The calculation of the assessment [11]:
LIU ET AL.: OBJECTIVE ASSESSMENT OF MULTIRESOLUTION IMAGE FUSION ALGORITHMS FOR CONTEXT ENHANCEMENT IN NIGHT... 99
8
>
> ðwÞSSIMðA; F jwÞ þ ð1ðwÞÞSSIMðB; F jwÞ; . Contrast preservation calculation: The masked contrast
<
SSIMðA; BjwÞ 0:75; map for input image IA ði; jÞ is calculated as
QY ¼
> maxfSSIMðA; F jwÞ;SSIMðB; F jwÞg;
>
:
SSIMðA; BjwÞ < 0:75: tðCA Þp
CA0 ¼ : ð51Þ
ð46Þ hðCA Þq þ Z
The local weight ðwÞ is as the definition in (42). Here, t, h, p, q, and Z are real scalar parameters that
determine the shape of the nonlinearity of the
2.4 Human Perception Inspired Fusion Metrics masking function [23].
2.4.1 Chen-Varshney Metric (QCV ) . Saliency map generation: The saliency map for IA ði; jÞ
is defined as
The Chen-Varshney metric consists of five steps [24]:
Fig. 4. Fusion metric values for VI-IR direct fusion. Numbers 1 to 6 refer
to the fusion algorithm: LAP, GRAD, RoLP, DB4, SIDW, and STEER,
respectively.
Fig. 2. The correlation matrix of fusion metrics for the VI-EVI modified be enough to justify the shifting of QT E and should be
fusion scheme.
further investigated. Besides, image contents may also have
an impact on the metric value as well. The dendrogram is
other metrics do not have the same correlation when
meaningful as the assessment rates the fusion algorithms
applied to the results obtained by the two different fusion
based on a relative value. The fusion metrics are clustered
schemes. However, the fusion metrics with a higher
based on similarity rather than their categories since the
correlation generally come from a same category. The
image structural similarity-based metrics may also depend
correlation analysis reveals the similarities between the
on image features.
same types of fusion metrics.
A dendrogram plot can be created from the similarity 4.2.2 The Consistency of Assessment Metrics
matrix with a tool called “DendroUPGMA” [37], [38]. The
An intuitive illustration of the change in the fusion metric
dendrogram tool transforms similarity coefficients into
value against fusion algorithms can be found in Figs. 4 and
distances and clusters the coefficients using the unweighted
5. All the metrics will assign a larger value to the better
pair group method with arithmetic mean (UPGMA) algo-
fusion result. These figures illustrate how the metric values
rithm. The dendrogram plots are given in Fig. 3. The local
change with the algorithms. To better understand the
topological relationships are identified in order of similarity
consistency of one metric with the others, the ranks of the
(Kendall correlation), and the phylogenetic tree is built in a
fusion results/algorithms with an integer number from 1 to
stepwise manner. The length represents the correlation
between these fusion metrics and the difference between
two fusion schemes can be observed. The visible and
infrared images employ different intensity tables and this
makes the joint gray-level histogram in VI-IR fusion quite
different from that of the VI-EVI fusion. And, the informa-
tion theory-based metrics use such information (e.g., joint
histogram) to calculate the metric values. This partially
explains the different tree structures. However, this may not
TABLE 1
Algorithm Rank for Fusion Schemes
TABLE 2
The Variance of Fusion Metrics across Images
6 are given in Table 1. The results from Borda count method 4.2.3 The Impact of Image Distortion
are listed on the last column in each table. A larger Borda The input UN camp images are distorted by additive white
count number indicates a better result. noise and blurring operation, respectively, in the experiment.
For VI-IR direct fusion, the SIDW is ranked the best The visible and IR images are evaluated with the IQM [35]
algorithm and next are the LAP and GRAD algorithms. The and plotted in Fig. 6. The image quality degrades with the
DB4 and STEER are given a number 2.5, which means they variance of Gaussian white noise and the standard deviation
are equal and between the ranks 2 and 3. The last algorithm of the Gaussian filter. However, the IQM does not dis-
is the RoLP. Compared with the Borda count result, metrics
criminate the severe degradation between images, which
QC , QP , and QY show a reasonable consistency. In the
does not necessarily mean those images are of the same
results of VI-EVI modified fusion, Borda count ranks STEER
quality. Another observation is that the IR image is of lower
the second and LAP the third. The rank for DB4 and GRAD
quality in comparison with the corresponding visible image
are three and two, respectively. RoLP is again ranked last.
Among all the metrics, QY and QM show a perfect
consistency with the Borda count result while QG and QP
give a reasonably consistent result.
To understand the performance of a fusion metric across
different inputs, the variance is calculated and listed in
Table 2. As far as the fusion algorithm is concerned, an ideal
fusion metric should not change with the contents of input
images because the fusion metric evaluates the fusion
algorithms rather than the image contents. A lower variance
indicates a good stability of a fusion metric for a specific
fusion algorithm. For example, metric QNCIE is most stable
for fusion algorithm STEER in the VI-IR direct fusion, while,
in the VI-EVI modified fusion, algorithm GRAD still prefers
QNCIE for the stability. Fig. 6. The distorted UN camp images assessed by IQM.
LIU ET AL.: OBJECTIVE ASSESSMENT OF MULTIRESOLUTION IMAGE FUSION ALGORITHMS FOR CONTEXT ENHANCEMENT IN NIGHT... 103
in terms of IQM. And the IR image is more sensitive to the which means the degradation of the infrared image has a
blurring operation. larger impact on the fusion result.
We first look at how the IQM changes with the image The impact of Gaussian additive white noise to fusion
quality. Fig. 7 indicates that the quality of fused images metrics is illustrated in Figs. 8 and 9, respectively. An
degrade with the decrease in image quality. When the example of theoretical analysis of correlation-based quality
Gaussian additive noise is severe, all the fused images are measures for weighted averaging image fusion was reported
almost of the same quality, regardless of the fusion scheme. in [39]. The computation of a new diffuse prior monotonic
For blurred input images, varied fusion algorithms generate likelihood ratio was further proposed in [40]. For MIF, a
the fused images with different qualities in terms of IQM. complicated theoretical analysis has not been reported. For
The VI-EVI modified fusion has a relatively higher quality, the VI-IR direct fusion scheme, metrics QG , QM , QS , QC , QY ,
Fig. 8. The impact of Gaussian additive noise to the fusion metrics for the VI-IR direct fusion scheme.
104 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 1, JANUARY 2012
Fig. 9. The impact of Gaussian additive noise to the fusion metrics for the VI-EVI modified fusion scheme.
and QCB show a general decreasing trend with the degrada- which calculate the fusion metric with pixel values, reflect
tion of image quality. An inflexion appears around 0.01 the difference. The phase congruency-based metric QP
variance value. In contrast, the value of metrics QMI , QT E , incorporates a feature extraction function, which is sensitive
QNCIE , and QP increase because these metrics count the to noise. While noise may have an impact on the fusion
additive noises as part of the input “features” or “informa- metric trend, it is not clear how phase congruency changes
tion.” When the variance value goes beyond 0.01 or the noise with noise.
becomes significant for the QMI , QT E , QNCIE , and QP metrics. The impact of blurring operation is illustrated in Figs. 10
QSF demonstrates a relative stability since the QSF metric and 11, respectively. As the infrared imaging measures the
considers four directional gradients, which are not greatly emitted energy of an object, the different regions in an IR
affected by additive noises. The last metric QCV decreases at image indicate the variance in temperature. IR images do
the beginning and increases around 0.04. In the QCV metric, a not show a sharp edge or boundary as a visible image. The
contrast sensitive filtering is applied to input images. The multiresolution analysis represents image features, like
CSF carries out a band-pass filtering operation, which may edges and boundaries, with larger coefficients. The blurring
suppress the noises to some extent (which depends on the operation does not greatly change the temperature regions.
specific CSF operation). Thus, in the VI-IR direct fusion, the fusion metrics give a
One difference between the direct and the modified relatively stable value for each fusion algorithm. The initial
image fusion schemes is the input image. In VI-IR direct blurring may also serve as a low-pass filtering operation to
fusion, an infrared image is input with a visible image while the IR image and this may lead to a “better” result in terms
an enhanced visible image is used instead in the VI-EVI of some metrics. For the VI-EVI modified fusion, all the
modified fusion. Thus, the changes of the results in Fig. 9 metrics except QSF and QCV decrease when the standard
are subject to one of the input images. Most of the metric deviation of the Gaussian filter increases.
values decrease except QT E and QCV , while QSF does not A fusion metric can be affected by the quality change
show any significant change. Among all the metrics, the (i.e., pixel value, numerical rounding, etc.) at a certain step
QMI , QNCIE , QT E , QP , and QCV are subject to such changes. of its calculation. Thus, a metric may exhibit a different
As the IR and visible images have a different intensity performance to such changes. It should also be noted
definition, the three information theory-based approaches, that the fusion algorithms are subject to the quality of input
LIU ET AL.: OBJECTIVE ASSESSMENT OF MULTIRESOLUTION IMAGE FUSION ALGORITHMS FOR CONTEXT ENHANCEMENT IN NIGHT... 105
Fig. 10. The impact of blurring operation to the fusion metrics for the VI-IR direct fusion scheme.
images as well. Some are sensitive to the quality change example, mutual information gives a coarse estimate of the
and some are not, which is beyond the discussion of this similarity between images and has been used for both
paper. Readers can refer to the experimental results for image registration and image fusion. An MIF metric value
further particulars. is only a relative ranking of various fusion algorithms for
To figure out how the fusion metrics are related to the a specific application. In general, we desire that the image
IQM, the correlation is calculated and given in Fig. 12. No fusion metric remain descriptive and consistent for
obvious correlation is observed for most metrics, although algorithm selection over various sensors and environmen-
QCV obtained a larger correlation value. The lack of IQM-to- tal conditions.
MIF metric correlation is because most fusion metrics count The role of image fusion metrics is important for
on how the input images are fused together rather than the applications, user acceptance, and image fusion algorithm
quality of the fused image. Note: When the input images are improvements. For a specific application and a specific fusion
of significantly different quality, we found that a fusion algorithm, should the fusion metric keep a constant value or
metric may lead to a confused judgment. need to vary with the quality of the input images? If the fusion
metric only considers how much information is transferred
5 DISCUSSION from inputs to the fused result, the quality of the input images
will not have an impact on the measure, as the metric should
What does a fusion metric tell? If there are two pairs of only reflect the capability of the fusion algorithms. However,
multisensor images fused with the same algorithm, the
the change of image contents (e.g., degradation) may change
performance of the algorithm should be the same, but the
the amount of information transfer and thus change the
fusion metric values are not equal in most cases. Fusion
metrics are calculated differently, measure various con- fusion metric value. For example, if the two input images are
textual details, and provide a relative value for compar- blank, it does not mean a failure of the fusion algorithm.
ison. From the experiments, we find that all fusion metrics However, the night vision application requires a fused image
considered in this study vary with the image contents. An be of a “good” quality as related to some standards.
MIF metric value is only meaningful in reference to the According to the “National Imagery Interpretability Rating
MIF goals when evaluating a specific image pair. For Scale” [41], the different rating scales define the capability to
106 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 1, JANUARY 2012
Fig. 11. The impact of blurring operation to the fusion metrics for the VI-EVI modified fusion scheme.
identify varied objects from the image. Therefore, the quality image fusion quality metric. In this study, we assume that
of fused image needs to be considered. the image fusion results in a composite image from which
There are a host of metrics and an equivalent variation in the image fusion quality metric is determined. As stated,
how to measure the parameters that compose the metric. the “enhancement” comes from the fused images (mostly of
An obvious relation between the image quality measure- different modalities) and the motivation is to improve night
ment and image fusion metrics considered in this study was vision imaging for improved image analysis.
not observed. However, each image would have a quality The value of a metric varies from its definition and
rating and fusing the image qualities does not compute an implementation. It could be in the range of ð1; 1Þ, ½1; 1,
Fig. 12. The correlation between fusion metrics and image quality measurement.
LIU ET AL.: OBJECTIVE ASSESSMENT OF MULTIRESOLUTION IMAGE FUSION ALGORITHMS FOR CONTEXT ENHANCEMENT IN NIGHT... 107
½0; 1, or ½0; þ1Þ, but none of them gives an absolute The fusion metrics demonstrate its diversities due to the
measurement. In other words, how the fusion performance different mechanisms in its implementation.
is distributed in the given range is not clear. Given two From the experiments, we understand that the fusion
metric values 0.99 and 0.96, for instance, we do not know metrics considered in this study only provide a relative
how significant the difference ð0:03Þ is with these two assessment (value) on how the input images are fused
results for a specific application. Thus, the normalization of together rather than the quality of the fused image. If a
the metric value to the range ½0; 1 does not make any sense fusion metric only counts how the information is trans-
because the metric value is a relative result. ferred to a fused image from inputs, the image quality
The fusion metrics can be applied to different image should not affect metric value. Meanwhile, the metric value
modalities. However, it also depends on what is expected varies with image contents and is subject to distortion like
from the fused image. A single fusion metric is not sufficient additive noise and blurring. With this knowledge, a fusion
to justify all the requirements for the fusion applications like metric can be selected based on application requirements,
multifocus imaging, surveillance, and medical imaging, etc. which is paramount for multisensor image fusion, or a
In multifocus imaging, the input images are of the same representative measurement can be derived from multiple
modality, but for the night vision application, heterogeneous metrics with a hierarchical cluster analysis. We demonstrate
images are fused to highlight the “hot” human beings and the a possible method to select image fusion metrics derived
“cool” background. This MIF-CE study investigates the from two phenograms in the supplement, which can be
fusion of heterogeneous images (VI-IR direct fusion) and found in the Computer Society Digital Library at http://
homogeneous images (VI-EVI modified fusion). According doi.ieeecomputersociety.org/10.1109/TPAMI.2011.109.
to Figs. 1 and 2 and results in Table 1, most fusion metrics rate This study considers a night vision application. Metrics
the fusion algorithms differently in the two fusion schemes. QG (gradient-based fusion metric), QC (Cvejie’s metric),
The choice of a fusion algorithm and fusion metric are and QY (Yang’s metric)2 are suggested for the VI-IR
application dependent. The application drives the require- direction image fusion, while metrics QM (multiscale
ments from which metric selection follows. As far as the metric) and QY are recommended to the VI-EVI modified
night vision context enhancement application is concerned, image fusion. When other applications are considered, the
the VI-EVI modified fusion scheme creates a fused image use of a certain fusion metric will depend on operational
more suitable for human perception. In addition to the requirements. A more reliable and universal fusion metric
fusion metrics presented in this paper, a successful fusion of is expected from future research.
the IR and visible images can also be learned from the
segmentation of the input and fused images [42]. To find a
representative measure from the multiple fusion metrics, a
ACKNOWLEDGMENTS
hierarchical cluster analysis can be applied [43]. The images used in the experiments are obtained from
http://www.imagefusion.org. The authors wish to pay
tribute to the contributors of these images for their valuable
6 CONCLUSION
support to image fusion research.
In this paper, we described 12 metrics used to assess the
fusion performance of multimodal images. These fusion
metrics are categorized into four groups: REFERENCES
[1] X.P.V. Maldague, Theory and Practice of Infrared Technology for
1. information theory based metrics, Nondestructive Testing, K. Chang, ed. John Wiley and Sons, Inc.,
2. image feature based metrics, 2001.
[2] R.S. Blum and Z. Liu, Eds., Multi-Sensor Image Fusion and Its
3. image structural similarity based metrics, and Applications. Taylor and Francis, 2005.
4. human perception inspired fusion metrics. [3] G. Piella, “A General Framework for Multiresolution Image
Fusion: From Pixels to Regions,” Information Fusion, vol. 4, no. 4,
With the infrared and visible image pairs from night vision
pp. 259-280, Dec. 2003.
application of context enhancement, we investigated these [4] G. Qu, D. Zhang, and P. Yan, “Information Measure for
fusion metrics over six multiresolution image fusion algo- Performance of Image Fusion,” Electronics Letters, vol. 38, no. 7,
rithms for two fusion schemes. One is the VI-IR direct image pp. 313-315, 2002.
[5] G. Piella and H. Heijmans, “A New Quality Metric for Image
fusion (heterogeneous sensor images) and the other is the Fusion,” Proc. Int’l Conf. Image Processing, 2003.
VI-EVI modified image fusion (homogeneous sensor [6] C.S. Xydeas and V. Petrovic, “Objective Image Fusion Perfor-
images). The major difference between these two schemes mance Measure,” Electronics Letters, vol. 36, no. 4, pp. 308-309,
is that the VI-IR direct fusion fuses two heterogeneous 2000.
[7] J. Zhao, R. Laganiere, and Z. Liu, “Performance Assessment of
images while the VI-EVI modified fusion fuses two homo- Combinative Pixel-Level Image Fusion Based on an Absolute
geneous images. Six multiresolution pixel-level fusion Feature Measurement,” Int’l J. Innovative Computing, Information
algorithms were considered in this study. The impact of and Control, vol. 3, no. 6(A), pp. 1433-1447, Dec. 2007.
[8] Y. Zheng, E.A. Essock, B.C. Hansen, and A.M. Haun, “A New
image quality to the fusion metric was studied by applying Metric Based on Extended Spatial Frequency and Its Application
Gaussian additive white noise and blurring operation to the to DWT Based Fusion Algorithms,” Information Fusion, vol. 8, no.
input images. Various comparative approaches were con- 2, pp. 177-192, Apr. 2007.
ducted such as correlation, Borda count, and IQM metric-to- [9] Y. Zheng, Z. Qin, L. Shao, and X. Hou, “A Novel Objective Image
Quality Metric for Image Fusion Based on Renyi Entropy,”
image quality relations. In addition, an image quality Information Technology J., vol. 7, no. 6, pp. 930-935, 2008.
measurement based on image power spectrum was com-
puted as a reference and compared with the fusion metrics. 2. QC and QY are similar.
108 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 1, JANUARY 2012
[10] N. Cvejic, C.N. Canagarajah, and D.R. Bull, “Image Fusion Metric [35] N.B. Nill and B. Bouzas, “Objective Image Quality Measure
Based on Mutual Information and Tsallis Entropy,” Electronics Derived from Digital Image Power Spectra,” Optical Eng., vol. 31,
Letters, vol. 42, no. 11, pp. 626-627, May 2006. no. 4, pp. 813-825, 1992.
[11] C. Yang, J. Zhang, X. Wang, and X. Liu, “A Novel Similarity Based [36] MITRE, “Image Quality Evaluation,” http://www.mitre.org/
Quality Metric for Image Fusion,” Information Fusion, vol. 9, tech/mtf/, 2011.
pp. 156-160, 2008. [37] S. Garcia-Vallve, J. Palau, and A. Romeu, “Horizontal Gene
[12] Q. Wang, Y. Shen, and J. Jin, “Performance Evaluation of Image Transfer in Glycosyl Hydrolases Inferred from Codon Usage in
Fusion Techniques,” Image Fusion: Algorithms and Applications, Escherichia Coli and Bacillus Subtilis,” Molecular Biology and
ch. 19, T. Stathaki, ed., pp. 469-492. Elsevier, 2008. Evolution, vol. 16, no. 9, pp. 1125-1134, Sept. 1999.
[13] M. Hossny, S. Nahavandi, and D. Creighton, “A Quadtree Driven [38] S. Garcia-Vallve and P. Puigbo, “Dendroupgma: A Dendro-
Image Fusion Quality Assessment,” Proc. Fifth IEEE Int’l Conf. gram Construction Utility,” http://genomes.urv.cat/UPGMA/,
Industrial Informatics, vol. 1, pp. 419-424, July 2007. June 2010.
[14] M. Hossny, S. Nahavandi, and D. Vreighton, “Comments on [39] C. Wei and R.S. Blum, “Theoretical Analysis of Correlation-Based
‘Information Measure for Performance of Image Fusion’,” Electro- Quality Measures for Weighted Averaging Image Fusion,”
nics Letters, vol. 44, no. 18, pp. 1066-1067, Aug. 2008. Information Fusion, vol. 11, pp. 301-310, June 2009.
[15] Z. Liu, D.S. Forsyth, and R. Laganiere, “A Feature-Based Metric [40] C. Wei, L. Kaplan, S. Burks, and R. Blum, “Diffuse Prior
for the Quantitative Evaluation of Pixel-Level Image Fusion,” Monotonic Likelihood Ratio Test for Evaluation of Fused Image
Computer Vision and Image Understanding, vol. 109, no. 1, pp. 56-68, Quality Measures,” IEEE Trans. Image Processing, vol. 20, no. 2,
Jan. 2008. pp. 327-344, Feb. 2011.
[16] E. Blasch, X. Li, G. Chen, and W. Li, “Image Quality Assessment [41] J.M. Irvine, “National Imagery Interpretability Rating Scale
for Performance Evaluation of Image Fusion,” Proc. 11th Int’l Conf. (NIIRS),” Encyclopedia of Optical Eng., pp. 1442-1456, 2003.
Information Fusion, June/July 2008. [42] A. Toet, M.A. Hogervorst, S.G. Nikolov, J.J. Lewis, T.D. Dixon,
[17] M. Hossny, S. Nahavandi, D. Creighton, and A. Bhatti, “Image D.R. Bull, and C.N. Canagarajah, “Towards Cognitive Image
Fusion Performance Metric Based on Mutual Information and Fusion,” Information Fusion, vol. 11, no. 2, pp. 95-113, June 2009.
Entropy Driven Quadtree Decomposition,” Electronics Letters, [43] S. Li, Z. Li, and J. Gong, “Multivariate Statistical Analysis of
vol. 46, no. 18, pp. 1266-1268, Sept. 2010. Measures for Assessing the Quality of Image Fusion,” Int’l J. Image
[18] V. Petrovic, “Subjective Tests for Image Fusion Evaluation and and Data Fusion, vol. 1, no. 1, pp. 47-66, Mar. 2010.
Objective Metric Validation,” Information Fusion, vol. 8, no. 2,
pp. 208-216, 2007. Zheng Liu received the doctorate in engineering
[19] Z. Liu and R. Laganiere, “Context Enhancement through Infrared from Kyoto University, Japan, in 2000. From
Vision: A Modified Fusion Scheme,” Signal, Image and Video 2000 to 2001, he was a research fellow with the
Processing, vol. 1, no. 4, pp. 293-301, Oct. 2007. control and instrumentation division of Nanyang
[20] R. Nava, G. Cristóbal, and B. Escalante-Ramı́rez, “Mutual Technological University, Singapore. He then
Information Improves Image Fusion Quality Assessments,” SPIE joined the Institute for Aerospace Research
News Room, http://spie.org/documents/Newsroom/Imported/ (IAR), National Research Council Canada,
0824/0824-2007-08-30.pdf, Sept. 2007. Ottawa, as a governmental laboratory visiting
[21] N. Cvejic, A. Loza, D. Bul, and N. Canagarajah, “A Similarity fellow in 2001. After being with IAR for five
Metric for Assessment of Image Fusion Algorithms,” Int’l J. Signal years, he transferred to the NRC Institute for
Processing, vol. 2, no. 3, pp. 178-182, 2005. Research in Construction, where he currently holds a research officer
[22] G. Piella, “New Quality Measures for Image Fusion,” Proc. Int’l position. He holds an adjunct professorship at the University of Ottawa.
Conf. Information Fusion, 2004. His research interests include image/data fusion, computer vision,
[23] Y. Chen and R.S. Blum, “A New Automated Quality Assessment pattern recognition, sensor/sensor network, structural health monitoring,
Algorithm for Image Fusion,” Image and Vision Computing, vol. 27, and nondestructive inspection and evaluation. He cochairs the IEEE
pp. 1421-1432, 2009. IMS TC-36. He is a senior member of the IEEE and a member of SPIE.
[24] H. Chen and P.K. Varshney, “A Human Perception Inspired
Quality Metric for Image Fusion Based on Regional Information,”
Information Fusion, vol. 8, pp. 193-207, 2007.
[25] P. Wang and B. Liu, “A Novel Image Fusion Metric Based on
Multi-Scale Analysis,” Proc. IEEE Int’l Conf. Signal Processing,
pp. 965-968, 2008.
[26] Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli, “Image
Quality Assessment: From Error Measurement to Structural
Similarity,” IEEE Trans. Image Processing, vol. 13, no. 1, pp. 1-14,
2004.
[27] Z. Wang and A.C. Bovik, “A Universal Image Quality Index,”
IEEE Signal Processing Letters, vol. 9, no. 3, pp. 81-84, Mar. 2002.
[28] “Dr. Zhou Wang’s Website,” http://www.ece.uwaterloo.ca/
~z70wang/research/ssim/, Aug. 2009.
[29] M. Hossny and S. Nahavandi, “Image Fusion Algorithms and
Metrics Duality Index,” Proc. 16th IEEE Int’l Conf. Image Processing,
pp. 2169-2172, 2009.
[30] G.A. Miller, “The Magical Number Seven, Plus or Minus Two:
Some Limits on Our Capacity for Processing Information,” The
Psychological Rev., vol. 63, pp. 81-97, 1956.
[31] L.M. Kaplan and R.S. Blum, “Evaluation of Image Quality
Features via Monotonic Analysis,” Technical Report XAARLA-
DELPHI, US Army Research Laboratory Adelphi, MD 20783, 2008.
[32] L.M. Kaplan, S.D. Burks, R.S. Blum, R.K. Moore, and Q. Nguyen,
“Analysis of Image Quality for Image Fusion via Monotonic
Correlation,” IEEE J. Selected Topics in Signal Processing, vol. 3,
no. 2, pp. 222-235, Apr. 2009.
[33] M.G. Kendall, “A New Measure of Rank Correlation,” Biometrika,
vol. 30, nos. 1/2, pp. 81-93, http://biomet.oxfordjournals.org/
content/30/1-2/81.short, 1938.
[34] T.K. Ho, J.J. Hull, and S.N. Srihari, “Decision Combination in
Multiple Classifier Systems,” IEEE Trans. Pattern Analysis and
Machine Intelligence, vol. 16, no. 1, pp. 66-75, Jan. 1994.
LIU ET AL.: OBJECTIVE ASSESSMENT OF MULTIRESOLUTION IMAGE FUSION ALGORITHMS FOR CONTEXT ENHANCEMENT IN NIGHT... 109
Erik Blasch received the BS degree in mechan- Zhiyun Xue received and the bachelor’s and
ical engineering from the Massachusetts Insti- master’s degrees in electrical engineering from
tute of Technology in 1992 and master’s Tsinghua University, China, in 1996 and 1998,
degrees in mechanical engineering (1994), respectively, and the PhD degree in electrical
health science (1995), and industrial engineering engineering from Lehigh University in 2006. She
(human factors) (1995) from the Georgia In- joined the Lister Hill National Center for Biome-
stitute of Technology and attended the Univer- dical Communications at the National Library of
sity of Wisconsin for the MD/PhD degree in Medicine (NLM) in 2006. Her research interests
mechanical engineering/neurosciences until are in the areas of medical image analysis,
being called to active duty in 1996 for the United computer vision, and pattern recognition.
States Air Force. He also received the MBA (1998), MSEE (1998), MS
Econ (1999), MS/PhD psychology (ABD), and the PhD degrees in
electrical engineering from Wright State University and is a graduate of Jiying Zhao received the PhD degree in
the Air War College. He is currently a United States Air Force Research electrical engineering from North China Electric
Laboratory (AFRL) exchange scientist to Defence R&D Canada (DRDC) Power University, and the PhD degree in
at Valcartier, Quebec, in the Future Command and Control (C2) computer engineering from Keio University. He
Concepts and Structures Group of the C2 Decision Support Systems is a professor with the School of Information
Section. Prior to this sabbatical, he was the Information Fusion Technology and Engineering, University of
Evaluation Tech lead for the AFRL Sensors Directorate—COMprehen- Ottawa, Canada. His research interests include
sive Performance Assessment of Sensor Exploitation (COMPASE) image and video processing and multimedia
Center and an adjunct electrical engineering and biomedical engineering communications. He is a member of the IEEE,
professor at Wright State University (WSU) and the Air Force Institute of the Institute of Electronics, Information and
Technology (AFIT) in Dayton, Ohio. He is also a reserve major with the Communication Engineers (IEICE), and is also a member of the
US Air Force Office of Scientific Research (AFRL/AFOSR) in Professional Engineers Ontario (PEO).
Washington, DC. He is currently a member of the IEEE AESS BoG,
associate editor for the IEEE Transactions on Systems, Man, and Robert Laganière received the PhD degree
Cybernetrics, Part A, and a member of the IEEE AESS Track Standards from INRS-Telecommunications in Montreal in
committee. He received the 2009 IEEE Russ Bioengineering award and 1996. He is a full professor and a faculty
supported the IEEE 2005 and 2008 Sections Congress meetings. He member of the VIVA research lab at the School
was a founding member of the International Society of Information of Information Technology and Engineering at
Fusion (ISIF) in 1998 and the 2007 ISIF president. He began his career the University of Ottawa, Canada. His research
in the IEEE Robotics and Automation Society, compiling more than 30 interests are in computer vision and image
top 10 finishes as part of robotic teams in international competitions, processing with applications to visual surveil-
including winning the 1991 American Tour del Sol solar car competition, lance, driver assistance, image-based modeling,
the 1994 AIAA mobile robotics contest, and the 1993 Aerial Unmanned and content-based video interpretation. He is the
Vehicle competition, where his team was first in the world to author of OpenCV Computer Vision Application Programming (Packt
automatically control a helicopter. He has focused on automatic target Publishing) and coauthor of Object-Oriented Software Development
recognition, targeting tracking, and information fusion research, compil- (McGraw Hill). He is a member of the IEEE.
ing 300+ scientific papers and book chapters. He is a fellow of SPIE and
a senior member of the IEEE.
Wei Wu received the BS degree from Tianjin
University, China, in 1998 and received the MS
and PhD degrees in communication and infor-
mation system from Sichuan University, China,
in 2003 and 2008, respectively. He now is a
faculty member at Sichuan University, China.
His current research interests are image process
and video communication, wireless communica-
tion, and super-resolution.