0% found this document useful (0 votes)

55 views14 pages

Image Composition Assessment With Saliency-Augmented Multi-Pattern Pooling

The document proposes a new dataset and method for image composition assessment. It contributes the first image composition assessment dataset with composition scores provided by multiple professional raters. It also proposes a composition assessment network called SAMP-Net that uses a novel Saliency-Augmented Multi-pattern Pooling module to analyze visual layout from different composition pattern perspectives. Experimental results show it outperforms previous aesthetic assessment approaches.

Uploaded by

Epic Arrow

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views14 pages

Image Composition Assessment With Saliency-Augmented Multi-Pattern Pooling

Uploaded by

Epic Arrow

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 1

Image Composition Assessment with

Saliency-augmented Multi-pattern Pooling
Bo Zhang MoE Key Lab of Artificial Intelligence
[email protected] Shanghai Jiao Tong University
Li Niu∗ Shanghai, China
[email protected]
Liqing Zhang
[email protected]

Abstract
Image composition assessment is crucial in aesthetic assessment, which aims to as-
sess the overall composition quality of a given image. However, to the best of our knowl-
edge, there is neither dataset nor method specifically proposed for this task. In this paper,
we contribute the first composition assessment dataset CADB with composition scores
for each image provided by multiple professional raters. Besides, we propose a compo-
sition assessment network SAMP-Net with a novel Saliency-Augmented Multi-pattern
Pooling (SAMP) module, which analyses visual layout from the perspectives of multiple
composition patterns. We also leverage composition-relevant attributes to further boost
the performance, and extend Earth Mover’s Distance (EMD) loss to weighted EMD loss
to eliminate the content bias. The experimental results show that our SAMP-Net can
perform more favorably than previous aesthetic assessment approaches.

1 Introduction
Image aesthetic assessment aims to judge aesthetic quality automatically in a qualitative or
quantitative way, which can be widely used in many down-stream applications such as as-
sisted photo editing, intelligent photo album management, image cropping, and smartphone
photography [5, 7, 11, 39, 40, 41, 43, 51]. Among the factors related to image aesthetics,
image composition, which mainly concerns the arrangement of the visual elements inside
the frame [38], is very critical in estimating image aesthetics [28, 36, 44], because compo-
sition directs the attention of viewer and has a significant impact on the aesthetic perception
[12, 34, 38].
Despite the importance of image composition, there is no dataset readily available for
image composition assessment. Some existing aesthetic datasets contain annotations related
to image composition [3, 19, 22, 35]. However, they only have composition-relevant at-
tributes without overall composition score except for PCCD dataset [3], but PCCD dataset
only presents one reviewer’s composition rating for each image and this reviewer, an anony-
mous website visitor, may be unprofessional. So the ratings might be biased and inaccurate,
which are far below the requirement for scientific evaluation. To this end, we contribute
∗ Corresponding author.
© 2021. The copyright of this document resides with its authors.
It may be distributed unchanged freely in print or electronic forms.
2 ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP

a new image Composition Assessment DataBase (CADB) on the basis of Aesthetics and
Attributes DataBase (AADB) dataset [22]. Our CADB dataset contains 9,497 images with
each image rated by 5 individual raters who specialize in fine art for the overall composition
quality. The details of our CADB dataset will be introduced in Section 3.
To the best of our knowledge, there
is no method specifically designed for im-
age composition assessment. However,
some previous aesthetic assessment meth-
ods also take composition into considera-
tion. We divide the existing composition-
relevant approaches into two groups. 1)
The composition-preserving methods [4,
32] can maintain image composition during
both training and testing. However, these
approaches fail to extract composition-
Figure 1: Evaluating composition quality
relevant feature for composition assessment
from the perspectives of different composition
task. 2) The composition-aware approaches
patterns. The first (resp., second) row shows a
[28, 31, 52] extract composition-relevant
good example and a bad example considering
feature by modeling the mutual dependen-
symmetrical (resp., radial) balance.
cies between all pairs of objects or regions
in the image. However, redundant and noisy information is likely to be introduced dur-
ing this procedure, which may adversely affect the performance of composition assessment.
Moreover, there are some previous methods [1, 10, 29, 49, 54, 55] designed to model the
well-established photographic rules (e.g., rule of thirds and golden ratio [20]), which hu-
mans use in evaluating image composition quality. However, these rule-based methods have
two major limitations: 1) The hand-crafted feature extraction is tedious and laborious com-
pared with deep learning features [27]. 2) Each rule is valid only for specific scenes and they
did not consider which rules are applicable for a given scene [47].
Interestingly, composition pattern, as an important aspect of composition assessment, is
not explicitly considered by the above methods. As shown in Figure 1, each composition
pattern divides the holistic image into multiple non-overlapping partitions, which can model
human perception of composition quality. In particular, by analyzing the visual layout (e.g.,
positions and sizes of visual elements) according to composition pattern, i.e., comparing the
visual elements in various partitions, we can quantify the aesthetics of visual layout in terms
of visual balance (e.g., symmetrical balance and radial balance) [18, 23, 30], composition
rules (e.g., rule of thirds, diagonals and triangles) [24, 50], and so on. Different composition
patterns offer different perspectives to evaluate composition quality. For example, the com-
position pattern in the top (resp., bottom) row in Figure 1 can help judge the composition
quality in terms of symmetrical (resp., radial) balance.
To dissect visual layout based on different composition patterns, we propose a novel
multi-pattern pooling module at the end of backbone to integrate the information extracted
from multiple patterns, in which each pattern provides a perspective to evaluate the compo-
sition quality. Considering that the sizes and locations of salient objects are representative of
visual layout and fundamental to image composition [30], we further integrate visual saliency
[17] into our multi-pattern pooling module to encode the spatial and geometric information
of salient objects, leading to our Saliency-Augmented Multi-pattern Pooling (SAMP) mod-
ule. Additionally, since some composition patterns may play more important roles, we de-
sign weighted multi-pattern aggregation to fuse multi-pattern features, which can adaptively
ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 3

Figure 2: The overall pipeline of our SAMP-Net for composition assessment. We use
ResNet18 [14] as backbone. The detailed structure of our Saliency-Augmented Multi-pattern
Pooling (SAMP) module and Attentional Attribute Feature Fusion (AAFF) module are illus-
trated in Figure 3 and Figure 4 respectively.
assign different weights to different patterns.
Moreover, because our dataset is built upon AADB dataset [22] with composition-relevant
attributes, we further leverage composition-relevant attributes to boost the performance of
composition assessment. Specifically, we propose an Attentional Attribute Feature Fusion
(AAFF) module to fuse composition feature and attribute feature. Finally, after noticing the
content bias existing in our dataset, that is, composition score distribution is severely influ-
enced by object category, we extend Earth Mover’s Distance (EMD) loss in [15] to weighted
EMD loss to eliminate the content bias.
The main contributions of this paper can be summarized as follows: 1) We contribute the
first image composition assessment dataset CADB, in which each image has the composition
scores annotated by five professional raters. 2) We propose a novel composition assessment
method with Saliency-Augmented Multi-pattern Pooling (SAMP) module. 3) We investigate
the effectiveness of auxiliary attributes and weighted EMD loss for composition assessment.
4) Our model outperforms previous aesthetic assessment methods on our dataset.

2 Related Work
2.1 Aesthetic Assessment Dataset
Many large-scale aesthetic assessment datasets have been collected in recent years, like Aes-
thetic Visual Analysis database (AVA) [35], AADB [22], Photo Critique Captioning Dataset
(PCCD) [3], AVA-Comments [60], AVA-Reviews [53], FLICKER-AES [42], and DPC-
Captions [19]. However, these datasets only have composition-relevant attributes without
overall composition score, or only have one inaccurate composition score for each image,
which are far below the requirement for composition assessment research. Unlike the ex-
isting aesthetic datasets, our CADB dataset contains composition ratings assigned to each
image by multiple professional raters. Besides, we guarantee the reliability of our dataset
based on sanity check and consistency analysis (see Section 3).

2.2 Composition-relevant Aesthetic Assessment

We can divide existing composition-relevant aesthetic assessment methods into traditional
methods and deep learning methods. As surveyed in [2, 9], traditional methods [1, 25, 29,
33, 35, 36, 39, 44, 46, 49, 55, 58, 61] usually employed hand-crafted features or generic im-
age features (e.g., bag-of-visual-words [46] and Fisher vectors [37]) to learn image aesthetic
evaluation, yet their generalization ability is limited by the complexity of image composition
4 ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP

Figure 3: Our designed eight composition patterns and Saliency-augmented Multi-pattern

Pooling (SAMP) module.

assessment task. The deep learning based methods can be divided into two groups. The
composition-preserving approaches [4, 32], without explicitly learning composition repre-
sentations, produce inferior results on composition evaluation task. The composition-aware
approaches [28, 31, 52] consider the relationship between all pairs of objects or regions in
the image for modeling image composition, which is likely to introduce redundant and noisy
information. Moreover, the above methods did not explicitly consider composition patterns.
In contrast, we design a novel Saliency-Augmented Multi-pattern Pooling (SAMP) module,
which provides an insightful and effective perspective for evaluating composition quality.

3 Composition Assessment DataBase (CADB)

To the best of our knowledge, there is no prior dataset specifically constructed for composi-
tion assessment. To support the research on this task, we build a dataset upon the existing
AADB dataset [22], from which we collect a total of 9,958 real-world photos. We adopt
a composition rating scale from 1 to 5, where a larger score indicates better composition.
We make annotation guidelines for composition quality rating and train five individual raters
who specialize in fine art. So for each image, we can obtain five composition scores rang-
ing from 1 to 5. Given the subjective nature of human aesthetic activity [12, 38, 44], we
perform sanity check and consistency analysis. Similar to [57], we use 240 additional “san-
ity check” images during annotating to roughly verify the validness of our annotations. We
also examine the consistency of composition ratings provided by five individual raters (see
Supplementary). Similar to [22, 35], we average the composition scores as the ground-truth
composition mean score for each image, which is denoted as ȳ. More details about our
CADB dataset will be elaborated in Supplementary.
Besides, we observe the content bias in our CADB dataset, that is, there are some biased
categories whose score distributions are concentrated in a very narrow interval. After remov-
ing 461 biased images, we split the remaining images into 8,547 training images and 950 test
ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 5

images, in which the test set is made less biased for better evaluation (see Supplementary).

4 Methodology
To accomplish the composition assessment task, we propose a novel network SAMP-Net,
which is named after Saliency-Augmented Multi-pattern Pooling (SAMP) module. The
overall pipeline of our method is illustrated in Figure 2, where we first extract the global
feature map from input image by backbone (e.g., ResNet18 [14]) and then yield aggregated
pattern feature through our SAMP module, which is followed by Attentional Attribute Fea-
ture Fusion (AAFF) module to fuse the composition feature and attribute feature. After that,
we predict composition score distribution based on the fused feature and predict the attribute
score based on the attribute feature, which are supervised by weighted EMD loss and Mean
Squared Error (MSE) loss respectively.

4.1 Saliency-augmented Multi-pattern Pooling

Multi-pattern Pooling: As demonstrated in Figure 3(a), we empirically design eight ba-
sic composition patterns inspired by classic composition guidelines. For instance, pattern
1,2,6,7 are inspired by symmetrical composition. Pattern 3,4 are inspired by diagonal com-
position. Pattern 5 is inspired by centre composition. Pattern 8 is inspired by rule of thirds
[24, 50]. Although our pattern design is inspired by composition rules, there is no strict
one-to-one correspondence between composition rules and patterns. Each pattern provides
a perspective for evaluating composition quality, which may be beyond the scope of a single
rule. For example, Pattern 8 is related to rule of thirds, but not limited to rule of thirds. Based
on Pattern 8, more useful information can be excavated by comparing the visual elements in
nine partitions.
Since humans typically employ multiple perspectives when analysing image composi-
tion, composition assessment should be accomplished based on all composition patterns in
a comprehensive way. Therefore, we propose multi-pattern pooling to achieve this goal,
which is illustrated in Figure 3(b). Given an H × W global feature map F with C channels,
which is extracted from input image by backbone, we represent the pixel-wise feature at
each location as xi, j , where 0 < i ≤ H, 0 < j ≤ W . For the p-th pattern, we divide F into
K p non-overlapping partitions {X1p , X2p , . . . , XKpp } and K p is the total number of partitions
in this pattern. Then, the feature of the k-th partition can be obtained via average pooling:
θ (Xkp ) = |X1p | ∑(i, j)∈X p xi, j ∈ RC .
k k
Saliency-augmented Multi-pattern Pooling: Considering the significance of salient ob-
jects for composition assessment, we further incorporate the saliency information (i.e., loca-
tions and scales of salient objects) into multi-pattern pooling. To achieve this goal, we utilize
an unsupervised saliency detection method [17] to produce saliency maps for input images.
We have also tried several supervised methods [6, 16, 59], which prove to be less effective.
After obtaining the saliency map, we downsample it to Hsal × Wsal through max pooling.
Recall that the size of global feature map is H × W , we set Hsal = 8H and Wsal = 8W for
retaining more details of salient objects.
Different from θ (Xkp ) using average pooling, we directly reshape each partition of saliency
map into a vector, because the pooling operation will result in significant information loss.
Specifically, for the k-th partition in the p-th pattern, we reshape the saliency map in this
6 ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP
p
partition into a saliency vector ψ(Xkp ) ∈ RDk , in which Dkp varies with partition and pattern.
Then, we concatenate ψ(Xkp ) and θ (Xkp ) to generate the partition feature [ψ(Xkp ), θ (Xkp )].
For the p-th pattern, we concatenate the partition features of K p partitions into a long
p
vector f̃samp , which is followed by a fc layer and ReLU activation function to produce the
p 0
pattern vector fsamp ∈ RC . Intuitively, [ψ(Xkp ), θ (Xkp )] extracts the visual information in
p
each partition and fsamp encodes the relationship among visual elements in different parti-
tions.
Weighted Multi-pattern Aggregation: Since some composition patterns may play more
important roles when evaluating image composition, our model is trained to assign different
weights for different patterns. Precisely, we apply global average pooling, a fc layer, and
softmax normalization to the global feature map F, producing the multi-pattern weight w p
for the p-th pattern. Then, we have the aggregated pattern feature via weighted summation
p
fsamp = ∑Pp=1 w p fsamp , in which P is the number of composition patterns (P = 8). Based on
the learnt weights, we can know the dominant patterns in determining the overall composi-
tion quality and provide interpretable guidance for users (see Section 5.4).
Comparison with Spatial Pyramid Pooling: Although the proposed SAMP and Spatial
Pyramid Pooling (SPP) [13] are similar in architecture, both of which pool features from
multiple sets of partitions, SAMP is significantly different from SPP in three main aspects:
1) our pooling patterns are specifically designed and well-tailored for image composition
evaluation, which can analyse the composition quality from the viewpoint of composition
patterns; 2) we introduce visual saliency into multi-pattern pooling; 3) we learn pattern
weights which provide interpretable guidance for improving composition quality.

4.2 Attentional Attribute Feature Fusion

Since our dataset is built upon
AADB [22], which is associ-
ated with composition-relevant at-
tributes, it is natural to con-
sider using them to help compo-
sition assessment. We use five
composition-relevant attribute an- Figure 4: Attentional Attribute Feature Fusion (AAFF)
notations: rule of thirds, bal- module. fc means a fully-connected layer with sigmoid
ancing elements, object emphasis, activation and e1 , e2 are attention coefficients.
symmetry, and repetition.
Specifically, as illustrated in Figure 2, we decompose the aggregated pattern feature
0
fsamp ∈ RC into composition feature fcomp and attribute feature fatts by using two separate fc
0
layers, the dimensions of which are both set to C2 . We dynamically weigh the contributions of
fcomp and fatts for the composition assessment task, as illustrated in Figure 4. First, we apply
a fc layer and sigmoid activation to the concatenation of fcomp and fatts , to learn the attention
coefficients [e1 , e2 ] for two types of features. Then, we concatenate the weighted composi-
0
tion feature and attribute feature, yielding the fused feature f f used = [e1 fcomp , e2 fatts ] ∈ RC .
During training, an additional layer is added to perform attribute prediction based on the
attribute feature fatts . We employ MSE loss denoted as Latts for attribute prediction.
As mentioned in Section 3, we observe the content bias in our dataset, in which case the
network may find a shortcut to simply rate images based on their contents. To mitigate the
content bias in training set, we extend EMD loss to weighted EMD loss denoted as LwEMD
ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 7

WE MP PW SA AF AA MSE↓ EMD↓ SRCC↑ LCC↑

1 0.4534 0.1943 0.6025 0.6148
2 X 0.4373 0.1859 0.6105 0.6258
3 X X 0.4170 0.1847 0.6292 0.6435
4 X X X 0.4134 0.1829 0.6323 0.6483
5 X X X X 0.4088 0.1820 0.6421 0.6544
6 X † X X 0.4274 0.1854 0.6226 0.6293
7 X ‡ X X 0.4205 0.1845 0.6319 0.6363
8 X X 0.4320 0.1850 0.6200 0.6303
9 X X X X X 0.3979 0.1817 0.6439 0.6610
10 X X X X X X 0.3867 0.1798 0.6564 0.6709

Table 1: Ablation studies of different components in our model. † means Spatial Pyramid
Pooling (SPP) [13]. ‡ means Multi-scale Pyramid Pooling (MPP) [56]. WE means weighted
EMD loss. MP means multi-pattern pooling. PW means pattern weights. SA means saliency-
augmented. AF indicates attribute feature and AA indicates attentional attribute feature fu-
sion.
(see Supplementary), which assigns smaller weights to biased samples when calculating
EMD Loss. Finally, our SAMP-Net can be trained in an end-to-end manner with attribute
prediction loss Latts and weighted EMD loss LwEMD :

L = LwEMD + λ Latts , (1)

where λ is a trade-off parameter set as 0.1 via cross validation.

5 Experiments
5.1 Implementation Details and Evaluation Metric
We use ResNet18 [14] pretrained on ImageNet [8] as the backbone of our SAMP-Net. Unless
otherwise specified, all input images are resized to 224 × 224 for both training and testing
following [21, 26, 45], leading to a global feature map of H × W = 7 × 7, and the saliency
map is downsampled to Hsal ×Wsal = 56 × 56 before passing to the SAMP. More details can
be found in Supplementary. All experiments are conducted on our CADB dataset.
To evaluate the composition score distribution and composition mean score predicted
by different models, it is natural to adopt EMD and MSE as the evaluation metrics. EMD
measures the closeness between the predicted and ground-truth composition score distribu-
tions as in [15]. MSE is computed between the predicted and ground-truth composition mean
scores. Moreover, following existing aesthetic assessment approaches [4, 22, 48], we also re-
port the ranking correlation measured by Spearman’s Rank Correlation Coefficient (SRCC)
and the linear association measured by Linear Correlation Coefficient (LCC) between the
predicted and ground-truth composition mean scores.

5.2 Ablation Study

To evaluate the effectiveness of each individual component in our SAMP-Net, we conduct a
series of experiments and report all the evaluation metrics described in Section 5.1. In this
section, we start from ResNet18 backbone and build up our holistic model step by step.
8 ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP

Method MSE↓ EMD↓ SRCC↑ LCC↑

ResNet18 0.4534 0.1943 0.6025 0.6148
AADB [22] 0.4234 0.1923 0.6236 0.6415
MNA-CNN [32] 0.4260 0.1944 0.6108 0.6375
A-Lamp [31] 0.4230 0.1898 0.6270 0.6456
VP-Net [52] 0.4304 0.1948 0.6169 0.6285
RG-Net [28] 0.4398 0.1915 0.6026 0.6218
AFDC-Net [4] 0.4245 0.1910 0.6154 0.6388
SAMP-Net (Ours) 0.3867 0.1798 0.6564 0.6709

Table 2: Comparison of different methods on the composition assessment task. All models
are trained and evaluated on the proposed CADB dataset.

Weighted EMD Loss: We start from basic ResNet18 [14], and report the results using EMD
loss and weighted EMD loss in Table 1. The experimental results show that training with
weighted EMD loss (row 2) performs better than standard EMD loss (row 1) with a clear
gap of test EMD between these two models, which is attributed to the advantage of weighted
EMD loss in eliminating content bias.
Saliency-Augmented Multi-pattern Pooling (SAMP): Based on ResNet18 with weighted
EMD loss (row 2), we add our SAMP module and also explore its ablated versions. We
first investigate vanilla multi-pattern pooling without saliency or pattern weights (row 3), in
which saliency vector is excluded from partition feature and the pattern features of multiple
patterns are simply averaged. Then, we learn pattern weights to aggregate multiple pattern
features (row 4). By comparing row 3 and row 4, it is beneficial to adaptively assign different
weights to different pattern features. We further incorporate saliency map into SAMP module
(row 5). The comparison between row 4 and row 5 proves that is useful to emphasize the
layout information of salient objects. Considering the architecture similarity between Spatial
Pyramid Pooling (SPP) [13] and our multi-pattern pooling, we replace our multi-pattern
pooling with SPP using scales {1 × 1, 2 × 2, 3 × 3} following [4] (row 6). In addition, we
also show the results of using Multi-scale Pyramid Pooling (MPP) [56] in row 7, in which
we make an image pyramid containing three scaled images. The comparisons (row 5 v.s.
row 6, row 5 v.s. row7) show that the model using multi-pattern pooling outperforms both
SPP and MPP. The reason is that our proposed multi-pattern pooling is specifically designed
and well-tailored for composition assessment task.
Attentional Attribute Feature Fusion (AAFF): Built on row 2 (resp., row 5) in Table 1,
we additionally learn attribute feature and directly concatenate it with composition feature,
leading to row 8 (resp., row 9). The experimental results demonstrate that composition-
relevant attributes can help boost the performance of composition evaluation. This sheds
light on that composition-relevant attribute prediction and composition evaluation are two
related and reciprocal tasks. Finally, we complete our attentional attribute feature fusion
module by learning weights for weighted concatenation (row 10). From row 9 and row 10,
we can observe that the model using weighted concatenation is better than that using plain
concatenation, which validates the superiority of attentional fusion mechanism.

5.3 Comparison with Existing Methods

To the best of our knowledge, there is no method specifically designed for image composition
assessment. Nevertheless, some previous aesthetic assessment methods [4, 22, 28, 31, 32,
ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 9

Figure 5: Analysis of the correlation between an image and its dominant pattern with the
largest weight. We show the estimated pattern weights and the largest weight is colored
green. We also show the ground-truth/predicted composition mean score in blue/red.
52] explicitly take composition into consideration. Since most of these methods do not yield
score distribution, we make a slight modification on the prediction layer of these methods to
be compatible with EMD loss [15]. For fair comparison, all methods are trained and tested
on our CADB dataset with ResNet18 pretrained on ImageNet [8] as backbone.
In Table 2, we compare our method with different composition-relevant aesthetic assess-
ment methods. The baseline model (ResNet18) only consists of the pretrained ResNet18 and
a prediction head, which is the same as row 1 in Table 1. Among these baselines, A-Lamp
is the most competitive one, probably because A-Lamp introduces additional saliency infor-
mation to learn the pairwise spatial relationship between objects. Our SAMP-Net clearly
outperforms all the composition-relevant baselines, which demonstrates that our method is
more adept at image composition assessment.

5.4 Analysis of Composition Pattern

To take a close look at the learnt pattern weights which indicate the importance of different
patterns on the overall composition quality, we show the input image, its saliency map, its
ground-truth/predicted composition mean score, and its pattern weights in Figure 5.
For each image, the composition pattern with the largest weight is referred to as its dom-
inant pattern. For each pattern, we show one example image with this pattern as dominant
pattern and overlay this pattern on the image in Figure 5, which reveals from which perspec-
tive the input image is given a high or low score. For example, in the right figure of the last
row, the surfer is placed at the intersection point between the gridlines of pattern 8, which
implicates that the image conforms to the rule of thirds properly, yielding a relatively high
score. On the contrary, in the right figure of the first row, the arch slightly deviates from its
10 ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP

Figure 6: We show some failure cases in the test set, which have the highest absolute er-
rors between the predicted composition mean scores (out of bracket) and the ground-truth
composition mean scores (in bracket).
symmetrical axis under pattern 2. So the low score implies that maintaining horizontal sym-
metry may enhance the composition quality. In the left figure of the third row, per the low
score under pattern 5, the dog is suggested to be moved to the center. In summary, our SAMP
module can facilitate composition assessment by integrating the information from multiple
patterns and provide constructive suggestions for improving the composition quality.

5.5 Additional Experiments in Supplementary

Due to the space limitation, we present some experiments in Supplementary, including the re-
sults of using different training set sizes, backbones, and hyper-parameters λ in (1), weighted
EMD loss analysis, the effectiveness of each pattern, the impact of using more composition
patterns, comparison with the performance of human raters, more results on the CADB and
PCCD [3] datasets.

5.6 Limitations
While our method can generally achieve accurate and reliable composition assessment, it
still has some failure cases. We show several failure cases in Figure 6, which have the
highest absolute errors between the predicted and ground-truth composition mean scores.
We can observe that our model tends to predict relatively low scores for these images with
high composition mean scores, which is probably due to the distracting backgrounds and
complicated composition patterns. In addition, there is a clear gap between our method and
human raters on ranking the composition quality of different images (see Supplementary),
which needs to be enhanced in the future work.

6 Conclusion
In this paper, we have contributed the first composition assessment dataset CADB with five
composition scores for each image. We have also proposed a novel method SAMP-Net with
saliency-augmented multi-pattern pooling. Equipped with SAMP module, AAFF module,
and weighted EMD loss, our method is capable of achieving the best performance for com-
position assessment.

Acknowledgement
This work is sponsored by National Natural Science Foundation of China (Grant No. 61902247)
and Shanghai Sailing Program (19YF1424400).
ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 11

References
[1] S. Bhattacharya, R. Sukthankar, and M. Shah. A framework for photo-quality assess-
ment and enhancement based on visual aesthetics. In ACM-Multimedia, 2010.
[2] A. Brachmann and C. Redies. Computational and experimental approaches to visual
aesthetics. Frontiers in computational neuroscience, 11(1):102–119, 2017.
[3] K. Chang, KH. Lu, and CS. Chen. Aesthetic critiques generation for photos. In ICCV,
2017.
[4] Q. Chen, W. Zhang, N. Zhou, P. Lei, Y. Xu, Y. Zheng, and J. Fan. Adaptive fractional
dilated convolution network for image aesthetics assessment. In CVPR, 2020.
[5] Y. Chen, J. Klopp, M. Sun, S. Chien, and K. Ma. Learning to compose with professional
photographs on the web. In ACM-Multimedia, 2017.
[6] M. Cornia, L. Baraldi, G. Serra, and R. Cucchiara. Predicting human eye fixations via
an LSTM-based saliency attentive model. IEEE Transactions on Image Processing, 27
(10):5142–5154, 2018.
[7] R. Datta, D. Joshi, J. Li, and J. Wang. Studying aesthetics in photographic images using
a computational approach. In ECCV, 2006.
[8] J. Deng, W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale
hierarchical image database. In CVPR, 2009.
[9] Y. Deng, C.C. Loy, and X. Tang. Image aesthetic assessment: An experimental survey.
IEEE Signal Processing Magazine, 34(4):80–106, 2017.
[10] S. Dhar, V. Ordonez, and T. Berg. High level describable attributes for predicting
aesthetics and interestingness. In CVPR, 2011.
[11] Yuming Fang, Hanwei Zhu, Yan Zeng, Kede Ma, and Zhou Wang. Perceptual quality
assessment of smartphone photography. In CVPR, 2020.
[12] M. Freeman. The photographer’s eye: Composition and design for better digital pho-
tos. CRC Press, 2007.
[13] K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional
networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 37(9):1904–1916, 2015.
[14] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In
CVPR, 2016.
[15] L. Hou, C.P. Yu, and D. Samaras. Squared earth mover’s distance-based loss for train-
ing deep neural networks. ArXiv, abs/1611.05916, 2016.
[16] Q. Hou, MM. Cheng, X. Hu, A. Borji, Z. Tu, and P. Torr. Deeply supervised salient
object detection with short connections. In CVPR, 2017.
[17] X. Hou and L. Zhang. Saliency detection: A spectral residual approach. In CVPR,
2007.
12 ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP

[18] A. Jahanian, S. Vishwanathan, and J. Allebach. Learning visual balance from large-
scale datasets of aesthetically highly rated images. In Human Vision and Electronic
Imaging XX, 2015.
[19] X. Jin, L. Wu, G. Zhao, X. Li, X. Zhang, S. Ge, D. Zou, B. Zhou, and X. Zhou.
Aesthetic attributes assessment of images. In ACM-Multimedia, 2019.
[20] Dhiraj Joshi, Ritendra Datta, Elena Fedorovskaya, Quang-Tuan Luong, James Z Wang,
Jia Li, and Jiebo Luo. Aesthetics and emotions in images. IEEE Signal Processing
Magazine, 28(5):94–115, 2011.
[21] Keunsoo Ko, Jun-Tae Lee, and Chang-Su Kim. PAC-Net: Pairwise aesthetic compari-
son network for image aesthetic assessment. In ICIP, 2018.
[22] S. Kong, X. Shen, Z. Lin, R. Mech, and C. Fowlkes. Photo aesthetics ranking network
with attributes and content adaptation. In ECCV, 2016.
[23] J.T. Lee, H. Kim, C. Lee, and C. Kim. Semantic line detection and its applications. In
ICCV, 2017.
[24] J.T. Lee, H. Kim, C. Lee, and C. Kim. Photographic composition classification and
dominant geometric element detection for outdoor scenes. Journal of Visual Commu-
nication and Image Representation, 55(1):91–105, 2018.
[25] C. Li, A. Gallagher, A. Loui, and T. Chen. Aesthetic quality assessment of consumer
photos with faces. In ICIP, 2010.
[26] Leida Li, Hancheng Zhu, Sicheng Zhao, Guiguang Ding, and Weisi Lin. Personality-
assisted multi-task learning for generic and personalized image aesthetics assessment.
IEEE Transactions on Image Processing, 29(1):3898–3910, 2020.
[27] Xuewei Li, Xueming Li, Gang Zhang, and Xianlin Zhang. A novel feature fusion
method for computing image aesthetic quality. IEEE access, 8:63043–63054, 2020.
[28] D. Liu, R. Puri, N. Kamath, and S. Bhattacharya. Composition-aware image aesthetics
assessment. In WACV, 2020.
[29] Ligang Liu, Renjie Chen, Lior Wolf, and Daniel Cohen-Or. Optimizing photo compo-
sition. In Computer Graphics Forum, 2010.
[30] S. Lok, S. Feiner, and G. Ngai. Evaluation of visual balance for automated layout. In
Proceedings of the 9th international conference on Intelligent user interfaces, 2004.
[31] S. Ma, J. Liu, and C. Chen. A-Lamp: Adaptive layout-aware multi-patch deep convo-
lutional neural network for photo aesthetic assessment. In CVPR, 2017.
[32] L. Mai, H. Jin, and F. Liu. Composition-preserving deep photo aesthetics assessment.
In CVPR, 2016.
[33] L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka. Assessing the aesthetic quality
of photographs using generic image descriptors. In ICCV, 2011.
[34] B. Martinez and J. Block. Visual forces: an introduction to design. Pearson College
Division, 1995.
ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 13

[35] N. Murray, L. Marchesotti, and F. Perronnin. AVA: A large-scale database for aesthetic
visual analysis. In CVPR, 2012.

[36] P. Obrador, L. Schmidt-Hackenberg, and N. Oliver. The role of image composition in

image aesthetics. In ICIP, 2010.

[37] F. Perronnin and C. Dance. Fisher kernels on visual vocabularies for image categoriza-
tion. In CVPR, 2007.

[38] D. Präkel. The fundamentals of creative photography. Bloomsbury Publishing, 2010.

[39] Yogesh Singh Rawat and Mohan S Kankanhalli. Context-aware photography learning
for smart mobile devices. ACM Transactions on Multimedia Computing, Communica-
tions, and Applications, 12(1):1–24, 2015.

[40] Yogesh Singh Rawat and Mohan S Kankanhalli. Clicksmart: A context-aware view-
point recommendation system for mobile photography. IEEE Transactions on Circuits
and Systems for Video Technology, 27(1):149–158, 2016.

[41] Yogesh Singh Rawat, Mingli Song, and Mohan S Kankanhalli. A spring-electric graph
model for socialized group photography. IEEE Transactions on Multimedia, 20(3):
754–766, 2017.

[42] J. Ren, X. Shen, Z. Lin, R. Mech, and D. Foran. Personalized image aesthetics. In
ICCV, 2017.

[43] R. Sukthankar S. Bhattacharya and M. Shah. A holistic approach to aesthetic enhance-

ment of photographs. ACM Transactions on Multimedia Computing, Communications,
and Applications, 7(1):1–21, 2011.

[44] A. Savakis, S. Etz, and A. Loui. Evaluation of image appeal in consumer photography.
In Human vision and electronic imaging V, 2000.

[45] Katharina Schwarz, Patrick Wieschollek, and Hendrik PA Lensch. Will people like
your image? learning the aesthetic space. In WACV, 2018.

[46] H. Su, T. Chen, C. Kao, W. Hsu, and S. Chien. Scenic photo quality assessment with
bag of aesthetics-preserving features. In ACM-Multimedia, 2011.

[47] Yu-Chuan Su, Raviteja Vemulapalli, Ben Weiss, Chun-Te Chu, Philip Andrew Mans-
field, Lior Shapira, and Colvin Pitts. Camera view adjustment prediction for improving
image composition. arXiv preprint arXiv:2104.07608, 2021.

[48] H. Talebi and P. Milanfar. NIMA: Neural image assessment. IEEE Transactions on
Image Processing, 27(8):3998–4011, 2018.

[49] X. Tang, W. Luo, and X. Wang. Content-based photo quality assessment. IEEE Trans-
actions on Image Processing, 15(8):1930–1943, 2013.

[50] K. Thömmes and R. Hübner. Instagram likes for architectural photos can be predicted
by quantitative balance measures and curvature. Frontiers in psychology, 9(1):1050–
1067, 2018.
14 ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP

[51] Yi Tu, Li Niu, Weijie Zhao, Dawei Cheng, and Liqing Zhang. Image cropping with
composition and saliency aware aesthetic score map. In AAAI, 2020.
[52] W. Wang and R. Deng. Modeling human perception for image aesthetic assessment. In
ICIP, 2019.
[53] W. Wang, S. Yang, W. Zhang, and J. Zhang. Neural aesthetic image reviewer. IET
Computer Vision, 13(8):749–758, 2019.
[54] Min-Tzu Wu, Tse-Yu Pan, Wan-Lun Tsai, Hsu-Chan Kuo, and Min-Chun Hu. High-
level semantic photographic composition analysis and understanding with deep neural
networks. In ICMEW, 2017.
[55] Yaowen Wu, Christian Bauckhage, and Christian Thurau. The good, the bad, and the
ugly: Predicting aesthetic image labels. In ICPR, 2010.
[56] Donggeun Yoo, Sunggyun Park, Joon-Young Lee, and In So Kweon. Multi-scale pyra-
mid pooling for deep convolutional representation. In CVPRW, 2015.
[57] N. Yu, X. Shen, L. Lin, R. Mech, and C. Barnes. Learning to detect multiple photo-
graphic defects. In WACV, 2018.
[58] L. Zhang, Y. Gao, R. Zimmermann, Q. Tian, and X. Li. Fusion of multichannel local
and global structural cues for photo aesthetics evaluation. IEEE Transactions on Image
Processing, 23(3):1419–1429, 2014.
[59] T. Zhao and X. Wu. Pyramid feature attention network for saliency detection. In CVPR,
2019.
[60] Y. Zhou, X. Lu, J. Zhang, and J.Z. Wang. Joint image and text representation for
aesthetics analysis. In ACM-Multimedia, 2016.
[61] Z. Zhou, S. He, J. Li, and J.Z. Wang. Modeling perspective effects in photographic
composition. In ACM-Multimedia, 2015.

Bombing Format
88% (25)
Bombing Format
19 pages
PHD Thesis On Content Based Image Retrieval
100% (3)
PHD Thesis On Content Based Image Retrieval
7 pages
Content Based Image Retrieval PHD Thesis
100% (3)
Content Based Image Retrieval PHD Thesis
8 pages
Thesis Report On Content Based Image Retrieval
100% (2)
Thesis Report On Content Based Image Retrieval
4 pages
Making Images Real Again: A Comprehensive Survey On Deep Image Composition
No ratings yet
Making Images Real Again: A Comprehensive Survey On Deep Image Composition
30 pages
Ntu 97 1
No ratings yet
Ntu 97 1
55 pages
Investigating Aesthetic Image Analysis With Convolution Neural Networks - Bachelor Thesis - David Van Der Linde
No ratings yet
Investigating Aesthetic Image Analysis With Convolution Neural Networks - Bachelor Thesis - David Van Der Linde
75 pages
Theme-Aware Semi-Supervised Image Aesthetic Quality Assessment
No ratings yet
Theme-Aware Semi-Supervised Image Aesthetic Quality Assessment
18 pages
Strategies To Train and Improve Students' Composition Creation Ability in Photography Practice Teaching
No ratings yet
Strategies To Train and Improve Students' Composition Creation Ability in Photography Practice Teaching
6 pages
Research Paper Iqa
No ratings yet
Research Paper Iqa
13 pages
Aia Photo
No ratings yet
Aia Photo
13 pages
Structural Similarity Based Image Quality Assessme
No ratings yet
Structural Similarity Based Image Quality Assessme
21 pages
An Integrated Approach For Image Retrieval Based On Content
No ratings yet
An Integrated Approach For Image Retrieval Based On Content
6 pages
Adaptive Mixed-Scale Feature Fusion Network For Blind AI-Generated Image Quality Assessment
No ratings yet
Adaptive Mixed-Scale Feature Fusion Network For Blind AI-Generated Image Quality Assessment
11 pages
8 X October 2020
No ratings yet
8 X October 2020
6 pages
Lu PDF
No ratings yet
Lu PDF
13 pages
Unit 5
No ratings yet
Unit 5
32 pages
Pveerina Final
No ratings yet
Pveerina Final
5 pages
Marchesotti 2011
No ratings yet
Marchesotti 2011
8 pages
No Reference Image Quality Assessment Based On DCT and SOM Clustering
No ratings yet
No Reference Image Quality Assessment Based On DCT and SOM Clustering
13 pages
Aaaaaaa
No ratings yet
Aaaaaaa
22 pages
IWSSIM
No ratings yet
IWSSIM
14 pages
Recsys18 Aesthetic
No ratings yet
Recsys18 Aesthetic
2 pages
Image Segmentation NO ML
No ratings yet
Image Segmentation NO ML
24 pages
Johan Oldenkamp & Gary Grinberg - Birthdate (Numerology)
100% (5)
Johan Oldenkamp & Gary Grinberg - Birthdate (Numerology)
9 pages
Tanaka 2000
No ratings yet
Tanaka 2000
12 pages
Bovik Non Reference Image Quality SSEQ
No ratings yet
Bovik Non Reference Image Quality SSEQ
8 pages
15 SPL Pcqi
No ratings yet
15 SPL Pcqi
4 pages
An Accurate Deep Convolutional Neural Networks Model For No-Reference Image Quality Assessment
No ratings yet
An Accurate Deep Convolutional Neural Networks Model For No-Reference Image Quality Assessment
6 pages
Performance Evaluation of Ontology Andfuzzybase Cbir
No ratings yet
Performance Evaluation of Ontology Andfuzzybase Cbir
7 pages
New Feature Selection Algorithms For No-Reference Image Quality Assessment
No ratings yet
New Feature Selection Algorithms For No-Reference Image Quality Assessment
22 pages
Recognizing Image Style
No ratings yet
Recognizing Image Style
10 pages
A Deep Neural Network For Image Quality Assessment
No ratings yet
A Deep Neural Network For Image Quality Assessment
5 pages
Section 7 Features Extraction
No ratings yet
Section 7 Features Extraction
42 pages
Symmetry: Center-Emphasized Visual Saliency and A Contrast-Based Full Reference Image Quality Index
No ratings yet
Symmetry: Center-Emphasized Visual Saliency and A Contrast-Based Full Reference Image Quality Index
14 pages
IET Image Processing - 2018 - Lahrache - Rules of Photography For Image Memorability Analysis
No ratings yet
IET Image Processing - 2018 - Lahrache - Rules of Photography For Image Memorability Analysis
9 pages
Aesthetic Attributes Assessment of Images
No ratings yet
Aesthetic Attributes Assessment of Images
9 pages
Featuredescriptor
No ratings yet
Featuredescriptor
45 pages
Datta On Photo
No ratings yet
Datta On Photo
14 pages
Picture Collage
No ratings yet
Picture Collage
8 pages
A New Image Quality Assessment Algorithm Based On SSIM and Multiple Regressions
No ratings yet
A New Image Quality Assessment Algorithm Based On SSIM and Multiple Regressions
10 pages
2007-Using Color Compatibility For Assessing Image Realism
No ratings yet
2007-Using Color Compatibility For Assessing Image Realism
8 pages
Conference Template A4
No ratings yet
Conference Template A4
4 pages
The Critic As Artist
No ratings yet
The Critic As Artist
4 pages
Comparison of No-Reference Image Quality Assessment Machine Learning-Based Algorithms On Compressed Images
No ratings yet
Comparison of No-Reference Image Quality Assessment Machine Learning-Based Algorithms On Compressed Images
10 pages
Digital Image Processing
No ratings yet
Digital Image Processing
7 pages
Asic Design
No ratings yet
Asic Design
63 pages
Journal Ale Andina
No ratings yet
Journal Ale Andina
12 pages
Semantics-Based Image Retrieval by Region Saliency
No ratings yet
Semantics-Based Image Retrieval by Region Saliency
9 pages
Ijcs 2016 0303012 PDF
No ratings yet
Ijcs 2016 0303012 PDF
5 pages
Psychological Science Modeling Scientific Literacy 2nd Edition Krause Test Bank 1
100% (75)
Psychological Science Modeling Scientific Literacy 2nd Edition Krause Test Bank 1
68 pages
Smart Draw! Doodle Recognition
No ratings yet
Smart Draw! Doodle Recognition
6 pages
Mind Illuminated
No ratings yet
Mind Illuminated
2 pages
Mid Semester Project Seminar On: Content Based Image Retrieval
No ratings yet
Mid Semester Project Seminar On: Content Based Image Retrieval
21 pages
Saliency-Enhanced Image Aesthetics Class Prediction
No ratings yet
Saliency-Enhanced Image Aesthetics Class Prediction
5 pages
What Non Readers 2
No ratings yet
What Non Readers 2
37 pages
OceanofPDF - Com Nostalgia Marketing - Marco Pichierri
No ratings yet
OceanofPDF - Com Nostalgia Marketing - Marco Pichierri
164 pages
Lovato 2013
No ratings yet
Lovato 2013
5 pages
Colored Pencil Handout 3 Options + Rubric
No ratings yet
Colored Pencil Handout 3 Options + Rubric
2 pages
Pepsi Screening
No ratings yet
Pepsi Screening
15 pages
Eduardo Talaman Thesis Title Final Portrait 1
No ratings yet
Eduardo Talaman Thesis Title Final Portrait 1
13 pages
Color Texture Moments For Content-Based Image Retrieval
No ratings yet
Color Texture Moments For Content-Based Image Retrieval
4 pages
Netball Scheme of Work
No ratings yet
Netball Scheme of Work
3 pages
BA ADA Part 1 Poems Plays Reference To The Context Notes1
No ratings yet
BA ADA Part 1 Poems Plays Reference To The Context Notes1
36 pages
Privilage Essay
No ratings yet
Privilage Essay
3 pages
J Kagan
No ratings yet
J Kagan
56 pages
VHDL FSM
No ratings yet
VHDL FSM
33 pages
Chapter 2 Personality
100% (1)
Chapter 2 Personality
18 pages
Deep Learning With Python A Crash Course To Deep Learning With Illustrations in Python Programming Language
100% (2)
Deep Learning With Python A Crash Course To Deep Learning With Illustrations in Python Programming Language
59 pages
Purposive Communication Midterm Reviewer
No ratings yet
Purposive Communication Midterm Reviewer
14 pages
Online Assignment: Importance of Handbooks and Workbooks in Learning Mathematics
No ratings yet
Online Assignment: Importance of Handbooks and Workbooks in Learning Mathematics
7 pages
SI SocialPower
100% (1)
SI SocialPower
4 pages
Arwood Kaulitz
No ratings yet
Arwood Kaulitz
30 pages
Creative Thinking Activity
No ratings yet
Creative Thinking Activity
4 pages
A Friendly Introduction To Cross Entropy Loss
No ratings yet
A Friendly Introduction To Cross Entropy Loss
10 pages
Joystick Modul Datenblatt
No ratings yet
Joystick Modul Datenblatt
8 pages
Portfolio Design
No ratings yet
Portfolio Design
6 pages
Cross Interopy
No ratings yet
Cross Interopy
7 pages
6289 - B.R.A.C.T's Vishwakarma Institute of Information Technology, Kondhwa (BK.), Pune
No ratings yet
6289 - B.R.A.C.T's Vishwakarma Institute of Information Technology, Kondhwa (BK.), Pune
14 pages
William Blake and A Poison Tree
No ratings yet
William Blake and A Poison Tree
11 pages
Managerial Economics Course Outline
100% (1)
Managerial Economics Course Outline
5 pages
Tos Math2 Q3 ST No.3
No ratings yet
Tos Math2 Q3 ST No.3
3 pages
Undoing Our Emotions IELTS Reading Answers
No ratings yet
Undoing Our Emotions IELTS Reading Answers
1 page
Career Reconnaissance
No ratings yet
Career Reconnaissance
6 pages
English For Specific Purposes
No ratings yet
English For Specific Purposes
2 pages
Desiderata
No ratings yet
Desiderata
5 pages
Travel Addicts
No ratings yet
Travel Addicts
1 page
Illusion: Causes of Optical Illusion
No ratings yet
Illusion: Causes of Optical Illusion
3 pages
Chapter 4 Marzano
No ratings yet
Chapter 4 Marzano
2 pages
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
From Everand
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
Fouad Sabry
No ratings yet
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
Document Mosaicing: Unlocking Visual Insights through Document Mosaicing
From Everand
Document Mosaicing: Unlocking Visual Insights through Document Mosaicing
Fouad Sabry
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet

Image Composition Assessment With Saliency-Augmented Multi-Pattern Pooling

Uploaded by

Image Composition Assessment With Saliency-Augmented Multi-Pattern Pooling

Uploaded by

ZHANG, NIU, ZHANG: IMAGE COMPOSITION ASSESSMENT WITH SAMP 1

Image Composition Assessment with

2.2 Composition-relevant Aesthetic Assessment

Figure 3: Our designed eight composition patterns and Saliency-augmented Multi-pattern

3 Composition Assessment DataBase (CADB)

4.1 Saliency-augmented Multi-pattern Pooling

4.2 Attentional Attribute Feature Fusion

WE MP PW SA AF AA MSE↓ EMD↓ SRCC↑ LCC↑

L = LwEMD + λ Latts , (1)

where λ is a trade-off parameter set as 0.1 via cross validation.

5.2 Ablation Study

Method MSE↓ EMD↓ SRCC↑ LCC↑

5.3 Comparison with Existing Methods

5.4 Analysis of Composition Pattern

5.5 Additional Experiments in Supplementary

[36] P. Obrador, L. Schmidt-Hackenberg, and N. Oliver. The role of image composition in

[38] D. Präkel. The fundamentals of creative photography. Bloomsbury Publishing, 2010.

[43] R. Sukthankar S. Bhattacharya and M. Shah. A holistic approach to aesthetic enhance-

You might also like