0% found this document useful (0 votes)
10 views4 pages

Painting Style Classification Using Deep Neural Networks

Uploaded by

gpbrasil11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views4 pages

Painting Style Classification Using Deep Neural Networks

Uploaded by

gpbrasil11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2020 IEEE 3rd International Conference on Computer and Communication Engineering Technology

Painting Style Classification Using Deep Neural Networks

Valentin Yu. Kovalev Alexei G. Shishkin


Dept. of Computational Mathematics & Cybernetics Dept. of Computational Mathematics & Cybernetics
Moscow State University Moscow State University
Moscow, Russian Federation Moscow, Russian Federation
e-mail: [email protected] e-mail: [email protected]

Abstract—In this paper we describe the problem of painting Classemes [5] and which encode the presence of an object in
style classification into five classes: impressionism, realism, the image. It was found that semantic-level features
expressionism, post-impressionism and romanticism. While significantly outperform low-level features for this problem.
most previous approaches relied on image processing and However, the experiments to classify the painting into 7
manual feature extraction from painting images, our model styles were based on a small data set with 70 digital images
based on the ResNet architecture and pre-trained on the for each painting style. In the paper [6] the authors concluded
ImageNet dataset operates on the raw pixel level. The training that low-level textural and color features are not effective
has been performed on a large dataset (about 43k images for due to inconsistent color and texture patterns that describe
five class style classification problem). To increase the quality
styles in painting. More recently [7] metric approaches were
of final model a large number of various augmentations were
used: random Affine transform, crop, flip, color jitter (i.e.
used to find relationships between artists based on their
contrast, hue, saturation), normalization, a scheduler for the paintings. The authors evaluated the quality of the algorithm
optimizer. Finally model weights were pruned which allowed using three metrics and low-level features obtained by the
increasing accuracy up to 51.5% and decreasing computation HOG method were used to optimize the metric. The data set
time as well. used in [7] was quite small and consisted of 1,710 images
from 66 artists.
Keywords-neural networks, deep learning, painting style While earlier approaches [1-8] used traditional computer
classification vision techniques, the later works rely on the use of deep
neural networks. This is quite natural since most of the
I. INTRODUCTION problems of computer vision can be solved with higher
accuracy using deep neural networks. One of the advantages
Now there are a lot of painting styles starting with the for solving the problem in this way is the possibility to use
Renaissance and ending with a street art. Often they do not graphics processors for conducting training in a relatively
have clearly defined features, which in some cases makes it short time. Also, to reduce the learning time, one can use
difficult to identify them even by experts. Therefore, the pre-trained neural networks that could already learned certain
problem of automatically determining the style of paintings dependencies in the digital images. Inspired by the results
is very important. It can be considered as a part of another obtained by deep neural networks as a feature extractor for
more general problem - the determination of the authorship solving various problems [9, 10], the authors [11] used pre-
of paintings by their digital images where one can trained neural networks to analyze a small number of images
distinguish a number of smaller subproblems such as available for classifying the author of the painting and its
classification of painting styles (for example, impressionism, style. Various experiments were performed in [12], training
baroque, etc.), as well as of genres (for example, portrait, the network from scratch or improving the existing network
landscape, etc.) [1]. These problems are very difficult due to to solve the problems of painting style and artist recognition.
the large number of both inter- and intra-class variations: in The use of pre-trained network for painting style
fact, there are different personal styles in the same art style, classification was studied as well [13]. In [14, 15]
and the same artist can paint in one or more different experiments related to the amount of data required for fine-
painting styles and genres. tuning the network developed for style classification were
Most previous approaches to the automatic classification performed. In [16, 17] intra-layer and inter-layer correlation
of painting styles were based on classical computer vision features were studied as style descriptors showing their
methods. Usually these are algorithms based on feature superiority over the features extracted by convolutional
extraction from the image such as SIFT (Scale-invariant neural networks which use only fully connected layers. It is
feature transform) [1] and HOG (Histogram of oriented worth noting that the above-metioned algorithms were tested
gradients) [2]. In [2] the SIFT-derived features were used on relatively small data sets.
together with the bag-of-words method to identify one of the In this paper we use the technology of deep neural
eight specified authors. In [3] a comparative study is networks for developing and implementing a model that
presented to solve the problem of style classification, which allows recognizing the style of a painting by its digital image
compares functions that receive attributes such as SIFT and with limited computational resources. The algorithm has a
COLOR SIFT [4] with semantic level attributes called

978-1-7281-8811-9/20/$31.00 ©2020 IEEE 334


Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2024 at 10:01:44 UTC from IEEE Xplore. Restrictions apply.
number of significant differences from existing approaches. To increase the quality of the model we used scheduler.
To determine the style of a painting deep learning methods The best scheduler for image classification task for today is
with pruning and quantization technologies were used that cosine annealing scheduler. The “cosine annealing” method
allowed to reduce the size of the model and speed up has the effect of starting with a large learning rate that is
computations. Unlike previous approaches we used a large relatively rapidly cosine decreased to a minimum value at the
dataset consisting of 43,242 images written in one of five last epoch. The model was trained for 100 epochs.
painting styles. A large amount of input data was collected
manually from open sources that are currently available for III. DATASET
general use while similar studies usually use proprietary As a dataset we used painting images presented on the
databases with a relatively small number of images. platform kaggle.com called "Painter by numbers" [18]. It
consists mostly of images from an open source wiki-art.com,
II. MODEL and images from other open databases as well. Each image
Currently a lot of of neural network architectures have label includes painting author, style, genre, and year of
been implemented where each of them had its own creation. Unfortunately, there are images (23% of total
advantages and disadvantages. Deep convolutional neural number) where some label elements are missing. We did not
networks AlexNet and ResNet are among the most use such images for our experiments.
successful ones. In this paper we use a neural network model Our developed model classified all images into five
based on the ResNet architecture. It can help with solving the classes. In the data set, the image distribution by class is as
problem because it has skip-connections (Fig. 1) that are follows:
extra connections between nodes in different layers of a 1. Impressionism – 10643 (24.5 %);
neural network that skip one or more layers of nonlinear 2. Realism - 10523 (24.3%);
processing. They allow to reduce overfitting and gradient 3. Romanticism - 9285 (21%);
decay problem. There are several ResNet types with different 4. Expressionism - 7013 (16.2%);
numbers of neurons and layers (for example, 18, 34, 50, 101, 5. Post impressionism – 5778 (13.3%).
152 layers). A large number of layers requires huge amount We have used stratified splits on dataset into train and
of data to avoid network overfitting, this is why we chose test as 80% and 20%.
ResNet50 architecture with 50 layers. As an input neural Also to use pretrained neural networks, it is important
network accepts RGB-channel image with a 224x224 shape. that the size of the data used match those on which the neural
The output is a flatten vector of 524 288 size. Then this network was trained before. So we changed the image size to
vector is fed to a fully connected layer with the softmax 224x224 pixels. In addition, we used various types of
activation function and max pooling layer and five classes as augmentation in order to increase generalization ability and
the output for each style. noise resistance of the neural network. As augmentations we
applied the following transformations in the specified order:
1. Random affine transformation with image rotation up
to 10 degrees and random image magnification by 1.1-1.3
times;
2. Cutting out a random section of the image with the size
of 224x224 pixels;
3. Random vertical reflection;
4. Slight change in brightness, contrast, saturation, tone;
5. Subsequent normalization.
The essence of normalization is to scale the pixel values
Figure 1. ResNet Skip-connections.
to the range as of the ImageNet dataset. The normalization in
ImageNet allowed to avoid over-training, and therefore we
We used the cross-entropy function as a loss function: used it here as well (since the selected features for
normalized and non-normalized images may be different).
c Some examples of images before and after augmentation
CE = − ∑ t log( f (s ) ) ,
i
i i operations are shown in Figs. 2-3.

where C is the number of classes, in our problem C=5; ti is


an encoded vector of an object belonging to i-class and f(s)i
is a probability that the object belongs to i-class; This loss
function shows the best quality for this task. Also, we have
tried hinge loss, bi-tempered loss, but they gave worse
results.
For a better neural network convergence we used ADAM
optimizer with 0.005 learning rate. Figure 2. Images after augmentation.

335
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2024 at 10:01:44 UTC from IEEE Xplore. Restrictions apply.
computation time by 13% while reducing the quality less by
2%.

Figure 5. Train and validation loss.

In Fig. 6 the confusion matrix for the model is shown.


Different painting styles share similarities of color,
composition and texture as well as sharing the object of the
painting. Thus, misclassification between closely related
styles occur quite often. As one can see, the best model
results are for impressionism. Misclassification errors tend to
occur significantly more frequently among closely related
Figure 3. Images before and after augmentations.
groups of styles, reflecting subtleties within these styles.
Several examples are post-impressionism and impressionism,
IV. RESULTS romanticism and realism. The total accuracy of the model
with all improvements and pruning weights is 51.51%.
We trained a deep neural network with described settings
and architecture on the data set with specified augmentations
for 100 epochs. Quality graphs and loss functions are shown
below.

Figure 4. Train and validation accuracy.


Figure 6. Confusion matrix of model.
One can see from Figs. 4-5 that by the 80th epoch both
the accuracy and the loss functions reached a plateau which One can see from Fig. 7 that accuracy for very different
indicated that the model has finished training. Also the classes (cubism, pop art, realism, pointilism and minimalism)
quality of training and validation is similar which allows us is much higher than that for similar classes. This suggests
to conclude that the model has not been overfitted. that the classes discussed earlier are very similar visually,
To reduce the computation time the technique of weights and a person who does not have an additional knowledge
pruning of neural network was used. This is a fairly simple about the artwork will hardly be able to correctly determine
algorithm that removes the connection between neurons if its style.
they have a small weight. It allowed to reduce the

336
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2024 at 10:01:44 UTC from IEEE Xplore. Restrictions apply.
painting categorization, in: Proceedings of the 2016 ACM on
International Conference on Multimedia Retrieval, ACM, 2016, pp.
339–342.
[3] D. G. Lowe. Distinctive image features from scale-invariant
keypoints. Int. J. Comput. Vision, 2004.
[4] N. Dalal and B. Triggs. Histograms of oriented gradients for human
detection. In International Conference on Computer Vision & Pattern
Recognition, volume 2, pages 886–893, June 2005.
[5] R. S. Arora and A. M. Elgammal. Towards automated classification
of fine-art painting style: A comparative study. In ICPR, 2012
[6] A. E. Abdel-Hakim and A. A. Farag. Csift: A sift descriptor with
color invariant characteristics. In IEEE Conference on Computer
Vision and Pattern Recognition, CVPR, 2006.
[7] L. Torresani, M. Szummer, and A. Fitzgibbon. Efficient object
category recognition using classemes. In ECCV, 2010.M. Young, The
Technical Writer’s Handbook. Mill Valley, CA: University Science,
1989.
[8] G. Carneiro, N. P. da Silva, A. Del Bue, J. P. Costeira, Artistic image
classification: An analysis on the printart database, in: European
Conference on Computer Vision, Springer, 2012, pp. 143–157.
Figure 7. Confusion matrix of model for radically different classes.
[9] F. S. Khan, S. Beigpour, J. Van de Weijer, M. Felsberg, Painting-91:
a large scale database for computational painting categorization,
Machine vision and applications 25 (6) (2014) 1385–1397.
V. CONCLUSSION
[10] A. Sharif Razavian, H. Azizpour, J. Sullivan, S. Carlsson, Cnn
In this paper we have developed a model and architecture features off-the-shelf: an astounding baseline for recognition, in:
of a deep neural network for classifying paintings into five Proceedings of the IEEE Conference on Computer Vision and Pattern
styles: impressionism, romanticism, realism, expressionism Recognition Workshops, 2014, pp. 806–813.
and post-impressionism. The model was trained on 40k [11] S. Bianco, D. Mazzini, D. Pau, R. Schettini, Local detectors and
compact descriptors for visual search: a quantitative comparison,
images while preserving the class distribution in the training Digital Signal Processing 44 (2015) 1–13.
and validation set. Various methods were used to improve
[12] K.-C. Peng, T. Chen, A framework of extracting multi-scale features
the quality of classification: augmentations, schedulers, using multiple convolutional neural networks, in: 2015 IEEE
selecting optimizer parameters, and the loss function. The International Conference on Multimedia and Expo (ICME), IEEE,
existing datasets of painting images were also analyzed and 2015, pp. 1–6.
preprocessed. The final accuracy of the model for classifying [13] W. R. Tan, C. S. Chan, H. E. Aguirre, K. Tanaka, Ceci n’est pas une
the picture into five classes is 51.5%. In addition, the weights pipe: A deep convolutional network for fine-art paintings
of the neural network were cut off in order to reduce classification, in: Image Processing (ICIP), 2016 IEEE International
Conference on, IEEE, 2016, pp. 3703–3707.
overtraining and computation time. The results for radically
[14] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification
different classes is much better – 91.25%. The results of this with deep convolutional neural networks, in: Advances in neural
work can be used for further research in the area of painting information processing systems, 2012, pp. 1097–1105.
analysis by its digital images. This includes classification of [15] C. Hentschel, T. P. Wiradarma, H. Sack, Fine tuning cnns with scarce
painting by its genre and painting authentication: using the training dataadapting imagenet to art epoch classification, in: Image
results of classifiers to determine the style, genre, era, and Processing (ICIP), 2016 IEEE International Conference on, IEEE,
author of the painting as features for the classifier, and not 2016, pp. 3693–3697.
just digital images (e.g., UV/IR images or a depth map) will [16] S. Banerji, A. Sinha, Painting classification using a pre-trained
convolutional neural network, in: International Conference on
allow the system to determine the authenticity of the picture. Computer Vision, Graphics, and Image processing, Springer, 2016,
For styles, images of that contains faces, one can analyze pp. 168–179.
only the areas with faces, rather than the entire picture. To do [17] W.-T. Chu, Y.-L. Wu, Deep correlation features for image style
this, one can use the algorithm, for example, implemented in classification, in: Proceedings of the 2016 ACM on Multimedia
[20]. Conference, ACM, 2016, pp. 402–406.
[18] W.-T. Chu, Y.-L. Wu, Image style classification based on learnt deep
REFERENCES correlation features, IEEE Transactions on Multimedia.
[1] S. Chen, B. Mulgrew, and P. M. Grant, “A clustering technique for [19] Painter by number competition: https://www.kaggle.com/c/painter-
digital communications channel equalization using radial basis by-numbers
function networks,” IEEE Trans. on Neural Networks, vol. 4, pp. 570- [20] Wisal Hashim Abdulsalam, Rafah Shihab Alhamdani, and
578, July 1993. Mohammed Najm Abdullah, "Facial Emotion Recognition from
[2] R. M. Anwer, F. S. Khan, J. van de Weijer, J. Laaksonen, Combining Videos Using Deep Convolutional Neural Networks," International
holistic and part-based deep representations for computational Journal of Machine Learning and Computing vol. 9, no. 1, pp. 14-19,
2019.

337
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2024 at 10:01:44 UTC from IEEE Xplore. Restrictions apply.

You might also like