Painting Style Classification Using Deep Neural Networks
Painting Style Classification Using Deep Neural Networks
Abstract—In this paper we describe the problem of painting Classemes [5] and which encode the presence of an object in
style classification into five classes: impressionism, realism, the image. It was found that semantic-level features
expressionism, post-impressionism and romanticism. While significantly outperform low-level features for this problem.
most previous approaches relied on image processing and However, the experiments to classify the painting into 7
manual feature extraction from painting images, our model styles were based on a small data set with 70 digital images
based on the ResNet architecture and pre-trained on the for each painting style. In the paper [6] the authors concluded
ImageNet dataset operates on the raw pixel level. The training that low-level textural and color features are not effective
has been performed on a large dataset (about 43k images for due to inconsistent color and texture patterns that describe
five class style classification problem). To increase the quality
styles in painting. More recently [7] metric approaches were
of final model a large number of various augmentations were
used: random Affine transform, crop, flip, color jitter (i.e.
used to find relationships between artists based on their
contrast, hue, saturation), normalization, a scheduler for the paintings. The authors evaluated the quality of the algorithm
optimizer. Finally model weights were pruned which allowed using three metrics and low-level features obtained by the
increasing accuracy up to 51.5% and decreasing computation HOG method were used to optimize the metric. The data set
time as well. used in [7] was quite small and consisted of 1,710 images
from 66 artists.
Keywords-neural networks, deep learning, painting style While earlier approaches [1-8] used traditional computer
classification vision techniques, the later works rely on the use of deep
neural networks. This is quite natural since most of the
I. INTRODUCTION problems of computer vision can be solved with higher
accuracy using deep neural networks. One of the advantages
Now there are a lot of painting styles starting with the for solving the problem in this way is the possibility to use
Renaissance and ending with a street art. Often they do not graphics processors for conducting training in a relatively
have clearly defined features, which in some cases makes it short time. Also, to reduce the learning time, one can use
difficult to identify them even by experts. Therefore, the pre-trained neural networks that could already learned certain
problem of automatically determining the style of paintings dependencies in the digital images. Inspired by the results
is very important. It can be considered as a part of another obtained by deep neural networks as a feature extractor for
more general problem - the determination of the authorship solving various problems [9, 10], the authors [11] used pre-
of paintings by their digital images where one can trained neural networks to analyze a small number of images
distinguish a number of smaller subproblems such as available for classifying the author of the painting and its
classification of painting styles (for example, impressionism, style. Various experiments were performed in [12], training
baroque, etc.), as well as of genres (for example, portrait, the network from scratch or improving the existing network
landscape, etc.) [1]. These problems are very difficult due to to solve the problems of painting style and artist recognition.
the large number of both inter- and intra-class variations: in The use of pre-trained network for painting style
fact, there are different personal styles in the same art style, classification was studied as well [13]. In [14, 15]
and the same artist can paint in one or more different experiments related to the amount of data required for fine-
painting styles and genres. tuning the network developed for style classification were
Most previous approaches to the automatic classification performed. In [16, 17] intra-layer and inter-layer correlation
of painting styles were based on classical computer vision features were studied as style descriptors showing their
methods. Usually these are algorithms based on feature superiority over the features extracted by convolutional
extraction from the image such as SIFT (Scale-invariant neural networks which use only fully connected layers. It is
feature transform) [1] and HOG (Histogram of oriented worth noting that the above-metioned algorithms were tested
gradients) [2]. In [2] the SIFT-derived features were used on relatively small data sets.
together with the bag-of-words method to identify one of the In this paper we use the technology of deep neural
eight specified authors. In [3] a comparative study is networks for developing and implementing a model that
presented to solve the problem of style classification, which allows recognizing the style of a painting by its digital image
compares functions that receive attributes such as SIFT and with limited computational resources. The algorithm has a
COLOR SIFT [4] with semantic level attributes called
335
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2024 at 10:01:44 UTC from IEEE Xplore. Restrictions apply.
computation time by 13% while reducing the quality less by
2%.
336
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2024 at 10:01:44 UTC from IEEE Xplore. Restrictions apply.
painting categorization, in: Proceedings of the 2016 ACM on
International Conference on Multimedia Retrieval, ACM, 2016, pp.
339–342.
[3] D. G. Lowe. Distinctive image features from scale-invariant
keypoints. Int. J. Comput. Vision, 2004.
[4] N. Dalal and B. Triggs. Histograms of oriented gradients for human
detection. In International Conference on Computer Vision & Pattern
Recognition, volume 2, pages 886–893, June 2005.
[5] R. S. Arora and A. M. Elgammal. Towards automated classification
of fine-art painting style: A comparative study. In ICPR, 2012
[6] A. E. Abdel-Hakim and A. A. Farag. Csift: A sift descriptor with
color invariant characteristics. In IEEE Conference on Computer
Vision and Pattern Recognition, CVPR, 2006.
[7] L. Torresani, M. Szummer, and A. Fitzgibbon. Efficient object
category recognition using classemes. In ECCV, 2010.M. Young, The
Technical Writer’s Handbook. Mill Valley, CA: University Science,
1989.
[8] G. Carneiro, N. P. da Silva, A. Del Bue, J. P. Costeira, Artistic image
classification: An analysis on the printart database, in: European
Conference on Computer Vision, Springer, 2012, pp. 143–157.
Figure 7. Confusion matrix of model for radically different classes.
[9] F. S. Khan, S. Beigpour, J. Van de Weijer, M. Felsberg, Painting-91:
a large scale database for computational painting categorization,
Machine vision and applications 25 (6) (2014) 1385–1397.
V. CONCLUSSION
[10] A. Sharif Razavian, H. Azizpour, J. Sullivan, S. Carlsson, Cnn
In this paper we have developed a model and architecture features off-the-shelf: an astounding baseline for recognition, in:
of a deep neural network for classifying paintings into five Proceedings of the IEEE Conference on Computer Vision and Pattern
styles: impressionism, romanticism, realism, expressionism Recognition Workshops, 2014, pp. 806–813.
and post-impressionism. The model was trained on 40k [11] S. Bianco, D. Mazzini, D. Pau, R. Schettini, Local detectors and
compact descriptors for visual search: a quantitative comparison,
images while preserving the class distribution in the training Digital Signal Processing 44 (2015) 1–13.
and validation set. Various methods were used to improve
[12] K.-C. Peng, T. Chen, A framework of extracting multi-scale features
the quality of classification: augmentations, schedulers, using multiple convolutional neural networks, in: 2015 IEEE
selecting optimizer parameters, and the loss function. The International Conference on Multimedia and Expo (ICME), IEEE,
existing datasets of painting images were also analyzed and 2015, pp. 1–6.
preprocessed. The final accuracy of the model for classifying [13] W. R. Tan, C. S. Chan, H. E. Aguirre, K. Tanaka, Ceci n’est pas une
the picture into five classes is 51.5%. In addition, the weights pipe: A deep convolutional network for fine-art paintings
of the neural network were cut off in order to reduce classification, in: Image Processing (ICIP), 2016 IEEE International
Conference on, IEEE, 2016, pp. 3703–3707.
overtraining and computation time. The results for radically
[14] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification
different classes is much better – 91.25%. The results of this with deep convolutional neural networks, in: Advances in neural
work can be used for further research in the area of painting information processing systems, 2012, pp. 1097–1105.
analysis by its digital images. This includes classification of [15] C. Hentschel, T. P. Wiradarma, H. Sack, Fine tuning cnns with scarce
painting by its genre and painting authentication: using the training dataadapting imagenet to art epoch classification, in: Image
results of classifiers to determine the style, genre, era, and Processing (ICIP), 2016 IEEE International Conference on, IEEE,
author of the painting as features for the classifier, and not 2016, pp. 3693–3697.
just digital images (e.g., UV/IR images or a depth map) will [16] S. Banerji, A. Sinha, Painting classification using a pre-trained
convolutional neural network, in: International Conference on
allow the system to determine the authenticity of the picture. Computer Vision, Graphics, and Image processing, Springer, 2016,
For styles, images of that contains faces, one can analyze pp. 168–179.
only the areas with faces, rather than the entire picture. To do [17] W.-T. Chu, Y.-L. Wu, Deep correlation features for image style
this, one can use the algorithm, for example, implemented in classification, in: Proceedings of the 2016 ACM on Multimedia
[20]. Conference, ACM, 2016, pp. 402–406.
[18] W.-T. Chu, Y.-L. Wu, Image style classification based on learnt deep
REFERENCES correlation features, IEEE Transactions on Multimedia.
[1] S. Chen, B. Mulgrew, and P. M. Grant, “A clustering technique for [19] Painter by number competition: https://www.kaggle.com/c/painter-
digital communications channel equalization using radial basis by-numbers
function networks,” IEEE Trans. on Neural Networks, vol. 4, pp. 570- [20] Wisal Hashim Abdulsalam, Rafah Shihab Alhamdani, and
578, July 1993. Mohammed Najm Abdullah, "Facial Emotion Recognition from
[2] R. M. Anwer, F. S. Khan, J. van de Weijer, J. Laaksonen, Combining Videos Using Deep Convolutional Neural Networks," International
holistic and part-based deep representations for computational Journal of Machine Learning and Computing vol. 9, no. 1, pp. 14-19,
2019.
337
Authorized licensed use limited to: UNIVERSIDADE DE SAO PAULO. Downloaded on October 18,2024 at 10:01:44 UTC from IEEE Xplore. Restrictions apply.