0% found this document useful (0 votes)
38 views6 pages

Deep Learning Models For Automated Classification of Dog Emotional States

Uploaded by

pawan.rathod
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views6 pages

Deep Learning Models For Automated Classification of Dog Emotional States

Uploaded by

pawan.rathod
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Deep Learning Models for Automated Classification of Dog Emotional States

from Facial Expressions


Tali Boneh-Shitrit1 Shir Amir2 Annika Bremhorst3,4,5 Daniel S. Mills4
Stefanie Riemer3 Dror Fried1 Anna Zamansky6
1 2 3 4
Open University, Israel Weizmann Institute of Science, Israel University of Bern, Switzerland University of Lincoln, UK
5 6
dogs and science - Institute for Canine Science and Applied Cynology, Germany University of Haifa, Israel
arXiv:2206.05619v1 [cs.CV] 11 Jun 2022

Abstract ipation and frustration in naturalistic settings using online


videos, showing that dogs displayed distinctive facial ac-
Similarly to humans, facial expressions in animals are tions in these two states. In contrast, Bremhorst et al. [7]
closely linked with emotional states. However, in contrast investigated dogs’ facial expressions in the emotional states
to the human domain, automated recognition of emotional of frustration and positive anticipation in a controlled exper-
states from facial expressions in animals is underexplored, imental setting, standardizing also the dog breed (Labrador
mainly due to difficulties in data collection and establish- Retriever) to reduce the potential effects of morphological
ment of ground truth concerning emotional states of non- variation and extremes on the dogs’ facial expressions. The
verbal users. We apply recent deep learning techniques to authors found that the some action units were more com-
classify (positive) anticipation and (negative) frustration of mon in the positive condition, and some others in the neg-
dogs on a dataset collected in a controlled experimental set- ative one. In a follow-up study, Bremhorst et al. [6] used a
ting. We explore the suitability of different backbones (e.g. similar set-up with new participants, to induce positive an-
ResNet, ViT) under different supervisions to this task, and ticipation and frustration in two reward contexts: food and
find that features of a self-supervised pretrained ViT (DINO- toys. The previous results were replicated, and additional
ViT) are superior to the other alternatives. To the best of our facial actions were also found more common in the nega-
knowledge, this work is the first to address the task of auto- tive condition. However, none of the identified units could
matic classification of canine emotions on data acquired in serve alone as potential emotion indicators, providing con-
a controlled experiment. sistent correct classifications of the associated emotion.
A major downside for using FACS (and AnimalFACS)
for facial expression analysis is that it requires extensive hu-
1. Introduction man annotation and certification. It is also time consuming,
It is widely accepted nowadays that animals are able and may be prone to human error or bias [26]. Therefore,
to experience emotional states [16]. Facial expressions automated approaches are an attractive alternative, consid-
are produced by most mammalian species [20], and also ered to have even greater objectivity and reliability than
are assumed to convey information about emotional states manual approaches [4, 15].
[18, 19]. Therefore, they are receiving increasing attention Indeed, in the human domain automated facial analy-
as indicators of emotional states in animals, as well as in sis and effective computing are vibrant fields of research.
research on animal emotions [18]. Numerous commercial software tools for automated facial
In human emotion research the golden standard for ob- analysis are available, such as FaceReader by Noldus [34],
jective measurement of facial expressions is the Facial Ac- Affdex [43], EmoVu [3], and more. Some researchers con-
tion Coding System (FACS [22]). It is an anatomy-based sider automated tools to have greater objectivity and re-
manual annotation system which allows describing facial liability than manual coding, eliminating subjectivity and
appearance changes based on movements of the underlying bias [4, 15].
facial muscles. With the recent adaptation of FACS to dif- In contrast, the animal domain remains under-explored
ferent non-human species, including dogs (DogFACS [45]), in the context of automated emotion recognition. Some rea-
FACS-related methods have been applied in several studies sons for this are outlined in Hummel et al. [29]: less data
for the investigation of dog emotional states. is available, and also animal faces vary more in terms of
Several works have used DogFACS to address two emo- color, shape and texture. Furthermore, due to lack of verbal
tional states in canines of different valence: positive antic- basis for establishing ground truth, data acquisition and an-
ipation and frustration [6, 7, 12]. Caeiro et al. [12] investi- notation are much more complicated. To the best of our
gated dog facial expressions associated with positive antic- knowledge, our work is the first to investigate automatic

1
deep-learning methods for canine emotion recognition on
a rigorously created and controlled dataset.
Automated facial analysis has so far been addressed in
a few species. Pain recognition from facial expressions
has been investigated for rodents [2, 42, 44], sheep [37],
equines [9, 29, 33], and cats [23]. Action unit recogni-
tion was automated for several types of non-human pri-
mates [5, 39]. In the context of dogs, Ferres et al. stud-
ied automated pose estimation using DeepLabCut [38] for
classification of emotional states as anger, fear, happiness
and relaxation [24]. Franzoni et al. [25] used a pre-trained
CNN (i.e. AlexNet [32]) to classify dog emotional states of
joy and anger. However, in both of these works the datasets
included images collected from the internet and annotated
by the authors, with a significant possibility of human bias
and inaccurate annotations. In contrast, our work is based
on a dataset obtained in a controlled setting, in which posi- Figure 1. Number of dogs by age and by sex. The dataset
tive anticipation and frustration are induced using carefully contains slightly more female than male dogs, and slightly more
designed experimental protocol specified in [8]. younger dogs than older ones.
Liu et al. addressed automated classification of dog
breed using part localization [35], highlighting the great
challenge in computer vision of dealing with wide varia-
tion in shape and appearance in different dog breeds. The
dataset used in this paper relaxes this problem by standard-
izing the breed of the participants to Labrador Retriever.
In this work we investigate the adequacy of different pre-
trained models for the delicate task of emotion recognition
from facial images. We explore the use of two commonly
used architectures: ResNet [28], a CNN with residual con-
nections; and Vision Transformer (ViT) [21], a model based
on the attention mechanism. We further consider the effect
of different pre-training techniques on ImageNet [17], com-
paring supervised training for image classification with self-
supervised training using DINO [13]. Amir et al. [1] show
DINO-ViT features encode powerful semantic information
Figure 2. Example frames from the dataset. Crops of dog faces
on object parts, and animal parts in particular; which can be
extracted from the dataset. The dataset contains images of diverse
beneficial to address subtle facial movement analysis for ca- pose, facial expression, and canine color.
nine emotion classification. To the best of our knowledge,
we are the first to explore different backbones for this task.
Our main contributions are the following:
2. Dataset
We use the dataset collected by Bremhorst et al. [8], con-
• We automatically classify dog emotional states of posi- taining short three-second videos of dogs experiencing the
tive anticipation and frustration based on facial images emotional states of frustration and positive anticipation in a
obtained in a controlled experimental setting, without controlled experimental setting. They also standardize the
using DogFACS annotations. dog breed (Labrador Retriever) to reduce the potential ef-
fects of morphological variation and extremes on the dogs’
• We explore the suitability of different pre-trained back- facial expressions. They used a non-social context in order
bones for this task. to eliminate the risk of interference from previously learned
attention getting responses: a high-value food reward was
• We conclude that DINO-ViT features are most suitable used as the triggering stimulus in two conditions – a posi-
for this classification task, and strengthen this claim tive condition predicted to induce positive anticipation, and
using qualitative interpretability methods. a negative condition predicted to induce frustration in dogs.
The dataset contains 248 videos of 29 subjects, most of

2
which are neutered. Figure 1 demonstrates the variety of Backbone Train Accuracy Val. Accuracy
subjects over age and gender. Each dog was filmed roughly ResNet50 [28] 0.800 0.809
Sup.
nine times, a third of which in positive anticipation state and ViT [21] 0.869 0.780
two thirds in frustration state. The same emotional state is ResNet50 [13] 0.870 0.813
DINO
assumed to hold throughout each video due to their short ViT [13] 0.878 0.853
duration.
The availability of such visual-temp data enables ad- Table 1. Classification Results. The best results are achieved
dressing it in two manners: (i) single frames and (ii) se- using a pre-trained DINO-ViT as a backbone.
quences of frames. The first implies more information loss,
but is simpler and more controllable; while the latter in-
cludes temporal dimension, which has been shown to have
importance for such tasks, e.g., in the context of detec-
tion of pain in horses [9, 40]. The prevalent approach in
the context of automated recognition of effective states and
pain in animals, is, however, the single frame basis (e.g.,
[2, 29, 33, 36]). Due to the exploratory nature of this study,
we opt for this option, assuming (at least the majority of)
the frames capture information on the emotional state of the
dogs. Therefore, our final dataset contains single frames
Figure 3. Loss and Accuracy Curves for each model. We show
extracted from the videos and labelled accordingly: 12569 the loss and accuracy on the validation set for each trained model.
negative frames 6823 positive frames. Figure 2 presents ex- The DINO-ViT based model performs better than models based
ample faces, indicating the diversity of poses, facial expres- on other backbones.
sions, and subject appearance.

2.1. Pre-Processing
used for training and testing is a common practice in the
The original video frames contain background clutter in- context of animal face analysis, as it enforces generaliza-
cluding the surrounding room, humans, dog body, etc. We tion to unseen subjects and ensures that no specific features
aim to focus on the facial expressions of the dogs and avoid of an individual are used for classification [2, 9].
learning other emotional state predictors (e.g. dog body
postures). Hence, we trained a Mask-RCNN [27] to iden- We use ResNet50 architecture for supervised and DINO-
tify canine faces, and used it to crop the facial bounding box trained backbones; ViT-S/16 trained in a supervised
from each image. We trained the Mask-RCNN on roughly manner and ViT-S/8 trained with DINO. We use pre-
200 annotated images from this dataset, making it most trained ViT weights from the Timm Library [47]. We train
suited for this specific experimental setup. Figure 2 shows all the models for 30 epochs using Adam optimizer [31]
facial crops acquired using this pre-processing stage. with betas=(0, 0.999) and learning rates: 10−4 for ResNet
backbones and 5 · 10−6 for ViT backbone. We apply several
3. Experiments and Results augmentations during training to improve the robustness of
the models: horizontal flips, color jitter, and random crops
3.1. Framework of 80-100% of the original facial crops. All inputs were
resized to size 224 × 224.
We pose the task as a binary classification task distin-
guishing the two emotional states. We employ the common
”transfer learning” setup, training a linear probe on top of 3.3. Results
a fixed pre-trained backbone using human annotations. We
explore the suitability of different backbones for this task by The accuracy measures of our trained models and loss
repeating the experiment with four pre-trained backbones: curves appear in Tab. 1 and Fig. 3 respectively. The model
ResNet and ViT trained either in a supervised manner for trained with a DINO-ViT backbone produces the highest
image classification [21] or in a self-supervised manner us- accuracy on the validation set. We hypothesize that this is
ing DINO [13]. due to DINO-ViT features being sensitive to object parts, as
shown in [1]; and due to the nature of the task at hand - emo-
3.2. Implementation Details
tion classification requires understanding at the object-part
The dataset was divided into a training set containing level (e.g. states of eyes, ears, etc.). Intriguingly, the back-
14830 frames from 22 dog subjects; and a test set contain- bones pre-trained with DINO produce better results than the
ing 4562 frames of 7 dog subjects. Separating the subjects supervised backbones.

3
Figure 4. EigenCAM [46] activation maps. on several images for our four different models. The images in the top and bottom row have
positive and negative ground-truth emotions respectively. The DINO-ViT backbone addresses similar areas to those proposed by human
annotated action units.

3.4. Explainability tween researchers even on the most basic questions. For
instance, already Darwin suggested that some emotional
We further investigate our trained models by visualizing
cues (such as facial expressions) may have visual similar-
their activation maps on several images. We apply Eigen-
ity across different species, and even bear the same mean-
CAM [46] to visualize the principal components of the final
ing ’ [10]; however, recent research applying objective tools
activations for each model. It has been shown that Eigen-
(such as AnimalFACS) for measuring facial expressions has
CAM provides more easily interpretable results with less
begun to question this assumption [12].
computation compared to other CAM methods such as the
popular Grad-CAM [41]. We chose this method since un- On a practical level, objective assessment of animal emo-
like other visualization methods such as Grad-CAM [46] tion should be a cornerstone of animal welfare practices, fo-
and Grad-CAM++ [14], Eigen-CAM is a class-independent cusing not only on reduction of negative emotions, but also
tool. This property enables Eigen-CAM to visualize learned recently attempting to promote positive states. The subtle
patterns even when the model prediction is wrong, as op- and complex nature of animal emotions, and at least some
posed to older CAM methods that produce irrelevant maps of their characteristics being species-specific (and thus dif-
when their prediction isn’t correct. This property of Eigen- ferent from humans’) pose significant challenges.
CAM enables interpreting reasons for prediction failure.
It is more consistent and class discriminative compared Deep learning has the potential to be a game changer
to other state of the art visualization methods.In addition, in providing answers both to foundational scientific ques-
EigenCAM is not model-specific - it can be used for both tions on the nature of animal emotions, as well as pushing
ViTs and CNNs without changing layers. forward practical tools in the context of animal welfare and
Several qualitative examples are presented in Fig. 4. We health. We have explored here in detail how it can be used to
observe several characteristics common to all activation classify two specific emotional states in dogs of positive and
maps of each pretrained model - (i) The ViT models seem negative valence. More specifically, having examined the
to exhibit better localization than the ResNet models. The suitability of different pre-trained backbones for this task,
highly activated regions (marked by red) are smaller and we conclude that DINO-ViT features have superior perfor-
lay on more salient regions (e.g. eyes, ears, nose rather than mance in this context, possibly due to the DINO-ViT fea-
skin). (ii) The DINO-ViT model seems to activate on mul- tures being sensitive to object parts [1]
tiple salient regions rather than one (e.g. activating on ears,
eyes and nose rather than just ears on the top-right exam- However, for deep learning models to contribute both
ple). We attribute the success of ViT based models to the to scientific discoveries in animal emotion, and to applied
ability of ViTs to provide a more localized signal than the tools for animal welfare and healthcare, these models judg-
ResNet models. This stems in their architecture – the reso- ment should be interpretable. We show existing explain-
lution of ViT features remains constant throughout the lay- ability methods hold much promise for this task. One in-
ers, while the resolution of CNN features diminishes as the teresting possibility is to include mapping features learnt by
layers become deeper. the deep learning methods to concepts grounded in behav-
ioral meaning such as DogFACS [11] action units, in a way
4. Discussion similar to what was done for human FACS in [30]. In any
case, the results presented will serve as first baseline for
Despite the huge (and increasing) body of work on emo- future research into canine affective computing using deep
tions in animals, there is still no common agreement be- learning techniques.

4
Acknowledgements [13] Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou,
Julien Mairal, Piotr Bojanowski, and Armand Joulin. Emerg-
The authors would like to thank Prof. Hanno Würbel for ing properties in self-supervised vision transformers. ICCV,
his guidance in collecting and analyzing the data used in 2021. 2, 3
this study. [14] Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader,
The research was partially supported by the grant from and Vineeth N Balasubramanian. Grad-cam++: General-
the Ministry of Science and Technology of Israel and RFBR ized gradient-based visual explanations for deep convolu-
according to the research project no. 19-57-06007, and by tional networks. In 2018 IEEE Winter Conference on Appli-
the Israel Ministry of Agriculture and Rural Development. cations of Computer Vision (WACV), pages 839–847, 2018.
4
[15] Jeffrey F Cohn and Paul Ekman. Measuring facial action.
References The new handbook of methods in nonverbal behavior re-
[1] Shir Amir, Yossi Gandelsman, Shai Bagon, and Tali Dekel. search., 525:9–64, 2005. 1
Deep vit features as dense visual descriptors. arXiv preprint [16] Amber J de Vere and Stan A Kuczaj. Where are we in the
arXiv:2112.05814, 2021. 2, 3, 4 study of animal emotions? Wiley Interdisciplinary Reviews:
Cognitive Science, 7(5):354–362, 2016. 1
[2] Niek Andresen, Manuel Wöllhaf, Katharina Hohlbaum, Lars
Lewejohann, Olaf Hellwich, Christa Thöne-Reineke, and Vi- [17] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li,
taly Belik. Towards a fully automated surveillance of well- and Li Fei-Fei. Imagenet: A large-scale hierarchical image
being status in laboratory mice using deep learning: Starting database. CVPR, 2009. 2
with facial expression analysis. PLoS One, 15(4):e0228059, [18] Kris A Descovich, Jennifer Wathan, Matthew C Leach, Han-
2020. 2, 3 nah M Buchanan-Smith, Paul Flecknell, D Framingham, and
[3] Jacob Arnold and Matthew Emerick. Emotional evaluation Sarah-Jane Vick. Facial expression: An under-utilised tool
through facial recognition. sites.psu.edu. 1 for the assessment of welfare in mammals. Altex, 2017. 1
[19] Rui Diogo, Virginia Abdala, N Lonergan, and BA Wood.
[4] M S Bartlett, J C Hager, P Ekman, and T J Sejnowski. Mea-
From fish to modern humans–comparative anatomy, ho-
suring facial expressions by computer image analysis. Psy-
mologies and evolution of the head and neck musculature.
chophysiology, 36(2):253–263, Mar. 1999. 1
Journal of Anatomy, 2008. 1
[5] Gaddi Blumrosen, David Hawellek, and Bijan Pesaran. To-
[20] Rui Diogo, Bernard A Wood, Mohammed A Aziz, and Anne
wards automated recognition of facial expressions in ani-
Burrows. On the origin, homologies and evolution of primate
mal models. In Proceedings of the IEEE International Con-
facial muscles, with a particular focus on hominoids and a
ference on Computer Vision Workshops, pages 2810–2819,
suggested unifying nomenclature for the facial muscles of
2017. 2
the mammalia. Journal of Anatomy, 2009. 1
[6] A Bremhorst, DS Mills, H Würbel, and S Riemer. Evaluat- [21] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov,
ing the accuracy of facial expressions as emotion indicators Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner,
across contexts in dogs. Animal cognition, pages 1–16, 2021. Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl-
1 vain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is
[7] A. Bremhorst, N.A. Sutter, and H. et al. Würbel. Differences worth 16x16 words: Transformers for image recognition at
in facial expressions during positive anticipation and frustra- scale. ICLR, 2021. 2, 3
tion in dogs awaiting a reward. Scientific Report, 2019. 1 [22] Paul Ekman and Wallace V. Friesen. Facial Action Coding
[8] Annika Bremhorst, Nicole A Sutter, Hanno Würbel, Daniel S System: Manual. Palo Alto, Calif: Consulting Psychologists
Mills, and Stefanie Riemer. Differences in facial expressions Press, 1978. 1
during positive anticipation and frustration in dogs awaiting [23] Marcelo Feighelstein, Ilan Shimshoni, Lauren Finka, Ste-
a reward. Scientific reports, 9(1):1–13, 2019. 2 lio P. Luna, Daniel Mills, and Anna Zamansky. Automated
[9] Sofia Broomé, Karina Bech Gleerup, Pia Haubro Ander- recognition of pain in cats. Submitted., 2022. 2
sen, and Hedvig Kjellstrom. Dynamics are important for [24] Kim Ferres, Timo Schloesser, and Peter A Gloor. Predicting
the recognition of equine pain in video. In Proceedings of dog emotions based on posture analysis using deeplabcut.
the IEEE/CVF Conference on Computer Vision and Pattern Future Internet, 14(4):97, 2022. 2
Recognition, pages 12667–12676, 2019. 2, 3 [25] Valentina Franzoni, Alfredo Milani, Giulio Biondi, and
[10] Darwin C. The expression of emotions in man and animals. Francesco Micheli. A preliminary work on dog emotion
The University of Chicago Press, 1872/1965. 4 recognition. In IEEE/WIC/ACM International Conference on
[11] Cátia Caeiro, Kun Guo, and Daniel Mills. Dogs and humans Web Intelligence-Companion Volume, pages 91–96, 2019. 2
respond to emotionally competent stimuli by producing dif- [26] Jihun Hamm, Christian G Kohler, Ruben C Gur, and Ragini
ferent facial actions. Scientific reports, 7(1):1–11, 2017. 4 Verma. Automated facial action coding system for dynamic
[12] Cátia C Caeiro, Anne M Burrows, and Bridget M Waller. analysis of facial expressions in neuropsychiatric disorders.
Development and application of catfacs: Are human cat Journal of neuroscience methods, 200(2):237–256, 2011. 1
adopters influenced by cat facial expressions? Applied Ani- [27] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Gir-
mal Behaviour Science, 2017. 1, 4 shick. Mask r-cnn. ICCV, 2017. 3

5
[28] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. [42] Susana G Sotocina, Robert E Sorge, Austin Zaloum, Alexan-
Deep residual learning for image recognition. CVPR, 2016. der H Tuttle, Loren J Martin, Jeffrey S Wieskopf, Josiane CS
2, 3 Mapplebeck, Peng Wei, Shu Zhan, Shuren Zhang, et al. The
[29] Hilde I Hummel, Francisca Pessanha, Albert Ali Salah, rat grimace scale: a partially automated method for quantify-
Thijs JPAM van Loon, and Remco C Veltkamp. Automatic ing pain in the laboratory rat via facial expressions. Molecu-
pain detection on horse and donkey faces. In FG, 2020. 1, 2, lar pain, 7:1744–8069, 2011. 2
3 [43] Sabrina Stöckli, Michael Schulte-Mecklenbeck, Stefan
[30] Pooya Khorrami, Thomas Paine, and Thomas Huang. Do Borer, and Andrea C Samson. Facial expression analysis
deep neural networks learn facial action units when doing with AFFDEX and FACET: A validation study. Behav. Res.
expression recognition? In Proceedings of the IEEE interna- Methods, 50(4):1446–1460, Aug. 2018. 1
tional conference on computer vision workshops, pages 19– [44] Alexander H Tuttle, Mark J Molinaro, Jasmine F Jethwa, Su-
27, 2015. 4 sana G Sotocinal, Juan C Prieto, Martin A Styner, Jeffrey S
[31] Diederik P. Kingma and Jimmy Ba. Adam: A method for Mogil, and Mark J Zylka. A deep neural network to assess
stochastic optimization. 2015. 3 spontaneous pain from mouse facial expressions. Molecular
[32] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. pain, 14:1744806918763658, 2018. 2
Imagenet classification with deep convolutional neural net- [45] Bridget Waller, Catia Caeiro, Kate Peirce, Anne Burrows,
works. NeurIPS, 2012. 2 Juliane Kaminski, et al. Dogfacs: the dog facial action cod-
[33] Gabriel Carreira Lencioni, Rafael Vieira de Sousa, Ed- ing system. 2013. 1
son José de Souza Sardinha, Rodrigo Romero Corrêa, and [46] Haofan Wang, Mengnan Du, Fan Yang, and Zijian
Adroaldo José Zanella. Pain assessment in horses using au- Zhang. Score-cam: Improved visual explanations via score-
tomatic facial expression recognition through deep learning- weighted class activation mapping. CoRR, abs/1910.01279,
based modeling. PloS one, 16(10):e0258672, 2021. 2, 3 2019. 4
[34] Peter Lewinski, Tim M den Uyl, and Crystal Butler. Auto- [47] Ross Wightman. Pytorch image models. https :
mated facial coding: Validation of basic emotions and FACS / / github . com / rwightman / pytorch - image -
AUs in FaceReader. J. Neurosci. Psychol. Econ., 7(4):227– models, 2019. 3
236, Dec. 2014. 1
[35] Jiongxin Liu, Angjoo Kanazawa, David Jacobs, and Peter
Belhumeur. Dog breed classification using part localization.
In European conference on computer vision, pages 172–185.
Springer, 2012. 2
[36] Yiting Lu, Marwa Mahmoud, and Peter Robinson. Esti-
mating sheep pain level using facial action unit detection.
In 2017 12th IEEE International Conference on Automatic
Face & Gesture Recognition (FG 2017), pages 394–399.
IEEE, 2017. 3
[37] Marwa Mahmoud, Yiting Lu, Xijie Hou, Krista McLennan,
and Peter Robinson. Estimation of pain in sheep using com-
puter vision. In Handbook of Pain and Palliative Care, pages
145–157. Springer, 2018. 2
[38] Alexander Mathis, Pranav Mamidanna, Kevin M Cury, Taiga
Abe, Venkatesh N Murthy, Mackenzie Weygandt Mathis,
and Matthias Bethge. Deeplabcut: markerless pose estima-
tion of user-defined body parts with deep learning. Nature
neuroscience, 21(9):1281, 2018. 2
[39] Anna Morozov, Lisa Parr, Katalin M Gothard, Rony Paz, and
Raviv Pryluk. Automatic recognition of macaque facial ex-
pressions for detection of affective states. bioRxiv, 2021. 2
[40] Maheen Rashid, Sofia Broomé, Katrina Ask, Elin Hernlund,
Pia Haubro Andersen, Hedvig Kjellström, and Yong Jae
Lee. Equine pain behavior classification via self-supervised
disentangled pose representation. In Proceedings of the
IEEE/CVF Winter Conference on Applications of Computer
Vision, pages 1646–1656, 2022. 3
[41] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek
Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Ba-
tra. Grad-cam: Visual explanations from deep networks via
gradient-based localization. In ICCV, pages 618–626. IEEE
Computer Society, 2017. 4

You might also like