0% found this document useful (0 votes)

98 views

Python para

This document is a diploma thesis written by Johannes Benjamin Wagner that explores using face recognition algorithms to recognize individual pets for a mixed reality pet gaming system called Metazoa Ludens. The thesis provides background on motion tracking, object recognition, and face recognition algorithms. It describes the hamster species that will be recognized and discusses their natural identification features. It then details the implementation of a prototype recognition system based on histograms and evaluates its performance at recognizing hamsters compared to existing face recognition methods.

Uploaded by

Jose Maurtua

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views

Python para

Uploaded by

Jose Maurtua

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 104

H OCHSCHULE F UR TWANGEN U NIVERSITY (HFU)

F ACULTY OF D IGITAL M EDIA

C OURSE P ROGRAMME C OMPUTER S CIENCE IN M EDIA (MI)

Diploma Thesis

From humans to pets – How to use face

recognition algorithms for pet recognition

written by Johannes Benjamin Wagner

Student ID: 215386
born on the 24th of March 1982 in Tübingen

Supervisors (HFU): Prof. Nikolaus Hottong, Prof. Daniel Fetzner

Supervisor (NUS): Assoc Prof. Dr. Adrian David Cheok

Reutlingen, February 28, 2007

To my late godmother Susanne,
who encouraged me to go out into the world.
Acknowledgments

I would like to thank my supervisors Prof. Nikolaus Hottong and Prof.

Daniel Fetzner for their helpful advice and their ability to ask the right
questions.
Furthermore I would like to thank all the members of the MixedRealityLab
Singapore for their help, especially Prof. Dr. Adrian Cheok, who gave me
the great opportunity to work in the lab, as well as Roger Thomas Tan
and Dr. Owen Noel Newton Fernando, who helped me raising the right
questions and finding the right answers for my thesis.
Last but not least, I would like to thank my family, especially my parents,
for the opportunity to study, and for all the support and understanding
they gave to me throughout all my years of study.
Abstract

The project Metazoa Ludens is a new computer mediated gaming system

which allows humans to play mixed reality computer games with their pets,
via custom built technologies and applications, and is currently being de-
veloped by the MixedRealityLab of the National University of Singapore.

The aim of this diploma thesis is, to develop a visual tracking system,
which improves the pet-computer interface of Metazoa Ludens, in order to
allow the interaction of one or more pets with a human through the game.

Starting from the discussion of possible tracking systems, the necessity

of a recognition module is explained, which is able to distinguish between
individual pets. Therefore multiple object recognition and face recognition
technologies are explored and theoretically discussed. The prototypical
implementation of a recognition system, which is based on histograms, is
being described in detail. Finally, the recognition performance of the pro-
totypical implementation is evaluated through experiment, and compared
to the recognition performance of existing face recognition algorithms.
Table of Contents

1 Introduction 1
1.1 Remote human-pet interaction . . . . . . . . . . . . . . . . . . . 1
1.2 Metazoa Ludens . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Theory and Background 9

2.1 Motion tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Segmentation by thresholding . . . . . . . . . . . . . . . 9
2.1.2 Segmentation by motion . . . . . . . . . . . . . . . . . . 10
2.1.3 Region labelling . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.4 Correspondence problem . . . . . . . . . . . . . . . . . . 12
2.2 Tracking for Metazoa Ludens . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Current tracking System . . . . . . . . . . . . . . . . . . 13
2.2.2 Improved tracking system . . . . . . . . . . . . . . . . . . 14
2.3 Object recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Hamster recognition . . . . . . . . . . . . . . . . . . . . . 16
2.4 Face recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.1 Face recognition processing flow . . . . . . . . . . . . . . 18
2.4.2 Face recognition algorithms . . . . . . . . . . . . . . . . 19
2.4.2.1 Principal components analysis . . . . . . . . . . 21
2.4.2.2 Linear Discriminant Analysis . . . . . . . . . . . 23
2.4.2.3 Bayesian (probabilistic) classifier . . . . . . . . 24

3 The Hamsters 27
3.1 Biological classification . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Natural Habitat . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

i
Table of Contents

3.4 Hamsters as pet . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5 Hamster species . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5.1 Roborovski hamster (Phodopus roborovskii) . . . . . . . 29
3.5.2 Russian Dwarf Hamster (Phodopus sungorus) . . . . . . 30
3.5.3 Golden hamsters (Mesocricetus auratus) . . . . . . . . . 31
3.6 Natural behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.7 Hamsters in Metazoa Ludens . . . . . . . . . . . . . . . . . . . 33
3.8 Hamster recognition . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.8.1 Natural identification . . . . . . . . . . . . . . . . . . . . 34
3.8.2 Features of the fur . . . . . . . . . . . . . . . . . . . . . . 35

4 Implementation 37
4.1 Development environment . . . . . . . . . . . . . . . . . . . . . 37
4.2 Software architecture . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Image retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3.1 Image capturing . . . . . . . . . . . . . . . . . . . . . . . 40
4.3.2 Camera calibration . . . . . . . . . . . . . . . . . . . . . . 41
4.4 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.1 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.2 Region labelling . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4.3 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4.4 Normalisation . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.5.1 Binary region descriptors . . . . . . . . . . . . . . . . . . 47
4.5.2 Basic gray value region descriptors . . . . . . . . . . . . 48
4.5.3 Histogram descriptors . . . . . . . . . . . . . . . . . . . . 49
4.5.4 Face recognition methods . . . . . . . . . . . . . . . . . . 51

5 Testing 53
5.1 Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1.1 Test set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.1.2 Test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Classification by histograms . . . . . . . . . . . . . . . . . . . . 54
5.2.1 Discriminant analysis of histogram classifiers . . . . . . 54

ii
Table of Contents

5.2.2 Comparative evaluation of distance metrics . . . . . . . 60

5.2.3 Classifier robustness against external variance . . . . . 64
5.2.4 Quality of classification . . . . . . . . . . . . . . . . . . . 67
5.3 Classification by face recognition algorithms . . . . . . . . . . 71
5.3.1 PCA results . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3.2 LDA results . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.3.3 Bayesian results . . . . . . . . . . . . . . . . . . . . . . . 74

6 Conclusion 75
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.2 Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

I Appendix I
A Classification performance of different histogram classifiers . II
B Comparison of different distance metrics (HSV-H) . . . . . . . VI
C Classifier robustness against external variance . . . . . . . . . VIII
D Quality of classification . . . . . . . . . . . . . . . . . . . . . . . XI
E Face recognition algorithms test results . . . . . . . . . . . . . XIII

F Bibliography XVI

iii
Chapter 1

Introduction

Pets are an important part of life for many people, and the interaction be-
tween pet-owner and pet is proven to be beneficial for the well-beeing of
humans as well as for the animals. Medical studies showed positive effects
of human-pet interaction on the medical condition of humans1 as well as
positive effects on social behaviour and self-esteem.2 Unfortunately, the
time which can be spent for these positive relationships between animals
and humans is getting constantly shorter due to increased working hours
and the resulting lack of free time. Furthermore an increased number of
professionals is forced to travel across the globe for business trips,3 with-
out the chance to see family members as well as their beloved pets. The
technological development of modern communication technologies, first of
all the internet, improves communication all around the world. However,
these improvements mainly affect the inter-human communication and
do not provide any way of remote human-pet interaction.

1.1 Remote human-pet interaction

Remote human-pet interaction is still a young research area, though first
artistic projects in this area already emerged in the late 1990’s. The project
"petting zoo"4 connected visitors of a gallery to a pet rabbit by using a video
camera and a monitor. The visitors could see the rabbit on the monitor
1
see Serpell (1991)
2
see Bergesen (1989)
3
see Selmer (1995)
4
see Exploration (1997)

1
Chapter 1 Introduction

Figure 1.1: "Petting Zoo" by the Center of Metahuman Exploration5

and speak to him through a phone. Furthermore they could use the phone
buttons to control a robotic arm and pet the rabbit through the system (see
Figure 1.1). Although this system allowed only limited interaction modes
it can be seen as the first remote human-pet interaction system.
Rover@home,6 a project of the MIT Media Lab, consisted of a computer me-
diated human-dog interface, which was derived from the popular clicker-
training technique. The author emphasised the animal-computer interface
as the main problem in mediated human-pet interaction and regarded it
as a more general form of a human-computer interface. Subsequently
human-computer interfaces would be just a special case of these inter-
faces for the animal homo sapiens.7 This definition allowed it to use design
rules for human computer interfaces to design the dog-computer interface
of the Rover@home project. The implementation of the project included
speakers, to talk to the dog, a treat dispenser and a motor-driven toy
which could be remotely controlled, as well as a webcam to watch the
reactions of the dog.
The project "Hello Kitty"8 proposed a combination of a feeding device and
a remote controlled toy for the remote interaction of a cat and a cat-owner.
The human could activate the feeding device and the toy by clicking on
a hyperlink on a website, and watch the reactions of the cat through a
webcam system. An interesting outcome of this project is the use of a

5
from Exploration (1997)
6
see Resner (2001)
7
see p.13 Resner (2001)
8
see Mikesell (2003)

2
1.1 Remote human-pet interaction

Figure 1.2: Playing games against real crickets10

embedded network device, which is independent of a computer and can

be controlled by a standard webbrowser, to allow the human to control
the pet-computer interface.
Another remote human-animal interaction project is "Hello-Fish",9 which
was proposed in 2004. The authors used a video camera and a color-
based tracking software to measure the movements of multiple fish in a
tank. This data was used to display an animation and generate music in
the remote computer of the pet owner. In contrast to the three previous
projects Hello-Fish did not allow any direct interaction of pet-owner and
pet, it merely monitored the actions of the fish, without providing a chan-
nel for the pet-owner to influence the fish. However, it has to be noted that
this lack of interaction is not specific for the Hello-Fish project but for the
interaction between pet-owners and their fish, which is, in general, much
less intensive than the interaction between pet-owners and other pets, for
example cats or dogs.
A different perspective on mediated human-animal interaction was taken
in the project "Playing games against real crickets",11 whose authors devel-
9
see Jang and Lee (2004)
10
from van Eck (2006)
11
see van Eck and Lamers (2006)

3
Chapter 1 Introduction

Figure 1.3: Poultry internet: rooster wearing a touch dress and a human
interacting with the rooster12

oped a game setup to play the game Pac-Man, while real crickets controlled
the movements of the ghosts in the game. The crickets where trapped in a
maze (see Figure 1.2), and could be forced to move by activating vibration
motors in different parts of the maze, while their position was tracked with
a color tracking algorithm. The aim of the project was not to establish an
enjoyable game-play for both, the human and the crickets, but to find out,
whether the crickets could act as intelligent control mechanisms which
ensure a less deterministic opponent behaviour than pre-programmed ar-
tificial opponents.
"Poultry internet"13 is a multimodal remote human-pet interaction system
which is meant to improve poultry welfare by establishing a strong link
between a human and a chicken. In contrast to other human-pet inter-
action projects poultry internet has also a strong focus on novel haptic
human-computer interfaces, beside the focus on a the pet-computer inter-
face. The poultry-pet wears a custom-built touch dress (see Figure 1.3),
while its position is tracked by a video tracking system. The movements of
the pet trigger the movement of a toy chicken on a table near the human.
If the human touches the toy chicken this touch is transmitted through
12
from Lee et al. (2006)
13
see Lee et al. (2006)

4
1.2 Metazoa Ludens

the internet, and is replicated by vibrating motors in the touch dress of the
poultry-pet. Furthermore movements of the pet trigger vibrating motors
at the humans feet, so the human can feel if the pet moves.
In general, most remote human-animal interaction systems have a strong
focus on monitoring the behaviour of the pet, and propose the interaction
based on basic schemes like the clicker-technique or gratification through
feeding of treats. Therefore an inter-species gaming framework, Metazoa
Ludens,14 is proposed, which allows the remote interaction of a human
and his pet, which is interesting for both, in the form of a game.

1.2 Metazoa Ludens

Metazoa Ludens is a remote human-pet interaction project, which allows

pet-owners to play with their pets over the internet. The game is currently
developed by the MixedRealityLab,15 which is part of the National Univer-
sity of Singapore. Mixed reality is a technology that allows the digital world
to be extended into the user’s physical world. Mixed reality research aims
to develop technologies that allow to mix or overlap the computer gener-
ated virtual reality with the users’ perception of the real world. Mixed re-
ality includes the development of new human-computer interfaces as well
as animal-computer interfaces, with the goal to establish a mixed reality
experience for both human and animal users.
The objective of Metazoa Ludens is, to allow humans to interact with their
pets by playing mixed-reality computer games with them, even if they are
not at the same place. Through this interface humans can give their
pets the attention they need, enhance their bonding and provide the re-
quired exercises. While internet games are a popular way of interaction
between humans, there are no games which allow the remote interaction
of humans and animals. One of the biggest problems of computer medi-
ated human-pet interaction is the interface for the pet-player. The human
player can be informed how the game works and can play it with the stan-
14
see KC Tan et al. (2006a)
15
see MixedRealityLab (2006)

5
Chapter 1 Introduction

Figure 1.4: Metazoa Ludens System Overview16

dard computer interfaces, which include a screen, a keyboard and mouse.

In contrast, the pet cannot be informed how it should play the game, it
can just be taught using a learning method, for example operant condi-
tioning. Therefore the game actions of the pet can only include natural
behaviours, which are then translated into inputs for the game engine.
The proposed interface for Metazoa Ludens consists of a running area in-
side a tank which is 860mm by 860mm wide, and a bait mounted on a
movable arm, which can be moved by motors inside the tank. The game
is a chaser-evader game, where the human is chased by the hamster. To
achieve this gameplay a feedback loop is established as shown in Figure
1.4. The hamster chases after the bait, which is controlled by the human
player. The position of the hamster is tracked by a video camera and is
used to control the position of the hamster avatar in the virtual space.
The human player controls a human avatar in the virtual space which is
represented by the bait in the real world.
Empirical tests using a prototype of the Metazoa Ludens system have been
successful, and showed a significant improvement of the hamsters body
16
from KC Tan et al. (2006a), p. 4

6
1.3 Outline

score through the game play.17 However, the tests showed as well, that
it would be desirable to have the possibility to include more than one
hamster at the same time into the game. In this case the human-pet inter-
action in Metazoa Ludens could be enriched with the inter-pet interaction
between the hamsters. Furthermore the training inside the game could be
used to ensure the well-beeing of more hamsters at one time. The current
version of the pet-computer interface does not allow to play with more
than one pet simultaneously, because the motion tracking algorithm is
only able to track the movement of one pet at the same time. Therefore
the tracking algorithm of Metazoa Ludens has to be improved.
Another reason why the tracking system has to be changed is the fact, that
in the current version of Metazoa Ludens, a human is needed to place the
hamster from the cage to the tank. For the future it is planned to connect
the game tank to the cage by a tunnel, to give the hamsters the possibility
to play whenever they like, and end the game as well, if they are tired. In
order to keep track which of the hamsters are playing, a system has to be
developed which is able to distinguish between the individual animals.

1.3 Outline
The main part of this thesis will describe how the pet-computer interface
of Metazoa Ludens can be improved to allow the interaction of multiple
animals. Therefore the fields of motion tracking and object recognition are
explored. Human face recognition is taken as a starting point on the way
to build a pet recognition algorithm. After describing multiple face recog-
nition algorithms, Chapter 3 will discuss features of the hamsters which
can be used for recognition. Chapter 4 will describe the implementation of
a prototypical hamster recognition algorithm, and Chapter 5 will discuss
the classification performance of different hamster recognition algorithms,
in comparison with standard face recognition algorithms.

17
see p.p. 6 KC Tan et al. (2006b)

7
Chapter 2

Theory and Background

2.1 Motion tracking

Motion tracking is a part of the research area of computer vision which
develops algorithms to track moving objects in a scene. This task, which
can be performed by humans with ease, leads to difficult problems when
performed by a computer. The standard input for a tracking software is a
video stream, either a live stream from a camera, or a video file. The first
problem is to find the objects which should be tracked inside the image, or
spoken in more general words, to define for each pixel in the input image
if it belongs to an object or to the background. This process is typically
referred to as segmentation.1

2.1.1 Segmentation by thresholding

In the easiest case the segmentation can be based on the gray-values of the
image, assuming that the objects have significantly different gray-values
than the background. In this case the segmentation can be easily done
using a threshold operation,2 which labels every pixel as object pixel or
background pixel, depending on the fact if the pixel value is above or below
a certain threshold (see Figure 2.1). Unfortunately simple thresholding
fails whenever the gray-values of objects and background are not different
enough or when the lighting is not uniform, which leads to changing gray-
values. To solve this problem several adaptive thresholding operations
1
see p.544 Steger (2006)
2
see p.31 Davies (2005)

9
Chapter 2 Theory and Background

Figure 2.1: Segmentation by thresholding5

exist, which use histograms to find an optimal threshold value.3 Another

class of thresholding methods are local thresholding methods,4 which are
often used in optical character recognition. In this case local maxima of
the intensity distribution are used to find the borders between objects and
background.
Although segmentation by thresholding is an easy method to find objects
in an image, it only works if the illumination can be controlled and if
the basic assumption of significantly different gray-values of object and
background is valid.

2.1.2 Segmentation by motion

A different approach, which can be taken if moving objects should be

tracked, is to segment the image by motion. Therefore a difference im-
age between two consecutive frames of a video stream is used, as this
difference image should show changes between the frames and therefore
segment moving parts of an image from non-moving parts. In general, this
approach finds temporal changes in the intensity distribution of a scene.
Problems can arise if the objects consist of uniform gray-values as these
areas do not give any sign of motion in the difference image, as well as
3
see p.548 Steger (2006)
4
see p.119 Davies (2005)
5
from Davies (2005), p. 32

10
2.1 Motion tracking

edges which are parallel to the direction of movement. This problem is

called aperture problem6 as it stems from the size of the aperture which
has to be big enough to cover edges of the objects, which are not parallel
to the motion.
One concept to describe the motion of objects inside an image is called
optical flow. It describes the movement of objects as a global optical flow
field, which consists of one motion vector per pixel. A standard algorithm
which is able to compute the optical flow field from difference images is
the optical flow algorithm by Horn and Schunck.7 In this case a local
operator is used to compute the motion vectors which are later optimised
with the assumption, that the optical flow field varies smoothly over the
image. Using this assumption an optical flow field can be found whose
motion vectors are zero for static image parts and unequal zero for moving
objects. Therefore optical flow can be used to segment moving objects from
a non-moving background.
Segmentation by motion fails whenever an object stops moving, so it has to
be combined with other segmentation techniques to ensure segmentation
in all situations.

2.1.3 Region labelling

Region labelling denotes a process which is used to label connected compo-

nents, the so-called regions. While a segmentation algorithm returns only
a segmentation of background and all foreground pixels, region labelling
can be used to differentiate between multiple pre-segmented objects in the
foreground. Therefore the labelling algorithm has to determinate the con-
nectivity of every pixel of the input image, which can be done for example
by a depth-first search.8 The output of a region labelling algorithm is an
image where every pixel is marked with a label value which corresponds
to a region.

6
see p.507 Davies (2005)
7
see Horn and Schunck (1981)
8
see p.552 Steger (2006)

11
Chapter 2 Theory and Background

2.1.4 Correspondence problem

Segmentation and labelling algorithms provide knowledge how many ob-

jects are present in the image, and at which locations they are. Using
these methods it is possible to track the motions of a object in a scene as
long as the scene consists only of one object and the background. How-
ever, once there are multiple objects in a scene a correspondence problem
arises. In general, tracking aims to follow the movement of objects. There-
fore it has to ensure temporal correspondence, i. e. to make sure that the
labelling of an object region in frame In corresponds to the object region
with the same label in frame In−1 . A basic approach includes the assump-
tion, that the optimal correspondence minimises the distance between the
regions’ positions in two consecutive frames. This approach fails when-
ever an object occludes another object, or when the trajectories of objects
cross. Therefore tracking filters, for instance Kalman-filters, can be used
to estimate the movement of objects and optimise tracking performance.
Another idea to tackle the correspondence problem and the segmentation
problem altogether is the use of template matching methods or histogram
based trackers.9 Both methods search for a region in the image which is
specified by a template of salient points (see Figure 2.2) or by a histogram,
for example of hue values of the tracking region. The region which matches
the template best is considered as new position of the object.

2.2 Tracking for Metazoa Ludens

The tracking system is one part of the animal-computer interface of Meta-

zoa Ludens. It consists of a video camera which is mounted on top of
the running space, looking down, in order to cover the whole area inside
which the pets can move. The goal of the tracking process is, to find the
position of each individual pet in the area, which is then used as an input
for the game engine. The game engine uses the position data to display
the pet’s representation in the virtual reality as well as to calculate the
proximity between human and pet player. As the pets can only move on
9
see Deutsch et al. (2005)

12
2.2 Tracking for Metazoa Ludens

Figure 2.2: Template matching: The crosses denote the tracked image
points, the dashed rectangle denotes the template position in
the previous frame10

the flat surface, and the camera is mounted on top and is looking down,
the problem can be tackled as a 2D tracking problem. The background of
the tracking input is a black fabric, which allows the use of thresholding
techniques for segmentation. Furthermore, the setup ensures that occlu-
sion, which is a big problem in other applications like people-tracking, is
no issue, as the pets are not likely to be occluding each other.
The following section describes the current tracking system which was
used for a prototype of Metazoa Ludens, as well as the further development
towards a combined hamster-tracking/hamster-recognition system.

2.2.1 Current tracking System

The current tracking system of Metazoa Ludens is a basic visual tracking

system, using a near-infrared light source and a videocamera with a near-
infrared filter. The camera is mounted on top of the cage, looking down
to the floor, and produces a grayscale image. This grayscale image shows
a high contrast between the floor of the cage as background and the pet
as foreground, which allows an easy and reliable visual tracking of the
hamster’s position. Therefore the image is converted to a binary image
10
from Deutsch et al. (2005), p. 270

13
Chapter 2 Theory and Background

using a variable threshold, which results in black pixels for the foreground
and white pixels for the background.
The tracking algorithm determines the position of the hamster by the sim-
ple calculation of the mean position of all black pixels. Although this
method is simple, computionally cheap and therefore suitable for a real-
time tracking system, it has some major drawbacks. It just works if only
one object is present, because multiple objects influence the tracker alto-
gether. This leads to a tracking point which is somewhere between the
objects, depending on the size of the objects.
In order to track multiple hamsters, a tracking algorithm has to be devel-
oped, which segments the image not only in background and foreground
but into multiple foreground objects. It has to be able to distinguish be-
tween the objects, and find the objects again if they leave and enter the
field of view of the camera again. Although the camera’s view covers the
whole gaming area it does not cover the rest area for the hamsters, which is
connected to the gaming area with a tunnel. For this reason the algorithm
has to involve a recognition part, which is able to distinguish between all
hamsters which are part of the test.

2.2.2 Improved tracking system

The new tracking system should be able to detect the position of each
hamster inside the gaming area at any time. In general, the system has
to track and recognise a number of non-rigid moving objects. The first
step uses an adaptive color threshold to distinguish between foreground
and background. Therefore the background color and standard deviance
is learned by the capturing of some frames without any hamsters. These
values are used for the segmentation process, where all pixels inside the
background color range are considered as background and all other pixels
are considered as foreground.
In the next step a region labelling algorithm is used to label the different
objects. The algorithm returns a number of labeled regions, which de-
scribe the objects inside the image, e.g. the hamsters, and a background
object. The region labels are not corresponding to the individual hamsters,

14
2.2 Tracking for Metazoa Ludens

Figure 2.3: Input image for the tracking software in Metazoa Ludens

but to the position of the region inside the image. In order to track individ-
ual hamsters, a way has to be found to distinguish between the individual
hamsters. The easiest approach for the tracking problem would be a track-
ing filter, for example a kalman filter, which could help to keep the labeling
of the hamsters constant by approximating the movements. Unfortunately,
this approach fails whenever the hamsters are close together, and it fails
whenever a hamster leaves the gaming area and moves back in again. To
ensure tracking in these cases as well, a combination of a tracking filter
with a recognition algorithm could be used. The tracking filter is able to
track the movement of the objects in the default case, while the recognition
part is used to handle the special cases. Furthermore a pet recognition al-
gorithm could be useful in many more scenarios, for example in pet shops,
for pet lovers or in medical research.

15
Chapter 2 Theory and Background

2.3 Object recognition

General object recognition can be approached in various ways, which can

be divided into three main areas.11 Methods based on invariant properties
assume that objects have properties which can be measured in every view.
One example where this idea is often used is the field of machine vision
in quality control, where flat objects can be recognised by their size12 or
by the distance between salient points. The second approach is based on
parts and structural descriptions, which describe the structure of an object
as a combination of primitive parts. This approach is mostly used to dis-
tinguish between object classes, for instance to distinguish between cars,
motorcycles and houses, but is not well suited to distinguish between
different objects of the same class, as they are often built from the same
primitives.This problem is especially important for human face recognition,
as all human faces are built of the same primitives.13 The third category
of recognition methods are the alignment based methods which try to nor-
malise the object image into a standard view by affine transformations, in
order to compare these normalised images directly.

2.3.1 Hamster recognition

The input of the recognition algorithm is a single frame of a video stream

which shows the pet from a top view. In order to use this data to distin-
guish between multiple hamsters, a suitable descriptor has to be found.
First prototypical tests using saturation histograms as a descriptor have
been successful. Using the histograms, it was possible to decide which of
two hamsters is present in the image, even if they moved very fast, and
were therefore affected by motion blur. The algorithm is based on a cross-
correlation of the histogram in question and stored gallery histograms of
all known hamsters. Although this approach worked in the test case, it
has to be examined further, and tested for the case of more than two ham-
sters. One problem is the high intra-subject variation, mainly due to mo-
11
see Ullman (2000)
12
see p.556 Steger (2006)
13
see p.227 Howell (1999)

16
2.4 Face recognition

tion blur and change of appearance through movement. A further problem

is the low inter-subject variation for the descriptor, which may not allow
to distinguish between more then two objects. This stems mainly from
the fact, that the descriptor only measures the distribution of saturation
values - how many pixels of each color range - and is therefore unable to
distinguish between two subjects which have the same saturation distri-
bution. Therefore further recognition algorithms have to be tested which
model the specific features of the hamsters fur pattern better than a sim-
ple histogram descriptor.
In order to develop a hamster recognition algorithm, the field of human
face recognition is explored. Like human face recognition the hamster
recognition algorithm has to distinguish between objects of the same class,
which are built of the same primitives. Human faces as well as the ham-
sters are non-rigid objects, which leads to problems if images are com-
pared for recognition. Therefore it is assumed that solutions from the field
of human face recognition are a helpful tool in the search for a possible
hamster recognition algorithm. The following section will give an overview
over the field of human face recognition in general and will describe several
face recognition algorithms.

2.4 Face recognition

Face recognition is a special case of the broader field of object recogni-

tion, which describes methods and algorithms to classify individual hu-
man faces in order to distinguish between them.
Human face recognition is a research area inside the field of image analy-
sis which has gained a lot of attention during the last decades. This trend
stems from the wide area of possible applications of face recognition as
biometrical tool in security applications, for example video surveillance,
access control or user authentication.14 Powerful computers are needed
in order to build face recognition applications, which is another reason,
why face recognition could move from a theoretical framework to a usable
14
see p.1 Zhao et al. (2000)

17
Chapter 2 Theory and Background

tool in the last couple of years. Today, face recognition systems show
good performance in controlled environments, while unconstrained face
recognition is still a difficult task, because of the big variations through
illumination, pose, expressions and aging.15
In general, face recognition can be formulated as the following problem:
"Which persons are shown on a given still image or video?" There are three
different types of scenarios for face recognition:16
Face verification (”Am I who I say I am?”): A person claims to have a
certain identity, which has to be proven by comparing the face image of
the person to a stored face image of the claimed identity. If the result of the
comparison is over a certain threshold, the persons’ identity is considered
as verified.
Face identification (”Who am I?”): In order to identify a person, the face
image has to be compared with each image inside the face database. The
results are ranked, and the result with the highest similarity value is con-
sidered as match. This approach works as a closed-test, i. e. the individ-
ual has to be known to be inside the face database.
The watch list(”Are you looking for me?”): In this case, the system matches
the current face against each image in the database, and raises an alarm,
if the similarity value is over a certain threshold. This scenario is called
an open-universe test, and can be used for example in law enforcement to
spot criminals, which are on a list of wanted persons.
The hamster recognition system will be developed as an identification sys-
tem, in which each object in the camera’s field of view will be tested on the
hamster database, in order to recognise the hamster.

2.4.1 Face recognition processing flow

The general design of a face recognition system is shown in Figure 2.4. At

first, the face image has to be captured by a camera, either as a video
stream or a still image. A face detection algorithm tracks the position of all
faces inside the image and segments the image into face objects and back-
15
see Zhou et al. (2005)
16
see p.2-4 Lu (2003)

18
2.4 Face recognition

Figure 2.4: General structure of a face recognition system20

ground. For this purpose skin color filtering,17 kernel based approaches
like the support vector machine18 or boosted classifiers based on haar-
like features19 can be used. The next module in the processing flow is
a face alignment module which normalises the captured faces. One possi-
ble approach is the detection of salient features of the faces, for example
the position of the eyes. The eyes’ position can be used to determine the
rotation and scaling of the face in the captured image, in order to trans-
form the face image into a standard representation. This representation
is necessary to ensure the correct matching of features, especially if sta-
tistical pattern recognition methods are used to describe the face image.
The normalised face image, which is often called probe image, is fed into
an feature extraction algorithm to find a group of features which describe
the individual face image. In the last step these features are matched to
stored features from a database. The result of this feature matching pro-
cess is a similarity value which denotes the similarity of the probe image
and a stored gallery image. An application specific threshold can be used
to decide if a face is considered as recognised or not.

2.4.2 Face recognition algorithms

Statistical pattern recognition is an research area which covers the defini-

tion, learning, feature extraction and matching of pattern classifiers with
17
see Zarit et al. (1999)
18
see Osuna et al. (1997)
19
see Viola and Jones (2002)
20
from Jain and Li (2004), p. 3

19
Chapter 2 Theory and Background

Figure 2.5: Pattern classifier22

the goal to classify patterns into a set of classes.21 Figure 2.5 shows the
general processing flow of pattern recognition. The data from a sensor, for
example a camera, is called representation pattern. A feature selector is
used to produce a feature pattern, by selecting salient features out of the
input data. Based on the feature pattern a classifier can decide to which
class the input data belongs. In a face recognition context the classes are
the individuals which have to be distinguished by their faces. The feature
pattern is a set of measurements on features of the object representation
which is stored in a feature vector. Which features are measured is de-
pending on the application and the input data, for example formants -
peaks in the frequency domain - can be used for speech recognition.
A simple approach to generate a feature vector for image recognition is
to build a vector which includes the intensity value of each pixel in the
image. In this high-dimensional image space each image is described
by one point. The problem of this approach is the high-dimensionality
of the image space, which leads to the "curse of dimensionality".23 The
feature vector of an image with 64x64 pixels and 256 possible gray val-
ues already has the dimension 2564096 , which makes matching of these
feature vectors computationally expensive and increases the number of
samples which are needed to train a recognition system. Therefore it is
necessary to find possibilities to reduce the dimensionality of the feature
space. In the case of face recognition this reduction is theoretically fea-
sible, because only face images have to be compared and all faces share
geometric properties, like a rough symmetry, and consist of the same ba-

21
see p.2 Webb (2002)
22
from Webb (2002), p. 3
23
see p.141 Shaknarovich and Moghaddam (2004)

20
2.4 Face recognition

sic elements like two eyes, a nose, and a mouth.24 Thus the "intrinsic
dimensionality"25 of face images is much lower than the dimensionality of
general images of this size, and the feature vectors can be transformed
from the high-dimensional image space to a face space which has a much
lower dimensionality. Apart from face recognition this face space can be
also used for face detection. Images which lie inside the face space can be
considered as face images while outliers are detected as non-face images.
To reduce the high-dimensional image space to a subspace with a smaller
dimension, several statistical methods can be used.

2.4.2.1 Principal components analysis

The principal components analysis (PCA) is a statistical method which

can be used to extract the most significant components out of a statistical
dataset, the so-called principal components. The first principal component
is defined as the "linear combination of the original dimensions that has
the maximum variance".26 Starting from the first principal component
the principal components are ordered in terms of maximal variance, and
the principal component n has to be orthogonal to the component n −
1. In order to reduce the feature data with PCA, the feature vectors are
projected into a new k-dimensional face space Φk which is defined by the
principal components [φ1 ...φk ]. Figure 2.6 shows the principle in a two-
dimensional case. The factor k defines the dimensionality of the face space
and thereby the accuracy how well the original face can be reconstructed
from the feature vector in face space. In an ideal case, the dimension of
the face space matches the intrinsic dimensionality of the face images, so
the feature vectors in face space represent the faces well and additional
components of the images, like gaussian noise, are rejected.
PCA was first used for face recognition by Turk and Pentland, who pro-
posed the "eigenface method"27 in 1991. In the eigenface method, the
mean face is subtracted from a face before it is projected into the face
24
see p.142 Shaknarovich and Moghaddam (2004)
25
see p.3 Webb (2002)
26
see p.142 Shaknarovich and Moghaddam (2004)
27
see Turk and Pentland (1991)

21
Chapter 2 Theory and Background

Figure 2.6: Principal components analysis:28 - x1 and x2 denote the origi-

nal basis, φ1 and φ2 are the first principal components

space. The vectors of multiple images of each subject are averaged and
stored to represent the individuals in the database. The euclidean dis-
tance of a new image vector to the stored vectors is taken as a measure-
ment for the similarity of the new face and the faces in the database.
Despite its success for face recognition, the eigenface PCA approach has
some major drawbacks. It does not model different variance classes but
maximises the overall variance of the data. Therefore it cannot deal well
with variance which stems from image transformations like scaling, shift
or rotation,29 as well as variance through lighting or facial expression.
Furthermore it has to be noted, that PCA itself just orders data based on
variability, and helps to reduce the dimensionality of the data by retaining
the most variant dimensions on the data. However, a high variance does
not automatically mean that these features are well-suited for discrimina-
tion between classes and therefore perform well in a pattern classification
task.30

28
from Shaknarovich and Moghaddam (2004), p. 144
29
see p.242 Howell (1999)
30
see p.713 Davies (2005)

22
2.4 Face recognition

Figure 2.7: Comparison of PCA and FLD in a two-dimensional case32

2.4.2.2 Linear Discriminant Analysis

Linear discriminant analysis is a statistical method which can be used to

find a subspace of the original image space, which discriminates optimally
between classes. It is based on the Fisher linear discriminant (FLD), pro-
posed by R.A. Fisher in 1936 to solve the discrimination of populations in
taxonomic problems.31 He proposed a criterion to measure the linear sep-
arability of classes based on the ratio of the variance between two classes,
the between scatter matrix Sb , and the variance inside one class, the within
scatter Sw . The higher this ratio is, the better two classes are linear sep-
arable. As the goal for face recognition is good discrimination between
classes, e.g. individual humans, the fisher criteria has to be maximised.

31
see Fisher (1936)
32
from Belhumeur et al. (1997), p. 714

23
Chapter 2 Theory and Background

Therefore the subspace Φ has to be found which maximises the ratio:33

|ΦT Sb Φ|
(2.1)
|ΦT Sw Φ|

The "fisherface" algorithm34 was the first algorithm which used the linear
discriminant based on the fisher criteria in face recognition. Because of
the high computational complexity of Sw and Sb in the original image space
the fisherface algorithm uses PCA to reduce the dimensionality and calcu-
lates equation 2.1 from the vectors in the PCA subspace. In empirical
tests, which included extreme illumination changes, fisherfaces showed
significant better results than the eigenface method.35 This benefit can be
explained from the structural difference between LDA and PCA: PCA max-
imises the global variance of the data while LDA maximises the variance
in-between the classes. Therefore it can separate better between classes,
if they are linear separable. Figure 2.7 shows how classes, which lie near
a linear subspace, can be separated using the fisher linear discriminant
(FLD).

2.4.2.3 Bayesian (probabilistic) classifier

Bayesian classifiers use the knowledge of the probability of a pattern be-

longing to a certain class, the so-called probability density function,36 to
reduce errors in classification. They are based on the Bayes theorem37
(see Equation 2.2),38 which defines the correlation between the probability
of events:

P (B|A) P (A)
P (A|B) = (2.2)
P (B)
33
see p.147 Shaknarovich and Moghaddam (2004)
34
see Belhumeur et al. (1997)
35
see p.717 Belhumeur et al. (1997)
36
see p.6 Webb (2002)
37
see Bayes (1763)
38
see p.453 Webb (2002)

24
2.4 Face recognition

P(A) is the "a priori" probability of A, without considering any information

about B. P(A|B) is the conditional probability of A, the so-called "a poste-
riori" probability, which depends on the value of B. The bayes rule can be
used to construct minimum-error or minimum-risk classifiers. The min-
imum error classifiers minimise the likelihood of an classification error.
Minimum risk classifiers reduce the risk or the costs which are connected
to certain classification errors. Therefore cost functions can be used, to
model the different costs which can be triggered by a misclassification.
Moghaddam et al.39 proposed to use a bayesian classifier for face recog-
nition. They define two classes of variation of face images, intrapersonal
and extrapersonal variations. Intrapersonal variations are all variations be-
tween two images of the same subject, which can stem for instance from
differences in lighting conditions, pose or facial expression. The second
class of variations are the extrapersonal variations which stem from the
different appearance of different individuals.
Moghaddam et al. use difference images between the probe image and a
gallery image for classification. They assume that all difference images are
part of two distinct Gaussian distributions in the image space of possible
difference images - one distribution for the intrapersonal, one for the ex-
trapersonal variations. As the distribution functions are not known, the
algorithm has to estimate the distribution of the difference images in the
training stage, and build two subspaces, for intrapersonal and extraper-
sonal difference images, which is done by Principal Components Analysis
(PCA). P (ΩI |∆), the a posteriori probability that the difference image ∆ is
part of the subspace of intrapersonal variations, is taken as the similarity
measure.

39
see Moghaddam et al. (1996)

25
Chapter 3

The Hamsters

This chapter describes the animals which are used as pets and test sub-
jects for the Metazoa Ludens game. They are biologically classified, de-
scribed in their natural habitat and behaviour, in order to discuss to what
extend the Metazoa Ludens framework can be beneficial for them. Further-
more the different breeds are described, with the goal to find features of
the breeds as well as features of individual hamsters, which can be used
for the classification of the breed, and for the recognition of individual
hamsters, respectively.

3.1 Biological classification

Cricetines, or hamsters, are a small subfamily of terrestrial cricetid ro-
dents. The hamsters can be divided in 7 genera and 18 different hamster
species.1 For Metazoa Ludens, hamsters from two different species have
been used for testing. Three roborovski hamsters (Phodopus roborovskii),
and two golden hamsters (Mesocricetus auratus). This test setup, which
is described in Chapter 5, is being used to find out, if the software is able
to distinguish between individual animals or between animals of different
species of the criecetine subfamily. Furthermore the use of Russian Dwarf
Hamsters (Phodopus sungorus) had been investigated, but no tests have
been carried out, as it is not possible to cage Russian dwarf hamsters and
Roborovski hamsters together. All hamsters at the lab are females, in or-
der to avoid offspring as well as fighting, which can occur if multiple male
1
see Musser and Carleton (2005)

27
Chapter 3 The Hamsters

hamsters are caged together.

3.2 History

In April 1930, professor Aronin, who currently studied and collected na-
tive animals in the dusty field near Aleppo in Syrian Desert (situated in
the Middle East to the north of Israel)2 , found a mother with twelve young
and decided to take them back to the Hebrew University in Jerusalem.
This is the first time that hamsters were kept in captivity. As the scien-
tists always look for new and more useful laboratory animals, hamsters,
which eat well, grow fast, become tame, breed rapidly and remain healthy
when they are caged, caught their attention. Pairs of hamsters were sent
to England, then France and in 1938, they were shipped to the United
States of America, for the first time. They found their way to spread all
over the world. Hamsters have become numerous and popular in schools,
museums, pet stores, laboratories and homes just in the short space of
twenty years.3

3.3 Natural Habitat

In the wild, hamsters live underground and search for food, their natural
habitat is the desert. Naturally hamsters are animals of fields, meadows
and open places. Living in rock piles and fence rows, they burrow down in
tunnels which are two to ten feet long. Females will build a nest and raise
her young.
Hamsters usually emerge under cover of darkness to find food. They rob
grain of the farmer, sometimes the nests of ground birds and eat eggs or
the young, they even eat small lizards, insects or worms. Hamsters stuff
the food they found into a pair of cheek pouches. When their pouches are
full, they hurry back to their burrows where they store food. This habit has
given them their name from the German word “hamstern”, meaning hoard
2
see Pet (2005b)
3
see Zim and Wartik (1951)

28
3.4 Hamsters as pet

or store. People have found up to 45 kilograms of grain in the burrows

of some European hamsters. That’s why farmers dislike wild hamsters
heartily.4

3.4 Hamsters as pet

Among 18 species of hamsters, those 5 species which are listed below are
widely kept as pets nowadays:5

• Syrian hamster/Golden hamster - Mesocricetus auratus

• Dwarf Campbells Russian hamster - Phodopus campbelli

• Dwarf winter white Russian hamster - Phodopus sungorus

• Chinese hamster - Cricetulus griseus

• Roborovski hamster - Phodopus roborovskii

All the hamsters found in schools, laboratories and homes now are off-
spring of the ones collected in Syria in 1930. Hamsters are usually very
tame, eat well, grow fast and remain healthy when they are caged. A full-
grown hamster is only 10 cm to 12 cm long in case of a golden hamster,6
dwarf hamsters are even smaller - so they need less room than any other
pets. Hamsters are clean, easy to house, simple to feed and interesting to
watch as they grow up, play in their cage and raise their families. These
characteristics have made them popular as pets.

3.5 Hamster species

3.5.1 Roborovski hamster (Phodopus roborovskii)

Having a body length of only 4-5cm, the Roborovski is the smallest among
the hamster species. Its white eyebrows give it a very distinctinve feature.
4
see Zim and Wartik (1951)
5
see Pet (2005b)
6
see Zim and Wartik (1951)

29
Chapter 3 The Hamsters

Figure 3.1: Roborovski hamster (Phodopus roborovskii)

The natural color for Roborovski is sandy-gold with an ivory belly, black
eyes and gray ears (see Figure 3.1). The “White Face Roborovski” has a
distinguishing white face.7
In comparison to other hamsters, Roborovski hamsters are very active and
fast runners. Therefore they are not a suitable pet for children, and not as
easy to handle as other breeds. On the other hand this attribute qualifies
this species for Metazoa Ludens, as the game is able to satisfy their desire
to be active, and is therefore able to improve the quality of living for these
animals.

3.5.2 Russian Dwarf Hamster (Phodopus sungorus)

The hamster body length varies from 5.3 to 10.2 cm, with an additional
0.7 to 1.1 cm of tail. Its size will usually reach the larger end of that
range. The fur of the upper body is grayish.The grayish body color usually
extends to the upper part of each leg. A dark dorsal stripe runs along the
7
see Pet (2006b)

30
3.5 Hamster species

length of the body. The underside and the sides of the muzzle, upper lips,
lower cheeks, lower flanks, limbs, legs and tails are white. Its tail and feet
are usually covered by its fur.
The hamster is most active in the evening, with some activity continued
throughout the night. It can also be quite alert during the daytime. It
appears to be docile as it is very nearsighted. It can appear calmer if
accustomed to a familiar voice during handling time.8 Like all dwarf ham-
sters, the russian dwarf hamster is a very fast runner, which would make
it an ideal choice for the Metazoa Ludens gaming system. Unfortunately it
is not possible to cage russian dwarf hamsters and roborovski hamsters
together. Therefore the decision to test only roborovski hamsters has been
made.

3.5.3 Golden hamsters (Mesocricetus auratus)

Compared to the hamsters discussed above, which are dwarf hamsters,

the golden hamsters are significantly bigger and heavier. Their body length
varies from 8 to 12 cm, and the adult animal has a mass between 100 g to
125 g.9 Therefore the size of the hamster could be a criteria to distinguish
at least a golden hamster from one of the dwarf hamsters. The fur colour
of the golden hamster is golden-brown on the top, their belly is of gray or
white color (see Figure 3.2).
One problem which arose when handling the golden hamsters for Metazoa
Ludens was the fact, that golden hamsters are solitary and extremely terri-
torial.10 They will only accept another hamster inside their territory when
they are mating, and attack any other hamsters until the intruder or the
defender is dead.11 Therefore golden hamsters cannot be kept in the same
cage with other breeds. To avoid fighting, they should be kept alone once
they are mature (around 7 weeks and above).
The only way to keep more than one golden hamster in captivity is to keep
individuals of the same family, for example sisters or brothers. Therefore
8
see McGuire (1993)
9
see Champagne and Lundrigan (2006)
10
see p.304 Burnie (2005)
11
see: Dieterlen (1959), Festing (1986), Johnston (1985), Murphy (1985)

31
Chapter 3 The Hamsters

Figure 3.2: Golden hamster (Mesocricetus auratus)

the initial plan to keep dwarf hamsters and golden hamsters in one cage
was abandoned, in order to avoid a deadly carnage. In the wild golden
hamsters occupy a large area, and try to keep a wide distance between
their burrows, which is usually above 100 m.12 Having said that, wild
golden hamsters are used to run a lot, a possibility which they do not
have when they are kept in captivity, unless they have a running wheel or
a gaming system like Metazoa Ludens.

3.6 Natural behaviour

The most important natural behavior of hamsters is that they store food
in their cheek pouches and empty the food into a storage pile. Hamster
feed mainly on grain found in nearby field; now and then they eat small
lizards, insects or worms. Their natural enemies - snakes, hawks, owls,
weasels and foxes - are usually bigger than them.13 Thus, they choose to
run and hide rather than fight back. In a first prototype a bait with food
12
see Gattermann et al. (2001)
13
see Zim and Wartik (1951)

32
3.7 Hamsters in Metazoa Ludens

Figure 3.3: Prototypical food bait for Metazoa Ludens

(see Figure 3.3) was used to attract the hamsters, which did not work well,
because it is not a natural behaviour of hamsters to hunt for food.
The recent prototype of the system is equipped with a bait which consists
of a small pipe (see Figure 3.4), where the hamster can crawl into and
hide. This uses the natural behaviour of hamsters to run away and hide
if they are attacked. Although this method needs some training until the
hamsters recognise the bait as attractive to crawl into and start running
after it, the pipe bait is working much better than the food bait.

3.7 Hamsters in Metazoa Ludens

Like all pets, hamsters need exercise and entertainment to maintain their
physical and mental health. Normally, a running wheel is attached in the
cage for them to exercise. As an alternative the Metazoa Ludens system
is tested on its ability to influence not only the well-being of the human
player, but have beneficial effects for the hamsters. Furthermore the mo-
tivation of the hamsters to play Metazoa Ludens was also tested, using

33
Chapter 3 The Hamsters

Figure 3.4: Prototypical pipe bait for Metazoa Ludens

the method of Duncan.14 The results show a significant motivation of the

hamsters to play the game.
For the testing of beneficial effects on the physical well-being of the ham-
sters, the Body Condition Score (BCS) has been used. It is a simple method
to evaluate the physical condition of a small animal, for example a ham-
ster. The results of the first Metazoa Ludens study15 show that hamsters,
which were allowed to play the Metazoa Ludens game, were healthier than
the hamsters of the control group.

3.8 Hamster recognition

3.8.1 Natural identification

Despite having large protruding, round eyes, hamsters are short-sighted.

However, their sense of smell and hearing are very acute. They rely heavily
on these latter senses to recognize each other and their environment.16
14
see KC Tan et al. (2006a)
15
see KC Tan et al. (2006a)
16
see Pet (2005a)

34
3.8 Hamster recognition

The hamsters’ near-sightedness unables them to see objects that are close
in range.17 This would mean that they would often identify objects by
biting them rather than looking at them.18 Nevertheless, with the lateral
position of their eyes, hamsters have a wide angle of vision. They may still
be able to spot movements of objects from a greater perceived distance.
Hamsters are color-blind, being only able to see different shades of black
and white. They could also be nearly blind in bright daylight. It is also
believed that those with red eyes have poorer eye sight than those with
black eyes.19
Unlike their eyes, hamsters’ sense of hearing is very well-developed. They
can hear a wide variety of sounds, including those made in ultrasonic
frequencies. This helps hamsters to communicate with each other without
being heard by others. Hamsters can often be seen to freeze if they hear
unfamiliar sounds or noise, especially loud noises.20
Apart from their good sense of hearing, hamsters are also compensated
with a very good sense of smell. They distinguish one another by their
distinct scents. They can make use of distinct musk-like liquid produced
from scent glands to identify other hamsters and to mark their territory.
This may also enable them to distinguish the sex of another hamster
through smelling.21

3.8.2 Features of the fur

Hamsters can have more than one color on their fur. For the Russian
Dwarf Hamsters species, the normal type is dark gray in color with a
darker gray undercolor. It has a thick jet black dorsal stripe and an almost
white belly. The eyes are black and the ears are gray. The Sapphire type
has a soft purple-gray fur with gray undercolor, a thick gray dorsal stripe
and ivory belly. The eyes are black and the ears are light gray-brown.22
17
see Hamsterhideout (2006)
18
see for the Prevention of Cruelty to animals (2006)
19
see Hamsterhideout (2006)
20
see Hamsterhideout (2006)
21
see Hamsterhideout (2006)
22
see Pet (2006a)

35
Chapter 3 The Hamsters

Roborovski hamster have a top coat color which is dark chestnut or gold
with a slate gray undercolor. The belly and side arches are white. The
hamsters have white eyebrows just above their eyes and a white patch
around the nose. Unlike other dwarf hamsters, they do not have a dorsal
stripe.23
The golden hamster has golden-brown top fur, while the belly is gray
or white. It can be possible to distinguish individual golden hamsters
through dark patches on their forehead and a black stripe on their back.24
Apart from such non-specific statements, there are not many facts about
the probability that two hamsters can be distinguished based on the ap-
pearance of their fur. Therefore the question has to be answered later,
based on the outcome of the experimental analysis in Chapter 5.

23
see Chamberlain. (1992)
24
see Alderton and Tanner (1999)

36
Chapter 4

Implementation

In this chapter the implementation of the combined tracking and recogni-

tion module for Metazoa Ludens is described in detail. At first the develop-
ment environment is described, and the decisions to use the programming
language C++ and the Framework OpenCV are explained. The next part
describes the general architecture of the Metazoa Ludens Software, espe-
cially the Server/Client architecture, the tasks of the client and the server
software as well as the data which is transmitted over the network. The
processing workflow starting from camera input, to preprocessing steps,
up to orientation and normalisation are shown afterwards. Finally the
different classification approaches and corresponding descriptors are dis-
cussed in this chapter.

4.1 Development environment

The software for Metazoa Ludens is written in C++/C, and Microsoft Visual
Studio is being used for software development. The decision to use C++ for
the hamster tracking stems from a couple of reasons. At first, some parts
of the Metazoa Ludens system, for example motor controls, have already
been developed in C++, and the new code should integrate seamlessly into
the old code. Secondly, the whole system is planned to work in realtime
on standard PC hardware, and produce sound and 3D graphics output
via standard libraries - in this case DirectX libraries. The third reason
was the insight, that a framework has to be used for the programming
of the hamster tracking software, in order to avoid the time-consuming

37
Chapter 4 Implementation

reprogramming of low-level functions like grabbing frames from a camera.

This framework has to be fast, to enable the system to complete the com-
plex task of tracking and recognition in real time. Furthermore it has to
include as many of the needed algorithms as possible, and should be well
documented.

After some research on computer vision frameworks, OpenCV has been

chosen. OpenCV is a open source library for computer vision, which is
published by Intel. It includes over 300 functions which can be used for
real-time computer vision, from image functions and data structures up
to object detection and basic recognition functions. Apart from the fact it
is open source and free to use, OpenCV has also a big user community
which is organised in several newsgroups,1 and can be helpful if problems
arise. Therefore the decision to use OpenCV for the implementation of the
hamster tracking/recognition system was made. Another library which is
used for the tracking system is the library cvBlobsLib,2 which is used for
region labelling.

4.2 Software architecture

The hamster tracking/recognition software for Metazoa Ludens is part of

the Metazoa Ludens server application which is developed in C++. The
architecture of the application is shown in Figure 4.1.

The server grabs a frame from one of the cameras which are mounted
above the hamsters’ running area. After the initial preprocessing, the
tracking of the different objects inside the image frame is done. The infor-
mation about position, orientation and identity of the hamsters is sent to
the client via a TCP/IP connection.

1
see OpenCV
2
see: http://opencvlibrary.sourceforge.net/cvBlobsLib

38
4.2 Software architecture

Server Hardware Server TCP/IP connection Client Client Hardware

Cam 1
Camera Hamster ID/Position/Orientation
Tracking Hamster Player
Selector
Cam 2

Arm position
Direct 3D
Game Engine
Display
Accentuator Rendering
Surface data
Accentuators Collision Detect
Controller

Bait Arm Human position

Moving Arm Controller Human Player Keyboard

Figure 4.1: Software Architecture of Metazoa Ludens

On the client side this information is used to create an object for every
hamster player inside the DirectX game engine. The client takes the input
of the human player on the keyboard, to move the human player object in
the virtual gamefield. Furthermore the graphics rendering for the virtual
world, which includes the hamster avatar as well as the human avatar, is
done on the client machine, and the output is displayed on the standard
display. The position of the human player and information about the
virtual surface are sent back via a TCP/IP connection to the server, by
the client software.
The server controls the moving arm with the bait according to the position
of the human player inside the game. The surface of the hamster tank
is controlled by sixteen accentuators, according to the virtual surface in-
formation which has been sent from the client. Thereby the real surface
for the hamster can be warped in realtime, depending on the virtual game
model. For improved flexibility the motor control has been implemented
as a bluetooth interface.

39
Chapter 4 Implementation

The structure of the tracking/recognition system is shown in Figure 4.2.

As stated above, the library OpenCV3 is used for image retrieval from the
camera, preprocessing and tracking. In the following the several modules
of the tracking/recognition system are described.

Filesystem

Input Image retrieval Preprocessing Recognition OUTPUT

Camera Capturing Classifier load

cvCapture loadGallery

Lens distortion Segmentation

cvUndistort thresholdByColor

Region labelling Classification Matching HamsterID

cvBlobsLib calcHisto matchHisto

Orientation/Pos. Position
findOrientation & Orientation

Normalisation Classification Matching HamsterID

normalize calcPCA matchPCA

Figure 4.2: Structure of the tracking/recognition system

4.3 Image retrieval

4.3.1 Image capturing

The video stream is captured by two Dragonfly cameras4 with 30 fps at 640
x 480 pixels. The OpenCV function cvCapture is used for image retrieval,
which allows capturing of frames from a camera as well as frames from a
video file. The two cameras are both mounted above the gaming field, and
show both the whole gaming area. Therefore the image of only one camera
3
see Intel (2006)
4
see http://www.ptgrey.com/products/dragonfly/index.asp

40
4.3 Image retrieval

Figure 4.3: Camera calibration: The left image shows the distorted input
image, the right image shows the undistorted and cropped out-
put image

is needed for the tracking process. However, the moving arm which holds
the bait obstructs some parts of the image. Because of that, two cameras
are used, and the camera selector module selects the unobstructed image
for further processing, based on the known position of the movable arm.
The gaming area is lit using two standard fluorescent tubes to get a nearly
uniform lighting. Due to the short distance of only 60 cm between the
cameras and the floor of the Metazoa Ludens structure, the cameras have
been equipped with a ultra-wideangle lens (2.5mm), in order to capture
the whole gaming area which is 86 x 86 cm wide. Unfortunately this wide-
angle lens produces a distorted image due to lens distortion which has to
be undistorted by the software. Without undistortion the position of the
hamsters would be tracked wrong, and the image would not be a suitable
input image for the recognition module.

4.3.2 Camera calibration

The goal of camera calibration is to bring two coordinate systems into coin-
cidence: the world coordinate system of the objects which are depicted and

41
Chapter 4 Implementation

the camera’s coordinate system.5 Therefore the parameters of the matrix

have to be found, which transforms images from the camera coordinate
system into the world coordinate system. In general, these parameters are
grouped into the intrinsic camera parameters perspective and scaling, and
the extrinsic camera parameters translation and rotation. As the camera
is fixed above the gaming field, pointing downwards and with zero rota-
tion, the extrinsic camera parameters do not have to be measured, but
can be assumed as 1. The intrinsic parameters, most important the ra-
dial perspective distortion, can be measured using the camera calibration
functions in OpenCV. For this purpose a chessboard pattern is captured
by the camera. The camera calibration algorithm finds the corner posi-
tions in the image and calculates the intrinsic parameters which minimise
the backprojection error. The backprojection error is the error between the
known positions of the chessboard corners, and the positions measured
from the camera image. Figure 4.3 shows the effect of a correct camera
calibration. The input image is undistorted using the intrinsic camera pa-
rameters and cropped to remove all parts of the image which do not show
the gaming field. The calculation of the intrinsic camera parameters has
to be done only once when the system is set up.

4.4 Preprocessing

4.4.1 Segmentation

The segmentation of the input image into foreground and background is

done through colour thresholding. The function getBackgroundColor6
queries frames from the camera and gets the background colour without
any object present. Noise suppression by image accumulation7 is used to
suppress effects from image noise. Therefore a temporal mean image of
10 frames is written into a temporary buffer, and the mean colour value,
as well as the standard deviation of the mean colour, are calculated. The
5
see p.604 Davies (2005)
6
The source code of the implementation can be found on the CD-ROM which is supplied
with this thesis
7
see p.37 Davies (2005)

42
4.4 Preprocessing

Figure 4.4: Segmentation: The left image shows the input image from the
camera, the right image shows the resulting mask (Foreground
is black, Background is white)

standard deviation is used to control the bandwidth of pixel values which

are filtered as background colour. If the background colour is less uni-
form, the standard deviation will rise accordingly and the thresholding
operation will filter out more pixels. The function thresholdByColor is
then used to compute a mask image from the input frame. In this mask
image background pixels, i.e. pixels which are inside the bandwidth of
the background colour, are marked with 1 (white) while foreground pixels,
which are not inside the background colour bandwidth, are marked with
0 (black). Figure 4.4 shows the effect of thresholdByColor.

This segmentation approach works as long as the lighting is not changing

while the game is running and the RGB values of the objects are signif-
icantly different from the background. In the case of Metazoa Ludens
these assumptions are true, as the background is a black fabric, which
does not reflect the incoming light well, while the foreground objects, i.e.
the hamsters, reflect the light better and have significantly higher RGB
values. Lighting changes due to different room lighting and ageing effects
of the fluorescent tubes do not affect the segmentation, as the calibration
of the system is automatically done every time the software is started.

43
Chapter 4 Implementation

Figure 4.5: Region labelling: The left image shows the input image from
the camera, the right image shows the connected components
in different colours

4.4.2 Region labelling

Region labelling is done with use of the library cvBlobsLib. The library im-
plements region labelling for connected components in binary images, as
well as filtering of the connected components, for example by size. The con-
nected components, called blobs, are saved in a run-length representation,
which allows fast computation of features like the size of the blob, the coor-
dinates of the bounding box and the ellipse, which approximates the blob.
Furthermore the calculation of moments is implemented in cvBlobsLib.
As input for cvBlobsLib the binary mask output of the segmentation pro-
cess is used. The regions are labelled and then filtered by size to suppress
image noise which is represented by small, non-connected regions. The
output of the region labelling process is an indexed image, i.e. an array
of numbers, where every pixel of the input image is represented by an in-
dex number, depending on the connected component to which the pixel
belongs (see Figure 4.5). As the filtered image does only consist of fore-
ground (hamster) and background pixels, there is one large background
blob (denoted as 0 in Figure 4.5), as well as one blob per hamster. In the
following, all image operations are done for every foreground blob individ-
ually, while all background pixels are omitted.

44
4.4 Preprocessing

Figure 4.6: Geometric parameters of the fitting ellipse10

4.4.3 Orientation

Moments are features of a region which can be normalised to give a size,

translation and rotation invariant descriptor of a region.8 Moments which
are normalised by size and by position of the region are called central
moments, denoted by µp,q , and can be used to find the major and minor
axis of the enclosing ellipse. The geometric parameters of the fitting ellipse
are shown in Figure 4.6. The labels r1 and r2 specify the direction of the
major and minor axis, while the rotation angle of the ellipse is labelled
with θ. The rotation of the fitting ellipse and therefore the rotation angle θ
can be calculated with the following formula:9

1 2µ1,1
θ = − arctan (4.1)
2 µ0,2 − µ2,0
The orientation angle θ is used to normalise the rotation of the input region,
in this case the image of the hamster in a top-view, to a standard rotation.
Problems with the calculation of the rotation angle θ can arise for any
objects for which the major and the minor axis of the ellipse have the
same length, for example circles, as well as squares. This shortcoming is
no problem for blob objects which describe hamsters, because hamsters
8
see p.557 Steger (2006)
9
see p.557 Steger (2006)
10
from Steger (2006), p. 558

45
Chapter 4 Implementation

are in general longish. Another principal shortcoming of the approach is

that θ can only be determined up to modulo π(180°). Steger11 proposes
to use the point which has the largest distance to the center of gravity to
determine if the orientation angle θ or the orientation angle θ + π refers to
the correct orientation. This feature is not yet implemented in the hamster
recognition algorithm, and it is not proven to solve the problem in this
special case.

4.4.4 Normalisation

The function iplRotate is used to normalise the rotation and the position
of the input region into a standard position and rotation. This normalisa-
tion step is important for the application of the face recognition algorithms,
which are alignment based, and therefore need the input images in a nor-
malised position and rotation. Basic descriptors, which are only based on
the number of pixels inside the region as well as histogram descriptors do
not need to have a normalised input. iplRotate calculates the displace-
ment matrix D, based on the rotation angle θ and the position of the centre
of gravity of the region CoGx,y , as well as the normalised target position
Nx,y .
 
cos θ − sin θ xN − xCoG
D =  sin θ cos θ yN − yCoG 
 

0 0 1

In a second step the pixels inside the region are extracted from the in-
put frame with the function extractForeground, and transformed to the
normalised position and rotation by transformation with the displacement
matrix D. Figure 4.7 shows the normalisation of two input regions into two
separate images with a standard position and rotation. These separated
and normalised images are not moving or rotating wherever the hamsters
move inside the gaming area, although they are still subject to non-rigid
transformations, which stem from the hamsters’ way to move.
11
see p.558 Steger (2006)

46
4.5 Classification

Figure 4.7: Normalisation of rotation and position

4.5 Classification

The goal of the hamster recognition algorithm is to find a general way to

distinguish between multiple animals of the same species, by the differ-
ences in the appearance of the hamsters from a top-view. As it could not
be proven in Chapter 3, that these distinctive features do exist, or to what
extend the appearance of two hamsters is different, the question has to
be answered by experiment. Thus a descriptor has to be designed which
is capable to describe the features of the individual animals, to allow dis-
crimination and thus recognition of individual animals. In the following
different classes of descriptors are presented, along with their possible use
for the project.

4.5.1 Binary region descriptors

Binary region descriptors describe the shape of a binary object and ignore
the actual intensity values of the objects’ image. They can therefore be
directly applied on the output image of the region labelling step. One of
the simplest region descriptors is the area of the object, measured by the

47
Chapter 4 Implementation

number of pixels in the region. Further descriptors include the moments,

already mentioned in Section 4.4.3, which allow the calculation of the
minor and major axis of the enclosing ellipse, and therefore give an esti-
mation to the length and breadth of the region. While these descriptors
are extremely useful for numerous machine vision tasks, for example for
quality control applications in manufacturing, they are not sufficient for
this task. As stated in Chapter 3, the size of the hamsters does not vary
significantly between individual animals of one breed. Furthermore the
area, as well as length and breadth, are highly variant through non-rigid
transformations of the moving hamsters. Therefore binary region descrip-
tors like area, length and breadth are no suitable descriptors, which can
be used to distinguish between individual hamsters. In general, binary re-
gion descriptors are more useful to distinguish between classes of objects,
for example between squares and circles, than between individual objects
of one class. As the size of different breeds is different (see Chapter 3), a
binary based descriptor could be useful to determine the breed the indi-
vidual hamster belongs to. This could be the first step of the recognition
process, which would have to be supported by further measurements.

4.5.2 Basic gray value region descriptors

Gray value region descriptors describe the region, based on the intensity
value of the pixels inside the region. Therefore not only the shape of the
region is described but also the different brightness values which consti-
tute the region. The simplest features of a region are the minimum, the
maximum and the mean gray value of the region. The mean gray value is
a measure of the brightness of the region, and can therefore be used to
distinguish between objects which have significantly different gray values.
This distinction approach is quite similar to the thresholding approach,
which has been used for segmentation and included the mean RGB value
as well as the standard deviation (see Section 4.4.1). A basic gray value de-
scriptor does not work as a descriptor for individual hamsters out of two
reasons: At first, the mean gray value of the hamsters’ fur is not differ-
ent enough to allow distinction, and secondly, the mean brightness is too

48
4.5 Classification

Figure 4.8: Histogram descriptor: The chart shows the histogram distribu-
tion of the lightness values of 250 hamster images. The 32
bins of the histogram are shown in different angles, the dis-
tance from the middle shows the amount of values in this bin.
Every line refers to one test image.

variant due to lighting changes through motion. A recognition approach

based on these basic gray value region descriptors would be based on the
idea of invariant properties (see Section 2.3).

4.5.3 Histogram descriptors

Histogram descriptors are based on the distribution of values which are

part of the region. To build a histogram, the range of values is segmented
into a finite number of sub-ranges. Every input value is sampled, and fit
into one of the sub-ranges. The bin which corresponds to the sub-range is
then incremented by one. Therefore a histogram shows the distribution of
values in a range. The resolution of the histogram depends on the number
of bins which are used. Histogram descriptors do not model the spatial
layout of the pixels in the region, but use the frequency of values to de-
scribe the region. Histogram descriptors can be derived from gray values

49
Chapter 4 Implementation

as well as from colour channels (R,G,B) of a colour image. The RGB colour
model describes the colour of a pixel by the intensity of red, green and
blue light. Unfortunately this representation mixes the brightness and the
hue of a colour into the RGB values. Therefore it can be helpful to convert
the image into another colour space, for example HSL, which describes
a pixel by its hue, saturation and luminance value, if colour information
should be used as a descriptor.12 Furthermore, the RGB values are more
sensitive to image noise than transformed representations. In the HSL
model hue corresponds to the dominant wavelength of the colour, satura-
tion is the relative purity of the colour, in other words the amount of white
which is added to the colour, while luminance refers to the amount of light.
Another possible colour model is HSV, hue, saturation and value, which
is similar to HSL apart from one difference. While HSL uses a gray point
in between black and white as reference, HSV uses the white point (illu-
minant) as reference. Thus the lightness value in HSL spans from black
through hue to white, while the V part of the HSV model is defined as
the range in between black and the hue component. First tests using a
histogram representation with 32 bins of the luminance part of the HSL
representation of the images showed significant differences between the
two hamsters (see Figure 4.8).
This effect is mainly due to the fact, that one of the hamsters has more
bright fur than the other one. The different shapes of the histogram dis-
tribution indicate the differences between the hamsters, which are used
for distinction with the histogram descriptor. Figure 4.9 shows two ham-
ster images and the corresponding histograms. The histogram descriptor
is rotation invariant by definition, which means that it does not need the
normalisation steps described in Section 4.4.4. Furthermore it is very
robust against the non-rigid transformations through the movement of
the hamsters, as the effect of these non-rigid transformations is mainly a
change of the spatial layout, whereas the absolute number of pixels of one
brightness value is relatively stable. As the histogram does not model the
spatial layout, but only the frequency of specific values, the recognition

12
see p.506 Iglesias et al. (2006)

50
4.5 Classification

Figure 4.9: Histogram descriptor: The image shows the colour images of
two hamsters as well as the corresponding histograms of the
luminance value in HSL colour space

performance is not affected.

For the training of the histogram classifier only one hamster is put into
the gaming field. The histogram is computed for every frame and accu-
mulated to a mean histogram, which describes this specific hamster. This
training step is done for every hamster, to get a gallery histogram for every
hamster. This gallery histogram is stored as a file to allow the comparison
of the gallery histograms and the current histogram for recognition. In the
recognition process the histogram of the region is computed, the probe
histogram, and correlated to each gallery histogram in the database. The
descriptor with the highest correlation value is considered as match. Fur-
ther testing of this descriptor as well as testing of descriptors derived from
face recognition methods will be described in Chapter 5.

4.5.4 Face recognition methods

The alignment based face recognition methods, which have been described
in Section 2.4.2, are standard algorithms in face recognition and have
been already implemented in several face recognition libraries. The open

51
Chapter 4 Implementation

source CSU Face Identification Evaluation System13 was created by the

Computer Science Faculty of the Colorado State University to allow test-
ing and evaluation of new face recognition algorithms in comparison to the
standard solutions. It includes an ANSI C implementation of a PCA (eigen-
faces) algorithm, a combined PCA+LDA (fisherfaces) algorithm as well as
an algorithm based on a bayesian classifier. Furthermore a preprocessing
tool and statistical tools for performance evaluation of the different algo-
rithms are included in the system. The input of the CSU system are 8-bit
gray-level image files, in the file format PGM, which have to be normalised
in position and rotation. Furthermore the system needs to have image
lists as ASCII text files, which include the file names of all image files, sep-
arated by the different human subjects. In order to use the CSU system
for hamster recognition an output module has been implemented which
writes the normalised hamster images to the harddisk, and creates the
necessary image lists. Apart from normalising the position and rotation of
the images, the background separation has to be done as well. Therefore
the space around the hamster object is filled with a uniform gray value,
which is then considered as background by the CSU system. It is planned
to integrate one of the face recognition algorithms into the hamster recog-
nition system if the tests, which are described in Chapter 5, show a good
performance of one of the face recognition algorithms, for the recognition
of individual hamsters.

13
see Bolme et al. (2003)

52
Chapter 5

Testing

This chapter describes the composition of the test set, as well as the ex-
perimental setup which has been used to conduct the software tests. Fur-
thermore the choice of one specific histogram classifier is explained on
the basis of a comparative analysis of the discrimination performance of
seven different histogram classfiers. A comparative test of three differ-
ent distance metrics and their performance is being performed, and ROC
curves are being used to evaluate the quality of classification of the his-
togram classifiers. The final part of this chapter describes the results of
tests with several face recognition algorithms, and compares them to the
histogram classifiers.

SubjectID FamilyID Breed

1 1 Roborovski 2
4
2 2 Roborovski 1
3 5
3 2 Roborovski
4 3 Golden hamster
5 3 Golden hamster

Figure 5.1: Test set

Figure 5.2: Distribution of
subsets1

1
Yellow circles denote the family groups, the green circles denote the different breeds

53
Chapter 5 Testing

5.1 Experimental setup

5.1.1 Test set

The test set consists of five individual hamsters, which are of two different
breeds and of three different families. In the beginning a test set with six
hamsters was planned, but the abrupt exitus of one roborovski hamster
(sister of subject 1), prohibited the realisation of this plan. The biological
classification of the hamsters is done in Chapter 3 (see Section 3.1). The
test subjects are numbered from 1 to 5 and held in different cages to allow
multiple test series. The Figures 5.1 and 5.2 show the allocation of the
hamsters to the subsets of family and breed.

5.1.2 Test setup

Instead of doing a real-time test, the hamsters are recorded while they
are running around in the playing field, and the test videos are analysed
later. This allows to test and compare different classifiers and algorithms,
and restrict the impact of possible side effects, e.g. changes in the lighting
situation. The tests are conducted with one hamster present at the playing
field at one time, in order to have a stable ground truth. Each hamster
is recorded for one minute for training and for one minute for testing, at
30 fps per second and a video size of 640x480 pixels. The videos are
edited in order to show two seconds of the background in the beginning,
without a hamster present, and 58 seconds of hamster and background.
The background footage is used to train the segmentation module (see
Section 4.4.1). In order to avoid side effects of compression algorithms, all
videos are recorded and edited as uncompressed footage.

5.2 Classification by histograms

5.2.1 Discriminant analysis of histogram classifiers

The first classifiers which are tested are the histogram classifiers, which
have been specified in Section 4.5.3. In the following, the performance of

54
5.2 Classification by histograms

multiple histogram based classifiers is analysed. These classifiers are all

based on a histogram with 32 bins, which represents the frequency of cer-
tain pixel values in the image of a hamster. The classifier has to represent
the image of the hamster, and should be robust against external variations
like changes in the lighting situation, as far as possible. Furthermore it
has to be specific enough to allow discrimination of the individual animals
even if the move, which changes their appearance through non-rigid de-
formations.
As stated in Section 4.5.3, it is not reasonable to use the RGB-values
from the camera directly, but to convert this information to another colour
space. Hence three different classes of possible histogram classifiers are
tested, based on gray values, HSL colour space and HSV colour space. The
gray values are calculated as the mean value of the RGB values, which is a
lossy transformation from three RGB channels to one gray channel. In the
HSL class, the histograms of the three components are tested as separate
classifiers as well as the three classifiers which stem from the components
of the HSV colour space.
The image data of the first test series is used to build one comma-separated
raw data file for each classifier, which includes all histograms of the five
hamsters. Each line of this file represents one frame, and includes the 32
values of the histogram as well as the hamsterID, which is derived from
the ground truth. This data is imported into the statistical software pack-
age SPSS2 to conduct a discriminant analysis. The goal of the analysis is,
to find out, to what extend the classifier in question is capable to discrimi-
nate between the five individual hamsters. For that purpose, the so-called
canonical discriminant functions have to be calculated, which maximise
the discriminance criteria Γ:3

SSb
Γ= (5.1)
SSw

SSb is the scatter matrix in-between the classification groups, in this case
2
see SPSS Inc. (2007)
3
see p.165 Backhaus et al. (2003)

55
Chapter 5 Testing

Figure 5.3: Canonical Discriminant Functions

5.2 Classification by histograms

Figure 5.4: Canonical Discriminant Functions (GRAY)

the individual hamsters, while SSw is the scatter matrix inside the groups.
The number of discriminance functions depends on the number of classes,
for n classes n − 1 discriminance functions can be calculated. In general,
the first two discriminance functions are most important for the perfor-
mance of the class seperation.4 The coefficients of the first and second
discriminance function can be drawn into a diagram, and give a good
overview of the distribution of the test data in the discrimination space.
These diagrams are shown in Figure 5.3 and Figure 5.4, and allow a visual
comparison of the seven different histogram classifiers, which is done in
the following.
The first insight from the analysis of the diagrams is the fact, that all classi-
fiers are not able to discriminate well between subjects 4 and 5. The group
centroids (marked with a blue box) of these two subjects are close to each
other, and the area of distribution nearly lies on top of each other. There-

4
see p.179 Backhaus et al. (2003)

57
Chapter 5 Testing

fore a discrimination of these two subjects does not seem to be possible,

although it should be easily possible to discriminate these two subjects
from other subjects. In fact, subject 4 and 5 both belong to the same fam-
ily and breed (see Figure 5.2), which explains why it is difficult to discrim-
inate between them, but easy to distinguish them from the other subjects.
Even for a human it is very difficult, if not even impossible, to differentiate
between them under the given circumstances, because their appearance
is just too similar (see Figure 5.5). The second question is related to the
subjects 1, 2 and 3 which are all of one breed, but of two different families.
From a biological point of view it should be easier to distinguish subject
1 from subjects 2 and 3 than to discriminate subjects 2 and 3, as 2 and
3 are siblings, and the difference between siblings should be smaller than
the differences between subjects of different families. However, for the
HSV-H and HLS-L classifiers, the distance between the group centres of
subject 1 and 3 is smaller than the distance between the centres of 2 and
3. Therefore the subjects 2 and 3, which are both of one family, should be
easier to distinguish then subject 1 and 3, which are of two families.

Figure 5.5: Subjects 4 (left) and 5 (right) as depicted on test video

As stated in Section 4.5.3, the colour models HLS and HSV are similar,
and therefore the diagrams of the hue based discriminant functions HSV-
H and HLS-H appear almost identical. The differences can be explained by
rounding errors at the transformation of the colour space. The same point
applies to the comparison of HSV-V and HLS-L. In contrast, the diagrams
for the saturation based HSV-S and HLS-S classifiers show a significant
difference, which is due to the different definition of saturation in the two

58
5.2 Classification by histograms

colour spaces (see Section 4.5.3). Altogether this leads to a comparison of

five different classifier classes:

• Gray value histogram

• Hue (HSV-H and HLS-H)

• Value/lightness (HSV-V and HLS-L)

• HSV-S saturation

• HLS-S saturation

Apart from the analysis of the graphical representation, a classification

test based on the discriminant functions can serve as a metric compar-
ison value. In this case the discriminant functions are used to classify
each case in the data set, i.e. every frame of the input video, and com-
pare this classification result with the ground truth. From this data the
percentage of cases can be calculated, for which the discriminant function
predicted the correct value of the group variable, in this case the ham-
sterID. A comparison of these performance values is given in Table 5.1.

Classifier Correct classification Error rate

GRAY 74.40% 25.60%
HSV-H 86.90% 13.10%
HLS-H 86.70% 13.30%
HSV-V 73.60% 26.40%
HLS-L 74.90% 25.10%
HSV-S 86.10% 13.90%
HLS-S 83.60% 16.40%

Table 5.1: Classification performance of seven histogram classifiers5

It has to be noted that these classification results are based on two as-
sumptions: The first assumption is, that the test set only includes mem-
bers of the group of known hamsters, whereby it is not necessary to dis-
tinguish the known hamster objects from other objects. The second point
5
The complete classification results are listed in Appendix A

59
Chapter 5 Testing

is, that the function does not use any information about the quality of the
classification, but will always return the hamsterID of the closest match,
even though the gallery histograms may be too close together for a reliable
classification. This problem will be addressed later (see Section 5.2.4).
The histogram classifier, which is based on gray values, needs less pro-
cessing time than the other classifiers, due to the easy transformation
from RGB to gray values. It has a poor performance compared to the hue
based histograms, but is good enough for the differentiation between the
two breeds, and shows nearly no misclassification for this task. However,
it is not able to distinguish between the individual hamsters of one breed,
and therefore has a mean classification rate of 74.40 %. As expected from
the diagrams, the "correct classification" values of the HSV-H and the HLS-
H classifier are very similar, as well as the results for HSV-V and HLS-L
classifiers. The performance difference of 2.5 % in this test between HSV-
S and HLS-S stems mainly from the fact, that the HLS-S classifier has a
poorer performance for the classification of subjects 2 and 3. In general, it
can be noted, that all classifiers have big problems to distinguish between
subject 4 and subject 5. Although nearly no misclassification between the
different breeds (subject 4 and 5 vs. 1,2,3) happen, the bad performance
inside the golden hamster breed has a substantial negative effect on the
overall classification performance of all classifiers. Due to the outstanding
classification performance, compared to the other classifiers, a hue based
classifier is chosen for further testing. These classifiers show a slightly bet-
ter performance than the saturation based classifiers, but a much better
performance than the value, lightness or gray-value based classifiers.

5.2.2 Comparative evaluation of distance metrics

Apart from the correct colour model, which has to be used to build a
well-working histogram classifier, the distance metric is important for the
classification performance as well. It defines how the correlation between
two histograms is calculated, which is important for the matching of the
probe histogram in question, with the gallery histograms. Especially if the
class distributions are close together, like subjects 4 and 5, small differ-

60
5.2 Classification by histograms

ences in the calculation of the similarity value can affect the classification
performance notably. Three different distance metrics are tested with a
hue based histogram classifier:

• Correlation

• Chi-Square

• Intersect

The calculation of the distance between two histograms is defined inside

the OpenCV function cvCompareHist, which returns a distance metric
value. In the case of Correlation and Intersect a high value describes a
high similarity of the test histograms, while in case of Chi-Sqr lower values
denote higher similarity. The similarity of two histograms is defined in
case of the correlation method by the following equation:6

H 0 (I) ∗ H20 (I)

P
d(H1 , H2 ) = p P I 0 1 (5.2)
( I [H1 (I)2 ] ∗ I [H20 (I)2 ]
P

where N is the number of histogram bins and Hk0 (I) can be computed as
follows:
X
Hk0 (I) = Hk (I) − 1/N ∗ Hk (J) (5.3)
J

The Chi-Square method defines the similarity of the two histograms by the
ratio of the difference to the sum of the values of the two histogram bins.
This ratio is summed up to find the similarity value for the histograms,
which becomes 0 if the histograms are identical.

X H1 (I) − H2 (I)
d(H1 , H2 ) = (5.4)
I H1 (I) + H2 (I)

The last method is called Intersect and needs least processing time of the
three formulas. It intersects the two histograms by finding the minimum
of every bin in the histograms, and summing up these minima:
X
d(H1 , H2 ) = min(H1 (I), H2 (I)) (5.5)
I
6
see Intel (2007)

61
Chapter 5 Testing

Based on the HSV-H classifier, a test including all three distance metrics,
has been conducted. The comparison of the ground truth and the clas-
sifiers’ guess provides the correct classification rate and the error rate,
respectively. The results are listed in Table 5.2:

Distance Metric Correct Error Rate

Correl 80.26 % 19.74 %
Chi-Square 83.99 % 16.01 %
Intersect 63.11 % 36.89 %

Table 5.2: Classification and error rates for different distance metrics7

Figure 5.6: HSV-H Intersect result

It is obvious from the data that the Intersect method has the worst perfor-
mance overall, and therefore should not be chosen for the pet recognition
application. In contrast to the other distance metrics, it does not only have
problems to differentiate between subjects 4 and 5, but also between sub-
jects 1 and 3. Figure 5.6 shows the perfect performance of the classifier
for subject 1 - it is classified correctly in all cases. Unfortunately subject
3 is never classified correctly, most of the time it is classified as subject 1.
7
The complete results are listed in Appendix B

62
5.2 Classification by histograms

The discrimination between the different breeds is the only task, which is
done well by this combination of HSV-H classifier and intersect distance
metric, but this task can be easily done as well by a more simple classifier,
like the classifier based on gray values.

Figure 5.7: HSV-H Correl Test 1 Figure 5.8: HSV-H Chi-Sqr Test 1

The Chi-Square and Correl distance metric generate much better classifi-
cation results of around 80% correct classification rate. The differences
between the results are not too big, Chi-Square seems to be more exact
for all subjects and has therefore a better overall performance than Correl.
Both methods deliver very good results for the discrimination inside the
roborovski breed (subjects 1,2,3), even between the members of one family.
There are nearly no cross-breed errors, which leads to a cross-breed error
rate of approximately 1% for the Chi-Square method. The result is even
better if the data of the subjects 4 and 5 is omitted, as these cannot be
distinguished by any of the algorithms, and they are responsible for a big
part of the classification errors. In this case the correct recognition rate,
to identify the correct one out of three roborovski hamsters, rises above
90%. As a result of this comparison, the intersect method is ignored in the
following, while the performance of the other two methods, under varying
external conditions, is investigated in the next section.

63
Chapter 5 Testing

5.2.3 Classifier robustness against external variance

In an ideal case, a classifier has to represent each subject no matter how

fast it moves, where it is positioned, how the lighting is levelled or how
far it is away from the camera. In reality, all these variables influence the
appearance of the subject for the camera and therefore may influence the
relation between classifier and subject, which may lead to a worse clas-
sification performance. The variable movement, and the consequential
non-rigid deformation, as well as the position have already been examined
in the preceding section, with the use of the first test series. The size
invariance of the classifier is ensured through the use of normalised his-
tograms. Depending on the colour model, which is used for the classifier,
the classification is dependent on a constant lighting situation as well as
a constant hue value, which itself is depending on a correct setting of the
cameras’ white balance. While the white balance can be easily readjusted
by placing a white card in the scene and setting the auto white, the lighting
setup is more complicated to keep constant.

Figure 5.9: HSV-H Correl Test 2 Figure 5.10: HSV-H Chi-Sqr Test 2

Even though the Metazoa Ludens System has fixed fluorescent tubes for
lighting, the room light is likely to be changing, as well as a changing

64
5.2 Classification by histograms

amount of sunlight, which can mix up with the light from the fluorescent
lights. Therefore the second test series includes a change in lighting of
approximately one f-stop, which is done by alteration of the mechanical
aperture of the camera. The white balance has been readjusted after the
lighting change. In the following, two classifiers and two distance metrics
are compared by their performance in the second test series. The classi-
fiers for this test had been trained on the first test series, so the effect of
the lighting change on the performance can be measured.
An exhaustive comparison of the classification performance for the second
testset, which included all tested classifiers, combined with all distance
metrics, has been conducted. For the sake of space, the individual re-
sults are not presented here, but discussed on the example of the HSV-H
classifier, which had the best recognition rates overall, and the gray value
classifier. It is assumed that a gray based histogram classifier is more
affected by the lighting change than a hue based classifier, while a hue
based classifier should be affected by hue changes, which can occur if the
white balance is shifted. The classification results of the second test series
are shown in Table 5.3:

Classifier Correct 1 ErrorRate 1 Correct 2 ErrorRate 2 Difference

HSV-H Correl 80.26% 19.74% 45.00% 55.00% -35.26%
HSV-H ChiSqr 83.99% 16.01% 60.50% 39.50% -23.49%
Gray ChiSqr 70.70% 29.30% 50.20% 49.80% -20.50%

Table 5.3: Classification rates comparison of TestSet 1 + 28

It is evident from Table 5.3 as well as from the diagrams (see Figures 5.9
and 5.10) that the classification rate drops significantly for the second test
series, and the error rate rises, respectively. Especially the hue based clas-
sifiers, which should not be too much affected, as only the amount of light
but not the colour temperature has been changed, show a performance
drop of up to 35%. This performance drop is even higher than the one
for the gray based histogram classifier, which leads to the conclusion that
the hue based classifier, especially in connection with the correl distance
8
The complete results are listed in Appendix C

65
Chapter 5 Testing

Figure 5.11: Gray ChiSqr Test 1 Figure 5.12: Gray Chi-Sqr Test 2

metric, is very sensitive to any lighting changes. Possible explanations for

this fact can be problems with the automatic white balance setup, which
does not lead to the same hue results for Test 1 and Test 2, as well as the
non-linear frequency response of the colour channels of the camera sen-
sor. The only task which can be solved by the gray value classifier for test
2 is the classification of the breed (see Figure 5.12), which is done with
an error rate of approximately 3% (see Table C.4), while the same qualifier
returns satisfying results in the discrimination of all subjects, inside the
first test set (see Figure 5.11). The combination of HSV-H classifier and
Chi-Square distance metric is far away from an ideal classification rate of
the individual hamsters for Test 2, but at least the discrimination of the
three different families is possible (see Figure 5.10). The additional varia-
tion of the lighting between the two tests adds a degree of freedom to the
appearance of every subject. This degree of freedom increases the variance
of the histogram classifier of each individual subject, the scatter inside the
group, while the scatter in-between the groups stays constant. Therefore
the discriminance criterion (see Equation 5.1) is reduced, which leads to
worse classification rates and higher error rates. None of the methods

66
5.2 Classification by histograms

is able to discriminate well between the individual hamsters for test set
2, which limits the technology to a laboratory setting, where the lighting
and camera parameters can be kept constant. For further development
more test series are needed, to explore the relation of changing light and
classification performance.

5.2.4 Quality of classification

As described in Section 5.1.1, the test set consists of five hamsters, and
the classifier has to detect which of the five hamsters is present in the test
image. The assumption, that only objects which are part of the group of
known hamsters are depicted on the test videos, has been made. There-
fore the algorithm works as a closed-test, with a test setup similar to a
face identification system (see Section 2.4). As a basic algorithm the sys-
tem considers the hamster as a match whose gallery histogram has the
highest similarity value to the current histogram. Problems arise if the
distance values of two gallery histograms lie near together, or in more gen-
eral words, if two class distributions have similar locations in the feature
space. A good example are subjects 4 and 5 whose appearance is very sim-
ilar, and whose classifiers are close in feature space (see Figure 5.3). In
this case the test confidence is low, which means that the result of the clas-
sification process may be the right class, but a wrong classification has a
high likelihood as well. Therefore the outcome of the classifier is overlaid
by an error which has the form of an unknown likelihood distribution. In
order to minimise the impact of this error, a classification confidence value
has to be calculated, which is used to reject the classification if the con-
fidence is too low. One quality value for histogram classification, which
is based on the calculation of the similarity value (see Equations 5.2, 5.4
and 5.5), can be computed as the difference of the maximum similarity
value and the second largest similarity value:9

conf = dmax − dmax−1 (5.6)

9
In the case of the Chi-Square metric a small d value denotes a match, and the minimum
and second smallest value are taken into the calculation, instead of the maximum

67
Chapter 5 Testing

A threshold of this confidence value can be used to accept or reject the

result of an classification, or the quality value is used for a time based
analysis, in which an object is considered as match if it has accumulated
enough quality points in a certain timeframe. Although these concepts do
not improve the exactness of the initial classification, they help to evalu-
ate the likelihood of a false classification and therefore the quality of the
guess. A high threshold value can be used to eliminate wrong classifica-
tions, but on the other hand a lot of objects cannot be classified and get
rejected, if the threshold value is too high. ROC analysis (Receiver Oper-
ator Characteristics) is a diagnostic tool which is widely used to evaluate
and compare laboratory tests,10 for example in medical applications. Thus
two error classes are defined, the false-positives and the false-negatives,
as well as two positives classes, true-positives and true-negatives. In the
case of false-negatives the test does not detect a positive state, although
it is given, while a false-positive error is a detection, although the tested
condition is not true. The ROC curve can be used to find a cut-off value
for the test, which minimises one of the above error classes.11 The ROC
curve shows the ratio of the sensitivity, which is the ratio of true-positives
to all positives, and the specificity which is the ratio of true-negatives to
all negative cases. Every point of the curve is specified by these two val-
ues, depending on a specific cut-off value. At the bottom-left corner, the
cut-off value is high, so no wrong classifications are done, but the number
of rejected measurements is very high as well. On the top right corner,
the maximum number of possible positive classifications is achieved, but
a lot of wrong classifications (false positives) depreciate the quality of the
classification. The area below the curve can be used as an overall quality
criterion of a test, which can be between 0 and 1, but should be above 0.5
to be useful. The basic classification without a quality value or minimum
threshold corresponds to the top right corner of the ROC curve. While
rising the threshold for classification, the number of rejected objects rises,
as well as the number of false negatives, i.e. objects which could have
been classified correct. Figures 5.13 and 5.14 show the ROC curves of the
10
see Stephan et al. (2003)
11
see Hanley (1982)

68
5.2 Classification by histograms

Figure 5.13: ROC curve HSV-H Figure 5.14: ROC curve Gray

HSV-H classifier, and the classifier based on gray values, respectively, in

combination with the Chi-square distance metric.
The area under the curves can be taken as a metric performance value,
which can be used to compare different technologies, with the same test
data. It provides a better insight into the strengths and weaknesses of a
test, than the simple ratio of correct classifications to all tested subjects.
In the ROC curves above it is evident, that there is a significant difference
between the performance of both classifiers for test series 1 and test series
2. Both classifiers show a lower value for the area under the curve for test
series 2 - a finding which matches the results of the performance tests in
Section 5.2.3. In the case of the HSV-H classifier, an optimal cut-off value
would be around Sensitivity 0.6 and 1-Specificity 0.15 (see circled area in
Figure 5.13). Figure 5.15 shows the effect of thresholding the classifier
result with this cut-off value, and rejecting all classifications which are
below this threshold. The rate of false-positives drops down to 0.6 % of all
classifications, but 47.4 % of all classifications are rejected.
This high reject rate is mainly due to subjects 4 and 5, which cannot

69
Chapter 5 Testing

Figure 5.15: HSV-H (Chi-Square) Figure 5.16: HSV-H (Chi-Square)

with Cut-off (Test 1) with Cut-off (Test 2)12

be distinguished well by the classifier (see Figure 5.8), and therefore get
rejected in 96.9 % of all cases. The mean reject-rate for the subjects 1-3
is at 14.2%, a value which would be adequate for a hamster identification
system. In this configuration the software could be used to detect if a
roborovski hamster (subjects 1-3) is inside the playing field, and identify
the specific individual hamster, if it is a roborovski. However, a reject
result would not identify the hamster as a member of the golden hamster
breed (subjects 4 and 5), as the reject signal is raised as well in the case of
false-negatives. The results for the second test set with the same classifier
are depicted in Figure 5.16, and indicate, that the classifier performance
is not sufficient. Although the cut-off value is high, and the reject rate
is at 47.8 %, a high amount of misclassifications is returned. The only
subject which can be detected by this setup is subject 3, but even this
classifier result is not very specific, as subject 2 is misclassified as subject
3 in a large number of cases.

12
Rejected cases are denoted by -1. The complete results are listed in Appendix D

70
5.3 Classification by face recognition algorithms

5.3 Classification by face recognition algorithms

The face recognition algorithms are tested with videos from the same test
set as the histogram classifiers, in order to ensure comparability of the re-
sults. The algorithms are trained on a subset of all video frames, and then
tested with frames from other parts of the same videos. As described in
Section 4.5.4 the images are converted to gray value images and exported
from the Metazoa Ludens software, including their corresponding image
lists. The CSU Face Identification Evaluation System13 is then used for the
training and testing of the classifiers. In general, the classification results
which are achieved by the face recognition algorithms are significantly
lower than the results of the histogram methods. One possible explana-
tion stems from the high variability of the hamsters appearance, through
the distortion of their fur, and their non-rigid movement. This variability
is very high compared to the differences between the individual hamsters,
which complicates the recognition of individual animals. The face recogni-
tion algorithms are optimised for the recognition of the human face, which
is quite variant, but less variant than the constant mutation of the ham-
sters’ appearance. Furthermore the differences between different human
faces are bigger than between the fur of individual hamsters. Therefore
the face recognition algorithms, which compare the spatial layout of the
input images, have big problems to distinguish between objects like the
hamsters, whose appearance is constantly changing. In the following, the
results of the several face recognition algorithms are shown and briefly
discussed.

5.3.1 PCA results

The principal components analysis (PCA) is used to reduce the complex-

ity of the gray value images, in order to use them as classifiers for the
hamsters. The CSU FaceID system has two different distance metrics for
the PCA algorithm included. The results of the PCA-Euclidean algorithm
are depicted in Figure 5.17, and show a wide-spreaded response of the
13
see Bolme et al. (2003)

71
Chapter 5 Testing

Figure 5.17: PCA-Euclidean Figure 5.18: PCA-MahCosine14

algorithm, with a mean correct-classification rate of 41.14%.

Although the correct-classification result is the most frequent response

for subjects 3-5, a lot of classification errors depreciate the overall result.
Furthermore the errors cannot be explained by the differences between
the hamsters, especially in the case of subject 1, which is misclassified
in 76.5% of all cases, with most of these cases being cross-breed misclas-
sifications. The performance of the MahCosine distance metric in com-
bination with the PCA algorithm is slightly better (see Figure 5.18) than
PCA-Euclidean, and has a correct-classification rate of 42.72%. In con-
trast to the euclidean algorithm, the correct guess is the most frequent
case for all subjects, but the amount of misclassifications is high as well.
The PCA-MahCosine method produces more systematic errors, which can
be explained by the appearance of the hamsters, and less cross-breed mis-
classifications than the PCA-Euclidean method.

14
The complete results are listed in Appendix E

72
5.3 Classification by face recognition algorithms

Figure 5.19: LDA-soft16

5.3.2 LDA results

Linear discriminant analysis (LDA) is used in an implementation of the

fisherface algorithm,15 and can be evaluated through the CSU FaceId soft-
ware. It is a combination of PCA and LDA techniques, and has advantages
for the classification, if the classes in question are linear separable (see
Section 2.4.2.2). Figure 5.19 shows the performance of the LDA-soft al-
gorithm with the test set 1. The overall rate of correct classifications is
44.42% and thus slightly higher than the preceding algorithms. A special
case is the response of the classifier for subject 1, which includes all five
possible responses with a similar likelihood between 15% and 25%. Apart
from that finding, the algorithm is able to distinguish between subjects 2
and 3, and between the breeds, but is not able to distinguish subjects 4
and 5, like the other algorithms.

15
see Belhumeur et al. (1997)
16
The complete results are listed in Appendix E

73
Chapter 5 Testing

Figure 5.20: Bayesian-MAP Figure 5.21: Bayesian-ML17

5.3.3 Bayesian results

The probabilistic bayesian classifier, based on bayes’ theorem (see Sec-

tion 2.4.2.3), is implemented inside the CSU FaceRec software with two
different evaluation algorithms - Bayesian-MAP and Bayesian-ML, whose
results are shown above (see Figures 5.20 and 5.21). The performance
of the Bayesian-ML algorithm is slightly better, with a recognition rate of
40.21%, compared to 36.56% for the Bayesian-MAP classifier. Despite
good recognition results for subjects 1 and 3, especially for Bayesian-ML,
the high amount of false classifications prohibits the practical use of these
classifiers for a pet recognition application.

17
The complete results are listed in Appendix E

74
Chapter 6

Conclusion

6.1 Conclusion

The goal of this thesis, to build a combined tracking and recognition soft-
ware, which can be used for remote human-pet interaction, has been
reached. It could be shown that a histogram classifier is able to distin-
guish between multiple hamsters, at least in the case of the roborovski
breed. Furthermore it could be demonstrated by experiment, that the
standard face recognition technologies do not have a better recognition
performance for this task, than the histogram classifier. This fact can
be explained by the highly-variant appearance of the hamsters, which is
constantly changing while they move. Apart from the recognition perfor-
mance, the face recognition algorithms need more computational power
than the histogram methods, for the training of the classifier as well as for
the classification. Therefore the use of a histogram classifier is proposed
for Metazoa Ludens. The poor classification results of the histogram algo-
rithm for one breed can be explained by the high similarity of the subjects,
in connection with the high variability of the appearance of the individual
subject. Furthermore, segmentation errors, which mainly stem from the
small difference between the colour of the background, and the colour of
parts of the golden hamsters’ fur, are a possible reason for the bad recogni-
tion performance for these subjects. Therefore a change of the segmenta-
tion algorithm, e.g. to a motion based segmentation approach, could help
to improve the recognition performance in these cases as well.

75
Chapter 6 Conclusion

6.2 Further work

For the future, further development and tests are needed, especially in
the question of classifier robustness against external influences. The hue
based histogram algorithms, which had the best results in the test, seem
to be very sensitive to illumination changes. For a future application of the
recognition system, either a calibration method has to be developed, which
is able to calibrate the camera parameters to a standard value each time
the software is started, or a classifier has to be designed, which is able to
recognise the individual animals, and which is more robust against exter-
nal variations of the test setup. Furthermore the use of different segmenta-
tion algorithms should be explored, to solve problems with the recognition
for some breeds, as well as to allow the recognition software to be used
in a less restricted setup, without the need of a uniform background, for
example in pet shops or research labs.

76
Appendix

I
Appendix A

Classification performance
of different histogram classifiers

Classification Results (GRAY)

HamsterID Predicted Group Membership Total
1 2 3 4 5
Original Count 1 1,505 131 104 0 0 1,740
2 184 1,219 337 0 0 1,740
3 62 208 1,470 0 0 1,740
4 0 0 4 1,059 677 1,740
5 0 0 7 522 1,211 1,740
% 1 86.5 7.5 6.0 0.0 0.0 100.0
2 10.6 70.1 19.4 0.0 0.0 100.0
3 3.6 12.0 84.5 0.0 0.0 100.0
4 0.0 0.0 0.2 60.9 38.9 100.0
5 0.0 0.0 0.4 30.0 69.6 100.0
74,3% of original grouped cases correctly classified.

Table A.1: Classification Results (GRAY)

II
Classification Results (HSV-H)
HamsterID Predicted Group Membership Total
1 2 3 4 5
Original Count 1 1,688 28 23 0 1 1,740
2 0 1,668 72 0 0 1,740
3 22 13 1,705 0 0 1,740
4 14 0 8 1,264 454 1,740
5 21 0 26 457 1,236 1,740
% 1 97.0 1.6 1.3 0.0 0.1 100.0
2 0.0 95.9 4.1 0.0 0.0 100.0
3 1.3 0.7 98.0 0.0 0.0 100.0
4 0.8 0.0 0.5 72.6 26.1 100.0
5 1.2 0.0 1.5 26.3 71.0 100.0
86,9% of original grouped cases correctly classified.

Table A.2: Classification Results (HSV-H)

Classification Results (HSV-S)

HamsterID Predicted Group Membership Total
1 2 3 4 5
Original Count 1 1,663 19 57 0 1 1,740
2 1 1,586 153 0 0 1,740
3 11 38 1,691 0 0 1,740
4 1 0 8 1,247 484 1,740
5 4 0 14 417 1,305 1,740
% 1 95.6 1.1 3.3 0.0 0.1 100.0
2 0.1 91.1 8.8 0.0 0.0 100.0
3 0.6 2.2 97.2 0.0 0.0 100.0
4 0.1 0.0 0.5 71.7 27.8 100.0
5 0.2 0.0 0.8 24.0 75.0 100.0
86,1% of original grouped cases correctly classified.

Table A.3: Classification Results (HSV-S)

III
Appendix A Classification performance of different histogram classifiers

Classification Results (HSV-V)

HamsterID Predicted Group Membership Total
1 2 3 4 5
Original Count 1 1,482 132 126 0 0 1,740
2 149 1,195 396 0 0 1,740
3 41 236 1,463 0 0 1,740
4 0 0 7 1,062 671 1,740
5 0 0 7 530 1,203 1,740
% 1 85.2 7.6 7.2 0.0 0.0 100.0
2 8.6 68.7 22.8 0.0 0.0 100.0
3 2.4 13.6 84.1 0.0 0.0 100.0
4 0.0 0.0 0.4 61.0 38.6 100.0
5 0.0 0.0 0.4 30.5 69.1 100.0
73,6% of original grouped cases correctly classified.

Table A.4: Classification Results (HSV-V)

Classification Results (HLS-H)

HamsterID Predicted Group Membership Total
1 2 3 4 5
Original Count 1 1,687 29 23 0 1 1,740
2 0 1,667 73 0 0 1,740
3 19 13 1,708 0 0 1,740
4 14 0 8 1,253 465 1,740
5 22 0 26 468 1,224 1,740
% 1 97.0 1.7 1.3 0.0 0.1 100.0
2 0.0 95.8 4.2 0.0 0.0 100.0
3 1.1 0.7 98.2 0.0 0.0 100.0
4 0.8 0.0 0.5 72.0 26.7 100.0
5 1.3 0.0 1.5 26.9 70.3 100.0
86,7% of original grouped cases correctly classified.

Table A.5: Classification Results (HLS-H)

IV
Classification Results (HLS-L)
HamsterID Predicted Group Membership Total
1 2 3 4 5
Original Count 1 1,546 114 80 0 0 1,740
2 142 1,271 327 0 0 1,740
3 61 225 1,454 0 0 1,740
4 0 0 5 1,054 681 1,740
5 0 0 7 543 1,190 1,740
% 1 88.9 6.6 4.6 0.0 0.0 100.0
2 8.2 73.0 18.8 0.0 0.0 100.0
3 3.5 12.9 83.6 0.0 0.0 100.0
4 0.0 0.0 0.3 60.6 39.1 100.0
5 0.0 0.0 0.4 31.2 68.4 100.0
74,9% of original grouped cases correctly classified.

Table A.6: Classification Results (HLS-L)

Classification Results (HLS-S)

HamsterID Predicted Group Membership Total
1 2 3 4 5
Original Count 1 1,618 50 44 1 27 1,740
2 49 1,478 213 0 0 1,740
3 26 79 1,635 0 0 1,740
4 14 0 8 1,241 477 1,740
5 28 0 18 392 1,302 1,740
% 1 93.0 2.9 2.5 0.1 1.6 100.0
2 2.8 84.9 12.2 0.0 0.0 100.0
3 1.5 4.5 94.0 0.0 0.0 100.0
4 0.8 0.0 0.5 71.3 27.4 100.0
5 1.6 0.0 1.0 22.5 74.8 100.0
83,6% of original grouped cases correctly classified.

Table A.7: Classification Results (HLS-S)

V
Appendix B

Comparison of different
distance metrics (HSV-H)

Correl
Result Total
1 2 3 4 5
Original 1 Count 1,444 18 238 0 40 1,740
% 83.0% 1.0% 13.7% 0.0% 2.3% 100.0%
2 Count 0 1,686 54 0 0 1,740
% 0.0% 96.9% 3.1% 0.0% 0.0% 100.0%
3 Count 251 120 1,369 0 0 1,740
% 14.4% 6.9% 78.7% 0.0% 0.0% 100.0%
4 Count 30 0 0 1,278 432 1,740
% 1.7% 0.0% 0.0% 73.4% 24.8% 100.0%
5 Count 40 0 0 494 1,206 1,740
% 2.3% 0.0% 0.0% 28.4% 69.3% 100.0%
Total Count 1,765 1,824 1,661 1,772 1,678 8,700
% 20.3% 21.0% 19.1% 20.4% 19.3% 100.0%
Correct 80.2644%
Error rate 19.74%

Table B.1: Distance metric test results (HSV-H) CORREL

VI
Chi-Square
Result Total
1 2 3 4 5
Original 1 Count 1,675 8 54 0 3 1,740
% 96.3% 0.5% 3.1% 0.0% 0.2% 100.0%
2 Count 1 1,567 172 0 0 1,740
% 0.1% 90.1% 9.9% 0.0% 0.0% 100.0%
3 Count 52 39 1,649 0 0 1,740
% 3.0% 2.2% 94.8% 0.0% 0.0% 100.0%
4 Count 22 0 8 1,276 434 1,740
% 1.3% 0.0% 0.5% 73.3% 24.9% 100.0%
5 Count 73 0 27 500 1,140 1,740
% 4.2% 0.0% 1.6% 28.7% 65.5% 100.0%
Total Count 1,823 1,614 1,910 1,776 1,577 8,700
% 21.0% 18.6% 22.0% 20.4% 18.1% 100.0%
Correct 83.9885%
Error rate 16.01%

Table B.2: Distance metric test results (HSV-H) CHI-SQR

Intersect
Result Total
1 2 3 4 5
Original 1 Count 1,739 1 0 0 0 1,740
% 99.9% 0.1% 0.0% 0.0% 0.0% 100.0%
2 Count 109 1,631 0 0 0 1,740
% 6.3% 93.7% 0.0% 0.0% 0.0% 100.0%
3 Count 1,686 53 1 0 0 1,740
% 96.9% 3.0% 0.1% 0.0% 0.0% 100.0%
4 Count 51 0 0 1,369 320 1,740
% 2.9% 0.0% 0.0% 78.7% 18.4% 100.0%
5 Count 133 0 0 856 751 1,740
% 7.6% 0.0% 0.0% 49.2% 43.2% 100.0%
Total Count 3,718 1,685 1 2,225 1,071 8,700
% 42.7% 19.4% 0.0% 25.6% 12.3% 100.0%
Correct 63.1149%
Error rate 36.89%

Table B.3: Distance metric test results (HSV-H) INTERSECT

VII
Appendix C

Classifier robustness against

external variance

Test 2 Results: HSV-H (Correl)

Result Total
1 2 3 4 5
Original 1 Count 1,025 0 3 37 675 1,740
% 58.9% 0.0% 0.2% 2.1% 38.8% 100.0%
2 Count 494 184 1,060 1 1 1,740
% 28.4% 10.6% 60.9% 0.1% 0.1% 100.0%
3 Count 961 1 752 0 26 1,740
% 55.2% 0.1% 43.2% 0.0% 1.5% 100.0%
4 Count 0 0 0 1,613 127 1,740
% 0.0% 0.0% 0.0% 92.7% 7.3% 100.0%
5 Count 0 0 0 1,400 340 1,740
% 0.0% 0.0% 0.0% 80.5% 19.5% 100.0%
Total Count 2,480 185 1,815 3,051 1,169 8,700
% 28.5% 2.1% 20.9% 35.1% 13.4% 100.0%
Correct 45.0%
Error Rate 55.01%

Table C.1: Test 2 Results: HSV-H (Correl)

VIII
Test 2 Results HSV-H (ChiSqr)
Result Total
1 2 3 4 5
Original 1 Count 1,362 0 29 1 348 1,740
% 78.3% 0.0% 1.7% 0.1% 20.0% 100.0%
2 Count 481 318 941 0 0 1,740
% 27.6% 18.3% 54.1% 0.0% 0.0% 100.0%
3 Count 71 2 1,665 0 2 1,740
% 4.1% 0.1% 95.7% 0.0% 0.1% 100.0%
4 Count 0 0 5 1,652 83 1,740
% 0.0% 0.0% 0.3% 94.9% 4.8% 100.0%
5 Count 0 0 3 1,473 264 1,740
% 0.0% 0.0% 0.2% 84.7% 15.2% 100.0%
Total Count 1,914 320 2,643 3,126 697 8,700
% 22.0% 3.7% 30.4% 35.9% 8.0% 100.0%
Correct 60.5%
Error Rate 39.53%

Table C.2: Test 2 Results HSV-H (ChiSqr)

Test 1 Results: Gray (ChiSqr)

Result Total
1 2 3 4 5
Original 1 Count 1,568 145 27 0 0 1,740
% 90.1% 8.3% 1.6% 0.0% 0.0% 100.0%
2 Count 257 1,150 333 0 0 1,740
% 14.8% 66.1% 19.1% 0.0% 0.0% 100.0%
3 Count 46 350 1,344 0 0 1,740
% 2.6% 20.1% 77.2% 0.0% 0.0% 100.0%
4 Count 0 0 4 999 737 1,740
% 0.0% 0.0% 0.2% 57.4% 42.4% 100.0%
5 Count 0 0 4 645 1,091 1,740
% 0.0% 0.0% 0.2% 37.1% 62.7% 100.0%
Total Count 1,871 1,645 1,712 1,644 1,828 8,700
% 21.5% 18.9% 19.7% 18.9% 21.0% 100.0%
Correct 70.7%
Error Rate 29.29%

Table C.3: Test 1 Results: Gray (ChiSqr)

IX
Appendix C Classifier robustness against external variance

Test 2 Results: Gray (ChiSqr)

Result Total
1 2 3 4 5
Original 1 Count 480 1,110 150 0 0 1,740
% 27.6% 63.8% 8.6% 0.0% 0.0% 100.0%
2 Count 13 1,565 160 0 2 1,740
% 0.7% 89.9% 9.2% 0.0% 0.1% 100.0%
3 Count 0 1,021 719 0 0 1,740
% 0.0% 58.7% 41.3% 0.0% 0.0% 100.0%
4 Count 49 0 0 1,498 193 1,740
% 2.8% 0.0% 0.0% 86.1% 11.1% 100.0%
5 Count 0 0 1 1,636 103 1,740
% 0.0% 0.0% 0.1% 94.0% 5.9% 100.0%
Total Count 542 3,696 1,030 3,134 298 8,700
% 6.2% 42.5% 11.8% 36.0% 3.4% 100.0%
Correct 50.2%
Error Rate 49.83%

Table C.4: Test 2 Results: Gray (ChiSqr)

X
Appendix D

Quality of classification

HSV-H (Chi-Square) with cut-off (Test 1)

Result Total
-1.0 1.0 2.0 3.0 4.0 5.0
Original 1 Count 176 1,536 1 27 0 0 1,740
% 10.1% 88.3% 0.1% 1.6% 0.0% 0.0% 100.0%
2 Count 287 0 1,383 70 0 0 1,740
% 16.5% 0.0% 79.5% 4.0% 0.0% 0.0% 100.0%
3 Count 278 16 3 1,443 0 0 1,740
% 16.0% 0.9% 0.2% 82.9% 0.0% 0.0% 100.0%
4 Count 1,690 10 0 8 32 0 1,740
% 97.1% 0.6% 0.0% 0.5% 1.8% 0.0% 100.0%
5 Count 1,683 30 0 26 0 1 1,740
% 96.7% 1.7% 0.0% 1.5% 0.0% 0.1% 100.0%
Total Count 4,114 1,592 1,387 1,574 32 1 8,700
% 47.3% 18.3% 15.9% 18.1% 0.4% 0.0% 100.0%

Table D.1: HSV-H (Chi-Square) with cut-off (Test 1)1

1
-1 denotes cases which have been rejected due to the cut-off criteria

XI
Appendix D Quality of classification

HSV-H (Chi-Square) with cut-off (Test 2)

Result Total
-1.0 1.0 2.0 3.0 4.0 5.0
Original 1 Count 781 935 0 21 0 3 1,740
% 44.9% 53.7% 0.0% 1.2% 0.0% 0.2% 100.0%
2 Count 593 292 127 728 0 0 1,740
% 34.1% 16.8% 7.3% 41.8% 0.0% 0.0% 100.0%
3 Count 142 18 1 1,579 0 0 1,740
% 8.2% 1.0% 0.1% 90.7% 0.0% 0.0% 100.0%
4 Count 1,236 0 0 1 503 0 1,740
% 71.0% 0.0% 0.0% 0.1% 28.9% 0.0% 100.0%
5 Count 1,403 0 0 3 324 10 1,740
% 80.6% 0.0% 0.0% 0.2% 18.6% 0.6% 100.0%
Total Count 4,155 1,245 128 2,332 827 13 8,700
% 47.8% 14.3% 1.5% 26.8% 9.5% 0.1% 100.0%

Table D.2: HSV-H (Chi-Square) with cut-off (Test 2)

XII
Appendix E

Face recognition algorithms

test results

PCA-Euclidean
Result Total
1 2 3 4 5
Original 1 Count 411 203 243 339 544 1,740
% 23.6% 11.7% 14.0% 19.5% 31.3% 100.0%
2 Count 768 634 128 116 94 1,740
% 44.1% 36.4% 7.4% 6.7% 5.4% 100.0%
3 Count 156 207 900 258 219 1,740
% 9.0% 11.9% 51.7% 14.8% 12.6% 100.0%
4 Count 183 21 32 808 696 1,740
% 10.5% 1.2% 1.8% 46.4% 40.0% 100.0%
5 Count 189 17 24 682 828 1,740
% 10.9% 1.0% 1.4% 39.2% 47.6% 100.0%
Total Count 1,707 1,082 1,327 2,203 2,381 8,700
% 19.6% 12.4% 15.3% 25.3% 27.4% 100.0%

Table E.1: PCA-Euclidean results

XIII
Appendix E Face recognition algorithms test results

PCA-MahCosine
Result Total
1 2 3 4 5
Original 1 Count 608 444 422 99 167 1,740
% 34.9% 25.5% 24.3% 5.7% 9.6% 100.0%
2 Count 586 968 144 23 19 1,740
% 33.7% 55.6% 8.3% 1.3% 1.1% 100.0%
3 Count 184 455 901 124 76 1,740
% 10.6% 26.1% 51.8% 7.1% 4.4% 100.0%
4 Count 294 171 52 700 523 1,740
% 16.9% 9.8% 3.0% 40.2% 30.1% 100.0%
5 Count 474 178 53 494 541 1,740
% 27.2% 10.2% 3.0% 28.4% 31.1% 100.0%
Total Count 2,146 2,216 1,572 1,440 1,326 8,700
% 24.7% 25.5% 18.1% 16.6% 15.2% 100.0%

Table E.2: PCA-MahCosine results

LDA-Soft
Result Total
1 2 3 4 5
Original 1 Count 424 242 211 400 463 1,740
% 24.4% 13.9% 12.1% 23.0% 26.6% 100.0%
2 Count 1,128 473 41 75 23 1,740
% 64.8% 27.2% 2.4% 4.3% 1.3% 100.0%
3 Count 67 143 1,317 95 118 1,740
% 3.9% 8.2% 75.7% 5.5% 6.8% 100.0%
4 Count 136 18 28 853 705 1,740
% 7.8% 1.0% 1.6% 49.0% 40.5% 100.0%
5 Count 68 34 31 810 797 1,740
% 3.9% 2.0% 1.8% 46.6% 45.8% 100.0%
Total Count 1,823 910 1,628 2,233 2,106 8,700
% 21.0% 10.5% 18.7% 25.7% 24.2% 100.0%

Table E.3: LDA-Soft results

XIV
Bayesian-MAP
Result Total
1 2 3 4 5
Original 1 Count 36 11 7 0 4 58
% 62.1% 19.0% 12.1% 0.0% 6.9% 100.0%
2 Count 26 28 4 0 0 58
% 44.8% 48.3% 6.9% 0.0% 0.0% 100.0%
3 Count 18 11 26 0 3 58
% 31.0% 19.0% 44.8% 0.0% 5.2% 100.0%
4 Count 28 13 5 4 8 58
% 48.3% 22.4% 8.6% 6.9% 13.8% 100.0%
5 Count 19 15 10 2 12 58
% 32.8% 25.9% 17.2% 3.4% 20.7% 100.0%
Total Count 127 78 52 6 27 290
% 43.8% 26.9% 17.9% 2.1% 9.3% 100.0%

Table E.4: Bayesian-MAP results

Bayesian-ML results
Result Total
1 2 3 4 5
Original 1 Count 37 11 5 0 5 58
% 63.8% 19.0% 8.6% 0.0% 8.6% 100.0%
2 Count 25 31 2 0 0 58
% 43.1% 53.4% 3.4% 0.0% 0.0% 100.0%
3 Count 15 9 31 0 3 58
% 25.9% 15.5% 53.4% 0.0% 5.2% 100.0%
4 Count 26 13 5 5 9 58
% 44.8% 22.4% 8.6% 8.6% 15.5% 100.0%
5 Count 19 14 9 3 13 58
% 32.8% 24.1% 15.5% 5.2% 22.4% 100.0%
Total Count 122 78 52 8 30 290
% 42.1% 26.9% 17.9% 2.8% 10.3% 100.0%

Table E.5: Bayesian-ML results

XV
Appendix F

Bibliography

[Alderton and Tanner 1999] A LDER TON, David ; T ANNER, Bruce: Rodents
of the World (Of the World). Blandford, 1999. – ISBN 0713727896

[for the Prevention of Cruelty to animals 2006] ANIMALS , Scot-

tish S. t.: Pet Facts - More about Hamsters. Version: 2006.
http://www.scottishspca-usa.org/PetFacts-3.aspx, Retrieved
on: 09. Dec. 2006

[Backhaus et al. 2003] B ACKHAUS, K. ; E RICHSON, B. ; P LINKE, W. ;

W EIBER, R.: Multivariate Analysemethoden. Eine anwendungsorientierte
Einführung, 10. Auflage. Springer Verlag, Berlin, 2003. – ISBN 3–540–
00491–2

[Bayes 1763] B AYES, T.: An essay towards solving a problem in the doc-
trine of chances. In: Philosophical Transactions of the Royal Soc. of Lon-
don 53 (1763), pp. 370–418. – reprinted in Biometrika 45(3/4) 293-315
Dec 1958

[Belhumeur et al. 1997] B ELHUMEUR, P. N. ; H ESPANHA, J. P. ; K RIEG -

MAN , D. J.: Eigenfaces vs. Fisherfaces: recognition using class specific
linear projection. In: Pattern Analysis and Machine Intelligence, IEEE
Transactions on 19 (1997), Nr. 7, pp. 711–720. ISBN 0162–8828

[Bergesen 1989] B ERGESEN, F.: The effects of pet facilitated therapy on

the self-esteem and socialization of primary school children. In: 5th Inter-
national Conference on the Relationship between Humans and Animals.
Monaco, IOS Press, 1989, pp. 1220–1234

XVI
Appendix F Bibliography

[Bolme et al. 2003] B OLME, David S. ; B EVERIDGE, J. R. ; T EIXEIRA, Marcio

; D RAPER, Bruce A.: The CSU Face Identification Evaluation System: Its
Purpose, Features, and Structure. 2003. – 304–313 p. http://www.
springerlink.com/content/dp9q7l1dyrjn2a91

[Burnie 2005] B URNIE, David: Animal: The Definitive Visual Guide to the
World’s Wildlife. DK ADULT, 2005. – ISBN 0756616344

[Chamberlain. 1992] C HAMBERLAIN ., Melissa: A Very Differ-

ent Hamster – The Roborovski. Version: 1992. http://www.
britishhamsterassociation.org.uk/get_article.php?fname=
journal/robos.html, Retrieved on: 08. Dec. 2006

[Champagne and Lundrigan 2006] C HAMPAGNE, A. ; L UNDRIGAN,

B.: Mesocricetus auratus (Animal Diversity Web). Version: 2006.
http://animaldiversity.ummz.umich.edu/site/accounts/
information/Mesocricetus_auratus.html, Retrieved on:
07. Feb. 2007

[Davies 2005] D AVIES, E. R.: Machine Vision: Theory, Algorithms, Practical-

ities, Third Edition. Morgan Kaufmann, 2005

[Deutsch et al. 2005] D EUTSCH, B. ; G RASSL, C. ; B AJRAMOVIC, F. ; D EN -

ZLER , J.: A Comparative Evaluation of Template and Histogram Based
2D Tracking Algorithms. In: German Pattern Recognition Symposium,
2005, 269–276

[Dieterlen 1959] D IETERLEN, F.: Das Verhalten des syrischen Goldham-

sters (Mesocricetus auratus. In: Zeitschrift f u r Tierpsychologie (1959),
pp. 47–103

[van Eck 2006] E CK, Wim van: Animal controlled computer games: Playing
Pac-Man against real crickets. Version: 2006. http://pong.hku.nl/
%7Ewim/bugman.htm, Retrieved on: 25. Nov. 2006

[van Eck and Lamers 2006] E CK, Wim van ; L AMERS, Maarten H.: Animal
controlled computer games: Playing Pac-Man against real crickets. In:

XVII
Appendix F Bibliography

R. H ARPER, M. C. M. Rauterberg R. M. Rauterberg (Publisher): Entertain-

ment Computing: ICEC 2006, Springer, 2006, pp. 31–36

[Exploration 1997] E XPLORATION, Centre M.: Petting Zoo. Version: 1997.

http://www.metahuman.org/web/projects.html, Retrieved on:
21. Nov. 2006

[Festing 1986] F ESTING, M.F.: Hamsters. In: P OOLE, T.P. (Publisher): The
UFAW Handbook on the Care and Management of Laboratory Animals. 6.
Longman Scientific and Technical, London, 1986, pp. 242—256

[Fisher 1936] F ISHER, R. A.: The use of multiple measurements in taxo-

nomic problems. In: Annals of Eugenics 7 (1936), Nr. 2, pp. 179–188

[Gattermann et al. 2001] G ATTERMANN, R. ; F RITZSCHE, P. ; N EUMANN, K.

; A L -H USSEIN, I. ; K AYSER, A. ; A BIAD, M. ; YAKTI, R.: Notes on the cur-
rent distribution and the ecology of wild golden hamsters (Mesocricetus
auratus). In: Journal of Zoology 254 (2001), Nr. 03, pp. 359–365

[Hamsterhideout 2006] H AMSTERHIDEOUT: Eyes. Anatomy of a Hamster.

Version: 2006. http://www.hamsterhideout.com/anatomyhead.html,
Retrieved on: 09. Dec. 2006

[Hanley 1982] H ANLEY, JA: The meaning and use of the area under a
receiver operating characteristic (ROC) curve. In: Radiology 143 (1982),
Nr. 1, pp. 29–36

[Horn and Schunck 1981] H ORN, B.K.P. ; S CHUNCK, B.G.: Determining

Optical Flow. In: Artificial Intelligence 17 (1981), Nr. 1-3, pp. 185–203

[Howell 1999] H OWELL, A. J.: Introduction to Face Recognition. In: J AIN,

Lakhmi C. (Publisher) ; H ALICI, Ugur (Publisher) ; H AYASHI, Isao (Pub-
lisher) ; L EE, S.B. (Publisher) ; T SUTSUI, Shigeyoshi (Publisher): Intel-
ligent Biometric Techniques in Fingerprint and Face Recognition. CRC,
1999. – ISBN 0849320550, Chapter 7, pp. 219–283

[Iglesias et al. 2006] I GLESIAS, Tony ; S ALMON, Anita ; S CHOLTZ, Jo-

hann ; H EDEGORE, Robert ; B ORGENDALE, Julianna ; R UNNELS, Brent

XVIII
Appendix F Bibliography

; M C K IMPSON, Nathan: Camera Computer Interfaces. In: H ORNBERG,

Alexander (Publisher): Handbook of Machine Vision. Wiley-VCH, 2006. –
ISBN 3527405844, Chapter 7, pp. 427–509

[Intel 2006] I NTEL: OpenCV Website. Version: Nov 2006. http://www.

intel.com/technology/computing/opencv/index.htm, Retrieved on:
27. Nov. 2006

[Intel 2007] I NTEL: OpenCV documentation. Version: 2007.

http://opencvlibrary.sourceforge.net/CvReference, Retrieved
on: 12. Feb. 2007

[Jain and Li 2004] J AIN, A. K. ; L I, S. Z.: Handbook of Face Recognition.

Springer, 2004. – ISBN 038740595X

[Jang and Lee 2004] J ANG, Sunyean ; L EE, Manjai: Hello-Fish: In-
teracting with Pet Fishes Through Animated Digital Wallpaper on a
Screen. Version: 2004. http://www.springerlink.com/content/
8rl5508nlrwbqccf. In: Lecture Notes in Computer Science: Entertain-
ment Computing ICEC 2004. Springer, 2004, 559–564

[Johnston 1985] J OHNSTON, R.E.: Communication. In: S IEGEL, H.I. (Pub-

lisher): The Hamster. Plenum, 1985. – ISBN 030641791X, pp. 121—154

[KC Tan et al. 2006a] KC T AN, Roger T. ; C HEOK, Adrian D. ; KS T EH,

James: Metazoa Ludens: Mixed Reality environment for playing com-
puter games with pets. In: International Journal of Virtual Reality (2006)

[KC Tan et al. 2006b] KC T AN, Roger T. ; C HEOK, Adrian D. ; KS T EH,

James ; M USTAPHA, Md Faisal B. ; L EONG, Li S. ; T ODOROVIC, Valdimir:
Metazoa Ludens: Inter-Species Computer Interaction and Play. In: ac-
cepted at SIGCHI (2006)

[Lee et al. 2006] L EE, Shang P. ; C HEOK, Adrian D. ; J AMES, Teh Keng S.
; D EBRA, Goh Pae L. ; J IE, Chio W. ; C HUANG, Wang ; F ARBIZ, Farzam:
A mobile pet wearable computer and mixed reality system for human-
poultry interaction through the internet. In: Personal and Ubiquitous

XIX
Appendix F Bibliography

Computing V10 (2006), Nr. 5, 301–317. http://dx.doi.org/10.1007/

s00779-005-0051-6

[Lu 2003] L U, Xiaoguang: Image Analysis for Face Recognition. http:

//www.face-rec.org/interesting-papers/. Version: May 2003. –
Dept. of Computer Science & Engineering Michigan State University,
East Lansing, MI, 48824 Email: [email protected]

[McGuire 1993] M C G UIRE, Jean: Phodopus Sungorus (Russian Dwarf

Hamsters). Version: 1993. http://netvet.wustl.edu/species/
hamsters/phodopus.txt, Retrieved on: 08. Dec. 2006

[Mikesell 2003] M IKESELL, Dan: Networking Pets and People. In: Adjunct
proceedings of Ubicomp 2003, ACM Press, 2003, pp. 88–89

[MixedRealityLab 2006] M IXED R EALITY L AB ; U NIVERSITY OF S INGAPORE

(Publisher): Website. Version: 2006. http://www.mixedrealitylab.
org, Retrieved on: 12. Oct. 2006

[Moghaddam et al. 1996] M OGHADDAM, B. ; N ASTAR, C. ; P ENTLAND, A. P.:

A Bayesian Similarity Measure for Direct Image Matching. In: Interna-
tional Conference on Pattern Recognition, 1996, II: 350–358

[Murphy 1985] M URPHY, M.R.: History of the capture and domestica-

tion of the Syrian golden hamster (Mesocricetus auratus Waterhouse).
In: S IEGEL, H.I. (Publisher): The Hamster. Plenum, 1985. – ISBN
030641791X, pp. 3–20

[Musser and Carleton 2005] M USSER, GG ; C ARLETON, MD: Superfamily

Muroidea. In: Mammal species of the world: a taxonomic and geographic
reference, (2005), pp. 894–1531

[OpenCV ] O PEN CV: Open CV Yahoo Group. http://tech.groups.yahoo.

com/group/OpenCV/, Retrieved on: 27. Nov. 2006

[Osuna et al. 1997] O SUNA, E. ; F REUND, R. ; G IROSI, F.: Training support

vector machines:an application to face detection. citeseer.comp.nus.
edu.sg/osuna97training.html. Version: 1997

XX
Appendix F Bibliography

[Pet 2005a] P ET, Website: Behavior. Version: 2005. http://www.

petwebsite.com/hamsters/roborovski_hamster_behavior.htm, Re-
trieved on: 09. Dec. 2006

[Pet 2005b] P ET, Website: Hamster. Version: 2005. http://www.

petwebsite.com/hamsters.asp, Retrieved on: 06. Dec. 2006

[Pet 2006a] P ET, Website: Colors. Dwarf Winter White Russian Hamsters
(Phodopus sungorus). Version: 2006. http://www.petwebsite.com/
hamsters/dwarf_winter_white_russian_hamsters_colors.htm, Re-
trieved on: 08. Dec. 2006

[Pet 2006b] P ET, Website: Colors of the roborovski hamsters.

Version: 2006. http://www.petwebsite.com/hamsters/roborovski_
hamster_colors.htm, Retrieved on: 06. Dec. 2006

[Resner 2001] R ESNER, Benjamin I.: Rover@Home: Computer Mediated

Remote Interaction Between Humans and Dogs, Massachusetts Institute
of Technology, Masterthesis, 2001. http://characters.media.mit.
edu/publications.html

[Selmer 1995] S ELMER, J. (Publisher): Expatriate Management: New Ideas

for International Business. Greenwood Publishing Group, 1995

[Serpell 1991] S ERPELL, J.: Beneficial effects of pet ownership on some

aspects of human health and behaviour. In: Journal of the Royal Society
of Medicine 12 (1991), Dec, Nr. 84, pp. 717–720

[Shaknarovich and Moghaddam 2004] S HAKNAROVICH, Gregory ;

M OGHADDAM, Baback: Face Recognition in Subspaces. In: J AIN,
A. K. (Publisher) ; L I, S. Z. (Publisher): Handbook of Face Recognition.
Springer, June 2004. – ISBN 038740595X, Chapter 7, pp. 141–168

[SPSS Inc. 2007] SPSS I NC .: SPSS for Windows. Version: 2007. http:
//www.spss.com/spss/, Retrieved on: 15. Jan. 2007

[Steger 2006] S TEGER, Carsten: Machine Vision Algorithms. In: H ORN -

BERG , Alexander (Publisher): Handbook of Machine Vision. Wiley-VCH,
2006. – ISBN 3527405844, Chapter 8, pp. 511–692

XXI
Appendix F Bibliography

[Stephan et al. 2003] S TEPHAN, C. ; W ESSELING, S. ; S CHINK, T. ; J UNG,

K.: Comparison of Eight Computer Programs for Receiver-Operating
Characteristic Analysis. In: Clinical Chemistry 49 (2003), Nr. 3, pp. 433–
439

[Turk and Pentland 1991] T URK, M. ; P ENTLAND, A.: Eigenfaces for Recog-
nition. In: Journal of Cognitive Neuroscience 3 (1991), Nr. 1, pp. 71–86

[Ullman 2000] U LLMAN, Shimon: High-Level Vision: Object Recognition and

Visual Cognition. The MIT Press, 2000. – ISBN 0262710072

[Viola and Jones 2002] V IOLA, Paul ; J ONES, Michael: Robust Real-time
Object Detection. In: International Journal of Computer Vision - to appear
(2002). citeseer.comp.nus.edu.sg/viola01robust.html

[Webb 2002] W EBB, Andrew R.: Statistical Pattern Recognition, 2nd Edition.
John Wiley & Sons, 2002. – ISBN 0470845147

[Zarit et al. 1999] Z ARIT, B. D. ; S UPER, B. J. ; Q UEK, F. K. H.: Comparison

of five color models in skin pixel classification. In: Recognition, Analy-
sis, and Tracking of Faces and Gestures in Real-Time Systems, 1999.
Proceedings. International Workshop on, 1999, pp. 58–63