0% found this document useful (0 votes)
133 views11 pages

Recognition - Unknown - Pattern Recognition and Classification

Uploaded by

Xavier Arias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
133 views11 pages

Recognition - Unknown - Pattern Recognition and Classification

Uploaded by

Xavier Arias
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Pattern Recognition and Classification

Geoff Dougherty

Pattern Recognition
and Classification
An Introduction
Geoff Dougherty
Applied Physics and Medical Imaging
California State University, Channel Islands
Camarillo, CA, USA

Please note that additional material for this book can be downloaded from
http://extras.springer.com
ISBN 978-1-4614-5322-2 ISBN 978-1-4614-5323-9 (eBook)
DOI 10.1007/978-1-4614-5323-9
Springer New York Heidelberg Dordrecht London
Library of Congress Control Number: 2012949108

# Springer Science+Business Media New York 2013


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or
information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts
in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being
entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication
of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’s location, in its current version, and permission for use must always be obtained from
Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center.
Violations are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)


Preface

The use of pattern recognition and classification is fundamental to many of the


automated electronic systems in use today. Its applications range from military
defense to medical diagnosis, from biometrics to machine learning, from bioinfor-
matics to home entertainment, and more. However, despite the existence of a
number of notable books in the field, the subject remains very challenging, espe-
cially for the beginner.
We have found that the current textbooks are not completely satisfactory for our
students, who are primarily computer science students but also include students
from mathematics and physics backgrounds and those from industry. Their mathe-
matical and computer backgrounds are considerably varied, but they all want to
understand and absorb the core concepts with a minimal time investment to the
point where they can use and adapt them to problems in their own fields. Texts with
extensive mathematical or statistical prerequisites were daunting and unappealing
to them. Our students complained of “not seeing the wood for the trees,” which is
rather ironic for textbooks in pattern recognition. It is crucial for newcomers to the
field to be introduced to the key concepts at a basic level in an ordered, logical
fashion, so that they appreciate the “big picture”; they can then handle progres-
sively more detail, building on prior knowledge, without being overwhelmed. Too
often our students have dipped into various textbooks to sample different
approaches but have ended up confused by the different terminologies in use.
We have noticed that the majority of our students are very comfortable with and
respond well to visual learning, building on their often limited entry knowledge, but
focusing on key concepts illustrated by practical examples and exercises. We
believe that a more visual presentation and the inclusion of worked examples
promote a greater understanding and insight and appeal to a wider audience.
This book began as notes and lecture slides for a senior undergraduate course
and a graduate course in Pattern Recognition at California State University Channel
Islands (CSUCI). Over time it grew and approached its current form, which has
been class tested over several years at CSUCI. It is suitable for a wide range of
students at the advanced undergraduate or graduate level. It assumes only a modest

v
vi Preface

background in statistics and mathematics, with the necessary additional material


integrated into the text so that the book is essentially self-contained.
The book is suitable both for individual study and for classroom use for students
in physics, computer science, computer engineering, electronic engineering, bio-
medical engineering, and applied mathematics taking senior undergraduate and
graduate courses in pattern recognition and machine learning. It presents a compre-
hensive introduction to the core concepts that must be understood in order to make
independent contributions to the field. It is designed to be accessible to newcomers
from varied backgrounds, but it will also be useful to researchers and professionals
in image and signal processing and analysis, and in computer vision. The goal is to
present the fundamental concepts of supervised and unsupervised classification in
an informal, rather than axiomatic, treatment so that the reader can quickly acquire
the necessary background for applying the concepts to real problems. A final
chapter indicates some useful and accessible projects which may be undertaken.
We use ImageJ (http://rsbweb.nih.gov/ij/) and the related distribution, Fiji (http://
fiji.sc/wiki/index.php/Fiji) in the early stages of image exploration and analysis,
because of its intuitive interface and ease of use. We then tend to move on to
MATLAB for its extensive capabilities in manipulating matrices and its image
processing and statistics toolboxes. We recommend using an attractive GUI called
DipImage (from http://www.diplib.org/download) to avoid much of the command
line typing when manipulating images. There are also classification toolboxes
available for MATLAB, such as Classification Toolbox (http://www.wiley.com/
WileyCDA/Section/id-105036.html) which requires a password obtainable from
the associated computer manual) and PRTools (http://www.prtools.org/download.
html). We use the Classification Toolbox in Chap. 8 and recommend it highly for its
intuitive GUI. Some of our students have explored Weka, a collection of machine
learning algorithms for solving data mining problems implemented in Java and open
sourced (http://www.cs.waikato.ac.nz/ml/weka/index_downloading.html).
There are a number of additional resources, which can be downloaded from the
companion Web site for this book at http://extras.springer.com/, including several
useful Excel files and data files. Lecturers who adopt the book can also obtain
access to the end-of-chapter exercises.
In spite of our best efforts at proofreading, it is still possible that some typos may
have survived. Please notify me if you find any.
I have very much enjoyed writing this book; I hope you enjoy reading it!

Camarillo, CA Geoff Dougherty


Acknowledgments

I would like to thank my colleague Matthew Wiers for many useful conversations
and for helping with several of the Excel files bundled with the book. And thanks to
all my previous students for their feedback on the courses which eventually led
to this book; especially to Brandon Ausmus, Elisabeth Perkins, Michelle Moeller,
Charles Walden, Shawn Richardson, and Ray Alfano.
I am grateful to Chris Coughlin at Springer for his support and encouragement
throughout the process of writing the book and to various anonymous reviewers
who have critiqued the manuscript and trialed it with their classes. Special thanks
go to my wife Hajijah and family (Daniel, Adeline, and Nadia) for their patience
and support, and to my parents, Maud and Harry (who passed away in 2009),
without whom this would never have happened.

vii
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Organization of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 The Classification Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Training and Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Supervised Learning and Algorithm Selection . . . . . . . . . . . . . . 17
2.5 Approaches to Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6.1 Classification by Shape . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6.2 Classification by Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6.3 More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6.4 Classification of Letters . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Nonmetric Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Decision Tree Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.1 Information, Entropy, and Impurity . . . . . . . . . . . . . . . . 29
3.2.2 Information Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.3 Decision Tree Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.4 Strengths and Weaknesses . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Rule-Based Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

ix
x Contents

4 Statistical Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43


4.1 Measured Data and Measurement Errors . . . . . . . . . . . . . . . . . . 43
4.2 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.1 Simple Probability Theory . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.2 Conditional Probability and Bayes’ Rule . . . . . . . . . . . . . 46
4.2.3 Naı̈ve Bayes Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3.1 The Multivariate Gaussian . . . . . . . . . . . . . . . . . . . . . . . 57
4.3.2 The Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3.3 The Mahalanobis Distance . . . . . . . . . . . . . . . . . . . . . . . 69
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.1 Parametric and Non-parametric Learning . . . . . . . . . . . . . . . . . . 75
5.2 Parametric Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2.1 Bayesian Decision Theory . . . . . . . . . . . . . . . . . . . . . . . 75
5.2.2 Discriminant Functions and Decision Boundaries . . . . . . 87
5.2.3 MAP (Maximum A Posteriori) Estimator . . . . . . . . . . . . 94
5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6 Nonparametric Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1 Histogram Estimator and Parzen Windows . . . . . . . . . . . . . . . . . 99
6.2 k-Nearest Neighbor (k-NN) Classification . . . . . . . . . . . . . . . . . 100
6.3 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4 Kernel Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7 Feature Extraction and Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.1 Reducing Dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.2 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.2.1 Inter/Intraclass Distance . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.2.2 Subset Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.3.1 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . 127
7.3.2 Linear Discriminant Analysis . . . . . . . . . . . . . . . . . . . . . 135
7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Contents xi

8 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143


8.1 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.2 k-Means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.2.1 Fuzzy c-Means Clustering . . . . . . . . . . . . . . . . . . . . . 148
8.3 (Agglomerative) Hierarchical Clustering . . . . . . . . . . . . . . . . . 150
8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9 Estimating and Comparing Classifiers . . . . . . . . . . . . . . . . . . . . . . 157
9.1 Comparing Classifiers and the No Free Lunch Theorem . . . . . . 157
9.1.1 Bias and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.2 Cross-Validation and Resampling Methods . . . . . . . . . . . . . . . 160
9.2.1 The Holdout Method . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.2.2 k-Fold Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . 162
9.2.3 Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
9.3 Measuring Classifier Performance . . . . . . . . . . . . . . . . . . . . . . 164
9.4 Comparing Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.4.1 ROC Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.4.2 McNemar’s Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.4.3 Other Statistical Tests . . . . . . . . . . . . . . . . . . . . . . . . 169
9.4.4 The Classification Toolbox . . . . . . . . . . . . . . . . . . . . . 171
9.5 Combining Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
10 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
10.1 Retinal Tortuosity as an Indicator of Disease . . . . . . . . . . . . . . 177
10.2 Segmentation by Texture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
10.3 Biometric Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
10.3.1 Fingerprint Recognition . . . . . . . . . . . . . . . . . . . . . . . 184
10.3.2 Face Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

You might also like