0% found this document useful (0 votes)

79 views29 pages

Pattern Recognition: An Overview: Prof. Richard Zanibbi

Uploaded by

এ.এস. সাকিব

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views29 pages

Pattern Recognition: An Overview: Prof. Richard Zanibbi

Uploaded by

এ.এস. সাকিব

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Pattern Recognition:

An Overview
Prof. Richard Zanibbi
Pattern Recognition
(One) Definition
The identification of implicit objects, types or relationships in raw
data by an animal or machine

• i.e. recognizing hidden information in data

Common Problems

• What is it?

• Where is it?

• How is it constructed?

• These problems interact. Example: in optical character

recognition (OCR), detected characters may influence the
detection of words and text lines, and vice versa
2
Pattern Recognition:
Common Tasks
What is it? (Task: Classification)
Identifying a handwritten character, CAPTCHAs;
discriminating humans from computers

Where is it? (Task: Segmentation)

Detecting text or face regions in images

How is it constructed? (Tasks: Parsing, Syntactic

Pattern Recognition)
Determining how a group of math symbols are related, and
how they form an expression;
Determining protein structure to decide its type (class) (an
3
example of what is often called “Syntactic PR”)
Models and Search: Key Elements of
Solutions to Pattern Recognition Problems
Models
For algorithmic solutions, we use a formal model of entities to be detected.
This model represents knowledge about the problem domain (‘prior
knowledge’). It also defines the space of possible inputs and outputs.

Search: Machine Learning and Finding Solutions

Normally model parameters set using “learning” algorithms

• Classification: learn parameters for function from model inputs to

classes

• Segmentation: learn search algorithm parameters for detecting

Regions of Interest (ROIs: note that this requires a classifier to
identify ROIs)

• Parsing: learn search algorithm parameters for constructing

structural descriptions (trees/graphs, often use sementers &
classifiers to identify ROIs and their relationships in descriptions) 4
Major Topics
Topics to be covered this quarter:
Bayesian Decision Theory
Feature Selection
Classification Models
Classifier Combination
Clustering (segmenting data into classes)
Structural/Syntactic Pattern Recognition
5
Pattern Classification
(Overview)
Classifying an Object
decision
Obtaining Model Inputs
post-processing costs Physical signals converted to digital
adjustments for
context signal (transducer(s)); a region of
classification interest is identified, features
adjustments for
missing features
computed for this region
feature extraction
Making a Decision
segmentation Classifier returns a class; may be
revised in post-processing (e.g.
sensing modify recognized character based
on surrounding characters)
input
7
GURE 1.7. Many pattern recognition systems can be partitioned into components
uch as the ones shown here. A sensor converts images or sounds or other physical
puts into signal data. The segmentor isolates sensed objects from the background or
om other objects. A feature extractor measures object properties that are useful for
Example (DHS): Classifying Salmon
and Sea Bass

e.g. image processing

(adjusting brightness)
segment fish regions

FIGURE 1.1. The objects to be classified are first sensed by a transducer (camera),
whose signals are preprocessed. Next the features are extracted and finally the clas- 8
sification is emitted, here either “salmon” or “sea bass.” Although the information flow
is often chosen to be from the source to the classifier, some systems employ information
flow in which earlier levels of processing can be altered based on the tentative or pre-
liminary response in later levels (gray arrows). Yet others combine two or more stages
Designing a classifier or
clustering algorithm

On a training set (learn parameters)

On a *separate* testing set

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

Feature Selection and Extraction

Feature Selection
Choosing from available features those to be used in our classification
model. Ideally, these:

• Discriminate well between classes

• Are simple and efficient to compute

Feature Extraction
Computing features for inputs at run-time

Preprocessing
User to reduce data complexity and/or variation, and applied before
feature extraction to permit/simplify feature computations; sometimes
involves other PR algorithms (e.g. segmentation)
10
Types of Features

(ordered)

(unordered)

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004 11

Example Single Feature (DHS): Fish
Length
salmon sea bass
count
A Poor Feature
22 for Classification
20
18
Computed on a
16
12 training set
10
8 No threshold will
6 prevent errors
4
2 Threshold l* shown
0 length
5 10 15 20 25 will produce fewest
l*
errors on average
FIGURE 1.2. Histograms for the length feature for the two categories. No single thresh-
old value of the length will serve to unambiguously discriminate between the two cat-
egories; using length alone, we will have some errors. The value marked l ∗ will lead to
the smallest number of errors, on average. From: Richard O. Duda, Peter E. Hart, and
c 2001 by John Wiley & Sons, Inc.
David G. Stork, Pattern Classification. Copyright "
12
A Better Feature: Average Lightness
of Fish Scales
count
14 salmon sea bass Still some errors
12
even for the best
10 threshold, x* (again,
8 min. average # errors)
6

2 Unequal Error Costs

0 lightness
2 4 x* 6 8 10 If worse to confuse
FIGURE 1.3. Histograms for the lightness feature for the two categories. No single bass for salmon than
∗
threshold value x (decision boundary) will serve to unambiguously discriminate be-
vice versa, we can
tween the two categories; using lightness alone, we will have some errors. The value x ∗
move x* to the left
marked will lead to the smallest number of errors, on average. From: Richard O. Duda,
c 2001 by John
Peter E. Hart, and David G. Stork, Pattern Classification. Copyright "
Wiley & Sons, Inc.

13
A Combination of Features:
Lightness and Width
width
Feature Space
22 salmon sea bass
21 Is now two-
20 dimensional; fish
19 described in model
18 input by a feature vector
17 (x1, x2) representing a
16
point in this space
15
14
2 4 6 8 10
lightness Decision Boundary
FIGURE 1.4. The two features of lightness and width for sea bass and salmon. The dark
A linear discriminant
line could serve as a decision boundary of our classifier. Overall classification error on
In general, determining appropriate features is a
the data shown is lower than if we use only one feature as in Fig. 1.3, but there will(line used to separate
still bedifficult problem,
some errors. and O.
From: Richard determining
Duda, Peter E.optimal
Hart, and features is Pattern
David G. Stork,
c 2001 by John Wiley & Sons, Inc. classes) is shown; still
often
Classification impractical
. Copyright ! or impossible (requires testing
some errors
all feature combinations)
14
Classifier: A Formal Definition

Classifier (continuous, real-valued features)

Defined by a function from a n-dimensional space of real
numbers to a set of c classes, i.e.
D : Rn → Ω, where Ω = {ω1 , ω2 , . . . ωc }
Canonical Model
Classifier defined by c discriminant functions, one per
class. Each returns a real-valued “score.” Classifier returns
the class with the highest score.
nn
gig:i R
:R →→R,R, i i==1,1,. . . c. , c
D(x) = ωi∗ ∈ Ω ⇐⇒ gi∗ = max gi(x)
i=1,...,c
D(x) = ωi∗ ∈ Ω ⇐⇒ gi∗ = maxi=1,...,c {gi (x)} 15
from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004
Regions and Boundaries
Classification (or Decision) Regions
Regions in feature space where one class has
the highest discriminant function “score”
! " #
"
Ri = x""x ∈ Rn, gi(x) = max gk (x) , i = 1, . . . , c
k=1,...,c

Classification (or Decision) Boundaries

Exist where there is a tie for the highest
discriminant function value
17
Example: Linear Discriminant
Separating Two Classes

width
22 salmon sea bass
21
20
19
18
17
16
15
14 lightness
2 4 6 8 10
(from Kuncheva: visualizes
FIGURE 1.4. The two features of lightness and width for sea bass and salmon. The dark
changes
line could serve as a decision boundary of our classifier. Overall classification error (gradient) for class score)
on
the data shown is lower than if we use only one feature as in Fig. 1.3, but there will
still be some errors. From: Richard O. Duda, Peter E. Hart, and David G. Stork, Pattern
c 2001 by John Wiley & Sons, Inc.
Classification. Copyright !
18
“Generative” “Discriminative”
Models Models

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

Generalization

Too Much of A Good Thing

If we build a “perfect” decision boundary for
our training data, we will produce a classifier
making no errors on the training set, but
performing poorly on unseen data
• i.e. the decision boundary does not
“generalize” well to the true input space, and
new samples as a result
20
Poor Generalization due to Over-
fitting the Decision Boundary
width
22 salmon sea bass
Question-?
21
20 Marks a salmon
19 that will be
18
? classified as a
sea bass.
17
16
15
14 lightness
2 4 6 8 10

RE 1.5. Overly complex models for the fish will lead to decision boundaries that
mplicated. While such a decision may lead to perfect classification of our training
es, it would lead to poor performance on future patterns. The novel test point
21
d ? is evidently most likely a salmon, whereas the complex decision boundary
n leads it to be classified as a sea bass. From: Richard O. Duda, Peter E. Hart, and
G. Stork, Pattern Classification. Copyright !c 2001 by John Wiley & Sons, Inc.
Avoiding Over-Fitting
A Trade-off
We may need to accept more errors on our
training set to produce fewer errors on new data
• We have to do this without “peeking at” (repeatedly
evaluating) the test set, otherwise we over-fit the test
set instead

• Occam’s razor: prefer simpler explanations over

those that are unnecessarily complex

• Practice: simpler models with fewer parameters are

easier to learn/more likely to converge. A poorly
trained “sophisticated model” with numerous
parameters is often of no use in practice. 22
A Simpler Decision Boundary, with
Better Generalization

width
22 salmon sea bass
21
20
19
18
17
16
15
14 lightness
2 4 6 8 10

FIGURE 1.6. The decision boundary shown might represent the optimal tradeoff be-
tween performance on the training set and simplicity of classifier, thereby giving the
highest accuracy on new patterns. From: Richard O. Duda, Peter E. Hart, and David 23 G.

c 2001 by John Wiley & Sons, Inc.

Stork, Pattern Classification. Copyright !
“No Free Lunch” Theorem
One size does not fit all
Because of great differences in the structure of
feature spaces, the structure of decision
boundaries between classes, error costs, and
differences in how classifiers are used to support
decisions, creating a single general purpose classifier
is “profoundly difficult” (DHS) - maybe impossible?
Put another way...
There is no “best classification model,” as different
problems have different requirements
24
Clustering
(trying to discover classes in data)
Designing a classifier or
clustering algorithm

On a training set (learn parameters)

On a *separate* testing set

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

Clustering
The Task
Given unlabeled data set Z, partition the data points
into disjoint sets (“clusters:” each data point is included
in exactly one cluster)

Main Questions Studied for Clustering:

• Is there structure in the data, or does our clustering
algorithm simply impose structure?

• How many clusters should we look for?

• How to define object similarity (distance) in feature space?

• How do we know when clustering results are “good”?

27
Hierarchical:
constructed by
merging most
similar clusters
at each iteration

Non-
Hierarchical:
all points
assigned to a
cluster each
iteration

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

Pattern Recognition Presenation
100% (1)
Pattern Recognition Presenation
83 pages
Introduction To Machine Learning: Jaime S. Cardoso
100% (1)
Introduction To Machine Learning: Jaime S. Cardoso
52 pages
Chapter 1 From Book Duda
No ratings yet
Chapter 1 From Book Duda
19 pages
Module 1 Part A
No ratings yet
Module 1 Part A
24 pages
CSE 473 Pattern Recognition
No ratings yet
CSE 473 Pattern Recognition
45 pages
Pattern Recognition...
No ratings yet
Pattern Recognition...
21 pages
PR01
100% (1)
PR01
41 pages
An Introduction To Pattern Recognition - 2
No ratings yet
An Introduction To Pattern Recognition - 2
46 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
100% (1)
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
57 pages
Lect#3 Basic Concepts Part2
No ratings yet
Lect#3 Basic Concepts Part2
38 pages
Shit You Dont Want To See
No ratings yet
Shit You Dont Want To See
58 pages
Artificial Neural Networks-Pattern Recogntion
No ratings yet
Artificial Neural Networks-Pattern Recogntion
21 pages
Pattern Recognition
No ratings yet
Pattern Recognition
45 pages
Lecture 3
No ratings yet
Lecture 3
50 pages
Spoken Dialog Systems and Voice XML
No ratings yet
Spoken Dialog Systems and Voice XML
94 pages
1 Introduction
No ratings yet
1 Introduction
81 pages
Lecture 12 - Training Methods
No ratings yet
Lecture 12 - Training Methods
25 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
Fundamentals of PR
No ratings yet
Fundamentals of PR
44 pages
CS434a/541a: Pattern Recognition Prof. Olga Veksler
No ratings yet
CS434a/541a: Pattern Recognition Prof. Olga Veksler
65 pages
Introduction of Pattern Recognition PDF
No ratings yet
Introduction of Pattern Recognition PDF
40 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
49 pages
Unit 2 S4 Slo2
No ratings yet
Unit 2 S4 Slo2
37 pages
Introduction To Pattern Recognition
No ratings yet
Introduction To Pattern Recognition
46 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
Pattern Recognition: Talal A. Alsubaie Sfda
No ratings yet
Pattern Recognition: Talal A. Alsubaie Sfda
40 pages
6 Data Mining Functionalities 08-01-2025
No ratings yet
6 Data Mining Functionalities 08-01-2025
23 pages
Chapter 1 Pattern Classification
No ratings yet
Chapter 1 Pattern Classification
11 pages
Single Layer Perceptron
No ratings yet
Single Layer Perceptron
113 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
Pattern Recognition: C G (P) G (F (M) )
No ratings yet
Pattern Recognition: C G (P) G (F (M) )
143 pages
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
100% (1)
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
39 pages
Lesson 5 Introduction To Pattern Recognition
No ratings yet
Lesson 5 Introduction To Pattern Recognition
38 pages
Pat Recogn
No ratings yet
Pat Recogn
145 pages
Single Layer Perceptron Classifier
No ratings yet
Single Layer Perceptron Classifier
62 pages
Pattern Recognition Linear Classifier by Zaheer Ahmad
0% (1)
Pattern Recognition Linear Classifier by Zaheer Ahmad
37 pages
PR Assignment 01 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 01 - Seemal Ajaz (206979)
7 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
Chapter 1. Introduction: (Huan - Nguyen@inha - Ac.kr)
No ratings yet
Chapter 1. Introduction: (Huan - Nguyen@inha - Ac.kr)
24 pages
Lecture Notes On Pattern Recognition and Image Processing
No ratings yet
Lecture Notes On Pattern Recognition and Image Processing
24 pages
PRA Min
No ratings yet
PRA Min
93 pages
Pattern Recognition: Lecturer
No ratings yet
Pattern Recognition: Lecturer
43 pages
AI Unit-5 Notes
No ratings yet
AI Unit-5 Notes
25 pages
Pattern Recognition: Lasse Holmstr Om and Petri Koistinen
No ratings yet
Pattern Recognition: Lasse Holmstr Om and Petri Koistinen
10 pages
ML Mid Syllabus
No ratings yet
ML Mid Syllabus
182 pages
Patrec Tutorial1
No ratings yet
Patrec Tutorial1
26 pages
Data Mining
No ratings yet
Data Mining
73 pages
Chapter 1
No ratings yet
Chapter 1
18 pages
Kuliah 1 Pendahuluan
No ratings yet
Kuliah 1 Pendahuluan
39 pages
Lecture 2 3
No ratings yet
Lecture 2 3
72 pages
Pattern Classification
No ratings yet
Pattern Classification
141 pages
DSH - L5 - Data-Driven Approaches - Concepts
No ratings yet
DSH - L5 - Data-Driven Approaches - Concepts
38 pages
Fulltext01
No ratings yet
Fulltext01
91 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Capgemini Coding Live Sessions Part 5 (30 Aug 2022)
No ratings yet
Capgemini Coding Live Sessions Part 5 (30 Aug 2022)
7 pages
Ao Search
No ratings yet
Ao Search
5 pages
Lecture 12.4
No ratings yet
Lecture 12.4
20 pages
Chapter 6 Solutions
100% (1)
Chapter 6 Solutions
30 pages
Lecture Notes - Random Forests PDF
100% (1)
Lecture Notes - Random Forests PDF
4 pages
Data Visualization-5
No ratings yet
Data Visualization-5
14 pages
Problem Solving Approach To Mathematics For Elementary School Teachers 12th Edition Billstein Solutions Manual
100% (32)
Problem Solving Approach To Mathematics For Elementary School Teachers 12th Edition Billstein Solutions Manual
10 pages
Descriptive Stats With R Software Book
No ratings yet
Descriptive Stats With R Software Book
944 pages
Theroy of Relativity
No ratings yet
Theroy of Relativity
4 pages
Kavach Final
No ratings yet
Kavach Final
4 pages
Chapter 5 - Time Series Models
No ratings yet
Chapter 5 - Time Series Models
195 pages
Mcse 302 C Network Security Dec 2020
No ratings yet
Mcse 302 C Network Security Dec 2020
3 pages
Control Engineeri G. Module 1
No ratings yet
Control Engineeri G. Module 1
22 pages
Patch-Based Image Inpainting With Generative Adversarial Networks
No ratings yet
Patch-Based Image Inpainting With Generative Adversarial Networks
29 pages
Midterm 1 2024
No ratings yet
Midterm 1 2024
8 pages
Simplified Aes Example
No ratings yet
Simplified Aes Example
5 pages
Spanning Tree: R K Mohapatra
No ratings yet
Spanning Tree: R K Mohapatra
29 pages
Unit Test Linear System Week 1 Block 1 Update G Sept 16
No ratings yet
Unit Test Linear System Week 1 Block 1 Update G Sept 16
3 pages
Control of Electromechanical Systems: Prof. Claudio Roberto Gaz
No ratings yet
Control of Electromechanical Systems: Prof. Claudio Roberto Gaz
7 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
Mathematics in Science and Technology Mathematical Methods Models and Algorithms in Science and Technology Proceedings of the Satellite Conference of ICM 2010 India Habitat Centre Ind 1st Edition A. H. Siddiqi - Own the ebook now with all fully detailed content
No ratings yet
Mathematics in Science and Technology Mathematical Methods Models and Algorithms in Science and Technology Proceedings of the Satellite Conference of ICM 2010 India Habitat Centre Ind 1st Edition A. H. Siddiqi - Own the ebook now with all fully detailed content
75 pages
Q2. Can Binary Search Be Used For Linked Lists?
No ratings yet
Q2. Can Binary Search Be Used For Linked Lists?
6 pages
Integral Transforms
No ratings yet
Integral Transforms
104 pages
Differential Equation
No ratings yet
Differential Equation
7 pages
X X P Q: Chapter 15: Random Variables
No ratings yet
X X P Q: Chapter 15: Random Variables
7 pages
Introduction To Data Mining Techniques: Dr. Rajni Jain
No ratings yet
Introduction To Data Mining Techniques: Dr. Rajni Jain
11 pages
Divide & Conquer
No ratings yet
Divide & Conquer
2 pages
Answer All Questions, Each Carries3 Marks.: Page 1 of 2
No ratings yet
Answer All Questions, Each Carries3 Marks.: Page 1 of 2
2 pages
Printable Module Test Form A1
No ratings yet
Printable Module Test Form A1
4 pages
Stiffness Method Analysis Notes
No ratings yet
Stiffness Method Analysis Notes
10 pages

Pattern Recognition: An Overview: Prof. Richard Zanibbi

Uploaded by

Pattern Recognition: An Overview: Prof. Richard Zanibbi

Uploaded by

Pattern Recognition:

• i.e. recognizing hidden information in data

• These problems interact. Example: in optical character

Where is it? (Task: Segmentation)

How is it constructed? (Tasks: Parsing, Syntactic

Search: Machine Learning and Finding Solutions

• Classification: learn parameters for function from model inputs to

• Segmentation: learn search algorithm parameters for detecting

• Parsing: learn search algorithm parameters for constructing

e.g. image processing

On a training set (learn parameters)

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

• Discriminate well between classes

• Are simple and efficient to compute

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004 11

2 Unequal Error Costs

Classifier (continuous, real-valued features)

Classification (or Decision) Boundaries

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

Too Much of A Good Thing

• Occam’s razor: prefer simpler explanations over

• Practice: simpler models with fewer parameters are

c 2001 by John Wiley & Sons, Inc.

On a training set (learn parameters)

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

Main Questions Studied for Clustering:

• How many clusters should we look for?

• How to define object similarity (distance) in feature space?

• How do we know when clustering results are “good”?

from “Combining Pattern Classifiers” by L. Kuncheva, Wiley, 2004

You might also like