Fundamentals of PR
Fundamentals of PR
Category “A”
Category “B”
1
Classification vs Clustering
Category “B”
2
What is a Pattern?
• A pattern could be an object x1
x
or event. 2
x .
• Typically, represented by a
.
vector x of numbers. xn
3
What is a Pattern? (con’t)
• Loan/Credit card applications
– Income, # of dependents, mortgage amount credit
worthiness classification
• Dating services
– Age, hobbies, income “desirability” classification
• Web documents
– Key-word based descriptions (e.g., documents
containing “football”, “NFL”) document classification
4
What is a Class ?
• A collection of “similar” objects.
5
Main Objectives
• Separate the data belonging to different classes.
• Given new data, assign them to the correct
category.
Gender Classification
6
Main Approaches
x: input vector (pattern)
ω1
ω: class label (class)
• Generative ω2
– Model the joint probability, p(x, ω).
– Make predictions by using Bayes rule to calculate p(ω/x).
– Pick the most likely class label ω.
• Discriminative
– No need to model p(x, ω).
– Estimate p(ω/x) by “learning” a direct mapping from x to ω (i.e.,
estimate decision boundary).
– Pick the most likely class label ω.
7
How do we model p(x, ω)?
• Typically, using a statistical model.
– probability density function (e.g., Gaussian)
male
Gender Classification female
8
Key Challenges
• Intra-class variability
• Inter-class variability
9
Pattern Recognition
Applications
10
Handwriting Recognition
11
License Plate Recognition
12
Biometric Recognition
13
Fingerprint Classification
14
Face Detection
15
Autonomous Systems
16
Medical Applications
Skin Cancer Detection Breast Cancer Detection
17
Land Cover Classification
(from aerial or satellite images)
18
Main Phases
Test Phase Training Phase
19
Complexity of PR – An Example
camera
Problem: Sorting
incoming fish on a
conveyor belt.
Assumption: Two
kind of fish:
(1) sea bass
(2) salmon
20
Sensors
• Sensing:
– Use a sensor (camera or microphone) for data
capture.
– PR depends on bandwidth, resolution, sensitivity,
distortion of the sensor.
21
Preprocessing
22
Training/Test data
• How do we know that we have collected an
adequately large and representative set of
examples for training/testing the system?
Training Set ?
Test Set ?
23
Feature Extraction
• How to choose a good set of features?
– Discriminative features
threshold l*
threshold x*
x1 x1 : lightness
x x2 : width
2
28
How Many Features?
• Does adding more features always improve
performance?
– It might be difficult and computationally
expensive to extract certain features.
– Correlated features might not improve
performance (i.e. redundancy).
– “Curse” of dimensionality.
29
Curse of Dimensionality
• Adding too many features can, paradoxically, lead to a
worsening of performance.
– Divide each of the input features into a number of intervals, so
that the value of a feature can be specified approximately by
saying in which interval it lies.
30
Missing Features
• Certain features might be missing (e.g., due
to occlusion).
• How should we train the classifier with
missing features ?
• How should the classifier make the best
decision with missing features ?
31
Classification
• Partition the feature space into two regions by
finding the decision boundary that minimizes the
error.
overfitting
34
Understanding model complexity:
function approximation
• Approximate a function from a set of samples
o Green curve is the true function
o Ten sample points are shown by the blue circles
(assuming noise)
35
Understanding model complexity:
function approximation (cont’d)
Polynomial curve fitting: polynomials having various
orders, shown as red curves, fitted to the set of 10
sample points.
36
Understanding model complexity:
function approximation (cont’d)
• More data can improve model estimation
37
Improve Classification Performance
through Post-processing
38
Improve Classification Performance
through Ensembles of Classifiers
• Performance can be
improved using a
"pool" of classifiers.
39
Cost of miss-classifications
• Consider the fish classification example.
• There are two possible classification errors:
(1) Deciding the fish was a sea bass when it was a
salmon.
(2) Deciding the fish was a salmon when it was a sea
bass.
• Are both errors equally important ?
40
Cost of miss-classifications (cont’d)
• Suppose that:
– Customers who buy salmon will object vigorously if
they see sea bass in their cans.
– Customers who buy sea bass will not be unhappy if
they occasionally see some expensive salmon in
their cans.
41
Computational Complexity
• How does an algorithm scale with the
number of:
• features
• patterns
• categories
• Need to consider tradeoffs between
computational complexity and performance.
42
Would it be possible to build a
“general purpose” PR system?
• It would be very difficult to design a system that is
capable of performing a variety of classification
tasks.
– Different problems require different features.
– Different features might yield different solutions.
– Different tradeoffs exist for different problems.
43
Thank you