0% found this document useful (0 votes)

88 views65 pages

CS434a/541a: Pattern Recognition Prof. Olga Veksler

This lecture introduced the concepts of pattern recognition. It discussed what pattern recognition is, some applications like character recognition and medical diagnostics, and outlined the typical structure of a pattern recognition system. It used a toy example of classifying fish into salmon and sea bass to illustrate the design process, including collecting training data, extracting discriminative features, designing a classifier, and testing it on new data. Overfitting and the importance of generalization were also covered.

Uploaded by

SRIRAM R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views65 pages

CS434a/541a: Pattern Recognition Prof. Olga Veksler

Uploaded by

SRIRAM R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

CS434a/541a: Pattern Recognition

Prof. Olga Veksler

Lecture 1

1
Outline of the lecture

Syllabus
Introduction to Pattern Recognition
Review of Probability/Statistics

2
Syllabus
Prerequisite
Analysis of algorithms (CS 340a/b)
First-year course in Calculus
Introductory Statistics (Stats 222a/b or
equivalent) will
review
Linear Algebra (040a/b)
Grading
Midterm 30%
Assignments 30%
Final Project 40%
3
Syllabus
Assignments
bi-weekly
theoretical or programming in Matlab or C
no extensive programming
may include extra credit work
may discuss but work individually
due in the beginning of the class
Midterm
open anything
roughly on November 8
4
Syllabus
Final project
Choose from the list of topics or design your
own
May work in group of 2, in which case it is
expected to be more extensive
5 to 8 page report
proposals due roughly November 1
due December 8

5
Intro to Pattern Recognition

Outline
What is pattern recognition?
Some applications
Our toy example
Structure of a pattern recognition system
Design stages of a pattern recognition system

6
What is Pattern Recognition ?

Informally
Recognize patterns in data
More formally
Assign an object or an event to
one of the several pre-specified
categories (a category is usually
called a class)

tea cup
face
phone 7
Application: male or female?
classes
Objects (pictures)
male female

Perfect

PR system

8
Application: photograph or not?
classes

Objects (pictures) photo not photo

Perfect

PR system

9
Application: Character Recognition

objects Perfect
hello world
PR system

In this case, the classes are all possible

characters: a, b, c,…., z

10
Application: Medical diagnostics
classes

objects (tumors) cancer not cancer

Perfect

PR system

11
Application: speech understanding

objects (acoustic signal) phonemes

Perfect
re-kig-'ni-sh&n
PR system

In this case, the classes are all phonemes

12
Application: Loan applications
classes
objects (people)
income debt married age approve deny

John Smith 200,000 0 yes 80

Peter White 60,000 1,000 no 30

Ann Clark 100,000 10,000 yes 40

Susan Ho 0 20,000 no 25

13
Our Toy Application: fish sorting
classifier
fis
hs
pe
fis
hi ies c salmon
ma
ge camera

sorting
chamber

conveyer belt sea bass

14
How to design a PR system?
Collect data (training data) and classify by hand
salmon sea bass salmon salmon sea bass sea bass

Preprocess by segmenting fish from background

Extract possibly discriminating features

length, lightness,width,number of fins,etc.
Classifier design
Choose model
Train classifier on part of collected data (training data)
Test classifier on the rest of collected data (test data)
i.e. the data not used for training
Should classify new data (new fish images) well 15
Classifier design
Notice salmon tends to be shorter than sea bass
Use fish length as the discriminating feature
Count number of bass and salmon of each length
2 4 8 10 12 14
bass 0 1 3 8 10 5
salmon 2 5 10 5 1 0

12
10
8
Count

salmon
6
sea bass
4
2
0
2 4 8 10 12 14
16
Length
Fish length as discriminating feature
Find the best length L threshold
fish length < L fish length > L

classify as salmon classify as sea bass

For example, at L = 5, misclassified:

1 sea bass
16 salmon
2 4 8 10 12 14
bass 0 1 3 8 10 5
salmon 2 5 10 5 1 0

17
= 34%
50 17
Fish Length as discriminating feature
fish classified fish classified
as salmon as sea bass
12
10
8
Count

salmon
6
sea bass
4
2
0
2 4 8 10 12 14
Length

After searching through all possible thresholds L,

the best L= 9, and still 20% of fish is misclassified
18
Next Step

Lesson learned:
Length is a poor feature alone!
What to do?
Try another feature
Salmon tends to be lighter
Try average fish lightness

19
Fish lightness as discriminating feature
1 2 3 4 5
bass 0 1 2 10 12
salmon 6 10 6 1 0

14
12
10
8
Count

salmon
6 sea bass
4
2
0
1 2 3 4 5
Lightness

Now fish are well separated at lightness threshold

of 3.5 with classification error of 8% 20
Can do even better by feature combining
Use both length and lightness features
Feature vector [length,lightness]

ba decision
ss boundary
lightness

decision regions

sa
lm
on
length

21
Better decision boundary
lightness

length

Ideal decision boundary, 0% classification error

22
Test Classifier on New Data
Classifier should perform well on new data
Test “ideal” classifier on new data: 25% error
lightness

length

23
What Went Wrong?
complicated
boundary
Poor generalization

Complicated boundaries do not generalize well to

the new data, they are too “tuned” to the particular
training data, rather than some true model which
will separate salmon from sea bass well.
This is called overfitting the data

24
Generalization
training data testing data

Simpler decision boundary does not perform ideally

on the training data but generalizes better on new
data
Favor simpler classifiers
William of Occam (1284-1347): “entities are not
to be multiplied without necessity”
25
Pattern Recognition System Structure
input
domain dependent

camera, microphones, medical sensing

imaging devices, etc.

Patterns should be well separated segmentation

and should not overlap.
Extract discriminating features. Good features feature extraction
make the work of classifier easy.

Use features to assign the object to a category.

Better classifier makes feature extraction easier. classification
Our main topic in this course

Exploit context (input depending information) to

improve system performance
post-processing
Tne cat The cat
decision 26
How to design a PR system?
start

collect data

choose features
prior
knowledge
choose model

train classifier

evaluate classifier
end 27
Design Cycle cont.
start

collect data
Collect Data
Can be quite costly
choose features
How do we know when
we have collected an
adequately choose model
representative set of
testing and training
examples? train classifier

evaluate classifier
end 28
Design Cycle cont.
start
Choose features collect data
Should be discriminating, i.e.
similar for objects in the same
category, different for objects in
different categories: choose features

good features: bad features:

choose model

Prior knowledge plays a great

role (domain dependent) train classifier
Easy to extract
Insensitive to noise and
evaluate classifier
irrelevant transformations
end 29
Design Cycle cont.
start

Choose model collect data

What type of classifier to
use?
choose features
When should we try to
reject one model and try
another one? choose model
What is the best classifier
for the problem?
train classifier

evaluate classifier
end 30
Design Cycle cont.
start

Train classifier collect data

Process of using data to
determine the parameters of
choose features
classifier
Change parameters of the
chosen model so that the choose model
model fits the collected data
Many different procedures
for training classifiers train classifier

Main scope of the course

evaluate classifier
end 31
Design Cycle cont.
start
Evaluate Classifier collect data
measure system
performance
Identify the need for choose features
improvements in system
components
choose model
How to adjust complexity of
the model to avoid over-
fitting? Any principled
methods to do this? train classifier

Trade-off between
computational complexity evaluate classifier
and performance
end 32
Conclusion

useful
a lot of exciting and important
applications
but hard
must solve many issues for a successful
pattern recognition system

33
Review: mostly probability and
some statistics

34
Content
Probability
Axioms and properties
Conditional probability and independence
Law of Total probability and Bayes theorem
Random Variables
Discrete
Continuous
Pairs of Random Variables
Random Vectors
Gaussian Random Variable
35
Basics
We are performing a random experiment (catching
one fish from the sea)
S: all fish in the sea

event A

12
total number of events: 2

probability
all events in S

P
events
A P(A)
Axioms of Probability

1. P (A) ≥ 0
2. P (S ) = 1
3. If A B = ∅ then P ( A B ) = P ( A) + P (B )

37
Properties of Probability
P (∅) = 0

P (A) ≤ 1

P ( A c ) = 1 − P ( A)

A ⊂B P ( A ) < P (B )

P(A B) = P(A) + P(B) − P(A B)

N N
{Ai Aj = ∅,∀i, j} P Ak = P(Ak )
k =1 k =1 38
Conditional Probability
If A and B are two events, and we know that event
B has occurred, then (if P(B)>0)
U
P(A B)
P(A|B)=
P(B)

A B occurred
U
A B B
A B
U B

U
! ! "
U
multiplication rule P(A B)= P(A|B) P(B)
39
Independence
A and B are independent events if
P(A B) = P(A) P(B)
U

By the law of conditional probability, if A

and B are independent
P(A) P(B)
P(A|B) = = P(A)
P(B)

If two events are not independent, then they

are said to be dependent
40
Law of Total Probability
B1, B2,…,B n partition S B1 B3
A
Consider an event A B2 B4

A = U
U U
U U
U U
A B1 A B2 A B3 A B4
Thus P(A) = P(A B1) +P(A B2 ) +P(A B3 ) +P(A B4 )
Or using multiplication rule:
P( A) = P(A | B1 )P(B1 ) + + P(A | B4 )P(B4 )
n
P(A) = P(A | Bk )P(Bk )
k =1
Bayes Theorem
Let B , B , …, B , be a partition of the
sample space S. Suppose event A occurs.
What is the probability of event B ?
Answer: Bayes Rule

P (B i A ) P (A | B i )P (B i )
P (B i | A ) = =
P (A ) n
P (A | B k )P (B k )
k =1

#
42
Random Variables
In random experiment, usually assign some number
to the outcome, for example, number of of fish fins
A random variable X is a function from sample
sample space S to a real number. $
$ (# of fins)
&
%

X is random due to randomness of its argument

P ( X = a ) = P ( X (ω ) = a ) = P (ω ∈ Ω | X (ω ) = a )
Two Types of Random Variables

Discrete random variable has countable

number of values
number of fish fins (0,1,2,….,30)

Continuous random variable has

continuous number of values
fish weight (any real number between 0 and
100)
Cumulative Distribution Function
Given a random variable X, CDF is defined
as
F (a ) = P ( X ≤ a )

"! #
" #

!
Properties of CDF F (a ) = P ( X ≤ a )

1. F(a) is non decreasing

2. lim b→ ∞ F (b) = 1 "! #

3. limb→ −∞ F (b) = 0
" #

Questions about X can be asked in terms of

CDF
P (a < X ≤ b) = F(b) − F(a)

Example:
P( ' ( " ) ( )=F(30)-F(20)
Discrete RV: Probability Mass Function
Given a discrete random variable X, we
define the probability mass function as
p(a) = P( X = a)
Satisfies all axioms of probability

CDF in discrete case satisfies

F(a) = P( X ≤ a) = P( X = a) = p(a)
x≤a x≤a

47
Continuous RV: Probability Density Function

Given a continuous RV X, we say f(x) is its

probability density function if
a
F(a) = P( X ≤ a) = f (x) dx
−∞
b
and, more generally P(a ≤ X ≤ b) = f (x)dx
a

48
Properties of Probability Density Function

d
F (x ) = f (x )
dx

a
P( X = a) = f (x )dx = 0
a

∞
P(− ∞ ≤ X ≤ ∞) = f (x) dx = 1
−∞

f (x ) ≥ 0
49
probability mass probability density
pmf 1

pdf
1
0.4 0.6
0.3

1 2 3 4 5
! $

true probability density, not probability

P(fish weights 30kg) ≠ 0.6
P(fish has 2 or 3 fins)=
=p(2)+p(3)=0.3+0.4 P(fish weights 30kg)=0
P(fish weights between 29
31
and 31kg)= f ( x)dx
29
take sums integrate
Expected Value
Useful characterization of a r.v.
Also known as mean, expectation, or first
moment
discrete case: µ = E( X ) = ∀x
x p( x )
continuous case: µ = E( X ) =
∞
x f (x)dx
−∞

Expectation can be thought of as the

average or the center, or the expected
average outcome over many experiments
51
Expected Value for Functions of X
Let g(x) be a function of the r.v. X. Then
discrete case: E[g( X )] = ∀x
g(x) p(x )

E[g( X )] = g(x) f (x)dx

∞
continuous case:
−∞

2
An important function of X: [X-E(X)]
2
Variance E[[X-E(X)] ] = var(X)=σ2
Variance measures the spread around the
mean
Standard deviation = [Var(X)]1/2 , has the
same units as the r.v. X
52
Properties of Expectation
If X is constant r.v. X=c, then E(X) = c

If a and b are constants, E(aX+b)=aE(X)+b

More generally,
E ( n
i =1
(ai X i + c i )) =
n
i =1
(ai E ( X i ) + c i )

If a and b are constants, then

2
var(aX+b)= a var(X)

53
Pairs of Random Variables
Say we have 2 random variables:
Fish weight X
Fish lightness Y
Can define joint CDF
F(a,b) = P( X ≤ a,Y ≤ b) = P(ω ∈Ω | X(ω) ≤ a,Y(ω) ≤ b)
Similar to single variable case, can define
discrete: joint probability mass function
p(a,b) = P( X = a,Y = b)
continuous: joint density function f (x, y )
P(a ≤ X ≤ b,c ≤ Y ≤ d ) = f (x, y ) dx dy
54
a≤ x≤b
c≤ y ≤d
Marginal Distributions
given joint mass function px,y(a,b), marginal,
i.e. probability mass function for r.v. X can
be obtained from px,y(a,b)
px (a) = px,y (a, y ) py (b) = px,y (x, b)
∀y ∀x

marginal densities fx(x) and fy(y) are obtained

from joint density fx,y (x,y) by integrating

fx (x) = f x,y (x, y ) dy

y =∞
fy (y ) = f x,y (x, y ) dx
x =∞

y =−∞ x =−∞
55
Independence of Random Variables

r.v. X and Y are independent if

P( X ≤ x,Y ≤ y ) = P( X ≤ x)P(Y ≤ y )

Theorem: r.v. X and Y are independent if

and only if
px,y (x, y ) = py (y )px (x) (discrete)
fx,y (x, y ) = fy (y )fx (x) (continuous)

56
More on Independent RV’s

If X and Y are independent, then

E(XY)=E(X)E(Y)
Var(X+Y)=Var(X)+Var(Y)
G(X) and H(Y) are independent

57
Covariance
Given r.v. X and Y, covariance is defined as:
cov ( X ,Y ) = E[( X − E( X ))(Y − E(Y ))] = E( XY ) − E( X )E(Y )
Covariance is useful for checking if features X
and Y give similar information
Covariance (from co-vary) indicates tendency
of X and Y to vary together
If X and Y tend to increase together, Cov(X,Y) > 0
If X tends to decrease when Y increases, Cov(X,Y)
<0
If decrease (increase) in X does not predict
behavior of Y, Cov(X,Y) is close to 0 58
Covariance Correlation
If cov(X,Y) = 0, then X and Y are said to be
uncorrelated (think unrelated). However X
and Y are not necessarily independent.

If X and Y are independent, cov(X,Y) = 0

Can normalize covariance to get correlation
cov( X,Y )
− 1 ≤ cor ( X,Y ) = ≤1
var( X ) var(Y )

59
Random Vectors
Generalize from pairs of r.v. to vector of r.v.
X= [X1 X2… X3 ] (think multiple features)
Joint CDF, PDF, PMF are defined similarly to
the case of pair of r.v.’s
Example:
F (x1, x2,...,xn ) = P( X1 ≤ x1, X2 ≤ x2,...,Xn ≤ xn )

All the properties of expectation, variance,

covariance transfer with suitable modifications

60
Covariance Matrix
characteristics summary of random vector
T
cov(X)=cov[X1 X2… Xn] = Σ =E[(X- µ)(X- µ) ]=

E(X 1– µ1)(X 1– µ1) … E(X n– µn)(X 1– µ1)

E(X 2– µ2)(X 1– µ1) … E(X n– µn)(X 2– µ2)

…
…

E(X n– µn)(X 1– µ1) … E(X n– µn)(X n– µn)

σ12 c12 c13

variances c21 σ22 c23 covariances
c31 c32 σ32 61
Normal or Gaussian Random Variable
2
1 x−µ
1 −
Has density f (x ) = e 2 σ

σ 2π
Mean µ, and variance σ2

62
Multivariate Gaussian
1 1
[( x−µ ) −1
( x − µ )]
has density f (x ) =
−
e 2

(2π )
n/2 1/ 2

[
mean vector µ = µ 1, , µ n ]
covariance matrix

63
Why Gaussian?

Frequently observed (Central limit theorem)

Parameters µ and Σ are sufficient to
characterize the distribution
Nice to work with
Marginal and conditional distributions also are
gaussians
If X i’s are uncorrelated then they are also
independent

64
Summary

Intro to Pattern Recognition

Review of Probability and Statistics
Next time will review linear algebra

ML Merged Endsem
No ratings yet
ML Merged Endsem
1,117 pages
ML Merged
No ratings yet
ML Merged
729 pages
Machine Learning Lectures
No ratings yet
Machine Learning Lectures
126 pages
PRA Min
No ratings yet
PRA Min
93 pages
6 - Intoduction To Pattern Recognition
No ratings yet
6 - Intoduction To Pattern Recognition
72 pages
Chapter 1 From Book Duda
No ratings yet
Chapter 1 From Book Duda
19 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
Lecture 3
No ratings yet
Lecture 3
50 pages
1 Introduction
No ratings yet
1 Introduction
81 pages
ML Mid Syllabus
No ratings yet
ML Mid Syllabus
182 pages
CSE 473 Pattern Recognition
No ratings yet
CSE 473 Pattern Recognition
45 pages
Fundamentals of PR
No ratings yet
Fundamentals of PR
44 pages
An Introduction To Pattern Recognition - 2
No ratings yet
An Introduction To Pattern Recognition - 2
46 pages
2 Pattern Recognition Task
No ratings yet
2 Pattern Recognition Task
27 pages
Lecture 12 - Training Methods
No ratings yet
Lecture 12 - Training Methods
25 pages
CH 01
No ratings yet
CH 01
22 pages
Introduction To Machine Learning: Jaime S. Cardoso
100% (1)
Introduction To Machine Learning: Jaime S. Cardoso
52 pages
Classification
No ratings yet
Classification
40 pages
Pattern Recognition...
No ratings yet
Pattern Recognition...
21 pages
Lect#3 Basic Concepts Part2
No ratings yet
Lect#3 Basic Concepts Part2
38 pages
Pattern Recognition: P.S.Sastry
No ratings yet
Pattern Recognition: P.S.Sastry
90 pages
Module 1 Part A
No ratings yet
Module 1 Part A
24 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
Unit 1 Image Proc
No ratings yet
Unit 1 Image Proc
37 pages
Unit - V Pattern Recognition: Dr.K.Sampath Kumar Scse/Gu
No ratings yet
Unit - V Pattern Recognition: Dr.K.Sampath Kumar Scse/Gu
30 pages
Spoken Dialog Systems and Voice XML
No ratings yet
Spoken Dialog Systems and Voice XML
94 pages
Pattern Reco Tutorial
No ratings yet
Pattern Reco Tutorial
13 pages
PRNN P S Sastry Lec 1
No ratings yet
PRNN P S Sastry Lec 1
177 pages
Pattern Recognition: An Overview: Prof. Richard Zanibbi
No ratings yet
Pattern Recognition: An Overview: Prof. Richard Zanibbi
29 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
Introduction To Pattern Recognition
No ratings yet
Introduction To Pattern Recognition
46 pages
Pattern Recognition: Talal A. Alsubaie Sfda
No ratings yet
Pattern Recognition: Talal A. Alsubaie Sfda
40 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
49 pages
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
100% (1)
CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam
57 pages
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015
No ratings yet
Statistical Methods in Artificial Intelligence CSE471 - Monsoon 2015
23 pages
Pattern Recognition Presenation
100% (1)
Pattern Recognition Presenation
83 pages
Pattern Recognition
No ratings yet
Pattern Recognition
45 pages
What Is Pattern Recognition?: Unit-I Introduction
No ratings yet
What Is Pattern Recognition?: Unit-I Introduction
28 pages
Artificial Neural Networks-Pattern Recogntion
No ratings yet
Artificial Neural Networks-Pattern Recogntion
21 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
PR01
100% (1)
PR01
41 pages
Pattern Recognition: Dr. Farah Qais Al-Khalidi
No ratings yet
Pattern Recognition: Dr. Farah Qais Al-Khalidi
43 pages
Introduction of Pattern Recognition PDF
No ratings yet
Introduction of Pattern Recognition PDF
40 pages
j077 2011 KulHar WileyTutorial
No ratings yet
j077 2011 KulHar WileyTutorial
14 pages
Bayesian Learning: Berrin Yanikoglu
No ratings yet
Bayesian Learning: Berrin Yanikoglu
64 pages
Lecture Notes On Pattern Recognition and Image Processing
No ratings yet
Lecture Notes On Pattern Recognition and Image Processing
24 pages
Pattern Recognition: Lasse Holmstr Om and Petri Koistinen
No ratings yet
Pattern Recognition: Lasse Holmstr Om and Petri Koistinen
10 pages
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
100% (1)
To Pattern Recognition: CSE555, Fall 2021 Chapter 1, DHS
39 pages
THESIS-PROPOSAL-Native Chicken
No ratings yet
THESIS-PROPOSAL-Native Chicken
8 pages
Introduction To Pattern Recognition: Luís Gustavo Martins - Lmartins@porto - Ucp.pt EA-UCP, Porto, Portugal
No ratings yet
Introduction To Pattern Recognition: Luís Gustavo Martins - Lmartins@porto - Ucp.pt EA-UCP, Porto, Portugal
48 pages
SGN-2506 Introduction To Pattern Recognition Handout
No ratings yet
SGN-2506 Introduction To Pattern Recognition Handout
82 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
Uoc Luong Phi Tham So
No ratings yet
Uoc Luong Phi Tham So
84 pages
Single Layer Perceptron
No ratings yet
Single Layer Perceptron
113 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Yoseph Moges
No ratings yet
Yoseph Moges
86 pages
Coping Mechanisms and Academic Performance of 12th Grade Students During The COVID 19 Pandemic
No ratings yet
Coping Mechanisms and Academic Performance of 12th Grade Students During The COVID 19 Pandemic
9 pages
EURO 2008 FinalReport Austria en
No ratings yet
EURO 2008 FinalReport Austria en
139 pages
EI 2022 Electronics and Instruments Engineering Etr 2022 Paper
No ratings yet
EI 2022 Electronics and Instruments Engineering Etr 2022 Paper
47 pages
Practical Research 1: Quarter 3
No ratings yet
Practical Research 1: Quarter 3
19 pages
Shuster Bcw4e Clicker Questions Ch01 (2
No ratings yet
Shuster Bcw4e Clicker Questions Ch01 (2
29 pages
Project Presentation
No ratings yet
Project Presentation
27 pages
Fuzzy-Set Qualitative Comparative Analysis (Fsqca) : Guidelines For Research Practice in Information Systems and Marketing
No ratings yet
Fuzzy-Set Qualitative Comparative Analysis (Fsqca) : Guidelines For Research Practice in Information Systems and Marketing
24 pages
Rithvik ++++
No ratings yet
Rithvik ++++
14 pages
Harnessing Artificial Intelligence For Hyper-Perso
No ratings yet
Harnessing Artificial Intelligence For Hyper-Perso
9 pages
Week 03
No ratings yet
Week 03
28 pages
Judoka PDF
No ratings yet
Judoka PDF
4 pages
Quasi Variance
No ratings yet
Quasi Variance
2 pages
Chapter 6
No ratings yet
Chapter 6
53 pages
A Working Guide To Boosted Regression Trees: J. Elith, J. R. Leathwick and T. Hastie
No ratings yet
A Working Guide To Boosted Regression Trees: J. Elith, J. R. Leathwick and T. Hastie
12 pages
Assignment 1
No ratings yet
Assignment 1
18 pages
3M Pakai PLS
No ratings yet
3M Pakai PLS
10 pages
Untitled Presentation
No ratings yet
Untitled Presentation
20 pages
Feed-Forward Neural Networks (Part 2: Learning)
No ratings yet
Feed-Forward Neural Networks (Part 2: Learning)
17 pages
Volatility Forecasting - A Comparison of GARCH (1,1) and EWMA Models
No ratings yet
Volatility Forecasting - A Comparison of GARCH (1,1) and EWMA Models
14 pages
Phase 3 Project
No ratings yet
Phase 3 Project
6 pages
Quality Control (QC) Process Quality Control
No ratings yet
Quality Control (QC) Process Quality Control
2 pages
Sample CIVE70052 Solutions Stochastic Water Resources Management - HWRM
No ratings yet
Sample CIVE70052 Solutions Stochastic Water Resources Management - HWRM
5 pages
Detecting and Correcting For Label Shift With Black Box Predictors
No ratings yet
Detecting and Correcting For Label Shift With Black Box Predictors
11 pages
Title of My Ph.D. Dissertation: Name Surname
No ratings yet
Title of My Ph.D. Dissertation: Name Surname
11 pages
1.management-Electronic Banking Usage in Albania-Tedis Ramaj
No ratings yet
1.management-Electronic Banking Usage in Albania-Tedis Ramaj
10 pages
Devroye Random Variate Generation One Line of Code
No ratings yet
Devroye Random Variate Generation One Line of Code
8 pages
Practical Research 1 Final Exam
No ratings yet
Practical Research 1 Final Exam
2 pages
Perfecting the Cast: Adapting Casting Principles for Any Fly-Fishing Situation
From Everand
Perfecting the Cast: Adapting Casting Principles for Any Fly-Fishing Situation
Ed Jaworowski
5/5 (1)
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
From Everand
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
Charlie Masterson
No ratings yet
Java: Best Practices to Programming Code with Java
From Everand
Java: Best Practices to Programming Code with Java
Charlie Masterson
No ratings yet

CS434a/541a: Pattern Recognition Prof. Olga Veksler

Uploaded by

CS434a/541a: Pattern Recognition Prof. Olga Veksler

Uploaded by

CS434a/541a: Pattern Recognition

Prof. Olga Veksler

Objects (pictures) photo not photo

In this case, the classes are all possible

objects (tumors) cancer not cancer

objects (acoustic signal) phonemes

In this case, the classes are all phonemes

John Smith 200,000 0 yes 80

Peter White 60,000 1,000 no 30

Ann Clark 100,000 10,000 yes 40

conveyer belt sea bass

Preprocess by segmenting fish from background

Extract possibly discriminating features

classify as salmon classify as sea bass

For example, at L = 5, misclassified:

After searching through all possible thresholds L,

Now fish are well separated at lightness threshold

Ideal decision boundary, 0% classification error

Complicated boundaries do not generalize well to

Simpler decision boundary does not perform ideally

camera, microphones, medical sensing

Patterns should be well separated segmentation

Use features to assign the object to a category.

Exploit context (input depending information) to

good features: bad features:

Prior knowledge plays a great

Choose model collect data

Train classifier collect data

Main scope of the course

P(A B) = P(A) + P(B) − P(A B)

By the law of conditional probability, if A

If two events are not independent, then they

X is random due to randomness of its argument

Discrete random variable has countable

Continuous random variable has

1. F(a) is non decreasing

Questions about X can be asked in terms of

CDF in discrete case satisfies

Given a continuous RV X, we say f(x) is its

true probability density, not probability

Expectation can be thought of as the

E[g( X )] = g(x) f (x)dx

If a and b are constants, E(aX+b)=aE(X)+b

If a and b are constants, then

marginal densities fx(x) and fy(y) are obtained

fx (x) = f x,y (x, y ) dy

r.v. X and Y are independent if

Theorem: r.v. X and Y are independent if

If X and Y are independent, then

If X and Y are independent, cov(X,Y) = 0

All the properties of expectation, variance,

E(X 1– µ1)(X 1– µ1) … E(X n– µn)(X 1– µ1)

E(X n– µn)(X 1– µ1) … E(X n– µn)(X n– µn)

σ12 c12 c13

Frequently observed (Central limit theorem)

Intro to Pattern Recognition

You might also like