0% found this document useful (0 votes)

213 views16 pages

Lecture 5

The document summarizes key concepts from Bayesian decision theory including the likelihood ratio test, probability of error, and Bayes' risk. It provides an example demonstrating how to derive the likelihood ratio test decision rule for a two-class problem assuming Gaussian class-conditional densities and equal priors. It is shown that the likelihood ratio test achieves the minimum probability of error, known as the Bayes error rate. Finally, it introduces the concept of Bayes' risk, which accounts for misclassification costs, and notes that the decision rule minimizing Bayes' risk also minimizes a related expression.

Uploaded by

Alireza Movahedian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

213 views16 pages

Lecture 5

Uploaded by

Alireza Movahedian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Lecture 5

Bayesian Decision Theory

A fundamental statistical approach to quantifying the tradeos
between various decisions using probabilities and costs that
accompany such decisions. Reasoning is based on Bayes Rule.
Discriminant Functions for the Gaussian Density / Quadratic
Classiers
Apply the results of Bayesian Decision Theory to derive the
discriminant functions for the case of Gaussian class-conditional
probabilities.
Bayesian Decision Theory
The Likelihood Ratio Test
Likelihood Ratio Test
Want to classify an object based on the evidence provided by a
measurement (a feature vector) x.
A reasonable decision rule would be - Choose the class that is most
probable given x. Or mathematically choose class i such that
P(
i
|x) P(
j
|x) for i = 1, , C
Consider the decision rule for a 2-class problem:
Class (x) =

1
if P(
1
|x) > P(
2
|x)

2
if P(
1
|x) < P(
2
|x)
Likelihood Ratio Test
Choose class
1
if P(
1
|x) > P(
2
|x),

P(x|
1
)P(
1
)
P(x)
>
P(x|
2
)P(
2
)
P(x)
, Bayes Rule
P(x|
1
)P(
1
) > P(x|
2
)P(
2
), eliminate P(x) > 0

P(x|
1
)
P(x|
2
)
>
P(
2
)
P(
1
)
, as P() > 0
Let:
(x) =
P(x|
1
)
P(x|
2
)
. .. .
likelihood ratio
then
Likelihood Ratio Test:
Class (x) =

1
if (x) >
P(
2
)
P(
1
)

2
if (x) <
P(
2
)
P(
1
)
An example
Derive a decision rule for the 2-class problem based on the Likelihood
Ratio Test assuming equal priors and class conditional densities:
P(x|
1
) =
1

2
exp
(x 4)
2
2
, P(x|
2
) =
1

2
exp
(x 10)
2
2
Solution: Substitute the likelihoods and priors into the expressions in the LRT
(x) =
(

2)
1
exp (.5(x 4)
2
)
(

2)
1
exp (.5(x 10)
2
)
,
P(2)
P(1)
=
.5
.5
= 1
Choose class 1 if:
(x) > 1
exp (.5(x 4)
2
) > exp (.5(x 10)
2
)
(x 4)
2
< (x 10)
2
, by taking logs and changing signs
x < 7
The LRT decision rule is:
Class (x) =

1
if x < 7

2
if x > 7
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
3
Likelihood Ratio Test: an example
! Given a classification problem with the following class conditional densities,
derive a decision rule based on the Likelihood Ratio Test (assume equal priors)
! Solution
" Substituting the given likelihoods and priors into the LRT expression:
" Simplifying the LRT expression:
" Changing signs and taking logs:
" Which yields:
" This LRT result makes sense from an intuitive point of
view since the likelihoods are identical and differ only
in their mean value
! How would the LRT decision rule change if, say, the priors were such that
P(!
1
)=2P(!
2
) ?
2 2
10) (x
2
1
2
4) (x
2
1
1
e
2!
1
) " | P(x e
2!
1
) " | P(x
" " " "
# #
1
1
e
! 2
1
e
! 2
1
) x ( #
1
2
2
2
"
"
) 10 x (
2
1
) 4 x (
2
1
$
%
#
" "
" "
1
e
e
) x ( #
1
2
2
2 "
"
) 10 x (
2
1
) 4 x (
2
1
$
%
#
" "
" "
0 ) 10 x ( ) 4 x (
1
2
"
"
2 2
%
$
" " "
7 x
1
2
"
"
%
$
R
1
: say !
1
x
R
2
: say !
2
P(x|!
1
) P(x|!
2
)
4 10
How would the LRT decision rule change if P(w
1
) = 2P(
2
) ?
Bayesian Decision Theory
The Likelihood Ratio Test
The Probability of Error
Probability of Error
Performance of a decision rule is measured by its probability of error:
P(error) =
C
X
i=1
P(error|i)P(i)
The class conditional probability of error is:
P(error|i) =
X
j=i
P(choose j|i) =
X
j=i
Z
R
j
P(x|i)dx
where Rj = {x : Class (x) = j}.
For the 2-class problem
P(error) = P(1)
Z
R
2
P(x|1)dx
| {z }

1
+P(2)
Z
R
1
P(x|2)dx
| {z }

2
1 is the integral of the likelihood P(x|1) over the region where 2 is chosen.
Back to the Example
For the decision rule of the previous example, the integrals
1
and
2
are
depicted below.
Since we assumed equal priors, then
P(error) = .5(
1
+
2
)
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
4
The probability of error (1)
! The performance of any decision rule can be measured by its probability of error P[error]
which, making use of the Theorem of total probability (Lecture 2), can be broken up into
! The class conditional probability of error P[error|!
i
] can be expressed as
! So, for our 2-class problem, the probability of error becomes
" where "
i
is the integral of the likelihood P(x|!
i
) over the region R
j
where we choose !
j
! For the decision rule of the previous example, the integrals "
1
and "
2
are depicted below
" Since we assumed equal priors, then P[error] = ("
1
+ "
2
)/2
! Compute the probability for the example above
#
$
$
C
1 i
i i
] ]P[! ! | P[error P[error]
%
$ $
j R
i i j i
dx ) ! | x ( P ] ! | ! choose [ P ] ! | error [ P
! !" ! !# $ ! !" ! !# $
2
1
1
2
"
R
2 2
"
R
1 1
dx ) ! | x ( P ] ! [ P dx ) ! | x ( P ] ! [ P ] error [ P
% %
& $
R
1
: say !
1
x
R
2
: say !
2
P(x|!
1
) P(x|!
2
)
4 10
"
2
"
1
Write out the expression for P(error) for this example.
Back to the Example
For the decision rule of the previous example, the integrals 1 and 2 are depicted below.
Since we assumed equal priors, then
P(error) = .5(1 +2)
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
4
The probability of error (1)
! The performance of any decision rule can be measured by its probability of error P[error]
which, making use of the Theorem of total probability (Lecture 2), can be broken up into
! The class conditional probability of error P[error|!
i
] can be expressed as
! So, for our 2-class problem, the probability of error becomes
" where "
i
is the integral of the likelihood P(x|!
i
) over the region R
j
where we choose !
j
! For the decision rule of the previous example, the integrals "
1
and "
2
are depicted below
" Since we assumed equal priors, then P[error] = ("
1
+ "
2
)/2
! Compute the probability for the example above
#
$
$
C
1 i
i i
] ]P[! ! | P[error P[error]
%
$ $
j R
i i j i
dx ) ! | x ( P ] ! | ! choose [ P ] ! | error [ P
! !" ! !# $ ! !" ! !# $
2
1
1
2
"
R
2 2
"
R
1 1
dx ) ! | x ( P ] ! [ P dx ) ! | x ( P ] ! [ P ] error [ P
% %
& $
R
1
: say !
1
x
R
2
: say !
2
P(x|!
1
) P(x|!
2
)
4 10
"
2
"
1
Write out the expression for P(error) for this example.

1
= (2)

1
2

x=7
exp(.5(x 4)
2
)dx,

2
= (2)

1
2

7
x=
exp(.5(x 10)
2
)dx
Probability of Error
Thinking about the 2-class problem, not all decisions are equally good wrt
minimizing P(error). For our example consider this (silly) rule:
Class (x) =

1
if x < 100

2
if x > 100
For this (silly) rule
1
1 and
2
0 which is much more than the error
for the rule dened by the likelihood ratio test. In fact:
Bayes Error Rate: For any given problem, the minimum probability
of error is achieved by the Likelihood Ratio Test decision rule. This
probability of error is called the Bayes Error Rate and is the BEST
any classier can do.
Bayesian Decision Theory
The Likelihood Ratio Test
The Probability of Error
Bayes Risk
Bayes Risk
So far have assumed that the penalty of misclassication of a class
1
example as class
2
is the same as that for the misclassication of a
class
2
example as class
1
. But consider misclassifying
a faulty airplane as a safe airplane (puts peoples lives in danger)
a safe airplane as a faulty airplane (costs the airline company money)
Can formalize this concept in terms of a cost function C
ij
Let Cij denote the cost of choosing class i when j is the true class.
Bayes Risk is the expected value of the cost
E[C] =
2
X
i=1
2
X
j=1
CijP(decide i, j true class) =
2
X
i=1
2
X
j=1
CijP(x Ri|j)P(j)
Bayes Risk
What is the decision rule that minimizes the Bayes Risk ?
First note: P(x R
i
|
j
) =

xR
i
P(x|
j
)dx
Then the Bayes Risk is:
E[C] =

R
1
[C
11
P(
1
)P(x|
1
) + C
12
P(
2
)P(x|
2
)] dx+

R
2
[C
21
P(
1
)P(x|
1
) + C
22
P(
2
)P(x|
2
)] dx
Now remember

R
1
P(x|
j
)dx +

R
2
P(x|
j
) =

1
= arg min
R
1
(
Z
R
1
[(C12 C22)P(2)P(x|2) (C21 C11)P(1)P(x|1)] dx
)
= arg min
R
1
(
Z
R
1
g(x)dx
)
Note we are assuming C
21
> C
11
and C
12
> C
22
, that is the cost of a
misclassication is higher than the cost of a correct classication. Thus:
(C
12
C
22
) > 0 AND (C
21
C
11
) > 0
Bayes Risk (2)
Temporally forget about the specic expression of g(x). Consider the
type of decision region R

1
we are looking for. Select the intervals that
minimize the integral

R
1
g(x)dx, that is the intervals where g(x) < 0
Thus we will choose R

1
such that
(C
21
C
11
)P(
1
)P(x|
1
) > (C
12
C
22
)P(
2
)P(x|
2
)
Rearranging the terms yields:
P(x|
1
)
P(x|
2
)
>
(C
12
C
22
)P(
2
)
(C
21
C
11
)P(
1
)
Therefore we obtain the decision rule
A Likelihood Ratio Test:
Class (x) =

1
if
P(x|
1
)
P(x|
2
)
>
(C
12
C
22
)P(
2
)
(C
21
C
11
)P(
1
)

2
if
P(x|
1
)
P(x|
2
)
<
(C
12
C
22
)P(
2
)
(C
21
C
11
)P(
1
)
Bayes Risk: An Example
Consider the following 2 class classication problem. The likelihood
functions for each class are:
P(x|
1
) = (23)

1
2
exp

.5x
2
/3

, P(x|
2
) = (2)

1
2
exp

.5(x 2)
2

Introduction to Pattern Analysis

Ricardo Gutierrez-Osuna
Texas A&M University
9
-6 -4 -2 0 2 4 6
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
x
-6 -4 -2 0 2 4 6
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
x
lik
e
lih
o
o
d
The Bayes Risk: an example
! Consider a classification problem with two classes
defined by the following likelihood functions
" Sketch the two densities
" What is the likelihood ratio?
" Assume P[!
1
]=P[!
2
]=0.5, C
11
=C
22
=0, C
12
=1 and C
21
=3
1/2
.
Determine a decision rule that minimizes the probability of
error
2
2
2) (x
2
1
2
3
x
2
1
1
e
2!
1
) " | P(x
e
3 2!
1
) " | P(x
" "
"
#
#
27 . 1 , 73 . 4 x 0 12 x 12 x 2
0 ) 2 x (
2
1
3
x
2
1
1
e
e
3
1
e
! 2
1
e
3 ! 2
1
) x ( #
1
2
1
2
1
2
2
2
1
2
2
2
"
"
2
"
"
2
2
"
"
) 2 x (
2
1
3
x
2
1
"
"
) 2 x (
2
1
3
x
2
1
# $
%
&
' "
%
&
" ' "
%
&
%
&
#
" "
"
" "
"
R
1
R
2
R
1
The priors are: P(
1
) = P(
2
) = .5
Dene the (mis)classication costs
as: C
11
= C
22
= 0, C
12
= 1, C
21
=

3
Problem: Determine a decision rule minimizing the probability of error.
Bayes Risk: An Example (2)
Solution: (x) =
(23)

1
2 exp

.5x
2
/3

(2)

1
2 exp (.5(x2)
2
)
=
(3)

1
2 exp

.5x
2
/3

exp (.5(x2)
2
)
.
Choose class 1 if (x) >
.5(1 0)
.5(

3 0)

(3)

1
2 exp
`
.5x
2
/3

exp (.5(x 2)
2
)
> 1
exp

.5x
2
/3

> exp

.5(x 2)
2

1
2
x
2
3
>
1
2
(x 2)
2
x
2
6x + 6 > 0
x > 4.73 and x < 1.27
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
9
-6 -4 -2 0 2 4 6
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
x
-6 -4 -2 0 2 4 6
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
x
lik
e
lih
o
o
d
The Bayes Risk: an example
! Consider a classification problem with two classes
defined by the following likelihood functions
" Sketch the two densities
" What is the likelihood ratio?
" Assume P[!
1
]=P[!
2
]=0.5, C
11
=C
22
=0, C
12
=1 and C
21
=3
1/2
.
Determine a decision rule that minimizes the probability of
error
2
2
2) (x
2
1
2
3
x
2
1
1
e
2!
1
) " | P(x
e
3 2!
1
) " | P(x
" "
"
#
#
27 . 1 , 73 . 4 x 0 12 x 12 x 2
0 ) 2 x (
2
1
3
x
2
1
1
e
e
3
1
e
! 2
1
e
3 ! 2
1
) x ( #
1
2
1
2
1
2
2
2
1
2
2
2
"
"
2
"
"
2
2
"
"
) 2 x (
2
1
3
x
2
1
"
"
) 2 x (
2
1
3
x
2
1
# $
%
&
' "
%
&
" ' "
%
&
%
&
#
" "
"
" "
"
R
1
R
2
R
1
Bayesian Decision Theory
The Likelihood Ratio Test
The Probability of Error
Bayes Risk
Bayes, MAP and ML Criteria
Variations of the LRT
The LRT decision rule minimizing the Bayes Risk is also known as the
Bayes Criterion
Bayes Criterion Class (x) =
8
<
:
1 if (x) >
(C
12
C
22
)P(
2
)
(C
21
C
11
)P(
1
)
2 if (x) <
(C
12
C
22
)P(
2
)
(C
21
C
11
)P(
1
)
Minimize the probability of error, that is the Bayes Criterion with
C
ij
=
ij
, this version of the LRT decision is referred to as the
Maximum A Posteriori Criterion.
MAP Criterion Class (x) =
(
1 if P(1|x) > P(2|x)
2 if P(1|x) < P(2|x)
Finally, for the case of equal priors P(
i
) and C
ij
=
ij
(a zero one
cost function) the LRT decision rule is called the Maximum Likelihood
Criterion, since it will minimize the likelihood P(x|
i
).
ML Criterion Class (x) =
(
1 if P(x|1) > P(x|2)
2 if P(x|1) < P(x|2)
Variations of the LRT (2)
Two more decision rules are commonly cited in the related literature.
The Neyman-Pearson Criterion which also leads to a LRT decision rule.
It xes one class error probability, say < and seeks to minimize the
other.
The Minimax Criterion, derived from the Bayes Criterion and seeks to
minimize the maximum Bayes Risk.
Bayesian Decision Theory
The Likelihood Ratio Test
The Probability of Error
Bayes Risk
Bayes, MAP and ML Criteria
Multi-class functions
Decision rules for multi-class problems
The decision rule minimizing P(error) generalizes to multi-class problems.
The derivation is easier if we express P(error) in terms of making a correct assignment.
P(error) = 1 P(correct)
Probability of making a correct assignment is
P(correct) =
C
X
i=1
P(i)
Z
R
i
P(x|i)dx
=
C
X
i=1
Z
R
i
P(x|i)P(i)dx
=
C
X
i=1
Z
R
i
P(i|x)P(x)dx
| {z }
T
i
The problem of minimizing P(error) is equivalent to that of maximizing
P(correct).
To maximize P(correct) we have to maximize each of the integrals T
i
.
In turn, each integral T
i
will be maximized by choosing the class
i
that yields the maximum P(
i
|x) = we will dene R
i
to be the
regions where P(
i
|x) is maximum.
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
12
Minimum P[error] rule for multi-class problems
! The decision rule that minimizes P[error] generalizes very easily to multi-class
problems
" For clarity in the derivation, the probability of error is better expressed in terms of the
probability of making a correct assignment
" The probability of making a correct assignment is
" The problem of minimizing P[error] is equivalent to that of maximizing P[correct]. Expressing
P[correct] in terms of the posteriors:
" In order to maximize P[correct], we will have to
maximize each of the integrals !
i
. In turn, each
integral !
i
will be maximized by choosing the
class "
i
that yields the maximum P["
i
|x]
#we will define R
i
to be the region where
P["
i
|x] is maximum
! Therefore, the decision rule that minimizes P[error] is the MAP Criterion
] correct [ P 1 ] error [ P $ %
& '
%
%
C
1 i
i
R
i
dx ) ! | x ( P ) ! ( P ] correct [ P
i
&' &' & '
%
!
% %
% % %
C
1 i R
i
C
1 i
i i
R
C
1 i
i
R
i
i
i i i
P(x)dx x) | P(! )dx )P(! ! | P(x )dx ! | P(x ) P(! P[correct]
! ! " ! ! # $
x
P
r
o
b
a
b
i
li
t
y
R
2
R
1
R
3
R
2
R
1
P("
1
|x)
P("
2
|x)
P("
3
|x)
Therefore, the decision rule that minimizes P(error) is the MAP Criterion.
Minimum Bayes Risk
Dene the overall decision rule as a function
: x {
1
,
2
, ,
C
} s.t. (x) =
i
if x is assigned to class
i
.
The risk R((x)|x) of assigning x to class (x) =
i
is
R(
i
|x) =
C

j=1
C
ij
P(
j
|x)
The Bayes Risk associate with the decision rule (x) is
R((x)) =

R((x)|x)P(x)dx
To minimize this expression we have to minimize the conditional risk
R((x)|x) at each point x in the feature space.
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
13
Minimum Bayes Risk for multi-class problems
! To determine which decision rule yields the minimum Bayes Risk for the multi-class
problem we will use a slightly different formulation
" We will denote by !
i
the decision to choose class "
i
,
" We will denote by !(x) the overall decision rule that maps features x into classes "
i
: !(x)#{!
1
, !
2
, , !
C
}
! The (conditional) risk $(!
i
|x) of assigning a feature x to class "
i
is
! And the Bayes Risk associated with the decision rule !(x) is
! In order to minimize this expression,we will have to minimize the conditional risk $(!(x)|x)
at each point x in the feature space, which in turn is equivalent to choosing "
i
such that
$(!
i
|x) is minimum
% & % &
'
(
( $ ( # $
C
1 j
j ij i i
) x | ! ( P C x | " " ) x ( "
% & % & dx ) x ( P x | ) x ( " ) x ( "
)
$ ( $
x
R
i
s
k
R
1
R
2
R
3
R
2
R
1
R
2
R
2
$ (!
2
|x)
$ (!
3
|x)
$ (!
1
|x)
Bayesian Decision Theory
The Likelihood Ratio Test
The Probability of Error
Bayes Risk
Bayes, MAP and ML Criteria
Multi-class functions
Discriminant Functions
Discriminant Functions
All decision rules presented in this lecture have the same structure
At each x in feature space choose class i which maximizes (or minimizes) some
measure gi(x)
Formally, there is a set of discriminant functions {g
i
(x)}
C
i=1
and the
following decision rule
assign x to class
i
if g
i
(x) > g
j
(x) j = i
Can visualize the decision rule as a network or machine
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
14
Discriminant functions
! All the decision rules we have presented in this lecture have the same structure
" At each point x in feature space choose class !
i
which maximizes (or minimizes) some measure g
i
(x)
! This structure can be formalized with a set of discriminant functions g
i
(x), i=1..C, and the
following decision rule
! Therefore, we can visualize the decision rule as a network or machine that computes C
discriminant functions and selects the category corresponding to the largest discriminant.
Such network is depicted in the following figure (presented already in Lecture 1)
! Finally, we express the three basic decision rules: Bayes, MAP and ML in terms of
Discriminant Functions to show the generality of this formulation
i" j (x) g (x) g if ! class to x assign "
j i i
" # $
x
2
x
2 x
3
x
3 x
d
x
d
g
1
(x) g
1
(x)
x
1
x
1
g
2
(x) g
2
(x) g
C
(x) g
C
(x)
Select max Select max
Costs Costs
Class assignment
Discriminant functions
Features
Criterion Discriminant Function
Bayes g
i
(x)=-%(&
i
|x)
MAP g
i
(x)=P(!
i
|x)
ML g
i
(x)=P(x|!
i
)
The three basic decision rules Bayes, MAP and ML in terms of
Discriminant Functions:
Criterion Discriminant Function
Bayes g
i
(x) = R(
i
|x)
MAP g
i
(x) = P(
i
|x)
ML g
i
(x) = P(x|
i
)
Discriminant Functions for the class of
Gaussian Distributions
Quadratic Classiers
Bayes classiers for Normally distributed classes
Bayes classiers for Normally distributed classes
The (MAP) decision rule minimizing the probability of error can be
formulated as a family of discriminant functions
Choose class i if gi(x) > gj(x) i = j with gi(x) = P(i|x)
For classes that are normally distributed, this family can be reduced to very simple
expressions.
General expression for Gaussian densities
The multi-variate Normal density is dened as
fX(x) =
1
(2)
n
2 ||
1
2
exp

1
2
(x )
T

1
(x )

Using Bayes rule the MAP discriminant function becomes

gi(x) =
P(x|i)P(i)
P(x)
=
1
(2)
n
2 |i|
1
2
exp

1
2
(x
i
)
T

1
i
(x
i
)

P(i)
P(x)
Eliminating constant terms
gi(x) = |i|
1/2
exp

1
2
(x )
T

1
i
(x )

P(i)
Taking the log since it is a monotonically increasing function
gi(x) =
1
2
(x
i
)
T

1
i
(x
i
)
1
2
log (|i|) + log (P(i))
Quadratic Discriminant Function
Quadratic Classiers
Bayes classiers for Normally distributed classes
Case 1:
i
=
2
I
Case 1:
i
=
2
I
The features are statistically independent with the same variance for
all classes.
The quadratic function becomes
gi(x) =
1
2
(x
i
)
T
(
2
I)
1
(x
i
)
1
2
log (|
2
I|) + log (P(i))
=
1
2
2
(x
i
)
T
(x
i
)
N
2
log (
2
) + log (P(i))
=
1
2
2
(x
i
)
T
(x
i
) + log (P(i)), dropping the second term
=
1
2
2
(x
T
x 2
T
i
x +
T
i

i
) + log (P(i))
Eliminate the term x
T
x as it is constant for all classes. Then
gi(x) =
1
2
2
(2
T
i
x +
T
i

i
) + log (P(i))
= w
T
i
x +wi0
where
wi =

i

2
and wi0 =
1
2
2

T
i

i
+ log (P(i))
As the discriminant is linear, the decision boundaries gi(x) = gj(x) will be
hyper-planes.
If we assume equal priors (also know as the nearest mean classier)
Minimum distance classier: g
i
(x) =
1
2
2
(x
i
)
T
(x
i
)
Properties of the class-conditional probabilities
the loci of constant probability for each class are hyper-spheres
Case 1:
i
=
2
I
The decision boundaries are the hyperplanes g
i
(x) = g
j
(x), and can
be written as
w
T
(x x
0
) = 0, after some algebra
where
w =
i

j
x
0
=
1
2
(
i
+
j
)

2

2
ln
P(
i
)
P(
j
)
(
i

j
).
The hyperplane separating R
i
and R
j
passes through the point x
0
and
is orthogonal to the vector w.
Case 1:
i
=
2
I, Example
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
4
Case 1: !
i
="
2
I, example
! To illustrate the previous result, we will
compute the decision boundaries for a 3-
class, 2-dimensional problem with the
following class mean vectors and
covariance matrices and equal priors
# $ # $ # $
%
&
'
(
)
*
+
%
&
'
(
)
*
+
%
&
'
(
)
*
+
+ + +
2 0
0 2
!
2 0
0 2
!
2 0
0 2
!
5 2 4 7 2 3
3 2 1
T
3
T
2
T
1
Compute decision boundaries for the above
3-class, 2D problem with class-conditional
parameters and equal priors.

1
= (3, 2)
T
,
2
= (7, 4)
T
,
3
= (2, 5)
T

1
=

2 0
0 2

,
2
=

2 0
0 2

,
3
=

2 0
0 2

Introduction to Pattern Analysis

Ricardo Gutierrez-Osuna
Texas A&M University
4
Case 1: !
i
="
2
I, example
! To illustrate the previous result, we will
compute the decision boundaries for a 3-
class, 2-dimensional problem with the
following class mean vectors and
covariance matrices and equal priors
# $ # $ # $
%
&
'
(
)
*
+
%
&
'
(
)
*
+
%
&
'
(
)
*
+
+ + +
2 0
0 2
!
2 0
0 2
!
2 0
0 2
!
5 2 4 7 2 3
3 2 1
T
3
T
2
T
1
Quadratic Classiers
Bayes classiers for Normally distributed classes
Case 1:
i
=
2
I
Case 2:
i
= ( diagonal)
Case 2:
i
= ( diagonal)
The classes have the same covariance matrix, but the features are
allowed to have dierent variances.
The quadratic function becomes
gi(x) =
1
2
(x
i
)
T

1
i
(x
i
)
1
2
log (|i|) + log (P(i))
=
1
2
(x
i
)
T
0
B
B
@

2
1
.
.
.

2
N
1
C
C
A
(x
i
)
1
2
log
0
B
B
@

0
B
B
@

2
1
.
.
.

2
N
1
C
C
A

1
C
C
A
+ log (P(
i
))
=
1
2
N
X
k=1
(xk ik)
2

2
k

1
2
log

N
Y
k=1

2
k
!
+ log (P(i))
=
1
2
N
X
k=1
x
2
k
2xkik +
2
ik

2
k

1
2
log

N
Y
k=1

2
k
!
+ log (P(i))
Eliminate the term x
2
k
as it is constant for all classes.
gi(x) =
1
2
N
X
k=1
2xkik +
2
ik

2
k

1
2
log

N
Y
k=1

2
k
!
+ log (P(i))
Properties
This discriminant is linear, so the decision boundaries g
i
(x) = g
j
(x)
are hyper-planes.
The loci of constant probability are hyper-ellipses aligned with the
feature axes.
The only dierence to the previous classier is that the distance of each
axis is normalized by the variance of the axis.
Case 2:
i
= ( diagonal), example
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
6
! To illustrate the previous result, we will
compute the decision boundaries for a 3-class,
2-dimensional problem with the following class
mean vectors and covariance matrices and
equal priors
! " ! " ! "
#
$
%
&
'
(
)
#
$
%
&
'
(
)
#
$
%
&
'
(
)
) ) )
2 0
0 1
!
2 0
0 1
!
2 0
0 1
!
5 2 4 5 2 3
3 2 1
T
3
T
2
T
1
Case 2: *
i
= * (* diagonal), example
Compute decision boundaries for the above
3-class, 2D problem with class-conditional
parameters and equal priors.

1
= (3, 2)
T
,
2
= (5, 4)
T
,
3
= (2, 5)
T

1
=

1 0
0 2

,
2
=

1 0
0 2

,
3
=

1 0
0 2

Introduction to Pattern Analysis

Ricardo Gutierrez-Osuna
Texas A&M University
6
! To illustrate the previous result, we will
compute the decision boundaries for a 3-class,
2-dimensional problem with the following class
mean vectors and covariance matrices and
equal priors
! " ! " ! "
#
$
%
&
'
(
)
#
$
%
&
'
(
)
#
$
%
&
'
(
)
) ) )
2 0
0 1
!
2 0
0 1
!
2 0
0 1
!
5 2 4 5 2 3
3 2 1
T
3
T
2
T
1
Case 2: *
i
= * (* diagonal), example
Quadratic Classiers
Bayes classiers for Normally distributed classes
Case 1:
i
=
2
I
Case 2:
i
= ( diagonal)
Case 3:
i
= ( non-diagonal)
Case 3:
i
= ( non-diagonal)
All classes have the same covariance matrix, but it is not necessarily
diagonal.
The quadratic discriminant function becomes
gi(x) =
1
2
(x
i
)
T

1
i
(x
i
)
1
2
log (|i|) + log (P(i))
=
1
2
(x
i
)
T

1
(x
i
)
1
2
log (||) + log (P(i))
Eliminate the term log (||), which is constant for all classes.
g
i
(x) =
1
2
(x
i
)
T

1
(x
i
) + log (P(
i
))
The quadratic term is called the Mahalanobis distance, a very important
term in Statistical PR.
Mahalanobis distance: x y
2

1
= (x y)
T

1
(x y)
The Mahalanobis distance is a vector distance that uses a
1
norm.

1
can be thought as a stretching factor on the space.
For = I the Mahalanobis distance becomes the Euclidean distance.
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
7
Case 3: !
i
=! (! non-diagonal)
! In this case, all the classes have the same covariance matrix, but this is no longer diagonal
! The quadratic discriminant becomes
! Eliminating the term log|"|, which is constant for all classes
" The quadratic term is called the Mahalanobis distance, a very important distance in Statistical PR
! The Mahalanobis distance is a vector distance that
uses a "
-1
norm
" "
-1
can be thought of as a stretching factor on the space
" Note that for an identity covariance matrix ("=I), the
Mahalanobis distance becomes the familiar Euclidean distance
# $ # $
# $ # $ ) ! P( log log
2
1
- ) (x ) (x
2
1
) ! P( log log
2
1
- ) (x ) (x
2
1
(x) g
i i
1 T
i
i i i
1
i
T
i i
% " & " & & '
' % " & " & & '
&
&
# $ ) ! P( log ) (x ) (x
2
1
(x) g
i i
1
i
T
i i
% & " & & '
&
(
x
2
x
1
" - x
2
i
'
K - x
2
i 1
'
&
"
y) (x y) (x y - x
1 T
2
1 & " & '
&
"
&
Distance s Mahalanobi
Case 3:
i
= ( non-diagonal)
Expansion of the quadratic term in the discriminant yields
gi(x) =
1
2
(x
i
)
T

1
(x
i
) + log (P(i))
=
1
2

x
T

1
x 2
T
i

1
x +
T
i

1

+ log (P(i))
Removing the term x
T

1
x which is constant for all classes
gi(x) =
1
2

2
T
i

1
x +
T
i

1

+ log (P(i))
Reorganizing terms get:
gi(x) = w
T
i
x +wi0 with wi =
1

i
and wi0 =
1
2

T
i

1

i
+ log (P(i))
Properties
The discriminant is linear, so the decision boundaries are hyper-planes.
The constant probability loci are hyper-ellipses aligned with the
eigenvectors of .
If we can assume equal priors the classier becomes a minimum
(Mahalanobis) distance classier.
Equal Priors: g
i
(x) =
1
2
(x
i
)
T

1
(x
i
)
Case 3: Example
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
9
! To illustrate the previous result, we will
compute the decision boundaries for a 3-
class, 2-dimensional problem with the
following class mean vectors and
covariance matrices and equal priors
! " ! " ! "
#
$
%
&
'
(
)
#
$
%
&
'
(
)
#
$
%
&
'
(
)
) ) )
2 7 . 0
7 . 0 1
!
2 7 . 0
7 . 0 1
!
2 7 . 0
7 . 0 1
!
5 2 4 5 2 3
3 2 1
T
3
T
2
T
1
Case 3: *
i
=* (* non-diagonal), example
Compute decision boundaries for the above
3-class, 2D problem with class-conditional
parameters and equal priors.

1
= (3, 2)
T
,
2
= (5, 4)
T
,
3
= (2, 5)
T

1
=

.5 .7
.7 2

,
2
=

1 .7
.7 2

,
3
=

1 .7
.7 2

Introduction to Pattern Analysis

Ricardo Gutierrez-Osuna
Texas A&M University
9
! To illustrate the previous result, we will
compute the decision boundaries for a 3-
class, 2-dimensional problem with the
following class mean vectors and
covariance matrices and equal priors
! " ! " ! "
#
$
%
&
'
(
)
#
$
%
&
'
(
)
#
$
%
&
'
(
)
) ) )
2 7 . 0
7 . 0 1
!
2 7 . 0
7 . 0 1
!
2 7 . 0
7 . 0 1
!
5 2 4 5 2 3
3 2 1
T
3
T
2
T
1
Case 3: *
i
=* (* non-diagonal), example
Quadratic Classiers
Bayes classiers for Normally distributed classes
Case 1:
i
=
2
I
Case 2:
i
= ( diagonal)
Case 3:
i
= ( non-diagonal)
Case 4:
i
=
2
i
I
Case 4:
i
=
2
i
I
Each class has a dierent covariance matrix, which is proportional to
the identity matrix.
The quadratic discriminant becomes
gi(x) =
1
2
(x
i
)
T

1
i
(x
i
)
1
2
log (|i|) + log (P(i))
=
1
2
(x
i
)
T

2
i
(x
i
)
N
2
log

2
i

+ log (P(i))
The expression cannot be reduce further so
The decision boundaries are quadratic: hyper-ellipses
The loci of constant probability are hyper-spheres aligned with the feature axis.
Case 4:
i
=
2
i
I, example
Compute decision boundaries for the above
3-class, 2D problem with class-conditional
parameters.

1
= (3, 2)
T
,
2
= (5, 4)
T
,
3
= (2, 5)
T

1
=

.5 0
0 .5

,
2
=

1 0
0 1

,
3
=

2 0
0 2

Quadratic Classiers
Bayes classiers for Normally distributed classes
Case 1:
i
=
2
I
Case 2:
i
= ( diagonal)
Case 3:
i
= ( non-diagonal)
Case 4:
i
=
2
i
I
Case 5:
i
=
j
, General Case
Case 5:
i
=
j
, General Case
Have already derived the expression for the general case it is:
gi(x) =
1
2
(x
i
)
T

1
i
(x
i
)
1
2
log (|i|) + log (P(i))
Reorganizing terms in a quadratic form yields
g
i
(x) = x
T
W
i
x +w
T
i
x + w
i0
where
W
i
=
1
2

1
i
,
w
i
=
1
i

i
,
w
i0
=
1
2

T
i

1
i

i

1
2
log (|
i
|) + log (P(
i
))
Properties
The loci of constant probability for each class are hyper-ellipses, oriented
with the eigenvectors of
i
for that class.
The decision boundaries are quadratic: hyper-ellipses or hyper-
parabolloids
The quadratic expression in the discriminant is proportional to the
Mahalanobis distance using the class-conditional variance
i
.
Case 5:
i
=
j
, Example
Introduction to Pattern Analysis
Ricardo Gutierrez-Osuna
Texas A&M University
13
! To illustrate the previous result, we will
compute the decision boundaries for a 3-
class, 2-dimensional problem with the
following class mean vectors and
covariance matrices and equal priors
! " ! " ! "
#
$
%
&
'
(
)
#
$
%
&
'
(
*
*
)
#
$
%
&
'
(
*
*
)
) ) )
3 5 . 0
5 . 0 5 . 0
!
7 1
1 1
!
2 1
1 1
!
5 2 4 5 2 3
3 2 1
T
3
T
2
T
1
Case 5: +
i
,+
j
general case, example
Zoom
out
Compute the decision boundaries for the above
3-class, 2-dimensional problem with class-
conditional parameters.

1
= (3, 2)
T

2
= (5, 4)
T

3
= (2, 5)
T

1
=

1 1
1 2

,
2
=

1 0
0 1

,
3
=

2 0
0 2

Introduction to Pattern Analysis

Ricardo Gutierrez-Osuna
Texas A&M University
13
! To illustrate the previous result, we will
compute the decision boundaries for a 3-
class, 2-dimensional problem with the
following class mean vectors and
covariance matrices and equal priors
! " ! " ! "
#
$
%
&
'
(
)
#
$
%
&
'
(
*
*
)
#
$
%
&
'
(
*
*
)
) ) )
3 5 . 0
5 . 0 5 . 0
!
7 1
1 1
!
2 1
1 1
!
5 2 4 5 2 3
3 2 1
T
3
T
2
T
1
Case 5: +
i
,+
j
general case, example
Zoom
out
Quadratic Classiers
Bayes classiers for Normally distributed classes
Case 1:
i
=
2
I
Case 2:
i
= ( diagonal)
Case 3:
i
= ( non-diagonal)
Case 4:
i
=
2
i
I
Case 5:
i
=
j
, General Case
Numerical Example
Numerical Example
Derive the discriminant function for the 2-class 3D classication problem dened by the
following Gaussian Likelihoods

1
=
0
@
0
0
0
1
A
;
2
=
0
@
1
1
1
1
A
; 1 = 2 =
0
@
1
4
0 0
0
1
4
0
0 0
1
4
1
A
; P(2) = 2P(1)
Solution:
g1(x) =
1
2
2
(x
1
)
T
(x
1
) + log (P(1))
=
1
2(
1
4
)
(x1 0, x2 0, x3 0)
0
@
x1 0
x2 0
x3 0
1
A
+ log

1
3

g2(x) =
1
2(
1
4
)
(x1 1, x2 1, x3 1)
0
@
x1 1
x2 1
x3 1
1
A
+ log

2
3

Classify x as 1 if g1(x) > g2(x).

g1(x) > g2(x)
2(x
T
x) + log

1
3

> 2((x1 1)
2
+ (x2 1)
2
+ (x3 1)
2
) + log

2
3

x1 +x2 +x3 <

6 log 2
4
= 1.32
Therefore the decision rule is:
Class (x) =
(
1 if x1 +x2 +x3 < 1.32
2 if x1 +x2 +x3 > 1.32
Classify the test example x
u
= (.1, .7, .8)
T
.1 + .7 + .8 = 1.6 > 1.32 = x
u

2
Quadratic Classiers
Bayes classiers for Normally distributed classes
Case 1:
i
=
2
I
Case 2:
i
= ( diagonal)
Case 3:
i
= ( non-diagonal)
Case 4:
i
=
2
i
I
Case 5:
i
=
j
, General Case
Numerical Example
Conclusions
Conclusions
We can draw the following conclusions
The Bayes classier for normally distributed classes (general case) is a quadratic
classier.
The Bayes classier for normally distributed classes with equal covariance matrices
is a linear classier.
The minimum Mahalanobis distance classier is Bayes-optimal for
normally distributed classes and
equal covariance matrices and
equal priors
The minimum Euclidean distance classier is Bayes-optimal for
normally distributed classes and
equal covariance matrices proportional to the identity matrix and
equal priors
Both Euclidean and Mahalanobis distance classiers are linear classiers.
Some of the most popular classiers can be derived from decision-
theoretic principles and some simplifying assumptions.
Using a specic (Euclidean or Mahalanobis) minimum distance classier implicitly
corresponds to certain statistical assumptions.
Can rarely answer the question if these assumptions hold for real problems. In
most cases limited to answering Does this classier solve our problem or not?

Introduction To Machine Learning CS - 229
No ratings yet
Introduction To Machine Learning CS - 229
109 pages
2023 LSE MY474 Applied Machine Learning Social Science, Lecture3
No ratings yet
2023 LSE MY474 Applied Machine Learning Social Science, Lecture3
58 pages
Lecture 7 Baysian Classifier
No ratings yet
Lecture 7 Baysian Classifier
25 pages
2 Unit PR Statistical Decision Making
No ratings yet
2 Unit PR Statistical Decision Making
61 pages
22-23 323 Week7Notes
No ratings yet
22-23 323 Week7Notes
15 pages
Linearclassification
No ratings yet
Linearclassification
31 pages
Bartlett 08 A
No ratings yet
Bartlett 08 A
18 pages
Lecturer4 - Bayesian Decision Theory
No ratings yet
Lecturer4 - Bayesian Decision Theory
40 pages
Lecture 9
No ratings yet
Lecture 9
6 pages
Lecture 2 3
No ratings yet
Lecture 2 3
72 pages
Lecture 2 BayesianHypothesisTesting
No ratings yet
Lecture 2 BayesianHypothesisTesting
10 pages
T06 - Bayes Classifiers
No ratings yet
T06 - Bayes Classifiers
22 pages
Lecture 11
No ratings yet
Lecture 11
49 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
AIML Lect7 Bayes
No ratings yet
AIML Lect7 Bayes
48 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
39 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
5 pages
Notes and Solutions For: Pattern Recognition by Sergios Theodoridis and Konstantinos Koutroumbas.
100% (1)
Notes and Solutions For: Pattern Recognition by Sergios Theodoridis and Konstantinos Koutroumbas.
209 pages
pr2 Bayes
No ratings yet
pr2 Bayes
44 pages
Bayesian Decision Theory: CS479/679 Pattern Recognition Dr. George Bebis
No ratings yet
Bayesian Decision Theory: CS479/679 Pattern Recognition Dr. George Bebis
64 pages
Linear Classification: 1 1 N N I D I
No ratings yet
Linear Classification: 1 1 N N I D I
33 pages
AE - Tema 5 - Two-Class Fisher Discriminant Analysis
No ratings yet
AE - Tema 5 - Two-Class Fisher Discriminant Analysis
6 pages
Bayesian Theory
No ratings yet
Bayesian Theory
66 pages
PR January20 03 PDF
No ratings yet
PR January20 03 PDF
74 pages
Weatherwax Theodoridis Solutions
No ratings yet
Weatherwax Theodoridis Solutions
212 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
65 pages
ELEG 5633 Detection and Estimation Detection Theory I: Jing Yang
100% (1)
ELEG 5633 Detection and Estimation Detection Theory I: Jing Yang
27 pages
Bayesian Decision Theory: Intro To
No ratings yet
Bayesian Decision Theory: Intro To
56 pages
Classification Example
No ratings yet
Classification Example
12 pages
SDA Bayes
No ratings yet
SDA Bayes
12 pages
Revised Lecture Notes 2
No ratings yet
Revised Lecture Notes 2
16 pages
Exercises 1 Bayasian Decision Theory
No ratings yet
Exercises 1 Bayasian Decision Theory
13 pages
PR Mod1
No ratings yet
PR Mod1
4 pages
Bayes&Voice Recognition
No ratings yet
Bayes&Voice Recognition
76 pages
Statistical Learning Theory: 18.657: Mathematics of Machine Learning
No ratings yet
Statistical Learning Theory: 18.657: Mathematics of Machine Learning
9 pages
APOS (A8, EX, DX) Development Mode Preparing
No ratings yet
APOS (A8, EX, DX) Development Mode Preparing
27 pages
Theory For Classification and Linear Models (I)
No ratings yet
Theory For Classification and Linear Models (I)
32 pages
Bayes Decision Theory
No ratings yet
Bayes Decision Theory
53 pages
Dr. Arslan Shaukat
No ratings yet
Dr. Arslan Shaukat
18 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
63 pages
Lecture 2 Part 1: Statistical Analysis (Bayesian Decision Theory, Probability Theory)
No ratings yet
Lecture 2 Part 1: Statistical Analysis (Bayesian Decision Theory, Probability Theory)
22 pages
Bayesian Decision Theory: Prof. Richard Zanibbi
No ratings yet
Bayesian Decision Theory: Prof. Richard Zanibbi
47 pages
Discriminant Analysis: 5.1 The Maximum Likelihood (ML) Rule
No ratings yet
Discriminant Analysis: 5.1 The Maximum Likelihood (ML) Rule
6 pages
Unit-Ii Bayesian Decision Theory
No ratings yet
Unit-Ii Bayesian Decision Theory
22 pages
Bayesian
No ratings yet
Bayesian
14 pages
Statistics 512 Notes 25: Decision Theory: of Nature. The Set of All Possible Value of
No ratings yet
Statistics 512 Notes 25: Decision Theory: of Nature. The Set of All Possible Value of
11 pages
Bayes Decision Theory: How To Make Decisions in The Presence of Uncertainty?
No ratings yet
Bayes Decision Theory: How To Make Decisions in The Presence of Uncertainty?
16 pages
Ch1 - Bayesian Analysis
No ratings yet
Ch1 - Bayesian Analysis
5 pages
Linic - by Slidesgo
No ratings yet
Linic - by Slidesgo
84 pages
Lecture-1 Introduction Water Supply Engr.
100% (1)
Lecture-1 Introduction Water Supply Engr.
49 pages
CMA Inter - July 2023 Past Paper Questions Practice
No ratings yet
CMA Inter - July 2023 Past Paper Questions Practice
36 pages
Statistical Inference For Engineers and Data Scientists Solutions Manual
No ratings yet
Statistical Inference For Engineers and Data Scientists Solutions Manual
12 pages
Machine Learning: Tools, Techniques, Applications (2013-14-I) # 1
No ratings yet
Machine Learning: Tools, Techniques, Applications (2013-14-I) # 1
5 pages
Accenture Offer Letter Validation
No ratings yet
Accenture Offer Letter Validation
18 pages
VB UNIT 1 Notes
No ratings yet
VB UNIT 1 Notes
24 pages
Point Estimation
No ratings yet
Point Estimation
5 pages
Top 20 Java Multithreading Interview Questions &
No ratings yet
Top 20 Java Multithreading Interview Questions &
2 pages
BIOS Manual For System Boards With Intel® 7 Series / C216 Chipset
No ratings yet
BIOS Manual For System Boards With Intel® 7 Series / C216 Chipset
70 pages
Homework1 Solutions
No ratings yet
Homework1 Solutions
5 pages
Statistics 512 Notes 26: Decision Theory Continued: FX FX D
No ratings yet
Statistics 512 Notes 26: Decision Theory Continued: FX FX D
11 pages
Tonoyan Et Al-2010-Entrepreneurship Theory and Practice
No ratings yet
Tonoyan Et Al-2010-Entrepreneurship Theory and Practice
40 pages
DOE Exercise Book
No ratings yet
DOE Exercise Book
35 pages
Sagara Technology Profile
No ratings yet
Sagara Technology Profile
39 pages
Scom 261 - News Release Final Version
No ratings yet
Scom 261 - News Release Final Version
2 pages
Homework Decision Solutions
No ratings yet
Homework Decision Solutions
3 pages
New Tariff Rates 2019
No ratings yet
New Tariff Rates 2019
6 pages
Radiography Interpretation
No ratings yet
Radiography Interpretation
13 pages
TPZB150 Method of Statement - Satamas (2024)
No ratings yet
TPZB150 Method of Statement - Satamas (2024)
9 pages
Bowser Document
No ratings yet
Bowser Document
2 pages
Solutions To Selected Problems-Duda, Hart
67% (3)
Solutions To Selected Problems-Duda, Hart
12 pages
Sample DLP 2024
No ratings yet
Sample DLP 2024
3 pages
Marketing Mix 4ps PNG
50% (2)
Marketing Mix 4ps PNG
2 pages
UT Dallas Syllabus For cs6390.001 05s Taught by Jorge Cobb (Jcobb)
No ratings yet
UT Dallas Syllabus For cs6390.001 05s Taught by Jorge Cobb (Jcobb)
3 pages
How To Solve The Rubik's Cube
No ratings yet
How To Solve The Rubik's Cube
23 pages
PaveAnalyzer White Paper
No ratings yet
PaveAnalyzer White Paper
4 pages
Modular Organizer Drawers
No ratings yet
Modular Organizer Drawers
3 pages
Pak ST Final Paper
No ratings yet
Pak ST Final Paper
7 pages
Motion Unit Packet
No ratings yet
Motion Unit Packet
10 pages
Welbilt Bread Machine Model Abm1h70 Instruction Manual & Recipes Abm 1h70
No ratings yet
Welbilt Bread Machine Model Abm1h70 Instruction Manual & Recipes Abm 1h70
4 pages
Dishwasher Instructions
No ratings yet
Dishwasher Instructions
4 pages
Brochure Road Safety Audit 16 12 2019 PDF
No ratings yet
Brochure Road Safety Audit 16 12 2019 PDF
2 pages
Ujwala N Jagdale and Ors Vs Jagdale Industries PriNC201723061716192583COM45336-1
No ratings yet
Ujwala N Jagdale and Ors Vs Jagdale Industries PriNC201723061716192583COM45336-1
2 pages
Financial Admission Requirements For Undergraduate International Students at Auburn University Academic Year 2016
No ratings yet
Financial Admission Requirements For Undergraduate International Students at Auburn University Academic Year 2016
2 pages
Course Purpose-MFS - Major III
No ratings yet
Course Purpose-MFS - Major III
2 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Algebraic Equations
From Everand
Algebraic Equations
Demetrios P. Kanoussis
No ratings yet
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet

Lecture 5

Uploaded by

Lecture 5

Uploaded by

Lecture 5

Bayesian Decision Theory

Introduction to Pattern Analysis

Using Bayes rule the MAP discriminant function becomes

Introduction to Pattern Analysis

Introduction to Pattern Analysis

Introduction to Pattern Analysis

Introduction to Pattern Analysis

Classify x as 1 if g1(x) > g2(x).

x1 +x2 +x3 <

You might also like