Homework 1: 1. Solve The Following Problems From Chapter 2 of The Text Book: 7, 12, 13, 31, 38

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Homework 1

CSE6416, Spring 2020

Due: April 15, 2020

1. Solve the following problems from Chapter 2 of the text book:

7, 12, 13, 31, 38


2. Suppose that we have three colored boxes r (red), b (blue) and g (green). Box r contains 3
apples, 4 oranges and 3 limes, box b contains 1 apple, 1 orange and 0 limes, and box g
contains 3 apples, 3 oranges and 4 limes. If a box is chosen at random with P(r) = 0.2, P(b) =
0.2, and P(g) = 0.6, and a piece of fruit is removed from the box (with equal probability of
selecting any of the items in the box), then what is the probability of selecting an apple? If you
observe that the selected fruit is in fact an orange, what is the probability that it came from the
g box.
3. Consider a two-category classification problem with two-dimensional feature vector X = (x1,
x2). The two categories are 1 and 2.
 0  
p ( X 1 ) ~ N    , 1  ,
 1  
 2 
p ( X 2 ) ~ N    ,  2  ,
 0 
1
P(1 )  P(2 ) 
2

and

1 0  1 1 
1    , 2   .
0 1  1 2 

(a) Calculate the Bayes decision boundary.


(b) Randomly draw 50 patterns from each of the two class-conditional densities and plot them
in the two-dimensional feature space. Also draw the decision boundary (from (a)) on this
plot.
(c) Calculate the Bhattacharya error bound.
(d) Generate 1000 additional test patterns from each class and determine the empirical error
rate based on the decision boundary in (3a). Compare the empirical error with the bound
in part (3c).

 x1 
4. Consider the following class-conditional density function for feature vector X    :
x2 
p ( X  ) ~ N (  , ),

where

 1    12  12 
  ;    2
;  12   21
 2   21  2 

(a) Write down the expression for the Euclidean distance between point X and mean vector μ.
(b) Write down the expression for the Mahalanobis distance between point X and mean vector
μ. Simplify this expression by expanding the quadratic term.
(c) How does the expression for Mahalanobis distance simplify when  is a diagonal matrix.
(d) Compare the expressions in (4a) and (4b). How does the Mahalanobis distance differ from
the Euclidean distance? When are the two distances equal? When is it more appropriate to
use the Mahalanobis distance? Support your answers with illustrations.

5. The class-conditional density functions of a binary random variable X for four pattern classes
are shown below:

x P(x/1) P(x/2) P(x/3) P(x/4)

1 1/3 2/3 3/4 2/5

2 2/3 1/3 1/4 3/5

The loss function is as follows, where action i means “decide pattern class i”:

1 2 3 4

1 0 2 3 4

2 1 0 1 8

3 3 2 0 2

4 5 3 1 0

(a) A general decison rule d(x) tells us which action to take for every possible observation x.
Construct a table defining all possible decision rules for the above problem. As an example,
one of the possible decision rules is:

x = 1, take action 1
x = 2, take action 2

(b) Compute the risk function Rd() for all the decision rules, where

Rd ( )   L  , d ( x) p( x  ).
x

Does a uniformly best rule exist?


(c) Consider the following prior probabilities: P(1) = 1/4, P(2) = 1/4, P(3) = 3/8 , P(4) =
1/8,

Compute the average risk for every decision rule as follows:

Rd   P ( ) Rd ( ).

Find the optimal Bayes decision rule.

* Note: You may use the MATLAB package to generate multivariate Gaussian patterns. Use
the “mvnrnd” (multivariate normal random point generator) and “plot” commands in matlab
to generate and plot the data. Type “help mvnrnd” and “help plot” to learn more about these
commands.

You might also like