MVDA Problem Set 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

MULTIVARIATE DATA ANALYSIS

Lesson 1 – Multivariate distributions

1) The joint probability mass function of a certain random vector (x, y)’ is given by

P ( x = x, y = y) = c ( x + y) x = 1, 2, 3, 4 ; y = 1, 2, 3.
Determine:

a) The value of c.
b) The marginal probability mass functions of x and y.
c) The conditional probability mass function of x given y = y and y given x = x.
d) Investigate the independence of x and y.

2) Let x and y be continuous random variates with joint density


f ( x, y )  c x 2  y 2  0  x  1 ;0  y  1

a) Obtain the value of c.


b) Determine the marginal and conditional density functions.
c) Obtain P ( x <1/ 2, y >1/ 2) .
d) Are x and y independent?

3) Let (x, y)’ be a random vector with joint density function

1
f ( x, y)  0 x y2
2

Obtain: a) Marginal density functions, b) Conditional density functions.

4) The probability distribution of the number of occupied spaces in two adjacent parking
areas is given by the following table (joint probability mass function):

x \y 0 1 2 3
0 0.15 0.1 0 0
1 0.1 0.2 0.1 0
2 0 0.1 0.15 0.05
3 0 0 0.05 0

a) Determine the univariate distributions of the number of occupied spaces at each


parking lot.
b) Compute the expectation and variance of the number of occupied spaces
considering both parking lots.
c) Compute the linear correlation between the number of occupied spaces in each
parking area.
d) Determine the mean vector and covariance matrix of the joint distribution.
5) An electricity line develops a fault when tension exceeds line capacity. Considering
that tension and capacity follow normal distributions N(100, 400) and N(140, 100),
respectively, find the probability that the line fails assuming that tension and capacity
vary independently.

6) Let y1, y2 , y3 be three measures of the tension of, respectively, three independent
components forming a circuit, with variances 1, 2, 3. Two indexes were proposed in order
to measure the overall tension of the circuit:

z1 = 3y1 + 2y 2 + 5y3
1 1 1
z 2 = y1 + y 2 + y3
3 3 3

Find the correlation between z1 and z 2 .

7) Let x be a continuous random variable with density function 𝑓(𝑥) = 6𝑥(1 − 𝑥) for 0 <
𝑥 < 1. Obtain the density function and the variance of 𝑦 = 3𝑥 + 2.

8) Let (x, y)’ be a random vector with density function

𝑓(𝑥, 𝑦) = 2; 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 1, 𝑥 + 𝑦 ≤ 1.

a) Obtain the marginal density functions of x and y.


b) Obtain the conditional density function 𝑓(𝑦|𝑥) and compute the probability
1 7 1
𝑃(2 ≤ y ≤ 8 |x = 3).

9) Given x ~ 𝑁(0,1) and y ~ 𝑁(1,1), obtain the distribution of the random variate
x+y
z= +2
3

assuming that x and y are independent random variates.

10) Let (x, y)’ be a random vector with density function 𝑓(𝑥, 𝑦) = 24𝑦(1 − 𝑥 − 𝑦) in the
region delimited by the straight lines {𝑥 = 0; 𝑦 = 0; 𝑥 + 𝑦 = 1}.

a) Obtain the marginal density functions of x and y.


b) Obtain the conditional density functions 𝑓(𝑥|𝑦) and 𝑓(𝑦|𝑥).

11) Two points a and b are randomly chosen in the real straight line such that −2 ≤ 𝑏 ≤
0 and 0 ≤ 𝑎 ≤ 3. Compute the probability of them being more than 3 units apart.

12) The multinomial distribution is a generalisation of the binomial distribution which


models the number of outcomes falling into a range of k mutually exclusive categories
out of n identical and independent trials with probabilities 𝑝1 , … , 𝑝𝑘 . The probability mass
function is given by
𝑛! 𝑥 𝑥
𝑃(x1 = 𝑥1, … , x𝑘 = 𝑥𝑘 ) = 𝑥 𝑝1 1 ⋯ 𝑝𝑘 𝑘 ,
1 !⋯𝑥𝑘 !
where 𝑥𝑖 ∈ {0, … , 𝑛} and ∑𝑘𝑖=1 𝑥𝑘 = 𝑛, with 𝑛 > 0 and ∑𝑘𝑖=1 𝑝𝑘 = 1. Given a collection
of independent random variates 𝑦1 , … , 𝑦𝑘 such that 𝑦𝑖 ~𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜆𝑖 ), 𝑖 = 1, 2, … , 𝑘, show
that for any 𝑛 > 0 the distribution of [𝑦1 , … , 𝑦𝑘 ] conditional to ∑𝑘𝑖=1 𝑦𝑖 = 𝑛 is
multinomial with parameters 𝑛 and 𝑝𝑖 = 𝜆𝑖 / ∑𝑘𝑖=1 𝜆𝑖 .

13) Two points x and y are randomly chosen from the segment [0,1]. Obtain the
probability of the distance between x and y being greater than 1/4.

You might also like