0% found this document useful (0 votes)
14 views22 pages

MDA3S

Multivariate Data Analysis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views22 pages

MDA3S

Multivariate Data Analysis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Multivariate Data Analysis

LECTURE 3

1
Notation
x1
• Random vector: 𝐱 = ⋮ , 𝐱 = (x1 , … , x𝑝 )′
x𝑝

• Matrix: 𝐀, X

• Random variables: x1 , y
• Realisations: 𝑥1 , 𝑦
• Bivariate case: x, y ′ is a random vector
2
Random vector
• Definition 1.1:

A 𝑝-dimensional random vector 𝐱 = (x1 , … , x𝑝 )′ is a function


from a sample space Ω to the 𝑝-dimensional real space ℝ𝑝 .

• 𝐱 is a vector of random variables

3
Joint case: Discrete Random Vectors
• Definition 2.1:

Given a discrete bidimensional random vector x,y ′ , its joint


probability mass function is a function 𝑓 x,y from ℝ2 to ℝ
defined as
𝑓 𝑥, 𝑦 = 𝑃 x = 𝑥; y = 𝑦

• So given a subset 𝐴 in ℝ2 , 𝑃 x,y ′ 𝜖 𝐴 = σ(𝑥,𝑦)𝜖 𝐴 𝑓 𝑥, 𝑦


• Example

4
Properties
1. 𝑃 x = 𝑥; y = 𝑦 ≥ 0.

2. σ𝑥 σ𝑦 𝑃 x = 𝑥; y = 𝑦 = 1 .

5
Marginals: Discrete Random Vectors
• Theorem 2.1:

Let x,y ′ be a discrete bidimensional random vector with joint


probability mass function 𝑓 𝑥, 𝑦 . Then, the marginal probability
mass functions of x and y, 𝑓x 𝑥 = 𝑃 x = 𝑥 and
𝑓y 𝑦 = 𝑃 y = 𝑦 respectively, are

𝑓x 𝑥 = ෍ 𝑓 𝑥, 𝑦 and 𝑓y 𝑦 = ෍ 𝑓 𝑥, 𝑦
𝑦 𝑥
6
Joint case: Continuous Random Vectors
• Definition 2.2:

A function 𝑓 x,y from ℝ2 to ℝ is called a is called density


function of a continuous bidimensional random vector x,y ′ , if
for any subset 𝐴 ⊂ ℝ2 ,
𝑃 x,y ′ 𝜖 𝐴 = ‫ 𝑓 𝐴׬ ׬‬x,y 𝑑𝑥𝑑𝑦

• Example

7
Properties
1. 𝑓 x,y ≥ 0.

2.
+∞ +∞

න න 𝑓 x,y 𝑑𝑥𝑑𝑦 = 1
−∞ −∞

𝜕2 𝐹(𝑥,𝑦)
3. From the Fundamental Theorem of Calculus, 𝑓 x,y =
𝜕𝑥𝜕𝑦

8
Continuous Random Vectors
• Bidimensional distribution function:

• Marginal density functions for x and y are

9
Conditional distributions: discrete case
• Definition 3.1:
Let x,y ′ a discrete bidimensional random vector with joint
probability mass function 𝑓 𝑥, 𝑦 and marginal probability mass
functions 𝑓x 𝑥 and 𝑓y 𝑦 . For any x such that 𝑃 x = 𝑥 =
𝑓x 𝑥 > 0, the conditional probability mass function of y
given x = 𝑥 is the function of 𝑦 defined as

• Similarly
10
Independence of random variables
• Definition 3.2:
Let x,y ′ a discrete bidimensional random vector with joint
probability mass or density function 𝑓 𝑥, 𝑦 and marginals 𝑓x 𝑥
and 𝑓y 𝑦 . Then, x and y are independent random variables, if
for all 𝑥 𝜖 ℝ and 𝑦 𝜖 ℝ,

• If x and y are independent, then

11
Expectation

Properties:
1. 𝐸 𝑘 = 𝑘.
2. 𝐸 𝑎𝑔 x, y + 𝑏ℎ x, y = 𝑎𝐸 𝑔 x, y + 𝑏𝐸 ℎ x, y .

12
Mean vector

• Sums rather than integrals for the discrete case

13
Covariance matrix

• Covariance matrix defined as

14
Covariance matrix: Properties

15
Covariance matrix: Properties

16
Correlation coefficient:
• Pearson’s linear correlation coefficient:

• Properties:

17
Correlation coefficient:

18
𝑝-dimensional random vectors
For a 𝑝-dimensional random vector 𝐱 = (x1 , … , x𝑝 )′
• Mean

• Covariance matrix

19
𝑝-dimensional random vectors
• Correlation matrix

• Obtain from the covariance matrix σ as


• Matrix notation:
where

20
𝑝-dimensional random vectors
• If the components x𝑖 of x are independent random variables, then:

• I𝑝 is the 𝑝-dimensional identity matrix

21
𝑝-dimensional random vectors
• Theorem 5.1:
Let x be a 𝑝-dimensional random vector with mean 𝜇x and
covariance matrix σx .
For random vector y = Ax (dimension 𝑚 ≤ 𝑝 ) with matrix A(𝑚×𝑝) ,

Mean 𝜇y = A𝜇x
Covariance matrix σy = A σx A′ .

22

You might also like