0% found this document useful (1 vote)
106 views56 pages

Introduction To Normal Distribution: Nathaniel E. Helwig

Uploaded by

Vidaup40
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
106 views56 pages

Introduction To Normal Distribution: Nathaniel E. Helwig

Uploaded by

Vidaup40
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Introduction to Normal Distribution

Nathaniel E. Helwig

Assistant Professor of Psychology and Statistics


University of Minnesota (Twin Cities)

Updated 17-Jan-2017

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 1


Copyright

Copyright
c 2017 by Nathaniel E. Helwig

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 2


Outline of Notes
1) Univariate Normal: 3) Multivariate Normal:
Distribution form Distribution form
Standard normal Probability calculations
Probability calculations Affine transformations
Affine transformations Conditional distributions
Parameter estimation Parameter estimation

2) Bivariate Normal: 4) Sampling Distributions:


Distribution form Univariate case
Probability calculations Multivariate case
Affine transformations
Conditional distributions
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 3
Univariate Normal

Univariate Normal

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 4


Univariate Normal Distribution Form

Normal Density Function (Univariate)

Given a variable x ∈ R, the normal probability density function (pdf) is

1 (x−µ)2

f (x) = √ e 2σ2
σ 2π
(1)
(x − µ)2
 
1
= √ exp −
σ 2π 2σ 2

where
µ ∈ R is the mean
σ > 0 is the standard deviation (σ 2 is the variance)
e ≈ 2.71828 is base of the natural logarithm

Write X ∼ N(µ, σ 2 ) to denote that X follows a normal distribution.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 5


Univariate Normal Standard Normal

Standard Normal Distribution


If X ∼ N(0, 1), then X follows a standard normal distribution:
1 2
f (x) = √ e−x /2 (2)

0.4
f (x )
0.2 0.0

−4 −2 0 2 4
x
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 6
Univariate Normal Probability Calculations

Probabilities and Distribution Functions

Probabilities relate to the area under the pdf:


Z b
P(a ≤ X ≤ b) = f (x)dx
a (3)
= F (b) − F (a)

where Z x
F (x) = f (u)du (4)
−∞

is the cumulative distribution function (cdf).

Note: F (x) = P(X ≤ x) =⇒ 0 ≤ F (x) ≤ 1

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 7


Univariate Normal Probability Calculations

Normal Probabilities

Helpful figure of normal probabilities:


0.4
0.3
0.2

34.1% 34.1%
0.1

2.1% 2.1%
0.1% 13.6% 13.6% 0.1%
0.0

−3σ −2σ −1σ µ 1σ 2σ 3σ

From http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 8


Univariate Normal Probability Calculations

Normal Distribution Functions (Univariate)

Helpful figures of normal pdfs and cdfs:


1.0 1.0
μ = 0, σ 2 = 0.2, μ = 0, σ 2 = 0.2,
μ = 0, σ 2 = 1.0, μ = 0, σ 2 = 1.0,
0.8 μ = 0, σ 2 = 5.0, 0.8 μ = 0, σ 2 = 5.0,
μ = −2, σ 2 = 0.5, μ = −2, σ 2 = 0.5,
0.6 0.6

Φμ,σ (x)
φμ,σ (x)

2
2

0.4 -3 -2 -1 0.4 -3 -2 -1

0.2 0.2

0.0 0.0

−5 −4 −3 −2 −1 0 1 2 3 4 5 −5 −4 −3 −2 −1 0 1 2 3 4 5
x x

http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg http://en.wikipedia.org/wiki/File:Normal_Distribution_CDF.svg

Note that the cdf has an elongated “S” shape, referred to as an ogive.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 9


Univariate Normal Affine Transformations

Affine Transformations of Normal (Univariate)

Suppose that X ∼ N(µ, σ 2 ) and a, b ∈ R with a 6= 0.

If we define Y = aX + b, then Y ∼ N(aµ + b, a2 σ 2 ).

Suppose that X ∼ N(1, 2). Determine the distributions of. . .


Y =X +3
Y = 2X + 3
Y = 3X + 2

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 10


Univariate Normal Affine Transformations

Affine Transformations of Normal (Univariate)

Suppose that X ∼ N(µ, σ 2 ) and a, b ∈ R with a 6= 0.

If we define Y = aX + b, then Y ∼ N(aµ + b, a2 σ 2 ).

Suppose that X ∼ N(1, 2). Determine the distributions of. . .


Y =X +3 =⇒ Y ∼ N(1(1) + 3, 12 (2)) ≡ N(4, 2)
Y = 2X + 3 =⇒ Y ∼ N(2(1) + 3, 22 (2)) ≡ N(5, 8)
Y = 3X + 2 =⇒ Y ∼ N(3(1) + 2, 32 (2)) ≡ N(5, 18)

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 10


Univariate Normal Parameter Estimation

Likelihood Function

Suppose that x = (x1 , . . . , xn ) is an iid sample of data from a normal


iid
distribution with mean µ and variance σ 2 , i.e., xi ∼ N(µ, σ 2 ).

The likelihood function for the parameters (given the data) has the form
n n
(xi − µ)2
 
2
Y Y 1
L(µ, σ |x) = f (xi ) = √ exp −
2πσ 2 2σ 2
i=1 i=1

and the log-likelihood function is given by


n n
X n n 1 X
LL(µ, σ 2 |x) = log(f (xi )) = − log(2π)− log(σ 2 )− 2 (xi −µ)2
2 2 2σ
i=1 i=1

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 11


Univariate Normal Parameter Estimation

Maximum Likelihood Estimate of the Mean

The MLE of the mean is the value of µ that minimizes


n
X n
X
2
(xi − µ) = xi2 − 2nx̄µ + nµ2
i=1 i=1
Pn
where x̄ = (1/n) i=1 xi is the sample mean.

Taking the derivative with respect to µ we find that

∂ ni=1 (xi − µ)2


P
= −2nx̄ + 2nµ ←→ x̄ = µ̂
∂µ

i.e., the sample mean x̄ is the MLE of the population mean µ.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 12


Univariate Normal Parameter Estimation

Maximum Likelihood Estimate of the Variance


The MLE of the variance is the value of σ 2 that minimizes
n Pn
n 2 1 X 2 n 2 x 2 nx̄ 2
log(σ ) + 2 (xi − µ̂) = log(σ ) + i=12 i − 2
2 2σ 2 2σ 2σ
i=1
Pn
where x̄ = (1/n) i=1 xi is the sample mean.

Taking the derivative with respect to σ 2 we find that


Pn n
∂ n2 log(σ 2 ) + 1
2σ 2 i=1 (xi − µ̂)2 n 1 X
= − (xi − µ̂)2
∂σ 2 2σ 2 2σ 4
i=1
Pn
which implies that the sample variance σ̂ 2 = (1/n) i=1 (xi − x̄)2 is the
MLE of the population variance σ 2 .
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 13
Bivariate Normal

Bivariate Normal

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 14


Bivariate Normal Distribution Form

Normal Density Function (Bivariate)

Given two variables x, y ∈ R, the bivariate normal pdf is


(y −µy )2
n h io
1 (x−µx )2 2ρ(x−µx )(y −µy )
exp − 2(1−ρ 2) 2
σx
+ σ 2 − σx σy
f (x, y ) = p y (5)
2πσx σy 1 − ρ 2

where
µx ∈ R and µy ∈ R are the marginal means
σx ∈ R+ and σy ∈ R+ are the marginal standard deviations
0 ≤ |ρ| < 1 is the correlation coefficient

X and Y are marginally normal: X ∼ N(µx , σx2 ) and Y ∼ N(µy , σy2 )

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 15


Bivariate Normal Distribution Form

Example: µx = µy = 0, σx2 = 1, σy2 = 2, ρ = 0.6/ 2

0.12
4

0.10
0.12

0.08
0.10

f (x, y )
0.08

0.06
y
0
f (x, y )

0.06

0.04
0.04

−2
4
0.02
2

0.02
−4
0
−4
−2 −2 y

0.00
0
2 −4 −4 −2 0 2 4
4
x x

http://en.wikipedia.org/wiki/File:MultivariateNormal.png
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 16
Bivariate Normal Distribution Form

Example: Different Means

µx = 0, µy = 0 µx = 1, µy = 2 µx = −1, µy = −1
0.12

0.12
0.12
4

4
0.10

0.10
0.10
2

2
0.08

0.08
0.08
f (x, y )

f (x, y )

f (x, y )
0.06

0.06

0.06
y

y
0

0
0.04

0.04

0.04
−2

−2

−2
0.02

0.02

0.02
−4

−4

−4
0.00

0.00

0.00
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
x x x


Note: for all three plots σx2 = 1, σy2 = 2, and ρ = 0.6/ 2.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 17


Bivariate Normal Distribution Form

Example: Different Correlations

ρ = −0.6/ 2 ρ=0 ρ = 1.2/ 2


0.12

0.20
0.10
4

4
0.10

0.08

0.15
2

2
0.08

f (x, y )

f (x, y )

f (x, y )
0.06
0.06
y

0.10
0

0
0.04
0.04
−2

−2

−2

0.05
0.02
0.02
−4

−4

−4
0.00

0.00

0.00
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
x x x

Note: for all three plots µx = µy = 0, σx2 = 1, and σy2 = 2.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 18


Bivariate Normal Distribution Form

Example: Different Variances

σy = 1 σy = 2 σy = 2

0.12

0.08
4

4
0.10
0.15

0.06
2

2
0.08
f (x, y )

f (x, y )

f (x, y )
0.10

0.06

0.04
y

y
0

0
0.04
−2

−2

−2
0.05

0.02
0.02
−4

−4

−4
0.00

0.00

0.00
−4 −2 0 2 4 −4 −2 0 2 4 −4 −2 0 2 4
x x x

Note: for all three plots µx = µy = 0, σx2 = 1, and ρ = 0.6/(σx σy ).

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 19


Bivariate Normal Probability Calculations

Probabilities and Multiple Integration

Probabilities still relate to the area under the pdf:


Z bx Z by
P([ax ≤ X ≤ bx ] and [ay ≤ Y ≤ by ]) = f (x, y )dy dx (6)
ax ay
RR
where f (x, y )dy dx denotes the multiple integral of the pdf f (x, y ).

Defining z = (x, y ), we can still define the cdf:

F (z) = P(X ≤ x and Y ≤ y )


Z x Z y
(7)
= f (u, v )dv du
−∞ −∞

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 20


Bivariate Normal Probability Calculations

Normal Distribution Functions (Bivariate)


Helpful figures of bivariate normal pdf and cdf:

1.0
0.12
4

4
0.10

0.8
2

2
0.08

0.6

F (x, y )
f (x, y )
0.06
y

y
0

0.4
0.04
−2

−2

0.2
0.02
−4

−4
0.00

0.0
−4 −2 0 2 4 −4 −2 0 2 4
x x


Note: µx = µy = 0, σx2 = 1, σy2 = 2, and ρ = 0.6/ 2

Note that the cdf still has an ogive shape (now in two-dimensions).
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 21
Bivariate Normal Affine Transformations

Affine Transformations of Normal (Bivariate)

Given z = (x, y )0 , suppose that z ∼ N(µ, Σ) where


µ = (µx , µy )0 is the 2 × 1 mean vector
 2 
σx ρσx σy
Σ= is the 2 × 2 covariance matrix
ρσx σy σy2

     
a11 a12 b1 0 0
Let A = and b = with A 6= 02×2 = .
a21 a22 b2 0 0

If we define w = Az + b, then w ∼ N(Aµ + b, AΣA0 ).

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 22


Bivariate Normal Conditional Distributions

Conditional Normal (Bivariate)

The conditional distribution of a variable Y given X = x is

fXY (x, y )
fY |X (y |X = x) = (8)
fX (x)

where
fXY (x, y ) is the joint pdf of X and Y
fX (x) is the marginal pdf of X

In the bivariate normal case, we have that

Y |X ∼ N(µ∗ , σ∗2 ) (9)


σ
where µ∗ = µy + ρ σyx (x − µx ) and σ∗2 = σy2 (1 − ρ2 )

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 23


Bivariate Normal Conditional Distributions

Derivation of Conditional Normal


To prove Equation (9), simply write out the definition and simplify:
fXY (x, y )
fY |X (y |X = x) =
fX (x)
( " #)
(x−µx )2 (y −µy )2 2ρ(x−µx )(y −µy )
 p 
exp − 1 + − / 2πσx σy 1 − ρ2
2(1−ρ2 ) σx2 σy2 σx σy
=
√ 
  
(x−µx )2
exp − 2 / σx 2π
2σx
( " # )
1 (x−µx )2 (y −µy )2 2ρ(x−µx )(y −µy ) (x−µx )2
exp − + − +
2(1−ρ2 ) σx2 σy2 σx σy 2σx2
= √ p
2πσy 1 − ρ2
σ2
( " #)
σ
exp − 1 ρ2 y2 (x − µx )2 + (y − µy )2 − 2ρ σy (x − µx )(y − µy )
2σy2 (1−ρ2 ) σx x
= √ p
2πσy 1 − ρ2
( )
h σ
i2
exp − 2 1 2 y − µy − ρ σy (x − µx )
2σy (1−ρ ) x
= √ p
2πσy 1 − ρ2

which completes the proof.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 24


Bivariate Normal Conditional Distributions

Statistical Independence for Bivariate Normal


Two variables X and Y are statistically independent if

fXY (x, y ) = fX (x)fY (y ) (10)

where fXY (x, y ) is joint pdf, and fX (x) and fY (y ) are marginals pdfs.

Note that if X and Y are independent, then

fXY (x, y ) fX (x)fY (y )


fY |X (y |X = x) = = = fY (y ) (11)
fX (x) fX (x)

so conditioning on X = x does not change the distribution of Y .

If X and Y are bivariate normal, what is the necessary and sufficient


condition for X and Y to be independent? Hint: see Equation (9)
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 25
Bivariate Normal Conditional Distributions

Example #1
A statistics class takes two exams X (Exam 1) and Y (Exam 2) where
the scores follow a bivariate normal distribution with parameters:
µx = 70 and µy = 60 are the marginal means
σx = 10 and σy = 15 are the marginal standard deviations
ρ = 0.6 is the correlation coefficient

Suppose we select a student at random. What is the probability that. . .


(a) the student scores over 75 on Exam 2?
(b) the student scores over 75 on Exam 2, given that the student
scored X = 80 on Exam 1?
(c) the sum of his/her Exam 1 and Exam 2 scores is over 150?
(d) the student did better on Exam 1 than Exam 2?
(e) P(5X − 4Y > 150)?
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 26
Bivariate Normal Conditional Distributions

Example #1: Part (a)


Answer for 1(a):
Note that Y ∼ N(60, 152 ), so the probability that the student scores
over 75 on Exam 2 is
 
75 − 60
P(Y > 75) = P Z >
15
= P(Z > 1)
= 1 − P(Z < 1)
= 1 − Φ(1)
= 1 − 0.8413447
= 0.1586553
Rx 2
where Φ(x) = −∞ f (z)dz with f (x) = √1 e−x /2 denoting the standard

normal pdf (see R code for use of pnorm to calculate this quantity).
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 27
Bivariate Normal Conditional Distributions

Example #1: Part (b)


Answer for 1(b):
Note that (Y |X = 80) ∼ N(µ∗ , σ∗2 ) where
µ∗ = µY + ρ σσYX (x − µX ) = 60 + (0.6)(15/10)(80 − 70) = 69
σ∗2 = σY2 (1 − ρ2 ) = 152 (1 − 0.62 ) = 144

If a student scored X = 80 on Exam 1, the probability that the student


scores over 75 on Exam 2 is
 
75 − 69
P(Y > 75|X = 80) = P Z >
12
= P (Z > 0.5)
= 1 − Φ(0.5)
= 1 − 0.6914625
= 0.3085375

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 28


Bivariate Normal Conditional Distributions

Example #1: Part (c)


Answer for 1(c):
Note that (X + Y ) ∼ N(µ∗ , σ∗2 ) where
µ∗ = µX + µY = 70 + 60 = 130
σ∗2 = σX2 + σY2 + 2ρσX σY = 102 + 152 + 2(0.6)(10)(15) = 505

The probability that the sum of Exam 1 and Exam 2 is above 150 is
 
150 − 130
P(X + Y > 150) = P Z > √
505
= P (Z > 0.8899883)
= 1 − Φ(0.8899883)
= 1 − 0.8132639
= 0.1867361

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 29


Bivariate Normal Conditional Distributions

Example #1: Part (d)


Answer for 1(d):
Note that (X − Y ) ∼ N(µ∗ , σ∗2 ) where
µ∗ = µX − µY = 70 − 60 = 10
σ∗2 = σX2 + σY2 − 2ρσX σY = 102 + 152 − 2(0.6)(10)(15) = 145

The probability that the student did better on Exam 1 than Exam 2 is
P(X > Y ) = P(X − Y > 0)
 
0 − 10
=P Z > √
145
= P (Z > −0.8304548)
= 1 − Φ(−0.8304548)
= 1 − 0.2031408
= 0.7968592

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 30


Bivariate Normal Conditional Distributions

Example #1: Part (e)


Answer for 1(e):
Note that (5X − 4Y ) ∼ N(µ∗ , σ∗2 ) where
µ∗ = 5µX − 4µY = 5(70) − 4(60) = 110
σ∗2 = 52 σX2 + (−4)2 σY2 + 2(5)(−4)ρσX σY =
25(102 ) + 16(152 ) − 2(20)(0.6)(10)(15) = 2500

Thus, the needed probability can be obtained using


 
150 − 110
P(5X − 4Y > 150) = P Z > √
2500
= P (Z > 0.8)
= 1 − Φ(0.8)
= 1 − 0.7881446
= 0.2118554

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 31


Bivariate Normal Conditional Distributions

Example #1: R Code

# Example 1a # Example 1d
> pnorm(1,lower=F) > pnorm(-10/sqrt(145),lower=F)
[1] 0.1586553 [1] 0.7968592
> pnorm(75,mean=60,sd=15,lower=F) > pnorm(0,mean=10,sd=sqrt(145),lower=F)
[1] 0.1586553 [1] 0.7968592

# Example 1b # Example 1e
> pnorm(0.5,lower=F) > pnorm(0.8,lower=F)
[1] 0.3085375 [1] 0.2118554
> pnorm(75,mean=69,sd=12,lower=F) > pnorm(150,mean=110,sd=50,lower=F)
[1] 0.3085375 [1] 0.2118554

# Example 1c
> pnorm(20/sqrt(505),lower=F)
[1] 0.1867361
> pnorm(150,mean=130,sd=sqrt(505),lower=F)
[1] 0.1867361

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 32


Multivariate Normal

Multivariate Normal

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 33


Multivariate Normal Distribution Form

Normal Density Function (Multivariate)

Given x = (x1 , . . . , xp )0 with xj ∈ R ∀j, the multivariate normal pdf is


 
1 1 0 −1
f (x) = exp − (x − µ) Σ (x − µ) (12)
(2π)p/2 |Σ|1/2 2

where
µ = (µ1 , . . . , µp )0 is the p × 1 mean vector
 
σ11 σ12 · · · σ1p
σ21 σ22 · · · σ2p 
Σ= . ..  is the p × p covariance matrix
 
.. ..
 .. . . . 
σp1 σp2 · · · σpp

Write x ∼ N(µ, Σ) or x ∼ Np (µ, Σ) to denote x is multivariate normal.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 34


Multivariate Normal Distribution Form

Some Multivariate Normal Properties

The mean and covariance parameters have the following restrictions:


µj ∈ R for all j
σjj > 0 for all j

σij = ρij σii σjj where ρij is correlation between Xi and Xj
σij2 ≤ σii σjj for any i, j ∈ {1, . . . , p} (Cauchy-Schwarz)

Σ is assumed to be positive definite so that Σ−1 exists.

Marginals are normal: Xj ∼ N(µj , σjj ) for all j ∈ {1, . . . , p}.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 35


Multivariate Normal Probability Calculations

Multivariate Normal Probabilities

Probabilities still relate to the area under the pdf:


Z b1 Z bp
P(aj ≤ Xj ≤ bj ∀j) = ··· f (x)dxp · · · dx1 (13)
a1 ap
R R
where ··· f (x)dxp · · · dx1 denotes the multiple integral f (x).

We can still define the cdf of x = (x1 , . . . , xp )0

F (x) = P(Xj ≤ xj ∀j)


Z x1 Z xp
(14)
= ··· f (u)dup · · · du1
−∞ −∞

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 36


Multivariate Normal Affine Transformations

Affine Transformations of Normal (Multivariate)

Suppose that x = (x1 , . . . , xp )0 and that x ∼ N(µ, Σ) where


µ = {µj }p×1 is the mean vector
Σ = {σij }p×p is the covariance matrix

Let A = {aij }n×p and b = {bi }n×1 with A 6= 0n×p .

If we define w = Ax + b, then w ∼ N(Aµ + b, AΣA0 ).

Note: linear combinations of normal variables are normally distributed.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 37


Multivariate Normal Conditional Distributions

Multivariate Conditional Distributions

Given variables x = (x1 , . . . , xp )0 and y = (y1 , . . . , yq )0 , we have

fXY (x, y)
fY |X (y|X = x) = (15)
fX (x)

where
fY |X (y|X = x) is the conditional distribution of y given x
fXY (x, y) is the joint pdf of x and y
fX (x) is the marginal pdf of x

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 38


Multivariate Normal Conditional Distributions

Conditional Normal (Multivariate)


Suppose that z ∼ N(µ, Σ) where
z = (x0 , y0 )0 = (x1 , . . . , xp , y1 , . . . , yq )0
µ = (µ0x , µ0y )0 = (µ1x , . . . , µpx , µ1y , . . . , µqy )0
Note: µx is mean vector of x, and µy is mean vector of y
 
Σxx Σxy
Σ= where (Σxx )p×p , (Σyy )q×q , and (Σxy )p×q ,
Σ0xy Σyy
Note: Σxx is covariance matrix of x, Σyy is covariance matrix of y,
and Σxy is covariance matrix of x and y

In the multivariate normal case, we have that

y|x ∼ N(µ∗ , Σ∗ ) (16)

where µ∗ = µy + Σ0xy Σ−1 0 −1


xx (x − µx ) and Σ∗ = Σyy − Σxy Σxx Σxy
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 39
Multivariate Normal Conditional Distributions

Statistical Independence for Multivariate Normal

Using Equation (16), we have that

y|x ∼ N(µ∗ , Σ∗ ) ≡ N(µy , Σyy ) (17)

if and only if Σxy = 0p×q (a matrix of zeros).

Note that Σxy = 0p×q implies that the p elements of x are uncorrelated
with the q elements of y.
For multivariate normal variables: uncorrelated → independent
For non-normal variables: uncorrelated 6→ independent

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 40


Multivariate Normal Conditional Distributions

Example #2
Each Delicious Candy Company store makes 3 size candy bars:
regular (X1 ), fun size (X2 ), and big size (X3 ).

Assume the weight (in ounces) of the candy bars (X1 , X2 , X3 ) follow a
multivariate normal distribution with parameters:
   
5 4 −1 0
µ = 3 and Σ = −1 4 2
7 0 2 9

Suppose we select a store at random. What is the probability that. . .


(a) the weight of a regular candy bar is greater than 8 oz?
(b) the weight of a regular candy bar is greater than 8 oz, given that
the fun size bar weighs 1 oz and the big size bar weighs 10 oz?
(c) P(4X1 − 3X2 + 5X3 < 63)?
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 41
Multivariate Normal Conditional Distributions

Example #2: Part (a)

Answer for 2(a):


Note that X1 ∼ N(5, 4)

So, the probability that the regular bar is more than 8 oz is


 
8−5
P(X1 > 8) = P Z >
2
= P(Z > 1.5)
= 1 − Φ(1.5)
= 1 − 0.9331928
= 0.0668072

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 42


Multivariate Normal Conditional Distributions

Example #2: Part (b)


Answer for 2(b):
(X1 |X2 = 1, X3 = 10) is normally distributed, see Equation (16).

The conditional mean of (X1 |X2 = 1, X3 = 10) is given by

µ∗ = µX1 + Σ012 Σ−1


22 (x̃ − µ̃)
 4 2 −1 1 − 3
   
= 5 + −1 0
2 9 10 − 7
  
 1 9 −2 −2
= 5 + −1 0
32 −2 4 3
= 5 + 24/32
= 5.75

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 43


Multivariate Normal Conditional Distributions

Example #2: Part (b) continued

Answer for 2(b) continued:


The conditional variance of (X1 |X2 = 1, X3 = 10) is given by

σ∗2 = σX2 1 − Σ012 Σ−1


22 Σ12
 4 2 −1 −1
   
= 4 − −1 0
2 9 0
  
 1 9 −2 −1
= 4 − −1 0
32 −2 4 0
= 4 − 9/32
= 3.71875

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 44


Multivariate Normal Conditional Distributions

Example #2: Part (b) continued

Answer for 2(b) continued:


So, if the fun size bar weighs 1 oz and the big size bar weighs 10 oz,
the probability that the regular bar is more than 8 oz is
 
8 − 5.75
P(X1 > 8|X2 = 1, X3 = 10) = P Z > √
3.71875
= P(Z > 1.166767)
= 1 − Φ(1.166767)
= 1 − 0.8783477
= 0.1216523

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 45


Multivariate Normal Conditional Distributions

Example #2: Part (c)

Answer for 2(c):


(4X1 − 3X2 + 5X3 ) is normally distributed.

The expectation of (4X1 − 3X2 + 5X3 ) is given by

µ∗ = 4µX1 − 3µX2 + 5µX3


= 4(5) − 3(3) + 5(7)
= 46

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 46


Multivariate Normal Conditional Distributions

Example #2: Part (c) continued

Answer for 2(c) continued:


The variance of (4X1 − 3X2 + 5X3 ) is given by
 
4
2

σ∗ = 4 −3 5 Σ −3 
5
  
 4 −1 0 4
= 4 −3 5  −1 4 2  −3
0 2 9 5
 
 19
= 4 −3 5 −6
39
= 289

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 47


Multivariate Normal Conditional Distributions

Example #2: Part (c) continued

Answer for 2(c) continued:


So, the needed probability can be obtained as
 
63 − 46
P(4X1 − 3X2 + 5X3 < 63) = P Z < √
289
= P(Z < 1)
= Φ(1)
= 0.8413447

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 48


Multivariate Normal Conditional Distributions

Example #2: R Code


# Example 2a
> pnorm(1.5,lower=F)
[1] 0.0668072
> pnorm(8,mean=5,sd=2,lower=F)
[1] 0.0668072

# Example 2b
> pnorm(2.25/sqrt(119/32),lower=F)
[1] 0.1216523
> pnorm(8,mean=5.75,sd=sqrt(119/32),lower=F)
[1] 0.1216523

# Example 2c
> pnorm(1)
[1] 0.8413447
> pnorm(63,mean=46,sd=17)
[1] 0.8413447
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 49
Multivariate Normal Parameter Estimation

Likelihood Function

Suppose that xi = (xi1 , . . . , xip ) is a sample from a normal distribution


iid
with mean vector µ and covariance matrix Σ, i.e., xi ∼ N(µ, Σ).

The likelihood function for the parameters (given the data) has the form
n n  
Y Y 1 1
L(µ, Σ|X) = f (xi ) = exp − (xi − µ)0 Σ−1 (xi − µ)
(2π)p/2 |Σ|1/2 2
i=1 i=1

and the log-likelihood function is given by


n
np n 1X
LL(µ, Σ|X) = − log(2π) − log(|Σ|) − (xi − µ)0 Σ−1 (xi − µ)
2 2 2
i=1

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 50


Multivariate Normal Parameter Estimation

Maximum Likelihood Estimate of Mean Vector

The MLE of the mean vector is the value of µ that minimizes


n
X n
X
(xi − µ)0 Σ−1 (xi − µ) = x0i Σ−1 xi − 2nx̄0 Σ−1 µ + nµ0 Σ−1 µ
i=1 i=1
Pn
where x̄ = (1/n) i=1 xi is the sample mean vector.

Taking the derivative with respect to µ we find that

∂LL(µ, Σ|X)
= −2nΣ−1 x̄ + 2nΣ−1 µ ←→ x̄ = µ̂
∂µ

The sample mean vector x̄ is the MLE of the population mean µ vector.

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 51


Multivariate Normal Parameter Estimation

Maximum Likelihood Estimate of Covariance Matrix


The MLE of the covariance matrix is the value of Σ that minimizes
n
X
−n log(|Σ−1 |) + tr{Σ−1 (xi − µ̂)(xi − µ̂)0 }
i=1
Pn
where µ̂ = x̄ = (1/n) i=1 xi is the sample mean.

Taking the derivative with respect to Σ−1 we find that


n
∂LL(µ, Σ|X) X
= −nΣ + (xi − µ̂)(xi − µ̂)0
∂Σ−1 i=1

i.e., the sample covariance matrix Σ̂ = (1/n) ni=1 (xi − x̄)(xi − x̄)0 is
P
the MLE of the population covariance matrix Σ.
Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 52
Sampling Distributions

Sampling Distributions

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 53


Sampling Distributions Univariate Case

Univariate Sampling Distributions: x̄ and s2

In the univariate normal case, we have that


x̄ = (1/n) ni=1 xi ∼ N(µ, σ 2 /n)
P

(n − 1)s2 = ni=1 (xi − x̄)2 ∼ σ 2 χ2n−1


P

χ2k denotes a chi-square variable with k degrees of freedom.

Pk iid
σ 2 χ2k = 2
i=1 zi where zi ∼ N(0, σ 2 )

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 54


Sampling Distributions Multivariate Case

Multivariate Sampling Distributions: x̄ and S

In the multivariate normal case, we have that


x̄ = (1/n) ni=1 xi ∼ N(µ, Σ/n)
P

(n − 1)S = ni=1 (xi − x̄)(xi − x̄)0 ∼ Wn−1 (Σ)


P

Wk (Σ) denotes a Wishart variable with k degrees of freedom.

Pk 0 iid
Wk (Σ) = i=1 zi zi where zi ∼ N(0p , Σ)

Nathaniel E. Helwig (U of Minnesota) Introduction to Normal Distribution Updated 17-Jan-2017 : Slide 55

You might also like