0% found this document useful (0 votes)
33 views

11 Normal Distribution

The normal distribution, also called the Gaussian distribution, produces a bell-shaped curve when graphed. It is defined by two parameters: the mean (μ) and the variance (σ2). The standard normal distribution refers to a normal distribution with a mean of 0 and variance of 1. Many natural phenomena can be approximated by the normal distribution when the phenomenon is the result of multiple independent factors. It is commonly used for modeling random variables whose values are likely to cluster around a single mean value.

Uploaded by

Hemanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

11 Normal Distribution

The normal distribution, also called the Gaussian distribution, produces a bell-shaped curve when graphed. It is defined by two parameters: the mean (μ) and the variance (σ2). The standard normal distribution refers to a normal distribution with a mean of 0 and variance of 1. Many natural phenomena can be approximated by the normal distribution when the phenomenon is the result of multiple independent factors. It is commonly used for modeling random variables whose values are likely to cluster around a single mean value.

Uploaded by

Hemanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

The Normal Distribution

image: Etsy

with materials by
Will Monroe Mehran Sahami
July 19, 2017 and Chris Piech
Announcements: Midterm

A week from yesterday:

Tuesday, July 25, 7:00-9:00pm


Building 320-105

One page (both sides) of notes

Material through today’s lecture

Review session:
Tomorrow, July 20, 2:30-3:20pm
in Gates B01
Review: A grid of random variables

number of successes time to get successes

One One
trial X ∼Ber( p) X ∼Geo( p) success

n=1 r=1

Several Several
trials X ∼Bin(n , p) X ∼NegBin (r , p) successes

Interval One success


of time X ∼Poi(λ) X ∼Exp(λ) after interval
of time

(continuous!)
Review: Continuous distributions
A continuous random variable has a
value that’s a real number (not
necessarily an integer).

Replace sums with integrals!

P (a< X ≤b)=F X (b)−F X (a)


a

F X (a)= ∫ dx f X (x)
x=−∞
Review: Probability density function
The probability density function (PDF)
of a continuous random variable
represents the relative likelihood of
various values.

Units of probability divided by units of X.


Integrate it to get probabilities!
b

P (a< X ≤b)= ∫ dx f X (x)


x=a
Continuous expectation and variance
Remember: replace sums with integrals!

∞ ∞
E [ X ]= ∑ x⋅p X (x) E [ X ]= ∫ dx x⋅f X ( x)
x=−∞ x=−∞
∞ ∞
2 2 2 2
E [ X ]= ∑ x ⋅p X ( x) E [ X ]= ∫ dx x ⋅f X ( x)
x=−∞ x=−∞

2 2 2
Var( X )=E [( X −E [ X ]) ]=E [ X ]−(E [ X ])
(still!)
Review: Uniform random variable
A uniform random variable is
equally likely to be any value in
a single real number interval.

X ∼Uni(α ,β)
1
{
f X (x)= β−α
0
if x∈[α ,β]
otherwise
Uniform: Fact sheet
minimum value

X ∼Uni(α ,β)
maximum value
1
PDF:
{
f X ( x)= β−α
0
if x∈[α ,β]
otherwise
x−α
CDF:
{
F X ( x)=
β−α
1
0
if x∈[α ,β]
if x>β
otherwise
expectation: E[ X ]=
α+β
2
(β−α)
2
variance: Var( X )=
12
image: Haha169
Review: Exponential random variable
An exponential random variable
is the amount of time until the
first event when events occur
as in the Poisson distribution.

X ∼Exp(λ)
−λ x
λe if x≥0
{
f X (x)=
0 otherwise

image: Adrian Sampson


Exponential: Fact sheet
rate of events per unit time

X ∼Exp(λ)
time until first event
−λ x
λe if x≥0
PDF: f X ( x)=
0{ otherwise
−λ x
1−e if x≥0
CDF: F X ( x)= {
0 otherwise
1
expectation: E [ X ]=
λ
1
variance: Var( X )= 2
image: Adrian Sampson
λ
Normal random variable
An normal (= Gaussian) random variable is
a good approximation to many other
distributions. It often results from sums or
averages of independent random variables.

2
X ∼N (μ , σ ) 2
1 x−μ
− (
1 2 σ )
f X ( x)= e
σ √2 π
Déjà vu?
Déjà vu?

P( X =k )

k
Déjà vu?

f X ( x)

X = sum of n independent Uni(0, 1) variables


image: Thomasda
“The normal distribution”

Also known as: Gaussian distribution

Shape: bell curve

Personality: easygoing
What is normally distributed?

Natural phenomena: heights, weights…


(approximately)
Noise in measurements
(caveats:
independence,
Sums/averages of many random variables equal weighting,
continuity...)

(with sufficient
Averages of samples from a population sample sizes)
The Know-Nothing Distribution
“maximum entropy”

The normal is the most spread-out distribution


with a fixed expectation and variance.

If you know E[X] and Var(X) but nothing else,


a normal is probably a good starting point!
Normal: Fact sheet
mean

2
X ∼N (μ , σ )
variance (σ = standard deviation)
2
1 x−μ
PDF: f X ( x)=
1
e
− ( )
2 σ

σ √2 π
The Standard Normal

Z∼N (0,1)
μ σ²

2
X ∼N (μ , σ ) X =σ Z +μ
X−μ
Z= σ
De-scarifying the normal PDF

2
1 x−μ

f X ( x)=
1 −
e
( )
2 σ

σ √2 π
De-scarifying the normal PDF

2
1 z−0

f Z ( z)=
1 −
e
( )
2 1

1 √2 π
De-scarifying the normal PDF

1 2
1 − z
2
f Z ( z)= e
√2 π
De-scarifying the normal PDF

1 2
− z
2
f Z ( z)=C e
De-scarifying the normal PDF

1 2
− z
2
f Z ( z)=C e

1 2
− z
2
De-scarifying the normal PDF

1 2
− z
2
f Z ( z)=C e

1 2
− z
2
De-scarifying the normal PDF

2
1 x−μ

f X ( x)=
1
e
− ( )
2 σ

σ √2 π X −μ
Z= σ

normalizing
constant
Normal: Fact sheet
mean

2
X ∼N (μ , σ )
variance (σ = standard deviation)
2
1 x−μ
PDF: f X ( x)=
1 −
e
(
2 σ )
σ √2 π
x
x−μ
CDF: ( )
F X ( x)=Φ σ = ∫ dx f X ( x)
−∞
(no closed form)
The Standard Normal

Z∼N (0,1)
μ σ²

2
X ∼N (μ , σ ) X =σ Z +μ
X−μ
Z= σ

Φ(z)=F Z ( z)=P(Z≤z)
Symmetry of the normal

P( X≤μ−x)=P( X≥μ+ x)
and don’t forget:

P( X > x)=1−P( X ≤x)


Symmetry of the normal

P(Z≤−z)=P(Z≥z)
and don’t forget:

P(Z > z)=1−P(Z≤z)


Symmetry of the normal

Φ(−z)=P(Z≥z)
and don’t forget:

P(Z > z)=1−Φ( z)


The standard normal table

Φ(0.54)=P(Z≤0.54)=0.7054
With today’s technology

scipy.stats.norm(mean, std).cdf(x)

standard deviation! not variance.


you might need math.sqrt here.
Break time!
Practice with the Gaussian
X ~ N(3, 16)
μ=3
σ² = 16
σ=4

X −3 0−3
P( X >0)=P
4 (>
4 )
3
=P Z >−(4 )
3 3
(
=1−P Z≤− =1−Φ(− )
4 )
4
3
=1−(1−Φ( ))
4
3
=Φ( )≈0.7734
4
Practice with the Gaussian
X ~ N(3, 16)
μ=3
σ² = 16
σ=4

P(|X −3|> 4)=P ( X <−1)+ P( X >7)


X −3 −1−3 X −3 7−3
=P( 4
<
4 ) ( +P
4
>
4 )
=P (Z <−1)+ P( Z >1)
=Φ(−1)+(1−Φ(1))
=(1−Φ(1))+(1−Φ(1))
≈2⋅(1−0.8413)
=0.3173
Practice with the Gaussian
X ~ N(3, 16)
μ=3
σ² = 16
σ=4

P(|X −μ|>σ)=P( X <μ−σ)+ P( X >μ+σ)


X −μ μ−σ−μ X −μ μ+σ−μ
(
=P σ < σ ) (+P σ > σ )
=P (Z <−1)+ P( Z >1)
=Φ(−1)+(1−Φ(1))
=(1−Φ(1))+(1−Φ(1))
≈2⋅(1−0.8413)
=0.3173
Normal: Fact sheet
mean

2
X ∼N (μ , σ )
variance (σ = standard deviation)
2
1 x−μ
PDF: f X ( x)=
1
e
( − )
2 σ

σ √2 π
x
x−μ
CDF: ( )
F X ( x)=Φ σ = ∫ dx f X ( x)
−∞
(no closed form)
expectation: E[ X ]=μ
2
variance: Var( X )=σ
Carl Friedrich Gauss
(1775-1855)—remarkably influential
German mathematician

Started doing groundbreaking math


as a teenager

Didn’t invent the normal distribution


(but popularized it)
Noisy wires
Send a voltage of X = 2 or -2 on a wire.
+2 represents 1, -2 represents 0.

Receive voltage of X + Y on other end,


where Y ~ N(0, 1).

If X + Y ≥ 0.5, then output 1, else 0.

P(incorrect output | original bit = 1) =


P(2+Y <0.5)=P (Y <−1.5)
=Φ(−1.5)
=1−Φ(1.5)≈0.0668
Noisy wires
Send a voltage of X = 2 or -2 on a wire.
+2 represents 1, -2 represents 0.

Receive voltage of X + Y on other end,


where Y ~ N(0, 1).

If X + Y ≥ 0.5, then output 1, else 0.

P(incorrect output | original bit = 0) =


P(−2+Y ≥0.5)=P(Y ≥2.5)
=1−P(Y <2.5)
=1−Φ(2.5)≈0.0062
Poisson approximation to binomial
large n, small p

P( X =k )

Bin (n , p)≈Poi (λ) k


Normal approximation to binomial
large n, medium p

P( X =k )

2 k
Bin (n , p)≈ N (μ , σ )
Something is strange...
Continuity correction
X ∼Bin (n , p)
Y ∼N (np , np(1− p))
P ( X ≥55)≈ P (Y >54.5)

When approximating a discrete distribution with


a continuous distribution, adjust the bounds by
0.5 to account for the missing half-bar.
Miracle diets
100 people placed on a special diet.

Doctor will endorse diet if ≥ 65 people have


cholesterol levels decrease.

What is P(doctor endorses | diet has no effect)?

X: # people whose cholesterol decreases

X ~ Bin(100, 0.5)
np = 50
np(1 – p) = 50(1 – 0.5) = 25

≈ Y ~ N(50, 25)

Y −50 64.5−50
P (Y >64.5)=P ( 5
>
5 )
=P(Z >2.9)=1−Φ(2.9)≈0.00187
Stanford admissions
Stanford accepts 2480 students.
Each student independently
decides to attend with p = 0.68.

What is
P(at least 1750 students attend)?

X: # of students who will attend.


X ~ Bin(2480, 0.68)
np = 1686.4
σ² = np(1 – p) ≈ 539.65

≈ Y ~ N(1686.4, 539.65)

Y −1686.4 1749.5−1686.4
P (Y >1749.5)=P
(
√ 539.65
>
√ 539.65 )
≈ P (Z >2.54)=1−Φ(2.54)≈0.0053
image: Victor Gane
Stanford admissions changes

You might also like