0% found this document useful (0 votes)
7 views

Week 5_Lesson2b_Normal

Data Science

Uploaded by

raicy22205101104
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Week 5_Lesson2b_Normal

Data Science

Uploaded by

raicy22205101104
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 31

CSE315: Introduction to Data Science

Fall 2024
Probability Distribution

L5b : Normal Distribution


Text: Hayter, A. (2012) Probability and Statistics for Engineers and Scientists
The Normal Distribution
‘Bell Shaped’
 Symmetrical
f(x)
 Mean, Median and Mode
are Equal
Location is determined by the σ
mean, μ x
Spread is determined by the μ
standard deviation, σ
Mean
= Median
The random variable has an = Mode
infinite theoretical range:
+  to  
Many Normal Distributions

By varying the parameters μ and σ, we obtain


different normal distributions
The Normal Distribution
Shape
f(x) Changing μ shifts the
distribution left or right.

Changing σ increases
or decreases the
σ spread.

μ x

Given the mean μ and variance σ we define the normal


distribution using the notationX ~ N (  , 2 )
The Normal Probability
Density Function

 The formula for the normal probability density


function is

1  ( x   )2 /2 2
f ( x)  e
2
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
x = any value of the continuous variable,  < x < 
Cumulative Normal Distribution

 For a normal random variable X with mean μ and


variance σ2 , i.e., X~N(μ, σ2), the cumulative
distribution function is

 ( x0 ) F ( x0 ) P( X x0 )
f(x)

P ( X x0 )

0 x0 x
Finding Normal Probabilities

The probability for a range of values is


measured by the area under the curve

P(a  X  b) F(b)  F(a)

a μ b x
Finding Normal Probabilities
(continued)

F(b) P(X  b)

a μ b x

F(a) P(X  a)

a μ b x

P(a  X  b) F(b)  F(a)

a μ b x
The Standardized Normal
 Any normal distribution (with any mean and variance
combination) can be transformed into the
standardized normal distribution (Z), with mean 0
and variance 1
f(Z)

Z ~ N(0,1) 1
0 Z
 Need to transform X units into Z units by subtracting the
mean of X and dividing by its standard deviation

X μ
Z
σ
Example

 If X is distributed normally with mean of 100


and standard deviation of 50, the Z value for
X = 200 is

X  μ 200  100
Z   2.0
σ 50
 This says that X = 200 is two standard
deviations (2 increments of 50 units) above
the mean of 100.
Comparing X and Z units

100 200 X (μ = 100, σ = 50)

0 2.0 Z (μ = 0, σ = 1)

Note that the distribution is the same, only the


scale has changed. We can express the problem in
original units (X) or in standardized units (Z)
Finding Normal Probabilities
 a μ b μ
P(a  X  b) P Z 
 σ σ 
f(x)  b μ  a μ
F   F 
 σ   σ 

a µ b x
a μ b μ
0 Z
σ σ
Probability as
Area Under the Curve
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below

f(X) P(    X  μ)  0.5
P(μ  X   )  0.5

0.5 0.5

μ X
P(    X   )  1.0
Appendix Table 1
 The Standardized Normal table in the textbook
(Appendix Table 1) shows values of the
cumulative normal distribution function

 For a given Z-value a , the table shows F(a)


(the area under the curve from negative infinity to a )

F(a) P(Z  a)

0 a Z
The Standardized Normal Table

 Appendix Table 1 gives the probability F(a) for


any value a

.9772
Example:
P(Z < 2.00) = .9772

0 2.00 Z
The Standardized Normal Table
(continued)

 For negative Z-values, use the fact that the


distribution is symmetric to find the needed
probability:
.9772

.0228
Example:
0 2.00 Z
P(Z < -2.00) = 1 – 0.9772
= 0.0228 .9772
.0228

-2.00 0 Z
General Procedure for Finding Probabilities

To find P(a < X < b) when X is distributed normally:


 Translate X-values to Z-values
 Draw the normal curve for the problem in terms of X
 Transform greater than type probability to less than type
probability (if needed)
 Use the Cumulative Normal Table

Reasons of using Standard Normal table


 No closed form expression of the integral of normal function,
requires numerical mathematics
 For different values of mean and variance, we need infinite
number of tables. To overcome this problem, we use standard
normal table. Mean and variance of a standard normal
variable are respectively 0 and 1.
Finding Normal Probabilities

 Suppose X is normal with mean 8.0 and


standard deviation 5.0
 Find P(X < 8.6)

X
8.0
8.6
Finding Normal Probabilities
(continued)
 Suppose X is normal with mean 8.0 and
standard deviation 5.0. Find P(X < 8.6)
X  μ 8.6  8.0
Z   0.12
σ 5.0

μ=8 μ=0
σ = 10 σ=1

8 8.6 X 0 0.12 Z

P(X < 8.6) P(Z < 0.12)


Solution: Finding P(Z < 0.12)

Standardized Normal Probability P(X < 8.6)


Table (Portion) = P(Z < 0.12)
z F(z) F(0.12) = 0.5478
.10 .5398

.11 .5438

.12 .5478
Z
0.00
.13 .5517
0.12
Upper Tail Probabilities

 Suppose X is normal with mean 8.0 and


standard deviation 5.0.
 Now Find P(X > 8.6)

X
8.0
8.6
Upper Tail Probabilities
(continued)
 Now Find P(X > 8.6)…
P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - 0.5478 = 0.4522

0.5478
1.000 1.0 - 0.5478
= 0.4522

Z Z
0 0
0.12 0.12
Finding the X value for a
Known Probability

 Steps to find the X value for a known


probability:
1. Find the Z value for the known probability
2. Convert to X units using the formula:

X μ  Zσ
Finding the X value for a
Known Probability
(continued)

Example:
 Suppose X is normal with mean 8.0 and
standard deviation 5.0.
 Now find the X value so that only 20% of all
values are below this X

.2000

? 8.0 X
? 0 Z
Find the Z value for
20% in the Lower Tail
1. Find the Z value for the known probability
Standardized Normal Probability  20% area in the lower
Table (Portion) tail is consistent with a
z F(z) Z value of -0.84
.82 .7939 .80
.20
.83 .7967

.84 .7995
? 8.0 X
.85 .8023 -0.84 0 Z
Problems on Normal Distribution (Anthony)
 5.1.1 Suppose that Z ∼ N(0, 1). Find:
 (a) P(Z ≤ 1.34), (b) P(Z ≥ −0.22), (c) P(−2.19 ≤ Z ≤ 0.43)
 (d) P(0.09 ≤ Z ≤ 1.76), (e) P(|Z| ≤ 0.38)
 (f) The value of x for which P(Z ≤ x) = 0.55
 (g) The value of x for which P(Z ≥ x) = 0.72
 (h) The value of x for which P(|Z| ≤ x) = 0.31
Solution:
Problems on Normal Distribution (Anthony)
 5.1.3 Suppose that X ∼ N(10, 2). Find:
 (a) P(X ≤ 10.34), (b) P(X ≥ 11.98)
 (c) P(7.67 ≤ X ≤ 9.90), (d) P(10.88 ≤ X ≤ 13.22)
 (e) P(|X − 10| ≤ 3), (f) The value of x for which P(X ≤ x) = 0.81
 (g) The value of x for which P(X ≥ x) = 0.04
 (h) The value of x for which P(|X − 10| ≥ x) = 0.63

 5.1.4 Suppose that X ∼ N(−7, 14). Find:


 (a) P(X ≤ 0), (b) P(X ≥ −10)
 (c) P(−15 ≤ X ≤ −1), (d) P(−5 ≤ X ≤ 2)
 (e) P(|X + 7| ≥ 8), (f) The value of x for which P(X ≤ x) = 0.75
 (g) The value of x for which P(X ≥ x) = 0.27
 (h) The value of x for which P(|X + 7| ≤ x) = 0.44
Problems on Normal Distribution (Anthony)
5.1.9 The thicknesses of glass sheets produced by a certain process are normally
distributed with a mean of μ =3.00 mm and a standard deviation of σ = 0.12 mm.
(a) What is the probability that a glass sheet is thicker than 3.2 mm?
(b) What is the probability that a glass sheet is thinner than 2.7 mm?
(c) What is the value of c for which there is a 99% probability that a glass sheet
has a thickness within the interval [3.00 − c, 3.00 + c]?
Solution:
Problems on Normal Distribution (Anthony)
5.1.10 The amount of sugar contained in 1-kg packets is actually normally
distributed with a mean of μ = 1.03 kg and a standard deviation of σ = 0.014 kg.
(a) What proportion of sugar packets are underweight?
(b) If an alternative package-filling machine is used for which the weights of the
packets are normally distributed with a mean of μ = 1.05 kg and a standard
deviation of σ = 0.016 kg, does this result in an increase or a decrease in the
proportion of underweight packets?
(c) In each case, what is the expected value of the excess package weight above
the advertised level of 1 kg?
Solution:
Problems on Normal Distribution (Anthony)
5.1.13 The resistance in milliohms of 1 meter of copper cable at a certain temperature is
normally distributed with mean μ = 23.8 and variance σ 2 = 1.28.
(a) What is the probability that a 1-meter segment of copper cable has a resistance less
than 23.0?
(b) What is the probability that a 1-meter segment of copper cable has a resistance greater
than 24.0?
(c) What is the probability that a 1-meter segment of copper cable has a resistance
between 24.2 and 24.5?
(d) What is the upper quartile of the resistance level?
(e) What is the 95th percentile of the resistance level?
Problems on Normal Distribution (Anthony)

You might also like