0% found this document useful (0 votes)
16 views22 pages

Normal Distribution Review

Uploaded by

mathishamache
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views22 pages

Normal Distribution Review

Uploaded by

mathishamache
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

The Normal Distribution

Review (from page 255 to 283)

Read book §6.1 §6.2

1
I. The Normal Distribution

 The normal distribution is considered the most prominent probability distribution because it has
several important properties that make it widely applicable in various fields, such as statistics,
finance, and natural sciences.

 It is often used as a first approximation to describe real-valued random variables that tend to
cluster around a single mean

 The “BELL" shape of the normal distribution makes it a convenient choice for modeling a large
variety of random variables encountered in practice.

The normal distribution

2
 The area under the curve represents the total probability, which sums to 1
 The distribution is symmetric around the mean, meaning that half of the total probability lies to the
left of the mean, and the other half lies to the right

f(X)
P(    X  μ) 0.5 P(μ  X  ) 0.5

0.5 0.5

μ X

P(    X  ) 1.0
3
I. The Normal Distribution

 Symmetrical.

 Since it is symmetric : median=mean d

 It is also the highest point, so most of the 50%

population is around the mean X


median
 so median = mean = mode

4
1. How does σ affect the shape ?

 The standard deviation describes how far values Small σ


large σ
lie from the mean
 It affects the concentration of the curve

The larger the standard deviation, the less concentrated the bell curve is, the more spread out the curve is.

5
2. The Empirical Rules

 Approximately 68% of the data in a bell shaped distribution is within 1 standard deviation of the
mean, or µ ± σ

 Approximately 95% of the data in a bell-shaped distribution lies within two standard deviations of
the mean, or µ ± 2σ

 Approximately 99.7% of the data in a bell-shaped distribution lies within three standard deviations
of the mean, or µ ± 3σ

95% 99.7
%
μ 2σ μ 3σ

6
2. The Empirical Rules

 Suppose that the variable “Stat scores” (noted X) is normally distributed with a mean of 13 and a standard
deviation of 2. Then,

 68% of all test takers scored between 11 and 15 (13 ± 2).

 95% of all test takers scored between 9 and 17 (13 ± 2*2)

 99.7% of all test takers scored between 7 and 19 (13 ± 3*2)

7
3. The Standardized Normal distribution

 Any normal distribution (with any mean and standard deviation combination) can be transformed into the
standardized normal distribution (Z).

 The standardized normal distribution (Z) has a mean of 0 and a standard deviation of 1.

 This process simplifies calculations and comparisons between different normal distributions.

8
3. The Standardized Normal distribution

 Translate from X to the standardized normal (the “Z” distribution) by subtracting the mean of X and
dividing by its standard deviation:

X
Z

The Z distribution always has mean = 0 and standard deviation = 1.

 Why Standardize?

 Comparison across different scales: It allows comparison of variables measured in different units
(e.g., height in cm, weight in kg) by bringing them into the same scale.
 Easier analysis: Many statistical techniques require or perform better when data is standardized.

9
3. The Standardized Normal Distribution

 Also known as the “Z” distribution.


 Mean is 0.
 Standard Deviation is 1.
f(Z)

Z
0

 Values above the mean have positive Z-values.


 Values below the mean have negative Z-values.

10
3. The Standardized Normal Distribution

Example

 If X is distributed normally with mean of 100 and standard deviation of 50, the Z value for X = 200 is:

X  μ $200  $100
Z  2.0
σ $50

 This says that X = 200 is two standard deviations (2 increments of 50 units) above the mean of 100.

11
3. The Standardized Normal Distribution

 Comparing X and Z units

100 200 X
(μ = 100, σ = 50)
0 2.0 Z (μ = 0, σ = 1)

 Note that the shape of the distribution is the same, only the scale has changed. We can express the
problem in the original units of X or in standardized units (Z).

12
3. The Standardized Normal Distribution

 The normal distribution (Z distribution) is symmetrical around 0, we have :


 P (Z< 0) = P ( Z > 0) = 0.5 = 50%
 P (Z> -z) = P (Z< +z)

Same area, same probability

Z =- 0,9 Z = + 0,9

13
Exercise 1

a. P (-2 < Z < 2) = 0.95


b. P ( Z < 3) = 0.997 + 0.003/2
c. X is distributed normally with mean of 100 and standard deviation of 15.
i. P ( X > 115 ) = P ( Z > 1 ) = ( 1 - 0.68 ) / 2
ii. P ( 70 < X < 130 ) = P ( -2 < Z < 2 ) = 0.95
iii. P ( X > 145 ) = P ( Z > 3 ) = 0.003/2
iv. P ( X < a ) = 0,975
 a = mean + 2 sigma =100 + 30

14
3. The Standardized Normal Distribution

 There are many z tables…


 How to read the table?

15
4. Evaluating Normality

 Not all continuous distributions are normal.

 It is important to evaluate how well the data set is approximated by a normal distribution.

 There are several ways to evaluate the normality of a dataset, including graphical methods,
descriptive statistics…

 Normally distributed data should approximate the theoretical normal distribution:

 The normal distribution is bell shaped (symmetrical) where the mean is equal to the median.

 The empirical rule applies to the normal distribution.

 The interquartile (Q3-Q1) range of a normal distribution is 1.33 standard deviations.

16
4. Evaluating Normality

a. Graphical Methods
 These methods help visually assess how closely the data align with a normal distribution.

 Histogram: Plotting a histogram of the data is a simple way to visually inspect its distribution. If the
data is approximately normally distributed, the histogram should show a bell-shaped curve.

 Box Plot: A box plot can give an indication of normality by showing the symmetry of the data. In a
normal distribution, the median line should be centered within the box, and the whiskers should be
of roughly equal length on either side.
b. Descriptive Statistics
 Do the mean, median and mode have similar values?

 Is the interquartile range approximately 1.33σ?

 Is the range approximately 6σ?

17
4. Evaluating Normality

c. Observe the distribution of the data set:

 Do approximately 68% of the observations lie within mean ±1 standard deviation?

 Do approximately 95% of the observations lie within mean ±2 standard deviations?

 Do approximately 80% of the observations lie within mean ±1.28 standard deviations?

18
5. Review

+E(Y) V+ 2 cov (X,Y)

. E(Y) if independent V 2 cov(X,Y)

If X and Y are independant, cov (X,Y) = 0

19
5. Review

 Properties

Suppose X ~ N (µx ;σx) and Y ~ N (µY ;σY) ; a and k are constant numbers

 a*X ~ N (a*µx ; |a|*σx)

 (a*X+ k) ~ N (a*µx+k ; |a|*σx)

 (X + Y) ~ N (µx+ µY ; ) (if X and Y are independent)

 (X – Y) ~ N (µx - µY ; ) (if X and Y are independent)

20
Exercise 2

A large group of students took a test in Physics and the final grades have a mean of 70 and a standard
deviation of 10. If we can approximate the distribution of these grades by a normal distribution, what
percent of the students.
Q1) scored higher than 80?
Q2) should pass the test (grades≥60)?
Q3) should fail the test (grades<60)?
Q4) determine the symmetrical interval (midpoint is the mean) that contains 95% of the scores.
Q5) determine the symmetrical interval (midpoint is the mean) that contains 80% of the scores.
Q6) if we take 5 students with replacement what is the probability that the sum of the 5 grades is
higher than 350?

21
Exercise 3

You work for food-processing industry. You know that the weight of potatoes packet is a normally
distributed random variable whose mean is 200 grams and whose standard deviation is 10 grams. You
randomly and independently select 10 packets, the total weight of your 10 packets is defined by the
variable TW.

1- Write the equation of the variable TW. Is TW a random variable ? Explain Why?

2- Calculate E(TW) and V(TW)

3- What’s the probability of the total weight to be higher than 2100 grams?

22

You might also like