Week 4
Week 4
Changing σ
increases or
σ
decreases the
spread.
μ X
Probability density function (pdf)
1 x−µ 2
1 − ( )
f ( x) = ⋅e 2 σ
σ 2π
This is a bell shaped
curve with different
Note constants: centers and spreads
π=3.14159 depending on μ and
e=2.71828 σ
Normal Curves
R Code:
The beauty of the normal curve:
Example
But…
How would you find the percentage of people within 1.2 standard
deviations of mean IQ score?
What is the probability that a person has an IQ above 110?
…
Another example
Somebody calculated all the integrals for the standard normal and
put them in a table! So we never have to integrate! Even better,
computers now do all the integration.
Example:
z-score for Anwar: z =  = 2.5
z-score for John: z =  = 1
So, Anwar’s income percentile is higher than that of John in his
corresponding country.
Transformation of Normal to Standard Normal
$140 or higher?
c. What is the probability to select someone with an income between
$80 and $125?
d. What is the minimum weekly earnings of top 1% driver?
Solution.
a. What is the probability of selecting someone having weekly
earnings of $140 or lower when sampling drivers at random?
P(x  
z=
P(x  P(z  1.6) = 0.9452
From the table z of 1.60 corresponds to a left tail (smaller than) area
of: P(z  1.6) = 0.9452 or 94.52% chance.
Solution
Sample Sample
Sample
Sample
Sample
Sample
Sample
Sample
Population
Sample Sample
Central Limit Theorem (CLT)
If a sample of size n ≥ 30 is taken from a population with any type of distribution that has a
mean = μ and standard deviation = σ,
x x
µ µ
the sample means will have a normal distribution.
xx
x x
x x x
x x x x x x
µ
Central Limit Theorem (CLT)
x
µ
the sample means will have a normal distribution for any
sample size n.
xx
x x
x x x
x x x x x x
µ
Central Limit Theorem (CLT)
Mean of the
µx = µ sample means
σ Standard deviation of
σx = the sample means
n
This is also called the
standard error of the
mean.
Simulation in R
Example
According to a recent study 16-24-year-olds spend average of 3 hours
per day on social media. Assume the standard deviation is equal to 1.5
hours. If you randomly select a sample of 40 individuals from 16-24-year
olds,
a. What is the standard error of the mean in this example?
b. What is the probability the sample average spending is greater than
3.5 hours?
c. What is the probability that sample mean spending on social media is
between 2 and 3 hours?
Solution
R code for Normal Distribution
dnorm(x, mean, sd) # gives height of the probability distribution
pnorm(x, mean, sd) # Cumulative Distribution Function, P(x 
qnorm(p, mean, sd) # gives a number whose cumulative value matches the probability
value
rnorm(n, mean, sd) # generates random numbers whose distribution is normal
# Create a sample of 100 numbers which are normally distributed with mean = 10, stdev = 5.
y <- rnorm(100, mean=10, sd = 5)
# What is the probability of selecting someone with an income of $140 or lower ()?
pnorm(140, mean = 100, sd = 25)
# What is the probability of selecting someone with an income of $130 or higher ()?
pnorm(130, mean = 100, sd = 25, lower.tail = FALSE) #OR
1 - pnorm(130, mean = 100, sd = 25)
# What is the probability to select someone with an income between $80 and $125?
pnorm(125, mean = 100, sd = 25) - pnorm(80, mean = 100, sd = 25)
# What is the minimum weekly earnings of top 1% driver?
qnorm(0.99, mean = 100, sd = 25)
References