0% found this document useful (0 votes)
4 views23 pages

Stats2 Normal Distribution

Uploaded by

yuz734341
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views23 pages

Stats2 Normal Distribution

Uploaded by

yuz734341
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

MMME4071

Studying Human
Performance

The normal distribution


Dr Robert Houghton
What we will cover
• What is the normal distribution?
• Why is it important?
• Calculating standard deviation
• Using Z-scores
– To convert from the normal distribution to values
– Practical applications

• Chapters 3 and 4 in the workbook


The Binomial Distribution
• Bernoulli (1655-1705)
• If we toss a coin 20
times, what is the
probability of observing a
given number of heads?
• This generates the
“binomial distribution”
• This is for discrete
variables (heads or tails,
yes or no, 0 or 1 etc.)
The Normal Distribution
• Continuous variables (i.e.,
interval or ratio)
– E.g., Height, reaction time, body
measurements etc.
– Quetelet “Equal causes occurring
in opposite directions”???
– Not compatible with ordinal or
categorical data (X axis units?) =
“distribution free”
– If you fit a curve to the binomial,
you get the same shape (not a
coincidence)
– Aka “Gaussian” (engineers), “Bell
• Properties:
Curve” (Victorians, outdated)
– Symmetrical
– If you know the mean and can say
something about the amount of
variability (spoiler: “the standard
– Bell-shaped
deviation”) you can describe the
whole distribution and use this in – Most values fall towards
advanced statistical the middle
calculations…
– Mean=Median=Mode
OKCupid Members…
Skewness
Skew = when data are not normally distributed and the majority of scores
fall at one end of the scale.

It is not appropriate to perform various statistical tests on the sample

Mode
Mode
Median
Median

Mean Mean
Frequency
Frequency
s s

Scores Scores

Negatively skewed data positively skewed data


mean < median < mode. mode < median < mean.
Skew - Income distribution
…revisited from last time
My exam has an average of 62%...
But each exam has a different “standard deviation”
68 – 95 – 99.7 rule
• One
standard
deviation =
68.3
percent of
values
within
• Two
standard
deviations
= 95.4
percent of
values
within
• Three
standard
deviations
= 99.7
percent of
values
within
Percentage above/below…
50% 50%

15.87% 15.87%

2.28% 2.28%

0.13% 0.13%

-3 -2 -1 0 1 2 3
No. of Standard Deviations
above and below the mean
Calculating the standard deviation?
Sample Variance (S2)is an estimate of the variability of a set of data.
The average of the squared differences from the mean.
(1) Work out the mean for the data
(2) For each data entry, subtract the mean from it and square the result
(3) Add all the squared differences together
(4) Divide by N - 1
Standard deviation (S or SD) is an estimate of the average variability of a set of
data measured in the
same units of measurement as the original data.
Description of data: 18 male students,
Height: mean 182.2cm, s.d. 8.39cm
Standard deviation is the square root of the variance.
(5) Sqrt (S2)
Two things to note in the
calculation
• IN GENERAL, WHY IS THERE SO MUCH SQUARING AND SQUARE
ROOTING GOING ON IN STATISTICAL FORMULAE?
– Usually because we’re trying to work out a difference and sometimes there will
be a negative number we need to lose the sign off

WHY N-1, WHY NOT JUST CALCULATE THE MEAN OF THE DIFFERENCES (ie N)?
– This is called “Bessel’s correction”
– In some books you’ll see just “N” in the ‘Variance’ formula but this is the variance
of the whole population. If you can measure everyone in “the world”, use N.
– We calculate here the “Sample Variance”, and we use N-1. This is a convention to
recognise the fact we’re assuming our sample is representative but just a
sample. We get a slightly bigger estimated variance because we’re dividing by a
smaller number.
Hrm ok.
Assume that we know:
Mean height of French men (not on OK Cupid) = 171.5 cm
SD = 6.9 cm

So, we know that if that is the mean, 50% of the distribution is above, 50% is
below.
Since one SD = 6.9 cm, 171.5+6.9cm = 178.4cm.
15.87% of French men are taller than 178.4cm.

This is useful but what if I want to talk about percentages that aren’t 15.87 or
what about Standard Deviation that isn’t exactly 1, 2 or 3.
Using the normal distribution
Once the mean and SD of a normally distributed variable is known we have a
perfect description of an entire distribution. We can also then calculate the
percentile rank associated with any score.

In order to perform calculations on normally distributed data we need to convert any


measurements or values into Z Scores.

Z score = distance from the mean measured in units of Standard Deviation

𝑥 −𝑥 is the score
𝑧= is the mean
𝑆
S is the SD
Normal distribution table relates distances from the
mean in SD
• 1. Its unitsthe
called (z-scores) to percentages
“D. normal distribution table” in your set of
tables…
• 2. First two digits are the rows, next digit is the column.

then
Look up z=1.0

• Z = 1 is -0.34134
z = 2.0

• -0.47725

• 34.1+13.6 = 47.7
z = 1.5, -0.4332 or -43%
86% between +/- 1.5 SD
So…
• We can also read the table backwards. Find the value in the
table and match it to the z-score.
• So, 40% to the left of the mean. We need to find the closest
figure to -.4
• In the table this is .3997. z = 1.28

10% Mean
Practical application
• If we have anthropometric
data (e.g., mean and
standard deviation for
reach) we can calculate the
percentage of people who
can reach a given height or
alternatively how high to put
something that 80% (or 83%
or 95% or 99% etc) of
people can reach it.
Examples
• E.g., if French men’s height mean = 171.5cm, SD = 6.9cm.
• How tall are the bottom 10%?

𝑥 −𝑥
𝑧=
𝑆

𝑥 − 171.5
− 1.28= = 162.67 cm (or less)
6.9
Examples
• E.g., if French men’s height mean = 171.5cm, SD = 6.9cm.
• If a man is 175cm tall, what percentile is he in?
𝑥 −𝑥
𝑧=
𝑆
175 − 171.5
𝑧= = 0.51
6.9
Look-up 0.51 as z-score, 0.1950

This is 0.1950 “above” the mean

50%+19.5% = 69.5% (or 70% rounded up)


Next time…
• We will look at choosing a statistical test based on what we’ve
now learned
• Chapter 5 in the workbook

• In future weeks, we will learn how to do them!

You might also like