0% found this document useful (0 votes)
9 views

Third Lecture

Uploaded by

dinaelkordy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Third Lecture

Uploaded by

dinaelkordy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

‫مادة تصميم وتحليل‬

‫التجارب‬
‫" المحاضرة الثالثة"‬
‫د‪ .‬سوزان عبد الرحمن‬
‫مدرس بكلية الدراسات العليا للبحوث اإلحصائية‬
Use Q-Q Plots to Check Normality

▪ A Q-Q plot is used to determine whether or


not a set of data potentially came from or
follows a normal distribution.
▪ If the data is normally distributed, the points
in a Q-Q plot will lie on a straight diagonal
line.
▪ Conversely, the more the points in the plot
deviate significantly from a straight
diagonal line, the less likely the set of data
follows a normal distribution.
Q-Q Plots vs. Histograms

▪ It’s worth noting that Q-Q plots are a way


to visually check whether a dataset follows a normal
distribution.

▪ Another way to visually check for normality is to


create a histogram of the dataset. If the data
roughly follows a bell curve shape in the histogram,
then we can assume that the dataset is normally
distributed.
How to Test for Normality in SPSS
▪ We can use formal statistical tests to determine whether or not a variable follows a normal
distribution. SPSS offers the following tests for normality:
• Shapiro-Wilk Test
• Kolmogorov-Smirnov Test
• The null hypothesis for each test is that a given variable is normally distributed.

• If the p-value of the test is less than the significance level (0.01, 0.05, and 0.10), then we can
reject the null hypothesis and conclude that the variable is not normally distributed.
▪ to perform some statistical test that assumes variables are normally distributed, we would
know that the variable points satisfy this assumption.

Normal distribution

▪ The normal distribution is a theoretical probability distribution


that is perfectly symmetric about its mean (and median and
mode)
▪ ► A “bell”-like shape
▪ Normal distributions are uniquely defined by two quantities: a
mean (µ) and standard deviation (σ)
▪ All normal distributions, regardless of mean and standard
deviation values, have the same structural properties:
► Mean = median (= mode)
► Values are symmetrically distributed around the mean
► Values “closer” to the mean are more frequent than values
“farther” from the mean
Structural Properties of the Normal Distribution

► The entire distribution of values (specific variable)


described by a normal distribution can be completely
specified by knowing the mean and standard deviation
► Since all normal distributions have the same
structural properties, we can use a reference
distribution, called the standard normal distribution.
► Any normal distribution with mean 𝜇 and standard
deviation σ can be rescaled to a standard normal
distribution.
The standard normal distribution is a normal
distribution with mean 𝜇 =0, and standard deviation σ=1
For data whose distribution is approximately normal:

68% of the observations in a normal


distribution fall within one standard
deviation of the mean

► 99.7% of the observations fall


within three standard deviations of
the mean
95% of the observations fall within two
standard deviations of the mean (1.96)
► The middle 95% of values fall between
𝜇 -2σ and 𝜇+2σ
► 2.5% of the values are smaller than 𝜇 -
2σ (97.5% are greater than 𝜇 -2σ)
and 2.5% are greater than 𝜇+2σ and
97.5% of the values are smaller than 𝜇
+2σ
▪ The normal distribution is a theoretical probability distribution
► No real data is perfectly described by this distribution. distributions of some data will be well
approximated by a normal distribution. In such situations, we can use the properties of the normal
curve to characterize aspects of the data distribution
Example
Using only the sample mean and standard deviation, and assuming
normality, let’s estimate the 2.5th and 97.5th percentiles SBP in this
population

Based on this sample data, we estimate that most (95%) of the men in
this clinical population have systolic blood pressures between 97.8
and 149.4 mmHg

Note: the observed 2.5th and 97.5th percentiles of the 113


sample value are 100.7 mmHg and 151.2 mmHg, respectively
Another example
▪ A basic histogram of the weight for 236 Nepali children
at one year old
▪ Using only the sample mean and standard deviation, and
assuming normality, let’s estimate a range of weights for
most (95%) Nepali children who were 12 months old

Based on this sample data, we estimate that most (95%) of


Nepali children who were 12 months had weights between
4.7 kg and 9.5 kg
Note: the empirical 2.5th and 97.5th percentile of the 236
sample values are 4.4 kg and 9.7 kg, respectively

You might also like