0% found this document useful (0 votes)
14 views31 pages

OpenStax Chapter 13 Power Point

This document provides an overview of one-way ANOVA, a statistical method used to determine if there are significant differences among the means of three or more groups. It outlines the assumptions required for the test, the formulation of null and alternative hypotheses, and the calculation of the F statistic. Additionally, it includes examples and explanations of interpreting results, including p-values and their significance levels.

Uploaded by

cjhoge70
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views31 pages

OpenStax Chapter 13 Power Point

This document provides an overview of one-way ANOVA, a statistical method used to determine if there are significant differences among the means of three or more groups. It outlines the assumptions required for the test, the formulation of null and alternative hypotheses, and the calculation of the F statistic. Additionally, it includes examples and explanations of interpreting results, including p-values and their significance levels.

Uploaded by

cjhoge70
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

C H A P T E R 1 3 F D I S T R I B U T I O N A N D 0 N E - WAY A N O VA

This presentation is based on material and graphs from Open Stax and is copyrighted by Open Stax and
Georgia Highlands College.
INTRODUCTION
For hypothesis tests comparing averages between more than two
groups, statisticians have developed a method called "Analysis of
Variance" (abbreviated ANOVA). In this chapter, you will study the
simplest form of ANOVA called single factor or one-way ANOVA.
13.1 | ONE-WAY ANOVA
ONE-WAY ANOVA
The purpose of a one-way ANOVA test is to determine the existence
of a statistically significant difference among several group means.
The test actually uses variances to help determine if the means are
equal or not.
In order to perform a one- way ANOVA test, there are five basic
assumptions to be fulfilled:
1. Each population from which a sample is taken is assumed to be
normal.
2. All samples are randomly selected and independent.
3. The populations are assumed to have equal standard
deviations(or variances).
4. The factor is a categorical variable.
THE NULL AND ALTERNATIVE
HYPOTHESIS
The null hypothesis is simply that all the group population means
are the same.
The alternative hypothesis is that at least one pair of means is
different.
For example, if there are k groups:

H0:μ1 =μ2 =μ3 = ... =μk


Ha: At least two of the group means μ1,μ2,μ3, ...,μk are not equal.
EXAMPLES OF NULL HYPOTHESIS BY BOX PLOTS

(a) H0 is true. All means are the


same; the differences are due to
random variation.

(b) H0 is not true. All means are not


the same; the differences are too
large to be due to random variation.
13.2 | THE F DISTRIBUTION
AND THE F-RATIO
THE F DISTRIBUTION AND
THE F-RATIO
The distribution used for the hypothesis test is a new one.
It is called the F distribution, named after Sir Ronald Fisher, an
English statistician.
The F statistic is a ratio (a fraction).
There are two sets of degrees of freedom; one for the numerator
and one for the denominator.
For example, if F follows an F distribution and the number of
degrees of freedom for the numerator is four, and the number of
degrees of freedom for the denominator is ten,
then F~F4,10.
ONE-WAY ANOVA RESULTS
Source of Sum of Degrees of Mean Square F
Variation Squares (SS) Freedom (df) (MS)
Factor SS (Factor) k -1 MS (Factor) = F=
(Between) SS(Factor)/(k -1) MS(Factor)/MS(Err
or)
Error (Within) SS (Error) n-k MS(Error) =
SS(Error)/(n-k)
Total SS (Total) n -1
F RATIO
To calculate the F ratio, two estimates of the variance are made.
1. Variance between samples: An estimate of σ² that is the variance
of the sample means multiplied by n (when the sample sizes are
the same.). If the samples are different sizes, the variance between
samples is weighted to account for the different sample sizes. The
variance is also called variation due to treatment or explained
variation.
F RATIO
2. Variance within samples: An estimate of σ2 that is the
average of the sample variances (also known as a pooled
variance). When the sample sizes are different, the variance
within samples is weighted. The variance is also called the
variation due to error or unexplained variation.

• SS between = the sum of squares that represents the


variation among the different samples
• SS within = the sum of squares that represents the
variation within samples that is due to chance.
F RATIO
To find a "sum of squares“ means to add together squared
quantities that , in some cases, may be weighted. We used sum
of squares to calculate the sample variance and the sample
standard deviation in Descriptive Statistics.
MS means "mean square.“
MS between is the variance between groups
MS within is the variance within groups.
F = MS between /MS within

NEVER COMPLETE BY HAND. USE THE WEBSITE LISTED LATER IN


THE POWERPOINT.
NOTATION
The one-way ANOVA hypothesis test is always right-tailed
because larger F-values are way out in the right tail of the
F-distribution curve and tend to make us reject H0.

The notation for the F distribution is


F~Fdf(num),df(denom)
where df(num) =dfbetween and df(denom) =dfwithin
The mean for the F distribution is µ = df(num) /
df(denom)–1
ANALYSIS OF
VARIANCE Using Internet Website

ANOVA—ONE WAY
WEBSITE
http://vassarstats.net/anova1u.html
EXAMPLE
A regional manager wants to know if there is a difference
between the mean amounts of time that customers wait
in line at the drive through window for the three stores in
her region. She samples the wait times at each store.
Use an ANOVA test to determine if a difference between
the mean wait times for the three stores, at the 0.05
level of significance.

Drive Through Wait Times (In Minutes)

Store 2.34 1.23 1.89 2.31 3.02 1.95 2.45


1
Store 2.87 1.94 2.36 1.85 1.75 2.82 3.32
2
Store 1.32 1.45 1.78 2.01 2.45 1.92 1.83
3
SETUP
Determine the number of “treatment” groups (categories)
Determine the “maximum” number of data values in any category

For the given example, there are 3 stores (aka “treatment groups”)
with 7 data values for each store.
Click “continue” Enter treatment group labels
Enter the data values under
the appropriate column Select “yes” for display graph
(category)
Results appear on screen.
Click “compute” Focus on table displayed
below
Record the F-
test statistic
given in the
table.

Record the p-
value given in
the table
13.3 FACTS ABOUT THE F
DISTRIBUTION
13.3 | FACTS ABOUT THE F
DISTRIBUTION
Here are some facts about the F distribution.
1. The curve is not symmetrical but skewed to the right.
2. There is a different curve for each set of dfs.
3. The F statistic is greater than or equal to zero.
4. As the degrees of freedom for the numerator and for the
denominator get larger, the curve approximates the normal.
5. Other uses for the F distribution include comparing two variances
and two-way Analysis of Variance. Two-Way Analysis is beyond the
scope of this chapter.
DIFFERENCE IN THE F DISTRIBUTIONS

These graphs show that as the degrees of freedom for the


numerator and for the denominator get larger, the curve
approximates the normal.
HYPOTHESES
The null and alternative hypotheses are:
H0:μ1 =μ2 =μ3 =μ4 =μ5
Ha: Not all of the means μ1,μ2,μ3,μ4,μ5 are equal.

You may see the Ha written an few other ways but for the course,
you can write it like above.
SOLVING FOR A ONE- WAY
ANOVA EXAMPLE
Let’s return to the slicing tomato exercise in Try It.
The means of the tomato yields under the five mulching conditions
are represented by μ1, μ2, μ3, μ4, μ5. We will conduct a hypothesis
test to determine if all means are the same or at least one is
different. Using a significance level of 5%, test the null hypothesis
that there is no difference in mean yields among the five groups
against the alternative hypothesis that at least one mean is
different from the rest.
Bare: n1 = Ground Cover: n2 Plastic: n3 = 3 Straw: n4 = 3 Compost: n5
3 =3 =3
2625 5348 6583 7285 6277
2997 5682 8560 6897 7818
4915 5482 3830 9230 8677
RESULTS
Hypotheses:
Ho: all the means are the same. OR H0:μ1 =μ2 =μ3 =μ4 =μ5
Ha: Not all of the means μ1,μ2,μ3,μ4,μ5 are equal. OR µi ≠µj some
i≠j
Run ANOVA
Distribution for F-Test: F4,10
Df(num)= 4
Df(denom) = 10
RESULTS
What is the p-value? 0.0248
P-value = P(F>4.481) = 0.0248
Make a decision p and alpha
0.0248 < 0.05, reject the Ho
Conclusion: At the 5% level of significance, there is strong evidence
that differences in the mean yields for the mulching conditions are
unlikely due to chance alone. We may conclude that at least some
of the mulches led to different mean yields.
ANOTHER EXAMPLE
Four sororities took a random sample of sisters regarding their grade
means for the past term. The results are shown in the table below
Sorority 1 Sorority 2 Sorority 3 Sorority 4
2.17 2.63 2.63 3.79
1.85 1.77 3.78 3.45
2.83 3.25 4.00 3.08
1.69 1.86 2.55 2.26
3.33 2.21 2.45 3.18
Using a significance level of 1%, is there a difference in mean grades
among the sororities?
RESULTS
Hypotheses
Ho: μ1 =μ2 =μ3 =μ4
Ha: Not all of the means μ1,μ2,μ3,μ4 are equal.

F distribution F3,16
F= 2.2303
P-value = P(F>2.2303) =0.1241
P and level of significance
0.1241> 0.01; I cannot reject the null hypothesis
There is no sufficient evidence to conclude that there is a difference
among the mean grades for the sororities.

You might also like