Module 18

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

9

Advanced Statistics
Quarter – Module 18:
THREE-OR-MORE-SAMPLE
HYPOTHESIS TESTS OF MEANS
Advanced Statistics –
Grade 9 Alternative Delivery
Mode Quarter 3 – Module 12:
Skewness
First Edition, 2020

Republic Act 8293, section 176 states that: No copyright shall subsist in any work of the
Government of the Philippines. However, prior approval of the government agency or office wherein
the work is created shall be necessary for exploitation of such work for profit. Such agency or office
may, among other things, impose as a condition the payment of royalties.

Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand names, trademarks,
etc.) included in this module are owned by their respective copyright holders. Every effort has been
exerted to locate and seek permission to use these materials from their respective copyright owners.
The publisher and authors do not represent nor claim ownership over them.

Published by the Department of Education Secretary: Leonor


Magtolis Briones Undersecretary: Diosdado
M. San Antonio

Development Team of the Module

Author: JANSTEN B. MAPATAC


Editor:
Reviewers:
Illustrator:
Layout Artist:
Management Team:

Printed in the Philippines by

Department of Education – Region II

Office Address:Regional Government Center, Carig Sur, Tuguegarao City, 3500 Telefax: (078) 304-3855 /
(078) 396-0677 / (078) 396-9728
E-mail Address: [email protected]
9

Advanced Statistics
Quarter 3 – Module 18:
THREE-OR-MORE-SAMPLE
HYPOTHESIS TESTS OF MEANS
Introductory Message
For the facilitator:
Welcome to the Advanced Statistics – Grade 9 Alternative Delivery Mode (ADM) Module on
Three-or-More-Sample Hypothesis Tests Of Means.

This module was collaboratively designed, developed and reviewed by educators


both from public and private institutions to assist you, the teacher or facilitator in
helping the learners meet the standards set by the K to 12 Curriculum while
overcoming their personal, social, and economic constraints in schooling.

This learning resource hopes to engage the learners into guided and independent
learning activities at their own pace and time. Furthermore, this also aims to help
learners acquire the needed 21st century skills while taking into consideration
their needs and circumstances.

In addition to the material in the main text, you will also see this box in the body of
the module:

Notes to the Teacher


This contains helpful tips or strategies that will help
you in guiding the learners.

As a facilitator you are expected to orient the learners on how to use this module.
You also need to keep track of the learners' progress while allowing them to
manage their own learning. Furthermore, you are expected to encourage and assist
the learners as they do the tasks included in the module.
For the learner:

Welcome to the Advanced Statistics Alternative Delivery Mode (ADM) Module on


Three-or-More-Sample Hypothesis Tests Of Means.

Through our hands we may learn, create and accomplish. Hence, the hand in this
learning resource signifies that you as a learner is capable and empowered to
successfully achieve the relevant competencies and skills at your own pace and
time. Your academic success lies in your own hands!

This module was designed to provide you with fun and meaningful opportunities
for guided and independent learning at your own pace and time. You will be
enabled to process the contents of the learning resource while being an active
learner.

This module has the following parts and corresponding icons:

What I Need to Know This will give you an idea of the skills or
competencies you are expected to learn in
the module.

What I Know This part includes an activity that aims to


check what you already know about the
lesson to take. If you get all the answers
correct (100%), you may decide to skip this
module.

What’s In This is a brief drill or review to help you link


the current lesson with the previous one.

What’s New In this portion, the new lesson will be


introduced to you in various ways such as a
story, a song, a poem, a problem opener, an
activity or a situation.

What is It This section provides a brief discussion of


the lesson. This aims to help you discover
and understand new concepts and skills.

What’s More This comprises activities for independent


practice to solidify your understanding and
skills of the topic. You may check the
answers to the exercises using the Answer
Key at the end of the module.

What I Have Learned This includes questions or blank


sentence/paragraph to be filled in to
process what you learned from the lesson.

What I Can Do This section provides an activity which will


help you transfer your new knowledge or
skill into real life situations or concerns.
Assessment This is a task which aims to evaluate your
level of mastery in achieving the learning
competency.

Additional Activities In this portion, another activity will be given


to you to enrich your knowledge or skill of
the lesson learned. This also tends retention
of learned concepts.

Answer Key This contains answers to all activities in the


module.

At the end of this module you will also find:

References This is a list of all sources used in


developing this module.

The following are some reminders in using this module:

1. Use the module with care. Do not put unnecessary mark/s on any part of
the module. Use a separate sheet of paper in answering the exercises.
2. Don’t forget to answer What I Know before moving on to the other activities
included in the module.
3. Read the instruction carefully before doing each task.
4. Observe honesty and integrity in doing the tasks and checking your
answers.
5. Finish the task at hand before proceeding to the next.
6. Return this module to your teacher/facilitator once you are through with it.
If you encounter any difficulty in answering the tasks in this module, do not
hesitate to consult your teacher or facilitator. Always bear in mind that you are
not alone.

We hope that through this material, you will experience meaningful learning
and gain deep understanding of the relevant competencies. You can do it!
What I Need to Know

This module was designed and written with you in mind. It is here to help
you master about skewness. The scope of this module permits it to be used in
many different learning situations. The language used recognizes the diverse
vocabulary level of students. The lessons are arranged to follow the standard
sequence of the course. But the order in which you read them can be changed to
correspond with the textbook you are now using.

The module is all about Three-or-More-Sample Hypothesis Tests Of Means.

After going through this module, you are expected to solve problems
involving Three-or-More-Sample Hypothesis Tests Of Means using Analysis of variance.

What I Know
Directions: Choose the letter of the correct answer from the given choices by
writing the chosen letter on a separate sheet of paper.

1. A type of test in which it is best applied in more than 2 populations or samples that are
meant to be compared.
A. t-test
B. z-test
C. chi-square
D. analysis of variance

2. Who invented the analysis of variance?


A. Ronald Fisher
B. William S. Gossett
C. Student E. Williams
D. Carl Friedrich Gauss

3. Which is true about the following statements?

I. The null hypothesis in analysis of variance is means are all equal.


II. The alternative hypothesis in analysis of variance is means are not all equal.

A. Only statement I is correct


B. Only statement II is correct
C. Both statement are correct
D. Both statement are incorrectf

4. What assumption of ANOVA in which the variation within each group being
compared is similar for every group?
A. Dependent variable
B. Homogeneity of variance
C. Independence of observations
D. Normally-distributed response variable

5. It is an assumption of ANOVA in which the data were collected using statistically-valid


methods, and there are no hidden relationships among observations?
A. Dependent variable
B. Homogeneity of variance
C. Independence of observations
D. Normally-distributed response variable
Lesson

01 ANALYSIS OF VARIANCE
(ANOVA)
In the past chapter, you learned the concepts of One-Sample Hypothesis Test
of Means

In this lesson, you will learn about Three-or-More-Sample Hypothesis Tests Of Means . Among
the topics to be discussed in the lesson includes Assumptions of ANOVA and solving
problems involving Analysis of Variance (ANOVA).

What’s In
Let’s revisit what you have learned from the previous module.
The z-test is used in comparing two means if the population standard deviation is known. We
should give emphasis in the discussion that if the population is normally distributed, z-test can be
used for any sample size n. However, in many practical cases, the population standard deviation is
unknown but the sample is sufficiently large, that is . The sample standard deviation is used as an
estimator of the population standard deviation.

When the sample size involves small case (n<30) and the population standard deviation is
unknown, use the sample standard deviation (s) as an estimator of population standard deviation (
). In cases like this, t-distribution is appropriate as the test statistic. Using the t-distribution as test
statistic, it is always an assumption that the sampled population is normal or approximately normal.
The t-distribution was developed by an employee of Irish brewery in the person of William S.
Gosett (1876-1936). He chose to publish his findings using the pen name “Student”. To honor his
work, the distribution is known today as Student t-distribution.

Notes to the Teacher


This contains helpful tips or strategies that will help you in guiding the learners.
What’s New
Activity 1. MYSTERY BOX!

Direction: Each box contains a question, pick a box then answer the questions that corresponds
the box.

A B C

A It is a test tells you if there are any statistical differences


between the means of three or more independent groups.

B Assumption in which variation within each group being


compared is similar for every group.

It is computed by taking the ratio of what is called the "


C mean square for treatments (MST) " variability to the
"mean square for errors (MSE)" variability.
What is It

The technique to test for a difference in more than two independent means is an extension of
the two independent samples procedure discussed previously which applies when there are exactly
two independent comparison groups.

The ANOVA technique applies when there are two or more than two independent groups.
Analysis of variance is a parametric statistical technique used to compare datasets. This technique
was invented by Ronald Fisher, and is thus often referred to as Fisher’s ANOVA, as well. It is similar
in application to techniques such as t-test and z-test, in that it is used to compare means and the
relative variance between them. The ANOVA procedure is used to compare the means of the
comparison groups and is conducted using the same five step approach used in the scenarios
discussed in previous sections. Because there are more than two groups, however, the computation
of the test statistic is more involved. The test statistic must take into account the sample sizes,
sample means and sample standard deviations in each of the comparison groups.

If one is examining the means observed among, say three groups, it might be tempting to
perform three separate group to group comparisons, but this approach is incorrect because each of
these comparisons fails to take into account the total data, and it increases the likelihood of
incorrectly concluding that there are statistically significate differences, since each comparison adds
to the probability of a type I error. Analysis of variance avoids these problemss by asking a more
global question, i.e., whether there are significant differences among the groups, without addressing
differences between any two groups in particular (although there are additional tests that can do
this if the analysis of variance indicates that there are differences among the groups).

The fundamental strategy of ANOVA is to systematically examine variability within groups being
compared and also examine variability among the groups being compared. However, analysis of
variance (ANOVA) is best applied where more than 2 populations or samples are meant to
be compared.

The use of analysis of variance ( ANOVA ) is to compute three or more ( more thantwo ) ,
sample means. This can be used for CRD. It is also called F-test for comparing k ( more than 2 )
population means on the assumptions that independent random samples have been drawn from the
k normal populations and the variability of each sampled population is σ 2.

Assumptions of ANOVA
The assumptions of the ANOVA test are the same as the general assumptions for any parametric test:

1. Independence of observations - the data were collected using statistically-valid methods,


and there are no hidden relationships among observations. If your data fail to meet this
assumption because you have a confounding variable that you need to control for
statistically, use an ANOVA with blocking variables.
2. Normally-distributed response variable: The values of the dependent variable follow
a normal distribution.
3. Homogeneity of variance: The variation within each group being compared is similar for
every group. If the variances are different among the groups, then ANOVA probably isn’t the
right fit for the data.
In analysis of variance there are two possible hypotheses:

a. The null hypothesis (H0) is that there is no difference between the groups and equality
between means.

The null hypothesis is


H0: μ1= μ2 = . .. = μK

b. The alternative hypothesis (HA) is that there is a difference between the means and
groups.

The alternative hypothesis is


HA : one or more pairs of population differ.

The ANOVA F Statistic

The test F statistic assumes equal variability in the k populations (i.e., the population
variances are equal, or s12 = s22 = ... = sk2 ). This means that the outcome is equally variable in each of
the comparison populations. This assumption is the same as that assumed for appropriate use of the
test statistic to test equality of two independent means. It is possible to assess the likelihood that
the assumption of equal variances is true and the test can be conducted in most statistical computing
packages. If the variability in the k comparison groups is not similar, then alternative techniques
must be used.

The F statistic is computed by taking the ratio of what is called the " mean square for
treatments (MST) " variability to the "mean square for errors (MSE)" variability. This is where the
name of the procedure originates. In analysis of variance we are testing for a difference in means
(H0: means are all equal versus H 1: means are not all equal) by evaluating variability in the data. The
numerator captures mean square for treatments (MST) (i.e., differences among the sample means)
and the denominator contains mean square for errors. The test statistic is a measure that allows us
to assess whether the differences among the sample means (numerator) are more than would be
expected by chance if the null hypothesis is true. Recall in the two independent sample test, the test
statistic was computed by taking the ratio of the difference in sample means (numerator) to the
variability in the outcome.

The decision rule for the F test in ANOVA is set up in a similar way to decision rules we
established for t-tests. The decision rule again depends on the level of significance and the degrees
of freedom. The F statistic has two degrees of freedom. These are denoted df 1 and df2, and called the
numerator and denominator degrees of freedom, respectively. The degrees of freedom are defined
as follows:

df1 = k-1 and df2=N-k,

where k is the number of comparison groups and N is the total number of observations in the
analysis.   If the null hypothesis is true, the mean square for treatments (numerator) will not exceed
the mean square for errors (denominator) and the F statistic will small. If the null hypothesis is false,
then the F statistic will be large.

The analysis of variance F statistic for testing the equality of several means has this form:

mean square for treatments ( MST )


F=
mean square for errors ( MSE )

where F is based on
d.f.1 = ( k −1 )
d.f.2 =( n−k )
The rejection region:
 Reject H0 if F¿ F critical

where Fcritical lies in the upper tail of the F- distribution table with d.f. 1 = k-1 and
d.f.2 =n-k.

The following quantities must be computed.

• Total SS =SST+ SSE


where Total SS=total ∑ of squared deviations
SST =∑ of squares for treatment
SSE=∑ of squares for error

• Total SS= ( ∑ of squares of all X values ) – CM


where CM= correction for the mean
(total of all observations)2
• CM =
n
where n= total number of X values


[
∑ of squares of treatment totalswith each
SST = square divided by the number of observations – CM

SSE=Total SS−SST
¿ the particular total ]
SSE
• MSE= n1+ n2+ …+n k – k 2
where n= number of observations per treatment
k= number of treatments

SST
• MST = k −1

MST
• F= MSE
where F based on d.f.1 = ( k −1 ) and d.f.2 =( n−k )
MSE=mean square for errors

 If F≥ Fα H0 is rejected.

where Fα = Fcritical based on ( k −1 ) degrees of freedom at α= 0.05 or 0.01.

The computed values are to be summarized in an ANOVA table such as the one
below.
Table. ANOVA
Source of Degrees of ss MS Fc
variation Freedom
Treatments k-1 SST SST MST
MST=
k−1 MSE
Error n-k SSE SSE
MSE =
n−k
Total n-1 Total SS
CRITICAL VALUES FOR F DISTRIBUTION (Fα = Fcritical )
Example
1. Groups of students were randomly assigned to be taught using four different
teaching techniques. They were tested at the end of a specified period of time.
Because of dropouts in the experimental groups, the number of students varied
from group to group. Do the following data present sufficient evidence to indicate a
difference in the mean achievement for students taught using the four teaching
techniques?

A B C D
65 75 59 94
87 69 78 89
73 83 67 80
79 81 62 88
81 72 83
69 79 76
90

Solution:

a. H0 :µA = µB= µC = µD
b. HA :µA ≠ µB ≠µC ≠ µD
c. Computations to be done:

Techniques
A B C D
65 75 59 94
87 69 78 89
73 83 67 80
79 81 62 88
81 72 83
69 79 76
90
Total = 454 549 425 351

Grand Total = 454 + 549 + 425 + 351


= 1779

Mean (X) A = 75.67 B=78.43 C=70.83 D=87.75

N =23

(total of all observations)2


CM =
n
2
(1779)
CM = = 137601.8
23

Total SS= ( ∑ of squares of all X values ) – CM


Total SS= (65)2 + (87)2 + ….. + (88)2 – CM
= 139511 -137601.8
= 1909.2

[ ∑ of squares of treatment totalswith each


SST = square divided by the number of observations – CM
¿ the particular total ]
2 2 2 2
( 454) (549) ( 425) (351)
SST = + + + −CM
6 7 6 4
= 138314.4 −137601.8
= 712.6

SSE = Total SS – SST


=1909.2−712.6
=1196.6

SST 712..6
MST = = = 237.5
k−1 4−1

SSE 1196.6
MSE = = = 63.0
n−k 23−4

The test statistics for testing the null hypothesis is

MST 237.5
F= = =3.77
MSE 63.0

where d.f.1 =(k-1) = 4−1=3


d.f.2 =(n-k) = 23 −4=19

Fcritical for α =0.05 is 3.13

Since F= 3.77 exceeds F0.05 = 3.13, reject H0.

It means that there are at least two treatments that were significantly different
from one another. The ANOVA table is shown below.

Source of d.f. SS MS F
Variation
Treatments 3 712.6 237.5 3.77
Error 19 1,196.6 63.0
Total 22 1,909.2

*Significant at 0.05 level.


What’s More
Activity 2: Fill me in
Direction: Fill out the table by completing the values for analysis of variance.

Source of Degrees of ss MS Fc
variation Freedom
1. k-1 SST SST 6.
MST=
k−1
Error 2. 4. 5.
Total 3. Total SS

What I Have Learned


Directions: Fill in the blanks with the appropriate word or terms.

The ________________ technique applies when there are two or more than two independent
groups. Analysis of variance is a parametric statistical technique used to compare datasets. This
technique was invented by ________________, and is thus often referred to as Fisher’s ANOVA, as
well. It is also called F-test for comparing k ( more than 2 ) population means on the assumptions
that independent random samples have been drawn from the k normal populations and the
variability of each sampled population is σ 2.
The ___________________ is computed by taking the ratio of what is called the " mean square for
treatments (MST) " variability to the "mean square for errors (MSE)" variability. This is where
the name of the procedure originates. In analysis of variance we are testing for a difference in
means (H0: means are all equal versus H1: means are not all equal) by evaluating variability in the
data. The numerator captures ___________________________ (i.e., differences among the sample means)
and the denominator contains ___________________________. The test statistic is a measure that allows
us to assess whether the differences among the sample means (numerator) are more than would
be expected by chance if the __________________ is true.
What I Can Do
Activity 3. Try this out!

Direction: Answer the given problem using analysis of variance.

Four brands of flashlight batteries are to be compared by testing each brand in five
flashlights. Twenty flashlights are randomly selected and divided randomly into four groups of
five flashlights each. Then each group of flashlights uses a different brand of battery. The
lifetimes of the batteries, to the nearest hour, are as follows.

Brand A Brand B Brand C Brand D


42 28 24 20
30 36 36 32
39 31 28 38
28 32 28 28
29 27 33 25

Preliminary data analyses indicate that the independent samples come from normal
populations with equal standard deviations. At the 5% significance level, does there appear to
be a difference in mean lifetime among the four brands of batteries?

Assessment
Directions: Choose the letter of the correct answer from the given choices by
writing the chosen letter on a separate sheet of paper.

1. To test equality of means of more than 2 populations which of the following


techniques is used?
a. T- test
b. chi-square test
c. Interval estimate
d. Analysis of variance

2. It is an assumption of ANOVA in which values of the dependent variable follow


a normal distribution.

a. Dependent variable
b. Homogeneity of variance
c. Independence of observations
d. Normally distributed response variable

3. The degrees of freedom associated with the denominator of F-test in the analysis of
variance are ?
a. (k-1)
b. (n-k)
c. n(k-1)
d. none of these

4. The total sum of squared deviations for analysis of variance is given by:

a. SSE + TSS
b. SST + SSE
c. SST – SSE
d. SSE – SST

5. The test statistics for testing the null hypothesis for analysis of variance is:
a. MST/MSE
b. SSC/SST
c. MSE/MST
d. MST/SST

Additional

Activities

Directions: Construct a creative acronym for the word ANOVA emphasizing its
function and importance in Statistics. Your output will be assessed by
the following criteria
Creativity and Originality – 10 points
Thought – 10 points
Answer Key
What I Know
1. D 4. B
2. A 5. C
3. C

What’s New
1. ANALYSIS OF VARIANCE
2. HOMOGENEITY OF VARIANCE
3. F-TEST

What’s More
1. TREATMENTS 4.SSE
SSE
2. n-k 5. MSE =
n−k
MST
3. n-1 6.
MSE

What I Have Learned


1. ANOVA 4. MEAN SQUARE FOR TREATMENTS
2. RONALD FISHER 5. MEAN SQUARE FOR ERRORS
3. F -STATISTIC 6. NULL HYPOTHESIS

What I Can Do
F = 0.7393
Assessment
1. D 4. B
2. D 5. A
3. B

For inquiries or feedback, please write or call:

Department of Education - Bureau of Learning Resources (DepEd-BLR) Ground

Floor, Bonifacio Bldg., DepEd Complex

Meralco Avenue, Pasig City, Philippines 1600 Telefax: (632) 8634-1072; 8634-1054;

8631-4985
Email Address: [email protected] * [email protected]

References

You might also like