Topic 18 Identifying The Appropriate Test Statistics Involving Population Mean

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

SAN AGUSTIN INTEGRATED SCHOOL LEARNING ACTIVITY SHEET

San Agustin Libon, Albay


STATISTICS AND PROBABILITY

Topic 18: Identifying Appropriate Test Statistics


Involving Population mean
In the previous module, you have learned more about hypothesis. You identified the two kinds of hypotheses and
the directionality test of hypothesis. The module also discussed about the notations commonly used in formulating a
hypothesis. You also accomplished activities identifying the test of hypothesis to be used after formulating null and
alternative hypotheses.
This time, you are ready to identify the test statistic to be used when the population variance is known and unknown.
After going through this module, you are expected to:
1. define the statistical concepts related to test concerning means;
2. identify the appropriate form of test statistics when: (a) the population variance is assumed to be known; (b) the
population variance is assumed to be unknown; and (c) the Central Limit Theorem is to be used; and
3. apply the concepts of test statistic on real-life problems.

Identifying Appropriate Test Statistics Involving Population Mean


Hypothesis testing is a method of testing a claim or hypothesis about a parameter in a population given a data
sample. In this method, we test the hypothesis by determining the likelihood that sample statistic could be selected and
if the hypotheses regarding the population parameter were true. The process of hypothesis testing involves setting up
two contrasting hypotheses: the null hypothesis and the alternative hypothesis. One selects a random sample, computes
summary statistics using appropriate test statistics, and then assesses the likelihood that the sample data support the
alternative hypothesis.
In the previous module, you were taught how to formulate null and alternative hypotheses. You are now ready to
analyze statistical hypothesis to determine the correct test statistics to be used in computing the results and making
decisions.

What’s In
Activity 1: Is It Zee or Tee?
Directions: Write the letter “z” if the statement is a characteristic of standard normal distribution and “t” if the given
characteristic describes tdistribution.
1. It is best applied if you have a limited sample size (n < 30) as long as the variables are approximately normally
distributed.
2. It is also applicable if you do not know the populations’ standard deviation.
3. This is the best to use in a statistical test if the population standard deviation is known.
4. It is always used for normal distribution.
5. This test is often applied in large samples (n > 30).

Follow-up Questions:
1. In the items above, how did you differentiate the statements describing standard normal distribution from those
involving tdistribution?
2. Were you able to answer them easily? If not, which item/s did you find difficult to answer?
3. Were you able to differentiate the statements characterizing normal distribution from those describing t-distribution?

What’s New
Activity 2: Find Me!
Directions: Determine the needed data for each given problem. First, read and understand the examples below before
you proceed to the items that follow.
Examples:
1. A Grade 11 researcher reported that the average allowance of Senior High School students was more than ₱100. A
sample of 40 students had mean allowance of ₱120. At 𝛼 = 0.01 test, it was the claimed that the students had
allowance of more than ₱ 100.The standard deviation of the population is ₱50.
𝜇 = 100 𝑥̅ = 120 𝑛 = 40 𝜎 = 50
2. According to a cell phone company, the average price of cellular phone in the Philippines is ₱12,999. However, in a
sample of 20 costumers randomly asked about the price of their cellular phone, data collected showed an average of
₱9,999 and standard deviation of ₱7,999. Using 𝛼 = 0.05 level of significance, is there enough evidence proving that
the average price of cellular phone is less than ₱12,999?
𝜇 = 12,999 𝑥̅ = 9,999 𝑛= 20 𝑠 = 7,999
Now, it’s your turn…
1. The average number of ad clicks per day for Facebook before was 192,000 and the standard deviation was 100,000.
Sixty-four (64) days after the redesign, the mean number of ad clicks per day was 200,000.
𝜇 = ______ 𝑥̅ = ______ 𝑛 = ______ 𝜎 = ______
2. The average life of typical incandescent bulb is 1,500 hours as claimed by a light bulb company. Thinking that the
average life of bulbs is less than what the company claimed, a client tested a random sample of 55 light bulbs. The
rest resulted to sample mean of 1,300 hours and standard deviation of 25 hours. Is there enough evidence to prove
that the average life of the company’s light bulb is less than 1,500 hours?
𝜇 = ______ 𝑥̅ = ______ 𝑛 = ______ 𝑠 = ______
3. The mean number of close friends for the population of people living in the Philippines is 5. The standard deviation of
scores in this population is 1.2. An investigator predicts that the mean number of close friends for introverts will be
significantly different from the mean of the population.
The mean number of close friends for a sample of 26 introverts is 6.
𝜇 = ______ 𝑥̅ = ______ 𝑛 = ______ 𝜎 = ______
Guide Questions:
1. How did you find the activity?
2. What mathematical concepts did you apply in answering the activity?
3. Were you able to determine the needed data for each notation?
4. Which value of notation/s seemed too difficult to identify on the given problems?
5. Have you observed the differences of notations in the items? Is the value of 𝑠 same as σ? If not, how do they
differ?
6. What do you think is the relationship of these notations on determining test statistic in hypothesis testing?

What Is It
Before we move forward to the different test statistics, it is important to define the following terms:
• A population includes all of the elements from a set of data.
• A sample consists of one or more observations drawn from the population.
• Sample mean (𝑥̅) is the mean of sample values collected.
• Population mean (µ) is the mean of all the values in the population. If the sample is randomly selected and sample size
is large, then the sample mean would be a good estimate of the population mean.
• Population standard deviation (𝝈) is a parameter which is a measure of variability with fixed value calculated from every
individual in the population.
• Sample standard deviation (𝒔) is a statistic which means that this measure of variability is calculated from only some of
the individuals in a population.
• Population variance (𝝈𝟐), in the same sense, indicates how the population data points are spread out. It is the average
of the distances from each data point in the population to the mean, squared.

Since we already defined important things in identifying the test statistics in hypothesis testing, let us now
determine those concepts when given a problem. Let’s use the example in Activity 2.
Example:
A Grade 11 researcher reported that the average allowance of Senior High School students was ₱100. A sample
of 40 students has mean allowance of ₱120. At 𝛼 = 0.01 test, it was the claimed that the students had allowance of more
than ₱ 100.The standard deviation of the population is ₱50.

µ = ₱100 the average allowance of the population (Senior High School students)
𝐧 = 𝟒𝟎 the number of students taken from all Senior High School students
𝑥̅ = ₱120 the mean allowance of the sample
𝛔 = ₱50 the standard deviation of the population

Now you already know how to get the data needed in choosing test statistics. This time, you will determine what
test statistic is appropriate in computing test value in the hypothesis testing.

A test statistic is a random variable that is calculated from sample data and used in a hypothesis test. You can use
test statistics to determine whether to reject or accept the null hypothesis. The test statistic compares your data with
what is expected under the null hypothesis.
To identify the test statistic, you must consider whether the population standard deviation/variance is known or
unknown. If the population standard deviation σ is known, then the mean has a normal distribution. Use z-test. If the
population standard deviation σ is unknown, then the mean has a t- distribution. Use t-test. Instead of the population
standard deviation, use the sample standard deviation.
 z-test
In a z-test, the sample is assumed to be normally distributed. A z-score is calculated with population parameters
such as “population mean” and “population standard deviation”. It is used to validate a hypothesis that the sample drawn
belongs to the same population. When the variance is known and either the distribution is normal or sample size is large,
use a z-test statistic.

 t-test
Like a z-test, a t-test also assumes a normal distribution of the sample. A t-test is used when the population
variance or standard deviation are not known. When the variance is unknown and a sample size is less than 30, use a t-
test statistic assuming that the population is normal or approximately normal.

 Central Limit Theorem


In Central Limit Theorem, if the population is normally distributed or the sample size is large and the true
population mean µ = µ𝑜 , then z has a standard normal distribution.
When population standard deviation σ is not known, we may still use z-score by replacing the population standard
deviation σ by its estimate, sample standard deviation s. Since the sample is large the resulting test statistic still has a
distribution that is approximately standard normal.
Historically, this was very useful, as most statisticians before did not have access to the t-table of quantities for
very large number of degrees of freedom. But with modern computers today, using t-test with a very large sample size is
not a problem at all.
However, since you will be using a t-table with only limited number of degrees of freedom, you will use z-test
when the sample size is large even though the population standard deviation is unknown.
When sample sizes are small, the Central Limit Theorem does not apply. You must then impose stricter assumptions on
the population to give statistical validity to the test procedure. One common assumption is that the population from which
the sample is taken has a normal probability distribution to begin with. Under such circumstances, if the population
𝑥̅ −𝜇
standard deviation is known, then the test statistic 𝜎 still has the standard normal distribution.
√𝑛
The table shows what test statistic is appropriate when:
Population Variance Is Population Variance Is Unknown Central Limit Theorem (CLT)
Known
Population is normally Population is normal or nearly normally Population may not be normally distributed.
distributed. distributed.
𝑛 ≥ 30 𝑛 < 30 𝑛 ≥ 30 or considered sufficiently large
Population standard Sample standard deviation (s) is known. Variance is known/ unknown.
deviation (𝜎) is known. Population standard deviation (𝜎) is
unknown.
z-test t-test Use z-test by replacing population standard
deviation (𝜎) by sample standard deviation
(𝑠) in the formula.

Identifying Appropriate Test Statistic

When the value of sample size (n)…

𝒏 ≥ 𝟑𝟎 𝒏 < 𝟑𝟎

σ is known σ is not known σ is known σ is not known

z-test z-test z-test t-test

Illustrative Examples:
1. A manufacturer claimed that the average life of batteries used in their electronic games is 150 hours. It is known that
the standard deviation of this type of battery is 20 hours. A consumer wished to test the manufacturer’s claim and
accordingly tested 100 electronic games using the battery. It was found out that the mean is equal to 144 hours.
Here, the sample size (n) is 100 (extremely large) and population standard deviation (20 hours) is known, then
the appropriate test statistic to be used is z-test.
2. An English teacher wanted to test whether the mean reading speed of students is 550 words per minute. A sample of
12 students revealed a sample mean of 540 words per minute with a standard deviation of 5 words per minute. At
0.05 significance level, is the reading speed different from 550 words per minute?
The sample size (n) is 12 which is less than 30 and sample standard deviation (5 words per minute) was given.
Therefore, the appropriate test is t-test.
3. A study was conducted to look at the average time students exercise. A researcher claimed that in average, students
exercise less than 15 hours per month. In a random sample size n=115, it was found that the mean time students
exercise is 𝑥̅ = 11.3 hours per month with s = 6.43 hours per month.
Since n=115, the sample size is large and variance is unknown. Hence, z-test is the appropriate tool. (Central
Limit Theorem)

Note:
The illustrative examples above used standard deviations instead of variances. Variance is the square of the standard
deviation and conversely, the standard deviation is the square root of the variance. Hence, if the standard deviation
is known in the problem, then basically, variance is also known.

What’s More
Activity 3: Mark My Numbers!
Directions: In each problem, underline the population standard deviation/sample standard deviation and circle the
number of samples.
1. A sample of 160 people has a mean age of 27 with a population standard deviation (σ) of 5. Test the hypothesis that
the population mean is 26.7 at α=0.05.
2. An electric lamps manufacturer is testing a new production method that will be considered acceptable if the lamps
produced by this method result in a normal population with an average life of 1,300 hours and a standard deviation
equal to 120. A sample of 100 lamps produced by this method has an average life of 1,250 hours.
3. The cholesterol levels in a certain population have mean of 210 and standard deviation 21. The cholesterol levels for
a random sample of 9 individuals are measured and the sample mean x is determined. What is the z-score for a sample
mean x=180?
4. Mabunga Elementary School has 1,000 students. The principal of the school thinks that the average IQ of students at
Mabunga is at least 110. To prove her point, she administers an IQ test to 20 randomly selected students. Among the
sampled students, the average IQ is 108 with a standard deviation of 10.
5. A new energy-efficient lawn mower engine was developed by a well-known inventor. He claims that the engine will
run continuously for 5 hours on a single gallon of regular gasoline. From his stock of 2,000 engines, the inventor selects
a simple random sample of 50 engines for testing. The engines run for an average of 295 minutes with a standard
deviation of
20 minutes.

Activity 4. Check It Out!


Directions: Read and analyze each problem. On the table below, put a check on the columns of the criteria that correspond
to the given problem.
1. It is claimed that the average age of working students in a certain university is 35. A researcher selected a random
sample of 25 working students. The computation of their ages resulted to an average of 32 years with standard
deviation of 10 years.
2. A manufacturer of tires claim that their tire has a mean life of at least 50,000kms. A random sample of 28 of these
tires is tested and the sample mean is 33,000kms. Assume that the population standard deviation is 3,000kms and the
lives of the tires are approximately normally distributed.
3. On average, a drinking vending machine is adjusted so it dispenses 240ml of fruit juice. However, the machine tends
to go out of adjustment and periodic checks are made to determine the average amount of fruit juice being dispensed.
A sample of 28 with a standard deviation of 15ml in plastic cup drinks is taken to test the adjustment of the machine.
4. Uber company claims that the mean time to rent a car on their app is 60 seconds with a standard deviation of 30
seconds. A random sample of 36 customers attempted to rent a car on the app. The mean time of renting was 75
seconds. Is this enough evidence to contradict the company's claim?
5. The waiting time to be seated at the restaurant has population standard deviation of 10 minutes. An expensive
restaurant claims that the average waiting time for dinner is approximately 1 hour, but we suspect that this claim is
inflated to make the restaurant appear more exclusive and successful. A random sample of 30 customers yielded a
sample average waiting time of 50 minutes.
𝒏 ≥ 𝟑𝟎 𝒏 < 𝟑𝟎 𝝈 is known. 𝝈 is unknown. z-test t-test
1.
2.
3.
4.
5.

Activity 5. Which is Which?


Directions: Identify the appropriate test statistic to be used in each problem. Write z-test or t-test on a separate sheet of
paper.
___________1. A sample of n=25 is selected from a normal population, 𝑥̅ = 56 and s= 12.
___________2. Based on the report of the school nurse, the average height of Grade 11 students has increased. Five years
ago, the average height of Grade 11 students was 170cm with standard deviation of 38cm. She took a random sample of
150 students and derived the average height of 165cm.
___________3. Knowing from a previous study that the average of athletes is 80, an athletic adviser asked how his soccer
players are academically doing as compared to other student athletes. After an initiative to help improve the average of
student athletes, the adviser randomly selected 15 soccer players and found 85 as the average with standard deviation of
1.25.
___________4. The CEO of a battery manufacturing company claimed that their batteries would last an average of 280
hours under normal use. A researcher randomly selected 20 batteries from the production line and tested them. The
tested batteries had a mean life span of 250 hours with a standard deviation of 40 hours. Do we have enough evidence to
suggest that the claim of an average of 280 hours is false?
___________5. It was known that the number of tickets purchased by students at the ticket window for the volleyball
match of two popular universities followed a distribution that has mean of 500 and standard deviation of 8.9. Suppose
that a few hours before the start of one of these matches, there are 100 eager students standing in line to purchase tickets.
If there are 250 tickets remaining, what is the probability that all 100 students will be able to purchase the tickets they
want?

What I Have Learned


Activity 6.
Complete the following sentences by filling each blank with the correct word or phrase.
1. __________________ is a random variable that is calculated from sample data and is used in a hypothesis test.
2. ____________ includes all of the elements from a set of data while ______________ consists of one or more
observations drawn from the population.
3. ___________ is a measure of variability calculated from every individual in the population while ______________
is calculated from only some of the individuals in a population.
4. The two common test statistics to be computed in hypothesis testing are ________________________ and
____________________________________.
5. A z-score is calculated with population parameters such as population mean and ______________________.
6. A t-test is used when the __________________ or standard deviation is not known.
7. The number of sample for z-test is ________________________ while ________________________ in t-test.
8. If the population standard deviation is known, use ______________________ and if it’s unknown, use
________________________.
9. The notations that need to be considered in identifying test statistics are _____________________ and
____________________.
10. If the number of samples is sufficiently large and the variance is unknown, then ________________________ is
appropriate to be used.

What I Can Do
Activity 7.
Make a comics strip on how to determine the appropriate tool when the variance is known, variance is unknown,
and when Central Limit Theorem is used. Your work will be evaluated using the following rubric.
Clear Understanding of Mathematical Concept 30
Organization and Accuracy of Solution(s) 30
Clear Understanding of Vocabulary 10
Accuracy of Analysis 20
Presentation 10
Total 100

Assessment
Directions: Choose the best answer to the given questions or statements. Write the letter of your choice on a separate
sheet of paper.
1. If the variance is known, what test statistic is appropriate?
A. t-test B. z-test C. two-tailed test D. one-tailed test
2. One-sample t-statistic is used instead of one-sample z-statistic when ___________________.
A. μ is known. B. σ is known. C. μ is unknown. D. σ is unknown.
3. Based on the Central Limit Theorem, when the sample (n) is extremely large and the variance is unknown, what is the
statistical test to be used?
A. t-test B. z-test C. two-tailed test D. one-tailed test
4. Which of the following is NOT a consideration in using z-test/statistic?
A. Variance is known. C. The population mean is less than 30.
B. Sample standard deviation is known. D. Population standard deviation is known.
5. What appropriate tool is applicable if the population is normal, sample standard deviation is known, and sample is
less than 30?
A. t-test B. z-test C. normal test D. Central Limit Theorem
6. Which of the following symbols is NOT needed when t-test is used in computing values?
A. 𝑛 B. µ C. 𝜎 D. 𝑠
7. If in a sample n=16 selected from a normal population, 𝑥̅ = 56 and 𝑠 = 12, what statistical test is applicable to be used?
A. f-test B. t-test C. z-test D. Central Limit Theorem
8. Based on Central Limit Theorem, the z-test for single sample may be used when all the following conditions are TRUE
except
_________________.
A. Sample size is less than 30. C. Population standard deviation is known.
B. Data are normally distributed. D. Population standard deviation is unknown.
9. What is the sample standard deviation if a simple random sample of 220 students is drawn from a population of 2,740
college students? Among the sampled students, the average IQ score is 115 with standard deviation of 10.
A. 10 B. 115 C. 220 D. 2,740
10. The supervisor of a certain company claimed that the mean workday of his workers is 8.3 hours per day. A sample of
20 workers was taken and it was found out that the mean workday is 8 hours with standard deviation of 1 hour. At
0.01 level of significance, is the mean workday less than 8.3 hours?
What test statistic is to be used in the given problem?
A. z-test B. t-test C. right-tailed test D. left-tailed test
11. Based on the problem in no. 10, 8.3 hours is _____________.
A. σ B. µ C. 𝑥̅ D. 𝑠
12. A leader of an association of jeepney driver claims that the average daily take-home pay of all jeepney drivers in
Caloocan is ₱350.00. A random sample of 100 jeepney drivers in Caloocan was interviewed and the takehome pay was
found to be ₱420.00. If 0.05 significance level was used to find out whether the average take home pay is different
from ₱350.00 and population variance was assumed to be ₱92.00, what is the appropriate test statistic?
A. t-test B. t-test C. left-tailed test D. right-tailed test
13. L.V. Co. has an average sale of ₱37 million per week from their products in all their outlets. An area manager found
out that the average gross sales from the 28 outlets under her jurisdiction is ₱32.5 million per week with standard
deviation of ₱1.5 million. Does the mean sales of all outlets differ from the mean sales of the 28 outlets under her
jurisdiction? In the given problem, what statistical tool is suitable to use?
A. t-test B. z-test C. ANOVA D. chi-square test
14. A cellular battery manufacturer claims that his battery when fully charged has mean life of 24 hours with standard
deviation of 4 hours. A dealer randomly chose sample of 35 batteries to be tested and resulted to 22.5 hours mean
life. In the given situation, 22.5 hours is __________.
A. sample mean C. sample standard deviation
B. number of sample D. population standard deviation
15. According to a study, there is an increase on average monthly expenses of ₱250.00 for cell phone loads of Senior High
School students in the city. Is there a reason to believe that the amount increased if sample of 60 students has an
average monthly expense of ₱280.00 and the population standard deviation is ₱77.00? What is the tool to be used in
computing the test value?
A. z-test B. t-test C. left-tailed test D. alternative test

Additional Activities
Activity 8. Read, Analyze, and Answer!
Directions: Answer the following.
1. In a sample of 𝑛 = 12 selected from a normal population, 𝑥̅ = 50, 𝑠 = 10, and null hypothesis is 𝐻0: µ = 45.
a. What is the number of degrees of freedom?
b. What is the test statistic to be used?
2. In order to test 𝐻0: µ = 26 versus 𝐻𝑎: µ < 26, a random sample of size 𝑛 = 37 is obtained from the population that is
known to be normally distributed with 𝜎 = 3.
a. Based on the given alternative hypothesis, what is the hypothesis test?
b. What test statistic would you apply to compute for the value?

You might also like