0% found this document useful (0 votes)
585 views19 pages

MATH 403 EDA Chapter 8

This document discusses hypothesis testing for a single sample. It covers: 1) Defining the null and alternative hypotheses, and types of errors that can occur. 2) Choosing a significance level like 0.05 and determining the critical value and region. 3) Calculating a test statistic like z and comparing it to the critical value to determine if the null hypothesis should be rejected or not. 4) Introducing the p-value approach which provides the probability of obtaining test results at least as extreme as what was observed, assuming the null is true. 5) Outlining the general procedure for hypothesis testing which involves stating hypotheses, choosing a test, making a decision rule,

Uploaded by

Marco Yvan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
585 views19 pages

MATH 403 EDA Chapter 8

This document discusses hypothesis testing for a single sample. It covers: 1) Defining the null and alternative hypotheses, and types of errors that can occur. 2) Choosing a significance level like 0.05 and determining the critical value and region. 3) Calculating a test statistic like z and comparing it to the critical value to determine if the null hypothesis should be rejected or not. 4) Introducing the p-value approach which provides the probability of obtaining test results at least as extreme as what was observed, assuming the null is true. 5) Outlining the general procedure for hypothesis testing which involves stating hypotheses, choosing a test, making a decision rule,

Uploaded by

Marco Yvan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

MATH 403- ENGINEERING DATA ANALYSIS

MATH 403

Engineering
Data
Analysis

CABACES, DONNALYN C.
MARCAIDA, MARJORIE G.
SOTTO, RODOLFO JR. C.
MATH 403- ENGINEERING DATA ANALYSIS

Chapter 8
TEST ON HYPOTHESIS FOR A SINGLE
SAMPLE

Introduction

In the previous chapters, how a parameter of a population can be estimated from

sample data using a point estimate or confidence interval was discussed. In many

situations there are two competing claims about the value of a parameter, and whichever

claim is correct must be determined. This can be done by statistical inference. Inferential

statistics is the other branch of statistics which deals with the estimates of population

values called parameters and to make statements about computed statistics acceptable

to some degree of confidence. Statistical inference is the method concerned with making

estimates of population value. This method called hypothesis testing is a help in

determining how accurate the generalizations are. This chapter focuses on the basic

principles of hypothesis testing of means, variance and proportion involving a single

sample of data.

Intended Learning Outcomes

At the end of this module, it is expected that the students will be able to:

1. Test hypotheses on the mean of a normal distribution using either a Z-test or a

t-test.
MATH 403- ENGINEERING DATA ANALYSIS

2. Test hypotheses on the variance or standard deviation of a normal distribution.

3. Test hypotheses on a population proportion.

4. Use the P-value approach for making decisions in hypothesis tests.

8.1. Hypothesis Testing

Hypothesis testing is a decision-making process for evaluating the claims about a

population. The goal of this process is to make judgment about the difference between

the sample statistics and a hypothesized population parameter. In this process, the

researcher must define the population under study, state the hypothesis to be

investigated, give the significance level, select a sample, collect data, perform the

required test and reach a conclusion. The z test and t test are statistical tests for

hypothesis testing on means while chi-square test is used for testing the standard

deviation.

Null and Alternative Hypothesis

The null hypothesis, denoted as Ho is the statement of equality indicating no

existence of relationship between the variables under study. This statement is tested for

the purpose of being accepted or rejected.

The alternative hypothesis, denoted as Ha is also termed as research hypothesis. It is

a statement of the expectation derived from the theory under the study.
MATH 403- ENGINEERING DATA ANALYSIS

Type I and Type II Error

In hypothesis testing, there are four possible outcomes.

Reject Ho Do no reject Ho

Ho is true Type I error Correct decision

Ho is false Correct decision Type II error

A type I error occurs if one rejects the null hypothesis when it is true. It is also

referred to as significance level and denoted by the Greek symbol alpha (). The

common values of  are 1%, 5% and 10%. A type II error occurs if one does not reject

the null hypothesis when it is false. It is denoted by a Greek symbol beta ().

Significance Level and Confidence Interval

The level of significance is the maximum probability of committing a type I error. That

is, P (type I error) = . Generally, statisticians agree on using three arbitrary significance

levels: 0.10, 0.05 and 0.01 level. That is, if the null hypothesis is rejected, the probability

of a type I error will be 10%, 5% or 1% and the probability of correct decision will be 90%,

95% or 99%, depending on which level of significance is used. The values of correct

decision is the confidence interval which represents the chance of accepting the null

hypothesis when in fact it is true.


MATH 403- ENGINEERING DATA ANALYSIS

8.1.1. One-sided and Two-sided Hypothesis

In order to state the hypothesis correctly, the researcher must translate correctly

the claim into mathematical symbols. There are three possible sets of statistical

hypotheses.

1. Ho : parameter = specific value This is a two-tailed test

H1 : parameter  specific value

2. Ho : parameter = specific value This is a left-tailed test

H1 : parameter < specific value

3. Ho : parameter = specific value This is a right-tailed test

H1 : parameter > specific value

8.1.2. P-value in Hypothesis Tests

In hypotheses testing of a discrete test statistic, the critical region may be arbitrarily

chosen. If α is too large, it can be reduced by making an adjustment in the critical value.

It may be necessary to increase the sample size to offset the decrease that occurs

automatically in the power of the test. In statistical analysis, it had become customary to

choose a significance level of 0.10, 0.05 or 0.01 and the critical region is selected

accordingly in which the rejection or non-rejection of the null hypothesis H0 would depend

on. For example, if the test is two tailed and 𝛼 is set at the 0.05 level of significance and

the test statistic involves, say, the standard normal distribution, then a z-value is observed

from the data and the critical region is z > 1.96 or z < −1.96 where the value 1.96 is found

as z0.025 in the table of Areas Under the Normal Curve. A value of z in the critical region

prompts the statement “The value of the test statistic is significant,” which we can then
MATH 403- ENGINEERING DATA ANALYSIS

translate into the user’s language. For example, if the hypothesis is given by H 0: μ = 12,

H1: μ  12, one might say, “The mean differs significantly from the value 12.”

The philosophy that the maximum risk of making a type I error should be controlled

is he root of the pre-selection of a significance level. However, this approach does not

account for values of test statistics that are “close” to the critical region. Suppose, for

example, in the illustration with H0: μ = 12 versus H1: μ  12, a value of z = 1.84 is

observed; strictly speaking, with  = 0.05, the value is not significant. But the risk of

committing a type I error if one rejects H0 in this case could hardly be considered severe.

In fact, in a two-tailed scenario, one can quantify this risk as P = 2P (Z > 1.84 when μ =

12) = 2(0.0329) = 0.0658. As a result, 0.0658 is the probability of obtaining a value of z

as large as or larger in magnitude than 1.84 when in fact μ = 12. It is an important

information to the user although the evidence against H 0 is not as strong as that which

would result from rejection at an  = 0.05 level. As a result, the P-value approach has

been extensively used in applied statistics. It is designed to have an alternative, in terms

of a probability, to a mere “reject” or “do not reject” conclusion. The P-value also gives an

important information when the z-value falls well into the ordinary critical region. For

example, if z is 2.75, it is informative for the user to observe that P = 2(0.0030) = 0.0060,

and thus the z-value is significant at a level considerably less than 0.05. It is important to

know that under the condition of H0, a value of z = 2.75 is an extremely rare event. That

is, a value at least that large in magnitude would only occur 60 times in 10,000

experiments.
MATH 403- ENGINEERING DATA ANALYSIS

A P-value is the lowest level of significance at which the observed value of the test

statistic is significant. It is the smallest level of  that would lead to rejection of the Ho with

the given data.

8.1.3. General Procedure for Test of Hypothesis

The following are the steps in hypothesis testing using the fixed probability of Type I

Error approach.

1. State the null and alternative hypothesis.

2. Determine the level of significance and the direction of test. The direction of test

will be based on whether the alternative hypothesis is stated as left or right tailed

test or as two-tailed test.

3. Determine the appropriate statistical test based on the level of measurement of

the data gathered.

4. Write the decision rule expressing on how to accept or reject the null hypothesis.

5. Compute the test statistic and compare with the critical value. The test statistic

plays a vital role in rejecting or accepting the null hypothesis.

6. State the decision based on the resulting computed value when compared to the

critical value.

7. Draw scientific or engineering conclusion for the given problem.

If you will be testing the hypothesis using Significant Testing or the P-value approach,

follow these steps:

1. State the null and alternative hypothesis.


MATH 403- ENGINEERING DATA ANALYSIS

2. Determine the appropriate statistical test based on the level of measurement of

the data gathered.

3. Compute the test statistic.

4. Compute the P-value based on the computed value of the test statistic.

5. State the decision based on the resulting P-value and knowledge of the scientific

system.

6. Draw scientific or engineering conclusion for the given problem.

8.2. Test on the Mean of a Normal Distribution Variance Known

Following the steps in hypothesis testing for only single mean, the hypothesized

value referred to as the hypothesized mean (µo).

The null hypothesis is stated as:

Ho: µ = µo

The alternative hypothesis can be written as:

H1 : µ  µo

H1 : µ > µo

H1: µ < µo

The decision rule is stated as follows: reject the null hypothesis if the absolute value of

the test statistic exceeds the critical value. Otherwise, do not reject the null hypothesis.

In order to draw inference on a mean in one-population case assuming that the entries

are normally distributed and the variance is known, Z-test is used. It can be used when

the sample size is equal or greater than 30 (n  30). The Z-statistic, Zc, is the test statistic
MATH 403- ENGINEERING DATA ANALYSIS

used in order to lead for the rejection of null hypothesis in favor of the alternative

hypothesis. This is computed as:

𝑋̅ − 𝜇𝑜
𝑧𝑐 =
𝜎/√𝑛

Where 𝑋̅ the computed mean is in the gathered data, 𝜇𝑜 is the hypothesized mean, 𝜎 is

the population standard deviation which is known or given and n is the sample size. The

critical value is obtained using the z-tabular value. For a two-tailed test, the value of 1-/2

written symbolically as z/2 is considered. Otherwise, for one-tailed test the value of 1-

written as z is written.

Figure 1. The Normal Distribution or Z- Distribution for Testing the Hypothesis Ho:  = o
with critical values for (a) H1:   o, (b)  > o, (c)  < o

Example 1. A random sample of 100 students enrolled in Statistics course under

Professor X shows that the average grade in the midterm examination is 85%. Professor

X claims that the average grade of the students in the midterm is at least 80% with a

standard deviation of 16%. Is there an evidence to say that the claim is correct at 5% level

of significance?

Solution:

1. H0 : µ = 80%

H1 : µ > 80%
MATH 403- ENGINEERING DATA ANALYSIS

2.  = 0.05, right-tailed test

𝑋̅ −𝜇𝑜
3. 𝑧𝑐 = 𝜎/√𝑛

4. Critical region: z > 1.645. Reject H0 if zc is greater than 1.645

5. Computing for z-statistic:

𝑋̅ − 𝜇𝑜
𝑧𝑐 = 𝜎
√𝑛
85 − 80
=
16
√100
= 3.125
6. Reject H0 since 3.125 is greater than 1.645

7. Therefore, the Professor claim is correct is 5% level of significance.

Using the P-value approach, the P-value corresponding to z = 3.125 is 0.0009 using the

table for Areas Under the Normal Curve. This results to an evidence stronger than the

0.05 level of significance in favor of the alternative hypothesis, H1.

Example 2. A manufacturer of solar lamp claims that the mean useful life of their new

product is 8 months with a standard deviation of 0.5 month. To test this clam, a random

sample of 50 solar lamps were tested and found to have a mean life of 7.8 months. Test

the hypothesis that  = 8 months against the alternative hypothesis that   8 months

using 1% level of significance.

Solution:

1. H0 : µ = 8 months

H1 : µ  8 months

2.  = 0.01, two-tailed test


MATH 403- ENGINEERING DATA ANALYSIS

𝑋̅ −𝜇𝑜
3. 𝑧𝑐 = 𝜎/√𝑛

4. Critical region: z < -2.575 and z > 2.575. Reject H0 if -2.575 > zc > 2.575

5. Computing for z-statistic:

𝑋̅ − 𝜇𝑜
𝑧𝑐 = 𝜎
√𝑛

7.8 − 8
=
0.5
√50

= −2.8284, 𝑠𝑎𝑦 2.83

6. Reject H0 since -2.83 is less than -2.575

7. Therefore, the mean useful life of the new product is not equal to 8 months. In fact

it is less than 8 months at 1% level of significance.

Using the P-value approach and considering that this is a two-tailed test, the P-value is

twice as the area to the left of z = -2.83. Using the table for Areas Under the Normal Curve,

𝑃 = 𝑃(|𝑧| > 2.83) = 2𝑃(𝑧 < −2.83) = 0.0046

This results to rejection of Ho at  less than 1%.

8.3. Test on the Mean of a Normal Distribution Variance Unknown

To draw an inference on a mean in one-population case assuming normally

distributed but the variance is unknown and the sample size is less than 30, t-test is used.

The test statistic used is the t-statistic, tc, which is computed as follows:

𝑋̅ − 𝜇𝑜
𝑡𝑐 =
𝑠/√𝑛
MATH 403- ENGINEERING DATA ANALYSIS

where 𝑋̅ the computed mean is in the gathered data, 𝜇𝑜 is the hypothesized mean, s is

the sample standard deviation and n is the sample size. The critical value is obtained

using the t-tabular value. For a two-sided test, critical value is obtained at /2 and at a

degree of freedom (d.f.) equals to (n-1), written as t/2 (n-1). Otherwise, for one-sided test,

the value is obtained at  and at a degree of freedom (n-1) written as t (n-1).

Figure 2. T- Distribution for Testing the Hypothesis Ho:  = o with critical values for
(a) H1:   o, (b)  > o, (c)  < o

Example1. The College of Engineering of a State University gives an entrance exam to

incoming freshmen. Those who got scores equal or higher than the set passing are

accepted in the College. The average score of the incoming freshmen was 80% before

the implementation of K to 12 education system. Due to this implementation, the entrance

exam was suspended for two years and it is thought that the quality of the first year

students had diminished. However, with the vision, mission, goals and objectives of the

University and the College towards quality education, the Dean wants to determine if the

quality of freshmen students has changed. He wants to know if it has improved or

diminished so a small random sample of 15 freshmen students and administers the same

entrance exam. The average score is found to be 83% with a standard deviation of 5%.

Determine whether the quality has changed using 1% level of significance.


MATH 403- ENGINEERING DATA ANALYSIS

Solution:

1. H0 : µ = 80%

H1 : µ  80%

2.  = 0.01, two-tailed test

𝑋̅ −𝜇𝑜
3. 𝑡𝑐 = 𝑠/√𝑛

4. Critical region: t =  2.977. Reject H0 if tc is less than -2.977 or greater than 2.977

This is obtained from the table for Critical Values of the t-distribution using /2 = 0.005

and degree of freedom,  = 15 -1 = 14.

5. Computing for t-statistic:

𝑋̅ − 𝜇𝑜
𝑡𝑐 = 𝑠
√𝑛

83 − 80
=
5
√15

= 2.32

6. Do not reject H0 since 2.32 is less than 2.977 but greater than -2.977

7. Therefore, the quality of freshmen students has not changed at 1% level of

significance.

The P-value corresponding to 2.32 is 0.036 or 3.6%. Since this is a two-tailed test, then

𝑃 = 𝑃(|𝑡| > 2.977) = 2𝑃(𝑡 < −2.977) = 0.036


MATH 403- ENGINEERING DATA ANALYSIS

8.4. Test on Variance and Statistical Deviation of a Normal Distribution

The chi-square distribution will be used to test a claim about a single variance or

standard deviation. The formula for the Chi-square test for a single variance is given by:

(𝑛 − 1)𝑠 2
𝜒2 =
𝜎2

where n is the sample size, 𝑠 2 is the sample variance and 𝜎 2 is the population variance

with the degrees of freedom equal to (n -1). There are three assumptions for the Chi-

square test: the sample must be randomly selected from the population, the population

must be normally distributed for the variable under study, and the observations must be

independent of each other.

Figure 3. Chi-Squared Distribution for Testing the Hypothesis Ho: 2 = o2 with critical values for
(a) H1: 2  o2, (b) 2 > o2, (c) 2 < o2

Example1. A company claims that the variance of the sugar content of its ice cream is

equal to 25 mg/oz. A sample of 20 servings is selected, and the sugar contents is

measured. The variance of the sample is found to be 36. At 10% level of significance, is

there enough evidence to reject the claim?

Solution:

1. H0 : 2 = 25 mg/oz
MATH 403- ENGINEERING DATA ANALYSIS

H1 : 2  25 mg/oz

2.  = 0.10, two-tailed test

(𝑛−1)𝑠2
3. 𝜒 2 = 𝜎2

4. Critical region: 𝜒 2 < 10.117 and 𝜒 2 > 30.144 . Reject H0 if 𝜒 2 is less than 10.117

or greater than 30.144. This is obtained from the table for Critical Values of the

Chi-Squared distribution using /2 = 0.05 and degree of freedom,  = 20 -1 = 19.

5. Computing for 𝜒 2 - statistic:

2
(𝑛 − 1)𝑠 2
𝜒 =
𝜎2

(19)(36)
=
25

= 27.36

6. Do not reject H0 since 10.117 < 27.36 < 30.144.

7. Therefore, the company claim that the sugar content is equal to 25 mg/oz is correct

at 10% level of significance.

8.5. Test on a Population Proportion

The problem of testing the hypothesis considers the proportion of successes in a

binomial experiment equals some specified value. That is, the null hypothesis H o that p =

po, where p is the parameter of the binomial distribution is tested. The alternative

hypothesis may be one of the usual one-sided or two-sided alternatives:

𝑝 < 𝑝𝑜 , 𝑝 > 𝑝𝑜 or 𝑝 ≠ 𝑝𝑜
MATH 403- ENGINEERING DATA ANALYSIS

The following are the steps in testing a proportion of small samples:

1. H0: p = po

H1: Alternatives are: 𝑝 < 𝑝𝑜 , 𝑝 > 𝑝𝑜 or 𝑝 ≠ 𝑝𝑜

2. Choose a level of significance equal to .

3. Test statistic: Binomial variable X with p = po.

4. Computations: Find x, the number of successes, and compute the appropriate P-

value

5. Decision: Draw appropriate conclusion based on the P-value.

Example1. A home developer claims that solar panels are installed in 65% of all homes

being constructed today in a certain subdivision. Would you agree with this claim if a

random survey of new homes in this subdivision shows that 8 out of 15 had solar panels

installed? Use a 0.10 level of significance.

Solution:

1. H0 : p = 0.65

H1 : p  0.65

2.  = 0.05, two-tailed test

3. Test statistic: Binomial variable X with p = 0.65 and n = 15

4. Computations: x = 8 and npo = (15) (0.65) = 9.75. Using the table for Binomial

Probability Sums, the computed P-value is shown below


MATH 403- ENGINEERING DATA ANALYSIS

𝑃 = 2𝑃(𝑋 ≤ 8 𝑤ℎ𝑒𝑛 𝑝 = 0.65)


8

= 2 ∑ 𝑏(𝑥; 15,0.65)
𝑥=0

= 0.5213

5. Do not reject H0 and conclude that there is no enough evidence to doubt the claim

of the home developer.

For large n, approximation is required. When the hypothesized value po is very close to 0

or 1, the Poisson distribution with parameter µ = npo may be used. However, the normal-

curve approximation, with parameters µ = npo and 2 = npoqo, is usually preferred for large

n and is very accurate as long as po is not extremely close to 0 or 1. Using the normal

approximation, the z-value for testing p = po is given by

𝑥 − 𝑛𝑝𝑜
𝑧=
√𝑛𝑝𝑜 𝑞𝑜

which is a value of the standard normal variable Z. Hence, for a two-tailed test at the -

level of significance, the critical region is z < -z/2 and z > z/2. For one-sided alternative

p < po, the critical region is z < -z and for the alternative p > po, the critical region is z >

z.

Example1. A semiconductor company produces microcontrollers for robotic applications.

The company is said to demonstrate capability to the customers if the process produces

defective items not exceeding to 5%. To determine this, a random sample of 200

microcontrollers were tested and found out that there are four defective items. Will you

agree that the company demonstrate process capability at 0.05 level of significance? Use

P-value approach.
MATH 403- ENGINEERING DATA ANALYSIS

Solution:

1. H0 : p = 0.05

H1: p < 0.05


𝑥−𝑛𝑝𝑜
2. 𝑧 =
√𝑛𝑝𝑜 𝑞𝑜

3. Computing for 𝑧 - statistic:

𝑥 − 𝑛𝑝𝑜
𝑧=
√𝑛𝑝𝑜 𝑞𝑜

4 − 200(0.05)
=
√200(0.05)(0.95)

= −1.95

4. The P-value from the Table for Areas Under the Normal Curve, P(z < -1.95) =

0.0256.

5. Since the P-value is less than 0.05, then reject Ho.

6. Therefore at 5% level of significance, the company demonstrates process

capability for the customers.

REFERENCES:

Garcia, George A. Fundamental Concepts and Methods in Statistics, Manila: University of Sto.
Tomas Publishing House, 2004

Montgomery, Douglas C., et al., Applied Statistics and Probability for Engineers, 7th ed., John
Wiley & Sons (Asia) Pte Ltd, 2018

Walpole, Ronald E., et al., Probability and Statistics for Engineers and Scientists, 9th ed.,
Pearson Education Inc., 2016
MATH 403- ENGINEERING DATA ANALYSIS

CHAPTER TEST

Solve the following problems completely.

1. A company producing lubricating oil claims that the average content of the

containers is 20 liters. Test this claim if a random sample of ten containers are 20.4,

19.4, 20.2, 20.6, 20.2, 19.6, 19.8, 20.8, 20.6 and 19.6 liters. Assume normal

distribution and use 1% level of significance.

2. It is claimed that personal vehicle is driven 25,000 kilometers per year. Would

you agree with this claim if a random sample of 100 vehicle owners were asked to

keep the records of their travel and showed that an average of 28,500 kilometers

with a standard deviation of 3,950 kilometers? Use P-value in your conclusion.

3. A marketing expert for mobile operating system believes that 40% of the users

prefer android. If 9 out of 20 choose android over IOS, what can you conclude about

the marketing expert’s claim? Use 5% level of significance.

4. If the volume of containers of a particular lubricating oil in Problem 1 is known to

normally distributed with a variance of 0.06 liter. Test this hypothesis against the

alternative that the variance is not equal to 0.06 liter. Use 0.01 level of significance.

You might also like