0% found this document useful (0 votes)
8 views13 pages

Module 1 - Introduction To Hypothesis Testing

Uploaded by

Kanika1908
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views13 pages

Module 1 - Introduction To Hypothesis Testing

Uploaded by

Kanika1908
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Course: Inferential Statistical Analysis

Module 1: Introduction to Hypothesis Testing

Learning Objectives
 Learn to formulate the Null and Alternative Hypothesis and the classification of
hypothesis test as two-tailed and one-tailed based on the Alternative
Hypothesis.
 Learn hypothesis testing process with an example.
 Learn about different types of errors in the hypothesis testing process.

Introduction
Inferential Statistics refers to understanding the characteristics of a population based
on the estimates done on sample data. It involves formulating and testing a research
hypothesis on a sample to make decisions and generalize them to the population.
Hypothesis Testing is a specific application of inferential statistics that involves
testing the claims (formulated as a hypothesis) about a group or population based on
the sample data. It includes formulating two types of hypotheses as given below.

Null and Alternative Hypothesis


Null Hypothesis (Ho) assumes that there is no significant difference, effect, or
relationship between features in a population. It refers to the claim made which one
tries to disprove.
Alternative Hypothesis (H1) assumes that there is a significant difference, effect, or
relationship between features in a population. It is the opposite of the Null
Hypothesis which one tries to prove.

Example: A retail store sells multiple products across different geographical


locations. The manager claims that the average daily sales of product A (µA) is
different from that of product B (µB). The Null and Alternative Hypothesis in this case
will be:

Table 1: Example for Types of Hypotheses

NULL HYPOTHESIS (H0) ALTERNATIVE HYPOTHESIS (H1)

There is no significant difference in the The average daily sales are significantly
average daily sales of the two different for the two products.
products. Mathematically, H1: μA ≠ μB
Mathematically, H0: μA=μB

In hypothesis testing, one tries to reject the Null Hypothesis and not to prove the
Alternative Hypothesis. Alternative Hypothesis is accepted when sufficient evidence
is available to reject the Null Hypothesis.

In the hypothesis testing, some of the key terms used are explained below.
Table 2: Key Terms in Hypothesis Testing

# Key Terms Definition


1 Level of Represents the probability of rejecting the null
Significance (α) hypothesis when it is true. Typical Significance Levels
are 10%, 5%, or 1%.
2 p-value Represents the strength of the evidence in favor of the
Null Hypothesis. The higher the p-value, the stronger
the evidence in favor of the Null Hypothesis
3 Confidence Level Represents the probability or level of certainty that the
(1-α) true population parameter lies in the interval
constructed. Typical Confidence Levels are 90%, 95%,
or 99%.
4 Critical Value A Threshold used to determine whether to reject or not
reject the Null Hypothesis.
It is obtained from statistical tables of the respective
statistical tests chosen for hypothesis testing.
It is used to define the boundaries for rejection and
non-rejection regions.
5 Test Statistic Is a numerical value calculated from sample data that
helps to assess how likely are the sample results if the
Null Hypothesis is true. It helps to decide whether to
reject or not reject the Null Hypothesis.
6 Rejection Region It is the area under the curve where if the test statistic
falls, the Null Hypothesis is rejected.
The area is equal to the value of α
Non-Rejection It is the area under the curve where if the test statistic
Region falls, the Null Hypothesis is not rejected.
The area is equal to the value 1-α.

Based on how the Alternative Hypothesis is defined, the hypothesis test is classified
as either two-tailed or one-tailed.

Two-Tailed and One-Tailed Test


The classification of hypothesis test based on how the Alternative
Hypothesis is formulated is given as follows:

Figure 1: Classification of Hypothesis Tests Based on Alternative Hypothesis

Now, consider an example to understand each type of hypothesis test. Suppose a


test needs to be done to see whether the new drug has any effect on the average
blood pressure of patients. A claim that the drug affects patients’ blood pressure is
made. Let µ1 be the average blood pressure of patients who take the drug and let µ2
be the average blood pressure of those who don’t take it.
Now, based on how the alternative hypothesis is formulated, the test will be
classified as two-tailed or one-tailed (left or right).

Two-Tailed Test: The rejection region lies on both ends of the distribution curve. It is
used when one wants to test if there is a significant difference or effect in either
direction without specifying the direction beforehand. In other words, two-tailed tests
are directionless and alternative hypothesis use ≠ sign that allows for both greater
than (>) or less than (<) possibility.
Figure 2: Rejection and Non-Rejection Region for Two-Tailed test

For instance, if one wants to test that the drug affects patients without specifying
whether blood pressure will increase or decrease after taking medicine, it is a two-
tailed test.
The Null and Alternative Hypothesis in this case will be:

Table 3: Example for Two-Tailed Hypothesis Test

NULL HYPOTHESIS (H0) ALTERNATIVE HYPOTHESIS (H1)

The average blood pressure of The average blood pressure of patients


patients who take the drug is equal to who take the drug is different from the
the average blood pressure of patients average blood pressure of patients who
who don’t take the drug. don’t take the drug.
Mathematically, H0: μ1=μ2 Mathematically, H1: μ1 ≠ μ2

One-Tailed Test: The rejection region lies in either the left tail or right tail of the
distribution curve. It is used when one wants to test if there is a significant difference
or effect in one direction. In other words, one-tailed tests are directional and
alternative hypothesis use either the greater than (>) or less than (<) sign.
Based on whether the alternative hypothesis uses less than or greater than sign, the
one-tailed hypothesis test is classified as below:

Left-Tailed Test: The rejection region lies on the left tail of the distribution curve. It is
used when one wants to test if a parameter is significantly less than a certain value.

Figure 3: Rejection and Non-Rejection Region for Left-Tailed test


For instance, if one wants to test whether the drug decreases the average blood
pressure of patients, then it is a left-tailed test.
The Null and Alternative Hypothesis in this case will be:

Table 4: Example for Left-Tailed Hypothesis Test

NULL HYPOTHESIS (H0) ALTERNATIVE HYPOTHESIS (H1)

The average blood pressure of The average blood pressure of patients


patients who take the drug is greater who take the drug is less than the
than or equal to the average blood average blood pressure of patients who
pressure of patients who don’t take the don’t take the drug.
drug.
Mathematically, H1: μ1 < μ2
Mathematically, H0: μ1 ≥ μ2

Right-Tailed Test: The rejection region lies on the right tail of the distribution curve.
It is used when one wants to test if a parameter is significantly greater than a certain
value.
Figure 4: Rejection and Non-Rejection Region for Right-Tailed test

For instance, if one wants to test whether the drug increases the average blood
pressure of patients, then it is a right-tailed test.
The Null and Alternative Hypothesis in this case will be:

Table 5: Example for Right-Tailed Hypothesis Test

NULL HYPOTHESIS (H0) ALTERNATIVE HYPOTHESIS (H1)

The average blood pressure of The average blood pressure of patients


patients who take the drug is less than who take the drug is greater than the
or equal to the average blood pressure average blood pressure of patients who
of patients who don’t take the drug. don’t take the drug.
Mathematically, H0: μ1 ≤ μ2 Mathematically, H1: μ1 > μ2
Hypothesis Testing Process
The steps followed for hypothesis testing are listed below:

Figure 5: Hypothesis Testing Process

The steps in the hypothesis testing process are explained in detail below.
 Hypothesis Testing starts with stating the Null and Alternative Hypotheses
based on the claim to be tested.

 Based on the alternative hypothesis defined, the test will be classified as two-
tailed or one-tailed (left or right).

 An appropriate statistical test is then chosen to test the hypothesis.

 The next step is to choose the level of significance (α) at which the test will be
conducted.

 A decision rule is then defined based on which the Null hypothesis will be
rejected or not rejected. It involves defining the probabilities of rejection and
non-rejection regions and finding the critical value.

 There are two ways based on which the Null Hypothesis is rejected or not
rejected.
o First is comparing the p-value with α.
 If the p-value ≤ α, then there is statistically significant evidence
against the Null Hypothesis, so the Null Hypothesis is rejected
and the Alternative Hypothesis is accepted.
 If the p-value > α, then there is no statistically significant
evidence against the Null Hypothesis, so the Null Hypothesis
is not rejected, and the Alternative Hypothesis is rejected.
o Second is comparing the test statistic with critical value.
 If the test statistic ≥ critical value, that is, it falls in the
rejection region, then there is statistically significant evidence
against the Null Hypothesis. So, the Null Hypothesis is
rejected and the Alternative Hypothesis is accepted.
 If the test statistic < critical value, that is, it falls in the non-
rejection region, then there is no statistically significant evidence
against the Null Hypothesis. So, the Null Hypothesis is not
rejected and the Alternative Hypothesis is rejected.

 A sample data that is representative of the population is then collected to test


the hypothesis at the chosen level of significance.

 The value of test statistic and p-value is calculated for the chosen statistical
test and defined level of significance.

 A conclusion is derived based on the above statistical analysis where the Null
Hypothesis will be either rejected or not rejected.

Example:
Consider the average income of households in a neighborhood to be $74914 in a
year. Since this survey data is ten years old, a random sample of 112 households is
collected to check if the average income has changed or not over the years.
Assumption here is that the standard deviation of income for the population is
$14530.

Using the steps for hypothesis testing, one can conclude whether the income has
changed or not.

Step 1: The Null and Alternative Hypothesis is:


H0: µ = 74914
H1: µ ≠ 74914

Step 2: Since the population standard deviation is known and the sample mean is to
be testing against a hypothesized value, z-test will be used. Moreover, the
hypothesis test is two-tailed.
x−μ
z=
σ
√n
Step 3: Assume the level of significance (α) as 5% or 0.05.

Step 4: A decision rule is formulated based on which the conclusion will be made.
Since it is a two-tailed test and α = 0.05, the area in each tail is α/2 = 0.025.
Thus, the rejection region is in both the ends of the tail with 2.5% area in each tail.
The corresponding critical value calculated from the z-table is:
z α/ 2=± 1.96

The decision rule is that if the test statistic is either greater than 1.96 or less than -
1.96, then the Null Hypothesis will be rejected. On the other hand, if the p-value
comes out to be less than 0.05 then also the Null Hypothesis will be rejected.

Step 5: The sample data from 112 randomly selected households is collected.

Step 6: Suppose from the sample data collected the sample mean comes out to be
x=78695 . The other values used to compute test statistics are: μ=74914 , σ =14530,
and n = 112. The value of test statistic z is:

78695−74914
z= =2.75
14530
√112
The corresponding p-value is computed as 0.0060.
The figure below shows the rejection non-rejection region based on the critical value
of z and the corresponding values of test statistic, p-value, and α.
Figure 6: Rejection and Non-Rejection Region for the test

Step 7: Now, since the calculated value of z lies in the rejection region, that is, z =
2.75 is greater than the critical value of z = 1.96. Thus, the Null Hypothesis is
rejected.
The same conclusion is derived by comparing the p-value with α. Since the p-value
is less than α, the Null Hypothesis is rejected.

Therefore, it can be concluded that the average income of households has changed
over the years.

Type of Errors
It may happen that the wrong conclusion is reached after hypothesis testing. Such
wrong conclusions are known as errors, and they are widely classified as Type 1 and
Type 2 errors.
Figure 7: Type of Errors

For example, a person is being prosecuted for a crime. Suppose the Null Hypothesis
is that he is innocent.
Type 1 error - Reject the Null Hypothesis when it is true, i.e., the person is convicted
even though he is innocent. The probability of Type I error is also called level of
significance (α).
Type 2 error – Not Rejecting Null Hypothesis when it is false, i.e., the person is
declared innocent when he is guilty.
The goal of a hypothesis test is to reduce the probability of these errors. However,
reducing one error increases the other.
So, the probability of Type I error is fixed, and one tries to reduce the probability of
Type II error.

Key Takeaways
 This reading collateral explains the Null and Alternative Hypothesis and two-
tailed and one-tailed test with examples.
 The hypothesis testing process is explained with an example.
 The type of errors is also explained with appropriate examples.

You might also like