Session 2 On Hypothesis Testing
Session 2 On Hypothesis Testing
In simple words p-value is a measure of the strength of the evidence against the Null
Hypothesis that is provided by our sample data.
1. Very small p-values (e.g., p < 0.01) indicate strong evidence against the null hypothesis,
suggesting that the observed effect or difference is unlikely to have occurred by chance
alone.
2. Small p-values (e.g., 0.01 ≤ p < 0.05) indicate moderate evidence against the null
hypothesis, suggesting that the observed effect or difference is less likely to have
occurred by chance alone.
3. Large p-values (e.g., 0.05 ≤ p < 0.1) indicate weak evidence against the null hypothesis,
suggesting that the observed effect or difference might have occurred by chance alone,
but there is still some level of uncertainty.
4. Very large p-values (e.g., p ≥ 0.1) indicate weak or no evidence against the null
hypothesis, suggesting that the observed effect or difference is likely to have occurred by
chance alone.
Suppose a company is evaluating the impact of a new training program on the productivity of its employees. The
company has data on the average productivity of its employees before implementing the training program. The
average productivity was 50 units per day. After implementing the training program, the company measures the
productivity of a random sample of 30 employees. The sample has an average productivity of 53 units per day and
the pop std is 4. The company wants to know if the new training program has significantly increased productivity.
Suppose a snack food company claims that their Lays wafer packets contain an average weight of 50 grams per
packet. To verify this claim, a consumer watchdog organization decides to test a random sample of Lays wafer
packets. The organization wants to determine whether the actual average weight differs significantly from the
claimed 50 grams. The organization collects a random sample of 40 Lays wafer packets and measures their
weights. They find that the sample has an average weight of 49 grams, with a pop standard deviation of 5
grams.
A t-test is a statistical test used in hypothesis testing to compare the means of two samples or
to compare a sample mean to a known population mean. The t-test is based on the t-
distribution, which is used when the population standard deviation is unknown and the
sample size is small.
One-sample t-test: The one-sample t-test is used to compare the mean of a single sample to a
known population mean. The null hypothesis states that there is no significant difference
between the sample mean and the population mean, while the alternative hypothesis states
that there is a significant difference.
Independent two-sample t-test: The independent two-sample t-test is used to compare the
means of two independent samples. The null hypothesis states that there is no significant
difference between the means of the two samples, while the alternative hypothesis states that
there is a significant difference.
Paired t-test (dependent two-sample t-test): The paired t-test is used to compare the means of
two samples that are dependent or paired, such as pre-test and post-test scores for the same
group of subjects or measurements taken on the same subjects under two different
conditions. The null hypothesis states that there is no significant difference between the
means of the paired differences, while the alternative hypothesis states that there is a
significant difference.
A one-sample t-test checks whether a sample mean differs from the population mean.
Suppose a manufacturer claims that the average weight of their new chocolate bars is 50
grams, we highly doubt that and want to check this so we drew out a sample of 25 chocolate
bars and measured their weight, the sample mean came out to be 49.7 grams and the sample
std deviation was 1.2 grams. Consider the significance level to be 0.05
2. Normality: The data in each of the two groups should be approximately normally
distributed. The t-test is considered robust to mild violations of normality, especially
when the sample sizes are large (typically n ≥ 30) and the sample sizes of the two groups
are similar. If the data is highly skewed or has substantial outliers, consider using a non-
parametric test, such as the Mann-Whitney U test.
4. Random sampling: The data should be collected using a random sampling method from
the respective populations. This ensures that the sample is representative of the
population and reduces the risk of selection bias.
Suppose a website owner claims that there is no difference in the average time spent on their
website between desktop and mobile users. To test this claim, we collect data from 30
desktop users and 30 mobile users regarding the time spent on the website in minutes. The
sample statistics are as follows:
desktop users = [12, 15, 18, 16, 20, 17, 14, 22, 19, 21, 23, 18, 25, 17, 16, 24, 20, 19, 22, 18, 15,
14, 23, 16, 12, 21, 19, 17, 20, 14]
mobile_users = [10, 12, 14, 13, 16, 15, 11, 17, 14, 16, 18, 14, 20, 15, 14, 19, 16, 15, 17, 14, 12,
11, 18, 15, 10, 16, 15, 13, 16, 11]
Desktop users:
○ Sample size (n1): 30
○ Sample mean (mean1): 18.5 minutes
○ Sample standard deviation (std_dev1): 3.5 minutes
Mobile users:
○ Sample size (n2): 30
○ Sample mean (mean2): 14.3 minutes
○ Sample standard deviation (std_dev2): 2.7 minutes
We will use a significance level (α) of 0.05 for the hypothesis test.
2. Matched or correlated groups: Comparing the performance of two groups that are
matched or correlated in some way, such as siblings or pairs of individuals with similar
characteristics.
Assumptions
1. Paired observations: The two sets of observations must be related or paired in some way,
such as before-and-after measurements on the same subjects or observations from
matched or correlated groups.
Let's assume that a fitness center is evaluating the effectiveness of a new 8 -week weight loss
program. They enroll 15 participants in the program and measure their weights before and
after the program. The goal is to test whether the new weight loss program leads to a
significant reduction in the participants' weight.