AS Lecture 09 (T-Test)
AS Lecture 09 (T-Test)
Lecture # 09
2
𝒕 test
• A 𝑡 test is a statistical test that is used to compare the means of
two groups.
• It is often used in hypothesis testing to determine whether a
process or treatment actually has an effect on the population of
interest, or whether two groups are different from one another.
3
𝒕 test Example
You want to know whether the mean petal length of iris flowers
differs according to their species. You find two different species of
irises growing in a garden and measure 25 petals of each species. You
can test the difference between these two groups using a t test
and null and alterative hypotheses.
4
𝒕 test Example
5
When to use t test
• A t test can only be used when comparing the means of two
groups (a.k.a. pairwise comparison). If you want to compare more
than two groups, or if you want to do multiple pairwise
comparisons, use an ANOVA test or a post-hoc test.
• The t test is a parametric test of difference, meaning that it makes
the same assumptions about your data as other parametric tests.
The t test assumes your data:
• are independent
• are (approximately) normally distributed
• have a similar amount of variance within each group being
compared (a.k.a. homogeneity of variance)
6
What type of t test should we use?
• When choosing a t test, you will need to consider two things:
whether the groups being compared come from a
single population or two different populations, and whether you
want to test the difference in a specific direction.
7
One sample, two sample, or paired t test
• If the groups come from a single population (e.g., measuring
before and after an experimental treatment), perform
a paired t test. This is a within-subjects design.
• If the groups come from two different populations (e.g., two
different species, or people from two separate cities), perform
a two-sample t test (a.k.a. independent t test). This is a between-
subjects design.
• If there is one group being compared against a standard value
(e.g., comparing the acidity of a liquid to a neutral pH of 7),
perform a one-sample t test.
8
One tailed or two tailed t-test
• If you only care whether the two populations are different from
one another, perform a two-tailed t test.
• If you want to know whether one population mean is greater than
or less than the other, perform a one-tailed t test.
9
One Sample t-test
• A one-sample t-test is a statistical test that helps us figure out if
the average (or mean) value of a sample of data is different from a
known or expected value. It's like checking if the average result we
got is significantly different from what we thought it should be.
• For example, imagine you have a group of students, and you
believe the average score in a test should be around 75. You then
collect actual test scores from the students and use a one-sample
t-test to see if the average score you got from your group of
students is really different from 75 or if it could be just due to
chance.
10
One Sample t-test
𝑫𝒊𝒇𝒇.
𝒕𝒄 =
𝑺. 𝑬
• 𝑫𝒊𝒇𝒇. = 𝒙 ഥ−𝝁
𝒔
• 𝑺. 𝑬. =
𝒏
𝟐
σ(𝒙−ഥ
𝒙)
• 𝒔=
𝒏−𝟏
σ𝒙
• ഥ=
𝒙
𝒏
11
One Sample t test: Example
For filling each bottle with 170 tablets of a particular medicine, an
automatic machine was installed. From the production, a sample of 9
bottles was taken, the number of tablets found in these 9 bottles are as
follows. Test whether the machine has been installed properly or not.
12
One Sample t test: Example
𝒙 ഥ)
(𝒙 − 𝒙 ഥ
𝒙−𝒙 𝟐
σ𝒙 𝟏𝟓𝟏𝟑
168 -0.11 0.01
• 𝒙
ഥ= = = 𝟏𝟔𝟖. 𝟏𝟏
𝒏 𝟗
164 -4.11 16.89 𝟐
σ(𝒙−ഥ
𝒙) 𝟑𝟖.𝟒𝟑
166 -2.11 4.45 • 𝒔= = = 𝟐. 𝟏𝟗
𝒏−𝟏 𝟗−𝟏
167 -1.11 1.23
𝒔 𝟐.𝟏𝟗
168 -0.11 0.01 • 𝑺. 𝑬. = = = 𝟎. 𝟕𝟑
𝒏 𝟗
169 0.89 0.79
• 𝑫𝒊𝒇𝒇. = 𝒙ഥ − 𝝁 = 𝟏𝟔𝟖. 𝟏𝟏 − 𝟏𝟕𝟎 = 𝟏. 𝟖𝟗
170 1.89 3.57
170 1.89 3.57
𝑫𝒊𝒇𝒇. 𝟏. 𝟖𝟗
𝒕𝒄 = = = 𝟐. 𝟓𝟗
171 2.81 7.90 𝑺. 𝑬 𝟎. 𝟕𝟑
1513 38.43
13
One Sample t test: Example
𝒙 ഥ)
(𝒙 − 𝒙 ഥ
𝒙−𝒙 𝟐 𝑫𝒊𝒇𝒇. 𝟏. 𝟖𝟗
𝒕𝒄 = = = 𝟐. 𝟓𝟗
168 -0.11 0.01 𝑺. 𝑬 𝟎. 𝟕𝟑
164 -4.11 16.89 Now we calculate 𝒕𝒕
166 -2.11 4.45 • 𝒅𝒇 = 𝒏 − 𝟏 = 𝟗 − 𝟏 = 𝟖
167 -1.11 1.23
• Level of significance = 0.05
168 -0.11 0.01
169 0.89 0.79
170 1.89 3.57 If 𝒕𝒄 ≤ 𝒕𝒕 then we accept 𝑯𝟎
170 1.89 3.57 From the t-table 𝒕𝒕,𝟎.𝟎𝟓,𝟖 = 𝟐. 𝟑𝟎𝟔
171 2.81 7.90
𝟐. 𝟓𝟗 ≤ 𝟐. 𝟑𝟎𝟔
1513 38.43
Here 𝒕𝒄 ≥ 𝒕𝒕 , Therefore we Reject 𝑯𝟎
14
15
One Sample t test: Practice Problem
A random sample drawn from a normal population is given below. Test
the hypothesis 𝑯𝟎 : 𝝁 = 𝟓𝟕𝟖 against 𝑯𝟏 : 𝝁 ≠ 𝟓𝟕𝟖
𝒙 578 572 570 568 572 578 570 572 596 544
16
Two Sample t-test
• A two-sample t-test is a statistical test used to find out if there's a
significant difference between the average (or mean) values of
two separate groups. It helps us determine if the differences we
observe in the averages of these groups are likely to be real
differences or if they could have happened by chance.
• For example, if you want to compare the test scores of two
different classes, you can use a two-sample t-test to figure out if
the average scores of the two classes are truly different from each
other or if the difference could be due to random variation.
17
Two Sample t-test
𝑫𝒊𝒇𝒇.
𝒕𝒄 =
𝑺. 𝑬
• ഥ𝟏 − 𝒙
𝑫𝒊𝒇𝒇. = 𝒙 ഥ𝟐
𝟏 𝟏
• 𝑺. 𝑬. = 𝒔 +
𝒏𝟏 𝒏𝟐
𝟐 𝟐
σ(𝒙𝟏 −ഥ
𝒙𝟏 ) +σ(𝒙𝟐 −ഥ
𝒙𝟐 )
• 𝒔=
𝒏𝟏 +𝒏𝟐 −𝟐
σ 𝒙𝟏
• ഥ𝟏 =
𝒙
𝒏𝟏
σ 𝒙𝟐
• ഥ𝟐 =
𝒙
𝒏𝟐
18
Two Sample t test: Example
The proportion of nicotine in mgs of two Tabacco samples are shown
below:
𝑺𝒂𝒎𝒑𝒍𝒆 𝑨 𝑺𝒂𝒎𝒑𝒍𝒆 𝑩
24 27
27 30
26 28
21 31
25 32
𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐 --- 36
𝑯𝟏 : 𝝁𝟏 ≠ 𝝁𝟐
19
Two Sample t test: Example
• ഥ𝟏 − 𝒙
𝑫𝒊𝒇𝒇. = 𝒙 ഥ𝟐 = 𝟐𝟒. 𝟔 − 𝟑𝟎. 𝟔𝟕 = 𝟔. 𝟎𝟕
𝒙𝟏 𝒙𝟐 ഥ𝟏
𝒙𝟏 − 𝒙 𝒙𝟏 )𝟐
(𝒙𝟏 −ഥ ഥ𝟐
𝒙𝟐 − 𝒙 𝒙𝟐 )𝟐
(𝒙𝟐 −ഥ
𝟏 𝟏 𝟏 𝟏
24 27 -0.6 0.36 -3.67 13.67 • 𝑺. 𝑬. = 𝒔 + = 𝟐. 𝟖𝟒 + = 𝟏. 𝟕𝟐
𝒏𝟏 𝒏𝟐 𝟓 𝟔
27 30 2.4 5.76 -0.67 0.45 𝟐 𝟐
σ(𝒙𝟏 −ഥ
𝒙𝟏 ) +σ(𝒙𝟐 −ഥ
𝒙𝟐 ) 𝟐𝟏.𝟐+𝟓𝟏.𝟑𝟒
26 28 1.4 1.96 -2.67 7.13 • 𝒔= = = 𝟐. 𝟖𝟒
𝒏𝟏 +𝒏𝟐 −𝟐 𝟓+𝟔−𝟐
σ 𝒙𝟏 𝟏𝟐𝟑
21 31 -3.6 12.96 0.33 0.11 • ഥ𝟏 =
𝒙 = = 𝟐𝟒. 𝟔
𝒏𝟏 𝟓
25 32 0.4 0.16 1.33 1.77 σ 𝒙𝟐 𝟏𝟖𝟒
• ഥ𝟐 =
𝒙 = = 𝟑𝟎. 𝟔𝟕
--- 36 --- --- 5.33 28.41 𝒏𝟐 𝟔
20
Two Sample t test: Example
𝑫𝒊𝒇𝒇. 𝟔. 𝟎𝟕
𝒕𝒄 = = = 𝟑. 𝟓𝟑
𝑺. 𝑬 𝟏. 𝟕𝟐
Now we calculate 𝒕𝒕
• 𝒅𝒇 = 𝒏𝟏 + 𝒏𝟐 − 𝟐 = 𝟏𝟏 − 𝟐 = 𝟗
• Level of significance = 0.05
If 𝒕𝒄 ≤ 𝒕𝒕 then we accept 𝑯𝟎
From the t-table 𝒕𝒕,𝟎.𝟎𝟓,𝟗 = 𝟐. 𝟐𝟔𝟐
𝟑. 𝟓𝟑 ≤ 𝟐. 𝟐𝟔𝟐
Here 𝒕𝒄 ≥ 𝒕𝒕 , Therefore we Reject 𝑯𝟎
21
22
Two Sample t test: Practice Problem
Two horses were put into travel a particular distance. The duration of
time which they have taken are given in seconds given below. From this
data and using t-test show that horse A runs more speedily than horse
B.
𝑯𝒐𝒓𝒔𝒆 𝑨 29 30 32 33 32 29 34
𝑯𝒐𝒓𝒔𝒆 𝑩 29 30 30 24 27 29 ---
𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐
𝑯𝟏 : 𝝁𝟏 > 𝝁𝟐
23
Paired t-test
• A paired t-test is a statistical method used to compare the means of two
sets of related data. It's designed for situations where each data point
in one set is paired or matched with a specific data point in the other
set. This pairing could arise from repeated measurements on the same
subjects, individuals, or items under different conditions or times.
• In simpler terms, a paired t-test is used when you have two sets of data
that are somehow connected, and you want to determine if there's a
significant difference between their means.
• For example, if you're testing the effect of a new drug on patients, you
might measure their blood pressure before and after taking the drug.
Each patient's before and after measurements are paired, and a paired
t-test would help you assess whether there's a significant change in
blood pressure due to the drug.
24
Paired t-test
ഥ∙ 𝒏−𝟏
𝒅
𝒕𝒄 =
𝒔
σ𝒅
ഥ
• 𝒅=
𝒏
• 𝒅 = 𝟐 𝒎𝒆𝒂𝒔𝒖𝒓𝒆𝒎𝒆𝒏𝒕 − 𝟏𝒔𝒕 𝒎𝒆𝒂𝒔𝒖𝒓𝒆𝒎𝒆𝒏𝒕
𝒏𝒅
ഥ 𝟐
σ(𝒅−𝒅)
• 𝒔=
𝒏−𝟏
25
Paired t test: Example
The sales in 6 shops before and after the more sale campaign are as
shown below, can we say that the campaign is successful? Use 5% level
of significance .
Sales
Sales
before
Shop after
campaign
campaign
1 53 58
2 28 32
3 32 30
4 48 50
𝑯𝟎 : 𝝁𝟏 = 𝝁𝟐
5 50 56
𝑯𝟏 : 𝝁𝟏 > 𝝁𝟐
6 42 45
26
Paired t test: Example
x y 𝒅=𝒚−𝒙 ഥ (𝒅 − 𝒅)𝟐
𝒅−𝒅
53 58 5 2 4
• ഥ = σ 𝒅 = 𝟏𝟖 = 𝟑
𝒅
28 32 4 1 1 𝒏 𝟔
𝟐
ഥ
σ(𝒅−𝒅) 𝟒𝟎
32 30 -2 -5 25 • 𝒔= = = 𝟐. 𝟖𝟑
𝒏−𝟏 𝟔−𝟏
48 50 2 -1 1
50 56 6 3 9
42 45 3 0 0
18 40
27
Paired t test: Example
ഥ∙ 𝒏−𝟏 𝟑∙ 𝟔−𝟏
𝒅
𝒕𝒄 = = = 𝟐. 𝟑𝟕
𝒔 𝟐. 𝟖𝟑
Now we calculate 𝒕𝒕
• 𝒅𝒇 = 𝒏 − 𝟏 = 𝟔 − 𝟏 = 𝟓
• Level of significance = 0.05
If 𝒕𝒄 ≤ 𝒕𝒕 then we accept 𝑯𝟎
From the t-table 𝒕𝒕,𝟎.𝟎𝟓,𝟓 = 𝟐. 𝟎𝟏𝟓
𝟐. 𝟑𝟕 ≤ 𝟐. 𝟎𝟏𝟓
Here 𝒕𝒄 ≥ 𝒕𝒕 , Therefore we Reject 𝑯𝟎
28
29
Paired t test: Practice Problem
A group of 10 children were treated to find out how many digits they
could repeat from memory after hearing them once. They were given
practice of the test during the next week and then re-tested. Is the
difference between the performance of children at the two test
significance.
Child 1 2 3 4 5 6 7 8 9 10
Test 1 (Before Practice) 6 5 4 7 8 6 7 5 6 8
Test 2 (After Practice) 7 7 6 7 9 6 8 6 6 10
30
Acknowledgment
• [Peter Andrew Bruce] Practical Statistics for Data Scientists
• [David Forsyth] Probability and Statistics for Computer Science
• [Michael Baron] Probability and Statistics for Computer Scientists
• .
31