0% found this document useful (0 votes)
33 views11 pages

Basic Information On The T-Test

This document provides an overview of the t-test, including the hypothesis, null hypothesis, t-value, p-value, types of t-tests (paired and unpaired), and degrees of freedom. It also describes how to perform one-sample, two-sample, and paired t-tests. Key points are that the t-test determines if there are significant differences between sample means and the p-value indicates the probability the results are due to chance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views11 pages

Basic Information On The T-Test

This document provides an overview of the t-test, including the hypothesis, null hypothesis, t-value, p-value, types of t-tests (paired and unpaired), and degrees of freedom. It also describes how to perform one-sample, two-sample, and paired t-tests. Key points are that the t-test determines if there are significant differences between sample means and the p-value indicates the probability the results are due to chance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Basic Information on the t-Test

 
Hypothesis: The hypothesis is a tentative explanation based on observations you have made.  Your
observations may have been followed up with a search of the literature for more information before you develop
your hypothesis.
 
Example: Men’s hands are larger than women’s hands OR adding fertilizer to a plant makes it grow better.
 

Null hypothesis:  The actual null hypothesis is a more formal statement of your original hypothesis.  The null
hypothesis is usually written in the following form:  There is no significant difference between population A
and population B. 
 
Example:  There is no significant difference in hand size between males and females.  OR  There is no
significant difference in the growth of fertilized plants vs. unfertilized plants.
 
The reason we write it in this form is that scientists are basically skeptics and their goal is to prove a hypothesis
false.  In fact, you can never really prove that a hypothesis is true.  In addition, the null hypothesis is used
because it allows you to relate your calculations of the difference between the sample means to a standard of
zero.  

The t-Test:  We use this statistical test to compare our sample populations and determine if there is a
significant difference between their means. The result of the t-test is a ‘t’ value; this value is then used to
determine the p-value (see below).

T-Test: A statistical examination of two population means. A two-sample t-test examines whether two samples
are different and is commonly used when the variances of two normal distributions are unknown and when an
experiment uses a small sample size. For example, a t-test could be used to compare the average floor routine
score of the U.S. women's Olympic gymnastic team to the average floor routine score of China's women's team.
 

P-value: The p-value is the probability that ‘t’ falls into a certain range.  In other words this is the value you use
to determine if the difference between the means in your sample populations is significant.  For our purposes, a
p-value < 0.05 suggests a significant difference between the means of our sample population and we would
reject our null hypothesis.  A p-value > 0.05 suggests no significant difference between the means of our sample
populations and we would not reject our null hypothesis.
 

Types of t-tests:  There are two types of t-tests, the unpaired and paired t-test that we will use in this course.
 
            Unpaired t-test:  This type of t-test is used when you have independent samples.          In other words
your samples are not directly related to one another.  Ex.: Index finger length between males and females.
 
            Paired t-test:  In this t-test your samples are related.  You collected data before and after some
manipulation of your subjects.  Ex.: Pulse before and after 3 cups of coffee.
 
Degrees of Freedom: E stimates of parameters can be based upon different amounts of information. The
number of independent pieces of information that go into the estimate of a parameter is called the degrees of
freedom (df). In general, the degrees of freedom of an estimate is equal to the number of independent scores
that go into the estimate minus the number of parameters estimated as intermediate steps in the estimation of the
parameter itself. For example, if the variance, σ², is to be estimated from a random sample of N independent
scores, then the degrees of freedom is equal to the number of independent scores (N) minus the number of
parameters estimated as intermediate steps (one, μ estimated by M) and is therefore equal to N-1. 

How to do simple t-tests


These are statistical tests that will tell you if there is a significant difference between two sets of data, or if
the average of a set of data differs significantly from a predicted value.

The results of these tests are only valid when the data are normally-distributed. If the data are not normally-
distributed, use another statistical test, such as the Mann-Whitney U-test.

Explanation of terms

n = The sample size


 = The mean of a sample
µ = The theoretical mean of a population
s = The standard deviation
var = The variance (equal to s²)

The standard deviation (s) can be calculated using the formula:

 One-sample t-test

To test whether the mean of a sample ( ) differs significantly from a predicted value (µ)...

Calculate the 'standard error of the mean' (SEM):

Calculate the t-statistic:

Use the table of critical values (below) to find out whether or not the result is significant.

 Two-sample t-test (Mulitple)


To test whether the mean of a sample ( 1) differs significantly from the mean of another sample ( 2)...
Calculate the 'standard error of the mean' (SEM):

Calculate the t-statistic:

Use the table of critical values (below) to find out whether or not the result is significant.

 Table of critical values


The table below gives the t-value at which the result has a particular level of 'significance'.

d.f. is the number of 'degrees of freedom'. In this case, d.f. = n -1


(If the exact d.f. value that you want is not included in the table, use the closest value below it that is included.)

p is the probability that the difference between two samples, or the difference between a sample and the
theoretical result, is entirely due to chance.

d.f. p=0.1 p=0.05 p=0.01


2 2.92 4.30 9.92
3 2.35 3.18 5.84
4 2.13 2.78 4.60
5 2.02 2.57 4.03
6 1.94 2.45 3.71
7 1.89 2.36 3.50
8 1.86 2.31 3.36
9 1.83 2.26 3.25
10 1.81 2.23 3.17
11 1.80 2.20 3.11
12 1.78 2.18 3.05
13 1.77 2.16 3.01
14 1.76 2.14 2.98
15 1.75 2.13 2.95
16 1.75 2.12 2.92
17 1.74 2.11 2.90
18 1.73 2.10 2.88
19 1.73 2.09 2.86
20 1.72 2.09 2.85
21 1.72 2.08 2.83
22 1.72 2.07 2.82
23 1.71 2.07 2.81
24 1.71 2.06 2.80
25 1.71 2.06 2.79
26 1.71 2.06 2.78
27 1.70 2.05 2.77
28 1.70 2.05 2.76
29 1.70 2.05 2.76
30 1.70 2.04 2.75
35 1.69 2.03 2.72
40 1.68 2.02 2.70
45 1.68 2.01 2.69
50 1.68 2.01 2.68
60 1.67 2.00 2.66
70 1.67 1.99 2.65
80 1.66 1.99 2.64
90 1.66 1.99 2.63
100 1.66 1.98 2.63
Infinity 1.64 1.96 2.58

Example: suppose that a t-test on a sample of 10 individuals (d.f. = 9) produced a t-value of 3.0. The table tells
us that p is between 0.01 and 0.05 in this case (p=0.05 when t=2.26, and p=0.01 when t=3.25; our t-value lies in
between these two). Therefore, the probability of the result arising by chance is less than 5% (p<0.05), so this is
a fairly significant result.

T-Test (Independent Samples)

Note: The below discusses the unranked “independent samples t-test”, the most common form of t-test.

Definition A t-test helps you compare whether two groups have different average values (for example,

whether men and women have different average heights).

Example
Let’s say you’re curious about whether New Yorkers and Kansans spend a different amount of money per
month on movies. It’s impractical to ask every New Yorker and Kansan about their movie spending, so instead
you ask a sample of each—maybe 300 New Yorkers and 300 Kansans—and the averages are $14 and $18. The
t-test asks whether that difference is probably representative of a real difference between Kansans and New
Yorkers generally or whether that is most likely a meaningless statistical fluke.

Technically, it asks the following: If there were in fact no difference between Kansans and New Yorkers
generally, what are the chances that randomly selected groups from those populations would be as different as
these randomly selected groups are? For example, if Kansans and New Yorkers as a whole actually spent the
same amount of money on average, it’s very unlikely that 300 randomly selected Kansans each spend exactly
$14 and 300 randomly selected New Yorkers each spend exactly $18. So if you’re sampling yielded those
results, you would conclude that the difference in the sample groups is most likely representative of a
meaningful difference between the populations as a whole.

Definition

A t-test asks whether a difference between two groups’ averages is unlikely to have occurred because of random
chance in sample selection. A difference is more likely to be meaningful and “real” if
(1) the difference between the averages is large,
(2) the sample size is large, and
(3) responses are consistently close to the average values and not widely spread out (the standard deviation is
low).

The t-test’s statistical significance and the t-test’s effect size are the two primary outputs of the t-test. Statistical
significance indicates whether the difference between sample averages is likely to represent an actual difference
between populations (as in the example above), and the effect size indicates whether that difference is large
enough to be practically meaningful.
The “One Sample T-Test” is similar to the “Independent Samples T-Test” except it is used to compare one
group’s average value to a single number (for example, do Kansans on average spend more than $13 per month
on movies?). For practical purposes you can look at the confidence interval around the average value to gain
this same information.

The “paired t-test” is used when each observation in one group is paired with a related observation in the other
group. For example, do Kansans spend more money on movies in January or in February, where each
respondent is asked about their January and their February spending? In effect a paired t-test subtracts each
respondent’s January spending from their February spending (yielding the increase in spending), then take the
average of all those increases in spending and looks to see whether that average is statistically significantly
greater than zero (using a one sample t-test).

The “ranked independent samples t-test” asks a similar question to the typical unranked test but it is more
robust to outliers (a few bad outliers can make the results of an unranked t-test invalid).
Making Sense of the Two-Sample T-Test

The two-sample t-test is one of the most commonly used hypothesis tests in Six Sigma work. It is applied to
compare whether the average difference between two groups is really significant or if it is due instead to
random chance. It helps to answer questions like whether the average success rate is higher after implementing
a new sales tool than before or whether the test results of patients who received a drug are better than test results
of those who received a placebo.

Here is an example starting with the absolute basics of the two-sample t-test. The question being answered is
whether there is a significant (or only random) difference in the average cycle time to deliver a pizza from Pizza
Company A vs. Pizza Company B. This is the data collected from a sample of deliveries of Company A and
Company B. 

Table 1: Pizza Company A Versus Pizza Company B Sample Deliveries


A B
20.4 20.2
24.2 16.9
15.4 18.5
21.4 17.3
20.2 20.5
18.5  
21.5  
To perform this test, both samples must be normally distributed.

Since both samples have a p-value above 0.05 (or 5 percent) it can be concluded that both samples are normally
distributed. The test for normality is here performed via the Anderson Darling test for which the null hypothesis
is “Data are normally distributed” and the alternative hypothesis is “Data are not normally distributed.” 

Using the two-sample t-test, statistics software generates the output in Table 2. 

Table 2: Two-Sample T-Test and Confidence Interval for A Sample and B Sample
  N Mean Standard Deviation SE Mean
A Sample 7 20.23 2.74 1.0
B Sample 5 18.68 1.64 0.73
Difference = mu (A Sample) – mu (B Sample)
Estimate for difference: 1.54857
95% CI for difference: (-1.53393, 4.63107)
T-test of difference = 0 (vs not =): T-value = 1.12, P-value = 0.289, DF = 10
Both use pooled StDev = 2.3627 
Since the p-value is 0.289, i.e. greater than 0.05 (or 5 percent), it can be concluded that there is no difference
between the means. To say that there is a difference is taking a 28.9 percent risk of being wrong. 
If the two-sample t-test is being used as a tool for practical Six Sigma use, that is enough to know. The rest of
the article, however, discusses understanding the two-sample t-test, which is easy to use but not so easy to
understand. 

So How Does It Work?

Actually, if one subtracts the means from two samples, in most cases, there will be a difference. So the real
question is not really whether the sample means are the same or different. The correct question is whether the
population means are the same (i.e., are the two samples coming from the same or different populations)? 

Hence,   will most often be unequal to zero. 

However, if the population means are the same,   will equal zero. The trouble is, only two samples exist.
The question that must be answered is whether   is zero or not. 

The first step is to understand how the one-sample t-test works. Knowing this helps to answer questions like in
the following example: A supplier of a part to a large organization claims that the mean weight of this part is 90
grams. The organization took a small sample of 20 parts and found that the mean score is 84 grams and standard
deviation is 11. Could this sample originate from a population of mean = 90 grams?

Figure 2: Equation Values

The organization wants to test this at significance level of 0.05, i.e., it is willing to take only a 5 percent risk of
being wrong when it says the sample is not from the population. Therefore: 

Null Hypothesis (H0): “True Population Mean Score is 90”


Alternative Hypothesis (Ha): “True Population Mean Score is not 90”
Alpha is 0.05
Logically, the farther away the observed or measured sample mean is from the hypothesized mean, the lower
the probability (i.e., the p-value) that the null hypothesis is true. However, what is far enough? In this example,
the difference between the sample mean and the hypothesized population mean is 6. Is that difference big
enough to reject H0? In order to answer the question, the sample mean needs to be standardized and the so-
called t-statistics or t-value need to be calculated with this formula: 

In this formula,   is the standard error of the mean (SE mean). Because the population standard deviation is
not known, we have to estimate the SE mean. It can be estimated by the following equation: 
where   is the sample standard deviation or s. 

In our example,   is: 

Next we obtain the t-value for this sample mean: 

Finally, this t-value must be compared with the critical value of t. The critical t-value marks the threshold that –
if it is exceeded – leads to the conclusion that the difference between the observed sample mean and the
hypothesized population mean is large enough to reject H0. The critical t-value equals the value whose
probability of occurrence is less or equal to 5 percent. From the t-distribution tables, one can find that the
critical value of t is +/- 2.093.

Figure 3: Finding Critical Value

Since the retrieved t-value of -2.44 is smaller than the critical value of -2.093, the null hypothesis must be
rejected (i.e., the sample mean is not from the hypothesized population) and the supplier’s claims must be
questioned. 

The Two-Sample T-Test Works in the Same Way


In the two-sample t-test, two sample means are compared to discover whether they come from the same
population (meaning there is no difference between the two population means). Now, because the question is
whether two populations are actually one and the same, the first step is to obtain the SE mean from the sampling
distribution of the difference between two sample means. Again, since the population standard deviations of
both of the two populations are unknown, the standard error of the two sample means must be estimated. 
In the one-sample t-test, the SE mean was computed as such: 

Hence: 

However, this is only appropriate when samples are large (both greater than 30). Where samples are smaller,
use the following method: 

Sp is a pooled estimate of the common population standard deviation. Hence, in this method it can be assumed
that variances are equal for both populations. If it cannot be assumed, it cannot be used. (Statistical software can
handle unequal variances for the two-sample t-test module, but the actual calculations are complex and beyond
the scope of this article). 

The two-sample t-test is illustrated with this example: 

Table 3: Illustration of Two-Sample T-Test


  N Mean Standard Deviation
A Sample 12 92 15
B Sample 15 84 19

Ho is: “The population means are the same, i.e.,


Ha is: “The population means are not the same, i.e., 
Alpha is to be set at 0.05. 

In the two-sample t-test, the t-statistics are retrieved by subtracting the difference between the two sample
means from the null hypothesis, which is   is zero. 
Looking up t-tables (using spreadsheet software, such as Excel’s TINV function, is easiest), one finds that the
critical value of t is 2.06. Again, this means that if the standardized difference between the two sample means
(and that is exactly what the t value indicates) is larger than 2.06, it can be concluded that there is a significant
difference between population means. 

Here, 1.19 is less than 2.06; thus, it is the null hypothesis that   = 0. 

Below is the output from statistical software using the same data: 

Table 4: Two-Sample T-Test and Confidence Interval for A Sample and B Sample
  N Mean Standard Deviation SE Mean
1 Sample 12 92.0 15.0 4.3
2 Sample 15 84.0 19.0 4.9
Difference = mu (1) – mu (2)
Estimate for difference: 8.00000
95% CI for difference: (-5.84249, 21.84249)
T-test of difference = 0 (vs not =): T-value = 1.19, P-value = 0.245, DF = 25
Both use pooled StDev = 17.3540

The T-Test

The t-test assesses whether the means of two groups are statistically different from each other. This analysis is
appropriate whenever you want to compare the means of two groups, and especially appropriate as the analysis
for the posttest-only two-group randomized experimental design

Figure 1. Idealized distributions for treated and comparison group posttest values.

Figure 1 shows the distributions for the treated (blue) and control (green) groups in a study. Actually, the figure
shows the idealized distribution -- the actual distribution would usually be depicted with a histogram or bar
graph. The figure indicates where the control and treatment group means are located. The question the t-test
addresses is whether the means are statistically different.

What does it mean to say that the averages for two groups are statistically different? Consider the three
situations shown in Figure 2. The first thing to notice about the three situations is that the difference between
the means is the same in all three. But, you should also notice that the three situations don't look the same --
they tell very different stories. The top example shows a case with moderate variability of scores within each
group. The second situation shows the high variability case. the third shows the case with low variability.
Clearly, we would conclude that the two groups appear most different or distinct in the bottom or low-
variability case. Why? Because there is relatively little overlap between the two bell-shaped curves. In the high
variability case, the group difference appears least striking because the two bell-shaped distributions overlap so
much.

Figure 2. Three scenarios for differences between means.

This leads us to a very important conclusion: when we are looking at the differences between scores for two
groups, we have to judge the difference between their means relative to the spread or variability of their scores.
The t-test does just this.

You might also like