0% found this document useful (0 votes)
14 views54 pages

Inferential Statistics Lecture 2

The document provides an overview of hypothesis testing, focusing on the one-sample t-test and the characteristics of normal distribution. It explains key concepts such as null and alternative hypotheses, significance levels, p-values, and the empirical rule for normal distributions. Additionally, it discusses the t-test's purpose in determining significant differences between data sets and outlines the types of t-tests available.

Uploaded by

msrect14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views54 pages

Inferential Statistics Lecture 2

The document provides an overview of hypothesis testing, focusing on the one-sample t-test and the characteristics of normal distribution. It explains key concepts such as null and alternative hypotheses, significance levels, p-values, and the empirical rule for normal distributions. Additionally, it discusses the t-test's purpose in determining significant differences between data sets and outlines the types of t-tests available.

Uploaded by

msrect14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Page 1 of 54

ONE-SAMPLE t-TEST
Recall:
LEFT TAIL TWO-TAIL TEST RIGHT TAIL

Source: Hartmann, K., Krois, J., Waske, B. (2018): E-Learning Project SOGA:
Statistics and Geospatial Data Analysis. Department of Earth Sciences, Freie
Universitaet Berlin.
Less than Not Equal Greater than
Below Different From Above
Lower than Changed From Higher than
Smaller than Not the same as Longer than
Shorter than Bigger than
Decreased Increased
Reduced from At least
At most

A statistician will make a decision about claims via a process called "hypothesis
testing”
i. A hypothesis test involves collecting data from a sample and evaluating
the data.
ii. Then, the statistician makes a decision as to whether or not there is
sufficient evidence, based upon analysis of the data, to reject the null
hypothesis.
Hypothesis that assumes there is no difference between the population
parameters of the groups being tested
Under this assumption, any apparent difference between sample statistics is the
result of sampling error.
Page 2 of 54

NULL HYPOTHESIS ALTERNATIVE HYPOTHESIS


Stated as H0 Stated as Ha or H1
“Nothing happened” “Something happened”
“No Effect” “There was an effect”
“No significant difference” exist “There was a difference”

Say,

H0: defendant is innocent H0 (innocent) is rejected if H1(guilty)


Ha: defendant is guilty is supported by evidence beyond
“reasonable doubt”

Failure to reject H0 (innocent) does


not imply innocence, only that the
evidences are insufficient to reject it.

Shapes of Distribution
1. Symmetry (Symmetrical or asymmetrical)
2. Skew (Right or Left)
3. Peak or Modes (Unimodal, bimodal, or multimodal)
4. Spread (Narrow, Wide)
Page 3 of 54

Source:
http://homepage.stat.uiowa.edu/~rdecook/stat1010/notes/Section_4.2_dist
ribution_shapes.pdf

Where,
(a) Left-skewed (Negatively-skewed Distribution)  The mean and
median are LESS than the mode. Has a long LEFT tail. The mean is
also to the LEFT of the peak.
(b) Right-skewed (Positive-skew Distribution) The mean and median
are GREATER than the mode. Has a long RIGHT tail. The mean is also
to the RIGHT of the peak.
(c) Symmetric Distribution  The mean, median, and mode are EQUAL

Source: lynnschools.org
Page 4 of 54

Several unimodal distributions plotted on the same graph. The green “bell
curve” is the normal distribution

Source: https://www.statisticshowto.com/shapes-of-distributions/

NORMAL DISTRIBUTION
o Is a bell-curve or the normal curve
o Describes the tendency for the data to cluster around a central value
(Population MEAN mu, which is located in the middle of the curve)
o Parameters affecting the Normal Curve
a. POPULATION MEAN (μ) characterizes the position of the Normal
Distribution:

Recall:
Mean is the mathematical average of the data. It is the sum of
all data values in your data set divided by the total number of
values you have in the data set. It is used to represent the
central tendency of the data. Most of the values in the
normally distributed data will be clustered around the mean.
Page 5 of 54

an INCREASE in the mean value shifts the entire bell curve to


the right, whereas a DECREASE in the mean value shifts the
entire bell curve to the left.
Why?
Because the data will always cluster around the mean in
normally distributed populations.
b. POPULATION STANDARD DEVIATION

Standard deviation is the measure for variation in your data. It


essentially represents how close or distant the data values are
spread from the mean value.
Page 6 of 54

POPULATION STANDARD DEVIATION () characterizes the


spread of the Normal Distribution:

Source: https://www.omnicalculator.com/statistics/normal-
distribution

(a) The LARGER the standard deviation, the more spread out
the distribution will be; and (b) The SMALLER the standard
deviation, the less spread out the distribution will be.

Source:
https://mathbitsnotebook.com/Algebra2/Statistics/STnorm
alDistribution.html
Page 7 of 54

Notice that when the spread increases the curve becomes


much flatter and when the spread decreases the curve
becomes much taller.

Why? Because the normal distribution is a DENSITY


CURVE and the total area of any density curve must remain
equal to 1 or 100%. Therefore, changes in the curve width
must be compensated for by changes in the height of the
curve vice-versa.

DEVIATION  how FAR from the mean


o Note: The shape of the bell curve is determined only by these two
Parameters: MEAN and STANDARD DEVIATION
Characteristics:
Is unimodal
Means that the distribution has a single peak.

Is symmetric about its mean


The distribution can be cut into 2 equal halves
The parameters
μ (determines the location of the distribution and
where the data tends to cluster around) and
 (determines how spread out the distribution
will be)

characterizes the normal distribution


Page 8 of 54

The Empirical Rule (68-95-99.7 Rule) for approximating the area of


Normal Distribution

For all normal distributions,

68.2% of the observations/values fall within plus or minus one standard


deviation from the mean;

95.4% of the observations will fall within +/- 2 standard deviations from the
mean; and

99.7% within +/- 3 standard deviations from the mean.

This fact is sometimes referred to as the "empirical rule," a heuristic that


describes where most of the data in a normal distribution will appear.

This means that data falling outside of three standard deviations ("3-sigma")
would signify rare occurrences.

Source: https://www.omnicalculator.com/statistics/normal-distribution
Page 9 of 54

The Gaussian curve is a symmetric distribution, so the middle 68.2% can


be divided in two. Zero to 1 standard deviations from the mean has 34.1% of
the data. The opposite side is the same (0 to -1 standard deviations).
Together, this area adds up to about 68% of the data.

Example 1:

The weights of stray dogs at a particular pound average 70 lbs with a standard
deviation of 2.5 lbs. Assuming the weights follow a Gaussian distribution:

1. What weight is 2 standard deviations below the mean?


2. What weight is 1 standard deviation above the mean?
3. The middle 68% of dogs weigh how much?

Example 2:

Findings: The height of all students at a local university is found to be


normally distributed with a mean height of 5.5 ft and a standard deviation of
0.5 ft.

Constructing the Normal Distribution:


Page 10 of 54

Source: https://www.youtube.com/watch?v=mtbJbDwqWLE

Create the intervals by the Std Deviation:

68-95-99.7 Rule:

Within 1 Std Deviation away from the mean it contains a total area of 0.68 or
68%

Therefore, 68% of the population are between 5 and 6 ft. tall.


Page 11 of 54

Then, if you go away 2 Std Deviation from the mean, it contains a total area
of 0.95 or 95%

Therefore, 95% of the population are between 4.5 and 6.5 ft. tall.

Within , 3 Std Deviation away from the mean


Page 12 of 54

it contains a total area of 99.7%

This means, that 99.7% of the population are between 4.0 and 7.5 ft. tall.

Note: Normal distribution never touches the x-axis. It continues on to


infinity

Exercises:
Page 13 of 54

Q1. The normal distribution below has a standard deviation of 10.


Approximately what area is contained between 70 and 90?

Given:

μ = 70 (Because it’s the center of distribution);  = 10 (From the graph, it


shows the interval goes up by 10)

Based on the above Rule, there is an area of 95% contained within 2 std
Dvtns of the mean.

(2 std dvtn to the left is 50 and to the right is 90)

Note also that we are only interested in the area from 70 to 90


Page 14 of 54

Therefor we divide 95% by 2,% thus giving us 47.5

Ans. 47.5%

Q2. For the normal distribution below, approximately what area is contained
between -2 and 1?
Page 15 of 54

Given:

μ =0;=1

Note: interval goes up by 1 to approximate the area between -2 and 1

Using the above Rule, we divide this area into 2


Page 16 of 54

From the Rule, 1 std Dvtn away from the mean gives us 68% and half of this
is 34%

With 2 (0 to -2) std Dvtn from the mean gives us an area of 95%, dividing this
by 2 gives us equal to 47.5%

To get the total area:

Ans. Total Area = 47.5 + 34 = 81.5%

Skewness

Skewness measures the degree of symmetry of a distribution. The normal


distribution is symmetric and has a skewness of zero.

If the distribution of a data set instead has a skewness less than zero, or
negative skewness (left-skewness), then the left tail of the distribution is
Page 17 of 54

longer than the right tail; positive skewness (right-skewness) implies that the
right tail of the distribution is longer than the left.

Kurtosis
Kurtosis measures the thickness of the tail ends of a distribution in relation
to the tails of a distribution. The normal distribution has a kurtosis equal to
3.0.

Distributions with larger kurtosis greater than 3.0 exhibit tail data exceeding
the tails of the normal distribution (e.g., five or more standard deviations
from the mean). This excess kurtosis is known in statistics as leptokurtic,
but is more colloquially known as "fat tails." The occurrence of fat tails in
financial markets describes what is known as tail risk.

Distributions with low kurtosis less than 3.0 (platykurtic) exhibit tails that
are generally less extreme ("skinnier") than the tails of the normal
distribution.Source:
https://www.investopedia.com/terms/n/normaldistribution.asp

KEY CONCEPTS

The level of significance ( )


 is the probability that the test statistic will fall into the critical region
when the null hypothesis is true.
 This level is set by the researcher.
 The levels of significance usually employed in testing of hypothesis are
5% and 1%.

Degrees of freedom
 represent the number of values that are free to vary in calculating each
statistic
 Formula: df = n -1; where n  Sample Size

t Distribution
 is actually a series of distributions where the exact shape of each is
determined by its respective degrees of freedom.

For rejecting a null hypothesis, a test statistic is calculated.


Page 18 of 54

HYPOTHESIS TESTING
1. p-Value Approach
 A p value is used in hypothesis testing to help you support or reject
the null hypothesis.
 The p value is the evidence against a null hypothesis.
 The smaller the p-value, the stronger the evidence that you should
reject the null hypothesis.
 It is compared to the level of significance ( ).

If
p-value ≤ , Then REJECT H0 at level α
p-value > , Then FAIL TO REJECT H0 at level α

When,
p-value > 0.05: Means, that there is GREATER than 5% chance that
the data is random but LESS than 95% confidence that the data is
significant.
We want as much as confidence as possible and we want a very
small p-value to tell us its not random, but it’s actually significantly
different.
Say,
We calculated a p-value of 0.53. Is that significantly different or not?
Ans. It’s NOT Significant. That means, there is a 53% chance that the
data you have is RANDOM and only 47% confidence that the data is
significant.
In scientific studies, they want a 95% confidence interval

The following table provides guidelines for using the p-value to assess the
evidence against the null hypothesis (Weiss, 2010).

p-Value Evidence against H0


p>0.10 Weak or no evidence
0.05 < p ≤ 0.10 Moderate Evidence
0.01 < p ≤ 0.05 Strong Evidence
P < 0.01 Very Strong Evidence
Page 19 of 54

2. Critical Value Approach


The critical values (C.V.)
 is the value that defines the rejection zone (the test statistic values
that would lead to rejection of the null hypothesis).
 It is defined by the level of significance.

C.V < 0.05: Means, that there is LESS than 5% chance that the
data is random but GREATER than 95% chance that the data is
significant.

Reject the null hypothesis if the test statistic lies in the critical region.
Otherwise, retain the null hypothesis.

KINDS OF VARIABILITY
Variance Standard Deviation
 is mean squared  is a measure of how
difference between spread out numbers are.
each data point and  Its symbol is σ (the greek
the center of the letter sigma) for
distribution population standard
measured by the deviation and S for
mean. sample standard
deviation.
 It is the square root of
the Variance.
SAMPLE

T-Test
o Also known as Student’s t-test
Page 20 of 54

o invented in 1908 by William Sealy Gosset, worked for the Guinness


Company in Ireland and they were testing different varieties of Barley. He
wanted to know was there a significant difference in their experiments. To
eliminate bias, he created the t-test.
o Purpose: Finds for significant difference between two sets of data. (The t-
Test tells us if there is statistically significant difference between the mean
values of two data sets, when the data is NORMALLY DISTRIBUTED) It
eliminates our bias when we write a conclusion and tells us is the data
significant or not significant.
 t-Test looks at not only the MEAN but the spread of our
STANDARD DEVIATION
o Considerations:
o Are you conducting a One-tailed or two-tailed t-Test?
Say,

Q: Is there a difference between the height of girls and the height of


boys?
(This is an open ended question. Therefore, we could look for an
answer in either direction that girls or boys could be taller)
Therefore, we are doing a two-tailed t-Test.

But what if,

Q: Are Boys significantly taller than girls?


(This is an close ended question. The question is closed to one
direction)
Therefore, we are doing a one-tailed t-Test.

o Is the data paired or unpaired? (Paired means that sets of data would
be linked in each row)
Unpaired: Height of Boys and Girls
Paired: Boy’s Height at age 7 and age 21
of same individual
o T-tests are statistical hypothesis tests that analyze one or
two sample means.

TYPES OF t-TESTS
1. One sample t-test: tests the mean of single group against a known
population mean.
Page 21 of 54

2. A Paired sample t-test: compares means from the same individual,


group, object or related units at different time.
of same population
Example:

3. An independent sample t-test: compares the means for two


independent groups
Example:
Effect of drug vs placebo

t-VALUE
o Measures the size of the difference in mean values relative to the variation
in the sample data

ONE-SAMPLE t-TEST
Degree of Freedom (df) = n -1

To calculate the test statistic:

Where:
𝑥̅ Observed Mean of the Sample
μ Theoretical Mean of the Population
s Standard Deviation of the Sample
n Sample Size
SE Standard error of mean difference
represents how much random error is
in the sample and how well the sample
estimates the population mean.

High levels of random error increase the


probability that your sample mean is
farther away from the population mean.
Page 22 of 54

t ratio Is a signal to noise ratio


Signal  difference between means;
Numerator (is the signal in
your sample data)
Noise  Standard error of mean
difference; Denominator

The less noise you have in your data set,


the less noisy it is

The lower the SE of the mean difference,


the greater your t statistic will be

Likewise, the larger the mean difference,


the larger your t statistic will be.

Assumptions for the t Test


 Normal population from which sample was drawn
 Random samples
 Homogeneity of variance
o Samples have similar variances
o Variance (spread of data) of one group should not be more than two
times larger than the other.

Example 1:

According the WHO statistics published in 2018, the lifespan of a person in


the Philippines is 67 years old. A random sample of 25 obituary notices in the
PDI has an average mean of 60 years old with a standard deviation of 19 years.
If the life span in the Philippines is normally distributed, does this information
indicate that the population mean life span of Filipinos is less than 67 years
old? Use 5% level of significance.

Solution:

Given:
n = 25
Page 23 of 54

df = 25 – 1 = 24
𝑥̅ = 60
s = 19

Hypotheses:
H0 : μ= 67
Ha : μ < 67
Statistical Test: t-Test (Left-Tailed)

From T-Table below: Determine the Critical Value, given the following:
df=24 (Row), -tailed (LEFT) Test
(Column)

TCritical Value = -1.711


Use t Table to find the t
ratio that must be
reached to reject
chance at a given level
of confidence
Page 24 of 54

DECISION:
-1.84 < -
1.711, Therefore we
reject the H0 at level
0.05

As you progress through your university career you will be introduced to statistical
packages such as R and Minitab that can perform these tests for you and present
the final significance level. However, you may also be introduced to how to conduct
and interpret hypothesis test without using such software (this is good to
demonstrate a thorough knowledge of what is really happening with the data).

Example 2.
Page 25 of 54

A weight reduction program claims to be effective in treating obesity. To test


this claim 12 people were put on the program and the number of pounds of
weight gain/loss was recorded for each person after two years, as follows:
23,15,-5,7,1,-10,12,-8,20,8,-2,and -5
Can we conclude that the program is effective if there is some weight loss at
the 95% significance level?

Sol’n a. (Manual)

1. Formulate the Hypotheses:


H0 : μ = 12
H1: μ ≠ 12

2. T-test (A Two-Tailed Test)


3.
n = 12
Degrees of Freedom (df) = 12-1 = 11

Mean (x) = 56/12 = 4.67


Page 26 of 54

s = 11.155

From T-Table above: Determine the Critical Value, given the following:
df=11 (Row), = 0.05 and Two-tailed Test (Column)

TCritical Value = +2.201 or -2.201

DECISION:
-2.28 < - 2.201, Therefore, we reject the H0 at 0.05 significance level.

Sol’n b. (Using Excel)

Two-tailed
You may also use the “DATA ANALYSIS” Option:
Page 27 of 54

If you do not see the “Data Analysis” option, you will need to install the add-
in. Do this by clicking on “File” in the top left corner, and selecting the
“Options” button (below left). You will then see the Excel Options menu (below
right): click on the “Add-Ins” button and select the “Analysis ToolPak” and click
the “Go” button to install. The “Data Analysis” tab should then appear in the
“Data” menu

Note: No available option for one sample case so we use a Dummy Values to
make it appear as 2 samples.

Click: Data Tab  Data Analysis OptionT-test: Paired Two Sample for
Means

Press Ok  Enter the range for values 1 and 2:


Page 28 of 54
Page 29 of 54

Then, type the value for Hypothesized mean difference and the Alpha (See
problem)  Press OK
Page 30 of 54

Another worksheet will be created, as shown below:

DECISION:
Two-tailed
p-Value = 0.044
Page 31 of 54

0.044 < 0.05, Therefore, we reject the H0 hypotheses at 0.05 significance


level.

Sol’n c. (Using Minitab)

1. Open the Minitab


2. Type the values in C1

3. Press Stat  Basic Statistics1-Sample t


Page 32 of 54

Click here
Page 33 of 54

Double-Click

Tick this box


Page 34 of 54

Enter the value


and click
Options

Type 95 (Given
in the Problem)
Page 35 of 54

Press Ok Choose this


one since it is
a 2 tailed

Press OK
Page 36 of 54

DECISION:

p-Value = 0.044

0.044 < 0.05, Therefore, we reject the H0 hypotheses at 0.05 significance


level.
Page 37 of 54

Example 3
An engineer measured the Brinell hardness of 25 pieces of ductile iron that
were subcritically annealed. The resulting data were:

170 167 174 179 179 187 179 183 179

156 163 156 187 156 167 156 174 170

183 179 174 179 170 159 187

The engineer hypothesized that the mean Brinell hardness of all such ductile
iron pieces is greater than 170. Therefore, he was interested in testing the
hypotheses at significance level α at 0.05.

Solution:

Given:
n = 25
df = 25 -1 = 24

Hypotheses
H0 μ = 170
μ > 170

Statistical Test: t-Test (Right-Tailed)

From T-Table above: Determine the Critical Value, given the following:
df=24 (Row), = 0.05 and One-tailed (Right)Test
(Column)

TCritical Value = +1.711


MANUAL COMPUTATION
Page 38 of 54

Mean (x) = 4313/25 = 172.52


Page 39 of 54

DECISION:

TCritical Value = +1.711

1.22184 < 1.711, Therefore, we fail to reject the H0 at 0.05 significance level
Page 40 of 54

Using Excel (Functions)

Right-tailed

DECISION:

p-Value = 0.117

Since 0.117 > 0.05, Therefore we Fail to Reject the H0. There is an insufficient
evidence, at the α = 0.05 level, to conclude that the mean Brinell hardness of
all such ductile iron pieces is greater than 170.

USING Data Analysis Tools in EXCEL


Page 41 of 54

DECISION:
Right-tailed
p-Value = 0.117 (One)
Page 42 of 54

Since 0.117 > 0.05, Therefore we Fail to Reject the H0 Hypothesis. There is
an insufficient evidence, at the α = 0.05 level, to conclude that the mean Brinell
hardness of all such ductile iron pieces is greater than 170.

USING MINITAB

DECISION:

p-Value = 0.117
Page 43 of 54

Since 0.117 > 0.05, Therefore we Fail to Reject the H0 Hypothesis. There is
an insufficient evidence, at the α = 0.05 level, to conclude that the mean Brinell
hardness of all such ductile iron pieces is greater than 170.

Exercises
A consumer group, concerned about the mean fat content of a certain grade
of steakburger submits to an independent laboratory a randomsample of 12
steakburgers for analysiis. The percentage of fat in each of the steakburgers
is as follows:

21 18 19 16 18 24 22 19 24 14 18 15

The manufacturer claims that the means fat content of this grade of
steakburger is less than 20%. Assuming percentage fat content to be
normally distributed, carry out an appropriate hypohtesis test in orders to
advise the consumer group as to the validity of the manufacturer’s claim.

Sol’n.
Given:
μ = 20%
n = 12

Hypotheses:

μ = 20%
μ < 20%

One-Tailed t-Test (Left Tailed)


Page 44 of 54

df = 12 – 1 = 11
̅𝑥 = (21+18+19+16+18+24+22+19+24+14+18+15)/12= (228/12 )= 19

-1.796

DECISION:
Since,
-1.07 > -1.796, Therefore we failed to reject the H0
level.

During a particular week, 13 babies were born in a maternity unit. Part of


the standard procedure is to measure the length of the baby. Given below is
a list of the lengths, in centimeters, of the babies born in this particular
week.

49 50 45 51 47 49 48 54 53 55 45 50 48
Assuming that this sample came from an underlying normal population<
test< at the 5% significance level, the hypothesis that the population mean
length is 50cm.
Page 45 of 54

A random sample of 12 steel ingots was taken from a production line. The
masses, in kilograms, of these ingots are given below:

24.8 30.8 28.1 24.8 27.4 22.2


24.7 27.3 27.5 27.8 23.9 23.2

Assuming that this sample came from an underlying normal population,


investigate the claim that its mean exceeds 25.0 Kg.

Sol’n.
Given:
μ = 25 Kg
n = 12

Hypotheses: (4 Pts)

H0: μ ≤ 25
H1: μ > 25

One-Tailed t-Test (Right Tailed) (2 pts)

df = 12 – 1 = 11 (2 pts)
Page 46 of 54

(3 pts) (2 pts)

Given the (2 pts)

DECISION:
Since,
1.43 < 1.796, Therefore we failed to reject the H0
significance level. (2 pts)
Page 47 of 54
Page 48 of 54

A random sample of 15 workers from a vacuum flask assembly line was


selected from a large number of such workers. John, a work-study engineer,
asked of these workers to assemble a one-liter vacuum flask at their normal
working speed. The times taken, in seconds, to complete these tasks are
given below:

109.2 146.2 127.9 92.0 108.5


91.1 109.8 114.9 115.3 99.0
112.8 130.7 141.7 122.6 119.9
Page 49 of 54

Assuming that this sample came from an underlying normal population,


investigating the claim that the population mean assembly time is less than 2
minutes.

12(8201.42)−97593.76
𝑠 =√ 12(12−1)

98417.04−97593.76
𝑠=√ 12(11)

823.28
𝑠=√ 𝑠 = √6.237
132
𝑠 = 2.5

26.03333−25 1.03333
𝑡= 2.5 𝑡= 2.5
√12 3.464
𝑡 = 1.43
1.03333
𝑡= 0.7217
Page 50 of 54

Lesson Summary

a. Hypothesis testing involves making educated guesses about a population


based on a sample drawn from the population. We generate null and
alternative hypotheses based on the mean of the population to test these
guesses.
b. We establish critical regions based on level of significance or alpha (α) levels.
If the value of the test statistic falls in these critical regions, we are able to
reject it.
c. When we make a decision about a hypothesis, there are four different outcome
and possibilities and two different types of errors. A Type I error is when we
Page 51 of 54

reject the null hypothesis when it is true and a Type II error is when we do
not reject the null hypothesis, even when it is false.

A comparison is demonstrated between z-test and t-test relying on specific


conditions.

References:

Wow Math. (14 May 2021). SOLVING PROBLEMS INVOLVING TEST OF


HYPOTHESIS ON POPULATION MEAN || STATISTICS AND PROBABILITY Q4.
Retrieved from https://www.youtube.com/watch?v=SgaG0nTszoA
https://online.stat.psu.edu/statprogram/reviews/statistical-
concepts/hypothesis-testing/examples
http://makemeanalyst.com/explore-your-data-variance-and-standard-
deviation/
https://www.real-statistics.com/students-t-distribution/one-sample-t-test/
Page 52 of 54

https://www.geo.fu-berlin.de/en/v/soga/Basics-of-statistics/Hypothesis-
Tests/Introduction-to-Hypothesis-Testing/Critical-Value-and-the-p-Value-
Approach/index.html
https://www.analyticssteps.com/blogs/what-are-differences-between-z-test-
and-t-test

12(1630)−3136
𝑠=√ 12(12−1)

19560−3136
𝑠=√ 12(11)

16424
𝑠=√ 𝑠 = √124.4242
132

4.67−12 −7.33
𝑡= 11.155 𝑡= 11.155
√12 3.464
𝑡 = −2.28
−7.33
𝑡=
3.22

𝑛𝑥 2 −(𝑥)2
𝑠 =√ 𝑛(𝑛−1)

12(4448)−51984
𝑠 =√
12(12−1)

53,376−51984
𝑠=√ 12(11)

1392
𝑠 = √ 132 𝑠 = √106.5454
𝑠 = 3.25

https://www.youtube.com/watch?v=R7y1dIRIqq8
https://www.youtube.com/watch?v=fiMFqfatieE
Page 53 of 54

https://www.youtube.com/watch?v=5vmb5zafqNk
https://www.youtube.com/watch?v=Fsa-5_XdIMs
https://www.youtube.com/watch?v=yvHQEJnYZBY
https://www.youtube.com/watch?v=rK3mXS3gHyI&t=738s

Example 1:

The weights of stray dogs at a particular pound average 70 lbs with a standard
deviation of 2.5 lbs. Assuming the weights follow a Gaussian distribution:

4. What weight is 2 standard deviations below the mean?


5. What weight is 1 standard deviation above the mean?
6. The middle 68% of dogs weigh how much?

Answers:

1. 2 standard deviations is 2 * 2.5 (5 lbs). So if a dog is 2.5 standard


deviations below the mean they weigh 70 lbs – 5 lbs = 65 lbs.
2. 1 standard deviation is 2.5 lbs, so a dog 1 standard deviation
above the mean would weigh 70 lbs + 2.5 lbs = 72.5 lbs.
3. The 68 95 99.7 Rule tells us that 68% of the weights should be
within 1 standard deviation either side of the mean. 1 standard
deviation above (given in the answer to question 2) is 72.5 lbs; 1
standard deviation below is 70 lbs – 2.5 lbs is 67.5 lbs. Therefore,
68% of dogs weigh between 67.5 and 72.5 lbs.
4.
Source: https://www.statisticshowto.com/probability-and-
statistics/statistics-definitions/empirical-rule-2/

Q. A coffee shop relocates to Italy and wants to make sure that all lattes are
consistent. They believe that each latte has an average of 4 Oz of espresso. If
this is not the case, they must increase or decrease the amount. A random
sample of 25 lattes shows a mean of 4.6 Oz of espresso and a standard
deviation of .22 Oz. Use alpha = .05 and run a one-sample t-test to compare
with the known population mean.

Solution
Page 54 of 54

1. Hypothesis
H0 : μ = 4.0
Ha : μ ≠ 4.0
2. Statistical Tool: t-Test (Two-Tailed)
3. Significance Level:
4. t-Test

n = 25
sample mean = 4.6
μ = 4.0
s = 0.22 Oz.

t = (4.6 – 4.0)/(0.22/sqrt(25))
= 0.6/(0.22/5)
= 0.6 / 0.044
=13.6364

Critical Value:
df = 25-
From the T Table
CV = ± 2.064

5. Conclusion
Since 13.6364 > 2.064 or -13.6364 < -2.064, we therefore conclude
that there is a significant difference between our sample mean of the
amount of espresso in the coffee in Italy and the expected population
amount. (Or We reject the Ho hypothesis and accept Ha. Therefore, we
can easily say that there is too much espresso being placed in the coffee
in Italy and it should be reduced to meet the normal (population) mean.)

You might also like