0% found this document useful (0 votes)

13 views

Lecture Slides 3a - Statistical Testing

This document discusses statistical testing and the one-sample t-test. It covers topics like the normal distribution, standard normal distribution, confidence intervals, and hypothesis testing. Specifically, it provides an example of calculating a 95% confidence interval for the average distance visitors live from a shopping center using a one-sample t-test when the population variance is unknown. The steps shown include determining the degrees of freedom, finding the critical t-value, calculating the standard deviation of sample averages, and determining the confidence interval bounds.

Uploaded by

Jasmijn Govaarts

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Lecture Slides 3a - Statistical Testing

Uploaded by

Jasmijn Govaarts

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Lecture 3

Statistical testing

- Statistical testing
- One-sample t-test

Free after dr. ir. P. Heijnen (TU Delft)

Statistical testing
Outline

• Normal distribution

• Standard normal distribution

• Confidence intervals

• The one-sample t-test

The normal distribution
Normal distribution
• X is a continuous random variable and has a normal
distribution with average  en standard deviation 


 average
 standard deviation
2 variance
Standard percentages of
the normal distribution

68%
95%
99,7%

-3 -2 -1  1 2 3

Normal distribution
Example shopping center

• On average visitors of a shopping center live within a

distance of 4.6 km of the center ( = 3.06)

 = 3.06

 = 4.6
The probability of an interval of X
Example shopping center

• What is the probability of a visitor of the shopping

center living within a distance of 2.5 km of the center?

 = 3.06

x = 2.5
 = 4.6
Standard normal distribution
Standard normal distribution

Transformation of X => Z

• What is the mean of z?

• What is the standard deviation of z?

All probabilities
1
known in tables

0
Standard normal distribution
Example shopping center

What is the z-value of a distance of 2.5 km ( = 4.6

and  = 3.06)?

z = -0.686 0
So we have
z = -0.686

Look up in table
P(Z ≤ -0.686) ≈ 0.245

P(X ≤ 2.5) ≈ 0.245

Probability P(Z  z)
Example Shopping Center
• Which % of visitors of the shopping center live within
a distance of 5 km from the center?

Look up in table

Approximately 55%
Probability P(Z > z)
Example Shopping Center

• Which % of visitors of the shopping center live more

than 5 km from the center?
Confidence intervals
Confidence intervals

• In a sample, we find an average of X

• Can we say that X represents the average of the

population?

• Yes, as a best guess, but how confident are we?

• The smaller the sample, the less confident we are

Notation

• Average so, the average of the

• Sample sample is an estimate of
the average of the
• Population population
• Estimate of
=

• Standard deviation
• Sample
• Population
Confidence interval - the problem
• Find an interval [x1, x2] such that average will be
with 95% confidence within [x1, x2]
95% is often used

• For the shopping example

• What is the 95% confidence interval for the
estimate average distance to the center? = 4.6
When we would draw many
samples, we would get a
distribution of sample means

95%

=4.6 km

x1 x2
Confidence interval

So, we are looking for the values x1 and x2

For the standard normal distribution
we know the critical values z1 and
z2

95%

z1 =-1.96 z2 = 1.96
Confidence interval
To translate this to x variables
we must know the variance of
the distribution of sample
means
Standard deviation of sample means

Standard deviation of sample averages?

σ = standard deviation in
population
n = sample size

So, the standard deviation of sample averages is much

smaller depending on n
So, we have the formula
?

... but we don’t know population

95% variance σ, we only know sample
variance s
=0
Let’s for the moment assume that the
z1 =-1.96 z2 = 1.96
population variance equals the sample
Confidence interval
variance
we assume = s
Then we get
Confidence interval – population variance known
Example shopping center

Using the formula for z, we have

95%

z1 =-1.96 z2 = 1.96
Confidence interval Solving x1 and x2
Confidence interval – population variance known
Example shopping center

95%

=4.6 km

x1 =3.92 x2 = 5.28
Confidence interval

Conclusion: the 95% confidence interval for average

distance to the center is: [3.92, 5.28]
BUT ... the population variance is unknown

• We do not know the population variance, but we do

have an estimate, namely the variance we find in the
sample

• Because we estimate the population variance, there is

more uncertainty

• As a consequence, we cannot use the standard normal

distribution

• Instead, we must use a slightly different distribution,

known as the student t-distribution
Student t-distribution

• Bell-shaped curve, around 0

• Larger variance than

t=0
standard normal distribution

• Takes into account the larger

uncertainty since also  is
estimated by sample stand. dev. s # degrees of freedom

• Probability density function has parameter (N - 1)

Confidence interval – population variance unknown
Example shopping center
df = 80 - 1

The formula for t is the

95% same as for z

-1.99 t=0 1.99

Correspond to 95% for t-

distribution, df = 80 - 1

Larger interval Solving x1 and x2

• Conclusion: the 95% confidence interval for average
distance to the center is: [3.91, 5.29]

95%

3.9 5.3
 = 4.6

Sample size is large enough, so

approximately the same interval is
found:[3.91,5.29]
Summary of steps
Calculate a 95% confidence interval

• Use the Student t-distribution since the population

variance is unknown
1. Calculate the degrees of freedom as df = N – 1
2. Given df, determine the critical values of t for a 95%
confidence interval – this is t0.975
3. Calculate the standard deviation of sample averages
using

4. Given the sample average , calculate the interval [x1

, x2] as:
Calculating a confidence interval in
SPSS

Descriptives

Statistic Std. Error

Verplaatsingsafstand Mean 123.90 .538
inTravel distance in
Nederland 95% Confidence Lower Bound 122.84
Netherlands Interval for Mean Upper Bound
124.95

5% Trimmed Mean 81.49

Median 35.00
Variance 64130.129
Std. Deviation 253.239
Minimum 1
Maximum 6950
Range 6949
Interquartile Range 110
Skewness 5.033 .005
Kurtosis 40.666 .010

Lower and upper bound of 95% confidence interval

for variable Travel distance in Netherlands
One-sample t-test:

Student t-test for averages

The concept of hypothesis testing
Example: body length

• Someone says the average length of an adult person in

the Netherlands is 1.70 m

• In a sample (n = 100) we find an average length of

1.75 m and a standard deviation of 0.15 m

• Do we belief the person?

• We use a statistical test to make a decision

• Assume the person is right; then what is the probability
that we find an average in our sample that differs as
strongly as 1.75 m does from this claimed average?

Quite extreme so it
seems unlikely

0 = 1.70
1.75
Assuming the claim is right
• We want to know the probability that we find an
average in our sample that differs as strongly as 1.75 m
does from this claimed average

• So, on both sides

This percentage is
often chosen
= 1.65 0 = 1.70 = 1.75

• If the probability is smaller than 5% we decide not to

belief the person
Translation to formal terms
• The average length is 1.70 m
• Null hypothesis (H0)

• The average length is not equal to 1.70 m

• Alternative hypothesis (H1)

• The maximum probability of making a wrong decision

that we still accept is 5%
• Alpha ( = 5%)

• The probability that we find an average length of 1.75

or larger in the sample while the null hypothesis is
true
• p-value
The way the test can be performed -
confidence intervals

• Calculate a 95% confidence interval [μ1, μ2] around

the test value
• If the sample average falls outside the interval then
reject the null hypothesis

2.5% 2.5% Because we assume the

sample standard
deviation, we should use a
1 2 Student t-distribution
0 = 1.70
2.5% 2.5%

1 2 df = N – 1 = 99
0 = 1.70

2.5% 2.5%

-1.98 0 1.98

Correspond to 95% for t-

distribution, df = 99

• We have found a mean of 1.75 m

What is the t-value of this mean?
• the t -value is the standardized value just as the z -
value, but then for the t-distribution

• calculated as:

Sample average Test value

• For the sample we find

very small –
typical for large N
df = N – 1 = 99

2.5% 2.5%

-1.98 0 1.98

Correspond to 95% for t-

distribution, df = 99

• Because 3.33 > 1.98, we reject the null hypothesis and

accept the alternative hypothesis that the average is
different than 1.70 m
Choosing the alternative hypothesis
• In the example we tested whether the average is
different from 1.70 m

• We could also test whether the average length is

larger than 1.70 m (in stead of just different)

• This makes sense if we are interested in that particular

question

• Then the alternative hypothesis is:

• The average length is larger than 1.70 m

• Does this make a difference for the test?

• Because we now test whether it is larger we look at
one side

df = N – 1 = 99

0 1.66

Correspond to 95% for t-

distribution, df = 99

• Because 3.33 > 1.66, we reject the null hypothesis

and accept the alternative hypothesis that the actual
average is larger than 1.70 m
Summary ( = 5%) – we have three
possibilities
H1 < H 0
Called one-tailed test
5%
H1 > H 0

H1 ≠ H 0

2.5% 2.5%

Called two-tailed test

Aonther way the test can be
performed - p-value
• p-value is the probability that we find the t-value
while the null hypothesis is true

• What is the probability that we find 1.75 m or

anything as strongly deviating from 1.70 m?

Probability?

= 1.65 0 = 1.70 = 1.75

The t-value is t = 3.33
The degrees of freedom is
df = 99

The corresponding p-value

is p = 0.0012
= 1.65 0 = 1.70 = 1.75

On the internet p-value calculators are avaialable, for

example https://www.graphpad.com/quickcalcs/pvalue1.cfm

In this case, we did a two-tailed test. Does it make a

difference when instead we would do a one-tailed test
(when the alternative H says the average is larger instead
of just difference)?
Yes that makes a difference

Two-tailed

p1 p1 The p-value is
p = p1 + p1 = 0.0012
= 1.65 0 = 1.70 = 1.75

One-tailed

p1 The p-value is
p = p1 = 0.0006
0 = 1.70 = 1.75

So, in a one-tailed test the p-value is twice as small!

Student t-test
Another example: satisfaction measurement

• In a survey, we ask a sample of N =50 visitors to indicate their

satisfaction on a 5 point scale (1 = very dissatisfied, 5 = very
satisfied)

• Null hypothesis: visitors are neutral (i.e., not satisfied or

dissatisfied): 0 = 3.0

• Alternative hypothesis: visitors are not neutral: 0  3.0

• We find a mean score of 𝑥= 3.45 with standard deviation s =

1.51, so = 3.45

• Can we conclude that on average visitors are not neutral?

Student t-test SPSS
Example satisfaction measurement H0
One-Sample Test

Test Value = 3.0

95% Confidence
Interval of the
Mean Difference
t df Sig. (2-tailed) Difference Lower Upper
Average
Satisfaction score 2,107 49 .0403 0.45 0.025 0.875

degrees of freedom N -1

If 0 in interval then
H0 accepted

p-value of this t-value

(or smaller) if H0 is true
Critical t – values

Alpha = 5%

large samples
Summary of important concepts

• Student t-distribution: population variance is estimated

 more uncertainty compared to z-distribution

• Confidence interval for an estimate of population

average based on a sample

• One-sample t-test:
• Is the average different from a known average?

• One-tailed or two-tailed testing depends on the

formulation of the alternative hypothesis

Artsy Case Solution
100% (1)
Artsy Case Solution
16 pages
Types of Research
100% (2)
Types of Research
12 pages
I. Test of a Mean: σ unknown: X Z n Z N X t s n ttn
No ratings yet
I. Test of a Mean: σ unknown: X Z n Z N X t s n ttn
12 pages
Confidence Interval
100% (3)
Confidence Interval
27 pages
BSCHAPTER - (Theory of Estimations)
No ratings yet
BSCHAPTER - (Theory of Estimations)
39 pages
Confidence Interval
No ratings yet
Confidence Interval
44 pages
Chapter3 Statistics 2021 22
No ratings yet
Chapter3 Statistics 2021 22
35 pages
Lecture 16 Confidence Interval
No ratings yet
Lecture 16 Confidence Interval
16 pages
BUS_7
No ratings yet
BUS_7
48 pages
Statistics and Probability Finals Reviewer
No ratings yet
Statistics and Probability Finals Reviewer
26 pages
Confidence Intervals
No ratings yet
Confidence Intervals
50 pages
Point and Interval Estimation
No ratings yet
Point and Interval Estimation
55 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Estimation of Parameters Handout
No ratings yet
Estimation of Parameters Handout
8 pages
CI Estimation and sample size determination
No ratings yet
CI Estimation and sample size determination
53 pages
A Session 18 2021
No ratings yet
A Session 18 2021
36 pages
CH 4 - Estimation & Hypothesis One Sample
No ratings yet
CH 4 - Estimation & Hypothesis One Sample
139 pages
Chapter 5
No ratings yet
Chapter 5
43 pages
Lecture 10
No ratings yet
Lecture 10
13 pages
Statistics Suggestions
No ratings yet
Statistics Suggestions
74 pages
L8 Estimate 2014
No ratings yet
L8 Estimate 2014
40 pages
A Confidence Interval Provides Additional Information About Variability
No ratings yet
A Confidence Interval Provides Additional Information About Variability
14 pages
STA2023 CH 09
No ratings yet
STA2023 CH 09
31 pages
Statistical Estimation: Prof GRC Nair
No ratings yet
Statistical Estimation: Prof GRC Nair
15 pages
statssss
No ratings yet
statssss
31 pages
Chapter 7: Statistical Intervals Based On A Single Sample
No ratings yet
Chapter 7: Statistical Intervals Based On A Single Sample
26 pages
CLO4-PPT1-Estimation and Confidence Intervals
No ratings yet
CLO4-PPT1-Estimation and Confidence Intervals
29 pages
Stats 2 Module Updated
No ratings yet
Stats 2 Module Updated
30 pages
Topic 6 - Confidence Interval Slides
No ratings yet
Topic 6 - Confidence Interval Slides
34 pages
Estimation of Parameters
No ratings yet
Estimation of Parameters
30 pages
BBA 122 Notes On Estimation and Confidence Intervals
No ratings yet
BBA 122 Notes On Estimation and Confidence Intervals
34 pages
Statistics Unit 7 Notes
No ratings yet
Statistics Unit 7 Notes
9 pages
Chapter 6
No ratings yet
Chapter 6
44 pages
Topic 5
No ratings yet
Topic 5
11 pages
4 Confidence Intervals
100% (1)
4 Confidence Intervals
49 pages
StatisticsProbability Reviewer
No ratings yet
StatisticsProbability Reviewer
7 pages
LECTURES UP TO FINAL ASSIGNMENTS
No ratings yet
LECTURES UP TO FINAL ASSIGNMENTS
33 pages
Business Analytics & Machine Learning: Regression Analysis
No ratings yet
Business Analytics & Machine Learning: Regression Analysis
58 pages
Confidence Intervals
No ratings yet
Confidence Intervals
56 pages
Confidence Interval
100% (1)
Confidence Interval
19 pages
Confidence Interval - Notes - Update
No ratings yet
Confidence Interval - Notes - Update
4 pages
C 4
No ratings yet
C 4
61 pages
Confidence Intervals: Vocabulary: Point Estimate - Interval Estimate - Level of Confidence - Margin of Error
No ratings yet
Confidence Intervals: Vocabulary: Point Estimate - Interval Estimate - Level of Confidence - Margin of Error
8 pages
20211227233453D4998_Chap009_PPT_Estimation and Confidence Intervals
No ratings yet
20211227233453D4998_Chap009_PPT_Estimation and Confidence Intervals
31 pages
Navidi ch5
No ratings yet
Navidi ch5
34 pages
Presentations Tatta of Ee Q Ah
No ratings yet
Presentations Tatta of Ee Q Ah
13 pages
UNIT 2
No ratings yet
UNIT 2
6 pages
Chapter 9 Estimation From Sampling Data
No ratings yet
Chapter 9 Estimation From Sampling Data
22 pages
Module Statistics Interval Estimaate
No ratings yet
Module Statistics Interval Estimaate
7 pages
Chapter 6 Statistics
No ratings yet
Chapter 6 Statistics
60 pages
Estimation of Parameters 2
No ratings yet
Estimation of Parameters 2
37 pages
Chapter 9 Slides
No ratings yet
Chapter 9 Slides
33 pages
Unit-4 - Confidence Interval and CLT
No ratings yet
Unit-4 - Confidence Interval and CLT
29 pages
Chapter 9 Estimation From Sampling Data
No ratings yet
Chapter 9 Estimation From Sampling Data
23 pages
Confidence Interval
No ratings yet
Confidence Interval
54 pages
Stats 2 Module Updated
No ratings yet
Stats 2 Module Updated
33 pages
Chapter 10
No ratings yet
Chapter 10
23 pages
Estimation and Test of Hypothesis
No ratings yet
Estimation and Test of Hypothesis
41 pages
Estimation 1920
No ratings yet
Estimation 1920
51 pages
Ch 2-Confidence Interval and Sample Size -YARA
No ratings yet
Ch 2-Confidence Interval and Sample Size -YARA
27 pages
Confidence Intervals and Margin Errors
No ratings yet
Confidence Intervals and Margin Errors
38 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Banking Board Presentation by Florante P. Tabudlong
No ratings yet
Banking Board Presentation by Florante P. Tabudlong
14 pages
Impact of Digitization On Mutual Fund Services in India: Keywords
No ratings yet
Impact of Digitization On Mutual Fund Services in India: Keywords
9 pages
Engineering Data Analysis: Instructional Materials in STAT 20023
No ratings yet
Engineering Data Analysis: Instructional Materials in STAT 20023
75 pages
Ba 5211 - Data Analysis and Business Modeling
No ratings yet
Ba 5211 - Data Analysis and Business Modeling
88 pages
Formula For Hypothesis Testing
No ratings yet
Formula For Hypothesis Testing
5 pages
Women Workers in Construction Sector Issues and Challenges
No ratings yet
Women Workers in Construction Sector Issues and Challenges
30 pages
Hypothesis Testing For One Population Parameter - Samples
100% (1)
Hypothesis Testing For One Population Parameter - Samples
68 pages
Biostatistics Revision Dr.nj
No ratings yet
Biostatistics Revision Dr.nj
67 pages
Exam Question Evaluation With Item Response Theory: Evert-Jan - Bakker@wur - NL
No ratings yet
Exam Question Evaluation With Item Response Theory: Evert-Jan - Bakker@wur - NL
4 pages
Qualifying Exam
No ratings yet
Qualifying Exam
11 pages
SPSS Assignment 2
No ratings yet
SPSS Assignment 2
4 pages
Comparative Analysis of Orange Peels and Magnetic Nanomaterial in The Removal of Malachite Green From An Aqueous Solution
No ratings yet
Comparative Analysis of Orange Peels and Magnetic Nanomaterial in The Removal of Malachite Green From An Aqueous Solution
38 pages
Plotrix
No ratings yet
Plotrix
231 pages
Report
No ratings yet
Report
12 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
Lawless-Heymann2010 Chapter AcceptanceTesting
No ratings yet
Lawless-Heymann2010 Chapter AcceptanceTesting
23 pages
Statistical Paper Music 1 PDF
No ratings yet
Statistical Paper Music 1 PDF
19 pages
Z-Test and T-Test
No ratings yet
Z-Test and T-Test
6 pages
Descriptive Hypotesis Test
No ratings yet
Descriptive Hypotesis Test
53 pages
GeneSpring Manual
No ratings yet
GeneSpring Manual
936 pages
Wolkite University: Department of Horticulture
100% (1)
Wolkite University: Department of Horticulture
167 pages
Data Analytics Lab
No ratings yet
Data Analytics Lab
46 pages
Sampling and Data Management
100% (1)
Sampling and Data Management
48 pages
Psyc417 Final World Happiness
No ratings yet
Psyc417 Final World Happiness
11 pages
Mann Whitney Worked Example
No ratings yet
Mann Whitney Worked Example
5 pages
Bab 3
No ratings yet
Bab 3
15 pages
Gujarat State Eligibility Test: Gset Syllabus
No ratings yet
Gujarat State Eligibility Test: Gset Syllabus
6 pages
Islamic Banking and Finance Review (Vol. 3), 179-Article Text-457-1-10-20191122
No ratings yet
Islamic Banking and Finance Review (Vol. 3), 179-Article Text-457-1-10-20191122
24 pages