0% found this document useful (0 votes)

6 views29 pages

Estimation

The document covers estimation methods in statistics, focusing on point and interval estimation for population parameters. It explains properties of point estimators, confidence intervals for population means and proportions, and introduces hypothesis testing, including Type I and Type II errors. Additionally, it discusses the relationship between hypothesis tests and confidence intervals, and provides examples to illustrate these concepts.

Uploaded by

tonin10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views29 pages

Estimation

Uploaded by

tonin10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Introduction Statistics II

Simone Tonin

1/29
Estimation

2/29
Estimation

The values of population parameters are often unknown.

We use a representative sample of the population to
estimate the population parameters.
There are two types of estimation:
Point Estimation
Interval Estimation

3/29
Point estimation

A point estimate is a single numerical value used to

estimate the corresponding population parameter. A point
estimate is obtained by selecting a suitable statistic (a
suitable function of the data) and computing its value from
the given sample data. The selected statistic is called the
point estimator.

The point estimator is a random variable, so it has a

distribution, mean, variance etc.

e.g. the sample mean X = n1 ni=1 Xi is one possible point

P
estimator of the population mean µ.
The point estimate is x̄ = n1 ni=1 xi .
P

4/29
Point estimation: properties

Let θ be the unknown population parameter and θ̂ be its

estimator. The parameter space is denoted by Θ.

An estimator θ̂ is called unbiased estimator of θ if

E(θ̂) = θ.

The bias of the estimator θ̂ is defined as Bias(θ̂) = E(θ̂) − θ

5/29
Point estimation: properties

Mean Square Error (MSE) is a measure of how close θ̂

is, on average, to the true θ,

M SE = E[(θ̂ − θ)2 ] = V ar(θ̂) + [Bias(θ̂)]2

6/29
Interval estimation

An interval estimate (confidence interval) is an

interval, or range of values, used to estimate a population
parameter.
This interval will contain the population parameter with
some probability 100(1 − α)100%,
i.e., P (A < θ < B) = (1 − α).
The level of confidence 100(1 − α)% is the probability
that the interval estimate contains the population
parameter.
Interval estimate components:

point estimate ± (critical value × standard error)

7/29
Confidence intervals for the population mean

When sampling is from a normal distribution with known

variance σ 2 , then a 100(1 − α)% confidence interval for the
population mean µ is
σ
x̄ ± zα/2 √
n
where zα/2 can be obtained from the standard normal
distribution table.
100(1 − α)% α zα/2
90% 0.10 1.645
95% 0.05 1.96
99% 0.01 2.58
If σ is unknown and n ≥ 30, the sample standard deviation
rP
(xi − x̄)2
s=
n−1
in place of σ.
8/29
Confidence intervals for the population mean

9/29
Confidence intervals for the population mean

If the sampling is from a non-normal distribution and n ≥ 30,

then the sampling distribution of x̄ is approximately normally
distributed (central limit theorem) and we can use the same
√
formula, x̄ ± zα/2 (σ/ n), to construct the approximate
confidence interval for population mean.

10/29
Confidence intervals for the population mean

When sampling is from a normal distribution whose standard

deviation σ is unknown and the sample size is small, the
100(1 − α)% confidence interval for the population mean µ is
√
x̄ ± tα/2 (s/ n)

where tα/2 can be obtained from the t distribution table with

df = n − 1 and s is the sample standard deviation which is
given by rP
(xi − x̄)2
s=
n−1
If σ is unknown, and we neither have normal population nor
large sample, then we should use nonparametric statistics (not
cover in this course).

11/29
Interpreting confidence intervals

Probabilistic interpretation: In repeated sampling,

from some population, 100(1 − α)% of all intervals which
we constructed will in the long run include the population
parameter.

Practical interpretation: When sampling is from some

population, we have 100(1 − α)% confidence that the single
computed interval contains the population parameter.

12/29
Confidence interval for a population proportion

The 100(1 − α)% confidence interval for a population

proportion π is given by
r
π̂(1 − π̂)
π̂ ± zα/2
n
where π̂ is the sample proportion.

13/29
Example 15

Suppose an Italian car rental firm wants to estimate the average

number of kilometres travelled per day by each of its cars
rented in Florence. A random sample of 20 cars rented in
Florence reveals that the sample mean travel distance per day is
85.5 kilometres, with a population standard deviation of 19.3
kilometres. Compute a 99% confidence interval to estimate µ.

For a 99% level of confidence, a z value of 2.575 is obtained

(from the standard normal table). Assume that number of
kilometres travelled per day is normally distributed.
σ
x̄ ± zα/2 √
n
19.3
85.5 ± 2.575 √
20
85.5 ± 11.1
thus 74.4 ≤ µ ≤ 96.6
14/29
Hypothesis testing

15/29
Motivation

We often encounter such statements or claims:

A newspaper claims that the average starting salary of
MBA graduates is over £50K.(one sample test)

A claim about the efficiency of a particular diet program,

the average weight after the program is less than the
average weight before the program. (two paired samples test)

On average female managers earn less than male managers,

given that they have the same qualifications and skills.(two
independent samples test)
So we have claims about the populations’ means (averages) and
we would like to verify or examine these claims.

This is a kind of problem that hypothesis testing is designed

to solve.
16/29
The nature of hypothesis testing

We often use inferential statistics to make decisions or

judgments about the value of a parameter, such as a
population mean.
Typically, a hypothesis test involves two hypotheses:
1 Null hypothesis: a hypothesis to be tested, denoted by
H0 .
2 Alternative hypothesis (or research hypothesis): a
hypothesis to be considered as an alternate to the null
hypothesis, denoted by H1 or Ha .
The problem in a hypothesis test is to decide whether or
not the null hypothesis should be rejected in favour of the
alternative hypothesis.
The choice of the alternative hypothesis should reflect the
purpose of performing the hypothesis test.

17/29
The nature of hypothesis testing

How do we decide whether or not to reject the null

hypothesis in favour of the alternative hypothesis?
Very roughly, the procedure for deciding is the following:
Take a random sample from the population.
If the sample data are consistent with the null hypothesis,
then do not reject the null hypothesis; if the sample data
are inconsistent with the null hypothesis, then reject the
null hypothesis and conclude that the alternative hypothesis
is true.
Test statistic: the statistic used as a basis for deciding
whether the null hypothesis should be rejected.
The test statistic is a random variable which therefore
has a sampling distribution with mean and standard
deviation (so-called standard error).

18/29
Type I and Type II Errors

Type I error: rejecting the null hypothesis when it is in

fact true.
Type II error: not rejecting the null hypothesis when it
is fact false.
The significance level, α, of a hypothesis test is defined
as the probability of making a Type I error, that is, the
probability of rejecting a true null hypothesis.
19/29
Type I and Type II Errors

Relation between Type I and II error probabilities:

For a fixed sample size, the smaller the Type I error
probability, α, of rejecting a true null hypothesis, the larger
the Type II error probability of not rejecting a false null
hypothesis and vice versa.
Possible conclusions for a hypothesis test:
We use the terms reject and failure to reject for possible
decision about a null hypothesis.
You should keep in mind that failure to reject the null
hypothesis leads to much greater uncertainty because we
do not know the probability of Type II error. (It is better
to say “do not reject” than “accept”).
When the null hypothesis is rejected in a hypothesis test
performed at the significance level α, we say that the
results are statistically significant at level α.
20/29
Hypothesis tests for one population mean

In order to test the hypothesis that the population mean µ is

equal to a particular value µ0 , we are going to test the null
hypothesis

H0 : µ = µ0

against one of the following alternatives:

H1 : µ 6= µ0 (Two-tailed)
H1 : µ < µ0 (Left-tailed)
H1 : µ > µ0 (Right-tailed)

21/29
Hypothesis tests for one population mean

In order to test H0 , we need to use one of the following test

statistics, we should choose the one that satisfies the
assumptions.
If σ is known, and we have a normally distributed
population or large sample (n ≥ 30), then the test statistic,
so-called z-test, is
x̄ − µ0
z= √
σ/ n
where σ is the standard deviation of the population.
If σ is unknown, and we have a normally distributed
population or large sample (n ≥ 30), then the test statistic,
so-called t-test, is
x̄ − µ0
t= √ with df = n − 1.
s/ n
where s is the standard deviation of the sample.
22/29
Critical-value approach to hypothesis testing

For any specific significance level α, one can obtain these

critical values ±zα/2 and ±zα from the standard normal table.

1.282 1.645 1.960 2.326 2.576

z0.10 z0.05 z0.025 z0.01 z0.005

If the value of the test statistic falls in the rejection

region, reject H0 ; otherwise do not reject H0 .
23/29
Critical-value approach to hypothesis testing

For any specific significance level α, one can obtain these

critical values ±tα/2 and ±tα from the T distribution table. For
example, for df = 9 and α = .05, the critical values are
±t0.025 = ±2.262 and ±t0.05 = ±1.833.

24/29
The p-value approach to hypothesis testing

The p-value is the smallest significance level at which the

null hypothesis would be rejected. The p-value is also
known as the observed significance level.

The p-value measures how well the observed sample agrees

with the null hypothesis. A small p-value (close to zero)
indicates that the sample is not consistent with the null
hypothesis and the null hypothesis should be rejected. On
the other hand, a large p-value (larger than .10) generally
indicates a reasonable level of agreement between the
sample and the null hypothesis.

As a rule of thumb, if p-value ≤ α then reject H0 ; otherwise

do not reject H0 .
25/29
Hypothesis testing and confidence intervals

Hypothesis tests and confidence intervals are closely related.

Consider, for instance, a two tailed hypothesis test for a
population mean at the significance level α. It can be shown
that the null hypothesis will be rejected if and only if the value
µ0 given for the mean in the null hypothesis lies outside the
100(1 − α)-level confidence interval for µ.

Example:
At significance level α = 0.05, we want to test H0 : µ = 40
against H1 : µ 6= 40 (so here µ0 = 40).
Suppose that the 95% confidence interval for µ is
35 <µ< 38.
As µ0 = 40 lies outside this confidence intervals, we reject
H0 .

26/29
Test of Normality

One of the assumptions in order to use z-test or t-test is that

the population which we sampled from is normally distributed.
However we did not yet test this assumption, we should perform
a so-called test of normality. In order to do so:
We can plot our data sample, e.g. histogram, boxplot,
Use normality tests such as Kolmogorov-Smirnov test or
Shapiro-Wilk test. The null and alternative hypotheses are
H0 : the population being sampled is normally distributed.
H1 : the population being sampled is nonnormally
distributed.
If σ is unknown, and we neither have normal population nor
large sample, then we should use nonparametric tests instead of
z-test or t-test (not cover in this course).

27/29
Example 16

Each year, manufacturers perform mileage tests on new car

models and submit the results to the Environmental Protection
Agency (EPA). The EPA then tests the vehicles to determine
whether the manufacturers are correct. In 1992 one company
reported that a particular model equipped with a four-speed
manual transmission averaged 29 mpg on the highway. Suppose
the EPA tested 15 of the cars and obtained the following gas
mileages.

27.3 30.9 25.9 31.2 29.7

28.8 29.4 28.5 28.9 31.6
27.8 27.8 28.6 27.3 27.6

What decision would you make regarding the company’s report

on the gas mileage of the car? Perform the required hypothesis
test at the 5% significance level.
28/29
Example 16 (cont.)

The null and alternative hypotheses:

H0 : µ = 29 mpg vs. H1 : µ 6= 29 mpg

The value of the test statistic,

x̄ − µ0 28.753 − 29
t= √ = √ = −0.599
s/ n 1.595/ 15

As p-value = 0.559 > α = 0.05. So, we cannot reject H0 .

At the 5% significance level, the data do not provide sufficient

evidence to conclude that the company’s report was incorrect.

29/29

Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Book of Sweep Picking
100% (3)
Book of Sweep Picking
4 pages
A Confidence Interval Provides Additional Information About Variability
No ratings yet
A Confidence Interval Provides Additional Information About Variability
14 pages
Normal Distribution
No ratings yet
Normal Distribution
8 pages
Chapter 5
No ratings yet
Chapter 5
65 pages
x (sample mean) is the most unbiased estimate for the population mean μ p= x n
No ratings yet
x (sample mean) is the most unbiased estimate for the population mean μ p= x n
5 pages
Chapter 5 Infernece Concerning Mean - 081fa6ed Cdae 4e5f Afc5 E9ab156488e0
No ratings yet
Chapter 5 Infernece Concerning Mean - 081fa6ed Cdae 4e5f Afc5 E9ab156488e0
47 pages
Estimation and Hypothesis Testing
No ratings yet
Estimation and Hypothesis Testing
44 pages
Week 1 - Hypothesis Testing - Part 1
No ratings yet
Week 1 - Hypothesis Testing - Part 1
77 pages
Lec 1
No ratings yet
Lec 1
38 pages
Statssss
No ratings yet
Statssss
31 pages
Astats8 (To Be Updated)
No ratings yet
Astats8 (To Be Updated)
5 pages
Week 12-1 Pre
No ratings yet
Week 12-1 Pre
30 pages
Review of Statistics
No ratings yet
Review of Statistics
36 pages
Introduction To Statistical Inference 2
No ratings yet
Introduction To Statistical Inference 2
46 pages
Chapter 2-HYPOTHESIS TESTING
No ratings yet
Chapter 2-HYPOTHESIS TESTING
52 pages
Lecture 03. Statistical Inference
No ratings yet
Lecture 03. Statistical Inference
31 pages
SP - Reviewer
No ratings yet
SP - Reviewer
5 pages
h5 Statistical Inference
No ratings yet
h5 Statistical Inference
4 pages
Module 1 - One Sample Test - With MINITAB
No ratings yet
Module 1 - One Sample Test - With MINITAB
60 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
12 pages
One-Sample Tests of Hypothesis: Mcgraw-Hill/Irwin
No ratings yet
One-Sample Tests of Hypothesis: Mcgraw-Hill/Irwin
14 pages
Hypothesis Testing Intro and Test For Means
No ratings yet
Hypothesis Testing Intro and Test For Means
10 pages
Sampling Theory
No ratings yet
Sampling Theory
7 pages
Sampling Theory 2
No ratings yet
Sampling Theory 2
7 pages
Chapter 8
No ratings yet
Chapter 8
45 pages
SB K49 Lecture8
No ratings yet
SB K49 Lecture8
51 pages
Chapter-7-Estimation & Hypothesis Testing
No ratings yet
Chapter-7-Estimation & Hypothesis Testing
15 pages
Powerpoint Topik 8
No ratings yet
Powerpoint Topik 8
6 pages
Cosm Unit - IV
No ratings yet
Cosm Unit - IV
18 pages
Lecture7 Confidence
No ratings yet
Lecture7 Confidence
45 pages
Qusestion 4
No ratings yet
Qusestion 4
8 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
71 pages
Hypothesis Test
100% (1)
Hypothesis Test
52 pages
02 - Statistical Inference
No ratings yet
02 - Statistical Inference
10 pages
Chapter10 - One Tailed Test of Hypothesis
No ratings yet
Chapter10 - One Tailed Test of Hypothesis
44 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
56 pages
Probability and Statistics 4 - PARAMETER ESTIMATION
No ratings yet
Probability and Statistics 4 - PARAMETER ESTIMATION
20 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
29 pages
8.hypo Testing....
No ratings yet
8.hypo Testing....
44 pages
02 - Statistical Inference
No ratings yet
02 - Statistical Inference
10 pages
Session 10
No ratings yet
Session 10
14 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
63 pages
T - Test
100% (2)
T - Test
32 pages
Biostats Midterms
No ratings yet
Biostats Midterms
4 pages
5 Estimation and Hypothesis Testing
No ratings yet
5 Estimation and Hypothesis Testing
25 pages
Testing of Hypotheses
No ratings yet
Testing of Hypotheses
24 pages
22-Intro To Inference For Decision Making-19-03-2024
No ratings yet
22-Intro To Inference For Decision Making-19-03-2024
15 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
19 pages
Chapt10 Hypothesis Testing One-Sample Tests BBA
No ratings yet
Chapt10 Hypothesis Testing One-Sample Tests BBA
50 pages
C22 Inferential Statistics DXB
No ratings yet
C22 Inferential Statistics DXB
66 pages
Statistical Estimation
No ratings yet
Statistical Estimation
37 pages
Toh 1 Student
No ratings yet
Toh 1 Student
71 pages
Statistics Esrimation and Hypothesis
No ratings yet
Statistics Esrimation and Hypothesis
13 pages
Module 3A0 Tests For A Population Mean
No ratings yet
Module 3A0 Tests For A Population Mean
52 pages
MNSTA Chapter 4
No ratings yet
MNSTA Chapter 4
31 pages
Chapter 10. One-Sample Test of Hypothesis
No ratings yet
Chapter 10. One-Sample Test of Hypothesis
23 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
36 pages
03 Inferential Statistics2025
No ratings yet
03 Inferential Statistics2025
38 pages
Bab 4
No ratings yet
Bab 4
7 pages
Noblelft FD20-35 Operation & Maintenance Manual
No ratings yet
Noblelft FD20-35 Operation & Maintenance Manual
108 pages
Including:: 4 Authors
No ratings yet
Including:: 4 Authors
34 pages
Stats1 Chapter 2::: Measures of Location & Spread
No ratings yet
Stats1 Chapter 2::: Measures of Location & Spread
53 pages
CST2355 Lab2b Summer2024
No ratings yet
CST2355 Lab2b Summer2024
9 pages
Eloisa Jasmin F. Perez E3Q - Engineering Data Analysis Formative Assessment
No ratings yet
Eloisa Jasmin F. Perez E3Q - Engineering Data Analysis Formative Assessment
2 pages
Classification of Air Masses and Fronts - Geography Optional - UPSC - Digitally Learn
No ratings yet
Classification of Air Masses and Fronts - Geography Optional - UPSC - Digitally Learn
14 pages
Discrete-Time Simulation With Simulink: ECE4560: Digital Control Laboratory
No ratings yet
Discrete-Time Simulation With Simulink: ECE4560: Digital Control Laboratory
5 pages
Group Members: 1. Shucayb Mohamed Ismail 2. Abdihafid Ismail Salad 3. Nimo Ahmed Hassan 4. Nimo Khadar Ahmed
No ratings yet
Group Members: 1. Shucayb Mohamed Ismail 2. Abdihafid Ismail Salad 3. Nimo Ahmed Hassan 4. Nimo Khadar Ahmed
20 pages
Integrated Circuits - K. R. Botkar
No ratings yet
Integrated Circuits - K. R. Botkar
67 pages
Piling
No ratings yet
Piling
20 pages
Esci JPP
0% (1)
Esci JPP
27 pages
Chapter 4 Measures of Location
No ratings yet
Chapter 4 Measures of Location
37 pages
Exp Limiting Friction
No ratings yet
Exp Limiting Friction
2 pages
Aqa Mm1b QP Jan13
No ratings yet
Aqa Mm1b QP Jan13
20 pages
Magnet Grade 5 WS
0% (1)
Magnet Grade 5 WS
7 pages
Basis Worksheet
No ratings yet
Basis Worksheet
52 pages
Chapter 4:jfet: Junction Field Effect Transistor
No ratings yet
Chapter 4:jfet: Junction Field Effect Transistor
67 pages
Chapter 17 Exercise
0% (1)
Chapter 17 Exercise
3 pages
Simple Stresses and Strains of Statically Indeterminate Structures
No ratings yet
Simple Stresses and Strains of Statically Indeterminate Structures
12 pages
Experiement 6
No ratings yet
Experiement 6
3 pages
Group 2 - How Does Music Impact Plant Growth
No ratings yet
Group 2 - How Does Music Impact Plant Growth
5 pages
9 Fraunhofer Snail Trails
No ratings yet
9 Fraunhofer Snail Trails
4 pages
Sol 5
No ratings yet
Sol 5
7 pages
8051 Instruction Set
No ratings yet
8051 Instruction Set
50 pages
Room Checksums: Room - 001 Heating Coil Peak CLG Space Peak Cooling Coil Peak Temperatures
No ratings yet
Room Checksums: Room - 001 Heating Coil Peak CLG Space Peak Cooling Coil Peak Temperatures
1 page
Formlabs Fuse F1 - Sift Tech Specs
No ratings yet
Formlabs Fuse F1 - Sift Tech Specs
4 pages
Template REVIEW JURNAL AJMH
No ratings yet
Template REVIEW JURNAL AJMH
2 pages
Heat Capacities of Inorganic and Organic Compounds in The Ideal Gas State
No ratings yet
Heat Capacities of Inorganic and Organic Compounds in The Ideal Gas State
5 pages

Estimation

Uploaded by

Estimation

Uploaded by

Introduction Statistics II

The values of population parameters are often unknown.

A point estimate is a single numerical value used to

The point estimator is a random variable, so it has a

e.g. the sample mean X = n1 ni=1 Xi is one possible point

Let θ be the unknown population parameter and θ̂ be its

An estimator θ̂ is called unbiased estimator of θ if

The bias of the estimator θ̂ is defined as Bias(θ̂) = E(θ̂) − θ

Mean Square Error (MSE) is a measure of how close θ̂

M SE = E[(θ̂ − θ)2 ] = V ar(θ̂) + [Bias(θ̂)]2

An interval estimate (confidence interval) is an

point estimate ± (critical value × standard error)

When sampling is from a normal distribution with known

If the sampling is from a non-normal distribution and n ≥ 30,

When sampling is from a normal distribution whose standard

where tα/2 can be obtained from the t distribution table with

Probabilistic interpretation: In repeated sampling,

Practical interpretation: When sampling is from some

The 100(1 − α)% confidence interval for a population

Suppose an Italian car rental firm wants to estimate the average

For a 99% level of confidence, a z value of 2.575 is obtained

We often encounter such statements or claims:

A claim about the efficiency of a particular diet program,

On average female managers earn less than male managers,

This is a kind of problem that hypothesis testing is designed

We often use inferential statistics to make decisions or

How do we decide whether or not to reject the null

Type I error: rejecting the null hypothesis when it is in

Relation between Type I and II error probabilities:

In order to test the hypothesis that the population mean µ is

against one of the following alternatives:

In order to test H0 , we need to use one of the following test

For any specific significance level α, one can obtain these

1.282 1.645 1.960 2.326 2.576

If the value of the test statistic falls in the rejection

For any specific significance level α, one can obtain these

The p-value is the smallest significance level at which the

The p-value measures how well the observed sample agrees

As a rule of thumb, if p-value ≤ α then reject H0 ; otherwise

Hypothesis tests and confidence intervals are closely related.

One of the assumptions in order to use z-test or t-test is that

Each year, manufacturers perform mileage tests on new car

27.3 30.9 25.9 31.2 29.7

What decision would you make regarding the company’s report

The null and alternative hypotheses:

H0 : µ = 29 mpg vs. H1 : µ 6= 29 mpg

The value of the test statistic,

As p-value = 0.559 > α = 0.05. So, we cannot reject H0 .

At the 5% significance level, the data do not provide sufficient

You might also like