0% found this document useful (0 votes)

9 views8 pages

3 Module Notes

The document discusses the Central Limit Theorem (CLT) and confidence intervals in the context of estimating population means from sample data. It emphasizes that the sample mean is an unbiased estimator of the population mean and that the distribution of the sample mean approaches normality as sample size increases. Additionally, it explains how to construct confidence intervals for both population means and proportions, highlighting the use of the t-distribution when the population standard deviation is unknown.

Uploaded by

dianaqrice

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views8 pages

3 Module Notes

Uploaded by

dianaqrice

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

MAST 6474 Introduction to Data Analysis I

Central Limit Theorem and Confidence Intervals

Sampling: Using Data to Estimate the Population Mean

Until now, we have been given complete information about each random variable. We have been told
its distribution along with the “true” mean and variance of the random variable, even if some
calculations were required.

In almost every problem that we encounter in business analysis, however, we do not know the
distribution of the random variable(s) of interest. Yet we can draw a sample from the random
variable(s), then use that sample data to learn about the variable’s true expected value or mean. We
will illustrate this approach with an example.

Example: Hail — Dents from Above! (source: Jon Danklefs, SMU MBA Class 45D). In March of
2000, a hailstorm ripped through North Texas and did serious damage to a number of homes and
businesses. Kia Motor’s Midlothian distribution center was particularly hard hit. Nearly 5000 vehicles
were hail-damaged.

The distributor immediately authorized a local dent-removal service to repair up to 200 damaged
vehicles. Each car was fixed panel by panel using a paintless dent removal process, and a detailed
invoice was prepared with documented repair costs. After fixing 180 vehicles to the distributor’s
satisfaction, it was mutually agreed that the remaining cars should be repaired using a fixed rate per
car.

Jon Danklefs was responsible for negotiating the repair rate per car. He already had a sample of 180
cars whose repair costs were known. The repair costs for all of these vehicles are contained in the file
Module 3 Notes Dataset 1 found in your Student Resources folder.

Using information from the sample of 180 repaired vehicles, come up with an estimate of the true
expected cost for damaged vehicles.

Solution: We begin by noting that we know nothing about the distribution of repair costs for the
nearly 5000 cars that were damaged. Taken together, these cars represent the population of
interest. We gather information that will help us learn about the true expected cost by selecting a
random sample of damaged cars. Specifically, we choose cars sequentially (one-by-one) for the
sample, with every car not yet included in the sample having an equal probability of being chosen.
Copyright Edward Fox and John Semple 2019 1
MAST 6474 Introduction to Data Analysis I

Assuming that 180 cars were selected this way, we have a random sample “without replacement”
from a finite population. If each one of the damaged vehicles had an equal probability of selection,
even if it was already selected for the sample, we would have a random sample “with replacement.”
When the random sample is small compared to the population (less than 5%), this distinction is not
critical. Assuming that we are drawing a relatively small sample from a virtually infinite (i.e., large)
population simplifies the formulas we use.1 Choosing 180 damaged cars from a population of nearly
5000 justifies this assumption. It will be our standard assumption throughout the course.

The second part of our solution requires that we convert the repair costs of the sample of damaged
cars into an estimate of repair costs for the population. Assuming that every damaged vehicle has the
same probability of appearing in the sample, we can use the average repair cost of vehicles in the
sample:

x 1+ x 2 + x 3+ ⋯+ x178 + x 179 + x 180

x= =$ 215.67
180
The computed value x is called the sample mean; we will use the sample mean to estimate the true
mean cost, μcost , of repair for the entire population of nearly 5000 damaged cars.

How good is an estimate of μcost is $215.67? Based on the sample mean, the Kia distributor could
refuse to pay a fixed price higher than $215.67 per car. But μcost could actually be higher than
$215.67, in which case $215.67 might be a good deal for Kia. On the other hand, μcost is just as likely
to be lower than $215.67, in which case $215.67 might be a bad deal for Kia. We will never know for
certain how close $215.67 is to μcost because the true cost of repairs for the remaining vehicles will
never be documented. But we can determine, probabilistically, how close we are.

To understand how this is done, imagine we were to draw another random sample of size n=180 and
compute another sample estimate of the true mean repair cost. How likely is this second estimate to
be exactly $215.67? What if we took a third random sample of size n=180? We will think of our
original estimate x =$215.67 as a single draw from a random variable that is the average repair cost
for any sample of 180 cars. In the language of statistics, we are interested in the distribution of the
X + X + X +⋯+ X 178 + X 179 + X 180
estimator X = 1 2 3 . Capital X will be used to represent this estimator, a
180
random variable for the mean of a sample of size 180; lower case x will be used represent an
estimate, the computed mean of a particular sample.

1
More complicated formulas are needed for the case of small finite populations from which samples are drawn without
replacement.
Copyright Edward Fox and John Semple 2019 2
MAST 6474 Introduction to Data Analysis I

It turns out that we know more about the distribution of the estimator X , which we will call the
sampling distribution. In fact, we know enough to make precise statements about how close the
computed sample mean x =$215.67 is to the true (but unknown) population mean μcost . Specifically, we
know that

1. The expected value of X is E( X ) = μcost . In simple terms, this means the mean of the
sampling distribution is exactly the same as the true (unknown) mean of the population —
the value that the sample was drawn to estimate. Estimators that have this property are
said to be unbiased. On the other hand, this is like knowing that a manufacturing machine
makes parts that are “correct on average” without knowing what that average actually is.

2
σ
2. The distribution of X is approximately Normal with a mean of μcost and a variance of ,
n
where σ 2 is the true (but unknown) variance of the population we are sampling from. This is
not at all obvious and is a consequence of the Central Limit Theorem (CLT).

We will focus on the second fact. The reason that this fact is such a profound result is that it doesn’t
even depend on the distribution of the underlying population we are sampling from.

The Central Limit Theorem

The Central Limit Theorem (CLT) holds that, if the sample size n is not too small, the distribution of
the estimator X , the sample mean, is approximately normally distributed. Note that the live session
exercise highlights two other important points: (1) that the sample mean is unbiased and (2) that the
dispersion of the sample mean decreases as the sample size n increases.

Let ( X 1 , X 2 ,⋯ , X n) be a random sample from any infinite population with mean μ and variance σ 2.
X + X +⋯+ X n
As n becomes large, the distribution of X = 1 2 is approximately Normal with mean μ and
n

( ) σ
2 2
σ σ
variance ; in our notation X ¿ . N μ , . The square root of the variance, , is known as the
n n √n
standard error of the mean.

Remarkably, the Central Limit Theorem approximation doesn’t require the population of the random
variable x to have any particular distribution. If the underlying random variable is normally distributed,
however, then the distribution of X is exactly Normal (for any size n), and we can dispense with the
Copyright Edward Fox and John Semple 2019 3
MAST 6474 Introduction to Data Analysis I

word “approximately.” This follows directly from the combination rule for independent Normal random
variables. How big does n need to be for this approximation to be very precise? There is general
agreement that n = 100 is big enough, although (as the exercise showed) n = 30 provides a pretty
good approximation. Nevertheless, the bigger the sample size, the better the approximation.

Using the Central Limit Theorem often means converting X to a standard Normal. The standardized
version of the Central Limit Theorem is
¿.
X−μ
N ( 0 ,1 )
2
σ
√n
for large n.

General Confidence Interval

In statistics, it is often useful to construct a confidence interval for a population parameter (we will
focus on the mean, μ), an interval within which we can say with some level of confidence that the true
population parameter is. Unfortunately, even after calculating the confidence interval, we never really
know whether the true population parameter is within the interval or not.

Based on the Central Limit Theorem and the standard Normal distribution (Z), the following probability
statement is approximately true for a large enough sample n:

( )
X−μ
Pr −z α/ 2 < < z α/ 2 =1−α
σ
√n

where a “cutoff” value, z α/ 2, from the standard normal distribution is chosen so that the area under the
curve above z α/ 2 in the upper tail is α /2 and that the area under the curve below −z α/ 2 in the lower tail
is also α /2.

Copyright Edward Fox and John Semple 2019 4

MAST 6474 Introduction to Data Analysis I

Area in this tail Area in this tail

is exactly α /2 is exactly α /2

Q: How do you compute the

value of z needed here?
A: Take the given  and compute
either
−z α/ 2 z α/ 2 =1-NORM.INV(α /2, 0, 1) or
=NORM.INV(1−α /2, 0, 1)

The equation above can be rewritten as Pr X −z α / 2 ∙( σ

√n
< μ< X + z α /2 ∙
)σ
√n
=1−α

a probability statement about the true population mean, μ. This probability statement shows that the
σ
interval X ± z α / 2 ∙ contains the true population mean, μ, 100(1-α )% of the time. If we replace X with
√n
an observed sample mean x , we get the interval
σ
x ± zα/ 2 ∙
√n
which is called a 100(1-α )% confidence interval for the mean—where confidence is expressed as a
percentage. If the underlying population that we draw from is normally-distributed, the confidence
interval is said to be an exact 100(1-α )% confidence interval. Otherwise we say it is an approximate
100(1-α )% confidence interval. α is the probability that the true mean falls outside the confidence
interval; α is always small, usually .05 or less.

In practice, we almost never know the true population standard deviation, σ (or equivalently σ 2). We
address this missing parameter differently when calculating confidence intervals for proportions than
when calculating confidence intervals for other parameters. We will discuss proportions first, then the
more general case.

Copyright Edward Fox and John Semple 2019 5

MAST 6474 Introduction to Data Analysis I

Confidence Interval for a Population Proportion

We constantly see TV, online, and newspaper polling that report sample statistics about the
percentage of people that believe something, suffer from some condition, would vote for some
candidate, and so forth. The next time you see a poll on TV, glance at the bottom of the screen. You
will typically see the poll’s “margin of error” reported; this is the sampling error of the estimate. More
specifically, this is the uncertainty about the true population proportion because it is being estimated
from a sample, not determined from the entire population.

Suppose you want to determine the percentage of American households that own firearms. If you
take a random sample of n people, a certain sample percentage will own guns. How does this sample
percentage compare to the true population percentage? Let p denote the true population percentage
and let p̂ denote percentage estimated from the sample. Each individual in the sample represents a
random draw from a population with a Bernoulli distribution

X Probability
1 (Own gun) p
0 (Do not own 1-p
gun)

From our discussion of the Bernoulli distribution, recall that the expected value is p and the variance
is p(1-p). Substituting these values into the confidence interval formula, we find that a 100(1-)%
confidence interval for p is

^p ± z α /2 ∙
√ p ( 1−p )
n

The quantity that is added to / subtracted from ^p is known as the margin of error. Observe the
unknown population proportion p under the square root sign, a reflection of the fact that we don’t
know the true population variance. We replace the population proportion p under the square root sign
with the sample proportion p̂ . This approximation is acceptable if n ^p ≥5 and n ( 1− ^p ) ≥ 5.

Confidence Interval for a Population Proportion

A 100(1-)% confidence interval for the true population proportion p is
given by ^p ± z α /2 ∙
^
p
Where is the sample proportion.
n √
^p ( 1− ^p )

MAST 6474 Introduction to Data Analysis I

Confidence Intervals for a Population Mean (Not a Proportion)

When not dealing with proportions, we use a more general approximation for the unknown population
standard deviation σ (or equivalently, the population variance σ 2). Given a random sample of size n
from the population ( X 1 , X 2 ,⋯ , X n), we will use the sample variance as an estimate for the population
variance. The sample variance, denoted s2, and given by the formula

n
1
2
s= ∑
n−1 i=1
( x i−x )
2

Note that the sample variance calculation divides the sum of squared differences from the sample
mean by n – 1; it is therefore not the average squared distance from the sample mean, x .2 When the
sample size is small (e.g., n = 3, 4 or 5), the result is very different from the average; however, when
the sample size is large (e.g., n > 100), the result is very close to the average. The sample variance
can be computed in Excel with the function VAR.S(number1,number2,…). The sample standard
deviation is denoted by s and can be computed with the function STDEV.S(number1,number2,…). Of
course, s can also be calculated by taking the square root of the sample variance.

t distribution

Remember that our initial expression for the confidence interval assumed that the population
standard deviation σ is known. Usually, however, σ is unknown. It is tempting to simply insert the
sample standard deviation s in its place. We can do this, but it requires us to use a variation of the
standard Normal distribution called Student’s t, or just the t distribution.

If a random sample of size n is drawn from a Normal distribution,

(X −μ)/( s/ √ n) follows a t distribution with n-1 degrees of freedom (df)

Note that the denominator includes an s instead of a σ . “Degrees of freedom,” or df, does not need to
be estimated—it depends on the sample size n. The t distribution is symmetric about 0 and looks a lot
like the standard Normal distribution (especially as the degrees of freedom become large) as shown
in the applet: http://www.stat.uiowa.edu/~mbognar/applets/t.html.

2
Dividing by n -1 rather than n is known as Bessel’s correction.
Copyright Edward Fox and John Semple 2019 7
MAST 6474 Introduction to Data Analysis I

t Distribution

Confidence Interval using the t Distribution

If we continue to assume that the sampling distribution is Normal (or approximately Normal), then we
can construct a 100(1-)% confidence interval for the mean using the following formula:

s
x ± t α / 2 ,df ∙
√n

s
√
2
Recall that the right-most quantity in this formula, (equivalently written s ), is known as the
√n n
standard error—we will use this term frequently throughout the remainder of the course. The
s
expression on the right side of the ± sign, t α / 2 ,df ∙ , is known as the margin of error.
√n
We can use Excel to calculate confidence intervals. The function CONFIDENCE.T(α ,s,n) computes
the margin of error, where (1 – α )% is the confidence level, s is the sample standard deviation, and n
is the sample size. To calculate the confidence interval, we simply add/subtract the margin of error
to/from the sample mean.

Example: How Rough Is Right? (Brent Pope, SMU MBA Class 46P). Airco has inspected 81
armature hubs and measured their roughness. The data is provided in the file Module 3 Notes
Dataset. Calculate a 95% confidence interval for the mean roughness. Calculate a 98% confidence
interval for the mean roughness.

Qmt12 Chapter 7 Sampling Distributions
No ratings yet
Qmt12 Chapter 7 Sampling Distributions
49 pages
MT233 October 2019-1
No ratings yet
MT233 October 2019-1
39 pages
Descriptive Statistics PDF
100% (1)
Descriptive Statistics PDF
40 pages
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
100% (1)
Sampling and Statistical Inference: Eg: What Is The Average Income of All Stern Students?
11 pages
Chapter 5 - Sample Statistics
No ratings yet
Chapter 5 - Sample Statistics
90 pages
OLS 2 Variables 2
No ratings yet
OLS 2 Variables 2
169 pages
Dr. NZ - CH-03
No ratings yet
Dr. NZ - CH-03
80 pages
STATISTICS FOR BUSINESS 1st Semester KPELA
No ratings yet
STATISTICS FOR BUSINESS 1st Semester KPELA
36 pages
Ders 1
No ratings yet
Ders 1
34 pages
Chapter 3 - SV
No ratings yet
Chapter 3 - SV
83 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
CH 3 - Luc
No ratings yet
CH 3 - Luc
76 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
9 Sampling Distribution and Point Estimation of Parameters
No ratings yet
9 Sampling Distribution and Point Estimation of Parameters
4 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
Https D2bv2xoq3lkf8w.cloudfront - Net Nlms-Cdn-Mec Nlms Content 1731816940 Unit 4 Inferential Statistics (Part 1)
No ratings yet
Https D2bv2xoq3lkf8w.cloudfront - Net Nlms-Cdn-Mec Nlms Content 1731816940 Unit 4 Inferential Statistics (Part 1)
32 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Univariate Statistics
No ratings yet
Univariate Statistics
7 pages
Basic Univariate Statistics For Engineers 2019
No ratings yet
Basic Univariate Statistics For Engineers 2019
32 pages
Lecture No. Probability & Statistics
No ratings yet
Lecture No. Probability & Statistics
34 pages
Chapter 3 (Technical English For Statistics)
No ratings yet
Chapter 3 (Technical English For Statistics)
8 pages
SM-2 Basic Statistics
No ratings yet
SM-2 Basic Statistics
35 pages
Statistics Lecture Course 2022-2023
No ratings yet
Statistics Lecture Course 2022-2023
66 pages
Chap5 Statistical Inference
No ratings yet
Chap5 Statistical Inference
38 pages
1 - 3 Descriptive Measures
No ratings yet
1 - 3 Descriptive Measures
33 pages
2021 EDA-Module 2 DESCRIBING DATA - Oct. 22c
No ratings yet
2021 EDA-Module 2 DESCRIBING DATA - Oct. 22c
70 pages
Statistics
No ratings yet
Statistics
49 pages
2 5+Sample+Moments
No ratings yet
2 5+Sample+Moments
30 pages
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
No ratings yet
Prof. Joy V. Lorin-Picar Davao Del Norte State College: New Visayas, Panabo City
91 pages
2 - Analyze - Inferential Statistics
No ratings yet
2 - Analyze - Inferential Statistics
27 pages
Business Statistics
No ratings yet
Business Statistics
25 pages
Module01 ProbabilityAndHypothesisTesting
No ratings yet
Module01 ProbabilityAndHypothesisTesting
62 pages
Lecture 8 - Statistics
No ratings yet
Lecture 8 - Statistics
21 pages
Standard Errors: A Review and Evaluation of Standard Error Estimators Using Monte Carlo Simulations
No ratings yet
Standard Errors: A Review and Evaluation of Standard Error Estimators Using Monte Carlo Simulations
17 pages
BN2102 1-6 Notes
No ratings yet
BN2102 1-6 Notes
38 pages
Pro Band Stat
No ratings yet
Pro Band Stat
27 pages
Ch. 9 Lecture Slides Fall 2018
No ratings yet
Ch. 9 Lecture Slides Fall 2018
77 pages
Bus 173 - 1
No ratings yet
Bus 173 - 1
28 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
Sample Theory With Ques. - Estimation (JAM MS Unit-14)
No ratings yet
Sample Theory With Ques. - Estimation (JAM MS Unit-14)
25 pages
2.2 Measures of Central Location
No ratings yet
2.2 Measures of Central Location
17 pages
MATH10282: Introduction To Statistics Lecture Notes
No ratings yet
MATH10282: Introduction To Statistics Lecture Notes
49 pages
05 Statistical Inference-2 PDF
No ratings yet
05 Statistical Inference-2 PDF
14 pages
Transition To MATH503
No ratings yet
Transition To MATH503
12 pages
Statistical and Probability Tools For Cost Engineering
No ratings yet
Statistical and Probability Tools For Cost Engineering
16 pages
CHAPTERS
No ratings yet
CHAPTERS
17 pages
Brief Lecture Notes
No ratings yet
Brief Lecture Notes
13 pages
Stat Reviewer
No ratings yet
Stat Reviewer
3 pages
Pre Lim - Midterm Handouts
No ratings yet
Pre Lim - Midterm Handouts
3 pages
Tekla Structural Designer 2022 Eurocodes Reference
No ratings yet
Tekla Structural Designer 2022 Eurocodes Reference
226 pages
4 - Managing Logistics Internationally
100% (1)
4 - Managing Logistics Internationally
48 pages
Review of Basic Statistical Concepts
No ratings yet
Review of Basic Statistical Concepts
8 pages
And Estimation Sampling Distributions: Learning Outcomes
No ratings yet
And Estimation Sampling Distributions: Learning Outcomes
12 pages
And Estimation Sampling Distributions: Learning Outcomes
No ratings yet
And Estimation Sampling Distributions: Learning Outcomes
12 pages
Intro Statistics
No ratings yet
Intro Statistics
9 pages
SCRM Brochure
No ratings yet
SCRM Brochure
6 pages
(Ebook PDF) Keay's Insolvency: Personal & Corporate Law and Practice 10th Editionpdf Download
100% (3)
(Ebook PDF) Keay's Insolvency: Personal & Corporate Law and Practice 10th Editionpdf Download
46 pages
SCM Assignment 5 - Standard Costing
No ratings yet
SCM Assignment 5 - Standard Costing
12 pages
Tanques Structural PENTAIR
No ratings yet
Tanques Structural PENTAIR
4 pages
Sample Selection Bias and Heckman Models in Strategic Management Research
No ratings yet
Sample Selection Bias and Heckman Models in Strategic Management Research
19 pages
120 Bhivai
No ratings yet
120 Bhivai
1 page
Taco Catalogue - Without Price
No ratings yet
Taco Catalogue - Without Price
83 pages
SSC 201
No ratings yet
SSC 201
14 pages
Minutes For The Meeting Held On The 25 MAY 2023
100% (1)
Minutes For The Meeting Held On The 25 MAY 2023
2 pages
Flexible Manufacturing System
No ratings yet
Flexible Manufacturing System
1 page
Student Fee Slip
No ratings yet
Student Fee Slip
2 pages
Clampco Catalog 2020 Web
No ratings yet
Clampco Catalog 2020 Web
38 pages
0694104000261753
No ratings yet
0694104000261753
30 pages
Investment: Introduction/ P. 1 of 9 ECN4141 Financial Economics
No ratings yet
Investment: Introduction/ P. 1 of 9 ECN4141 Financial Economics
9 pages
Chapter 2 Derivatives
No ratings yet
Chapter 2 Derivatives
43 pages
Karl Marx
No ratings yet
Karl Marx
5 pages
Market Structure and Competition Strategic Management
No ratings yet
Market Structure and Competition Strategic Management
34 pages
MCQS, c.8, Class Xii, Acs
No ratings yet
MCQS, c.8, Class Xii, Acs
16 pages
Shri Ram Sons Wax Private Limited
No ratings yet
Shri Ram Sons Wax Private Limited
14 pages
5 Facts About Freeter in Japan! What Exactly Is A Freeter
No ratings yet
5 Facts About Freeter in Japan! What Exactly Is A Freeter
20 pages
21.10.2022 - MSP - Coin Flipping Game
No ratings yet
21.10.2022 - MSP - Coin Flipping Game
19 pages
Transdisciplinary and Eco Complex
No ratings yet
Transdisciplinary and Eco Complex
9 pages
Bankuru Vindhyasri
No ratings yet
Bankuru Vindhyasri
4 pages
RFL Rainbow Insta Painting
No ratings yet
RFL Rainbow Insta Painting
4 pages
Bao Bill
No ratings yet
Bao Bill
1 page
Catalogue-Cotton FrwVF1b
No ratings yet
Catalogue-Cotton FrwVF1b
6 pages
Daftar Perusahaan Manufaktur
No ratings yet
Daftar Perusahaan Manufaktur
7 pages
Customer Satisfaction Survey Form: Organization Name
No ratings yet
Customer Satisfaction Survey Form: Organization Name
1 page
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Alarming! the Chasm Separating Education of Applications of Finite Math from It's Necessities
From Everand
Alarming! the Chasm Separating Education of Applications of Finite Math from It's Necessities
Ramune B. Adams
No ratings yet
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

3 Module Notes

Uploaded by

3 Module Notes

Uploaded by

MAST 6474 Introduction to Data Analysis I

Central Limit Theorem and Confidence Intervals

Sampling: Using Data to Estimate the Population Mean

x 1+ x 2 + x 3+ ⋯+ x178 + x 179 + x 180

The Central Limit Theorem

General Confidence Interval

Copyright Edward Fox and John Semple 2019 4

Area in this tail Area in this tail

Q: How do you compute the

The equation above can be rewritten as Pr X −z α / 2 ∙( σ

Copyright Edward Fox and John Semple 2019 5

Confidence Interval for a Population Proportion

Confidence Interval for a Population Proportion

Copyright Edward Fox and John Semple 2019 6

Confidence Intervals for a Population Mean (Not a Proportion)

If a random sample of size n is drawn from a Normal distribution,

Confidence Interval using the t Distribution

Copyright Edward Fox and John Semple 2019 8

You might also like