0% found this document useful (0 votes)
31 views62 pages

Statistics 1 - Sesi 12 - Sampling Distribution, Estimation and Confidence Interval

This document discusses sampling methods and the central limit theorem. It explains why sampling a population is necessary due to constraints like time, cost and feasibility. It describes different probability sampling methods like simple random sampling, systematic random sampling, stratified random sampling and cluster sampling. It defines the sampling distribution of the sample mean and explains how according to the central limit theorem, the sampling distribution of means from any population will be approximately normally distributed regardless of the population's distribution. It provides examples to illustrate concepts like finding the population mean, constructing the sampling distribution of the sample mean, and using it to determine probabilities.

Uploaded by

Dwi Suprapti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views62 pages

Statistics 1 - Sesi 12 - Sampling Distribution, Estimation and Confidence Interval

This document discusses sampling methods and the central limit theorem. It explains why sampling a population is necessary due to constraints like time, cost and feasibility. It describes different probability sampling methods like simple random sampling, systematic random sampling, stratified random sampling and cluster sampling. It defines the sampling distribution of the sample mean and explains how according to the central limit theorem, the sampling distribution of means from any population will be approximately normally distributed regardless of the population's distribution. It provides examples to illustrate concepts like finding the population mean, constructing the sampling distribution of the sample mean, and using it to determine probabilities.

Uploaded by

Dwi Suprapti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 62

Sampling Methods

and
the Central Limit Theorem
STATISTICS 1
Online Business
Semester Genap 2022/2023
GOALS
• Explain why a sample is the only feasible way to learn about
a population.
• Describe methods to select a sample.
• Define and construct a sampling distribution of the sample
mean.
• Explain the central limit theorem.
• Use the Central Limit Theorem to find probabilities of
selecting possible sample means from a specified
population.
Why Sample the Population?
• The physical impossibility of checking all items in the
population.
• The cost of studying all the items in a population.
• The sample results are usually adequate.
• Contacting the whole population would often be time-
consuming.
• The destructive nature of certain tests.
Probability Sampling
• A probability sample is a sample selected such that
each item or person in the population being studied
has a known likelihood of being included in the
sample.
Methods of Probability Sampling
• Simple Random Sample: A sample formulated so that each
item or person in the population has the same chance of
being included.
• Systematic Random Sampling: The items or individuals of the
population are arranged in some order. A random starting
point is selected and then every kth member of the
population is selected for the sample.
• Stratified Random Sampling: A population is first divided into
subgroups, called strata, and a sample is selected from each
stratum.
• Cluster Sampling: A population is first divided into primary
units then samples are selected from the primary units.
Methods of Probability Sampling
• In nonprobability sample inclusion in the sample is based on
the judgment of the person selecting the sample.
• The sampling error is the difference between a sample
statistic and its corresponding population parameter.
Methods of Probability Sampling
• In a probability sample, all members of the population have a known
chance of being selected for the sample.
• There are several probability sampling methods.
• In a simple random sample, all members of the population have the
same chance of being selected for the sample.
• In a systematic sample, a random starting point is selected, and then
every kth item is selected for the sample.
• In a stratified sample, the population is divided into several groups, or
strata, and then a sample is selected from each stratum.
• In cluster sampling, the population is divided into primary units, and
then samples are drawn from the primary units.
• In nonprobability sampling, inclusion in the sample is based on the
judgement of the person conducting the sample. Nonprobability
samples may lead to biased results.
SIMPLE RANDOM SAMPLING
• Suppose a population consists of 845 employees of Nitra
Industries. A sample of 52 employeesis to be selected from
the population.
• One way of ensuring that every employee has a chance of
being chosen is to first write the name of each one on a small
slip of paper and deposit all of the slips in a box. After they
have been throughly mixed, the first selection is made by
drawing a slip out of the box without looking at it. This
process is repeated until the sample of 52 is chosen.
• A more convenient method of selecting a random sample is to
use the identification number of each employee and a table of
random numbers.
SIMPLE RANDOM SAMPLING
(cont.)
50525 57454 28455 68226 34656 38884 39018
72507 53380 53827 42486 54465 71819 91199
34986 74297 00144 38676 89967 98869 39744
68851 27305 03759 04723 96108 78489 18910
06738 62879 03910 17350 49169 03850 19801
11448 10734 05837 24397 10420 16712 94496

 First, choose a starting point in the table. Any starting point will
do.
SYSTEMATIC RANDOM
SAMPLING
• Suppose the population of interest consists of 2000 invoices
located in file drawers. Drawing a simple random sample
would first require numbering the invoices from 0000 to
1999. this would be a very time-consuming task.
• Instead, a systematic random sample could be selected by
simply going through the file drawers and selecting every
20th invoice for study.
• The first invoice should be chosen using a random process. If
the 10th invoice were chosen as the starting point the sample
would consist of the 10th, 30th, 50th, 70th, ... Invoices.
• Since the first item is chosen at random, all items have the
same likelihood of being selected for the sample. Thus, it is a
probability sample.
STRATIFIED RANDOM
SAMPLING
• A population is divided into subgroups, called strata, and a
sample is selected from each stratum.
• As the name implies, a proportional sampling procedure
requires that the number of items in each stratum be in the
same proportion as in the population.
• Suppose the objective of the study is to determine whether
firms with high returns on equity (a measure of profitability)
spent more of each sales dollar on advertising than firms
with a low return.
• Assume that the 352 firms were divided into five strata. If,
50 firms are to be selected for intensive study.
STRATIFIED RANDOM
SAMPLING
Stratum Profitability Number of firms Number sampled
1 30% atau lebih 8 8/352 x 50 = 1
2 20% – 30% 35 35/352 x 50 = 5
3 10% – 20% 189 189/352 x 50 = 27
4 0 – 10% 115 115/352 x 50 = 16
5 Deficit 5 5/352 x 50 =1
Total 352 50
CLUSTER SAMPLING
• It is often employed to reduce the cost of sampling a
population scattered over a large geographic area.
• Suppose, you want to determine the views of industrialists in
Texas about state and federal environmental protection
policies.
• Selecting a random sample of industrialists in Texas and
personally contacting each one would time consuming and
very expensive. Instead, you could employ cluster sampling by
subdividing the state into small units–either countries aor
regions. These are often called primary units.
• Suppose you divided Texas into 12 primary units, then selected
at random 4 regions – 2, 7, 4, 12 and concentrated your efforts
in these primary units, continued with take a random sample
of industrialists in each of the regions and interview them.
SAMPLING “ERROR”
• It is unlikely that the mean of a sample would be identical
to the population mean. Likewise, the sample standard
deviation or other measure computed from a sample would
probably not be exactly equal to the corresponding
population value.
• Therefore some difference between a sample statistic, such
as the sample mean or sample standard deviation, and the
corresponding population parameter.
• The difference between a sample statistic and a population
parameter is called sampling error.
EXAMPLE
• Jane and Joe Miley operate the Foxtrot Inn, a bed and
breakfast in Tryon, North Carolina. There are eight rooms
available for rent at this B&B. Listed below is the number of
these eight rooms that was rented each day during June
2015.
• Find the mean of the population.
• Select three random samples of five days.
• Calculate the mean of each sample and compare it to the
population mean.
• What is the sampling error in each case?
EXAMPLE
June Rentals June Rentals June Rentals
1 0 11 3 21 3
2 2 12 4 22 2
3 3 13 4 23 3
4 2 14 4 24 6
5 3 15 7 25 0
6 4 16 0 26 4
7 2 17 5 27 1
8 3 18 3 28 1
9 4 19 6 29 3
10 7 20 2 30 3
EXAMPLE
Sample 1 Sample 2 Sample 3
4 3 0
7 3 0
4 2 3
3 3 3
1 6 3
Total 19 17 9
Sample mean 3,8 3,4 1,8
EXAMPLE
Sampling Distribution of the
Sample Means
• The sampling distribution of the sample mean is a
probability distribution consisting of all possible
sample means of a given sample size selected from
a population.
Sampling Distribution of the
Sample Means - Example
Tartus Industries has seven production employees (considered the
population). The hourly earnings of each employee are given in the table
below.
1. What is the population mean?
2. What is the sampling distribution of the sample mean for samples of
size 2?
3. What is the mean of the sampling distribution?
4. What observations can be made about the population and the sampling
distribution?
Sampling Distribution of the
Sample Means - Example
Sampling Distribution of the
Sample Means - Example
Sampling Distribution of the
Sample Means - Example
Central Limit Theorem
• For a population with a mean μ and a variance σ2 the
sampling distribution of the means of all possible samples of
size n generated from the population will be approximately
normally distributed.
• The mean of the sampling distribution equal to μ and the
variance equal to σ2/n.
Central Limit Theorem
Using the Sampling
Distribution of the Sample Mean (Sigma
Known)
• If a population follows the normal distribution, the sampling
distribution of the sample mean will also follow the normal
distribution.
• To determine the probability a sample mean falls within a
particular region, use:
Using the Sampling Distribution of the
Sample Mean (Sigma Known) - Example
• The Quality Assurance Department for Cola, Inc., maintains records
regarding the amount of cola in its Jumbo bottle. The actual amount of
cola in each bottle is critical, but varies a small amount from one bottle
to the next. Cola, Inc., does not wish to underfill the bottles. On the
other hand, it cannot overfill each bottle. Its records indicate that the
amount of cola follows the normal probability distribution. The mean
amount per bottle is 31.2 ounces and the population standard deviation
is 0.4 ounces. At 8 A.M. today the quality technician randomly selected
16 bottles from the filling line. The mean amount of cola contained in
the bottles is 31.38 ounces.
• Is this an unlikely result? Is it likely the process is putting too much soda
in the bottles? To put it another way, is the sampling error of 0.18
ounces unusual?
Using the Sampling Distribution of the
Sample Mean (Sigma Known) - Example
Step 1: Find the z-values corresponding to the sample mean
of 31.38
Using the Sampling Distribution of the
Sample Mean (Sigma Known) - Example
Step 2: Find the probability of observing a Z equal to or
greater than 1.80
Using the Sampling Distribution of the
Sample Mean (Sigma Known) - Example
• What do we conclude?
• It is unlikely, less than a 4 percent chance, we could select a
sample of 16 observations from a normal population with a
mean of 31.2 ounces and a population standard deviation of
0.4 ounces and find the sample mean equal to or greater
than 31.38 ounces.
• We conclude the process is putting too much cola in the
bottles.
Using the Sampling
Distribution of the Sample Mean (Sigma
Unknown)
• If the population does not follow the normal distribution,
but the sample is of at least 30 observations, the sample
means will follow the normal distribution.
• To determine the probability a sample mean falls within a
particular region, use:
Estimation
and
Confidence Intervals
STATISTICS 1
ONLINE BUSINESS
2021
GOALS
• Define a point estimate.
• Define level of confidence.
• Construct a confidence interval for the population mean
when the population standard deviation is known.
• Construct a confidence interval for a population mean when
the population standard deviation is unknown.
• Construct a confidence interval for a population proportion.
• Determine the sample size for attribute and variable
sampling.
Point and Interval Estimates
• A point estimate is the statistic, computed from sample
information, which is used to estimate the population
parameter.
• A confidence interval estimate is a range of values
constructed from sample data so that the population
parameter is likely to occur within that range at a specified
probability. The specified probability is called the level of
confidence.
Factors Affecting Confidence Interval
Estimates
The factors that determine the width of a confidence interval
are:
• The sample size, n.
• The variability in the population, usually σ estimated by s.
• The desired level of confidence.
Interval Estimates - Interpretation
• For a 95% confidence interval about 95% of the similarly constructed
intervals will contain the parameter being estimated.
• Also 95% of the sample means for a specified sample size will lie within
1.96 standard deviations of the hypothesized population
Characteristics of the t-distribution
• It is, like the z distribution, a continuous distribution.
• It is, like the z distribution, bell-shaped and symmetrical.
• There is not one t distribution, but rather a family of t
distributions. All t distributions have a mean of 0, but their
standard deviations differ according to the sample size, n.
• The t distribution is more spread out and flatter at the
center than the standard normal distribution.
• As the sample size increases, however, the t distribution
approaches the standard normal distribution,
Comparing the z and t Distributions
when n is small
Confidence Interval Estimates for the
Mean
• Use Z-distribution • Use t-distribution
• If the population standard • If the population standard
deviation is known or the deviation is unknown and
sample is greater than 30. the sample is less than 30.
When to Use the z or t Distribution for
Confidence Interval Computation
Confidence Interval for the Mean –
Example using the z-distribution
• A study involves selecting a random sample of 256 sales representatives
under the age of 35. One item of interest is their annual income. The
sample mean is $55.420, and the sample standard deviation is $2.050
• What is the point estimate of mean income of all the representatives
under the age of 35?
• What is the 95% confidence interval for the population mean (rounded
to the nearest $10)?
• t 0,05/2; 255 = 1.969 ~ 1.96
Confidence Interval for the Mean –
Example using the t-distribution
• A tire manufacturer wishes to investigate the tread life of its tires.
• A sample of 10 tires driven 50,000 miles revealed a sample mean of
0.32 inch of tread remaining with a standard deviation of 0.09 inch.
• Construct a 95 percent confidence interval for the population mean.
• Would it be reasonable for the manufacturer to conclude that after
50,000 miles the population mean amount of tread remaining is 0.30
inches?

• The manufacturer can be reasonably sure (95% confident) that the


mean remaining tread depth is between 0.256 and 0.384
Student’s t-distribution Table
Student’s t-distribution Table
CONFIDENCE INTERVAL for a
PROPORTION
• The career services director at Southern Technical Institute
reports that 80 percent of its graduates enter the job
market in a position related to their field of study.
• A company representative claims that 45 percent of Burger
King sales are made at the drive-through window.
• A survey of homes in the Chicago area indicated that 85
percent of the new construction had central air
conditioning.
• A recent survey of married men between the ages of 35 and
50 found that 63 percent felt that both partners should earn
a living.
CONFIDENCE INTERVAL for a
PROPORTION
The factors that determine a confidence interval for a
proportion are:
• The number of observations in the sample
• The value of p is computed by dividing the number of
successes in the sample, X, by the number of observations
in the sample, n.
• The level of confidence, it determines the Z-value.
CONFIDENCE INTERVAL for a
PROPORTION
• A proportion is a ratio, fraction, or percent that indicates the
part of the sample or population that has a particular
characteristic.
• A sample proportion is found by X, the number of successes,
divided by n, the number of observations.
• The standard error of the sample proportion reports the
variability in the distribution of sample proportions. It is
found by:

• Confidence interval for a sample proportion from the


following formula is found by:
EXAMPLE
• The union representing the Bottle Blowers of America (BBA) is
considering a proposal to merge with the Teamsters Union. According to
BBA union bylaws, at least three fourths of the union membership must
approve any merger.
• A random sample of 2,000 current BBA members reveals 1,600 plan to
vote for the merger proposal.
• What is the estimate of the population proportion?
• Develop a 95 percent confidence interval for the population proportion.
• Based on your decision on this sample information, can you conclude
that the necessary proportion of BBA members favor the merger? Why?
EXAMPLE
• Calculate the sample proportion:

• Estimate that 80 percent of the population favor the merger proposal,


determine the 95% confidence interval. The z value corresponding to
the 95% level of confidence is 1.96:

• The endpoints of the confidence interval are .782 and .818. The lower
endpoint is greater than 0.75. Hence, we conclude that the merger
proposal will likely pass because the interval estimate includes values
greater than 75 percent of the union membership.
FINITE-POPULATION
CORRECTION FACTOR
• A finite population can be rather small; it could be all the
students registered for this class.
• It can also be very large, such as all senior citizens living in
Florida.
• For a finite population, where the total number of objects is
N and the size of the sample is n. This adjustment is called
the finite-population correction factor :

• The usual rule is if the ratio of n/N is less than .05, the
correction factor is ignored.
EXAMPLE
• There are 250 families in Scandia, Pennsylvania. A poll of 40 families
reveals the mean annual church contribution is $450 with a standard
deviation of $75.
• Construct a 90% confidence interval for the mean annual contribution.
• First, note that the population is finite. That is, there is a limit to the
number of people in Scandia.
• Second, note that the sample constitutes more than 5 percent of
the population; that is, n/N = 40/250 = 0.16.
• The endpoints of the confidence interval are $432.03 and $467.97.
It is likely that the population mean falls within this interval.
Finite-Population Correction
Factor
• A population that has a fixed upper bound is said to be
finite.
• For a finite population, where the total number of objects is
N and the size of the sample is n, the following adjustment
is made to the standard errors of the sample means and the
proportion:
• However, if n/N < .05, the finite-population correction factor
may be ignored.
Effects on FPC when n/N Changes
Observe that FPC approaches 1 when n/N becomes smaller
Confidence Interval Formulas for Estimating
Means and Proportions with Finite Population
Correction
• C.I for the Mean (µ)

• C.I for the Proportion (π)


CI For Mean with FPC - Example
• There are 250 families in Scandia, Pennsylvania. A random sample of 40
of these families revealed the mean annual church contribution was
$450 and the standard deviation of this was $75.
• Develop a 90 percent confidence interval for the population mean.
• Interpret the confidence interval.
• Since n/N = 40/250 = 0.16, the finite population correction factor must
be used. The population standard deviation is not known therefore use
the t-distribution (may use the z-dist since n > 30)
Selecting a Sample Size
There are 3 factors that determine the size of a sample, none
of which has any direct relationship to the size of the
population. They are:
• The degree of confidence selected.
• The maximum allowable error.
• The variation in the population.
Sample Size Determination for a
Variable
To find the sample size for a variable:
Sample Size Determination for a
Variable-Example
• A student in public administration wants to determine the mean
amount members of city councils in large cities earn per month as
remuneration for being a council member.
• The error in estimating the mean is to be less than $100 with a 95
percent level of confidence.
• The student found a report by the Department of Labor that estimated
the standard deviation to be $1,000.
• What is the required sample size?
Sample Size Determination for a Variable-
Another Example
• A consumer group would like to estimate the mean monthly
electricity charge for a single family house in July within $5
using a 99 percent level of confidence.
• Based on similar studies the standard deviation is estimated
to be $20.00. How large a sample is required?
Sample Size for Proportions
The formula for determining the sample size in the
case of a proportion is:
Another Example

• The American Kennel Club wanted to estimate the


proportion of children that have a dog as a pet.
• If the club wanted the estimate to be within 3% of the
population proportion, how many children would they need
to contact?
• Assume a 95% level of confidence and that the club
estimated that 30% of the children have a dog as a pet.
Another Example
• A study needs to estimate the proportion of cities that have
private refuse collectors.
• The investigator wants the margin of error to be within .10
of the population proportion, the desired level of
confidence is 90 percent, and no estimate is available for
the population proportion.
• What is the required sample size?

You might also like