0% found this document useful (0 votes)
24 views8 pages

CH 4.1 Notes

The document describes a study to estimate the average number of M&Ms per pile for 100 piles. Participants are instructed to first guess the average without looking at the piles, then select and calculate the average of 5 representative piles, and finally calculate the average of 5 randomly selected piles by the calculator. The purpose is to see if random sampling produces a less biased estimate than judgmental sampling. Random sampling is found to be less likely to produce bias by taking into account unknown population characteristics.

Uploaded by

Sasha Tucakov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views8 pages

CH 4.1 Notes

The document describes a study to estimate the average number of M&Ms per pile for 100 piles. Participants are instructed to first guess the average without looking at the piles, then select and calculate the average of 5 representative piles, and finally calculate the average of 5 randomly selected piles by the calculator. The purpose is to see if random sampling produces a less biased estimate than judgmental sampling. Random sampling is found to be less likely to produce bias by taking into account unknown population characteristics.

Uploaded by

Sasha Tucakov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

• Goal: To estimate the average number of m&m’s per pile

for the 100 piles


• When I give you the signal, you will have 10 seconds to
look at all the piles of m&m’s and make a guess as to the
average number of m&m’s per pile.
• Input your guess the chat (stick with whole numbers)
• Go to STAPPLET.com and input your guess.
• Class code EB0D37

• Now select 5 piles that are, in your judgment,


• Go to STAPPLET.com and input your guess. representative of the entire population.
• Class code EB0D37
• Calculate the average pile size (round to the nearest
whole number) and write down the result in the chat.
• 60sec for this part!
• Go to STAPPLET.com and input your answer.
• Class code C86561

• Now let the calculator choose: five piles randomly.


• math → prb → randInt(1,100) → enter

• Calculate the average number of m&m’s for these five


piles and type you answer in stapplet

• Use code: 559F06


What’s the point of the M&M Activity? Chapter 4.1 – Sample Surveys and Bias
• Random sampling (hopefully) worked where the other methods failed • Big Idea
• How do we uncover the truths about a population when
• Both dotplots are usually centered at the wrong place – that’s bias.
all we have is a sample?
• Only the random samples produced a dotplot centered in the right
place, with the sampling variability we expect to always be present.

• Conclusion - Only random selection can combat bias.

Point 1 – Why do we need a sample? Why do we need a sample?


• Often times it is difficult, impractical or impossible to do a • By analyzing samples we can make
census of entire populations so we have to settle for
samples. inferences about the population.
• Just think Gelato
• Opinion polls are examples of sample surveys.
Population

Sample
Bias
Undercoverage Bias
• Sampling methods that, by their nature, tend to over- or
under- emphasize some characteristics of the population
are said to be biased.
• Bias is the bane of sampling and results in unrepresentative or The homepage of college
inaccurate estimates —the one thing above all to avoid. website boasts:
“Our overall graduation
placement rate is 98.5%,
with 91% working in their
field of study.”
-southwest.tn.edu (6/9/2020)

Population
Within 8 years…
Sample  Only Graduates

Total: 27%
of students
had positive
outcomes.

Undercoverage: When part of the population has a


Among full-time, first-time degree or certificate-seeking students reduced chance of being included in a sample.
who entered in 2010/2011, Source: IPEDS (2020)
Nonresponse Bias Population  All Grads
In a recent report, Sample  Survey Respondents
Rogers State
University found that Rogers State University – Oklahoma
about 75% of
graduates were
pursuing another
degree or had found
full-time employment
by their final semester.
Nonresponse: When individuals chosen for a sample
don’t respond.
How did the University collect this data? Surveys
-Leads to bias if these individuals differ from
respondents.

Voluntary Response Bias


Types of Survey Bias
Occurs when a sample is composed of volunteers, who Question wording bias: When survey questions are
may differ from individuals who don’t choose to confusing or leading.
volunteer. Ex: “How amazing was your experience with our
Voluntary response samples show bias because people with strong customer service team?”
opinions (often in the same direction) are most likely to respond.
Self-reported response bias: Is the tendency of a
Ex: You want to study heart rate during exercise. You
person to answer questions on a survey untruthfully.
recruit volunteers to run a mile and then measure Ex: How well did you understand “statistics
their pulse. The people who actually like to run are for dummies”?
the ones who volunteer, so they’re healthier on Ex: How much alcohol do you usually drink per
average than the population  bias week?
Slide 12 -
21

Important: Point 2 - Randomize


• In order to avoid bias you must sample randomly.
These categories of bias can often overlap. On an • Randomizing protects us from the influences of all the
FRQ, if you’re unsure, don’t try to use one of these features of our population, even ones that we may not
vocab terms. Instead, just describe the bias, how it have thought about.
arises, and whether it leads to an under or • Randomizing makes sure that on the average the sample looks like
overestimate. the rest of the population.
• And as we’ll see later in the course is a condition for inference.
• The statistics we compute from the sample must reflect
the corresponding parameters accurately.
• A sample that does this is said to be representative.

Point 3: Sample Size


• It’s the size of the sample, not the size of the population,
Random Sampling
that makes the difference in sampling.
• Gelato!
Simple Random Sample (SRS): a sampling
method in which every possible group of
• Make sure you sample is large enough:
• “We found that 66.67% of the rats improved their performance with
individuals in the population has an equal
training. The other rat died.” chance of being selected.
• Exception: The sample cannot be more than 10% of the
population when sampling without replacement.
COVID‐19 and Convenience Sampling Instead: We could have randomly sampled the NYC
population, tested those who were sampled, and
When COVID‐19 spread to NYC, the city only gotten an unbiased estimate of the number of
provided tests to people who showed symptoms. people infected.
Some infected people don’t show symptoms. So, the
sampling method led to an underestimate of the
number of people infected. It was biased.

Example inspired by the work of Dartmouth professors Daniel Rockmore and Michael Herron:
https://theconversation.com/want‐to‐know‐how‐many‐people‐have‐the‐coronavirus‐test‐randomly‐135784

COVID‐19 and Sampling COVID‐19 and Sampling


Describe how you would implement a simple random Describe how you would implement a simple random
sample (SRS) of 1,000 NYC residents to test for COVID. sample (SRS) of 1,000 NYC residents to test for COVID.
1. Assign every individual in NYC an integer 1 – N (where N
is the population size of NYC).
2. Use a random number generator to obtain 1,000 integers
between 1 – N, skipping repeats.
3. Administer the COVID test to the 1,000 individuals whose
numbers were selected.”
So how do we go about getting a good sample? So how do we go about getting a good sample?
• Simple Random Sample (SRS) ‐ Every possible sample of the size we plan
to draw has an equal chance to be selected.
• Stratified Sampling ‐ The population is first sliced into homogeneous • Multistage Sampling ‐ Sampling schemes that combine several methods
groups, called strata, before the sample is selected. are called.
• Reduced variability and highlights differences among groups. • Systematic Sampling ‐ Sometimes we draw a sample by selecting
• Cluster Sampling ‐ Splitting the population into similar heterogeneous individuals systematically.
parts can make sampling more practical. • You might survey every 10th person on an alphabetical list of students.

The good, the bad, and the ugly!

• A valid survey yields the information we are seeking about the


population we are interested in.
• “How many hours did you sleep last night?” vs. “How much do you
usually sleep?”
• “Did you have too much to drink last night?”
• “Could you understand our new Instructions for Dummies manual, or
was it too difficult for you?”
• “How did you like the movie?”

You might also like