STT 100 Chapter 1 Notes
STT 100 Chapter 1 Notes
Prepare
1. Context
. What does the data represent?
. What is the goal of the study?
2. Source of the data
. Are the data from a source with a special interest so that there is pressure to obtain
results that are favorable to the source?
3. Sampling Method
. Were the data collected in a way that is unbiased, or were the data collected in a way
that is biased (such as a procedure in which respondents volunteer to participate)?
Analyze
Conclude
1. Significance
. Do the results have statistical significance?
. Do the results have practical significance?
Sample Data Reported Instead of Measured: When collecting data from people,
it is better to take measurements yourself instead of asking subjects to report
results. Ask people what they weight, and you are likely to get their desired
weights, not actual weights. People tend to round down, sometimes way down.
When asked, someone with a weight of 187lb might respond that he or she
weighs 160Ib. Accurate weight are collected by using a scale to measure weights,
not by asking people what they weigh.
1. Computer Virus In an AOL survey of Internet users, this question was posted
online: “Have you ever been hit by a computer virus?” Among the 170,063
responses, 63% answered “yes. “A) What term is used to describe this type of
survey in which the people surveyed consist of those who chose to respond?
B) What is wrong with this type of sampling method?
A) Voluntary sampling
B) It is a flawed sampling method because the respondents only consist of
people that wanted to respond or were motivated to respond. In this case,
the respondents felt motivated to respond because they themselves were hit
with a computer virus, therefore they wanted to volunteer and respond to
the survey. Additionally, it is a flawed sampling method because the sampling
method is not random, meaning not everyone had an equal chance of
participating in the survey. Lastly, this type of sampling method leads to
strong bias, causing misleading conclusions. For example, the conclusion that
all computer users have been hit by a computer virus based on this survey is
not reliable and misleading.
4. Correlation: One study showed that for a recent period of 10 years, there was
a strong correlation (or association) between the per capita consumption of
margarine and the divorce rate
in Maine (based on data from National Vital Statistics reports and the U.S.
Department of Agriculture). Does this imply that increasing margarine
consumption is the cause of an increase in
the divorce rate in Maine? Why or why not?
No, it does not imply that increasing margarine consumption is the cause of
an increase in the divorce rate in Maine? Just because two variables are
changing together, it does not mean that change in one variable directly
causes change in the other variable. Correlation does not mean causation.
Data Set 1 “Body Data” in Appendix B includes pulse rates of subjects, and
those pulse rates were recorded by examiners as part of a study conducted by
the National
Center for Health Statistics.
There could be a potential bias if the sampling method is not random, and
there is an examiner bias since different examiners may have varying
consistencies, accuracies, and precisions when measuring pulse rates
An article in Journal of Nutrition (Vol. 130, No. 8) noted that chocolate is rich
in flavonoids. The article notes “regular consumption of foods rich in
flavonoids may reduce
the risk of coronary heart disease.” The study received funding from Mars,
Inc., the candy company, and the Chocolate Manufacturers Association.
There could be a bias as journals are not a reliable source. Additionally, the
article has received funding from Mars and the Chocolate Manufacturers
Association, which could influence the study and it’s objective and
expectations, resulting in bias. There is a financial incentive to come up
with a conclusion that is beneficial to mars.
12. Social Media Usage In a survey of social media usage, the Pew Research
Center randomly selected 2002 adults in the United States.
Sound
13. Diet and Exercise Program In a study of the Ornish weight loss program, 40 subjects
lost a mean of 3.3 lb after 12 months (based on data from “Comparison of the Atkins, Ornish,
Weight Watchers, and Zone Diets for Weight Loss and Heart Disease Risk Reduction,” by Dan-
singer et al., Journal of the American Medical Association, Vol. 293, No. 1). Methods of statis-
tics can be used to show that if this diet had no effect, the likelihood of getting these results is
roughly 3 chances in 1000.
The results appear to have statistical significance as 40 subjects lost 3.3lb after 12 months
which was then proven to be unlikely to have been occurred by chance. However, the study
lacks practical significance as many overweight people would not see the effort and
commitment required worth it, as 3.3lbs isn’t that big of an amount to lose over the course of
12 months.
The results appear to have statistical significance due to the high success rates and the
unlikelihood of it to have occurred by chance. There appears to be no practical significance as
more information is required.
A parameter is a numerical measurement describing some characteristic of a
Population.
Page 14
Discrete data: number of values are finite, values are quantitative, number of values can be
counted.
Continuous data: infinite possibilities of quantitative values, collection of values cannot be
counted.