0% found this document useful (0 votes)
59 views30 pages

R2 Sarah

This document provides examples and problems related to classifying variables as qualitative or quantitative, determining if variables are discrete or continuous, identifying populations and samples, and potential sources of bias in sampling. Specifically, it discusses: - Classifying variables like nation of origin, number of siblings, grams of carbohydrates as qualitative or quantitative. - Determining if variables like goals scored in a season, length of a song, internet connection speed are discrete or continuous. - Identifying the population and sample in a study examining bottles of Coca-Cola filled by a machine on a given date. - Potential sources of bias, like only sampling vehicles at a casino, in a city council's decision about speed

Uploaded by

sara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views30 pages

R2 Sarah

This document provides examples and problems related to classifying variables as qualitative or quantitative, determining if variables are discrete or continuous, identifying populations and samples, and potential sources of bias in sampling. Specifically, it discusses: - Classifying variables like nation of origin, number of siblings, grams of carbohydrates as qualitative or quantitative. - Determining if variables like goals scored in a season, length of a song, internet connection speed are discrete or continuous. - Identifying the population and sample in a study examining bottles of Coca-Cola filled by a machine on a given date. - Potential sources of bias, like only sampling vehicles at a casino, in a city council's decision about speed

Uploaded by

sara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

College of Engineering

NGN 111 – INTRODUCTION TO STATISTICAL ANALYSIS

Chapter 1 Recitation
In Problems 15-22, classify the variable as qualitative or quantitative.
15. Nation of origin
Qualitative
16. Number of siblings
Quantitative
L7. Grams of carbohydrates in a doughnut

Section 1.1 Quantitative


18. Number on a football player's jersey
Problems: Qualitative
19. Number of un-popped kernels in a bag of ACT microwave popcorn
Quantitative
20. Assessed value of a house
Quantitative
23. Goals scored in a season by a soccer player
Discrete
24. Volume of water lost each day through a leaky faucet
Continuous

In Problems 23-
25. Length (in minutes) of a country song
Continuous
30, determine 26. Number of Sequoia trees in a randomly selected acre of Yosemite National Park

whether the Discrete


27. Temperature on a randomly selected day in Memphis, Tennessee
quantitative Continuous

variable is 28. Internet connection speed in kilobytes per second


Continuous
discrete or 29. Points scored in an NCAA basketball game
continuous. Discrete
30. Air pressure in pounds per square inch in an automobile tire
Continuous
Identify the • 40. A quality-control manager randomly
selects 50 bottles of Coca-Cola that were
population filled on October 15 to assess the
and the calibration of the filling machine.

sample in • The population consists of all bottles of


the Coca-Cola filled by that particular machine
following on October 15. The sample consists of the
50 bottles of Coca-Cola that were selected
study: by the quality control manager.
Section 1.2 Problems:
18. Daily Coffee Consumption: Researchers wanted to determine if there was an association between daily coffee
consumption and the occurrence of skin cancer. The researchers looked at 93,676 women enrolled in the Women’s Health
Initiative Observational Study and asked them to report their coffee-drinking habits. The researchers also determined which of
the women had nonmelanoma skin cancer. After their analysis, the researchers concluded that consumption of six or more cups
of caffeinated coffee per day was associated with a reduction in nonmelanoma skin cancer.
(a) What type of observational study was this? Explain.
This is a cross-sectional study because the researchers collected information about the individuals at a specific point in time.
(b) What is the response variable in the study? What is the explanatory variable?
The response variable is whether the woman has nonmelanoma skin cancer or not. The explanatory variable is the daily
amount of caffeinated coffee consumed.
(c) In their report, the researchers stated that "After adjusting for various demographic and lifestyle variables, daily
consumption of six or more cups was associated with a 30% reduced prevalence of nonmelanoma skin cancer." Why was it
important to adjust for these variables?
It was necessary to account for these variables to avoid confounding due to lurking variables.
19. Television in the Bedroom: Researchers Christelle Delmas and associates wanted to determine if having a
television (TV) in the bedroom is associated with obesity. The researchers administered a questionnaire to
379 twelve-year-old French adolescents. After analyzing the results, the researchers determined that the body
mass index of the adolescents who had a TV in their bedroom was significantly higher than that of the
adolescents who did not have a TV in their bedroom.
(a) Why is this an observational study? What type of observational study is this?
This is an observational study because the researchers simply administered a questionnaire to obtain their data. No attempt was made to
manipulate or influence the variable(s) of interest. This is a cross-sectional study because the researchers are observing participants at a
single point in time.
(b) What is the response variable in the study? What is the explanatory variable?
The response variable is body mass index. The explanatory variable is whether a TV is in the bedroom or not.
(c) Can you think of any lurking variables that may affect the results of the study?
Answers will vary. Some lurking variables might be the amount of exercise per week and eating habits. Both of these variables can affect the
body mass index of an individual.
(d) In the report, the researchers stated, "These results remain significant after adjustment for socioeconomic status." What does this mean?
The researchers attempted to avoid confounding due to lurking variables by taking into account such variables as 'socioeconomic status'.
(e) Can we conclude that a television in the bedroom causes a higher body mass index? Explain.
No. Since this was an observational study, we can only say that a television in the bedroom is associated with a higher body mass index.
Section
1.4:Samplin
g Methods
Section 1.4
27. The human resource department at a certain company wants to conduct a survey
regarding worker morale. The department has an alphabetical list of all 4502
employees at the company and wants to conduct a systematic sample.
(a)Determine k ( skip interval) if the sample size is 50.

(b) Determine the individuals who will be administered the survey. More than one
answer is possible.
Randomly select a number between I and 90. Suppose that we select 15. Then the
individuals to be surveyed will be the 15th, 105th, 195th, 285th, and so on up to the
4425th employee on the company list.
Which Method?
The mathematics department at a university wishes to administer a survey to a sample
of students taking college algebra. The department is offering 32 sections of college
algebra, similar in class size and makeup, with a total of 1280 students. They would
like the sample size to be roughly 10% of the population of college algebra students
this semester.
• How might the department obtain a simple random sample?
• A stratified sample?
• cluster sample?
• Which method do you think is best in this situation?
Simple Random
Sample:

• Number the students from 1 to 1280.


Use a table of random digits or a
random-number generator to
randomly select 128 students to
survey.
Stratified Sample:

• Since class sizes are similar, we would want


to randomly select students from each class
to be included in the sample.
Cluster Sample:
• In essence, we use cluster sampling whe
n the population is already 
broken up into groups (clusters), and eac
h cluster represents the 
population. That way, we just select a cer
tain number of clusters
• Since classes are similar in size and
makeup, we would want to randomly
select students from those classes.
• Each class will have 1280/32 = 40
students
• 3 classes will roughly result into 10%
of the total population
Sample Design:

• A school board at a local community college is considering raising the student


services fees. The board wants to obtain the opinion of the student body before
proceeding. Design a sampling method to obtain the individuals in the sample.
Be sure to support your choice. (Cluster, simple random , stratified, systematic?)
• One design would be a cluster sample, with classes as the clusters. Randomly
select clusters and then survey all the students in the selected classes. However,
care would need to be taken to make sure that no one was polled twice. Since
this would negate some of the ease of cluster sampling, a simple random sample
might be the more suitable design.
Sample Design

• The county sheriff wishes to determine if a certain highway has a high


proportion of speeders traveling on it. Design a sampling method to obtain
the individuals in the sample. Be sure to support your choice.

• One appropriate design would be a systematic sample, clocking the speed


of ever tenth car, for example.
Section 1.5:Bias in Sampling
Putting It Together

Speed Limit: In the state of California, speed limits are


established through traffic engineering surveys. One
aspect of the survey is for city officials to measure the
speed of the vehicles on a particular road.
(a) What is the population of interest for this portion of the engineering
survey?
The population of interest is all vehicles that travel on the road (or portion
of the road) in question.
(b) What is the variable of interest for this portion of the engineering
survey?
The variable of interest is the speed of the vehicles.
(c) Is the variable qualitative or quantitative?
The variable is quantitative.
Speed Limit cont.:

(e) Is a census feasible in this situation? Explain why or why not.


A census is not feasible. It would be impossible to obtain a list of all the
vehicles that travel on the road.
(f) Is a sample feasible in this situation? If so, explain what type of
sampling plan could be used. If not, explain why not.
A sample is feasible, but not a simple random sample (since a complete
frame is impossible). A systematic random sample would be a feasible
alternative.
Speed Limit cont.:

(g) In July 2007, the Temecula City Council refused a request to increase the speed limit on
Pechanga Parkway from 40 to 45 mph despite survey results indicating that the prevailing
speed on the parkway favored the increase. Opponents were concerned that it was visitors
to a nearby casino who were driving at the increased speeds and that city residents actually
favored the lower speed limit. Explain how bias might be playing a role in the city council’s
decision.
One bias is sampling bias. If the city council wants to use the cars of residents who live in
the neighborhood to gauge the prevailing speed, then individuals who are not part of the
population were in the sample (likely a huge portion), so the sample is not representative of
the intended population.
Section 1.6 : Pharmacy
10. Pharmacy: A pharmaceutical company has developed an experimental drug meant to relieve symptoms
associated with the common cold. The company identifies 300 adult males 25 to 29 years old who have a
common cold and randomly divides them into two groups. Group 1 is given the experimental drug, while group 2
is given a placebo. After 1 week of treatment, the proportions of each group that still have cold symptoms are
compared.
(a) What is the response variable in this experiment?
The response variable is the proportion of subjects with the cold.
(b) Think of some of the factors in the study. How are they controlled?
Some factors are gender, age, geographic location, overall health, and drug intervention.
Fixed: gender, age, location
Set at predetermined levels: drug intervention
Pharmacy Cont.:

(c) What are the treatments? How many treatments are there?
The treatments are the experimental drug and the placebo. There are 2 levels of treatment.
(d) How are the factors that are not controlled dealt with?
The factors that are not controlled are dealt with by random assignment into the two groups.
(e) What type of experimental design is this?
The experiment has a completely randomized design.
(f) Identify the subjects.
The subjects are the 300 adult males aged 25 to 29 who have the common cold.
Pharmacy (g) Draw a diagram to illustrate the design.

Cont.:
Assessment:
To help assess student learning in her developmental math courses, a mathematics
professor at a community college implemented pre- and posttests for her students. A
knowledge-gained score was obtained by taking the difference of the two test scores.
(a)What type of experimental design is this?
The experiment has a matched-pairs design.
(b) What is the response variable in this experiment?
The response variable is the difference in test scores.
(c) What is the treatment?
The treatment is the mathematics course.
Golf:
18. Golf Anyone? A local golf pro wanted to compare two styles of golf clubs. One golf club had a graphite shaft
and the other had the latest style of steel shaft. It is a common belief that graphite shafts allow a player to hit
the ball farther, but the manufacturer of the new steel shaft said the ball travels just as far with its new
technology. To test this belief, the pro recruited 10 golfers from the driving range. Each player was asked to hit
one ball with the graphite-shafted club and one ball with the new steel-shafted club. The distance that the ball
traveled was determined using a range finder. A coin flip was used to determine whether the player hit with the
graphite club or the steel club first. Results indicated that the distance the ball was hit with the graphite club
was no different than the distance when using the steel club.
(a) What type of experimental design is this?
The experiment has a matched-pairs design.
(b) What is the response variable in this study?
The response variable is the distance the ball is hit.
(c) What is the factor that is set to predetermined levels? What is the treatment?
The explanatory variable is the shaft type. The treatment is graphite shaft versus steel shaft.
Golf cont.:
18. Golf Anyone? A local golf pro wanted to compare two styles of golf clubs. One golf club had a graphite shaft
and the other had the latest style of steel shaft. It is a common belief that graphite shafts allow a player to hit
the ball farther, but the manufacturer of the new steel shaft said the ball travels just as far with its new
technology. To test this belief, the pro recruited 10 golfers from the driving range. Each player was asked to hit
one ball with the graphite-shafted club and one ball with the new steel-shafted club. The distance that the ball
traveled was determined using a range finder. A coin flip was used to determine whether the player hit with the
graphite club or the steel club first. Results indicated that the distance the ball was hit with the graphite club
was no different than the distance when using the steel club.
(d) Identify the experimental units.
The experimental units are the 10 golfers.
(e) Why did the golf pro use a coin flip to determine whether the golfer should hit with the graphite first or the
steel first?
The golf pro use a coin flip to eliminate bias due to the type of shaft used first.
(f) Draw a diagram to illustrate the design.

Golf cont.:
Social Work:
A social worker wants to examine methods that can be used to deter truancy. Three hundred chronically
truant students from District 103 volunteer for the study. Because the social worker believes that
socioeconomic class plays a role in truancy, she divides the 300 volunteers according to household income.
Of the 300 students, 120 fall in the low-income category, 132 fall in the middle-income category, and the
remaining 48 fall in the upper-income category. The students within each income category are randomly
divided into three groups. The students in group 1 receive no intervention. The students in group 2 are
treated with positive reinforcement in which, for each day the student is not truant, he or she receives a
star that can be traded in for rewards. The students in group 3 are treated with negative reinforcement
such that each truancy results in a 1-hour detention. However, the hours of detention are cumulative,
meaning that the first truancy results in 1 hour of detention, the second truancy results in 2 hours, and so
on. After a full school year, the total number of truancies are compared.
a) What type of experimental design is this?
This experiment has a randomized block design
B) What is the response variable in this experiment?
The response variable is the total number of truancies
Social Work cont.:
A social worker wants to examine methods that can be used to deter truancy. Three hundred chronically truant
students from District 103 volunteer for the study. Because the social worker believes that socioeconomic class
plays a role in truancy, she divides the 300 volunteers according to household income. Of the 300 students, 120
fall in the low-income category, 132 fall in the middle-income category, and the remaining 48 fall in the upper-
income category. The students within each income category are randomly divided into three groups. The
students in group 1 receive no intervention. The students in group 2 are treated with positive reinforcement in
which, for each day the student is not truant, he or she receives a star that can be traded in for rewards. The
students in group 3 are treated with negative reinforcement such that each truancy results in a 1-hour detention.
However, the hours of detention are cumulative, meaning that the first truancy results in 1 hour of detention, the
second truancy results in 2 hours, and so on. After a full school year, the total number of truancies are compared.
(c) What are the treatments?
The explanatory variable is the type of intervention. The treatments are no intervention, positive reinforcement,
and negative reinforcement.
(d) What variable serves as the block?
Income is the variable that serves as the block.
Draw a diagram to
illustrate the
design:
Coke or Pepsi:

• Coke or Pepsi: Suppose you want to perform an experiment whose goal


is to determine whether people prefer Coke or Pepsi.
• (A)Design an experiment that utilizes the completely randomized design.
(B)Design an experiment that utilizes the matched-pairs design.
• In both designs be sure to identify the response variable, the role of
blinding, and randomization. Which design do you prefer? Why?
• The researcher would randomly assign each subject to either drink
Coke or Pepsi. The response variable would be whether the subject
Completely likes the soda or not. Preference rates would be compared at the end
Randomized Design of the experiment. The subject would be blinded, but the researcher
would not. Therefore, this would be a single-blind experiment.
• The researcher would randomly determine whether each subject drinks Coke first or

Matched- Pepsi first. To avoid confounding, subjects should eat something bland between
drinks to remove any residual taste. The response variable would be either the
proportion of subjects who prefer Coke or the proportion of subjects who prefer

Pairs Design: Pepsi. This would also be a single-blind experiment since the subject would not know
which drink was first but the researcher would. The matched-pairs design is likely
superior.

You might also like