100% found this document useful (1 vote)
3K views4 pages

Super - Seven Answers

This document contains summaries of several statistical problems and questions. It includes summaries of studies on calorie intake in urban vs. rural students, income and age relationships, aircraft production trends over time, telephone line usage probabilities, bauxite ore car weights, and a proposed cholesterol study design. For each problem, it lists the questions asked and provides short answers to the questions, often including calculations and brief interpretations.

Uploaded by

Patrick
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views4 pages

Super - Seven Answers

This document contains summaries of several statistical problems and questions. It includes summaries of studies on calorie intake in urban vs. rural students, income and age relationships, aircraft production trends over time, telephone line usage probabilities, bauxite ore car weights, and a proposed cholesterol study design. For each problem, it lists the questions asked and provides short answers to the questions, often including calculations and brief interpretations.

Uploaded by

Patrick
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

The Super Six

2005 #1: Urban and Rural Calories

a) The Urban center is lower, with fewer calories typically than Rural. The
spreads of the distributions are about the same. The Urban distribution is
skewed right, the Rural distribution is roughly uniform.
b) No. A random sample of US schools was not selected. Only 1 school from each
region.
c) Plan II would be better. One given day might have more or fewer calories than
normal. A 7-day average would average out the day to day variability and
more accurately estimate the true average.
d) Construct parallel boxplots to compare the calorie distribution of the rural
vs. the urban students.

e) Verify whether or not there are outliers in either data set.


Rural:
IQR: 45.5 – 35.5 = 10
lower fence = 35.5 – 15 = 20.5
upper fence = 45.5 + 15 = 60.5
Urban:
IQR: 36 – 29 = 7
lower fence = 29 – 10.5 = 18.5
upper fence = 36 + 10.5 = 46.5
No outliers in either data set because all data is within the fences.
f) Describe how a researcher might use schools as clusters to gather data in a
given county.
A researcher could take a list of all the schools in a the county of interest.
Randomly select some of the schools. Then survey all the students in those
schools.
g) One researcher observed that rural students ate more home cooked meals
than urban students. A journalist wrote an article stating that home cooked
meals caused an increase in calorie intake. Describe a confounding variable
that may be the cause of the higher calorie intake in rural students.
It is possible that rural students eat more calories because they are more active
and thus eat larger portions. Thus we would be unable to determine in the
increase in calories in calories for rural students was due to the home cooked
meals or if it was due to the larger portions.
h) Describe how you would use your calculator and a list of 9 th grade students
from your school to conduct a simple random sample. Include a description
of how you would implement your procedure.
The Super Six

Take a list of all 9th graders. Number the list, 1 to n. Randomly generate a
random number and survey that student. Repeat the process, skipping repeats,
until you have the desired sample size.
i) Describe one variable that might be important to create strata and why you
chose that variable.
Because males and females tend to consume different amounts of calories, it
would be helpful to stratify by gender. This would reduce variability created by
gender differences in calorie consumption (ensuring that the correct
proportion of males and females are in the sample.)

2003B #2 Income & Age

a) 89/207 = 43%
b) 35/96 = 36.5%
c) They are not independent. Because 43% the sample is 31-45, but only 36.5% of
this age group makes over $50,000, this shows they’re not independent.
d) Make a graphical display to examine the relationship between Age and
Income. Describe this graph.

0% 20% 40% 60% 80% 100%

e) Name an inference procedure that could be carried out to answer the


independence question on part (c).
Chi-square test for independence.

1999 #1 Aircraft in the 90’s

a) Yes, because the residual plot has no pattern.


b) 233.5: There are approximately 233.5 more aircraft per year.
c) 2939.9: In 1990, the model predicts 2939.9 aircraft.
d) 2939.9 + 233.5(2) = 3406.9 aircraft predicted
e) 40 = y – 3406.9, so y = 3446.9, so 4447 actual aircraft.
f) Interpret s in the context of this problem.
The regression line misses the data by an average of 33.43 aircraft.
I am 95% that the true slope of the aircraft/year is between 233.3 and 243.7.
g) R^2= 87.4%. Interpret this value in context.
87.4% of the variation in aircraft has been successfully explained by regression
on years.
h) Find and interpret the correlation coefficient.
The Super Six

√0.874 = .934 : There is a strong, positive, linear relationship between the # of


aircraft and years.
i) If each new aircraft costs the FAA an additional $1000 in regulatory costs,
how much are the costs increasing each year, on average?
$1000*233.5 = $233,500 more per year.

2005 #2 Telephone Lines

a) Yes this is a legitimate probability distribution as all of the probabilities sum


to 1 and all are between 0 and 1 inclusive.
b) What is the probability that 3 or more lines are in use at noon?
15% + 10% + 5% = 30%
c) What is the probability that at least 1 line is in use at noon?
1 – 35% = 65%
d) Given that 3 or more lines are in use at noon, what is the probability that all 5
are in use?
5/30 = 16.7%

2004B #3 Bauxite Ore Cars

a) z = 0.7778; normalcdf = 21.8%


b) No. Because 70.7 could happen by chance about 22% of the time.
c) Draw a careful sketch to show your answer to part (a)

d) Given the initial mean and standard deviation, how full are the most heavy
10% of the cars?
invnormal(0.90) = 1.28; 1.28 = (x – 70)/0.9 = 71.152 tons

g) mean=70.7 X 907.186 = 64,138.05 kg. standard deviation: 816.4665 kg.

h) mean = 70.7 + 70.7=141.4, standard deviation= √ .92 +.92=1.28


The Super Six

2000 #5 Cholesterol and Exercise

a) I would randomly sort the volunteers into 2 groups. 1 group would take the
new drug and the other the current drug. Compare cholesterol levels at the end.
b) Since exercise effects cholesterol level, I would block by the volunteers’ exercise
level. Divide the volunteers into high, medium, low exercise level. Randomly
place half from each block in the treatment groups.
c) Yes. An assistant can setup the medications so neither the evaluators nor the
subjects know which treatment they are receiving.
d) Describe a method for implementing your design in part (c).
Put all the volunteers’ names in a hat. Stir. Randomly select half the names and
those subjects receive the new drug. The rest receive the current drug.
e) Identify the subjects, the treatment(s), the factor(s), the level(s), and the
response variable in this experiment.
The subjects are the volunteers.
The factor is the drug.
Two levels: current and new.
Two treatments.
Response variable: cholesterol level.

2008 #2

See scoring guidelines on college board.

You might also like