0% found this document useful (0 votes)
6 views6 pages

PracticeforTest3 s24

Uploaded by

hayly.dewitt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views6 pages

PracticeforTest3 s24

Uploaded by

hayly.dewitt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Practice for Test 3

1. A Type I error occurs if you reject the null hypothesis but the null was false. TRUE / FALSE
2. When making a decision, you compare the test statistic to the p-value. TRUE / FALSE
3. If p is less than α, your decision is that H0 is true. TRUE / FALSE
4. Conclusions are written in terms of H0 – stating whether is H0 true or not true. TRUE / FALSE
5. Describe the strength & direction of each scatterplot:

6. When describing a scatterplot, we would never describe the relationship as both strong and negative. TRUE/FALSE
7. A new diet is being tested. Six individuals have agreed to participate on the diet for a pre-determined number of
weeks. At the conclusion of their participation, the number of pounds lost for each individual is determined. Let x
denote the number of weeks that a subject has been on a special diet, and y denote the number of pounds lost
during that period.

Individual Weeks Pounds


1 3 9 Construct a scatter plot of Pounds vs Weeks. Describe the information.
2 7 12 Find the least squares equation of the line.
3 2 2 Find r and interpret.
4 4 13 What would you predict the pounds lost to be if a person is on the diet for 5 weeks?
Is there evidence that the slope is not equal to 0? Complete the hypothesis test
5 8 12
using a 5% level of significance
6 3 8 Find the residuals and plot residuals (y) versus weeks (x). Describe what you see.

8. Professor Moore is an avid swimmer. For 23 days, he records the time (in minutes) it takes him to swim 2000 yards
and his pulse rate (beats per minute) after swimming. These data are in StatCrunch in a dataset named Swim.
Time (x) 34.12 35.72 34.72 34.05 34.13 35.72 36.17 35.57
Pulse (y) 152 124 140 152 146 128 136 144
Time (x) 35.37 35.57 35.43 36.05 34.85 34.70 34.75 33.93
Pulse (y) 148 144 136 124 148 144 140 156
Time (x) 34.60 34.00 34.35 35.62 35.68 35.28 35.97
Pulse (y) 136 148 148 132 124 132 139
a) Create a scatter plot of Pulse vs Time. Describe what you see (direction, form, strength, outliers).
b) Using time (x) to predict Professor Moore’s pulse rate (y), find the equation of the least-squares regression line.
c) If Professor Moore completes his 2000-yard swim in 35.00 minutes, what would you predict his pulse rate
would be?
d) What is the value of r for this linear relationship? Interpret this value.
e) Interpret the value of the slope in terms of the question.
f) A test of the null hypothesis H0: Slope= 0 (versus Ha: Slope ≠ 0) results in a test statistic t= –5.133 and a p-value
of 0.0000438. What decision and conclusion can you draw from the results of this hypothesis test?
g) Examine the residuals. Create a plot of Residuals vs Time. What do you observe?
h) For the time = 34.12, what would you predict the pulse to be? The actual pulse = 152. What is the residual?

9. Earlier, our class participated in a survey proving responses to a variety of questions. These data are in StatCrunch
in a dataset named MAT175 – Survey Fall 2023 (available in our group).
a) Create a scatter plot of Salary vs Minutes_SocialMedia. Describe what you see (direction, form, strength, outliers).
b) Using Minutes_SocialMedia (x) to predict Salary (y), find the equation of the least-squares regression line.
c) If a student spent 240 minutes on social media, what would you predict her future salary would be?
d) What is the value of r for this linear relationship? Interpret this value.
e) Interpret the value of the slope in terms of the question.
f) A test of the null hypothesis H0: Slope= 0 (versus Ha: Slope ≠ 0) results in a test statistic t= -1.6164683 and a p-
value of 0.1234. . What decision and conclusion can you draw from the results of this hypothesis test?
g) Examine the residuals. Create a plot of Residuals vs Minutes_SocialMedia. What do you observe?
h) For the Minutes_SocialMedian =120 what would you predict the Salary to be? The actual Salary = $67153. What
is the residual?
10. When we sample a population, we want to randomize if at all possible. TRUE/FALSE
11. A good way to select a sample is to just ask those that you know. TRUE/FALSE
12. We want a sample with a lot of bias. TRUE/FALSE
13. A surveyor asks: “Many people think this playground is too small and in need of repair. Would you agree?” This is
an example of a leading question. TRUE/FALSE
14. Stopping students leaving BDH is a good way to collect a sample if we want to know the quality of the food.
TRUE/FALSE
15. The true proportion of students who enjoy statistics is called the ‘population statistic’. TRUE/FALSE
16. You must always take a census and never sample. TRUE/FALSE
17. A scatter plot is created for the population vs storks. The linear regression equation is:
Population = 35652.888 + 149.76712 * Storks ( or y = 35652.888 + 149.76712 x) Add
the line to the scatter plot.

18. Which of these has the higher variability? Which of these has the higher bias?

19. A statistics professor has been collecting student pulse data (resting) and their pulse after jogging. A sample of 82 students is available in a
StatCrunch dataset called Pulse. The professor is interested in Pulse_afterJog VS Pulse_Resting. Complete the following using the Pulse
dataset.
a. Generate a scatter plot. Describe the direction, form, strength, and any outliers.
b. Find r (correlation coefficient). What is the value of r? Describe r.
c. Find the least squares regression equation that would predict Pulse_afterJog from Pulse_Resting. Write the full equation (round
the slope and y-intercept to 2 decimal places).
d. What is the value of the slope? Interpret this value in terms of this equation.
e. Find r2. What is the value of r2? Describe r2
f. A student’s resting pulse is 60. What would you predict her Pulse after jogging to be?
g. If the actual value of her pulse after jogging is 75, what is her residual (use your prediction above)?
h. Test for a non-zero slope What is your decision? What is your conclusion?
i. Create a scatter plot of residuals vs Pulse_Resting. Describe what you see.
ANSWERS:

1. FALSE – a Type I error occurs if you reject the null hypothesis, but the null was true.
2. FALSE -- When making a decision, you compare the p-value to the level of significance.
3. FALSE -- If p is less than α, your decision is to reject H 0 (meaning you think H0 is not true)
4. FALSE -- Conclusions are written in terms of Ha – is there evidence that the alternate is proven or not proven
5. Strong positive Weak negative

6. False – it is possible for a negative relationship to be strong


7. Enter the X and Y data into StatCrunch; Stat  Regression  Simple Linear  It looks there may be a relationship, but linear may not be
the best description

The equation of the line is: y= 3.9943503 + 1.1864407 x

r= 0.70590728 (indicates a moderate and positive relationship)

If x=5 weeks, predict pounds lost (y) = 3.9943503 + 1.1864407 *5 = 9.9265537pounds

H0: slope = 0
Ha: slope ≠ 0
α=0.05
t= 1.9932318
p= 0.117
Do not reject H0
At α=0.05, there is not evidence that the slope is different from 0 (so we cannot say the slope is non-zero).

Notice the pattern with the residuals …. An upside-down V …. This indicates that a straight line may not be the best model for the
relationship between pounds lost and weeks

This is the Output from StatCrunch:


Scatterplot – notice that it doesn’t look very linear – more curved

General StatCrunch Output:


With the line on the scatterplot
Residuals (data and scatterplot) (notice the shape of the residuals)

8. Stat  Regression  Simple Linear 


a) There is a negative, moderate, linear relationship with no obvious outliers
b) Pulse(y) = 479.93415 - 9.6949034 Time(x)
or y = 479.93415 - 9.6949034 (x)
c) if x = 35.00, y = 479.93415 - 9.6949034 (35)
predict y to be 140.6 or ~141
d) r = -0.74598413 which is a negative, and not strong but not weak – so moderate linear
relationship
e) the slope is -9.69 which means for every increase of 1 minute in time, the pulse goes down by
9.7 (almost 10) beats per minute
f) if the p-value is 0.0000438, I would Reject H0 and conclude that there is evidence that the
slope is not 0
g) there is not an obvious pattern to the residuals – which is what we want to see when we
look at residuals
h) if x = 34.12, predicted y = 479.93415 - 9.6949034 (34.12), predicted y = 149.14404
residual = actual y – predicted y = 152 - 149.14404 = 2.85596

9.
Create a scatter plot of Salary vs Minutes_SocialMedia. Describe what you see (direction, form,
strength, outliers).
Direction is negative
Form is possibly linear
Strength is weak
Outliers – maybe the value near (180,24500)
least-squares regression line
Salary = 70299.131 - 49.017492 Minutes_SocialMedia
If a student spent 240 minutes on social media, predicted Salary would be $58534.933
Predicted Salary = 70299.131 - 49.017492 (240)
Predicted Salary = 58534.933
R is -0.35603846. This is a weak negative linear relationship
The slope is -49.17492. This means for each increase of 1 minute on social media, the predicted salary would decrease by about
$49.17.
A test of the null hypothesis H0: Slope= 0 (versus Ha: Slope ≠ 0) results in a test statistic t= -1.6164683 and a p-value of 0.1234. The
decision is Do Not Reject H0; The Conclusion is There is Not evidence that the slope is different from 0 (meaning, we do not have evidence
that the slope is non-zero)
Examine the residuals. Create a plot of Residuals vs Minutes_SocialMedia. What do you observe?
There is not a distinct patter, but there is more variability/spread for the lower values of
Minutes_SocialMedia than there is for the larger values (however, this may be due to the lack of data
for the larger values of Minutes_SocialMedia).

For the Minutes_SocialMedian = 120, what would you predict the Salary to be?
Predicted Salary = 70299.131 - 49.017492 (120)
Predicted Salary = $64417.032
The actual Salary = 67153. What is the residual?
Residual = actual – predicted
Residual = 67153 - 64417.032
Residual = $2,735.968
(remember a positive residual implies an under prediction)

10. TRUE – randomization is one of the key principles for sampling


11. FALSE – asking only those you know may not represent the population of interest
12. FALSE – bias is a systematic failure of the sampling method to represent the population.
13. TRUE -- the question points out two issues that the responder may not have noticed and this may lead them to believe that they should
agree.
14. FALSE – this sample is probably biased since students who don’t like the food at BDH might choose not to eat there
15. FALSE -- it is the population parameter (remember that statistics describe samples, parameters describe populations).
16. FALSE – a well-designed sample is acceptable.

17. Solve the equation for two different values of x


If x = 140, then y = 35652.888 + 149.76712(140) , y = 56,620.2848 (140, 56,620)
If x = 240, t hen y = 35652.888 + 149.76712(240), y = 71,596.9968 (240, 71,597)
Add these two new points to the plot and connect these two points to form the line

18. High variability -- The one on the right has high variability
High bias -- The one on the left has high bias .
19.
a. Direction: positive
Form: linear
Strength: moderate
Outliers: None that appear obvious

b. r = 0.65743631 ; positive moderate relationship


c. y = 37.45 + 0.89x
or
Pulse_AfterJogging = 37.48 + 0.89 Pulse_Resting

d. The slope is 0.89. For each 1 beat increase in Pulse_Resting, the


pulse_afterjogging increases by 0.89 beats

e. r2 = 0.4322225; Only about 43.2% of the variability in pulse_afterjogging is


explained by the variability in pulse_resting

f. If x = 60; y = 37.45 + 0.89(60); predicted y = 90.85


g. residual = actual – predicted; 75-90.85 = -15.85
h.
H0: slope = 0
Ha: slope ≠ 0
α = 0.05
test statistic = 7.8018621
p-value = <0.0001
Decision = Reject H0
Conclusion = there is evidence that the slope is not 0

I
Residual plot – fairly random scatter (which is what we want to see when looking at the residuals)

You might also like