0% found this document useful (0 votes)

153 views7 pages

Stat 302 Practice Final: Brad Mcneney 2017-04-15

This practice exam document provides a 3-question practice final for a statistics course. It includes questions on topics like ANOVA, regression, and distributions. The questions analyze real datasets on flicker frequency by eye color, football punt distances, and time to make soft drink deliveries. Students are asked to interpret analyses, identify models, conduct statistical tests, and explain results.

Uploaded by

Siiroostaiii Koomiaar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

153 views7 pages

Stat 302 Practice Final: Brad Mcneney 2017-04-15

Uploaded by

Siiroostaiii Koomiaar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Stat 302 practice final

Brad McNeney
2017-04-15

Introduction

This practice final is made from the datasets that were analyzed for the final exam of Stat 302 in 2015. We
used a different text book then, so I have had to modify the questions from their original. Please note: The
analyses in this document represent a subset of all topics listed in the review document. You are responsible
for all the topics in the review. This practice exam is intended to give you an idea of the style of questions.

Questions

Short questions

1. (1 mark) Which of the following summary statistics measures the spread of a distribution: first quartile,
third quartile, inter-quartile range, Median.
2. (1 mark) Briefly, define the sampling distribution of a statistic.

Question 1: Flicker frequency

An individual’s critical flicker frequency is the highest frequency at which the flicker in a flickering light
source can be detected. At frequencies above the critical frequency, the light source appears to be continuous
even though it is actually flickering. This investigation recorded critical flicker frequency and iris colour of
the eye for 19 subjects. A summary of the flicker frequencies by group (eye colour) is as follows:
dat <- read.table("flicker.txt",header=TRUE)
library(dplyr)
dat %>% group_by(Colour) %>%
summarize(n=n(),mean=mean(Flicker),sd=sd(Flicker))

## # A tibble: 3 ◊ 4
## Colour n mean sd
## <fctr> <int> <dbl> <dbl>
## 1 Blue 6 28.16667 1.527962
## 2 Brown 8 25.58750 1.365323
## 3 Green 5 26.92000 1.843095
a. (1 mark) Is this a balanced or unbalanced design?
b. (1 mark) Comment on the constant error SD assumption.
c. (2 marks) Using baseline coding with brown eyes as the baseline group, write down the dummy variables
needed for an ANOVA model.
d. Using your dummy variables from (c), write down the model for a one-way ANOVA for these data,
including the error terms. You do not need to define any regression coefficients.
e. (4 marks) The model from (d) is fit to the data and we obtain the following Q-Q plot:
ffit<-aov(Flicker~Colour,data=dat)
augment(ffit) %>% ggplot(aes(sample=.resid)) + geom_qq()

1
This study source was downloaded by 100000850425706 from CourseHero.com on 08-11-2022 14:43:03 GMT -05:00

https://www.coursehero.com/file/90171528/prfinalpdf/
2

1
sample

−1

−2

−2 −1 0 1 2
theoretical
What assumption does this plot assess? Does the assumption appear plausible? Justify your answer.
f. (4 marks) The ANOVA summary for the model fit in (e) is as follows:
anova(ffit)

## Analysis of Variance Table

##
## Response: Flicker
## Df Sum Sq Mean Sq F value Pr(>F)
## Colour 2 22.997 11.4986 4.8023 0.02325 *
## Residuals 16 38.310 2.3944
## ---
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
State the null and alternative hypotheses being tested by the F test, and report the results of a test at the
5% level in technical language and language that anyone can understand.
g. (3 marks) The following raw p-values are obtained from pairwise comparisons.
with(dat,pairwise.t.test(Flicker,Colour,p.adjust.method="none"))

##
## Pairwise comparisons using t tests with pooled SD
##
## data: Flicker and Colour
##
## Blue Brown
## Brown 0.0071 -
## Green 0.2020 0.1504
##
## P value adjustment method: none
What are the Bonferroni-corrected p-values?

2
This study source was downloaded by 100000850425706 from CourseHero.com on 08-11-2022 14:43:03 GMT -05:00

https://www.coursehero.com/file/90171528/prfinalpdf/
h. (1 mark) In light of (f), to what do you attribute the significant F test in (e).

Question 2: Football punts

A football team records data on the length of punts made by 13 players at a tryout for the team. The distance
measure for each punter is the average of ten punts. They also record the following information on each
player:
• Hang: Time in air in seconds
• RStrength: Right leg strength in pounds
• LStrength: Left leg strength in pounds
• RFlexibility: Right leg flexibility in degrees
• LFlexibility: Left leg flexibility in degrees
• OStrength: Overall leg strength in pounds
You perform stepwise regression with the BIC criterion to build a predictive model of distance. The largest
model in your search includes all main effects; the smallest model includes only an intercept.
a. (1 mark) What is the BIC penalty term for model selection in this example? Report your answer to
four digits.
b. (2 marks) After several iterations of stepwise selection you obtain a model that includes RStrength,
RFlexibility, LFlexibility and OStrength. Here is a summary of the next iteration:
Distance ~ RStrength + RFlexibility + LFlexibility + OStrength

Df Sum of Sq RSS BIC

- RStrength 1 221.49 1728.3 73.829
- RFlexibility 1 228.58 1735.3 73.882
- LFlexibility 1 64.68 1571.5 72.592
- OStrength 1 660.99 2167.8 76.774
<none> 1506.8 74.611
+ Hang 1 5.81 1501.0 77.126
+ LStrength 1 4.45 1502.3 77.137
What action would you take next? Justify your answer.
c. (1 mark) After finishing stepwise selection we obtain the following model:
tidy(pfitbyBIC)

## term estimate std.error statistic p.value

## 1 (Intercept) 12.7675932 24.9925728 0.5108555 0.62054007
## 2 R_Strength 0.5563157 0.2104269 2.6437479 0.02457537
## 3 O_Strength 0.2716885 0.1003015 2.7087170 0.02198197
From this model, what is the predicted Distance for a punter with RStrength 170 and OStrength 266? Round
the coefficients to three significant digits and report your answer to three digits.

Question 3: Soft drinks

A softdrink vendor collects data on the relationship between the time in minutes a delivery takes, and two
explanatory variables: (i) the number of cases delivered (Cases), and (ii) the walking distance in feet to make
the delivery (Distance). The following summaries are obtained for 25 deliveries:
dat <-read.table("softdrin.txt",header=TRUE)
summary(dat)

3
This study source was downloaded by 100000850425706 from CourseHero.com on 08-11-2022 14:43:03 GMT -05:00

https://www.coursehero.com/file/90171528/prfinalpdf/
## Time Cases Distance
## Min. : 8.00 Min. : 2.00 Min. : 36.0
## 1st Qu.:13.75 1st Qu.: 4.00 1st Qu.: 150.0
## Median :18.11 Median : 7.00 Median : 330.0
## Mean :22.38 Mean : 8.76 Mean : 409.3
## 3rd Qu.:21.50 3rd Qu.:10.00 3rd Qu.: 605.0
## Max. :79.24 Max. :30.00 Max. :1460.0
ggplot(dat,aes(x=Time)) + geom_histogram(binwidth=10)
12.5

10.0

7.5
count

5.0

2.5

0.0
20 40 60 80
Time
ggplot(dat,aes(x=Cases)) + geom_histogram(binwidth=5)

10
count

0
0 10 20 30
Cases
ggplot(dat,aes(x=Distance)) + geom_histogram(binwidth=100)

4
count

0
0 500 1000 1500
Distance
a. (1 mark) How would you describe the distribution of the Distance variable?
b. (2 marks) Write out a linear model for mean Time that includes interaction between Cases and Distance.
Define any notation you use.
c. In terms of your model from the previous question, state formal hypotheses for testing for statistical
interaction.

4
This study source was downloaded by 100000850425706 from CourseHero.com on 08-11-2022 14:43:03 GMT -05:00

https://www.coursehero.com/file/90171528/prfinalpdf/
d. (2 marks) Computer software reports the following VIFs. Do you have any concerns about collinearity?
If so, why? If not, why not?
sfit<-lm(Time~Cases*Distance,data=dat)
library(car)

##
## Attaching package: car
## The following object is masked from package:dplyr :
##
## recode
vif(sfit)

## Cases Distance Cases:Distance

## 6.932817 4.842433 10.765414
e. (6 marks) The model is refit with the Cases and Distance variables centred by their means. The fitted
model yields the following residual plots. For each plot, state the assumptions being checked, and give
your opinion about whether or not each assumption is plausible:
dat <- mutate(dat,Cases = Cases-mean(Cases), Distance=Distance-mean(Distance))
sfit<-lm(Time~Cases*Distance,data=dat)
augment(sfit) %>% ggplot(aes(x=.fitted,y=.resid)) + geom_point() +
geom_smooth()

## geom_smooth() using method = loess

3
.resid

−3

20 40 60 80
.fitted
augment(sfit) %>% ggplot(aes(sample=.resid)) + geom_qq()

5
This study source was downloaded by 100000850425706 from CourseHero.com on 08-11-2022 14:43:03 GMT -05:00

https://www.coursehero.com/file/90171528/prfinalpdf/
5.0

2.5
sample

0.0

−2.5

−2 −1 0 1 2
theoretical
f. (2 marks) The following graphic, called a dotplot, shows the hat values for the fitted model. Each
observation is represented by a dot.
augment(sfit) %>% ggplot(aes(x=.hat)) + geom_dotplot(binwidth=.03)

1.00

0.75
count

0.50

0.25

0.00
0.00 0.25 0.50 0.75
.hat
Are there any observations with very high leverage? If so, why? If not, why not?
g. (2 marks) The Cook’s Distance values are:
round(augment(sfit)$.cooksd,2)

## [1] 0.09 0.00 0.00 0.06 0.00 0.00 0.04 0.00 2.76 0.17 0.19 0.01 0.00 0.01
## [15] 0.01 0.03 0.00 0.05 0.01 0.11 0.02 0.13 0.03 0.07 0.01
Are there any highly influential observations? If so, why? If not, why not?
h. (2 marks) A summary of the fitted model is as follows:
tidy(sfit)

## term estimate std.error statistic p.value

## 1 (Intercept) 2.107030e+01 0.5795254762 36.357857 1.895179e-20

6
This study source was downloaded by 100000850425706 from CourseHero.com on 08-11-2022 14:43:03 GMT -05:00

https://www.coursehero.com/file/90171528/prfinalpdf/
## 2 Cases 1.318060e+00 0.1462440700 9.012740 1.157667e-08
## 3 Distance 1.232658e-02 0.0027574849 4.470225 2.111072e-04
## 4 Cases:Distance 7.419211e-04 0.0001749773 4.240100 3.659326e-04
Test for interaction at the 5% level and write two sentences to report your conclusion, one using technical
language and one in language that anyone can understand.
i. (BONUS question) Write a sentence to interpret the effect of increasing Distance by 100 feet for a
delivery of 10 cases. Ten cases translates to a value of 1.24 of the centred Cases explanatory variable.
To calculate the effect, round coefficients to three significant digits and report the effect size to three
digits.

Question 4: Maple seeds

Maple tree seeds look like spinning helicopters when they fall from the tree. A forest scientist studied the
relationship between how fast they fall (Velocity) and their size (Load), taking a total of 35 measurements
on three trees (12 on two of them and 11 on the third). We will analyze the relationship between Velocity
and Load, allowing for different relationships in the three trees (Tree).
a. (1 mark) How many dummy variables are required to include Tree in regression models?
b. (2 marks) A model that allows different lines for mean Velocity as a function of Load for each Tree
gives the following summary.
dat <- read.table("samara.txt",header=TRUE)
dat$Tree <- factor(dat$Tree)
fit1<-lm(Velocity~Load*Tree,data=dat)
tidy(fit1)

## term estimate std.error statistic p.value

## 1 (Intercept) 0.5414479 0.2632359 2.0568924 0.04879063
## 2 Load 3.0628684 1.1599016 2.6406278 0.01318748
## 3 Tree2 -0.8407505 0.3356459 -2.5048736 0.01812005
## 4 Tree3 -0.2986812 0.4454446 -0.6705239 0.50782912
## 5 Load:Tree2 3.7342611 1.4999857 2.4895311 0.01877360
## 6 Load:Tree3 0.8204951 2.2836681 0.3592882 0.72198244
What are the estimated intercept and slope for the line for Tree2. Use three significant digits in your
calculations and report your answer to three digits.
c. (2 marks) The results of a multiple partial F test for interaction is summarized as follows.
fit2 <- lm(Velocity ~ Load+Tree,data=dat)
anova(fit2,fit1)

## Analysis of Variance Table

##
## Model 1: Velocity ~ Load + Tree
## Model 2: Velocity ~ Load * Tree
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 31 0.20344
## 2 29 0.16549 2 0.037949 3.325 0.05011 .
## ---
## Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
What are the degrees of freedom for the F statistic?
d. (2 marks) State the results of testing the no-interaction null hypothesis at the 5% level in (i) technical
language and (ii) non-technical language that anyone can understand.

7
This study source was downloaded by 100000850425706 from CourseHero.com on 08-11-2022 14:43:03 GMT -05:00

https://www.coursehero.com/file/90171528/prfinalpdf/
Powered by TCPDF (www.tcpdf.org)

Econometrics by Shalabh Sir
No ratings yet
Econometrics by Shalabh Sir
488 pages
Programming With R Test 2
50% (2)
Programming With R Test 2
5 pages
Final Predictive Vaibhav 2020
No ratings yet
Final Predictive Vaibhav 2020
101 pages
ML Fundamentals
No ratings yet
ML Fundamentals
38 pages
Analisis Jalur
No ratings yet
Analisis Jalur
30 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
2018may 02402 Solution en
No ratings yet
2018may 02402 Solution en
36 pages
Assignment 7
No ratings yet
Assignment 7
23 pages
On Fitting Models For Danish Fire Data
No ratings yet
On Fitting Models For Danish Fire Data
49 pages
Exercises Question
No ratings yet
Exercises Question
30 pages
2017aug 02323 02402 Solution en
No ratings yet
2017aug 02323 02402 Solution en
43 pages
Project
No ratings yet
Project
16 pages
Bacs HW3
No ratings yet
Bacs HW3
12 pages
The University of Auckland: Second Semester, 2004 Campus: City
No ratings yet
The University of Auckland: Second Semester, 2004 Campus: City
23 pages
DEV Lab Manual
No ratings yet
DEV Lab Manual
27 pages
Role of Banks in Indian Economy Report
94% (35)
Role of Banks in Indian Economy Report
28 pages
2022 Final
No ratings yet
2022 Final
25 pages
Analysis Course HW3
No ratings yet
Analysis Course HW3
12 pages
R Intro 2011
No ratings yet
R Intro 2011
115 pages
Predictive Modeling-Handouts
No ratings yet
Predictive Modeling-Handouts
11 pages
WEEK
No ratings yet
WEEK
17 pages
Graded Homework 1 Solutions
No ratings yet
Graded Homework 1 Solutions
19 pages
Assignment3 Zhao Zihui
No ratings yet
Assignment3 Zhao Zihui
8 pages
HW12 Sol
No ratings yet
HW12 Sol
9 pages
HWK5 SS
No ratings yet
HWK5 SS
11 pages
SPECIMEN EXAM SOLUTIONS - CS1B - IFoA - 2019 - Final
No ratings yet
SPECIMEN EXAM SOLUTIONS - CS1B - IFoA - 2019 - Final
8 pages
Maths Lab
No ratings yet
Maths Lab
17 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
No ratings yet
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
23 pages
STA3022Test2 2018
No ratings yet
STA3022Test2 2018
7 pages
(Unit 4-5) R 2marks
No ratings yet
(Unit 4-5) R 2marks
6 pages
Homework 9: Independent and Paired Samples T-Tests: Information 1
No ratings yet
Homework 9: Independent and Paired Samples T-Tests: Information 1
7 pages
Cost Practical
No ratings yet
Cost Practical
13 pages
IBS Sample I
No ratings yet
IBS Sample I
10 pages
Mindanao State University General Santos City: Simple Linear Regression
No ratings yet
Mindanao State University General Santos City: Simple Linear Regression
12 pages
Module 4: Recommended Exercises: Problem 1: KNN (Exercise 2.4.7 in ISL Textbook, Slightly Modified)
No ratings yet
Module 4: Recommended Exercises: Problem 1: KNN (Exercise 2.4.7 in ISL Textbook, Slightly Modified)
6 pages
Lab 5
No ratings yet
Lab 5
6 pages
Assignment STAT5002
No ratings yet
Assignment STAT5002
5 pages
CSE 312-Introduction To Statistical Tools in Research - Question Bank
No ratings yet
CSE 312-Introduction To Statistical Tools in Research - Question Bank
6 pages
Model Lab
No ratings yet
Model Lab
6 pages
DSC2608 - Assessment - 05 S1-2025
No ratings yet
DSC2608 - Assessment - 05 S1-2025
4 pages
DSC2608 Assessment5 S1 2024
No ratings yet
DSC2608 Assessment5 S1 2024
5 pages
Assignment - 2 3
No ratings yet
Assignment - 2 3
4 pages
DS Assignment COMPLETED
No ratings yet
DS Assignment COMPLETED
11 pages
Exp 7
No ratings yet
Exp 7
8 pages
Lecture Note - ES 1
No ratings yet
Lecture Note - ES 1
48 pages
Chapter 1 Introduction To Data Mining
No ratings yet
Chapter 1 Introduction To Data Mining
10 pages
Lab Wk1soln PDF
No ratings yet
Lab Wk1soln PDF
14 pages
Rekapitulacija NIR - Sve
No ratings yet
Rekapitulacija NIR - Sve
23 pages
R Console
No ratings yet
R Console
6 pages
Tobit Models - R Data Analysis Examples
No ratings yet
Tobit Models - R Data Analysis Examples
9 pages
MS4610 - Introduction To Data Analytics Final Exam Date: November 24, 2021, Duration: 1 Hour, Max Marks: 75
No ratings yet
MS4610 - Introduction To Data Analytics Final Exam Date: November 24, 2021, Duration: 1 Hour, Max Marks: 75
11 pages
ASM Compre Paper (Sem-I) (2021-22)
No ratings yet
ASM Compre Paper (Sem-I) (2021-22)
2 pages
A Term Paper On Monte Carlo Analysis / Simulation
No ratings yet
A Term Paper On Monte Carlo Analysis / Simulation
12 pages
HW4 Solutions: Problem 6.2
No ratings yet
HW4 Solutions: Problem 6.2
8 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Stat 305 Final Practice - Solutions
No ratings yet
Stat 305 Final Practice - Solutions
10 pages
Midterm Test Model Answers
No ratings yet
Midterm Test Model Answers
2 pages
Exercise Sheet - Control Structures and Functions: Hint: You Can Use The Command Diag
No ratings yet
Exercise Sheet - Control Structures and Functions: Hint: You Can Use The Command Diag
4 pages
R-Practical questions-Sem-IV
No ratings yet
R-Practical questions-Sem-IV
4 pages
Impact of Committment of Employees in Bpo Sector
No ratings yet
Impact of Committment of Employees in Bpo Sector
16 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
Curve Fitting - DS
100% (1)
Curve Fitting - DS
9 pages
Tentamen #1 - Data Analytics and Visualization - 2020-2021
No ratings yet
Tentamen #1 - Data Analytics and Visualization - 2020-2021
6 pages
Isye4031 Regression and Forecasting Practice Problems 2 Fall 2014
No ratings yet
Isye4031 Regression and Forecasting Practice Problems 2 Fall 2014
5 pages
PPT
No ratings yet
PPT
10 pages
STAB27
No ratings yet
STAB27
51 pages
Chapter 8 Interval Estimation
No ratings yet
Chapter 8 Interval Estimation
25 pages
AIML Question Bank
No ratings yet
AIML Question Bank
25 pages
General English Essays
100% (2)
General English Essays
8 pages
Dissertation Milestones Timetable Semester 2: Milestone Target Times Completion of Dissertation
No ratings yet
Dissertation Milestones Timetable Semester 2: Milestone Target Times Completion of Dissertation
9 pages
Probabilidade Um Curso Moderno Com Aplicacoes Sheldon Ross
0% (1)
Probabilidade Um Curso Moderno Com Aplicacoes Sheldon Ross
11 pages
Introduction To Management Chapter One Rift Valley University
No ratings yet
Introduction To Management Chapter One Rift Valley University
31 pages
Unit - 1 - Student - Notes - Complete (AP BIOLOGY)
No ratings yet
Unit - 1 - Student - Notes - Complete (AP BIOLOGY)
47 pages
Experiments Montgomery
No ratings yet
Experiments Montgomery
111 pages
Mobile Banking Adoption by Business Executives in Nigeria
No ratings yet
Mobile Banking Adoption by Business Executives in Nigeria
9 pages
Econometrics I Handout
No ratings yet
Econometrics I Handout
41 pages
Numerical Analysis Anzar Lec 29 Least Squares Regression 16122022 065607pm
No ratings yet
Numerical Analysis Anzar Lec 29 Least Squares Regression 16122022 065607pm
10 pages
Principles of Econometrics, 5th Ed. (R. Carter Hill, William E. Griffiths Etc.) (Z-Lib - Org) - 345-353
No ratings yet
Principles of Econometrics, 5th Ed. (R. Carter Hill, William E. Griffiths Etc.) (Z-Lib - Org) - 345-353
9 pages
Decision Science Assignment
No ratings yet
Decision Science Assignment
15 pages
APtest 3B
No ratings yet
APtest 3B
4 pages
HW 1
No ratings yet
HW 1
7 pages
Machine Learning - Exploring The Model
No ratings yet
Machine Learning - Exploring The Model
2 pages
3-Applying Multiple Linear Regression
No ratings yet
3-Applying Multiple Linear Regression
5 pages
Csci567 Hw1 Spring 2016
No ratings yet
Csci567 Hw1 Spring 2016
9 pages
Jiu-Jitsu-Specific Performance Test: Reliability Analysis and Construct Validity in Competitive Athletes
No ratings yet
Jiu-Jitsu-Specific Performance Test: Reliability Analysis and Construct Validity in Competitive Athletes
6 pages
HRM0092 1
No ratings yet
HRM0092 1
5 pages
The Effectiveness of Task-Based Language Teaching in Developing Speaking Skills at SMKN 2 MALANG
No ratings yet
The Effectiveness of Task-Based Language Teaching in Developing Speaking Skills at SMKN 2 MALANG
4 pages

Stat 302 Practice Final: Brad Mcneney 2017-04-15

Uploaded by

Stat 302 Practice Final: Brad Mcneney 2017-04-15

Uploaded by

Stat 302 practice final

Question 1: Flicker frequency

## Analysis of Variance Table

Question 2: Football punts

Df Sum of Sq RSS BIC

## term estimate std.error statistic p.value

Question 3: Soft drinks

## Cases Distance Cases:Distance

## geom_smooth() using method = loess

## term estimate std.error statistic p.value

Question 4: Maple seeds

## term estimate std.error statistic p.value

## Analysis of Variance Table

You might also like