HW 6

This homework assignment involves analyzing properties of adjusted R-squared and using various regression techniques to predict college acceptance rates. Specifically, it asks students to: 1) Investigate properties of adjusted R-squared through examples and counterexamples. 2) Fit linear regression models using different variable selection techniques like forward selection and evaluate their performance on training and test sets. 3) Compare regularization methods like ridge regression, lasso, and others in terms of their test error and variable selection. 4) Recommend the best approach for this dataset based on the test error and predictive ability of different models.

Uploaded by

Dhroov Patel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views2 pages

HW 6

Uploaded by

Dhroov Patel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

STATS 415 Homework 6

Please use R Markdown to write up your solutions. Submit your work

through Canvas by uploading a pdf file that contains your solutions and
a separate Rmd file that contains your code.

1. Read ISLR chapter 6

2. ISLR chapter 6 exercises: 1, 2, 3, 4, 6
3. This exercise investigates properties of the adjusted R2 .

(a) We know that R2 is guaranteed to be between 0 and 1. Are both bounds

(≥ 0 and ≤ 1) true for Ra2 ? For each bound, either prove it is true or
give a counterexample.
(b) Suppose you have p = 500 predictors and 501 observations in your
dataset, and you fit a linear regression model. Predictors 1-50 are cor-
related with the response, and when a linear model with just these 50
predictors is fit, we get R2 = 0.5. The remaining 450 predictors have 0
correlation with the response, so adding any of them to the model does
not change the R2 . How many of these extra “uninformative” predictors
added to the model will make the adjusted R2 exactly 0?

4. In this exercise, we will predict the acceptance rate of a college (number of

applications accepted / number of applications received) using the College
dataset from the ISLR package.

(a) Split the data set into a training set and a test set. Fix the random seed
to the value 234, choose 30% (rounded down to the nearest integer) of the
data at random for testing, and use the rest for training. Define a new
response variable Accept/Apps. Plot this variable against every variable
in the dataset (make sure you use the appropriate type of plot for each
predictor). Comment on which variables appear to be most predictive.
(b) Fit a linear model using least squares on the training set, and report
the training and test error obtained, with Accept/Apps as the response
variable and all other variables as predictors.
(c) Perform forward and backward selection on the full model with the
threshold α = 0.05 to select a potentially smaller model. Report which
model each method chose, and the training and test errors for their chosen
models.
(d) Use AIC, BIC, and adjusted R2 to select a potentially smaller model
instead, from the set of all possible predictors used in 4b. Report which
model each method chose, and the training and test errors for their chosen
model(s).
(e) Use 5-fold cross-validation to estimate the test error from the training
data, for the candidate smaller model(s) you found so far, and for the full
model from 4b. Compare the training, CV, and test errors and comment
on the results.
(f) Fit a ridge regression model on the training set, with λ chosen by cross-
validation. Report the training and test errors.
(g) Fit a lasso model on the training set, with λ chosen by cross-validation.
Report which variables are included in the model, and the training and
test errors obtained.
(h) Fit a PCR model on the training set, with M chosen by cross-validation.
Report the test error obtained, along with the value of M selected by
cross-validation.
(i) Fit a PLS model on the training set, with M chosen by cross-validation.
Report the test error obtained, along with the value of M selected by
cross-validation.
(j) Comment on the results obtained. How accurately can we predict the
acceptance rate? How much difference is there among the test errors
resulting from different approaches? Which approach would you recom-
mend for this dataset and why?

Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Logit Probit
No ratings yet
Logit Probit
87 pages
59044431253327ME Industrial Engg. CPQ
No ratings yet
59044431253327ME Industrial Engg. CPQ
18 pages
ch12 0
No ratings yet
ch12 0
43 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
51 pages
H-311 Linear Regression Analysis With R
100% (1)
H-311 Linear Regression Analysis With R
71 pages
NR21 ML Lab Manual
No ratings yet
NR21 ML Lab Manual
34 pages
DM Slip Solutions
100% (1)
DM Slip Solutions
24 pages
Stat - Model - Exam - 2017 - DBU
No ratings yet
Stat - Model - Exam - 2017 - DBU
20 pages
Improving Pretraining Data Using Perplexity Correlations
No ratings yet
Improving Pretraining Data Using Perplexity Correlations
31 pages
Intermediate Statistics Sample Test 1
0% (3)
Intermediate Statistics Sample Test 1
17 pages
Metrikaq
No ratings yet
Metrikaq
11 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
SOLVED - Practice Test
No ratings yet
SOLVED - Practice Test
15 pages
2021 Quiz2 Problems
No ratings yet
2021 Quiz2 Problems
13 pages
INDR 372 Selected Solutions of Review Exercises For The Midterm Exam
No ratings yet
INDR 372 Selected Solutions of Review Exercises For The Midterm Exam
15 pages
Stats101A - Chapter 1
No ratings yet
Stats101A - Chapter 1
25 pages
X400004 2021 02 09 Course
No ratings yet
X400004 2021 02 09 Course
8 pages
Statistical Data Analysis Assignment
No ratings yet
Statistical Data Analysis Assignment
17 pages
Al Manja Hie 2020
No ratings yet
Al Manja Hie 2020
15 pages
Individual Household Electric Power Consumption
No ratings yet
Individual Household Electric Power Consumption
29 pages
Individual Variable Data Analysis: Warning
No ratings yet
Individual Variable Data Analysis: Warning
38 pages
X 400004 - Statistics Resit Exam: 13 February 2024
No ratings yet
X 400004 - Statistics Resit Exam: 13 February 2024
8 pages
QBA Final Exam (May 31, 2021)
No ratings yet
QBA Final Exam (May 31, 2021)
12 pages
14 Estimation
No ratings yet
14 Estimation
10 pages
Population Sample Parameter
No ratings yet
Population Sample Parameter
16 pages
Assignment Week 2
No ratings yet
Assignment Week 2
10 pages
Inferential Statistics: (Parametric Data)
No ratings yet
Inferential Statistics: (Parametric Data)
46 pages
Introduction To Econometrics - Stock & Watson - CH 4 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 4 Slides
84 pages
Econ7020X 2024S FinalExam
No ratings yet
Econ7020X 2024S FinalExam
10 pages
Tutorial Stat 322 PDF
No ratings yet
Tutorial Stat 322 PDF
58 pages
Bda Assign
No ratings yet
Bda Assign
15 pages
Assignment Solution 2
No ratings yet
Assignment Solution 2
8 pages
May 23
No ratings yet
May 23
21 pages
Lesson Week 13
No ratings yet
Lesson Week 13
6 pages
q6-5 Solution (Ridge and Lasso)
No ratings yet
q6-5 Solution (Ridge and Lasso)
7 pages
Problem Set Solution QT I I 17 Dec
No ratings yet
Problem Set Solution QT I I 17 Dec
22 pages
0.1 Guilherme Marthe - Boston House Pricing Challenge
100% (1)
0.1 Guilherme Marthe - Boston House Pricing Challenge
15 pages
SDSC3006 - Assignment 3
No ratings yet
SDSC3006 - Assignment 3
4 pages
Analisa Pengaruh Fasilitas Dan Kepuasan Pelanggan Terhadap Loyalitas Pelanggan Menginap Di Mikie Holiday Resort Dan Hotel Berastagi
No ratings yet
Analisa Pengaruh Fasilitas Dan Kepuasan Pelanggan Terhadap Loyalitas Pelanggan Menginap Di Mikie Holiday Resort Dan Hotel Berastagi
13 pages
Quantile Regression (Final) PDF
100% (1)
Quantile Regression (Final) PDF
22 pages
t2 Sol
No ratings yet
t2 Sol
5 pages
MASII Sample Questions
No ratings yet
MASII Sample Questions
14 pages
Descriptive Statistics: Making Sense of Data
No ratings yet
Descriptive Statistics: Making Sense of Data
21 pages
Lecture15 Binary Dependent Variables
No ratings yet
Lecture15 Binary Dependent Variables
38 pages
DSBDAL - Assignment No 6
No ratings yet
DSBDAL - Assignment No 6
4 pages
Problem Set 7
No ratings yet
Problem Set 7
5 pages
Unit 09
100% (1)
Unit 09
25 pages
A Comparison of Regression Models For Prediction of Graduate Admissions
No ratings yet
A Comparison of Regression Models For Prediction of Graduate Admissions
5 pages
Assignment
No ratings yet
Assignment
4 pages
Attachment 1
No ratings yet
Attachment 1
3 pages
Stats 12 Practice Test
No ratings yet
Stats 12 Practice Test
6 pages
2018-IPS Endterm Sols
No ratings yet
2018-IPS Endterm Sols
14 pages
Compre MATH F432 2023 24 I
No ratings yet
Compre MATH F432 2023 24 I
3 pages
SDSC3006 - Assignment 1
No ratings yet
SDSC3006 - Assignment 1
3 pages
University of Toronto Scarborough STAB22 Final Examination: December 2010
No ratings yet
University of Toronto Scarborough STAB22 Final Examination: December 2010
25 pages
Pengaruh Praktik Manajemen Sumber Daya Manusia Terhadap Kinerja Karyawan Di Pt. Indo-Rama Synthetics Tbk. Divisi Spun Yarns
No ratings yet
Pengaruh Praktik Manajemen Sumber Daya Manusia Terhadap Kinerja Karyawan Di Pt. Indo-Rama Synthetics Tbk. Divisi Spun Yarns
11 pages
STAT 4540 Homework 1 Solution: 1 ISLR 2.4.1
No ratings yet
STAT 4540 Homework 1 Solution: 1 ISLR 2.4.1
6 pages
Set 2 IBM-322
No ratings yet
Set 2 IBM-322
3 pages
Assignment 5 Lanka Jaswanth 19BIT0061
No ratings yet
Assignment 5 Lanka Jaswanth 19BIT0061
9 pages
Why Do Some People Go To College While Others Do Not? - Why Do Some Women Enter The Labor Force While Others
No ratings yet
Why Do Some People Go To College While Others Do Not? - Why Do Some Women Enter The Labor Force While Others
24 pages
Final Exam (Eda Vinluan, Jericho)
No ratings yet
Final Exam (Eda Vinluan, Jericho)
6 pages
Stats1 Chp3 SupplementaryHistogramExercise
No ratings yet
Stats1 Chp3 SupplementaryHistogramExercise
4 pages
Practical 5
No ratings yet
Practical 5
2 pages
thống kê
No ratings yet
thống kê
4 pages
ANOVA Table For One Factor and Multi Factorial Experiments
No ratings yet
ANOVA Table For One Factor and Multi Factorial Experiments
6 pages
MIT18 05S14 Prac Fnal Exm
No ratings yet
MIT18 05S14 Prac Fnal Exm
8 pages
HW3 STAT425 Spr2024-1
No ratings yet
HW3 STAT425 Spr2024-1
2 pages
Sensitivity, Specificity, Accuracy, Associated Confidence Interval and ROC Analysis With Practical SAS Implementations
No ratings yet
Sensitivity, Specificity, Accuracy, Associated Confidence Interval and ROC Analysis With Practical SAS Implementations
9 pages
Figure 1a.: Bo Guan Undieh Midterm
No ratings yet
Figure 1a.: Bo Guan Undieh Midterm
7 pages
Chapter 10 BO SP24
No ratings yet
Chapter 10 BO SP24
2 pages
ML MCQ 1
No ratings yet
ML MCQ 1
5 pages
Uji Validitas & Reliabilitas Kuesioner: Correlations Correlations
No ratings yet
Uji Validitas & Reliabilitas Kuesioner: Correlations Correlations
6 pages
Final Exam 2017
No ratings yet
Final Exam 2017
10 pages
Assignment 1: 1. Likelihood and Bayesian Inference in The Linear Model
No ratings yet
Assignment 1: 1. Likelihood and Bayesian Inference in The Linear Model
3 pages
Mathematics - Statistics 2
No ratings yet
Mathematics - Statistics 2
2 pages
Final Assignment MAT1004 Code 5
No ratings yet
Final Assignment MAT1004 Code 5
2 pages
The Receiver Operating Characteristic ROC Curve
No ratings yet
The Receiver Operating Characteristic ROC Curve
3 pages
Summ. Stat&Prob q3m3
No ratings yet
Summ. Stat&Prob q3m3
2 pages
United States 2004 National Corrections Reporting Program
No ratings yet
United States 2004 National Corrections Reporting Program
3 pages
Activity 7
No ratings yet
Activity 7
5 pages
M.a.M.sc. Statistics Syllabus
No ratings yet
M.a.M.sc. Statistics Syllabus
14 pages
UCS410
No ratings yet
UCS410
2 pages
Homework 5
No ratings yet
Homework 5
2 pages
Example Problems - Econometrics
No ratings yet
Example Problems - Econometrics
8 pages
UNIT-5: Procedure of T-Test
No ratings yet
UNIT-5: Procedure of T-Test
12 pages
Sathyabama University: Register Number
No ratings yet
Sathyabama University: Register Number
4 pages
Analysis of Variance (ANOVA) Definition
No ratings yet
Analysis of Variance (ANOVA) Definition
1 page
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet

HW 6

Uploaded by

HW 6

Uploaded by

STATS 415 Homework 6

Please use R Markdown to write up your solutions. Submit your work

1. Read ISLR chapter 6

(a) We know that R2 is guaranteed to be between 0 and 1. Are both bounds

4. In this exercise, we will predict the acceptance rate of a college (number of

You might also like