0% found this document useful (0 votes)

46 views5 pages

Chapter 4 Exercise 11

The document analyzes mpg data from automobiles. It finds mpg is anti-correlated with cylinders, weight, displacement, and horsepower. It then uses LDA, QDA, logistic regression, and KNN models to predict high vs. low mpg using those variables, obtaining test error rates between 12-16%. KNN with 100 nearest neighbors performed best with 14.3% error.

Uploaded by

krisjooniejin tan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views5 pages

Chapter 4 Exercise 11

Uploaded by

krisjooniejin tan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

11

library(ISLR)
summary(Auto)

## mpg cylinders displacement horsepower

## Min. : 9.0 Min. :3.00 Min. : 68 Min. : 46.0
## 1st Qu.:17.0 1st Qu.:4.00 1st Qu.:105 1st Qu.: 75.0
## Median :22.8 Median :4.00 Median :151 Median : 93.5
## Mean :23.4 Mean :5.47 Mean :194 Mean :104.5
## 3rd Qu.:29.0 3rd Qu.:8.00 3rd Qu.:276 3rd Qu.:126.0
## Max. :46.6 Max. :8.00 Max. :455 Max. :230.0
##
## weight acceleration year origin
## Min. :1613 Min. : 8.0 Min. :70 Min. :1.00
## 1st Qu.:2225 1st Qu.:13.8 1st Qu.:73 1st Qu.:1.00
## Median :2804 Median :15.5 Median :76 Median :1.00
## Mean :2978 Mean :15.5 Mean :76 Mean :1.58
## 3rd Qu.:3615 3rd Qu.:17.0 3rd Qu.:79 3rd Qu.:2.00
## Max. :5140 Max. :24.8 Max. :82 Max. :3.00
##
## name
## amc matador : 5
## ford pinto : 5
## toyota corolla : 5
## amc gremlin : 4
## amc hornet : 4
## chevrolet chevette: 4
## (Other) :365

attach(Auto)
mpg01 = rep(0, length(mpg))
mpg01[mpg > median(mpg)] = 1
Auto = data.frame(Auto, mpg01)

cor(Auto[, -9])
## mpg cylinders displacement horsepower weight
## mpg 1.0000 -0.7776 -0.8051 -0.7784 -0.8322
## cylinders -0.7776 1.0000 0.9508 0.8430 0.8975
## displacement -0.8051 0.9508 1.0000 0.8973 0.9330
## horsepower -0.7784 0.8430 0.8973 1.0000 0.8645
## weight -0.8322 0.8975 0.9330 0.8645 1.0000
## acceleration 0.4233 -0.5047 -0.5438 -0.6892 -0.4168
## year 0.5805 -0.3456 -0.3699 -0.4164 -0.3091
## origin 0.5652 -0.5689 -0.6145 -0.4552 -0.5850
## mpg01 0.8369 -0.7592 -0.7535 -0.6671 -0.7578
## acceleration year origin mpg01
## mpg 0.4233 0.5805 0.5652 0.8369
## cylinders -0.5047 -0.3456 -0.5689 -0.7592
## displacement -0.5438 -0.3699 -0.6145 -0.7535
## horsepower -0.6892 -0.4164 -0.4552 -0.6671
## weight -0.4168 -0.3091 -0.5850 -0.7578
## acceleration 1.0000 0.2903 0.2127 0.3468
## year 0.2903 1.0000 0.1815 0.4299
## origin 0.2127 0.1815 1.0000 0.5137
## mpg01 0.3468 0.4299 0.5137 1.0000

pairs(Auto) # doesn't work well since mpg01 is 0 or 1

Anti-correlated with cylinders, weight, displacement, horsepower. (mpg, of course)

train = (year%%2 == 0) # if the year is even

test = !train
Auto.train = Auto[train, ]
Auto.test = Auto[test, ]
mpg01.test = mpg01[test]

d
# LDA
library(MASS)
lda.fit = lda(mpg01 ~ cylinders + weight + displacement +
horsepower, data = Auto,
subset = train)
lda.pred = predict(lda.fit, Auto.test)
mean(lda.pred$class != mpg01.test)

## [1] 0.1264

12.6% test error rate.

# QDA
qda.fit = qda(mpg01 ~ cylinders + weight + displacement +
horsepower, data = Auto,
subset = train)
qda.pred = predict(qda.fit, Auto.test)
mean(qda.pred$class != mpg01.test)

## [1] 0.1319

13.2% test error rate.

# Logistic regression
glm.fit = glm(mpg01 ~ cylinders + weight + displacement +
horsepower, data = Auto,
family = binomial, subset = train)
glm.probs = predict(glm.fit, Auto.test, type = "response")
glm.pred = rep(0, length(glm.probs))
glm.pred[glm.probs > 0.5] = 1
mean(glm.pred != mpg01.test)

## [1] 0.1209

12.1% test error rate.

g
library(class)
train.X = cbind(cylinders, weight, displacement, horsepower)
[train, ]
test.X = cbind(cylinders, weight, displacement, horsepower)[test,
]
train.mpg01 = mpg01[train]
set.seed(1)
# KNN(k=1)
knn.pred = knn(train.X, test.X, train.mpg01, k = 1)
mean(knn.pred != mpg01.test)

## [1] 0.1538

# KNN(k=10)
knn.pred = knn(train.X, test.X, train.mpg01, k = 10)
mean(knn.pred != mpg01.test)

## [1] 0.1648

# KNN(k=100)
knn.pred = knn(train.X, test.X, train.mpg01, k = 100)
mean(knn.pred != mpg01.test)

## [1] 0.1429

k=1, 15.4% test error rate. k=10, 16.5% test error rate. k=100, 14.3% test error rate. K of 100
seems to perform the best. 100 nearest neighbors.

Probabilistic Machine Learning An Introduction Book 1 (Kevin P Murphy)
100% (1)
Probabilistic Machine Learning An Introduction Book 1 (Kevin P Murphy)
949 pages
Fox 2016 PDF
100% (1)
Fox 2016 PDF
817 pages
R Manual To Agresti's Categorical Data Analysis
100% (1)
R Manual To Agresti's Categorical Data Analysis
280 pages
Introduction To Data Science: Hui Lin and Ming Li
No ratings yet
Introduction To Data Science: Hui Lin and Ming Li
403 pages
Regression Models Course Project
100% (1)
Regression Models Course Project
4 pages
Project 8 Predictive Analytics - Ipynb - Colaboratory
No ratings yet
Project 8 Predictive Analytics - Ipynb - Colaboratory
8 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
Assignment 9
No ratings yet
Assignment 9
8 pages
Mtcars: Choosing The Most Related Variable (S) To The Response
No ratings yet
Mtcars: Choosing The Most Related Variable (S) To The Response
13 pages
Introduction To R Program and Output
No ratings yet
Introduction To R Program and Output
6 pages
Economics 400 Computer Exercise
No ratings yet
Economics 400 Computer Exercise
7 pages
Assignment
No ratings yet
Assignment
49 pages
ISyE7406 Homework3
No ratings yet
ISyE7406 Homework3
20 pages
R Program
No ratings yet
R Program
2 pages
Motor Trend Car Road Tests
No ratings yet
Motor Trend Car Road Tests
5 pages
Data Science Using R
No ratings yet
Data Science Using R
11 pages
HW3 Isye 7406
No ratings yet
HW3 Isye 7406
8 pages
Lab2 Revathy Report
No ratings yet
Lab2 Revathy Report
5 pages
R11
No ratings yet
R11
2 pages
R Notebook: "Mtcars - CSV"
No ratings yet
R Notebook: "Mtcars - CSV"
4 pages
Fall 2023-2024 IE 451 Homework 2 Solutions
No ratings yet
Fall 2023-2024 IE 451 Homework 2 Solutions
20 pages
Regression Models Project
No ratings yet
Regression Models Project
5 pages
Assignment Auto
No ratings yet
Assignment Auto
6 pages
Lab1: Introduction To R: Islr2
No ratings yet
Lab1: Introduction To R: Islr2
10 pages
Exercises 2 Unfinished
No ratings yet
Exercises 2 Unfinished
8 pages
Aayushi Bda File
No ratings yet
Aayushi Bda File
41 pages
STA1040 Assignment
No ratings yet
STA1040 Assignment
9 pages
Regression Models Assignment 1
No ratings yet
Regression Models Assignment 1
5 pages
Lab 4
No ratings yet
Lab 4
4 pages
DMPM-LAB-03-Assignment: Rcode
No ratings yet
DMPM-LAB-03-Assignment: Rcode
9 pages
Course2 - DataAnalysis With Python - Week3 - Exploratory Data Analysis
No ratings yet
Course2 - DataAnalysis With Python - Week3 - Exploratory Data Analysis
23 pages
Regression Models Assignment 1
No ratings yet
Regression Models Assignment 1
6 pages
Statisitics Project 3
No ratings yet
Statisitics Project 3
22 pages
LinearRegression HandsOn
No ratings yet
LinearRegression HandsOn
3 pages
Regression Models Assignment 1
No ratings yet
Regression Models Assignment 1
5 pages
Sas Cheat Sheet
No ratings yet
Sas Cheat Sheet
3 pages
'Horsepower' "?" 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower'
No ratings yet
'Horsepower' "?" 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower'
5 pages
Multi Regression
No ratings yet
Multi Regression
12 pages
Machine Learning
No ratings yet
Machine Learning
54 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
22 pages
Statistics Introduction
No ratings yet
Statistics Introduction
8 pages
Regression
No ratings yet
Regression
5 pages
R
No ratings yet
R
3 pages
R Studio
No ratings yet
R Studio
5 pages
Bda File
No ratings yet
Bda File
54 pages
ML Foram
No ratings yet
ML Foram
17 pages
Data Analytics Solution - Assignment - 1
No ratings yet
Data Analytics Solution - Assignment - 1
3 pages
Miles Per Gallon
No ratings yet
Miles Per Gallon
11 pages
Regression Models Project Sid Jas
No ratings yet
Regression Models Project Sid Jas
7 pages
MTCARS Regression Analysis
No ratings yet
MTCARS Regression Analysis
5 pages
Topic
No ratings yet
Topic
9 pages
Data Science Lab
No ratings yet
Data Science Lab
28 pages
Se Python - Merged
No ratings yet
Se Python - Merged
77 pages
Week2 Submission Assignment Solution AshaA-3
No ratings yet
Week2 Submission Assignment Solution AshaA-3
2 pages
7406HW03
No ratings yet
7406HW03
2 pages
Mtcars Dataset Analysis in R
No ratings yet
Mtcars Dataset Analysis in R
4 pages
R Lab Ex 1 To 5
No ratings yet
R Lab Ex 1 To 5
26 pages
R Studio
No ratings yet
R Studio
4 pages
Exercise 4 Comparing Transmission Types
No ratings yet
Exercise 4 Comparing Transmission Types
1 page
Activity 2
No ratings yet
Activity 2
16 pages
DS On MTCARS Solutions
No ratings yet
DS On MTCARS Solutions
3 pages
Practical 5
No ratings yet
Practical 5
5 pages
Chain Ladder
No ratings yet
Chain Ladder
76 pages
Badm 8th Record R Language
No ratings yet
Badm 8th Record R Language
6 pages
Coursera Regression Models Course Project: Subha Shree S R 08/10/2020
No ratings yet
Coursera Regression Models Course Project: Subha Shree S R 08/10/2020
7 pages
As Data Manipulation With Dplyr-2
No ratings yet
As Data Manipulation With Dplyr-2
6 pages
MBAS901 2 Lecture
No ratings yet
MBAS901 2 Lecture
87 pages
Lin Mod Book
No ratings yet
Lin Mod Book
567 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
CT6 QP 0416
No ratings yet
CT6 QP 0416
6 pages
Closed Population Capture-Recapture Models
No ratings yet
Closed Population Capture-Recapture Models
37 pages
Presentation Generalized Linear Model Theory
No ratings yet
Presentation Generalized Linear Model Theory
77 pages
London 2011 Boa
No ratings yet
London 2011 Boa
154 pages
The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application Using Crash Data
No ratings yet
The Negative Binomial-Lindley Generalized Linear Model: Characteristics and Application Using Crash Data
19 pages
CDA Exercises
No ratings yet
CDA Exercises
26 pages
Use and Interpretation of Climate Envelope Model - Practcal Guide
No ratings yet
Use and Interpretation of Climate Envelope Model - Practcal Guide
43 pages
Influence Networks in International Relations: Social in Uence Regression, Provides A Way
No ratings yet
Influence Networks in International Relations: Social in Uence Regression, Provides A Way
34 pages
Pododermatitis en Conejas Alojadas en Grupo en Suiza
No ratings yet
Pododermatitis en Conejas Alojadas en Grupo en Suiza
8 pages
Assignment 2 Mechanical Vibration
No ratings yet
Assignment 2 Mechanical Vibration
44 pages
Intro To Hierachical GAM
No ratings yet
Intro To Hierachical GAM
43 pages
Chapter 6
No ratings yet
Chapter 6
24 pages
Shaffer Et Al. - 2003 - Foraging Effort in Relation To The Constraints of
No ratings yet
Shaffer Et Al. - 2003 - Foraging Effort in Relation To The Constraints of
9 pages
Modelling Claim Frequency in Vehicle Insurance: Jiří Valecký
No ratings yet
Modelling Claim Frequency in Vehicle Insurance: Jiří Valecký
7 pages
Analyzing The Ionosphere Using R
No ratings yet
Analyzing The Ionosphere Using R
22 pages
Julia For Data Science
No ratings yet
Julia For Data Science
15 pages
Analytical Challenges of Customer Acquisition
No ratings yet
Analytical Challenges of Customer Acquisition
13 pages
Multivariate Analysis: y N P V A
No ratings yet
Multivariate Analysis: y N P V A
2 pages
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
No ratings yet
1 Computation Questions: STA3002: Generalized Linear Models Spring 2023
2 pages
Road & Track Iconic Cars: Camaro
From Everand
Road & Track Iconic Cars: Camaro
Larry Webster
No ratings yet
Check Engine Light On?
From Everand
Check Engine Light On?
Keith Thompson
No ratings yet

Chapter 4 Exercise 11

Uploaded by

Chapter 4 Exercise 11

Uploaded by

11

## mpg cylinders displacement horsepower

pairs(Auto) # doesn't work well since mpg01 is 0 or 1

train = (year%%2 == 0) # if the year is even

12.6% test error rate.

13.2% test error rate.

12.1% test error rate.

You might also like