0% found this document useful (0 votes)

43 views6 pages

R Multiple Regression Exercise 2019

This document provides an introduction to multiple linear regression analysis in R. It discusses key concepts like quantifying relationships between multiple independent and dependent variables while accounting for confounding effects, model selection, and data transformation. The document walks through an example analysis of US life expectancy data, performing correlations, multiple regression, and iterative model selection to determine which variables best predict life expectancy while controlling for other variables. Key findings include that income, murder rate, and high school graduation rates significantly influence life expectancy.

Uploaded by

Clarissa

We take content rights seriously. If you suspect this is your content, claim it here.

0% found this document useful (0 votes)

43 views6 pages

R Multiple Regression Exercise 2019

Uploaded by

Clarissa

We take content rights seriously. If you suspect this is your content, claim it here.

You are on page 1/ 6

An introduction to multiple regression in R

Figure 1 – Well…

Last week we worked through the basics of simple linear regression analysis. Today and the next class
we’re going to take this into the multivariate realm and look at multiple regression. There are several
types of multiple regression, but we’ll focus on the most straightforward, linear case.

Background to the Problem

Extending OLS regression we will move further to quantifying the relationships between one
independent variable and two or more dependent variables. As well as quantifying these more complex
relationships, we will learn about model selection – a concept that becomes more important the more
complicated a dataset we analyse. Through the course of this worksheet we will learn about the
following concepts, and how to implement them in R:

Multiple Linear Regression and adjusted r2

Model selection and the trade-off between model fit and number of explanatory variables
Data transformation
Exercise

As with all of my exercises, the questions you will be expected to answer are interspersed throughout
the text. Please read carefully to make sure you don’t miss anything, and be sure to include answers to
every question in your write-up.

Part 1 – Preparation

As you did last week, create a directory to hold this week’s data and analyses somewhere convenient.
Switch to this directory as your working directory – Click “Misc”, “Change Working Directory” and
then find the directory you’ve just created. Remember that after each session you’ll want to save your
results by clicking “Workspace” and then “Save Workspace File”.

Part 2 – Analysis of US Life Expectancy

After spending a little while trying to compile some US demographic data by state for this case study, it
was pointed out to me that these (legacy) data already exist as a test dataset in R. So, thanks to William
B. King, your life and mine starts more easily this class!

> state.data <- as.data.frame(state.x77)

You can look at the original data if you’d like (state.x77), but it isn’t a data matrix, so isn’t as useful to
us. One useful way to view a summary of some data that you’ve imported is to use the structure
command:

> str(state.data)

This dataframe needs a little tidying up before we can start analysing it, so…:

Question 1) What does each variable mean, and in what units is it measured? Rename your columns
to remove the spaces in the variable names (you could just substitute the spaces for
periods) – that will save some hassle later. Add an appropriately named population
density column to your data set. Provide your answers and commands.

You may find the help command in R useful here. If you type a question mark, followed by any R
command, you will display the help file. For example, ?str() will tell you about the structure command.

As a first step, we’re interested in determining if there are any potential causal relationships among the
variables that in our dataset. In comes rule number one. If in doubt, plot it up. We could plot each pair
of variables separately, but that would be pretty tedious. We could also write a script to plot each pair
and output them into a matrix of plots, but you saw how much trouble we had with that last time…
Instead, fortunately, someone has already done the job for us. Try this:

> pairs(state.data)

Pretty nifty, eh?

Question 2) From the plot you have produced, state three pairs of variables that you would expect to
show moderate to strong correlations, and three pairs that you would expect to show no
correlation.

We can actually calculate these correlations (r2 from the OLS regressions as we saw last week) very
easily in R using the function cor(). When applied to a data frame, this will produce a matrix of the
pairwise correlations of variables in the data frame. These will lie between -1(perfect negative
correlation) and 1 (perfect positive correlation). We could simply output these using cor(state.data), but
we can visualise them even more easily using the following commands:

> library(lattice)
> levelplot(cor(state.data))

The first function tells R to use commands from an additional pre-existing, but not initialised, library of
commands called lattice. There are a huge number of these command libraries available, some come
packaged with R (like lattice), but others you have to download separately. The second line of code
uses one of these new functions to plot up a visual representation of the correlation matrix with strong
positive correlations in blue and strong negative correlations in pink. This may be the first time you’ve
used explicitly nested commands like this in R, but they work the same way as they do in algebra,
starting from the innermost set of parentheses and working outwards.

Question 3) Looking at your correlation plot, do your predictions from question 2 hold? What are the
strongest positive and negative correlations? Remember that a variable will always
correlate 100% positively with itself, and so that information isn’t that useful…

If we were to simply carry out OLS regressions between these pairs of variables, we would be ignoring
the fact that some of the variables that would not be included in our bivariate model could influence the
relationship that we are trying to understand, perhaps even in complex and unexpected manners. To
steal an example from our friend William King, teacher pay is negatively correlated with SAT scores.
Huh?? What are we paying these people for? The issue here, which would not be clear with a bivariate
model, is that SAT scores are strongly negatively correlated with the proportion of students who sit the
test (i.e. when only a few students sit the test it is usually the best few, so scores are high, when
everyone sits, scores decrease), and teacher pay is positively correlated with proportion of students
sitting the SAT. Higher paid teachers get more of their students to sit the test, but this lowers the overall
score.

Multiple regression overcomes this problem by looking at all the variables together and asking “what
would the influence of x variable be if I hold the influence of all other variables constant”? In the case
above, this restores the world to rights and shows that students of more highly paid teachers have
significantly higher SAT scores when accounting for the confounding effects.

Let’s apply this to our dataset. It is reasonable to hypothesise that many of our variables will influence
life expectancy in some fashion. Fortunately for us, it is a simple matter to add these extra variables
into the function we called last week to carry out OLS regression, transforming it into multiple
regression:

> complete.model <- lm(Life.Exp ~ Population + Income + Illiteracy + Murder + HS.Grad + Frost
+ Area, data = state.data)

In this model, we are asking R to calculate the influence of each of the independent variables on the
dependent variable, accounting for the influence of all of the other independent variables. Nifty, eh?

Question 4) Output a summary of your complete regression model. Is the model statistically
significant, and if so, what does this mean? What proportion of the variation in the
dependent variable (life expectancy) is explained by this combination of independent
variables? Which of the IVs appear significantly related to the DV and which don’t?
Provide a copy of your summary output to support your explanations.

As you can see not all of our IVs are significantly related to our DV. The most obvious solution to this
problem would be to re-cast the model without all of the non-significant IVs. This would, however, be
inappropriate. As we remove unrelated IVs from our model, we are removing the confounding effects
of random variation, and the relationships of other IVs to our DV can change. To be more rigorous
about this, we can remove the variables in turn, from least correlated to most correlated and see what
influence this has on our model.

Looking at our IVs, perhaps unsurprisingly, illiteracy shows the least significant correlation with life
expectancy. Let’s remove it and rerun the model. We could simply retype our previous command
without the “+ Illiteracy”. R has a shortcut for us, however:

> model2 = update(complete.model, .~. – Illiteracy)

Here the period means “everything the same from the previous model”, so we’re producing a new
model with the same DV and the same IVs except without Illiteracy.

Question 5) Provide a summary of the new model. What has changed in the relationships of the IVs,
p and the adjusted r2 value? How do you interpret this?

Repeat the step for the next least significant variable and take a look at your regression summary
again? Are we seeing a pattern? Continue on until you have removed all variables that aren’t significant
in your model. Did you choose a p-value that you are willing to accept as statistically significant? If
not, you should have done, and stated so earlier. P = 0.05 is our standard, but we now know that we
don’t have to stick to that.

Question 6) Provide a summary of your final model. Which variables influence life expectancy?
Does this make sense? What issues did you encounter in narrowing down to this model?
Provide a table containing the p-values and adjusted r2 values for each model you
examined.

As you can see, adjusted r2 generally decreases as we remove variables from the model (i.e. models
with fewer variables are doing a worse job at describing your data than models with more variables),
but not by much. A model with lots of parameters, most of which don’t have a large impact isn’t
particularly useful when we are trying to understand general controlling principles, however. What we
want is a model that balances the best explanatory power with the fewest number of variables. How do
we figure out what this is?

A simple way is to use adjusted r2 – the “adjusted” in its title indicates that the goodness of fit of the
DV to the IVs has been modified to account for the number of parameters in the model, providing a
measure of the balance of model complexity and explanatory power. R2 is a tricky parameter in general,
however, as there is some portion of the variation in any real dataset that is random and hence can
never be explained by the model. Additionally, small differences in r2 can lead to large differences in
model fit, as illustrated on the following page (courtesy Charles Annis).

Another method that I prefer is to use a technique based in information theory to assess model fit. The
model with the lowest calculated “Information Criterion” value (most commonly Akaike’s “An
Information Criterion” or AIC), contains the most information for the least model complexity – is the
“best” model.

We can calculate AIC values for each of our candidate models using a command that looks something
like this:

> AIC(complete.model, …………., smallest.model)

Question 7) Provide a summary of the best fitting model according to AIC. Which variables
influence life expectancy now? How would you interpret this? How does this differ
from the model you selected in question 6?

You can actually carry out AIC-based model selection starting with the complete model using a single
command:

> step(complete.model, direction = “backward”)

Check that this gives you the same answer as before. We now know that the factors controlling life
expectancy in the US are state population, murder rate, high school graduation rate and number of days
per year below freezing. Interesting and hurrah for R!

Wait, wait! Hang on. What did we forget? How about testing that we don’t violate some of the
assumptions of linear regression. Oops. Let’s do that now.
Question 8) Provide a 2x2 diagnostic plot of the regression test statistics for your best fitting model,
and evaluate whether any assumptions have been violated.

Looks like there might be a couple of issues, doesn’t it? Firstly, I’d recommend removing the couple of
states with the highest leverage from your dataset, providing you can justify it... Maybe something else
is going with those that isn’t captured by the variables we measured.

Question 9) Which states did you remove, and why? Provide the code you used to remove the states.

One other issue is that some of our individual IVs might not be normally distributed. Unlike with OLS
regression, we can’t directly see which IVs are violating our assumptions. A quick way to eyeball this
is through using box and whisker plots. If these are symmetrical, our data are normally distributed. If
not, something else is going on.

Let’s do this now:

> par(mfrow=c(3,3))
> for(i in 1:8){boxplot(state.data[,i],main=dimnames(state.data)[[2]][i])}

Three of our variables seem like they’re strongly skewed away from normality. We have several
options here – ignore it (which we can do, as I talked about earlier), use a non-linear regression model
(about which more in a couple of weeks time), or transform the variables. The latter is the simplest to
pull off, so let’s give it a go.

Log transforming data will normalize quite strong skews, and can be achieved in R using the log() or
log10() functions, depending on the base you wish to use.

Question 10) Log transform the three offending variables, plot new box plots showing that the
normality of these data have improved, and then repeat the regression/model selection
exercise using these new variables in place of the non-transformed variables. What
influence does this have on your model? Is the AIC better than before (i.e. does log
transforming provide a better model than the non-transformed data)? How do you
interpret your results? Provide all the plots and output necessary to support your
arguments.

Physics and Chem Practicals - Maneb
100% (1)
Physics and Chem Practicals - Maneb
96 pages
NTHALA, SJ - 1933 - Nthondo
100% (4)
NTHALA, SJ - 1933 - Nthondo
102 pages
2025 - Applied Causal Inference Powered by ML and AI
No ratings yet
2025 - Applied Causal Inference Powered by ML and AI
518 pages
Maths-Maneb Past Papers
90% (10)
Maths-Maneb Past Papers
75 pages
Shona-English English-Shona (ChiShona) Dictionary and Phrasebook (Aquilina Mawadza) (Z-Library)
100% (1)
Shona-English English-Shona (ChiShona) Dictionary and Phrasebook (Aquilina Mawadza) (Z-Library)
180 pages
Central African History
88% (49)
Central African History
108 pages
An R Companion To Applied Regression 2nd Edition
No ratings yet
An R Companion To Applied Regression 2nd Edition
538 pages
Chasowa Report
100% (3)
Chasowa Report
84 pages
Arise With Chichewa Buku La Op - John Milanzi
100% (1)
Arise With Chichewa Buku La Op - John Milanzi
212 pages
Chichewa Grammer-1
100% (5)
Chichewa Grammer-1
204 pages
A History of The Xhosa c1700-1835
No ratings yet
A History of The Xhosa c1700-1835
287 pages
Zambia ANC Guidelines 2022 - Print Ready
100% (1)
Zambia ANC Guidelines 2022 - Print Ready
44 pages
Greenwood Intermediate Statistics With R
No ratings yet
Greenwood Intermediate Statistics With R
429 pages
The Shona Peoples - An Ethnography of The Contemporary Shona, With Special Reference To Their Religion (PDFDrive)
No ratings yet
The Shona Peoples - An Ethnography of The Contemporary Shona, With Special Reference To Their Religion (PDFDrive)
401 pages
Sifunda Isizulu - Learn Zulu (PDFDrive)
100% (6)
Sifunda Isizulu - Learn Zulu (PDFDrive)
78 pages
Essential R
No ratings yet
Essential R
261 pages
Chichewa Novel Module
100% (3)
Chichewa Novel Module
86 pages
Dzuka-English Grammar For Schools - 20201119 - 0001-Compressed
100% (6)
Dzuka-English Grammar For Schools - 20201119 - 0001-Compressed
89 pages
A Conversation About Calculus
From Everand
A Conversation About Calculus
Ginachukwu Amah
No ratings yet
B. Tech. Honours EL
No ratings yet
B. Tech. Honours EL
19 pages
Xhosa Dictionary
60% (5)
Xhosa Dictionary
16 pages
Travels, Researches, and Missionary Labors, During An Eighteen Years' Residence in Eastern Africa
100% (1)
Travels, Researches, and Missionary Labors, During An Eighteen Years' Residence in Eastern Africa
520 pages
Speaking Xhosa With Us (1998)
100% (4)
Speaking Xhosa With Us (1998)
185 pages
Econometrics Notes
No ratings yet
Econometrics Notes
95 pages
Chichewa Orthography
No ratings yet
Chichewa Orthography
18 pages
'A' Level Zimbabwean History Turn Up College PDF
60% (15)
'A' Level Zimbabwean History Turn Up College PDF
285 pages
General History of Africa, Abridged Edition, V.4 Africa From The Twelfth To The Sixteenth Century
100% (1)
General History of Africa, Abridged Edition, V.4 Africa From The Twelfth To The Sixteenth Century
313 pages
R-Unit 5
No ratings yet
R-Unit 5
76 pages
Statistical Astronomy Robert J. Trumpler Instant Download
No ratings yet
Statistical Astronomy Robert J. Trumpler Instant Download
41 pages
Complete Zulu
100% (11)
Complete Zulu
370 pages
Joan Russell - Teach Yourself Swahili (1996)
100% (18)
Joan Russell - Teach Yourself Swahili (1996)
333 pages
5 QSS Sampling Distribution
No ratings yet
5 QSS Sampling Distribution
57 pages
MIT 302 - Statistical Computing II - Tutorial 03
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 03
16 pages
R Programming
No ratings yet
R Programming
47 pages
Swahili Method For Beginners
88% (8)
Swahili Method For Beginners
197 pages
Lec 3EFCFull
No ratings yet
Lec 3EFCFull
50 pages
Mipanda Yemazita (Noun Classes) : Mupanda 1
100% (16)
Mipanda Yemazita (Noun Classes) : Mupanda 1
33 pages
(Chinyanja) Chewa - English Dictionaryvermeullen PDF
100% (1)
(Chinyanja) Chewa - English Dictionaryvermeullen PDF
72 pages
Objects Oriented Programming OOP
No ratings yet
Objects Oriented Programming OOP
67 pages
Objects Oriented Programming OOP
No ratings yet
Objects Oriented Programming OOP
66 pages
(Rice J.a.) Mathematical Statistics and Data Analy
100% (3)
(Rice J.a.) Mathematical Statistics and Data Analy
685 pages
Econometrics I - R Summary (Maite Cabeza-Gutes)
No ratings yet
Econometrics I - R Summary (Maite Cabeza-Gutes)
77 pages
R Practicals
No ratings yet
R Practicals
32 pages
Lab 2
No ratings yet
Lab 2
22 pages
Hoffmann - Linear Regression Analysis - Second Edition
100% (1)
Hoffmann - Linear Regression Analysis - Second Edition
285 pages
Mas 42b Cost Behavior With Regression Analysis
No ratings yet
Mas 42b Cost Behavior With Regression Analysis
7 pages
FALL2023 Chapter 7 Point Estimation of Parameters and Sampling Distributions
No ratings yet
FALL2023 Chapter 7 Point Estimation of Parameters and Sampling Distributions
39 pages
CFA Level 2 Formula
No ratings yet
CFA Level 2 Formula
46 pages
Notebook 4 - Machine Learning
No ratings yet
Notebook 4 - Machine Learning
16 pages
SCM 200 - Ch. 8 Powerpoint
No ratings yet
SCM 200 - Ch. 8 Powerpoint
39 pages
Introduction To Data Science With R Programming
No ratings yet
Introduction To Data Science With R Programming
12 pages
S24 Stats10 Lab1-1
No ratings yet
S24 Stats10 Lab1-1
8 pages
Hasil Analisis Data - Mortalitas HIV
No ratings yet
Hasil Analisis Data - Mortalitas HIV
27 pages
Group 8 Chapter 9forecasting
100% (1)
Group 8 Chapter 9forecasting
68 pages
Linear Regression
No ratings yet
Linear Regression
22 pages
21BCS5999 - Ankit Kumar (Assignment 2)
No ratings yet
21BCS5999 - Ankit Kumar (Assignment 2)
16 pages
Commands For Data Analysis Using R
No ratings yet
Commands For Data Analysis Using R
11 pages
DSR 2879
No ratings yet
DSR 2879
25 pages
CS ELEC 4 Finals Module
No ratings yet
CS ELEC 4 Finals Module
57 pages
Unit3-Data Science
No ratings yet
Unit3-Data Science
37 pages
Modeling and Visulizing Data Using R: A Practical Introduction
No ratings yet
Modeling and Visulizing Data Using R: A Practical Introduction
106 pages
R Programming Cheat Sheet
No ratings yet
R Programming Cheat Sheet
7 pages
Unit 2 R
No ratings yet
Unit 2 R
16 pages
Lab 5
0% (1)
Lab 5
5 pages
ECS171: Machine Learning: Lecture 13: Validation, Model Selection
No ratings yet
ECS171: Machine Learning: Lecture 13: Validation, Model Selection
32 pages
R Computer Lab4 Instructions
No ratings yet
R Computer Lab4 Instructions
10 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
Advanced - Linear Regression
No ratings yet
Advanced - Linear Regression
57 pages
R Commands
No ratings yet
R Commands
18 pages
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
No ratings yet
Problem Set 1: Introduction To R - Solutions With R Output: 1 Install Packages
24 pages
Data Analytics Lesson 12 Notes
No ratings yet
Data Analytics Lesson 12 Notes
6 pages
Advanced Statistical Methods Using R
No ratings yet
Advanced Statistical Methods Using R
32 pages
Notebook 2 - Linear Regression
No ratings yet
Notebook 2 - Linear Regression
11 pages
Study Unit 6 - Regression
No ratings yet
Study Unit 6 - Regression
9 pages
SimpleRegression Transcript
No ratings yet
SimpleRegression Transcript
4 pages
IAR Lecture 3
No ratings yet
IAR Lecture 3
6 pages
Factorial Design Notes and Examples
No ratings yet
Factorial Design Notes and Examples
20 pages
Measures of Fit For Logistic Regression: Paul D. Allison, Statistical Horizons LLC and The University of Pennsylvania
No ratings yet
Measures of Fit For Logistic Regression: Paul D. Allison, Statistical Horizons LLC and The University of Pennsylvania
12 pages
R Manual PDF
No ratings yet
R Manual PDF
78 pages
7 OLS Assumptions
No ratings yet
7 OLS Assumptions
37 pages
R Commands
No ratings yet
R Commands
5 pages
Dummy Variable Regression Models
No ratings yet
Dummy Variable Regression Models
19 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Reward Dan Punishment: Pengaruh Pemberian Terhadap Kinerja Pegawai (Survey Pada Pegawai Cafe Detuik Kabupaten Bandung)
No ratings yet
Reward Dan Punishment: Pengaruh Pemberian Terhadap Kinerja Pegawai (Survey Pada Pegawai Cafe Detuik Kabupaten Bandung)
20 pages
Class (16) - Chapter 16 Regression
No ratings yet
Class (16) - Chapter 16 Regression
15 pages
Chapter 10 Regression Slides
No ratings yet
Chapter 10 Regression Slides
46 pages
Seminar 4
No ratings yet
Seminar 4
6 pages
Time Series Practice
No ratings yet
Time Series Practice
4 pages
Fundamentals of Statistical Signal Processing Estimation 3001q9c4fj
No ratings yet
Fundamentals of Statistical Signal Processing Estimation 3001q9c4fj
5 pages
Pengaruh Safety Talk Terhadap Tingkat Pemahaman K3 Pada Pekerja Dimoderasi Dengan Gender Instruktur Safety Talk
No ratings yet
Pengaruh Safety Talk Terhadap Tingkat Pemahaman K3 Pada Pekerja Dimoderasi Dengan Gender Instruktur Safety Talk
8 pages
R Regression Exercise 2019
No ratings yet
R Regression Exercise 2019
9 pages
1 Introduction To R and Rstudio: 2024-2025 Calculus Iii
No ratings yet
1 Introduction To R and Rstudio: 2024-2025 Calculus Iii
3 pages
R Manual
No ratings yet
R Manual
10 pages
Applied Econometrics Problem Set 3
No ratings yet
Applied Econometrics Problem Set 3
4 pages
4 Trip Generation Lecture 10.unlocked
No ratings yet
4 Trip Generation Lecture 10.unlocked
30 pages
R Course
No ratings yet
R Course
7 pages
Tugas 2 Spatial Enivironment (Comparison OLS GWR GTWR)
No ratings yet
Tugas 2 Spatial Enivironment (Comparison OLS GWR GTWR)
7 pages
LASSO Regression Math
No ratings yet
LASSO Regression Math
7 pages
Module - 4 (R Training) - Basic Stats & Modeling
No ratings yet
Module - 4 (R Training) - Basic Stats & Modeling
15 pages
Chapter 5
No ratings yet
Chapter 5
16 pages
Rintro
No ratings yet
Rintro
14 pages
Assignment #2 - For Statistical Software
No ratings yet
Assignment #2 - For Statistical Software
4 pages
Simple Regression Model Fitting
No ratings yet
Simple Regression Model Fitting
5 pages
Regression: by Vijeta Gupta Amity University
No ratings yet
Regression: by Vijeta Gupta Amity University
15 pages
Multinomial Logistic Regression - R Data Analysis Examples - IDRE Stats
No ratings yet
Multinomial Logistic Regression - R Data Analysis Examples - IDRE Stats
8 pages
Tutorial On Fuzzy Logic
No ratings yet
Tutorial On Fuzzy Logic
20 pages
Introduction To R: 1 Getting Started
No ratings yet
Introduction To R: 1 Getting Started
14 pages
Rstudio Study Notes For PA 20181126
No ratings yet
Rstudio Study Notes For PA 20181126
6 pages
Chichewa Sample Sentences
No ratings yet
Chichewa Sample Sentences
5 pages
Two Sample Statistical Inference:: Formulae
No ratings yet
Two Sample Statistical Inference:: Formulae
2 pages
Time Series Analysis Cheat Sheet
No ratings yet
Time Series Analysis Cheat Sheet
2 pages
Advanced Panel Data Econometrics Sample Test PDF
No ratings yet
Advanced Panel Data Econometrics Sample Test PDF
8 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
Thesis
No ratings yet
Thesis
8 pages
Mydata - Read - CSV ("Nameofthedatafile - CSV") : Sorting A Data Frame
No ratings yet
Mydata - Read - CSV ("Nameofthedatafile - CSV") : Sorting A Data Frame
2 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
BDMDM
No ratings yet
BDMDM
2 pages
Instruction: ARIMA Model Simulation: Statistical Summary
No ratings yet
Instruction: ARIMA Model Simulation: Statistical Summary
4 pages
Biostatistics (University of Namibia)
No ratings yet
Biostatistics (University of Namibia)
2 pages
Errors of Regression Models: Bite-Size Machine Learning, #1
From Everand
Errors of Regression Models: Bite-Size Machine Learning, #1
Lee Baker
No ratings yet

R Multiple Regression Exercise 2019

Uploaded by

R Multiple Regression Exercise 2019

Uploaded by

An introduction to multiple regression in R

Background to the Problem

Multiple Linear Regression and adjusted r2

Part 2 – Analysis of US Life Expectancy

> state.data <- as.data.frame(state.x77)

Pretty nifty, eh?

> model2 = update(complete.model, .~. – Illiteracy)

> AIC(complete.model, …………., smallest.model)

> step(complete.model, direction = “backward”)

Let’s do this now:

You might also like