0% found this document useful (0 votes)

55 views6 pages

MLR Version2

This document describes conducting multiple linear regression analysis in R to predict stock index prices using interest rates and unemployment rates as predictor variables. It includes steps to import and prepare the data, fit three regression models (multiple linear regression, random forest regression, and support vector regression), compare their performance using RMSE, and visualize the predicted versus actual values. The assignment is to replicate this analysis for a marketing dataset to predict sales using three advertising variables as predictors and compare the regression models.

Uploaded by

Melanie Samsona

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views6 pages

MLR Version2

Uploaded by

Melanie Samsona

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 6

# copy and paste to a blank file in R

# your cellphone may block this file if given in R format

## Your assignment is found at the bottom of the document

## If you want to see the corresponding output then
# run R code line by line

## Full Illustration of Multiple Linear Regression Using R

## You have to run every line of code to see the output

## Because of him ...

## "Life is really simple, but we insist on making it complicated."
## --- **Confucius**

## Topic: Prediction of Stock Index Price using Interest_Rate and

## Unemployment_Rate as predictor variables

## Methods: Multiple Linear Regression, Random Forest Regression and Support Vector
Regression

# Prepared by"
# Carlito O. Daarol
# Faculty/Statistician/Data Scientist
# Mathematics Department
# Mindanao State University
# General Santos city
# January 1, 2021

# Step 1: Enter the data.

(Year <-
c(2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2016,2016,2016,2
016,2016,2016,2016,2016,2016,2016,2016))
(Month <- c(12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1))
(Interest_Rate <-
c(2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1.75,1
.75,1.75,1.75,1.75,1.75))
(Unemployment_Rate <-
c(5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5.9
,6.2,6.2,6.1))
(Stock_Index_Price <-
c(1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,965,943,958,971,
949,884,866,876,822,704,719))

## combine all variables as a table

data <-
as.data.frame(cbind(Year,Month,Interest_Rate,Unemployment_Rate,Stock_Index_Price))

# check first 6 rows

head(data)
# load variable names to memory
attach(data)

# --------------Here is how to save file---------------------

# save table as Excel csv file for future retrieval
write.csv(data,file="Multreg.csv")

# --------------Here is how to load file---------------------

filedata <- read.csv("Multreg.csv")
filedata
# -----------------------------------------------------------

# Step 2: Check for linearity of relationship by inspection

# response variable = Stock_Index_Price versus
# predictor1 = Interest_Rate
# predictor2 = Unemployment_Rate

plot(x=Interest_Rate, y=Stock_Index_Price)
plot(x=Unemployment_Rate, y=Stock_Index_Price)
# the plot should suggest a linear pattern

# Step 3: Use this template, apply the multiple linear regression in R

# model_mlr <- lm(Dependent variable ~ First independent Variable + Second
independent variable + ...)
# summary(model)

model <- lm(Stock_Index_Price ~ Interest_Rate + Unemployment_Rate)

summary(model)
multiple_linear_prediction <- predict(model, filedata)

# Step 4: Inspect the results

# You should see this kind of results

# Call:
# lm(formula = Stock_Index_Price ~ Interest_Rate + Unemployment_Rate)

# Residuals:
# Min 1Q Median 3Q Max
# -158.205 -41.667 -6.248 57.741 118.810

# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 1798.4 899.2 2.000 0.05861 .
# Interest_Rate 345.5 111.4 3.103 0.00539 **
# Unemployment_Rate -250.1 117.9 -2.121 0.04601 *
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Residual standard error: 70.56 on 21 degrees of freedom

# Multiple R-squared: 0.8976, Adjusted R-squared: 0.8879
# F-statistic: 92.07 on 2 and 21 DF, p-value: 4.043e-11

# Step 5 You can use the coefficients in the summary in

# order to build the multiple linear regression equation as follows:

# Stock_Index_Price = (Intercept) + (Interest_Rate coef)*X1 (Unemployment_Rate

coef)*X2
# Stock_Index_Price = (1798.4) + (345.5)*X1 + (-250.1)*X2
# You should x1 and x2 with a particular value, say x1 = 1.5 and x2 = 5.8

# Step 6: Make a prediction

# For example, imagine that you want to predict the stock index price after you
collected the following data:

# Interest Rate = 1.5 (i.e., X1= 1.5)

# Unemployment Rate = 5.8 (i.e., X2= 5.8)
Stock_Index_Price1 = (1798.4) + (345.5)*(1.5) + (-250.1)*(5.8)
Stock_Index_Price1

# Step 7: Some additional statistics to consider in the output summary:

# pickup the value of each statistic

# 1. Adjusted R-squared reflects the fit of the model,

# where a higher value generally indicates a better fit

# 2. Intercept coefficient is the Y-intercept

# 3. Interest_Rate coefficient is the change in Y due to a change of

# one unit in the interest rate (everything else held constant)

# 4. Unemployment_Rate coefficient is the change in Y due to a

# change of one unit in the unemployment rate
# (everything else held constant)

# 5. Std. Error reflects the level of accuracy of the coefficients

# Pr(>|t|) is the p-value. A p-value of less than 0.05 is
# considered to be statistically significant

# Step 8: Extract all results in the regression model

# extract variables under the model object

names(model)
# [1] "coefficients" "residuals" "effects" "rank"
"fitted.values" "assign"
# v[7] "qr" "df.residual" "xlevels" "call" "terms"
"model"

# extract fitted.values or forecast values

model$fitted.values

# extract model coefficients

model$coefficients

# extract the variables in the model

model$model
model$model$Stock_Index_Price
model$model$Interest_Rate
model$model$Unemployment_Rate

# Step 9: check for normality of the regression error terms

plot(model$residuals)
hist(model$residuals)
shapiro.test(model$residuals)
# non-normal if p-value < 0.05

# Step 10: check for percentage error

rmse_mlr <- sqrt(mean(model$residuals^2))

# Step 11: check for percentage error

# using RandomForest and Support Vector regression

library(randomForest)
random_forest <- randomForest(Stock_Index_Price ~., data = filedata, ntree = 5000)
random_forest_prediction <- predict(random_forest, filedata)
residual <- random_forest_prediction - filedata$Stock_Index_Price
rmse_rf <- sqrt(mean(residual^2))

library(e1071)
svr <- svm(Stock_Index_Price ~., data = filedata)
OptModelsvm=tune(svm,Stock_Index_Price ~., data =
filedata,ranges=list(elsilon=seq(0,1,0.1), cost=1:100))
BestModel=OptModelsvm$best.model
svr_prediction <- predict(BestModel, filedata)
residual <- svr_prediction - filedata$Stock_Index_Price
rmse_svr <- sqrt(mean(residual^2))

# present model root mean square error for all models

c(rmse_mlr,rmse_rf,rmse_svr)
#[1] 66.00463 42.85300 17.79912

# Step 12: Plot for Stock_Index_Price and predicted valuer

# MLR model versus RandomForest Model versus Support Vector model

num_obs <- length(Stock_Index_Price)

x_vals = seq(from = 0, to = 2000, length.out = num_obs)
par(mfrow =c(1,3))
plot(x_vals,Stock_Index_Price, xlab="Multiple Linear Regression",main=)
points(x_vals,multiple_linear_prediction, col = "blue", pch=4, type="l",lwd=2)

plot(x_vals,Stock_Index_Price, xlab="Random Forest Regression")

points(x_vals,random_forest_prediction, col = "green", pch=4, type="l",lwd=2)

plot(x_vals,Stock_Index_Price, xlab="Support Vector Regression")

points(x_vals,svr_prediction, col = "red", pch=4, type="l",lwd=2)
par(mfrow =c(1,1))

# Another display using ggplot2

data <- as.data.frame(cbind(x_vals,Stock_Index_Price))
colnames(data) <- c("X","Y")

library(ggplot2)

title = paste0("Multiple Linear Regression RMSE = ", round(rmse_mlr,2))

ggplot2::ggplot() +
ggplot2::geom_point(data = data, size = 2,
ggplot2::aes(x = X, y = Y, color = "Stock_Index_Price"))+

# Multiple Linear Regression Predictions

ggplot2::geom_line(data = data, size = 2, alpha = 0.7,
ggplot2::aes(x = X, y = multiple_linear_prediction,
color = "Predicted with MLR")) +
ggtitle(title)

title = paste0("Random Forest Regression RMSE = ", round(rmse_rf,2))

ggplot2::ggplot() +

ggplot2::geom_point(data = data, size = 2,

ggplot2::aes(x = X, y = Y, color = "Stock_Index_Price"))+

ggplot2::geom_line(data = data, size = 2, alpha = 0.7,

ggplot2::aes(x = X, y = random_forest_prediction,
color = "Predicted with RandomForest"))+
ggtitle(title)

title = paste0("Support Vector Regression RMSE = ", round(rmse_svr,2))

ggplot2::ggplot() +

ggplot2::geom_point(data = data, size = 2,

ggplot2::aes(x = X, y = Y, color = "Stock_Index_Pricea"))+

ggplot2::geom_line(data = data, size = 2, alpha = 0.7,

ggplot2::aes(x = X, y = svr_prediction ,
color = "Predicted with Support Vector
Regression"))+
ggtitle(title)

# Your Group Assignment to be submitted as a word file

# Conduct a multiple Regression model for the given dataset
#marketing dataset

data <- read.csv("E:/Advance Models/marketing.csv")

data <- data[-1]
head(data)
dim(data)

# Your problem: Develop the 3 regression models and get

# the corresponding rmse

# Response variable: sales

# Predictor variable1 = youtube advertising
# Predictor variable2 = facebook advertising
# Predictor variable3 = newspaper advertising

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Samsona-Case-Act 122 5 - 3G
100% (2)
Samsona-Case-Act 122 5 - 3G
4 pages
Financial Statement Analysis Exercise
No ratings yet
Financial Statement Analysis Exercise
5 pages
CASE ANALYSIS - Don Masters and Assoicates Law Office
0% (1)
CASE ANALYSIS - Don Masters and Assoicates Law Office
1 page
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
M S S N: Lincoln-Petersen Index
No ratings yet
M S S N: Lincoln-Petersen Index
27 pages
Case Analysis - Hockey Camp-Cvp
100% (1)
Case Analysis - Hockey Camp-Cvp
2 pages
Assignment #2
100% (1)
Assignment #2
18 pages
Chapter 13
No ratings yet
Chapter 13
11 pages
Corliss Engine
No ratings yet
Corliss Engine
1 page
Case Analysis - Hockey Camp-Cvp
100% (1)
Case Analysis - Hockey Camp-Cvp
2 pages
Specialized Industry Hospital
No ratings yet
Specialized Industry Hospital
55 pages
Specialized Industry Hospital
No ratings yet
Specialized Industry Hospital
44 pages
Impact of Time On Students' Academic Performance
100% (1)
Impact of Time On Students' Academic Performance
14 pages
5TH Activity
No ratings yet
5TH Activity
15 pages
Stress: Normal Stress Shearing Stress Bearing Stress
100% (1)
Stress: Normal Stress Shearing Stress Bearing Stress
79 pages
Filmmaking
100% (2)
Filmmaking
10 pages
Air Filter
No ratings yet
Air Filter
1 page
Double Acting Air Cylinder
No ratings yet
Double Acting Air Cylinder
1 page
Impact of Time On Students
No ratings yet
Impact of Time On Students
10 pages
Senior High School Electronic Class Record: Instructions
No ratings yet
Senior High School Electronic Class Record: Instructions
39 pages
ENS 164 Syllabus
No ratings yet
ENS 164 Syllabus
1 page
Stress: Normal Stress Shearing Stress Bearing Stress
No ratings yet
Stress: Normal Stress Shearing Stress Bearing Stress
23 pages
TORSION
No ratings yet
TORSION
25 pages
STRAIN
No ratings yet
STRAIN
30 pages
Grade 11-Abm 3 Hamilton
No ratings yet
Grade 11-Abm 3 Hamilton
45 pages
Agricultural and Biosystems Engineering Department: College of Agriculture
No ratings yet
Agricultural and Biosystems Engineering Department: College of Agriculture
1 page
ENS 181 - Seatwork No. 1 - Samsona - A1
No ratings yet
ENS 181 - Seatwork No. 1 - Samsona - A1
1 page
Grade 11-TVL 3 Aviso 2018-2019
No ratings yet
Grade 11-TVL 3 Aviso 2018-2019
42 pages
Happy Me
No ratings yet
Happy Me
1 page
Koronadal National Comprehensive High School-Senior High School
No ratings yet
Koronadal National Comprehensive High School-Senior High School
60 pages
ABE051 - 2nd MAjor Exam
No ratings yet
ABE051 - 2nd MAjor Exam
1 page
ENS181 Seatwork 10 - Samsona - A1
No ratings yet
ENS181 Seatwork 10 - Samsona - A1
12 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Bayesian Nash Equilibrium
No ratings yet
Bayesian Nash Equilibrium
177 pages
Strategy or Dominance Method) States That If One Strategy of A Player Dominates Over The
No ratings yet
Strategy or Dominance Method) States That If One Strategy of A Player Dominates Over The
2 pages
Mago, Jessica Marionne O. - Hypothesis Tests in Simple Linear Regression - Quiz
No ratings yet
Mago, Jessica Marionne O. - Hypothesis Tests in Simple Linear Regression - Quiz
2 pages
Polynomial Regression Presentation
No ratings yet
Polynomial Regression Presentation
11 pages
Analisis Kelayakan Investasi Kapal Khusu 5eb28355
No ratings yet
Analisis Kelayakan Investasi Kapal Khusu 5eb28355
22 pages
Summary Output W T: Multiple R R Square Adjusted R Square Standard Error Observations Anova Regression Residual Total
No ratings yet
Summary Output W T: Multiple R R Square Adjusted R Square Standard Error Observations Anova Regression Residual Total
6 pages
CFA L2 2024 Volume1
100% (1)
CFA L2 2024 Volume1
168 pages
Business Statistics Session 17: Simple Correlation and Regression
No ratings yet
Business Statistics Session 17: Simple Correlation and Regression
24 pages
Violation of (Weak) Exogeneity Assumption
No ratings yet
Violation of (Weak) Exogeneity Assumption
34 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
21 pages
Syllabus FEM11090 2023
No ratings yet
Syllabus FEM11090 2023
6 pages
EXPASSVG IHSTATmacrofree
No ratings yet
EXPASSVG IHSTATmacrofree
2 pages
Problem Set 2 Foundations of Asset Pricing
No ratings yet
Problem Set 2 Foundations of Asset Pricing
2 pages
Durbin Watson Test
No ratings yet
Durbin Watson Test
6 pages
Engineering Economy Module 5
No ratings yet
Engineering Economy Module 5
2 pages
Audit Regression Nestle
No ratings yet
Audit Regression Nestle
2 pages
ch11 1 PDF
No ratings yet
ch11 1 PDF
4 pages
Regression Analysis
No ratings yet
Regression Analysis
9 pages
Chapter 09-Test of Hypotheses For A Single Sample
No ratings yet
Chapter 09-Test of Hypotheses For A Single Sample
44 pages
Agni4 - Agni5 - Zbus Agni3 Transmission Line Parameters Gauss2 - Gauss Seidal With Data
No ratings yet
Agni4 - Agni5 - Zbus Agni3 Transmission Line Parameters Gauss2 - Gauss Seidal With Data
5 pages
Chapter Sixteen: Analysis of Variance and Covariance
No ratings yet
Chapter Sixteen: Analysis of Variance and Covariance
64 pages
Applied Multilevel Analysis-Section B 1
No ratings yet
Applied Multilevel Analysis-Section B 1
12 pages
Vasicek Model - Wikipedia
No ratings yet
Vasicek Model - Wikipedia
4 pages
X Y Korelasi Regresi: 0.58 35 Regression Statistics
No ratings yet
X Y Korelasi Regresi: 0.58 35 Regression Statistics
3 pages
IandF CS2A 202009 Examiners Report
No ratings yet
IandF CS2A 202009 Examiners Report
25 pages
Statistical and Econometric Methods For Transportation Data Analysis
No ratings yet
Statistical and Econometric Methods For Transportation Data Analysis
2 pages
Lect 6
No ratings yet
Lect 6
20 pages
Full Summary of Panel Data
No ratings yet
Full Summary of Panel Data
17 pages
Calibrated Learning and Correlated Equilibrium
No ratings yet
Calibrated Learning and Correlated Equilibrium
24 pages

MLR Version2

Uploaded by

MLR Version2

Uploaded by

# copy and paste to a blank file in R

# your cellphone may block this file if given in R format

## Your assignment is found at the bottom of the document

## Full Illustration of Multiple Linear Regression Using R

## Because of him ...

## Topic: Prediction of Stock Index Price using Interest_Rate and

# Step 1: Enter the data.

## combine all variables as a table

# check first 6 rows

# --------------Here is how to save file---------------------

# --------------Here is how to load file---------------------

# Step 2: Check for linearity of relationship by inspection

# Step 3: Use this template, apply the multiple linear regression in R

model <- lm(Stock_Index_Price ~ Interest_Rate + Unemployment_Rate)

# Step 4: Inspect the results

# Residual standard error: 70.56 on 21 degrees of freedom

# Step 5 You can use the coefficients in the summary in

# Stock_Index_Price = (Intercept) + (Interest_Rate coef)*X1 (Unemployment_Rate

# Step 6: Make a prediction

# Interest Rate = 1.5 (i.e., X1= 1.5)

# Step 7: Some additional statistics to consider in the output summary:

# 1. Adjusted R-squared reflects the fit of the model,

# 2. Intercept coefficient is the Y-intercept

# 3. Interest_Rate coefficient is the change in Y due to a change of

# 4. Unemployment_Rate coefficient is the change in Y due to a

# 5. Std. Error reflects the level of accuracy of the coefficients

# Step 8: Extract all results in the regression model

# extract variables under the model object

# extract fitted.values or forecast values

# extract model coefficients

# extract the variables in the model

# Step 9: check for normality of the regression error terms

# Step 10: check for percentage error

# Step 11: check for percentage error

# present model root mean square error for all models

# Step 12: Plot for Stock_Index_Price and predicted valuer

num_obs <- length(Stock_Index_Price)

plot(x_vals,Stock_Index_Price, xlab="Random Forest Regression")

plot(x_vals,Stock_Index_Price, xlab="Support Vector Regression")

# Another display using ggplot2

title = paste0("Multiple Linear Regression RMSE = ", round(rmse_mlr,2))

# Multiple Linear Regression Predictions

title = paste0("Random Forest Regression RMSE = ", round(rmse_rf,2))

ggplot2::geom_point(data = data, size = 2,

ggplot2::geom_line(data = data, size = 2, alpha = 0.7,

title = paste0("Support Vector Regression RMSE = ", round(rmse_svr,2))

ggplot2::geom_point(data = data, size = 2,

ggplot2::geom_line(data = data, size = 2, alpha = 0.7,

# Your Group Assignment to be submitted as a word file

data <- read.csv("E:/Advance Models/marketing.csv")

# Your problem: Develop the 3 regression models and get

# Response variable: sales

You might also like