Linear Regression For Real

Uploaded by

Adlina Balkis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views1 page

Linear Regression For Real

Uploaded by

Adlina Balkis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

LINEAR REGRESSION

Background Coding in R Interpretation

A linear regression tests shows the relationship between 2
variables to determine if an independent variable significantly
1. Create/Import the data:
affects a dependent variable. >data <- data.frame(hours=c(3, 7, 9, 8, 10, 12, 12, 14, 16, 18, 20,
Purpose: To evaluate the null hypothesis (i.e there is no 28, 30, 38, 42),
relationship between the 2 variables) against the alternative score=c(64, 66, 76, 73, 74, 81, 83, 82, 80, 88, 84, 82, 91,
hypothesis (i.e there is a significant linear relationship between 2 93, 89))
variables). >view(data)
P values indicate the significance of the relationship.
A low P value (<0.05): The null hypothesis is rejected.
2. Visualise the Data
Allows us to check the model’s assumptions.
2 assumptions behind the linear regression model:
Variables having a roughly linear relationship
Assumptions Absence of outliers.
>scatter.smooth(data$score ~ data$hours,
main= "Scatter graph: Hours studied vs. Exam Score",
1.Linear relationship
xlab = "Time (hours)",
The relationship between independent and dependent variables
ylab = "Score (%)")
must be linear. (scatter plot).
Though not entirely linear, still worth carrying out a linear regression.
2. Little or no multicollinearity between explanatory variables

3. No auto-correlation of errors
y = mx + c
4. Normality m = slope = 0.6049
Residuals should be distributed normally through the plot.Q-Q c = intercept = 69.6326
Residuals plot (residuals should be close to the line) or a histogram.
5. Homoscedasticity Residual standard error:
Homogeneity of variances. Residuals vs Fitted and the scale-location 3. Plot boxplot The average distance between the data values and the regression line. The lower the value, the
plot, residuals should be randomly scattered, if there is a pattern in Visualises the distribution of exam scores allowing the identification of any outliers. more closely the regression line fits the data values.
the distribution there is no homoscedasticity. >boxplot(data$score, Multiple R-squared:
6. No outliers main= 'Boxplot: Distribution of Scores', In this case, only 68.42% of the variation in scores can be explained by hours studied, suggesting
The residual vs leverage uses Cook’s distance to calculate whether ylab= 'Score (%)') that this might not be the only variable affecting exam scores.
any significant outliers could affect our analysis results. No outliers are present here. Adjusted R-squared:
distribution. The value would be lower than the one from multiple R-squared. Here, it is 65.99%.
4. Perform Linear Regression F-statistic & p-value:
>linearmodel <- lm(data$score ~ data$hours) #fitting linear The p-value is less than 0.05, (p-value = 0.000142). The model is statistically significant and
regression model hours is deemed to be a useful explanation for variation of exam scores.
>summary(linearmodel) #for quantitative output
>plot(data$score ~ data$hours, To verify that these assumptions, plot the 4 diagnostic plots.
main="Hours studied vs. Exam score", >par(mfrow = c(2, 2))
xlab= 'Time (hours)', plot(linearmodel)
ylab= 'Score (%)') #plot the 2 variables Q-Q plot: For normal distribution.
>abline(linearmodel, col= 'blue') #adding the linear model If data values follow roughly the dotted straight line at a 45-degree angle, the data is
regression line normally distributed.
Here, normal distribution can be assumed.
Residual vs. fitted values plot: For homoscedasticity.
The x-axis displays the fitted values; y-axis displays the residuals.
The residuals should appear randomly and evenly around the value zero. Otherwise,
homoscedasticity would be violated.
Due to the parabola, homoscedasticity seems to be violated.
Scale-Location: For homoscedasticity.
You should see a horizontal line with equally spread points.
5. Creating Residual Plots and Running Here, the line curves off, suggesting that the variance of the residuals is different for each
diagnostic plots data point. Violation of homoscedacity.
2 assumptions of linear regression: Residuals vs Leverage: For influential outliers.
The residuals are roughly normally Outlying values outside of the dashed lines mean that these values are influential to the
distributed regression. The regression results will be altered if these values are excluded.
The residuals are homoscedastic Here, there are no influential outliers, so the regression line is not affected by it.
There seems to be low homoscedacity between data values. As this is one of the main
assumption of linear regression, a weak linear regression is implied suggesting that another
model may be more appropriate for this dataset.

STATG5 - Simple Linear Regression Using SPSS Module
No ratings yet
STATG5 - Simple Linear Regression Using SPSS Module
16 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
R-Programming - Unit 5
No ratings yet
R-Programming - Unit 5
43 pages
Unit II - Diagnotis and Multiple Linear
No ratings yet
Unit II - Diagnotis and Multiple Linear
8 pages
2.3 Assumptions of Linear Regression
No ratings yet
2.3 Assumptions of Linear Regression
16 pages
Engineering - Simple Correlation and Regression - 2024
No ratings yet
Engineering - Simple Correlation and Regression - 2024
35 pages
LR Assumptions - 05
No ratings yet
LR Assumptions - 05
12 pages
Business Analytics C-2
No ratings yet
Business Analytics C-2
7 pages
The Five Assumptions of Multiple Linear Regression
No ratings yet
The Five Assumptions of Multiple Linear Regression
18 pages
LR Assumptions
No ratings yet
LR Assumptions
9 pages
MIT 302 - Statistical Computing II - Tutorial 03
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 03
16 pages
1 Linear Regression
No ratings yet
1 Linear Regression
22 pages
Practical Session 2 Linear Regression Model Assumptions
No ratings yet
Practical Session 2 Linear Regression Model Assumptions
7 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Simple Linear Ordinary Least Squares Regression: JTMS-03 Applied Statistics With R
No ratings yet
Simple Linear Ordinary Least Squares Regression: JTMS-03 Applied Statistics With R
39 pages
Regression Analysis Simple Regression
No ratings yet
Regression Analysis Simple Regression
10 pages
Example How To Perform Multiple Regression Analysis Using SPSS Statistics
100% (1)
Example How To Perform Multiple Regression Analysis Using SPSS Statistics
14 pages
W6 - L5 - Assumptions of Regression
No ratings yet
W6 - L5 - Assumptions of Regression
4 pages
How To Perform Simple Linear Regression in Python
No ratings yet
How To Perform Simple Linear Regression in Python
8 pages
Unit-2 Ak
No ratings yet
Unit-2 Ak
106 pages
Proycto Final Karla Tamayo Bioestadistica - Ingles.
No ratings yet
Proycto Final Karla Tamayo Bioestadistica - Ingles.
5 pages
Cheatsheet Part 2
No ratings yet
Cheatsheet Part 2
2 pages
Unit5 R
No ratings yet
Unit5 R
5 pages
Module 2
No ratings yet
Module 2
21 pages
Unit 3
No ratings yet
Unit 3
24 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Lab Box Cox and Multiple Linear Reg-1
No ratings yet
Lab Box Cox and Multiple Linear Reg-1
4 pages
Reference+Material Linear Regression
No ratings yet
Reference+Material Linear Regression
12 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
Lab 9 Report
No ratings yet
Lab 9 Report
5 pages
Assumptions of Multiple Regression
No ratings yet
Assumptions of Multiple Regression
12 pages
Basic Regression Analysis 2
No ratings yet
Basic Regression Analysis 2
6 pages
Updated Lecture 7
No ratings yet
Updated Lecture 7
29 pages
Exercice V
No ratings yet
Exercice V
5 pages
Linear Regression
No ratings yet
Linear Regression
38 pages
Linear Regression
100% (2)
Linear Regression
28 pages
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
No ratings yet
DMV Unit 3 PPT - RSK - 250419 - 125620 Jfhuehiwhu
89 pages
Regression Analysis
No ratings yet
Regression Analysis
20 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Chapter 4 MLR
No ratings yet
Chapter 4 MLR
17 pages
Subjective Questions
No ratings yet
Subjective Questions
8 pages
Reference Material Linear Regression
No ratings yet
Reference Material Linear Regression
12 pages
00000chen - Linear Regression Analysis3
No ratings yet
00000chen - Linear Regression Analysis3
252 pages
Unit5 R
No ratings yet
Unit5 R
5 pages
Simple Regression Model Fitting
No ratings yet
Simple Regression Model Fitting
5 pages
DA-3rd Unit
No ratings yet
DA-3rd Unit
16 pages
Lab-5-1-Regression and Multiple Regression
100% (2)
Lab-5-1-Regression and Multiple Regression
8 pages
Linear Regression Experiment
No ratings yet
Linear Regression Experiment
6 pages
Prediction Is A Key Task of Statistics
No ratings yet
Prediction Is A Key Task of Statistics
18 pages
10 Regression Analysis
No ratings yet
10 Regression Analysis
55 pages
5.multiple Regression
No ratings yet
5.multiple Regression
17 pages
Ben 10
No ratings yet
Ben 10
15 pages
C1M5 Peer Reviewed Others
No ratings yet
C1M5 Peer Reviewed Others
27 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Statistic For Agriculture Studies: The Assumptions of Regression
No ratings yet
Statistic For Agriculture Studies: The Assumptions of Regression
6 pages
Untitled
No ratings yet
Untitled
5 pages
Anunnaki Genetic Creation of The Human Races, Gods, Angels, Demo
100% (1)
Anunnaki Genetic Creation of The Human Races, Gods, Angels, Demo
279 pages
Interpreting Correlation
No ratings yet
Interpreting Correlation
13 pages
FEM Analysis of CORNERING CHARACTERISTICS OF ROTATING TIRES PDF
No ratings yet
FEM Analysis of CORNERING CHARACTERISTICS OF ROTATING TIRES PDF
178 pages
Thermodynamic Data Coefficients PDF
No ratings yet
Thermodynamic Data Coefficients PDF
94 pages
ATA 46 Network
100% (1)
ATA 46 Network
52 pages
EM5305
No ratings yet
EM5305
12 pages
Specifications: Description Specification
No ratings yet
Specifications: Description Specification
24 pages
Naruto - The Wind Calamity by DevilHex-8lrgvp44
No ratings yet
Naruto - The Wind Calamity by DevilHex-8lrgvp44
2,734 pages
1200 KGBS Newsletter v7
No ratings yet
1200 KGBS Newsletter v7
16 pages
Fender Blues Junior Owner's Manual
No ratings yet
Fender Blues Junior Owner's Manual
4 pages
Thorndikes Connectionism
No ratings yet
Thorndikes Connectionism
16 pages
23 - Transition Metals and Coordination Chemistry
No ratings yet
23 - Transition Metals and Coordination Chemistry
39 pages
CMDE Volume 6 Issue 4 Pages 411-425
No ratings yet
CMDE Volume 6 Issue 4 Pages 411-425
15 pages
Mat Code All 2020
No ratings yet
Mat Code All 2020
11 pages
LspCAD 6 Tutorial PDF
No ratings yet
LspCAD 6 Tutorial PDF
30 pages
16SPC14XTB
No ratings yet
16SPC14XTB
1 page
Saldo Barang Cidco, Mercure
No ratings yet
Saldo Barang Cidco, Mercure
154 pages
EFLM European Urinalysis Guidelines Draft
No ratings yet
EFLM European Urinalysis Guidelines Draft
285 pages
EC 302 - Intermediate Macroeconomics
No ratings yet
EC 302 - Intermediate Macroeconomics
20 pages
S.No Test Name Other Labs MRP S.No Test Name Other Labs MRP
No ratings yet
S.No Test Name Other Labs MRP S.No Test Name Other Labs MRP
1 page
Medical Lecture: Nazem Shams
No ratings yet
Medical Lecture: Nazem Shams
26 pages
BISC 1403 R04 Fall 2024 V2
No ratings yet
BISC 1403 R04 Fall 2024 V2
6 pages
34 The Hidden Heirloom Script
No ratings yet
34 The Hidden Heirloom Script
24 pages
CAFE DES ARTISTES Recipes
No ratings yet
CAFE DES ARTISTES Recipes
17 pages
7315 15903 1 SM
No ratings yet
7315 15903 1 SM
13 pages
Lid Driven Cavity Simulation
No ratings yet
Lid Driven Cavity Simulation
13 pages
Router: Web Safe Router With 4-Port 10/100 Mbps Switch
No ratings yet
Router: Web Safe Router With 4-Port 10/100 Mbps Switch
2 pages
AutoQuant 200i Brochure
No ratings yet
AutoQuant 200i Brochure
2 pages
CD # 0157 Tanker Management and Self-Assessment Guide
100% (1)
CD # 0157 Tanker Management and Self-Assessment Guide
5 pages
QUANTUM No-Go Locator
No ratings yet
QUANTUM No-Go Locator
1 page
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet

Linear Regression For Real

Uploaded by

Linear Regression For Real

Uploaded by

LINEAR REGRESSION

Background Coding in R Interpretation

You might also like