0% found this document useful (0 votes)

33 views8 pages

Linear Regression

Uploaded by

Oussama Amiri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views8 pages

Linear Regression

Uploaded by

Oussama Amiri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Supervised Learning Algorithms: Simple

and Multiple Linear Regression

In [1]: # Importing necessary libraries

import numpy as np # For numerical computing, linear algebra, ...etc.

import pandas as pd # For data manipulation, like Excel
import matplotlib.pyplot as plt # For plotting and visualization
from sklearn.linear_model import LinearRegression # Sklearn is a machine learning l

Estimating Parameters using Ordinary Least Squares

and Normal Equations

Simple Linear Regression

In [2]: # Creating a dataframe
x = [2, 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7]
y = [196, 221, 136, 255, 244, 230, 232, 255, 267]
d = {'EngineSize':x, 'CO2emissions':y}
df = pd.DataFrame(data = d)

In [3]: # Printing the dataframe df

Out[3]: EngineSize CO2emissions

0 2.0 196

1 2.4 221

2 1.5 136

3 3.5 255

4 3.5 244

5 3.5 230

6 3.5 232

7 3.7 255

8 3.7 267

In [5]: # Plotting the scatter plot of the dataframe "df"

plt.scatter(x = df.EngineSize, y = df.CO2emissions)

<matplotlib.collections.PathCollection at 0x194f73c1b50>
Out[5]:
In [6]: # Computing the mean value of X and y using mean() function in numpy (np) library
x_bar = np.mean(x)
y_bar = np.mean(y)

In [7]: # Printing the values of x_bar and y_bar

x_bar, y_bar

(3.033333333333333, 226.22222222222223)
Out[7]:

Reminder: For simple linear regression, we use one feature to predict the output,
y = theta_0 + theta_1 * X , where theta_0 is the intercept, and theta_1 is the slope of X

In [9]: # Computing theta_0 and theta_1 (the intercept and the slope of X)
theta_1 = np.sum( (x - x_bar) * (y - y_bar) ) / np.sum( (x - x_bar) ** 2 )
theta_0 = y_bar - (theta_1 * x_bar )

In [10]: # Printing the values of theta_0 and theta_1

theta_0, theta_1

(92.80266825965751, 43.98446833930705)
Out[10]:

In [11]: # Drawing the simple linear regression line

X = df.EngineSize # X is the input feature (simple linear regression = one input)

y_my_model = theta_0 + theta_1 * X # is the developed simple linear model

plt.scatter(x = df.EngineSize, y = df.CO2emissions) # Scattering the data points in

plt.plot(X, y_my_model, color = "red") # Plotting the developed linear model y_my_

[<matplotlib.lines.Line2D at 0x194f8418d30>]
Out[11]:
In [12]: # Let's compare your results with scikit-learn

LR_model = LinearRegression() # Initializing an instance of the LinearRegression cl

In [19]: # this method fits the input X to the output y, in other words it computes the para
LR_model.fit(X = df[["EngineSize"]], y = df.CO2emissions)

Out[19]: ▾ LinearRegression

LinearRegression()

In [20]: # Remember your thetas!

theta_0, theta_1

(92.80266825965751, 43.98446833930705)
Out[20]:

In [21]: # Printing the thetas computed using sklearn LinearRegression

LR_model.intercept_, LR_model.coef_

(92.80266825965754, array([43.98446834]))
Out[21]:

See, they are the same values!!! But, wht?? Because, sklearn LinearRegression uses the same
approach "Least Squares and Normal Equations"!

Multiple Linear Regression

Reminder: For multiple linear regression, there is more than one input feature (2 or more)
to predict the output

In [22]: # Creating a dataframe

x1 = [2, 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7]

x2 = [4, 4, 4, 6, 6, 6, 6, 6, 6]
x3 = [8.5, 9.6, 5.9, 11.1, 10.6, 10.0, 10.1, 11.1, 11.6]
y = [196, 221, 136, 255, 244, 230, 232, 255, 267]
d = {'EngineSize':x1, 'Cylinders':x2, 'FuelConsumptionComb':x3, 'CO2emissions':y}
df = pd.DataFrame(data = d)

In [23]: # Printing the dataframe df

Out[23]: EngineSize Cylinders FuelConsumptionComb CO2emissions

0 2.0 4 8.5 196

1 2.4 4 9.6 221

2 1.5 4 5.9 136

3 3.5 6 11.1 255

4 3.5 6 10.6 244

5 3.5 6 10.0 230

6 3.5 6 10.1 232

7 3.7 6 11.1 255

8 3.7 6 11.6 267

In [24]: ## TO-DO Task: Compute the coefficients (theta_0, theta_1, theta_2, and theta_3) us
## Note:
# theat_0 is the intercept,
# theta_1, theta_2, and theta_3 are the slopes of EngineSize, Cylinders, and FuelCo

Estimating Parameters using Gradient Descent

Optimization Algorithm
In [26]: # Importing a dataset using pandas' read_csv method
df2 = pd.read_csv("./datasets/random_linear_data.csv")

In [28]: # Printing the dataframe df2

df2
Out[28]: X y

0 32.502345 31.707006

1 53.426804 68.777596

2 61.530358 62.562382

3 47.475640 71.546632

4 59.813208 87.230925

... ... ...

95 50.030174 81.536991

96 49.239765 72.111832

97 50.039576 85.232007

98 48.149859 66.224958

99 25.128485 53.454394

100 rows × 2 columns

In [29]: # Defining the features X and the output y

X = df2.X
y = df2.y

In [30]: # Scattering the data points in the dataframe

plt.scatter(df2.X, df2.y)

<matplotlib.collections.PathCollection at 0x194f857b7f0>
Out[30]:

In [31]: # Gradient Descent Optimizer

'''
X: the input
y: the output
learning_rate: The size of the step, it determines how fast or slow we will mov
nbr_iterations: How many times/iterations repeating the optimization script
'''
def gradient_descent(X, y, learning_rate, nbr_iterations):

# Initializing the parameters randomly or by setting the values to 0

theta_0 = 0
theta_1 = 0

# n contains the total number of items/data points in the df2

n = ?

# Repeat for nbr_iterations (updating the parameters/weights/coefficients theta

for i in range(nbr_iterations):

# y_predictions
y_predictions = ?

# Gradient/Partial derivative of the loss function MSE with respect to thet

d_theta_0 = ?
# Gradient/Partial derivative of the loss function MSE with respect to thet
d_theta_1 = ?

# Updating the coefficients theta_0 and theta_1

theta_0 = ?
theta_1 = ?

return theta_0, theta_1

In [41]: # Computing the thetas theta_0 and theta_1 using gradient descent optimization algo
theta_0, theta_1 = gradient_descent(df2.X, df2.y, 0.0001, 500000)

In [42]: print("theta_0 = ",theta_0 , "\ntheta_1 = ",theta_1)

theta_0 = 7.808193346466124
theta_1 = 1.326024444231642

In [47]: # Drawing the simple linear regression line

y_my_model_GD = theta_0 + theta_1 * X # is the developed simple linear model

plt.scatter(x = df2.X, y = df2.y) # Scattering the data points in the dataframe (df
plt.plot(X, y_my_model_GD, color = "red") # Plotting the developed linear model y_

[<matplotlib.lines.Line2D at 0x194f89b59d0>]
Out[47]:
In [45]: # Let's compare your results with scikit-learn
# Remember: Sklearn LinearRegression use least squares and normal equations,

LR_model = LinearRegression()
LR_model.fit(df2[['X']], df2.y)

Out[45]: ▾ LinearRegression

LinearRegression()

In [48]: # Remember the gradient descent results!

print("theta_0 = ",theta_0 , "\ntheta_1 = ",theta_1)

theta_0 = 7.808193346466124
theta_1 = 1.326024444231642

In [49]: LR_model.intercept_, LR_model.coef_

(7.991020982270399, array([1.32243102]))
Out[49]:

See! almost the same values!

In [51]: # Let's plot the developed linear models using Gradient Descent vs. sklearn.linear_

# sklearn.linear_model.LinearRegression
y_pred_sklearn = df2.X * LR_model.coef_[0] + LR_model.intercept_

# Our linear model using Gradient Descent

y_pred_grad_desc = df2.X * theta_1 + theta_0

# Scattering the data points in the dataframe df2

plt.scatter(df2.X, df2.y)
# Plotting the sklearn LinearRegression model
plt.plot(df2.X, y_pred_sklearn, color = 'green')
# Plotting our model (Gradient Descent)
plt.plot(df2.X, y_pred_grad_desc, color = 'red')
plt.show()

ND Science Laboratory Technology
No ratings yet
ND Science Laboratory Technology
264 pages
100 Geometry Problems: Contributors: Djmathman, Abishek99, Captainflint
No ratings yet
100 Geometry Problems: Contributors: Djmathman, Abishek99, Captainflint
8 pages
WATERAX Genuine Parts Catalog PDF
100% (1)
WATERAX Genuine Parts Catalog PDF
77 pages
ML Lab File
No ratings yet
ML Lab File
48 pages
Linear Regression
No ratings yet
Linear Regression
14 pages
Lecture3 Upload
No ratings yet
Lecture3 Upload
28 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Regression
No ratings yet
Regression
16 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Regression
No ratings yet
Regression
25 pages
LinearRegression Tutorial
No ratings yet
LinearRegression Tutorial
40 pages
Linear Regression - Cheatsheet
No ratings yet
Linear Regression - Cheatsheet
8 pages
ML Regression Documentation
No ratings yet
ML Regression Documentation
7 pages
ML0101EN Reg Simple Linear Regression Co2 Py v1
No ratings yet
ML0101EN Reg Simple Linear Regression Co2 Py v1
4 pages
Linear Regression
No ratings yet
Linear Regression
19 pages
Lecture04. Training Models (Regression in Chapter 4)
No ratings yet
Lecture04. Training Models (Regression in Chapter 4)
44 pages
Machine Learning With Python Algorithms
No ratings yet
Machine Learning With Python Algorithms
28 pages
Chapter04 Training Models
No ratings yet
Chapter04 Training Models
33 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
23 pages
Assignment 2 ML
No ratings yet
Assignment 2 ML
11 pages
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
No ratings yet
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
5 pages
Linear Regression With Gradient Descent
100% (1)
Linear Regression With Gradient Descent
8 pages
MLDL I Linear Regression With Gradient Descent - Ipynb Colaboratory
No ratings yet
MLDL I Linear Regression With Gradient Descent - Ipynb Colaboratory
15 pages
Unit 2 ML - Ver 2
No ratings yet
Unit 2 ML - Ver 2
129 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
CSL0777 L15
No ratings yet
CSL0777 L15
24 pages
L. D. College of Engineering: Lab Manual For
No ratings yet
L. D. College of Engineering: Lab Manual For
70 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
Lab5 Linear Regression
No ratings yet
Lab5 Linear Regression
1 page
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
Implementation of Linear Regression With Python
No ratings yet
Implementation of Linear Regression With Python
5 pages
Lab2 Linear Regression
100% (1)
Lab2 Linear Regression
18 pages
MLCyber Lab
No ratings yet
MLCyber Lab
9 pages
Coding Questions
No ratings yet
Coding Questions
124 pages
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
No ratings yet
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
3 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
ML Cyber Lab
No ratings yet
ML Cyber Lab
16 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
Wa0002.
No ratings yet
Wa0002.
5 pages
ML Exp 3-7 Manuval
No ratings yet
ML Exp 3-7 Manuval
21 pages
DatA414 Prac 2 Linear Regression 2024.pdfasisipho
No ratings yet
DatA414 Prac 2 Linear Regression 2024.pdfasisipho
21 pages
ML Lab Manual
100% (1)
ML Lab Manual
37 pages
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
No ratings yet
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
43 pages
GradientDescent-Regression Slides
No ratings yet
GradientDescent-Regression Slides
26 pages
02-Linear Regression
No ratings yet
02-Linear Regression
17 pages
2 - (9-3) Regression Classifiers
No ratings yet
2 - (9-3) Regression Classifiers
35 pages
Simple Linear Regression: Math Behind
0% (1)
Simple Linear Regression: Math Behind
6 pages
Linear Regression
No ratings yet
Linear Regression
91 pages
Lab Manual 05
No ratings yet
Lab Manual 05
13 pages
Cheat Sheet Linear and Logistic Regression
No ratings yet
Cheat Sheet Linear and Logistic Regression
2 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
Module B Handbook
No ratings yet
Module B Handbook
11 pages
(Slide) Non Linear Regression
No ratings yet
(Slide) Non Linear Regression
39 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Linear Regression Code
No ratings yet
Linear Regression Code
5 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
132 pages
Lecture Notes 5 Linear Regression
No ratings yet
Lecture Notes 5 Linear Regression
11 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
10 pages
ML Remaining
No ratings yet
ML Remaining
17 pages
Day 3 ML
No ratings yet
Day 3 ML
4 pages
Infotec Ai 1000 Program-hcia-Ai Lab Guide
No ratings yet
Infotec Ai 1000 Program-hcia-Ai Lab Guide
82 pages
EE708 Module 3A
No ratings yet
EE708 Module 3A
28 pages
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Davanagere
No ratings yet
Davanagere
11 pages
Fixed Drug Eruptions PDF
No ratings yet
Fixed Drug Eruptions PDF
7 pages
Emami Limited
100% (2)
Emami Limited
44 pages
High Quality Knitting in The Nordic Tradition Instant EPUB Download
0% (1)
High Quality Knitting in The Nordic Tradition Instant EPUB Download
15 pages
List of Land Lease in TPM
No ratings yet
List of Land Lease in TPM
3 pages
Weight-For-Age BOYS: 6 Months To 2 Years (Percentiles)
No ratings yet
Weight-For-Age BOYS: 6 Months To 2 Years (Percentiles)
1 page
02 Radio Engineering - Radio Propagation
No ratings yet
02 Radio Engineering - Radio Propagation
18 pages
Golgi Apparatus Structure and Function Relationship
No ratings yet
Golgi Apparatus Structure and Function Relationship
3 pages
Country Home Ibusa Duplex Electrical 1
No ratings yet
Country Home Ibusa Duplex Electrical 1
15 pages
Proposal For 365 Farms Nig. LTD
No ratings yet
Proposal For 365 Farms Nig. LTD
12 pages
Summer03 The Labyrinth PDF
100% (1)
Summer03 The Labyrinth PDF
3 pages
The Opportunity Cost of Using Excess Capacity
No ratings yet
The Opportunity Cost of Using Excess Capacity
8 pages
RDO No. 68 - Sorsogon City, Sorsogon 3
No ratings yet
RDO No. 68 - Sorsogon City, Sorsogon 3
703 pages
Enb 3
No ratings yet
Enb 3
10 pages
Complexity Epistemology and The Challenge of The Future
No ratings yet
Complexity Epistemology and The Challenge of The Future
12 pages
FA4 10th Science (2024-25)
No ratings yet
FA4 10th Science (2024-25)
3 pages
Pindell Dewey 1982
No ratings yet
Pindell Dewey 1982
34 pages
DEPORTES
No ratings yet
DEPORTES
5 pages
Srirangam Temple
No ratings yet
Srirangam Temple
25 pages
CH 7.5 - Cargo & Ballast Operations
No ratings yet
CH 7.5 - Cargo & Ballast Operations
472 pages
Quick Repair Manual
No ratings yet
Quick Repair Manual
14 pages
8020 Blocked From Use: Tuesday
No ratings yet
8020 Blocked From Use: Tuesday
95 pages
Lampiran SPTJM PSP
No ratings yet
Lampiran SPTJM PSP
7 pages
Resume Bryan S. Caneda
No ratings yet
Resume Bryan S. Caneda
11 pages
Specif Electromec Bariera Engl
No ratings yet
Specif Electromec Bariera Engl
2 pages
UPS Power Monitor Users Manual Ver 1.17 - C
No ratings yet
UPS Power Monitor Users Manual Ver 1.17 - C
32 pages
Perancangan Alat Pada Engine Trainer Sepeda Motor Sebagai Peningkatan Kemampuan Siswa Dalam Praktik Sistem Perawatan
No ratings yet
Perancangan Alat Pada Engine Trainer Sepeda Motor Sebagai Peningkatan Kemampuan Siswa Dalam Praktik Sistem Perawatan
7 pages

Linear Regression

Uploaded by

Linear Regression

Uploaded by

Supervised Learning Algorithms: Simple

and Multiple Linear Regression

import numpy as np # For numerical computing, linear algebra, ...etc.

Estimating Parameters using Ordinary Least Squares

Simple Linear Regression

In [3]: # Printing the dataframe df

Out[3]: EngineSize CO2emissions

In [5]: # Plotting the scatter plot of the dataframe "df"

In [7]: # Printing the values of x_bar and y_bar

In [10]: # Printing the values of theta_0 and theta_1

In [11]: # Drawing the simple linear regression line

X = df.EngineSize # X is the input feature (simple linear regression = one input)

plt.scatter(x = df.EngineSize, y = df.CO2emissions) # Scattering the data points in

LR_model = LinearRegression() # Initializing an instance of the LinearRegression cl

In [20]: # Remember your thetas!

In [21]: # Printing the thetas computed using sklearn LinearRegression

Multiple Linear Regression

In [22]: # Creating a dataframe

x1 = [2, 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7]

In [23]: # Printing the dataframe df

Out[23]: EngineSize Cylinders FuelConsumptionComb CO2emissions

0 2.0 4 8.5 196

1 2.4 4 9.6 221

2 1.5 4 5.9 136

3 3.5 6 11.1 255

4 3.5 6 10.6 244

5 3.5 6 10.0 230

6 3.5 6 10.1 232

7 3.7 6 11.1 255

8 3.7 6 11.6 267

Estimating Parameters using Gradient Descent

In [28]: # Printing the dataframe df2

... ... ...

100 rows × 2 columns

In [29]: # Defining the features X and the output y

In [30]: # Scattering the data points in the dataframe

In [31]: # Gradient Descent Optimizer

# Initializing the parameters randomly or by setting the values to 0

# n contains the total number of items/data points in the df2

# Repeat for nbr_iterations (updating the parameters/weights/coefficients theta

# Gradient/Partial derivative of the loss function MSE with respect to thet

# Updating the coefficients theta_0 and theta_1

return theta_0, theta_1

In [42]: print("theta_0 = ",theta_0 , "\ntheta_1 = ",theta_1)

In [47]: # Drawing the simple linear regression line

y_my_model_GD = theta_0 + theta_1 * X # is the developed simple linear model

In [48]: # Remember the gradient descent results!

In [49]: LR_model.intercept_, LR_model.coef_

See! almost the same values!

# Our linear model using Gradient Descent

# Scattering the data points in the dataframe df2

You might also like