0% found this document useful (0 votes)

24 views

linear-regression

Uploaded by

Oussama Amiri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views

linear-regression

Uploaded by

Oussama Amiri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Supervised Learning Algorithms: Simple

and Multiple Linear Regression

In [1]: # Importing necessary libraries

import numpy as np # For numerical computing, linear algebra, ...etc.

import pandas as pd # For data manipulation, like Excel
import matplotlib.pyplot as plt # For plotting and visualization
from sklearn.linear_model import LinearRegression # Sklearn is a machine learning l

Estimating Parameters using Ordinary Least Squares

and Normal Equations

Simple Linear Regression

In [2]: # Creating a dataframe
x = [2, 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7]
y = [196, 221, 136, 255, 244, 230, 232, 255, 267]
d = {'EngineSize':x, 'CO2emissions':y}
df = pd.DataFrame(data = d)

In [3]: # Printing the dataframe df

Out[3]: EngineSize CO2emissions

0 2.0 196

1 2.4 221

2 1.5 136

3 3.5 255

4 3.5 244

5 3.5 230

6 3.5 232

7 3.7 255

8 3.7 267

In [5]: # Plotting the scatter plot of the dataframe "df"

plt.scatter(x = df.EngineSize, y = df.CO2emissions)

<matplotlib.collections.PathCollection at 0x194f73c1b50>
Out[5]:
In [6]: # Computing the mean value of X and y using mean() function in numpy (np) library
x_bar = np.mean(x)
y_bar = np.mean(y)

In [7]: # Printing the values of x_bar and y_bar

x_bar, y_bar

(3.033333333333333, 226.22222222222223)
Out[7]:

Reminder: For simple linear regression, we use one feature to predict the output,
y = theta_0 + theta_1 * X , where theta_0 is the intercept, and theta_1 is the slope of X

In [9]: # Computing theta_0 and theta_1 (the intercept and the slope of X)
theta_1 = np.sum( (x - x_bar) * (y - y_bar) ) / np.sum( (x - x_bar) ** 2 )
theta_0 = y_bar - (theta_1 * x_bar )

In [10]: # Printing the values of theta_0 and theta_1

theta_0, theta_1

(92.80266825965751, 43.98446833930705)
Out[10]:

In [11]: # Drawing the simple linear regression line

X = df.EngineSize # X is the input feature (simple linear regression = one input)

y_my_model = theta_0 + theta_1 * X # is the developed simple linear model

plt.scatter(x = df.EngineSize, y = df.CO2emissions) # Scattering the data points in

plt.plot(X, y_my_model, color = "red") # Plotting the developed linear model y_my_

[<matplotlib.lines.Line2D at 0x194f8418d30>]
Out[11]:
In [12]: # Let's compare your results with scikit-learn

LR_model = LinearRegression() # Initializing an instance of the LinearRegression cl

In [19]: # this method fits the input X to the output y, in other words it computes the para
LR_model.fit(X = df[["EngineSize"]], y = df.CO2emissions)

Out[19]: ▾ LinearRegression

LinearRegression()

In [20]: # Remember your thetas!

theta_0, theta_1

(92.80266825965751, 43.98446833930705)
Out[20]:

In [21]: # Printing the thetas computed using sklearn LinearRegression

LR_model.intercept_, LR_model.coef_

(92.80266825965754, array([43.98446834]))
Out[21]:

See, they are the same values!!! But, wht?? Because, sklearn LinearRegression uses the same
approach "Least Squares and Normal Equations"!

Multiple Linear Regression

Reminder: For multiple linear regression, there is more than one input feature (2 or more)
to predict the output

In [22]: # Creating a dataframe

x1 = [2, 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7]

x2 = [4, 4, 4, 6, 6, 6, 6, 6, 6]
x3 = [8.5, 9.6, 5.9, 11.1, 10.6, 10.0, 10.1, 11.1, 11.6]
y = [196, 221, 136, 255, 244, 230, 232, 255, 267]
d = {'EngineSize':x1, 'Cylinders':x2, 'FuelConsumptionComb':x3, 'CO2emissions':y}
df = pd.DataFrame(data = d)

In [23]: # Printing the dataframe df

Out[23]: EngineSize Cylinders FuelConsumptionComb CO2emissions

0 2.0 4 8.5 196

1 2.4 4 9.6 221

2 1.5 4 5.9 136

3 3.5 6 11.1 255

4 3.5 6 10.6 244

5 3.5 6 10.0 230

6 3.5 6 10.1 232

7 3.7 6 11.1 255

8 3.7 6 11.6 267

In [24]: ## TO-DO Task: Compute the coefficients (theta_0, theta_1, theta_2, and theta_3) us
## Note:
# theat_0 is the intercept,
# theta_1, theta_2, and theta_3 are the slopes of EngineSize, Cylinders, and FuelCo

Estimating Parameters using Gradient Descent

Optimization Algorithm
In [26]: # Importing a dataset using pandas' read_csv method
df2 = pd.read_csv("./datasets/random_linear_data.csv")

In [28]: # Printing the dataframe df2

df2
Out[28]: X y

0 32.502345 31.707006

1 53.426804 68.777596

2 61.530358 62.562382

3 47.475640 71.546632

4 59.813208 87.230925

... ... ...

95 50.030174 81.536991

96 49.239765 72.111832

97 50.039576 85.232007

98 48.149859 66.224958

99 25.128485 53.454394

100 rows × 2 columns

In [29]: # Defining the features X and the output y

X = df2.X
y = df2.y

In [30]: # Scattering the data points in the dataframe

plt.scatter(df2.X, df2.y)

<matplotlib.collections.PathCollection at 0x194f857b7f0>
Out[30]:

In [31]: # Gradient Descent Optimizer

'''
X: the input
y: the output
learning_rate: The size of the step, it determines how fast or slow we will mov
nbr_iterations: How many times/iterations repeating the optimization script
'''
def gradient_descent(X, y, learning_rate, nbr_iterations):

# Initializing the parameters randomly or by setting the values to 0

theta_0 = 0
theta_1 = 0

# n contains the total number of items/data points in the df2

n = ?

# Repeat for nbr_iterations (updating the parameters/weights/coefficients theta

for i in range(nbr_iterations):

# y_predictions
y_predictions = ?

# Gradient/Partial derivative of the loss function MSE with respect to thet

d_theta_0 = ?
# Gradient/Partial derivative of the loss function MSE with respect to thet
d_theta_1 = ?

# Updating the coefficients theta_0 and theta_1

theta_0 = ?
theta_1 = ?

return theta_0, theta_1

In [41]: # Computing the thetas theta_0 and theta_1 using gradient descent optimization algo
theta_0, theta_1 = gradient_descent(df2.X, df2.y, 0.0001, 500000)

In [42]: print("theta_0 = ",theta_0 , "\ntheta_1 = ",theta_1)

theta_0 = 7.808193346466124
theta_1 = 1.326024444231642

In [47]: # Drawing the simple linear regression line

y_my_model_GD = theta_0 + theta_1 * X # is the developed simple linear model

plt.scatter(x = df2.X, y = df2.y) # Scattering the data points in the dataframe (df
plt.plot(X, y_my_model_GD, color = "red") # Plotting the developed linear model y_

[<matplotlib.lines.Line2D at 0x194f89b59d0>]
Out[47]:
In [45]: # Let's compare your results with scikit-learn
# Remember: Sklearn LinearRegression use least squares and normal equations,

LR_model = LinearRegression()
LR_model.fit(df2[['X']], df2.y)

Out[45]: ▾ LinearRegression

LinearRegression()

In [48]: # Remember the gradient descent results!

print("theta_0 = ",theta_0 , "\ntheta_1 = ",theta_1)

theta_0 = 7.808193346466124
theta_1 = 1.326024444231642

In [49]: LR_model.intercept_, LR_model.coef_

(7.991020982270399, array([1.32243102]))
Out[49]:

See! almost the same values!

In [51]: # Let's plot the developed linear models using Gradient Descent vs. sklearn.linear_

# sklearn.linear_model.LinearRegression
y_pred_sklearn = df2.X * LR_model.coef_[0] + LR_model.intercept_

# Our linear model using Gradient Descent

y_pred_grad_desc = df2.X * theta_1 + theta_0

# Scattering the data points in the dataframe df2

plt.scatter(df2.X, df2.y)
# Plotting the sklearn LinearRegression model
plt.plot(df2.X, y_pred_sklearn, color = 'green')
# Plotting our model (Gradient Descent)
plt.plot(df2.X, y_pred_grad_desc, color = 'red')
plt.show()

ML LAB FILE (2)
No ratings yet
ML LAB FILE (2)
48 pages
Linear Regression
No ratings yet
Linear Regression
14 pages
Lecture3_upload
No ratings yet
Lecture3_upload
28 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Regression
No ratings yet
Regression
16 pages
LinearRegression Tutorial
No ratings yet
LinearRegression Tutorial
40 pages
Linear Regression - Cheatsheet
No ratings yet
Linear Regression - Cheatsheet
8 pages
ML Regression Documentation
No ratings yet
ML Regression Documentation
7 pages
ML0101EN Reg Simple Linear Regression Co2 Py v1
No ratings yet
ML0101EN Reg Simple Linear Regression Co2 Py v1
4 pages
Lecture04. Training Models (Regression in Chapter 4)
No ratings yet
Lecture04. Training Models (Regression in Chapter 4)
44 pages
Machine Learning With Python Algorithms
No ratings yet
Machine Learning With Python Algorithms
28 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
11 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
23 pages
Assignment 2 ML
No ratings yet
Assignment 2 ML
11 pages
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
No ratings yet
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
5 pages
Linear Regression With Gradient Descent
100% (1)
Linear Regression With Gradient Descent
8 pages
19BCS2059 DL1
No ratings yet
19BCS2059 DL1
4 pages
CSL0777 L15
No ratings yet
CSL0777 L15
24 pages
L. D. College of Engineering: Lab Manual For
No ratings yet
L. D. College of Engineering: Lab Manual For
70 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
Lab5 Linear Regression
No ratings yet
Lab5 Linear Regression
1 page
ml2020 Pythonlab02
No ratings yet
ml2020 Pythonlab02
3 pages
Implementation of Linear Regression With Python
No ratings yet
Implementation of Linear Regression With Python
5 pages
Lab2 Linear Regression
100% (1)
Lab2 Linear Regression
18 pages
MLCyberLab
No ratings yet
MLCyberLab
9 pages
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
No ratings yet
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
3 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
ML Activity Kalyan
No ratings yet
ML Activity Kalyan
21 pages
Wa0002.
No ratings yet
Wa0002.
5 pages
ml exp 3-7 manuval
No ratings yet
ml exp 3-7 manuval
21 pages
DatA414 Prac 2 Linear Regression 2024.pdfasisipho
No ratings yet
DatA414 Prac 2 Linear Regression 2024.pdfasisipho
21 pages
ML Lab Manual
100% (1)
ML Lab Manual
37 pages
GradientDescent-Regression_slides
No ratings yet
GradientDescent-Regression_slides
26 pages
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
No ratings yet
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
43 pages
02-Linear Regression
No ratings yet
02-Linear Regression
17 pages
2-(9-3) Regression Classifiers
No ratings yet
2-(9-3) Regression Classifiers
35 pages
Simple Linear Regression: Math Behind
No ratings yet
Simple Linear Regression: Math Behind
6 pages
Lab Manual 05
No ratings yet
Lab Manual 05
13 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
(Slide) Non Linear Regression
No ratings yet
(Slide) Non Linear Regression
39 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Linear Regression Code
No ratings yet
Linear Regression Code
5 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
132 pages
Lecture Notes 5 Linear Regression
No ratings yet
Lecture Notes 5 Linear Regression
11 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
10 pages
ML Remaining
No ratings yet
ML Remaining
17 pages
CS 611 Slides 4
No ratings yet
CS 611 Slides 4
25 pages
Infotec Ai 1000 Program-hcia-Ai Lab Guide
No ratings yet
Infotec Ai 1000 Program-hcia-Ai Lab Guide
82 pages
Exp 1
No ratings yet
Exp 1
11 pages
CS464 Ch9 LinearRegression
100% (1)
CS464 Ch9 LinearRegression
43 pages
Neural Networks Report HW2: Pripoae Serbanescu Mihai
No ratings yet
Neural Networks Report HW2: Pripoae Serbanescu Mihai
5 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Multiple Regression
No ratings yet
Multiple Regression
7 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
ilovepdf_merged (1)_Enhance
No ratings yet
ilovepdf_merged (1)_Enhance
5 pages
AIML Lab
No ratings yet
AIML Lab
48 pages
Machine Learning Lab File (BTCS619-18)
No ratings yet
Machine Learning Lab File (BTCS619-18)
50 pages
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Chapter 3
No ratings yet
Chapter 3
6 pages
Ch02 TestB
No ratings yet
Ch02 TestB
3 pages
PS - Gtu Paper
No ratings yet
PS - Gtu Paper
3 pages
Appendix B Writing APA Style Results
No ratings yet
Appendix B Writing APA Style Results
6 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Harry Markowitz's Portfolio Theory Model: Tushar Joshi 14 Pratiksha Pandya 30 Komal Fulekar 09 Mandar Panchal 28
No ratings yet
Harry Markowitz's Portfolio Theory Model: Tushar Joshi 14 Pratiksha Pandya 30 Komal Fulekar 09 Mandar Panchal 28
15 pages
Exercise 6
No ratings yet
Exercise 6
2 pages
Chapter 6 Variable Selection and Model Building
No ratings yet
Chapter 6 Variable Selection and Model Building
32 pages
To Locate Median Graphically
100% (1)
To Locate Median Graphically
4 pages
Biostatistics Assignment
No ratings yet
Biostatistics Assignment
17 pages
Ada Assign
No ratings yet
Ada Assign
6 pages
SHS - Statistics
No ratings yet
SHS - Statistics
7 pages
Types of Data & Measurement Scales: Nominal, Ordinal, Interval and Ratio
No ratings yet
Types of Data & Measurement Scales: Nominal, Ordinal, Interval and Ratio
5 pages
Stepwise Multiple Regression Method To Forecast Fish Landing
No ratings yet
Stepwise Multiple Regression Method To Forecast Fish Landing
7 pages
EMAG MAT105 Mock Questions 2019-2020
No ratings yet
EMAG MAT105 Mock Questions 2019-2020
5 pages
INTRODUCTION TO STATISTICS Notes
No ratings yet
INTRODUCTION TO STATISTICS Notes
16 pages
SEEC DiscussionPaper No8
No ratings yet
SEEC DiscussionPaper No8
24 pages
2022 2 Bbac 322 Exam
No ratings yet
2022 2 Bbac 322 Exam
18 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
Statistics Drills
No ratings yet
Statistics Drills
5 pages
Business Statistics A First Course 8th Edition David Levine Kathryn Szabat download
100% (1)
Business Statistics A First Course 8th Edition David Levine Kathryn Szabat download
39 pages
Asymptotic Theory for Econometricians Revised Edition Economic Theory Econometrics and Mathematical Economics White 2024 scribd download
100% (1)
Asymptotic Theory for Econometricians Revised Edition Economic Theory Econometrics and Mathematical Economics White 2024 scribd download
67 pages
Understanding PSLE T-Scores
No ratings yet
Understanding PSLE T-Scores
12 pages
Inferential Statics
No ratings yet
Inferential Statics
33 pages
Notation
100% (1)
Notation
36 pages
Assignment #1
No ratings yet
Assignment #1
1 page
Jurgiel Flying Turky
No ratings yet
Jurgiel Flying Turky
3 pages
Principles of Statistical Inference
100% (10)
Principles of Statistical Inference
236 pages
MMW - Module 4-1
100% (1)
MMW - Module 4-1
80 pages
Eviews User Guide 2
No ratings yet
Eviews User Guide 2
822 pages

linear-regression

Uploaded by

linear-regression

Uploaded by

Supervised Learning Algorithms: Simple

and Multiple Linear Regression

import numpy as np # For numerical computing, linear algebra, ...etc.

Estimating Parameters using Ordinary Least Squares

Simple Linear Regression

In [3]: # Printing the dataframe df

Out[3]: EngineSize CO2emissions

In [5]: # Plotting the scatter plot of the dataframe "df"

In [7]: # Printing the values of x_bar and y_bar

In [10]: # Printing the values of theta_0 and theta_1

In [11]: # Drawing the simple linear regression line

X = df.EngineSize # X is the input feature (simple linear regression = one input)

plt.scatter(x = df.EngineSize, y = df.CO2emissions) # Scattering the data points in

LR_model = LinearRegression() # Initializing an instance of the LinearRegression cl

In [20]: # Remember your thetas!

In [21]: # Printing the thetas computed using sklearn LinearRegression

Multiple Linear Regression

In [22]: # Creating a dataframe

x1 = [2, 2.4, 1.5, 3.5, 3.5, 3.5, 3.5, 3.7, 3.7]

In [23]: # Printing the dataframe df

Out[23]: EngineSize Cylinders FuelConsumptionComb CO2emissions

0 2.0 4 8.5 196

1 2.4 4 9.6 221

2 1.5 4 5.9 136

3 3.5 6 11.1 255

4 3.5 6 10.6 244

5 3.5 6 10.0 230

6 3.5 6 10.1 232

7 3.7 6 11.1 255

8 3.7 6 11.6 267

Estimating Parameters using Gradient Descent

In [28]: # Printing the dataframe df2

... ... ...

100 rows × 2 columns

In [29]: # Defining the features X and the output y

In [30]: # Scattering the data points in the dataframe

In [31]: # Gradient Descent Optimizer

# Initializing the parameters randomly or by setting the values to 0

# n contains the total number of items/data points in the df2

# Repeat for nbr_iterations (updating the parameters/weights/coefficients theta

# Gradient/Partial derivative of the loss function MSE with respect to thet

# Updating the coefficients theta_0 and theta_1

return theta_0, theta_1

In [42]: print("theta_0 = ",theta_0 , "\ntheta_1 = ",theta_1)

In [47]: # Drawing the simple linear regression line

y_my_model_GD = theta_0 + theta_1 * X # is the developed simple linear model

In [48]: # Remember the gradient descent results!

In [49]: LR_model.intercept_, LR_model.coef_

See! almost the same values!

# Our linear model using Gradient Descent

# Scattering the data points in the dataframe df2

You might also like