0% found this document useful (0 votes)
20 views20 pages

Lec 3 Regression.

Linear regression is a supervised machine learning algorithm that finds the best linear relationship between a dependent variable and one or more independent variables. It works by minimizing the sum of the squares of the differences between the actual and predicted values. Overfitting and underfitting occur when the model is too complex or simple respectively, and various techniques like regularization can help address overfitting.

Uploaded by

Katende Chris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views20 pages

Lec 3 Regression.

Linear regression is a supervised machine learning algorithm that finds the best linear relationship between a dependent variable and one or more independent variables. It works by minimizing the sum of the squares of the differences between the actual and predicted values. Overfitting and underfitting occur when the model is too complex or simple respectively, and various techniques like regularization can help address overfitting.

Uploaded by

Katende Chris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Linear Regression

 Linear Regression is a supervised machine learning algorithm.


 It tries to find out the best linear relationship that describes the data you have.
 It assumes that there exists a linear relationship between a dependent variable and
independent variable(s).
 The value of the dependent variable of a linear regression model is a continuous
value i.e. real numbers.
Linear Regression
We want to find the best line (linear function y=f(X)) to
explain the data.
y

X
Simple Linear Regression
Simple Linear Regression Equation

The equation that describes how individual y values relate to x


The predicted value of y is given by: 𝑦 = 𝑏0 + 𝑏1 X . Where;

 y is a dependent variable.
 𝑦 is the predicted value of y
 X is an independent variable.
 b0 and b1 are the regression coefficients.
 b0 is the intercept or the bias that fixes the offset to a line.
 b1 is the slope or weight that specifies the factor by which X
has an impact on Y.
Error for Simple Linear Regression model

Y= 𝛽0 + 𝛽1 X + 𝜀 ed the regressiondel.
 𝜀: reflects how individuals deviate from others with the same
value of x
Ŷ82=b0 + b182 e82=Y82-Ŷ82

X=82
Estimated Simple Linear Regression Equation
 Recall: The estimated simple linear regression equation is:
𝑌 = 𝑏0 + 𝑏1 X
 b0 is the estimate for β0

 b1 is the estimate for β1

 𝑌 is the estimated (predicted) value of Y for a given x value.


Least Squares method

• Least Squares Criterion: Choose the “best” β0 and β1 to minimize

• S=Σ(𝑌𝑖 – (𝛽0 + 𝛽1𝑋𝑖) )2

• Use calculus: take derivative with respect to β0 and with respect to


β1 and set the two resulting equations equal to zero and solve for β0
and β1

• Of all possible lines pick the one that minimizes the sum of the
distances squared of each point from that line
Least Squares Solution

b1 
 (X  X )(Y  Y )
i i

 (X  X )
slope:
2
i

Intercept: b 0  Y  b1 X
Estimating the Variance s 2

• An Estimate of s 2
The mean square error (MSE) provides the estimate
of s 2, and the notation s2 is also used.
s2 = MSE = SSE/(n-2)
where:

If points are close to the regression line then SSE will be small
If points are far from the regression line then SSE will be large

SSE: Sum of Squared Errors


Bias, variance tradeoff

Variance
Bias
 Regression predictions should be unbiased. That is:
"average of predictions" should ≈ "average of observations"
 Bias measures how far the mean of predictions is from the mean of actual
values
𝐵𝑖𝑎𝑠 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠 – 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑐𝑡𝑢𝑎𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 (𝑔𝑟𝑜𝑢𝑛𝑑 𝑡𝑟𝑢𝑡ℎ 𝑙𝑎𝑏𝑒𝑙𝑠)

 The model can’t fit the data (usually too simplistic)


 Increase the complexity of the model to minimize Bias
Variance
 Variance indicates how much the estimate of the target function will alter if different
training data were used.
 It describes how much a random variable differs from its expected value.
 It is based on a single training set.
 Measures the inconsistency of different predictions using different training

 Different samples of training data yield different model fits


 Increase the size training data set to minimize variance
Overfitting Vs Model’s Complexity

•Models with high bias will have low variance.


•Models with high variance will have a low bias
Over-fitting

 Overfitting is an undesirable behavior where a learning model gives


accurate predictions for training data but not for new data.
 The machines fits all the data points or more than the required data
points present in the training set,.
 The model starts caching noise
How to minimize Overfitting

 Reduce model complexity


 Training with more data
 Removing features
 Early stopping the training
 Regularization
Regularization and Over-fitting
Adding a regularizer:

Model
error Without regularizer
With regularizer

Number of iterations
Under fitting

 The model is not able to capture the underlying trend of the


data.
How to minimize underfitting

• By increasing the training time of the model.


• By increasing the number of features
• Increasing model complexity
2. Multiple Linear Regression
 In multiple linear regression, the dependent variable depends on more than one
independent variables

 The predicted value of y is given by:

𝑦 = 𝛽0 + 𝛽1𝑋1 + 𝛽2𝑋2 + 𝛽3𝑋3 + … … + 𝛽𝑛𝑋𝑛

You might also like