0% found this document useful (0 votes)
12 views20 pages

Class 5 - LinearRegression

Class 5 - LinearRegression.pptx

Uploaded by

Ngoc Anh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views20 pages

Class 5 - LinearRegression

Class 5 - LinearRegression.pptx

Uploaded by

Ngoc Anh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Chapter 4

Supervised Learning

1
Types of Machine learning
3.1 Regression:
Linear Regression

Contents 3.2 Classification


K-nearest neighbors
Decision Tree
Naive Bayesian
SVM
Ensemble Learning 3
Linear Regression
● Simple linear regression
● Multiple linear regression
● Examples
● Advantage of linear regression
● Limitation of linear regression

4
Linear Regression

● Linear regression is a linear model, e.g. a model that


assumes a linear relationship between the input
variables x and the single output variable y.

5
Simple linear regression
● Linear Regression with one variable
● The objective variable y and the input
variable x1 have the following linear
relationship:

● w0, w1 are unknown constants => We


estimate their values from the input data
● y is the actual value of outcome (based
on the statistics we have in the training
data set),
● ŷ is the value that the model predicts.

6
Simple linear regression
● How do we estimate the coefficients (“fit the model”)?
● How to evaluate model fit from observed data?

7
Prediction error
● It is expected that the difference between
the true value y and the predicted value ŷ
is minimal
⇒ Sum of squared error is calculated by the
formula:

L(w|X): the loss function


The parameters w=(w0, w1) are estimated by
minimizing the sum of squares of error
8
Linear Regression with one variable
● The derivation of loss function:

● We have the solution:

9
Linear Regression with one variable
Example:
● Given a data set consisting of population information and profit earned when opening
restaurants in 15 cities, the data distributed as follows:
Population Profit Population Profit

1.7 139 4 162


2.1 150 5 160
3.5 160 1 135
3.9 162 6.6 171
5 161 4.5 157
6.5 170 5.5 160
2.5 152 3 150

3.5 154
● Predict the profit of a certain restaurant, given the population of the city in which the
restaurant is located?
10
Linear Regression with one variable
Population Profit Population Profit

1.7 139 4 162


2.1 150 5 160
3.5 160 1 135
3.9 162 6.6 171
5 161 4.5 157
6.5 170 5.5 160
2.5 152 3 150

3.5 154

11
Linear Regression with one variable

12
Multiple linear regression
● Multivariable linear regression (Linear Regression with
multiple variables): the model with more than 1 variables is
used to predict the target variable (output):
ŷ = f(w,x)= w0 + w1x1 + w2x2 + …+ wnxn

● Loss function:

13
Multiple linear regression
● The derivation of loss function:

● We have the solution:

14
Multiple linear regression
Area (m2) Number of Number of Sale Price
● Example: bedrooms Floors ($1000)

Predicting 2100 5 1 460

House 1416 3 2 232

Prices 1534 3 2 315

852 2 1 178

1600 3 2 329

1985 5 1 420

1535 4 2 330

1050 2 1 195

2300 4 2 450

1200 3 2 250 15
Data regulation
● A house costs $2100, number of bedrooms: 5,
number of floors: 1. House price forecast:

Large differences in the range between variables can


reduce the accuracy of the model in some cases.
● Solution: normalize the data to the same range.

16
Data regulation
Normalize the data to the same range
● Min max scale:

● Standard scale:

● Unit length scale:

Apply normalization in linear regression:

17
Advantage of linear regression
● Simple model, Easy to understand
● Continuous variables are predictable
● Simple optimal solution
● Easy to interpret the model through regression
coefficients

18
Limitation of linear regression
● The model is simple, so it is not flexible to represent
complex data relationships.
● Very sensitive to outliers (noise)

19
Pratical exercises

20

You might also like