03 Regression
03 Regression
Supervised Unsupervised
Discrete Data
Classification Clustering
(predict a label) (group similar items)
Continuous Data
Dimensionality
Regression Reduction
(predict a quantity) (reduce n. of variables)
Machine Learning Tasks
Supervised Unsupervised
Discrete Data
Classification Clustering
(predict a label) (group similar items)
Continuous Data
Dimensionality
Regression Reduction
(predict a quantity) (reduce n. of variables)
Classification vs Regression
• Classification:
- supervised
- requires a set of labelled training samples
- predicts a label
• Regression:
- supervised
- requires a set of labelled training samples
- predicts a quantity
Section Agenda
• Regression Algorithms
• Linear Regression
• Evaluation
Introduction to
Regression
Regression Analysis
• Set of process for estimating the relationship
between variables
• … Predict a quantity
Variables
• Dependent variables:
- variables that we want to forecast
- their values depend on something else
- denoted as Y (desired output)
• Independent variables:
- variables that explain the other one
- its values are independent
- denoted as X (input)
- a.k.a. features, predictors, covariates
Regression Model
• b: unknown parameters
Regression Example
Regression Example
Price
Problem:
predict the price of a house
given its size
Size
Regression Example
Price
Training data:
samples (size, price)
Size
Regression Example
Price
Training data:
samples (size, price)
$200K
Size
100 sq. m
Regression Example
Price
Training:
find the curve
that best fits the data
Size
Regression Example
Price
Prediction:
given the size of a new house
Size
120 sq. m
Regression Example
Price
$250K
Prediction:
given the size of a new house
estimate the price
Size
120 sq. m
Regression Example
Price
$250K Estimated
$220K Real
Size
Regression Example
Price
$250K
$220K Error:
difference between
estimated and real
Size
Regression
Algorithms
Linear Regression
Linear Regression
• From school (calculus?):
• In statistics:
Linear Regression
• From school (calculus?):
• In statistics: Slope
Linear Regression
• From school (calculus?):
• In statistics: Intersect
Linear Regression Error
Price
Actual value
Error:
difference between
estimated and actual
Estimated value
Size
Linear Regression Error
Price
Error:
difference between
estimated and actual
Size
Linear Regression Error (2)
Linear Regression Error (2)
Actual value
Estimated value
Error a.k.a. residual
Linear Regression
Ingredients
• Training data