100% found this document useful (1 vote)

97 views43 pages

CS464 Ch9 LinearRegression

CS464 covers linear regression. Linear regression finds the linear relationship between features (X) and an outcome (Y) by estimating parameters to minimize error. The parameters are estimated using ordinary least squares, which finds the slope and intercept that minimize the sum of squared errors between predicted and actual Y values. Gradient descent can also be used to iteratively estimate optimal parameters by minimizing a loss function.

Uploaded by

Onur Asım İlhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

97 views43 pages

CS464 Ch9 LinearRegression

Uploaded by

Onur Asım İlhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

CS464

Linear Regression

(slides based on the slides provided by Öznur Taştan and

Mehmet Koyutürk)
Regression
•  Some historical sales data of houses
•  Base our predictions of housing sale prices (Y) on the
observable features such as the size of the house (X)

X
Regression
•  Assume the data is generated by a function that generates a
value for the outcome variable (y) using the values of the
features (x), plus some error

•  The outcome variable is

real-valued
•  So this is not classification

x
Regression

Target Y
Target Y

Feature X Feature X

•  Assume a functional form for f(x)

•  Find a good f(x) within that family of functions
Linear Regression
Linear Regression

•  We will focus on linear regression

Linear Regression

•  Linear regression

Parameters needs to be estimated

Slope and the Intercept
•  The slope is a number that indicates how slanted the
regression line is and the direction in which it slants

•  The Y-intercept is the value of y where the regression

crosses the Y axis (that is, when X equals zero)
Housing Example
•  Suppose we have a dataset giving the living areas
and prices of 47 houses from Portland, Oregon:
Single-Feature Model

Price (in $1000)

Living area (Square feet)

If we regress on a single variable (feature), living area:

w0 = 140.27, w1 = 0.1345
Two-Feature Model

If the number of bedrooms were included as one of the input

features as well, we get
w0 = 89.60
w1 = 0.1392
w2 = −8.738
How do we interpret this result, given that the
regression coefficients for a single variable were:
w0 = 140.27, w1 = 0.1345?
Multiple Linear Regression
•  The reason for this difference in the relationship between y
and x2 is that x1 and x2 are strongly correlated.
•  That is: when x2 increases, so does x1.

Combining several simple regressions (each using the

method of least squares) generally only gives us the same
result as a multiple regression if the explanatory variables
are orthogonal.
Optimization Problem

•  Linear regression (or any regression thereof) is an

optimization problem

•  We need to define our constraints and an objective

function
–  In linear regression, the constraint is the linearity of the
function

•  What is our objective?

Optimization Problem

•  Linear regression (or any regression thereof) is an

optimization problem

•  We need to define our constraints and an objective

function
–  In linear regression, the constraint is the linearity of the
function

•  What is our objective?

–  Minimize our error in approximating the outcome
variable
Loss Functions
Measure of Error
•  We can measure the prediction loss in terms of
squared error. Loss on one example:

Predicted value
•  Loss on n training examples: Actual value
Ordinary Least Squares (OLS)
Loss Function on Two Features
Least Squares Linear Fit to Data
•  Most popular estimation method is least squares:
–  Determine linear coefficients w0, w that minimize sum
of squared loss (SSL)
–  Use standard (multivariate) differential calculus:
•  differentiate SSL with respect to w0, w
•  find zeros of each partial differential equation
•  solve for w0, w
Minimize the Squared Loss
•  Minimize the empirical squared loss:
Direct Minimization
•  Minimize the empirical squared loss:

•  To get the optimal parameter values take derivative

Finding Optimal Parameters

This is a system of linear equations!

Regression in Matrix Notation
Solution in Matrix Form
Probabilistic View of Linear Regression
•  In a statistical regression model we model both the
function and noise

Whatever we cannot capture with

our chosen family of functions will
be interpreted as noise
Maximum Likelihood Estimation
•  Given observations:

•  Find the parameters w that maximize the (conditional)

likelihood of the outputs
Likelihood of the Observed Outputs
•  Likelihood of the observed data:

•  It is often easier (but equivalent) to try to maximize

the log-likelihood:

The maximum likelihood estimate of w is the one that

minimizes the mean squared residual error!
The MLE of σ
•  The maximum likelihood estimate of the noise variance σ2 is

Maximum likelihood
setting of the parameters
i.e. the mean squared prediction error.
Numerical Solution
•  Matrix inversion is computationally very expensive
–  Θ(n3) for n features

•  Using the analytical form to compute the optimal

solution may not be feasible even for moderate
values of n

•  Also only possible if XTX is not singular à multi-

collinearity problem.
–  Determinant is zero
–  Not full rank ..
Gradient Descent
•  General algorithm for optimization
–  Assign values to Θis to minimize J(Θ)
Gradient Descent
•  General algorithm for optimization
–  Assign values to Θis to minimize J(Θ)
Regression
Coefficients
(wi) Sum of squares
of error
Gradient Descent
Gradient descent
Gradient descent in more dimensions
Gradient descent
Step size

Stopping
condition
Comments on gradient descent
Comments on gradient descent
Variations on gradient descent
•  Batch Gradient Descent
–  Update weights after calcualting the error for each
example (epoch)
–  Pros: Stable convergence, computationally efficient
–  Cons: Might stuck in a local minima
•  Stochastic Gradient Descent
–  Recalculate the weights for each sample
–  Pros: Might avoid local minima
–  Cons: Error jumps around, computationally expensive
•  Mini batch Gradient Descent
–  Hybrid approach
Extending Application of Linear Regression
•  The inputs X for linear regression can be:
–  Original quantitative inputs
–  Transformation of quantitative inputs, e.g. log, exp,
square root, square, etc.
–  Polynomial transformation
•  Example: y = w0 + w1⋅x + w2⋅x2 + w3⋅x3
–  Dummy coding of categorical inputs
•  Binary variable for each value of the categorical variable
–  Interactions between variables
•  Example: x3 = x1 ⋅ x2 This is what statisticians
call an interaction
•  This allows use of linear regression techniques to fit much
more complicated non-linear datasets.
Non-linear functions
Basis Functions
Different Basis Functions
Example of fitting polynomial curve with linear model

CH 4 Quiz Bank Testing and Assessment
33% (3)
CH 4 Quiz Bank Testing and Assessment
70 pages
Willmott 2005 - Advantages of The MAE Over RMSE
No ratings yet
Willmott 2005 - Advantages of The MAE Over RMSE
4 pages
Factor Analysis (FA)
No ratings yet
Factor Analysis (FA)
61 pages
Regression Analysis: Study Hours GPA 5 2.8 8 3.1 6 3.4 7 3.5 1 2.2 4 3.67 3 3 8 2.5 5 3.33 2 3
No ratings yet
Regression Analysis: Study Hours GPA 5 2.8 8 3.1 6 3.4 7 3.5 1 2.2 4 3.67 3 3 8 2.5 5 3.33 2 3
9 pages
MCQ Hypothesis Testing 4
No ratings yet
MCQ Hypothesis Testing 4
3 pages
Adjusted Control Limits For P Charts
No ratings yet
Adjusted Control Limits For P Charts
9 pages
B.S. (G) Final Year Math Ma Tics
No ratings yet
B.S. (G) Final Year Math Ma Tics
6 pages
Control Charts Template
No ratings yet
Control Charts Template
14 pages
Pengaruh Penentuan Lokasi Terhadap Kesuksesan Usah
No ratings yet
Pengaruh Penentuan Lokasi Terhadap Kesuksesan Usah
12 pages
BB A 3 Econometric Sand Excel
No ratings yet
BB A 3 Econometric Sand Excel
28 pages
Quality Kitchens Meat Loaf Mix: Team 8
No ratings yet
Quality Kitchens Meat Loaf Mix: Team 8
7 pages
Handout 2
No ratings yet
Handout 2
10 pages
Job Deskriptif Index
No ratings yet
Job Deskriptif Index
10 pages
Exp With Random Factors
No ratings yet
Exp With Random Factors
42 pages
Credibility, Mahler & Dean (AutoRecovered)
No ratings yet
Credibility, Mahler & Dean (AutoRecovered)
4 pages
Modeling Higher Moments
No ratings yet
Modeling Higher Moments
31 pages
Position/e/z - Scores - 1: Above or Below Mean. Data Set
No ratings yet
Position/e/z - Scores - 1: Above or Below Mean. Data Set
2 pages
The Bass Model Unscrambling Regression Coefficients For P&Q
No ratings yet
The Bass Model Unscrambling Regression Coefficients For P&Q
4 pages
State Council of Educational Research and Training, Chennai - 600 006
No ratings yet
State Council of Educational Research and Training, Chennai - 600 006
12 pages
Empirical Methods For Finance: Sjoerd Van Den Hauwe
No ratings yet
Empirical Methods For Finance: Sjoerd Van Den Hauwe
27 pages
Andan Queen Zhien A. CE 4B Assignment 1
No ratings yet
Andan Queen Zhien A. CE 4B Assignment 1
10 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
Assignment 4 Chapter 4
No ratings yet
Assignment 4 Chapter 4
4 pages
2008 2009 Exam
No ratings yet
2008 2009 Exam
1 page
11-6D Quartiles, Percentiles and Boxplots and Histograms
No ratings yet
11-6D Quartiles, Percentiles and Boxplots and Histograms
24 pages
Uji Validitas Instrumen B-Ipq Versi Indonesia Pada Pasien Hipertensi Di Rsud Sultan Syarif Mohamad Alkadrie Pontianak
No ratings yet
Uji Validitas Instrumen B-Ipq Versi Indonesia Pada Pasien Hipertensi Di Rsud Sultan Syarif Mohamad Alkadrie Pontianak
9 pages
36-202 Methods For Statistics and Data Science
No ratings yet
36-202 Methods For Statistics and Data Science
3 pages
Homework 5
No ratings yet
Homework 5
3 pages
Day of The Week Effects
No ratings yet
Day of The Week Effects
13 pages
Arora 2019
No ratings yet
Arora 2019
29 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2141)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)

CS464 Ch9 LinearRegression

Uploaded by

CS464 Ch9 LinearRegression

Uploaded by

CS464

(slides based on the slides provided by Öznur Taştan and

• The outcome variable is

• Assume a functional form for f(x)

• We will focus on linear regression

Parameters needs to be estimated

• The Y-intercept is the value of y where the regression

Price (in $1000)

Living area (Square feet)

If we regress on a single variable (feature), living area:

If the number of bedrooms were included as one of the input

Combining several simple regressions (each using the

• Linear regression (or any regression thereof) is an

• We need to define our constraints and an objective

• What is our objective?

• Linear regression (or any regression thereof) is an

• We need to define our constraints and an objective

• What is our objective?

• To get the optimal parameter values take derivative

This is a system of linear equations!

Whatever we cannot capture with

• Find the parameters w that maximize the (conditional)

• It is often easier (but equivalent) to try to maximize

The maximum likelihood estimate of w is the one that

• Using the analytical form to compute the optimal

• Also only possible if XTX is not singular à multi-

You might also like

•  The outcome variable is

•  Assume a functional form for f(x)

•  We will focus on linear regression

•  The Y-intercept is the value of y where the regression

•  Linear regression (or any regression thereof) is an

•  We need to define our constraints and an objective

•  What is our objective?

•  Linear regression (or any regression thereof) is an

•  We need to define our constraints and an objective

•  What is our objective?

•  To get the optimal parameter values take derivative

•  Find the parameters w that maximize the (conditional)

•  It is often easier (but equivalent) to try to maximize

•  Using the analytical form to compute the optimal

•  Also only possible if XTX is not singular à multi-