w3 - Linear Model - Linear Regression

The document provides an overview of linear regression models. It discusses that regression analysis is used to predict the value of a response variable from attribute variables. The key aspects covered include: - Regression models involve parameters, independent variables, dependent variables, and error terms. The objective is to estimate the function that best fits the data. - The least squares method is commonly used to estimate the parameters by minimizing the sum of squared errors between observed and predicted values. - Gradient descent is another approach to determine the optimal parameter weights by iteratively updating the weights in steps that reduce the error function. It avoids problems with singular matrices and large computations compared to the pseudoinverse method.

Uploaded by

Swastik Sindhani

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

w3 - Linear Model - Linear Regression

Uploaded by

Swastik Sindhani

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 33

LINEAR MODEL -

LINEAR
REGRESSION
Dr. Srikanth Allamsetty
Formulation & Mathematical Foundation
of Regression Problem
What is Regression
• Regression – predict value of response variable from attribute variables.
• Variables – continuous numeric values
• Regression analysis – a set of statistical processes for estimating the relationships
between a dependent variable and one or more independent variables.
• Dependent variables are often called the 'predictand', 'outcome' or 'response' variable;
• Independent variables are often called 'predictors', 'covariates', 'explanatory variables' or
'features'.
• Regression analysis is a way of mathematically sorting out which of those variables does
indeed have an impact.
• Used for modeling the future relationship between the variables.
• Statistical process – a science of collecting, exploring, organizing, analyzing,
interpreting data and exploring patterns and trends to answer questions and make
decisions (Broad area).
Basics of Regression Models
• Regression models predict a value of the Y variable given known values
of the X variables.
• Prediction within the range of values in the dataset used for model-fitting
is known as interpolation.
• Prediction outside this range of the data is known as extrapolation.
• First, a model to estimate the outcome need to be fixed.
• Then the parameters of that model need to be estimated using any
chosen method (e.g., least squares).
Formulation of Regression Models
• Regression models involve the following components:
• The unknown parameters, often denoted as β or ω or w.
• The independent variables, which are observed in data and are often
denoted as a vector Xi (where i denotes a row of data).
• The dependent variable, which are observed in data and often denoted
using the scalar Yi.
• The error terms, which are not directly observed in data and are often
denoted using the scalar ei.
Formulation of Regression Models
• Most regression models propose that Yi is a function of Xi and β, with ei
representing an additive error term that may stand in for a random statistical
noise.

• Our objective is to estimate the function f(Xi , β) that most closely fits the data.
• To carry out regression analysis, the form of the function f must be specified.
• Sometimes the form of this function is based on knowledge about the
relationship between Yi and Xi .
• If no such knowledge is available, a flexible or convenient form for f is chosen.
Formulation of Regression Models
• You may start with a simple univariate linear regression:

• It indicates that you believe that a reasonable approximation for Yi is:

• Now the next objective is to estimate the parameters β

• may be using least squares method
• may go with other alternatives such as least absolute deviations, Least trimmed squares,
quantile regression estimator, Theil–Sen estimator, M-estimation (maximum likelihood
type) or S-estimation (scale).
Formulation of Regression Models
• Find the value of β that minimizes the sum of squared errors

• A given regression method will ultimately provide an estimate of β, usually

denoted .
• Using this estimate, you can then find the fitted value for prediction or to
assess the accuracy of the model in explaining the data.
Variants in Regression Models
• The most common
models are simple linear
Y = a + bX + ϵ and multiple linear.
(discussed in last class)
• Nonlinear regression
Y = a + bX1 + cX2 + dX3 + ϵ analysis is commonly used
for more complicated
data sets in which the
dependent and
independent variables
-- Logistic Regression show a nonlinear
relationship.
Y= -- Michaelis–Menten model for enzyme kinetics
Multiple Linear Regression
x y
Size (feet2) Price ($1000)
Number of bedrooms Number of floors Age of home (years)
i 1 2 3 4
1 2104 5 1 45 460
2 1416 3 2 40 232
3 1534 3 2 30 315
4 852 2 1 36 178
… … … … …
N

Notation:
= number of features
= input (features) of training example.
= value of feature in training example.
N = number of training examples
Multiple Linear Regression

Notation:
= number of features
= input (features) of training example.
= value of feature in training example.
N = number of training examples
The Regression Model, & The
Concepts of Least Squares
What is Least Square Method
• The least-squares method is a statistical method that is practised to find a
regression line or a best-fit line for the given pattern.
• The method of least squares is used in regression.
• In regression analysis, this method is said to be a standard approach for the
approximation of sets of equations having more equations than the number of
unknowns (overdetermined systems).
• It is used to approximate the solution by minimizing the sum of the squares of the
residuals made in the results of each individual equation.
• Residual: the difference between an observed value and the fitted value provided by a model
• The problem of finding a linear regressor function will be formulated as a problem
of minimizing a criterion function.
• The widely-used criterion function for regression purposes is the sum-of-error-
squares.
Least Square Method with Linear
Regression
Least Square Method with Linear
Regression
• In general, regression methods are used to predict the value of
response (dependent) variable from attribute (independent)
variables,
• Linear regressor model fits a linear function (relationship) between
dependent (output) variable and independent (input) variables.

• where {w0, w1, …, wn} are the parameters of the model.

• The method of linear regression is to choose the (n + 1) coefficients
w0, w1, …, wn, to minimize the residual sum of squares of these
differences over all the N training instances.
Least Square Method with Linear
Regression
• In general, regression methods are used to predict the value of
response (dependent) variable from attribute (independent)
variables,
• Linear regressor model fits a linear function (relationship) between
dependent (output) variable and independent (input) variables.

• where {w0, w1, …, wn} are the parameters of the model.

• The method of linear regression is to choose the (n + 1) coefficients
w0, w1, …, wn, to minimize the residual sum of squares of these
differences over all the N training instances.
Minimal Sum-of-Error-Squares

• For an optimum solution for w , the following equations need to be satisfied:

Minimal Sum-of-Error-Squares
Minimal Sum-of-Error-Squares

• In this least-squares estimation task, the objective is to find the

optimal * that minimizes E ().
• The solution to this classic problem in calculus is found by setting the
gradient of E (), with respect to , to zero.
Minimal Sum-of-Error-Squares

• The (n + 1) x N matrix X+ = (XXT)–1X is called the pseudoinverse matrix

of the matrix XT. Thus, the optimal solution is
* = X+y
Unique solution?
• It might happen that the columns of X are not linearly independent.
• Then XXT is singular and the least squares coefficients * are not
uniquely defined.
• The singular case occurs most often when two or more inputs were
perfectly correlated.
• A natural way to resolve the non-unique representation is by dropping
redundant columns in X.
• Most regression software packages detect these redundancies and
automatically implement some strategy for removing them.
Error Reduction-Gradient Descent
Basics of Gradient Descent
• Gradient descent search helps determine a weight vector that minimizes E
by starting with an arbitrary initial weight vector and then altering it again
and again in small steps.
• Batch gradient descent: When the weight update is calculated based on all
examples in the training dataset, it is called as batch gradient descent.
• Stochastic gradient descent: When the weight update is calculated
incrementally after each training example or a small group of training
example, it is called as stochastic gradient descent.
• Gradient descent procedure has two advantages over merely computing
the pseudoinverse:
• (1) it avoids the problems that arise when XXT is singular (it always yields a
solution regardless of whether or not XXT is singular);
• (2) it avoids the need for working with large matrices.
Basics of Gradient Descent
Basics of Gradient Descent
• The error surface may have multiple local minimums but a single global
minimum.
• The objective would be to find out global minimum.
What is Gradient Descent
Gradient Descent Optimization Schemes
● Optimization method Gradient Descent Method used for minimization
tasks. Changes of the weights are made according to the following
algorithm:

where denotes the learning rate, and stands for the actual iteration step.
Note:
● Need to choose .
● Needs many iterations.
● Works well even when n is large.
● Gradient descent serves as the basis for learning algorithms that search
the hypothesis space of possible weight vectors to find the weights that
best fit the training examples.
What is Gradient Descent
Gradient Descent Optimization Schemes
● Optimization method Gradient Descent Method used for minimization
tasks. Changes of the weights are made according to the following
algorithm:

where denotes the learning rate, and stands for the actual iteration step.
Approaches for deciding the iteration step:
1. Batch methods use all the data in one shot.
● iteration step means the kth presentation of training dataset.
● Gradient is calculated across the entire set of training patterns.
2. Online methods is where
● – iteration step after single data pair is presented.
● Share almost all good features of recursive least square algorithm with
reduced computational complexity.
The gradient descent training rule

Linear classification using regression technique

Performance Criterion:

---To be minimized.
Called as cost function.

used for computational convenience

The gradient descent training rule
• The error surface is parabolic with a single global minimum.
• The specific parabola will depend on the particular set of training examples.
• The direction of steepest descent along the error surface can be found by
computing the derivative of E with respect to each component of the vector .
• This vector-derivative is called the gradient of E w.r.t , written ∇E () .
Remember, it can be applied to any objective
function, not just for squared distances.

• The negative of this vector gives the direction Performance Criterion:

of steepest decrease.
• Therefore, the training rule for gradient
descent is, ---To be minimized.
Called as cost function.

is learning rate, a +ve constant, determines the step size in the search.
The gradient descent training rule
• This training rule can also be written in its component form:

• which shows that steepest descent is achieved by

altering each component wj of w in proportion to
Performance Criterion:

• starting with an arbitrary initial weight vector, is

changed in the direction producing the steepest
descent along the error surface. ---To be minimized.
• The process goes on till the global minimum error Called as cost function.

is attained.
The gradient with respect to weight wj

＆ Performance Criterion:

---To be minimized.
Called as cost function.
The gradient with respect to weight wj

An epoch is a complete run through all the N associated pairs.

The gradient with respect to weight wj
• Once an epoch is completed, the
pair (x(1), y(1)) is presented again
and a run is performed through
all the pairs again.
• After several epochs, the ouput
error is expected to be
sufficiently small.

• k corresponds to the epoch number, the

number of times the set of N pairs is
presented and cumulative error is
compounded.

MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
A) The Least-Squares Method
No ratings yet
A) The Least-Squares Method
19 pages
Least Square Regression
No ratings yet
Least Square Regression
13 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
Chapter2 Annotated Part2
No ratings yet
Chapter2 Annotated Part2
30 pages
Linear_Regression (1)
No ratings yet
Linear_Regression (1)
35 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Regression and Optimization in ML
No ratings yet
Regression and Optimization in ML
41 pages
Machine Learning (CSO851) - Lecture 02
No ratings yet
Machine Learning (CSO851) - Lecture 02
74 pages
Machine Learning QB
No ratings yet
Machine Learning QB
32 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Lecture2 241007 162001
No ratings yet
Lecture2 241007 162001
11 pages
Isn't Linear Regression From Statistics?
No ratings yet
Isn't Linear Regression From Statistics?
4 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
ML 5
No ratings yet
ML 5
21 pages
5_AML Lecture 5_Linear regression
No ratings yet
5_AML Lecture 5_Linear regression
56 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
4-Curve Fitting and Interpolation
No ratings yet
4-Curve Fitting and Interpolation
48 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Linear Regression
No ratings yet
Linear Regression
62 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Linear Regression
No ratings yet
Linear Regression
31 pages
linear regression
No ratings yet
linear regression
20 pages
CS464 Ch9 LinearRegression
100% (1)
CS464 Ch9 LinearRegression
43 pages
Linear Regression
100% (1)
Linear Regression
8 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
Chapter4_Regression.docx
No ratings yet
Chapter4_Regression.docx
15 pages
LinearRegression1 210720 171800
No ratings yet
LinearRegression1 210720 171800
41 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
Lecture 2
No ratings yet
Lecture 2
23 pages
MachineLearning_Unit-II
No ratings yet
MachineLearning_Unit-II
45 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
Welcome To:: Simple Linear Regression
No ratings yet
Welcome To:: Simple Linear Regression
33 pages
(Revised) Simple Linear Regression and Correlation
No ratings yet
(Revised) Simple Linear Regression and Correlation
41 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Experiment 1
No ratings yet
Experiment 1
17 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Lecture3_upload
No ratings yet
Lecture3_upload
28 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
FBAS
No ratings yet
FBAS
5 pages
Linear Models
No ratings yet
Linear Models
50 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Linear Regression in Machine Learning MY NOTES
No ratings yet
Linear Regression in Machine Learning MY NOTES
21 pages
Unit III
No ratings yet
Unit III
18 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
3. Linear Regression
No ratings yet
3. Linear Regression
49 pages
Linear Regression
No ratings yet
Linear Regression
60 pages
Least Squares Method
No ratings yet
Least Squares Method
36 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
Linear Regression
No ratings yet
Linear Regression
47 pages
IV Ai & Ds Al3451 Ml Unit2
No ratings yet
IV Ai & Ds Al3451 Ml Unit2
50 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
25 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Exercises of Numerical Analysis
From Everand
Exercises of Numerical Analysis
Simone Malacrida
No ratings yet
Download Complete From Dimension Free Matrix Theory to Cross Dimensional Dynamic Systems 1st Edition Daizhan Cheng PDF for All Chapters
100% (2)
Download Complete From Dimension Free Matrix Theory to Cross Dimensional Dynamic Systems 1st Edition Daizhan Cheng PDF for All Chapters
55 pages
All Class NAME
No ratings yet
All Class NAME
31 pages
Ebooks File Algebra: Form and Function, 2nd Edition (Ebook PDF) All Chapters
100% (1)
Ebooks File Algebra: Form and Function, 2nd Edition (Ebook PDF) All Chapters
49 pages
Notes 3.6-Analyzing Exponential Graphs
No ratings yet
Notes 3.6-Analyzing Exponential Graphs
4 pages
Form 4: Chapter 10 (Solution of Triangles) SPM Practice Fully-Worked Solutions
No ratings yet
Form 4: Chapter 10 (Solution of Triangles) SPM Practice Fully-Worked Solutions
4 pages
5.7 Lesson Answer Key
No ratings yet
5.7 Lesson Answer Key
2 pages
FFT
100% (1)
FFT
22 pages
C4 Partial Fractions B - Questions
No ratings yet
C4 Partial Fractions B - Questions
1 page
12th-Maths-Chapter-11-Question-Paper-English-Medium-PDF-Download
No ratings yet
12th-Maths-Chapter-11-Question-Paper-English-Medium-PDF-Download
2 pages
Business Statistics Chapter 2
No ratings yet
Business Statistics Chapter 2
48 pages
Sine and Cosine Rule
No ratings yet
Sine and Cosine Rule
4 pages
3 Smooth Maps: 3.1 Smooth Functions On Manifolds
No ratings yet
3 Smooth Maps: 3.1 Smooth Functions On Manifolds
13 pages
Galois Groups Over Q: A First Course: Rodolfo Venerucci
No ratings yet
Galois Groups Over Q: A First Course: Rodolfo Venerucci
42 pages
Schrodinger Equations
No ratings yet
Schrodinger Equations
3 pages
MAT-142 - PPT - Function-Monsoon 2021 - Student
No ratings yet
MAT-142 - PPT - Function-Monsoon 2021 - Student
38 pages
Chapter 2 Week 1 (Separable Equations)
No ratings yet
Chapter 2 Week 1 (Separable Equations)
23 pages
Linear Algebra and Differential Equations Lotka-Volterra Equations
No ratings yet
Linear Algebra and Differential Equations Lotka-Volterra Equations
20 pages
Basic and Applied Mathematics: Solution
No ratings yet
Basic and Applied Mathematics: Solution
9 pages
Engineering mathematics 2(with fourier)
No ratings yet
Engineering mathematics 2(with fourier)
103 pages
Matrix Differentiation
No ratings yet
Matrix Differentiation
15 pages
Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained
No ratings yet
Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained
3 pages
MCA VITMEE Syllabus 2024
No ratings yet
MCA VITMEE Syllabus 2024
3 pages
Newtons Most Prolific Years
No ratings yet
Newtons Most Prolific Years
4 pages
Major Test - 1
No ratings yet
Major Test - 1
5 pages
SASMO 2014 Round 1 Secondary 1 Problems
100% (1)
SASMO 2014 Round 1 Secondary 1 Problems
3 pages
Matrix - Part 3
No ratings yet
Matrix - Part 3
37 pages
Theories and Applications of Plate Analysis 0471429899
No ratings yet
Theories and Applications of Plate Analysis 0471429899
4 pages
1 The First (Dirichlet) Eigenvalue Problem
No ratings yet
1 The First (Dirichlet) Eigenvalue Problem
12 pages
TB Homework 65bb5b412a3cf4.65bb5b43d6d373.99991714
No ratings yet
TB Homework 65bb5b412a3cf4.65bb5b43d6d373.99991714
7 pages
Duality in Fundamentals of Operation Research
No ratings yet
Duality in Fundamentals of Operation Research
4 pages