0% found this document useful (0 votes)

3 views

Linear Regression

Regression

Uploaded by

Samriddhi Jaiswal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Linear Regression

Regression

Uploaded by

Samriddhi Jaiswal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Augustine Joseph

Statistics for
Data Science
Linear
Regression 101
What is Linear Regression?
Linear regression is a machine learning
algorithm used to predict a number (continuous
outcome) based on one or more input factors
(features). It’s widely used in data science for
tasks like predicting sales, pricing, and
forecasting.
Why is it Important?
Linear regression is easy to understand and
implement, making it one of the ﬁrst algorithms
data scientists learn. It helps you see how
changes in one or more features impact the
outcome, which is useful in many real-world
scenarios.
Our Example:
In this guide, we will use linear regression to
predict house prices based on features like
square footage, lot size, number of rooms, and
the city/cost of living.
Supervised Learning

Deﬁnition and Overview:

In supervised learning, the machine is given data

where the correct answers (labels) are already known.
The model's job is to learn the pattern between the
input data and the output labels, so it can predict the
right answers for new, unseen data.

Example - House Price Prediction:

Imagine you have historical data on house prices.

● You know the square footage, lot size, number of

rooms, and cost of living for each house, and you
also know what the actual sale price was.
● You feed this data to the model (supervised
learning), and the model learns how these features
(inputs) relate to the house price (output).

● After training, it can predict the price for a new

house with similar features.
Supervised Learning

Before diving into Linear Regression, it’s important to

ﬁrst understand its broader category: Supervised
Learning. Linear regression is one of the key methods
within supervised learning.
What is Supervised Learning?

Supervised learning is a method in machine learning

where the model learns by being shown examples that
have both input data and the correct answers (called
labels).

● Think of it like teaching a child to recognize objects

by showing them a picture of a cat and telling them
it’s a cat. The goal is for the model to predict th

● right
answer for new, unseen data based on what it
learned from the labeled examples.
Linear Regression

Linear regression is one of the simplest algorithms in

machine learning, and understanding Math behind it
establishes a solid foundation.
Mathematics Behind Linear Regression:
General Equation: Linear regression models the
relationship between inputs and an outcome using the
equation of a straight line. The simplest form of the
equation is:

y=β0 +β1 x1 +β2 x2 +⋯+βn xn +ε

Here,
● y (like house price) is what we are trying to predict.
● x1,x2,…xn represent the input features (like square
footage, lot size, etc.)
β1,β2,…βn are the coefficients that tell us how much
●
each feature affects the outcome.
Mathematics Behind Linear Regression

House Price Prediction Example:

For our house price prediction, the equation could look

like this:

House Price= β0+

β1(Square Footage)+
β2(Lot Size)+
β3(Number of Rooms)+
β4(City/Cost of Living)+
ε
Each coefficient (like β1 for square footage) shows how
much the price changes with an increase in that feature.

For example, for every 100 square feet added, the price
might increase by a certain amount, depending on the
value of β1.

Linear Algebra Representation:

We also represent the data in matrix form. This allows us
to handle large datasets and multiple features more
efficiently. The vector notation is simply a compact way
of writing the same equation.
Matrix Form of Linear Regression

The general linear regression equation is:

y=β0+β1x1+β2x2+⋯ +βnxn+ε

This can be written in matrix form as:

y=Xβ+ε

Where:

● y is the vector of the target variable (house pric

●
X is the matrix of feature values (square footage
size, number of rooms, etc.).
● β is the vector of coefficients.
●
ε is the vector of error terms.
Matrix Form of Linear Regression

The general linear regression equation is:

y=β0+β1x1+β2x2+⋯ +βnxn+ε

This can be written in matrix form as:

y=Xβ+ε

Where:

● y is the vector of the target variable (house pric

●
X is the matrix of feature values (square footage
size, number of rooms, etc.).
● β is the vector of coefficients.
●
ε is the vector of error terms.
Matrix Form of Linear Regression
Example:
Let’s say we have data for 3 houses and 4 features:
square footage, lot size, number of rooms, and cost of
living (city). We can represent this in matrix form.

The "1" in the ﬁrst column corresponds to the intercept β0

Now, the equation becomes:

Where:

This representation allows you to solve for the vector

β (the coefficients) in a more efficient and scalable way
when you have many features and data points.
Finding the Right Coefficients
Ordinary Least Squares (OLS):
In linear regression, we use a method called Ordinary
Least Squares (OLS) to ﬁnd the best coefficients (β1,
β2,…βn) that minimize the error between the predicted
and actual values. OLS tries to draw a line through the
data that ﬁts it as closely as possible.

What is a Loss Function?

A loss function is a measure of how far off the model's
predictions are from the actual outcomes. In the case of
linear regression, we use Mean Squared Error (MSE) as
the loss function. It calculates the average of the
squared differences between the predicted values and
the actual values.

Mean Squared Error (MSE):

The MSE tells us how much our model's predictions
deviate from the actual outcomes. Squaring the errors
makes sure that large errors are penalized more. The
equation is:
Mean Squared Error (MSE)

House Price Example:

In our house price prediction example, if the actual price
of a house is $500,000 but our model predicts $450,000,
the error is $50,000. The MSE gives us a way to calculate
the average of these errors across all the houses in our
dataset.
Optimization Techniques

Concept of Optimization: Optimization is the process of

adjusting the coefficients (like β1,β2,…βn ) so that the loss
function (MSE) is minimized. The goal is to ﬁnd the best line
(or hyperplane in multiple dimensions) that ﬁts the data.

Gradient Descent: Gradient Descent is a popular method for

optimization. It works by taking small steps in the direction
that reduces the loss function. Think of it like walking
downhill to ﬁnd the lowest point in a valley.

How Gradient Descent Works: Gradient Descent updates the

coefficients using the formula:

Where α\alphaα is the learning rate, which controls how big

each step is. Too big, and you might overshoot the minimum.
Too small, and it will take a long time to get there.

House Price Example: In predicting house prices, gradient

descent adjusts the coefficients for square footage, lot size,
etc., with each iteration to reduce the prediction error (MSE)
until the model ﬁnds the best-ﬁtting line.
Assumptions of Linear Regression

Linearity of Relationships: The relationship between the

features and the target variable should be linear. This
means that as one feature changes, the target variable
should change proportionally.

Independence of Errors: The errors (or residuals) should not

be correlated with each other. If they are, it means there
might be some pattern in the errors that the model is
missing.

Homoscedasticity: The variance of the errors should be

constant across all levels of the input features. If the spread
of errors increases or decreases for different values of the
features, the model might not be well-ﬁtted.

No Multicollinearity: The input features should not be too

highly correlated with each other. If two features are very
similar (e.g., square footage and lot size), it’s hard to tell
which one is actually affecting the outcome, making the
model unstable.

Normal Distribution of Errors: The errors should follow a

normal distribution, which helps ensure that the model's
predictions are unbiased and accurate.
Advanced Topics

● Overﬁtting and Underﬁtting:

○ Overﬁtting: When the model is too complex and
captures not just the true relationship but also
the noise in the data. This means it works well on
the training data but fails to generalize to new
data.

○ Underﬁtting: When the model is too simple and

doesn’t capture the true relationship, leading to
poor performance even on the training data.

● Regularization (Lasso and Ridge): Regularization

techniques like Lasso (L1) and Ridge (L2) add a
penalty to the loss function to prevent overﬁtting by
reducing the size of the coefficients.
Augustine Joseph

Was this Helpful?

Save it
Follow Me
github.com/augustine-aj

Measures of Central Tendency
90% (10)
Measures of Central Tendency
22 pages
Fourth Quarter Periodical Test in Grade Eleven Statistics and Probability
100% (2)
Fourth Quarter Periodical Test in Grade Eleven Statistics and Probability
2 pages
Supervised Learning - Basics
No ratings yet
Supervised Learning - Basics
115 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Isn't Linear Regression From Statistics?
No ratings yet
Isn't Linear Regression From Statistics?
4 pages
AI lab7
No ratings yet
AI lab7
13 pages
Linear Regression
No ratings yet
Linear Regression
46 pages
ML Lecture - 3
No ratings yet
ML Lecture - 3
47 pages
Regression
No ratings yet
Regression
45 pages
Unit - 2 ML
No ratings yet
Unit - 2 ML
32 pages
Implementation of Linear Regression With Python
No ratings yet
Implementation of Linear Regression With Python
5 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
Module3-Fitting A Model To Data
No ratings yet
Module3-Fitting A Model To Data
57 pages
Week 1
No ratings yet
Week 1
9 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Linear Regression For Absolute Beginners With Implementation in Python
No ratings yet
Linear Regression For Absolute Beginners With Implementation in Python
17 pages
UNIt-3 TY
No ratings yet
UNIt-3 TY
67 pages
Unit 3c Linear Regression
No ratings yet
Unit 3c Linear Regression
98 pages
A) The Least-Squares Method
No ratings yet
A) The Least-Squares Method
19 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Chapter 8 B - Trendlines and Regression Analysis
No ratings yet
Chapter 8 B - Trendlines and Regression Analysis
73 pages
Machine Learning
No ratings yet
Machine Learning
60 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
ML UNIT 4 MATERIAL
No ratings yet
ML UNIT 4 MATERIAL
20 pages
Unit2 ML Notes
No ratings yet
Unit2 ML Notes
19 pages
Computer Class 1_multiple regression
No ratings yet
Computer Class 1_multiple regression
24 pages
Machine Learing Algorithms
No ratings yet
Machine Learing Algorithms
13 pages
Unit 2
No ratings yet
Unit 2
35 pages
Computing For Data Sciences: Introduction To Regression Analysis
No ratings yet
Computing For Data Sciences: Introduction To Regression Analysis
9 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
73 pages
To understand Regression Models using first principles thinking
No ratings yet
To understand Regression Models using first principles thinking
3 pages
UNIT I Notes-1
No ratings yet
UNIT I Notes-1
18 pages
UNIT I Notes
No ratings yet
UNIT I Notes
23 pages
Master of Business Administration Arpit
No ratings yet
Master of Business Administration Arpit
75 pages
ml
No ratings yet
ml
10 pages
Linear Regression: What Is Regression Analysis?
100% (1)
Linear Regression: What Is Regression Analysis?
21 pages
5_AML Lecture 5_Linear regression
No ratings yet
5_AML Lecture 5_Linear regression
56 pages
CM
No ratings yet
CM
8 pages
GradientDescent-Regression_slides
No ratings yet
GradientDescent-Regression_slides
26 pages
ML-UNIT-3-1
No ratings yet
ML-UNIT-3-1
57 pages
UNIT-6
No ratings yet
UNIT-6
107 pages
Data Analysis ToolPak For Statistics
No ratings yet
Data Analysis ToolPak For Statistics
10 pages
ML Coursera
No ratings yet
ML Coursera
10 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
Curve Fitting With Matlab
100% (1)
Curve Fitting With Matlab
38 pages
Section 1
No ratings yet
Section 1
5 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
27 pages
MachineLearning_Unit-II
No ratings yet
MachineLearning_Unit-II
45 pages
linear regression
No ratings yet
linear regression
130 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
43 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Histogram Charts in Matlab: Data Analysis Statistics
No ratings yet
Histogram Charts in Matlab: Data Analysis Statistics
13 pages
Module 3 EDA
No ratings yet
Module 3 EDA
14 pages
LinearRegression1 210720 171800
No ratings yet
LinearRegression1 210720 171800
41 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
MLF_Week_4_Notes_by_Manisha_Pal
No ratings yet
MLF_Week_4_Notes_by_Manisha_Pal
13 pages
Unit 2 ML
No ratings yet
Unit 2 ML
201 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
Fundamental Math
From Everand
Fundamental Math
Russell Pead
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Lesson 1 Basic Concepts of Statistics
No ratings yet
Lesson 1 Basic Concepts of Statistics
9 pages
(Anderson) Design of Experiments Realistic Approach
No ratings yet
(Anderson) Design of Experiments Realistic Approach
441 pages
Lc07 SL Estimation - PPT - 0
No ratings yet
Lc07 SL Estimation - PPT - 0
48 pages
MMW Finals Reviewer
No ratings yet
MMW Finals Reviewer
3 pages
Hu and Bentler (1999)
No ratings yet
Hu and Bentler (1999)
57 pages
Logistic Regression Ensemble For Predicting Custom
No ratings yet
Logistic Regression Ensemble For Predicting Custom
8 pages
Test 2 A
No ratings yet
Test 2 A
7 pages
Biostatistics and Epidemiology Syllabus-1
100% (1)
Biostatistics and Epidemiology Syllabus-1
7 pages
BUS - 5030 - Milestone - 2 - Worksheet (2) (1) (Repaired)
No ratings yet
BUS - 5030 - Milestone - 2 - Worksheet (2) (1) (Repaired)
12 pages
Deco504 Statistical Methods in Economics Hindi
No ratings yet
Deco504 Statistical Methods in Economics Hindi
409 pages
Aligned Rank Transform PDF
No ratings yet
Aligned Rank Transform PDF
4 pages
Segunda Asignación de Estadística Aplicada A La Ingeniería
No ratings yet
Segunda Asignación de Estadística Aplicada A La Ingeniería
5 pages
Review of Chi-Square
No ratings yet
Review of Chi-Square
3 pages
Unit I Probability and Random Variables: S. No Questions BT Level Part - A
No ratings yet
Unit I Probability and Random Variables: S. No Questions BT Level Part - A
20 pages
Chapter 1
No ratings yet
Chapter 1
13 pages
Quanti Question and Answers
No ratings yet
Quanti Question and Answers
22 pages
SFM QB
No ratings yet
SFM QB
8 pages
Statistics Probability
No ratings yet
Statistics Probability
6 pages
Les8e PPT Study 07 04
No ratings yet
Les8e PPT Study 07 04
18 pages
Tugas Statistik Hal 109 No 11 - 14
No ratings yet
Tugas Statistik Hal 109 No 11 - 14
26 pages
A Basic Windspeed Map For Oman
No ratings yet
A Basic Windspeed Map For Oman
16 pages
Statistics and Probability Presentation
No ratings yet
Statistics and Probability Presentation
10 pages
Unit 3 Test
No ratings yet
Unit 3 Test
4 pages
Taguchi Metho
70% (10)
Taguchi Metho
17 pages
Data Science 1 2023 - Lecture 02 - Mathematical Preliminaries and Correlation
No ratings yet
Data Science 1 2023 - Lecture 02 - Mathematical Preliminaries and Correlation
49 pages
Measures of Variability
No ratings yet
Measures of Variability
7 pages
Walter R. Gilks, Sylvia Richardson (Auth.), Walter R. Gilks, Sylvia Richardson, David J. Spiegelhalter (Eds.) - Markov Chain Monte Carlo in Practice-Springer US (1996)
No ratings yet
Walter R. Gilks, Sylvia Richardson (Auth.), Walter R. Gilks, Sylvia Richardson, David J. Spiegelhalter (Eds.) - Markov Chain Monte Carlo in Practice-Springer US (1996)
487 pages
Quantitative Techniques Topic:-Correlation and Regression: Chandigarh Group of Colleges, Jhanjeri
No ratings yet
Quantitative Techniques Topic:-Correlation and Regression: Chandigarh Group of Colleges, Jhanjeri
51 pages