0% found this document useful (0 votes)

46 views33 pages

Introduction To ML Linear Regression

Machine learning is a process that enables computers to learn from data without being explicitly programmed. It involves using algorithms to search for patterns in large amounts of data and then using the patterns to predict or classify new data. Machine learning is useful when algorithms cannot be written to identify patterns or when there is too much data or complexity. It is used in areas like fraud detection, recommendation systems, and image recognition. For machine learning to work, rich data representing the real world as well as skills in programming, statistics, and domain knowledge are required. Machine learning algorithms treat data as mathematical points in a feature space defined by attribute dimensions. They generate models, such as planes or hyperplanes, that classify or predict new data based on its position in

Uploaded by

Srinidhi Adya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views33 pages

Introduction To ML Linear Regression

Uploaded by

Srinidhi Adya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Introduction to machine learning

Machine Learning

1
Introduction to machine learning

“[Machine Learning is the] field of study that gives computers the ability to
learn without being explicitly programmed.” Arthur Samuel 1959:

2
Introduction to machine learning

What is machine learning?

1. Is a process of enabling a computer based system to learn to do tasks based on

well defined statistical and mathematical methods

2. The ability to do the tasks come from the underlying model which is the result of
the learning process. Sometimes the ability comes from an mathematical
algorithm

3. The model generated represents behaviour of the processes that were earlier
performed before machine learning

4. The model is generated from huge volume of data, huge both in breadth and
depth reflecting the real world in which the processes are performed

5. The more representative data is of the real world, the better the model would be.
The challenge is how to make it a true representative

3
Introduction to machine learning

What do machine learning algorithms do?

1. Search through data to look for patterns

2. Patterns in form of trends, cycles, associations, classes etc.

3. Express these patterns as mathematical structures such as probability equations

or polynomial equations

4
Introduction to machine learning
When is machine learning useful ?

1. Cannot express our knowledge about patterns as a program. For e.g. Character
recognition or natural language processing

2. Do not have an algorithm to identify a pattern of interest. For e.g. In spam mail detection

3. Too complex and dynamic. For e.g. Weather forecasting

4. Too many permutations and combinations possible. For e.g. Genetic code mapping

5. No prior experience or knowledge. For e.g. Mars rover

6. Patterns hidden in humongous data. For e.g. Recommendation system

5
Introduction to machine learning

Where are machine learning based systems used (examples only)

1. Fraud detection

2. Sentiment analysis

3. Credit risk management

4. Prediction of equipment failures

5. New pricing models / strategies

6. Network intrusion detection

7. Pattern and image recognition

8. Email spam filtering

6
Introduction to machine learning
Machine Learning & Data Science

1. Machine learning is part of a larger discipline called Data Science

2. Data science is the process of applying science and domain expertise to data to
extract useful information from data.

3. It includes application of all the statistical and mathematical tools and techniques to
glean out the useful information from data using machine learning

7
Introduction to machine learning
Machine Learning Pre-requisites

1. Rich set of data representing the real world

2. Knowledge and skills in

a. Maths and statistics
b. Programming (Python, R, Java, Go)
c. Tools / frameworks such as Keras / TensorFlow
d. Domain knowledge

8
Introduction to machine learning

Real World as Mathematical Space

9
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

1. A data set representing the real world, is a collection attributes that define an
entity

2. Each entity is represented as one record / line in the data set

Attributes / Dimensions

10
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

1. Each attribute becomes a

dimension

2. Each record becomes a

point in the space

Sugar

BP level
Heart healthy
Potential heart ailments

11
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

1. Position of a point in
space is defined with
respect to the origin

2. The position is decided by

the values of the attributes
for a point

Sugar

BP level

Heart healthy
Potential heart ailments

12
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

3. A model represents the real

world process that
generated the different set
of data points

4. The model could be a simple

plane, complex plane, hyper
plane

Sugar
5. But multiple planes can do
the job. Each representing
an alternate hypothesis

6. The learning algorithm

selects that hypothesis
which minimizes errors in
the test data BP level
Heart healthy
Erroneous classification Potential heart ailments

13
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

7. In the figure, since the

separator is a plane, the
model will be the equation
representing the plane

ax + by + cz = d

8. x , y, z represent the three

dimensions i.e. BP, Age,
Sugar while d represents

Sugar
the color i.e. healthy or
ailing heart

BP level
Heart healthy
Potential heart ailments

14
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

9. A new data point enters

the system

10. It’s x,y and z values will be

fed into the model to get
value of d (healthy or
ailing)

Sugar
11. The data point will be
placed above or below the
plane based on d

ax + by + cz = d, BP level

Heart healthy
Potential heart ailments

15
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

12. Whether the new data point

is correctly placed (above
or below the plane) i.e.
correctly classified as ailing
or healthy hear will be
known only after direct
observation

Sugar

ax + by + cz = d, BP level

Heart healthy
Potential heart ailments

16
Introduction to machine learning
Machine learning happens in mathematical space / feature space:

13. Only direct test on the

object of interest will tell
whether the classification is
correct or not

Sugar
ax + by + cz = d,

14. If majority of new data

points are correctly
classified, the model is
good else not
BP level
Heart healthy
Potential heart ailments

17
Introduction to machine learning

Introduction to Supervised
Machine Learning

18
Introduction to machine learning
Characteristics of Supervised Machine Learning -

a. Class of machine learning algorithms that work on externally supplied instances

(data) in form of predictor attributes and associated target values

b. They produce a model representing alternate hypothesis i.e. distribution of

class labels in terms of predictor variables in the feature space

c. The model thus generated is used to make predictions about future instances
where the predictor feature values are known but the target / class value is
unknown
a. E.g.-1 building model to predict the re-sale value of a car based on its current mileage,
age, color etc.
b. E.g.-2 Predicting the final year scores based on student performance in previous
years.

19
Introduction to machine learning
Data Science Machine Learning Steps -
Identify Data Identify what type of data, source of data and how to ingest data into
your system. Need domain expertise and lateral thinking
Required

Pre-process Address data quality issues such as missing values, outliers, data
Data pollution etc. Establish veracity of the data. Select attributes for model,
Need domain expertise
Create
Split the data into training set and test set. Generally
training & 70:30 ratio is used
test set
Select
Select appropriate algorithm/s to model. For e.g. Random
appropriate Forest, K Nearest Neighbors etc. Depends on data
algorithm/s

Train & build Build the model in Python or Spark or R

the model
Evaluate the model on test data
Evaluate ensure it is not overfit or underfit
with test data and likely to generalize well

Deploy at scale
OK?
No Yes

Productionize
& calibrate
20
Introduction to machine learning

Linear Regression

21
Introduction to machine learning
Linear Regression Models -

a. The term "regression" generally refers to predicting a real number. However, it

can also be used for classification (predicting a category or class.)

b. The term "linear" in the name “linear regression” refers to the fact that the
method models data with linear combination of the explanatory variables.

c. A linear combination is an expression where one or more variables are scaled

by a constant factor and added together.

d. In the case of linear regression with a single explanatory variable, the linear
combination used in linear regression can be expressed as:

response = intercept + constant ∗ explanatory

e. In its most basic form fits a straight line to the response variable. The model is
designed to fit a line that minimizes the squared differences (also called errors
or residuals.).

22
Introduction to machine learning
Linear Regression Models -

a. Before we generate a model, we need to understand the degree of relationship

between the attributes Y and X

b. Mathematically correlation between two variables indicates how closely their

relationship follows a straight line. By default we use Pearson’s correlation which
ranges between -1 and +1.

c. Correlation of extreme possible values of -1 and +1 indicate a perfectly linear

relationship between X and Y whereas a correlation of 0 indicates absence of linear
relationship
I. When r value is small, one needs to test whether it is statistically significant or not to
believe that there is correlation or not

23
Introduction to machine learning
Linear Regression Models -

d. Coefficient of relation - Pearson’s coefficient p(x,y) = Cov(x,y) / ( stnd Dev (x) X stnd
Dev (y) )

r is near 0 r is near -1 r is near +1

e. Generating linear model for cases where r is near 0, makes no sense. The model will
not be reliable. For a given value of X, there can be many values of Y! Nonlinear
models may be better in such cases

24
Introduction to machine learning
Linear Regression Models (Recap) -

f. Coefficient of relation - Pearson’s coefficient p(x,y) = Cov(x,y) / ( stnd Dev (x) X stnd
Dev (y) )

- ve +ve
quad quad

+ve - ve
quad quad

=0
>0

http://www.socscistatistics.com/tests/pearson/Default2.aspx

25
Introduction to machine learning
Linear Regression Models -
g. Given Y = f(x) and the scatter plot shows apparent correlation between X and Y
Let’s fit a line into the scatter which shall be our model

h. But there are infinite number of lines that can be fit in the scatter. Which one
should we consider as the model?

i. This and many other

algorithms use gradient
descent or variants of
gradient descent method
for finding the best
model

j. Gradient descent
methods use partial
derivatives on the
parameters (slope and
intercept) to minimize
sum of squared errors

26
Introduction to machine learning
Linear Regression Models (Recap) -
k. Whichever line we consider as the model, it will not pass through all the points.
l. The distance between a point and the line (drop a line vertically (shown in
yellow)) is the error in prediction
m. That line which gives least sum of squared errors is considered as the best line

Error = (T – (mx + C)
Sum of all errors can cancel
out and give 0

We square all the errors and

sum it up. That line which
gives us least sum of squared
errors is the best fit

27
Introduction to machine learning
Linear Regression Models -
n. Coefficient of determinant – determines the fitness of a linear model. The closer the
points get to the line, the R^2 (coeff of determinant) tends to 1, the better the model is

Model line always passes

through Xbar and Ybar

Ybar

Xbar

28
Introduction to machine learning
Linear Regression Models -
o. Coefficient of determinant (Contd…)
I. There are a variety of errors for all those points that don’t fall exactly on the line.
II. It is important to understand these errors to judge the goodness of fit of the model i.e. How
representative the model is likely to be in general
III. Let us look at point P1 which is one of the given data points and associated errors due to
the model
1. P1 – Original y data point for given x

2. P2 - Estimated y value for given x

y P1 3. Ybar – Average of all Y values in data set

SSE
SST 4. SST – Sum of Square error Total (SST)
P2 Variance of P1 from Ybar (Y – Ybar)^2
SSR
Ybar 5. SSR - Regression error (p2 – ybar)^2 (portion
SST captured by regression model)

6. SSE - Residual error (p1 – p2)^2

Xbar x
29
Introduction to machine learning
Linear Regression Models -

p. Coefficient of determinant (Contd…)

1. That model is the most fit where every data
point lies on the line. i.e. SSE = 0 for all
y P1 data points

SSE
2. Hence SSR should be equal to SST i.e.
SST SSR/SST should be 1.
P2
SSR
Ybar 3. Poor fit will mean large SSE. SSR/SST will
be close to 0

4. SSR / SST is called as r^2 (r square) or

coefficient of determination

Xbar 5. r^2 is always between 0 and 1 and is a

x
measure of utility of the regression model

30
Introduction to machine learning
Linear Regression Models -

q. Coefficient of determinant (Contd…) -

Point B
Point B

Point A Point A

In case of point “A”, the line explains the variance of the point

Whereas point “B” the is a small area (light grey) which the line does not represent.

%age of total variance that is represented by the line is coeff of determinant

31
Introduction to machine learning
Linear Regression Model -

Advantages –
1. Simple to implement and easier to interpret the outputs coefficients

Disadvantages -
1. Assumes a linear relationships between dependent and independent variables. That
is, it assumes there is a straight-line relationship between them
2. Outliers can have huge effects on the regression
3. Linear regression assume independence between attributes
4. Linear regression looks at a relationship between the mean of the dependent variable
and the independent variables.
5. Just as the mean is not a complete description of a single variable, linear regression
is not a complete description of relationships among variables
6. Boundaries are linear

32
Introduction to machine learning
Linear Regression Model -

Lab- 1- Estimating mileage based on features of a second hand car

Description – Sample data is available at

https://archive.ics.uci.edu/ml/datasets/Auto+MPG

The dataset has 9 attributes listed below that define the quality
1. mpg: continuous
2. cylinders: multi-valued discrete
3. displacement: continuous
4. horsepower: continuous
5. weight: continuous
6. acceleration: continuous
7. model year: multi-valued discrete
8. origin: multi-valued discrete
9. car name: string (unique for each instance)

Sol : mpg-linear regression.ipynb

Introduction to ML Linear Regression
No ratings yet
Introduction to ML Linear Regression
10 pages
ML Notes(BCS602)
No ratings yet
ML Notes(BCS602)
186 pages
MLF Lec01
No ratings yet
MLF Lec01
23 pages
Machine Learning
No ratings yet
Machine Learning
48 pages
Lec 01 - Intro To ML
No ratings yet
Lec 01 - Intro To ML
28 pages
Unit III
No ratings yet
Unit III
19 pages
AI-Powered Innovations in Electrical Engineering E
No ratings yet
AI-Powered Innovations in Electrical Engineering E
8 pages
Introduction To Data Science and Machine Learning
No ratings yet
Introduction To Data Science and Machine Learning
23 pages
ML UNIT-I
No ratings yet
ML UNIT-I
34 pages
Machine Learning Report
No ratings yet
Machine Learning Report
73 pages
Machine Learning Career Roadmap_
No ratings yet
Machine Learning Career Roadmap_
17 pages
CE880_lecture5_slides
No ratings yet
CE880_lecture5_slides
32 pages
ML UNIT-I
No ratings yet
ML UNIT-I
28 pages
Iu 3.6.4 ML 101
No ratings yet
Iu 3.6.4 ML 101
39 pages
Machine Learning Introduction - A Comprehensive Guide
No ratings yet
Machine Learning Introduction - A Comprehensive Guide
13 pages
UNIT 3-Clustering Metrics (1)
No ratings yet
UNIT 3-Clustering Metrics (1)
59 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
Intro To ML PDF
No ratings yet
Intro To ML PDF
66 pages
Machine Learning Unit 1 Notes
No ratings yet
Machine Learning Unit 1 Notes
22 pages
ML Unit 1
No ratings yet
ML Unit 1
22 pages
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
No ratings yet
[English (Auto-generated)] All Machine Learning Algorithms Explained in 17 Min [DownSub.com]
19 pages
Machinelearning Unit-1
No ratings yet
Machinelearning Unit-1
29 pages
03-Introduction To Machine Learning - DNN
No ratings yet
03-Introduction To Machine Learning - DNN
35 pages
Tutorial 3
No ratings yet
Tutorial 3
30 pages
Introduction To Machine Learning: Gilles Gasso
No ratings yet
Introduction To Machine Learning: Gilles Gasso
32 pages
ML - Part - A
No ratings yet
ML - Part - A
10 pages
Chapter 4- Machine Learning
No ratings yet
Chapter 4- Machine Learning
81 pages
Shen 2025 Abstract
No ratings yet
Shen 2025 Abstract
6 pages
Day 2 Part 1
No ratings yet
Day 2 Part 1
52 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
163 pages
Cidu2011 Banerjee Intro To ML 01
No ratings yet
Cidu2011 Banerjee Intro To ML 01
120 pages
Deep Learning-Based Structural Health Monitoring
No ratings yet
Deep Learning-Based Structural Health Monitoring
38 pages
Prakash, Chandra - Google Cloud Professional Data Engineer Practice Tests 2019 - GCP Data Engineer Dumps 2019. 100 - Unconditional Pass Guarantee Ex (2019, 万千书友聚集地) - Libgen.li
No ratings yet
Prakash, Chandra - Google Cloud Professional Data Engineer Practice Tests 2019 - GCP Data Engineer Dumps 2019. 100 - Unconditional Pass Guarantee Ex (2019, 万千书友聚集地) - Libgen.li
141 pages
Data - Analytics - Chapter 2
No ratings yet
Data - Analytics - Chapter 2
58 pages
A gradient boosting decision tree based GPS signal reception classification algorithm
No ratings yet
A gradient boosting decision tree based GPS signal reception classification algorithm
12 pages
UNIT-I
No ratings yet
UNIT-I
132 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
(Chou e JIANG) A Survey On Data-Driven Network Intrusion Detection.
No ratings yet
(Chou e JIANG) A Survey On Data-Driven Network Intrusion Detection.
36 pages
Abstract:: Keywords: Emotion Detection, Natural Language Processing, Adversarial Transfer Learning
No ratings yet
Abstract:: Keywords: Emotion Detection, Natural Language Processing, Adversarial Transfer Learning
17 pages
Deep Learning: Sara Billen
No ratings yet
Deep Learning: Sara Billen
55 pages
Intro To ML
No ratings yet
Intro To ML
26 pages
diabets project document3
No ratings yet
diabets project document3
60 pages
MLLecture 1
No ratings yet
MLLecture 1
10 pages
Entropy and Information Gain
No ratings yet
Entropy and Information Gain
3 pages
Machine Learning
No ratings yet
Machine Learning
80 pages
MachineLearning Presentation
No ratings yet
MachineLearning Presentation
71 pages
ML Unit-1
No ratings yet
ML Unit-1
12 pages
CS601_Machine Learning_Unit 1_Notes_1672759748
No ratings yet
CS601_Machine Learning_Unit 1_Notes_1672759748
13 pages
ML 1
No ratings yet
ML 1
79 pages
Literature Review Table
No ratings yet
Literature Review Table
9 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Sami Briouza CV
No ratings yet
Sami Briouza CV
2 pages
Prediction of Brain Stroke Using Machine Learning
No ratings yet
Prediction of Brain Stroke Using Machine Learning
8 pages
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
No ratings yet
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
4 pages
Fraud Detection System Micro-Project
No ratings yet
Fraud Detection System Micro-Project
27 pages
Introduction To Convolutional Neural Network (CNN) Using Tensorflow - by Govinda Dumane - Towards Data Science
No ratings yet
Introduction To Convolutional Neural Network (CNN) Using Tensorflow - by Govinda Dumane - Towards Data Science
17 pages
Artificial Neural Networks in Construction Engineering and Management
No ratings yet
Artificial Neural Networks in Construction Engineering and Management
12 pages
A Survey Paper On Credit Card Fraud Detection Techniques
No ratings yet
A Survey Paper On Credit Card Fraud Detection Techniques
9 pages
Introduction To Data Science and Machine Learning
No ratings yet
Introduction To Data Science and Machine Learning
23 pages
SEng5305-chap-1-Introduction to ML (1)
No ratings yet
SEng5305-chap-1-Introduction to ML (1)
85 pages
Marketing Research - Hair PDF
No ratings yet
Marketing Research - Hair PDF
35 pages
Final Report Phase-1
No ratings yet
Final Report Phase-1
23 pages
Classification Methods Based On Formal Concept Analysis
No ratings yet
Classification Methods Based On Formal Concept Analysis
9 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Chapter 01 machine learning
No ratings yet
Chapter 01 machine learning
22 pages
Introduction
No ratings yet
Introduction
4 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Guidebook Machine Learning Basics PDF
100% (1)
Guidebook Machine Learning Basics PDF
16 pages
Machine Learning Basics: 1. General Introduction
No ratings yet
Machine Learning Basics: 1. General Introduction
46 pages
Augmenting Banking and Fintech With Intelligent Internet of Things Technology
No ratings yet
Augmenting Banking and Fintech With Intelligent Internet of Things Technology
6 pages
Technological Forecasting & Social Change
No ratings yet
Technological Forecasting & Social Change
13 pages
ML Lecture Notes Unit-1
No ratings yet
ML Lecture Notes Unit-1
45 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
68 pages
IEEE Paper Format Template
No ratings yet
IEEE Paper Format Template
4 pages
Unit_I_1
No ratings yet
Unit_I_1
203 pages
Ethiopian Common and Emerging Maize Disease
No ratings yet
Ethiopian Common and Emerging Maize Disease
14 pages
Food Waste Management Using Machine Learning: Abstract
No ratings yet
Food Waste Management Using Machine Learning: Abstract
7 pages
Deep learning: deep learning explained to your granny – a guide for beginners
From Everand
Deep learning: deep learning explained to your granny – a guide for beginners
PAT NAKAMOTO
3/5 (2)
What Is Machine Learning
No ratings yet
What Is Machine Learning
9 pages
Aiml Project Report
No ratings yet
Aiml Project Report
10 pages
Machine Learning Basics: An Illustrated Guide For Non-Technical Readers
No ratings yet
Machine Learning Basics: An Illustrated Guide For Non-Technical Readers
16 pages
ML Overview Notes
No ratings yet
ML Overview Notes
23 pages
Module1 Introduction
No ratings yet
Module1 Introduction
35 pages
JNTUK R20 ML UNIT-I (Chapter-I)
No ratings yet
JNTUK R20 ML UNIT-I (Chapter-I)
9 pages
Machine Learning For Absolute Beginners A - Oliver Theobald
100% (2)
Machine Learning For Absolute Beginners A - Oliver Theobald
179 pages
JNTUK R20 ML UNIT-I Final
No ratings yet
JNTUK R20 ML UNIT-I Final
22 pages
Unit-1
No ratings yet
Unit-1
88 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
MACHINELEARING UNIT 1material
100% (1)
MACHINELEARING UNIT 1material
64 pages

Introduction To ML Linear Regression

Uploaded by

Introduction To ML Linear Regression

Uploaded by

Introduction to machine learning

What is machine learning?

1. Is a process of enabling a computer based system to learn to do tasks based on

What do machine learning algorithms do?

1. Search through data to look for patterns

2. Patterns in form of trends, cycles, associations, classes etc.

3. Express these patterns as mathematical structures such as probability equations

3. Too complex and dynamic. For e.g. Weather forecasting

5. No prior experience or knowledge. For e.g. Mars rover

6. Patterns hidden in humongous data. For e.g. Recommendation system

Where are machine learning based systems used (examples only)

3. Credit risk management

4. Prediction of equipment failures

5. New pricing models / strategies

6. Network intrusion detection

7. Pattern and image recognition

8. Email spam filtering

1. Machine learning is part of a larger discipline called Data Science

1. Rich set of data representing the real world

2. Knowledge and skills in

Real World as Mathematical Space

2. Each entity is represented as one record / line in the data set

1. Each attribute becomes a

2. Each record becomes a

2. The position is decided by

3. A model represents the real

4. The model could be a simple

6. The learning algorithm

7. In the figure, since the

8. x , y, z represent the three

9. A new data point enters

10. It’s x,y and z values will be

12. Whether the new data point

13. Only direct test on the

14. If majority of new data

a. Class of machine learning algorithms that work on externally supplied instances

b. They produce a model representing alternate hypothesis i.e. distribution of

Train & build Build the model in Python or Spark or R

a. The term "regression" generally refers to predicting a real number. However, it

c. A linear combination is an expression where one or more variables are scaled

response = intercept + constant ∗ explanatory

a. Before we generate a model, we need to understand the degree of relationship

b. Mathematically correlation between two variables indicates how closely their

c. Correlation of extreme possible values of -1 and +1 indicate a perfectly linear

r is near 0 r is near -1 r is near +1

i. This and many other

We square all the errors and

Model line always passes

2. P2 - Estimated y value for given x

y P1 3. Ybar – Average of all Y values in data set

6. SSE - Residual error (p1 – p2)^2

p. Coefficient of determinant (Contd…)

4. SSR / SST is called as r^2 (r square) or

Xbar 5. r^2 is always between 0 and 1 and is a

q. Coefficient of determinant (Contd…) -

%age of total variance that is represented by the line is coeff of determinant

Lab- 1- Estimating mileage based on features of a second hand car

Description – Sample data is available at

Sol : mpg-linear regression.ipynb

You might also like