0% found this document useful (0 votes)

33 views

(Unit-04) Part-01 - ML Algo

This document provides an overview of machine learning algorithms, specifically regression. It defines regression as constructing a model to predict dependent variables from independent variables. Regression uses continuous output variables like salary or weight. Examples of regression applications given include sales forecasting, price analysis, and risk assessment. Simple linear regression fits a linear relationship between one dependent and independent variable. It finds the slope and y-intercept that minimize the sum of squared errors between the predicted and actual values using the least squares method. The goodness of fit is measured by R-squared, which indicates the percentage of variation explained by the model. Logistic regression is also introduced as a technique to predict categorical dependent variables.

Uploaded by

suma varanasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

(Unit-04) Part-01 - ML Algo

Uploaded by

suma varanasi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Artificial Intelligence

(CSE3007)
Unit – 04 (Part-I)

Machine Learning Algorithms

Dr. Susant Kumar Panigrahi

Assistant Professor
School of Electrical & Electronics Engineering
What is Regression?

• The main goal of regression is the construction of an efficient model

to predict the dependent attributes from a bunch of attribute
variables. A regression problem is when the output variable is either
real or a continuous value i.e. salary, weight, area, etc.

• We can also define regression as a statistical means that is used in

applications like housing, investing, etc. It is used to predict the
relationship between a dependent variable and a bunch of
independent variables.
Examples
Applications of Regression

1. Evaluating Trends and Sales Estimates

• Linear regressions can be used in business to

evaluate trends and make estimates or forecasts.
• For example, if a company’s sales have increased
steadily every month for the past few years,
conducting a linear analysis on the sales data
with monthly sales on the y-axis and time on the
x-axis would produce a line that that depicts the
upward trend in sales. After creating the trend
line, the company could use the slope of the line
to forecast sales in future months.
2. Analyzing the Impact of Price Changes

• Linear regression can also be used to analyze the

effect of pricing on consumer behavior.

• For example, if a company changes the price on a

certain product several times, it can record the
quantity it sells for each price level and then
performs a linear regression with quantity sold as
the dependent variable and price as the
explanatory variable. The result would be a line
that depicts the extent to which consumers reduce
their consumption of the product as prices
increase, which could help guide future pricing
decisions.
3. Assessing Risk
Simple Linear Regression

• One of the most interesting

and common regression
technique is simple linear
regression. In this, we predict
the outcome of a dependent
variable based on the
independent variables, the
relationship between the
variables is linear. Hence, the
word linear regression.
• Simple linear regression is a regression
technique in which the independent variable
has a linear relationship with the dependent
variable. The straight line in the diagram is
the best fit line.

• The main goal of the simple linear regression

is to consider the given data points and plot
the best fit line to fit the model in the best
way possible.
The Main Idea of Least Square and Linear Regression

Data Points of some

observations
Dependent Variable

But which among these lines best fit

Independent Variable
the data for future prediction..!!
The Main Idea of Least Square and Linear Regression
Lets measure how well this line fits the data….
Lets start with a worst case scenario….

So far the distance between the data points and the
line is:
The Main Idea of Least Square and Linear Regression
So far the distance between the data points and the
line is:
The Main Idea of Least Square and Linear Regression
Finally …….
So to make the cost positive and more
mathematically meaningful, each difference
terms are squared and added together to find
the fit:

= 24.62

This measure indicate how well the line

fits the data
The Main Idea of Least Square and Linear Regression
Rotate the line a little bit and check how well it
fits = 18.72

Rotate the line a little bit more and check how

well it fits = 14.05
The Main Idea of Least Square and Linear Regression

Rotate the line a whole lot then how well it fits

= 31.71
The Main Idea of Least Square and Linear Regression
There is a sweet spot between the horizontal
line and the last case of “whole lot rotated
line” for which we may get the optimal value of
the fit.
The generic line equation for the above linear
regression is:

or Slope
y- intercept

We need to find out the optimum value of and

so that we minimize the sum of squared
As we are looking to find the value
residual.
of m and c so that we will get

Mathematically: smallest sum of residual, so it is
Sum of squared residual = called as “Least Square”
The Main Idea of Least Square and Linear Regression

How do we find the optimal rotation:

“We take the derivative of this
function”

Derivative tells us the slope of the function at

every point…

Notice: The slope at the best point (the “Least

Square”) is zero.

Different rotations are the different values of slope m and y-intercept c.

The big concepts…!!!!!

•• We
want to minimize the squares of the distance between the
observed value and the line.

• We do this by taking the derivative and finding the values of

slope and y-intercept where it is equal to zero.

• The final line minimizes the sums of squares (“least square”)

between it and the real data.
Understanding Linear Regression Algorithm

Mean of
Mean of

Centroid ()

The best fit regression line must pass

through the centroid.

So we need to find out the equation of line that should pass

through the centroid point using least square approach.
Finding the equation of line …..

The generic line equation for the above linear

regression is:

𝑐=𝑦 −𝑚 𝑥

4

𝑚= =0 . 4
𝑐=3.6−0.4×3=2.4

10
The Predicted Line…..
The Predicted Line…..
Goodness of fit…. – R2

WHAT IS R-SQUARED?

 R-squared is a statistical measure of how close the data are to the fitted
regression line.
 It is also known as the coefficient of determination, or the coefficient of
multiple determination for multiple regression.
 The definition of R-squared is fairly straight-forward; it is the percentage of
the response variable variation that is explained by a linear model.
 R-squared = Explained variation / Total variation
Calculation of – R2
Calculation of – R2

2
𝑅 ≈0.3
Interpretation of values of R2

R2=1
Regression line is a Perfect fit
on actual values

R2=0
There is larger distance
between Actual and predicted
values.
Advantages And Disadvantages

Advantages Disadvantages
Linear regression performs exceptionally The assumption of linearity between
well for linearly separable data dependent and independent variables
Easier to implement, interpret and It is often quite prone to noise and
efficient to train overfitting
It handles overfitting pretty well using Linear regression is quite sensitive to
dimensionally reduction techniques,
regularization, and cross-validation outliers
One more advantage is the extrapolation It is prone to multicollinearity
beyond a specific data set
Solve it
• Use least-squares regression to fit a straight line to

• Also find the goodness of fit. Analyze the result.

Logistic Regression
What is Regression?

• Regression analysis is a powerful statistical analysis technique. A dependent

variable of our interest is used to predict the values of other independent
variables in a data-set.
• We come across regression in an intuitive way all the time. Like predicting the
weather using the data-set of the weather conditions in the past.
• It uses many techniques to analyses and predict the outcome, but the
emphasis is mainly on relationship between dependent variable and one or
more independent variable.
• Logistic regression analysis predicts the outcome in a binary variable which
has only two possible outcomes.
What Is Logistic Regression?
• Logistic regression is a classification algorithm, used when the
value of the target variable is categorical in nature. Logistic
regression is most commonly used when the data in question
has binary output, so when it belongs to one class or another, or
is either a 0 or 1.

• Remember that classification tasks have discrete categories,

unlike regressions tasks.
• Logistic Regression is a Machine Learning algorithm
which is used for the classification problems, it is a
predictive analysis algorithm and based on the
concept of probability.
Logistic Regression

• It is a technique to analyze a data-set which has a dependent

variable and one or more independent variables to predict
the outcome in a binary variable, meaning it will have only
two outcomes.

• The dependent variable is categorical in nature. Dependent

variable is also referred as target variable and the
independent variables are called the predictors.
• Logistic regression is a special case of linear regression
where we only predict the outcome in a categorical
variable. It predicts the probability of the event using
the log function.

• We use the Sigmoid function/curve to predict the

categorical value. The threshold value decides the
outcome(win/lose).
• We can call a Logistic Regression a Linear Regression model but the Logistic
Regression uses a more complex cost function, this cost function can be
defined as the ‘Sigmoid function’ or also known as the ‘logistic function’
instead of a linear function.
• The hypothesis of logistic regression tends it to limit the cost function between
0 and 1. Therefore linear functions fail to represent it as it can have a value
greater than 1 or less than 0 which is not possible as per the hypothesis of
logistic regression.
What is the Sigmoid Function?

• In order to map predicted values to probabilities, we use the

Sigmoid function. The function maps any real value into
another value between 0 and 1. In machine learning, we use
sigmoid to map predictions to probabilities.

• The sigmoid function/logistic function is a function that

resembles an “S” shaped curve when plotted on a graph. It
takes values between 0 and 1 and “squishes” them towards
the margins at the top and bottom, labeling them as 0 or 1.
• The equation for the Sigmoid function is this:

• What is the variable e in this instance? The e represents the

exponential function or exponential constant, and it has a value of
approximately 2.71828.
Example:

Age group people those who either brought

insurance or not.

Have_insurance = 1 (Brought Insurance)

Have_insurance = 0 (Have no Insurance)

Applying Linear Regression
Applying Linear Regression  Thesholding

Likely to buy insurance

Applying Linear Regression  Thesholding

[Lets assume we have another extreme value]
Unlikely to buy
insurance

New predictions are more

erroneous
Sigmoid or Logit Function

1
𝑆 ( 𝑦 )= 𝑦
1+ 𝑒
Linear Regression and Logistic Regression
Relationship
Linear Regression Logistic Regression
1. Definition To predict a continuous dependent To predict a categorical dependent
variable based on values of variable based on values of
independent variable independent variables
2. Variable Type Continuous dependent variable Categorical dependent variable

3. Estimation Method Least square estimation Maximum likelihood estimation

4. Equation Y= a0+a1x Log()=a0+a1x1+a2x2 +…+ anxn
5. Best fit line Straight line Curve
6. Relationship between Linear Non linear
dependent and independent
variable
7. Output Predicted Integer value Predicted binary value (0/1)
Types Of Logistic Regression

• Binary logistic regression – It has

only two possible outcomes.
Example- yes or no
• Multinomial logistic regression – It
has three or more nominal
categories. Example- cat, dog,
elephant.
• Ordinal logistic regression- It has
three or more ordinal categories,
ordinal meaning that the categories
will be in a order. Example- user
ratings(1-5).

Analytics Compendium
No ratings yet
Analytics Compendium
41 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
What Is Linear Regression
No ratings yet
What Is Linear Regression
14 pages
6 Regression Analysis
No ratings yet
6 Regression Analysis
12 pages
Linear Regression
No ratings yet
Linear Regression
24 pages
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
No ratings yet
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
19 pages
Complete Linear Regression Algorithm
No ratings yet
Complete Linear Regression Algorithm
4 pages
Regression
No ratings yet
Regression
11 pages
4 ML
No ratings yet
4 ML
41 pages
UNIT 2 Machine Learning BCAI601BCDS062.pptx
No ratings yet
UNIT 2 Machine Learning BCAI601BCDS062.pptx
244 pages
3. Linear Regression
No ratings yet
3. Linear Regression
49 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Linear Regression-Part 2
No ratings yet
Linear Regression-Part 2
26 pages
Everything You Need To Know About Linear Regression
No ratings yet
Everything You Need To Know About Linear Regression
19 pages
Unit III
No ratings yet
Unit III
18 pages
MOD3_EDA
No ratings yet
MOD3_EDA
16 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Experiment No 7
No ratings yet
Experiment No 7
7 pages
ML - LAB - BE CSE (DS) Final
No ratings yet
ML - LAB - BE CSE (DS) Final
110 pages
AIML MSE 2 Notes
No ratings yet
AIML MSE 2 Notes
35 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Regression Modelling
No ratings yet
Regression Modelling
25 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Unit2-Regression NGP
No ratings yet
Unit2-Regression NGP
81 pages
ML Unit-2 Final
No ratings yet
ML Unit-2 Final
32 pages
Hanan
No ratings yet
Hanan
9 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Combinepdf
No ratings yet
Combinepdf
8 pages
Machine Learning and Deep Learning Course
No ratings yet
Machine Learning and Deep Learning Course
23 pages
Unit - II_DA
No ratings yet
Unit - II_DA
22 pages
DA Unit-3
No ratings yet
DA Unit-3
13 pages
TYPES OF SUPERVISED LEARNING2
No ratings yet
TYPES OF SUPERVISED LEARNING2
66 pages
Unit-Iii-1 1
No ratings yet
Unit-Iii-1 1
31 pages
DA Notes 3
No ratings yet
DA Notes 3
12 pages
DA-MODULE-3
No ratings yet
DA-MODULE-3
54 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Data Science
100% (1)
Data Science
14 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
APznzaaV-S8wLPGsP_Add8mCHq3JcpXzeJ180tg4GWAcHx6DAgMVD3eyvT5dWstrOMVpGkO6YPvB6EzW3QMZ2MOlHap6AIHzt5bF4qrpZ6P5COArRIkGSOpTA3irJqdWr5VzZJgsslAEoNck-7XB6goMBGQ2C1xBIjiLrywLxqEZfdK9zE3-of9LPSjsbB_QkInc2mquD_oyBRUUJcHri
No ratings yet
APznzaaV-S8wLPGsP_Add8mCHq3JcpXzeJ180tg4GWAcHx6DAgMVD3eyvT5dWstrOMVpGkO6YPvB6EzW3QMZ2MOlHap6AIHzt5bF4qrpZ6P5COArRIkGSOpTA3irJqdWr5VzZJgsslAEoNck-7XB6goMBGQ2C1xBIjiLrywLxqEZfdK9zE3-of9LPSjsbB_QkInc2mquD_oyBRUUJcHri
199 pages
UNIT - III
No ratings yet
UNIT - III
9 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
Data Analytics Unit 2
No ratings yet
Data Analytics Unit 2
13 pages
Linear_Regression (1)
No ratings yet
Linear_Regression (1)
35 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
1.5.Linear Regression
No ratings yet
1.5.Linear Regression
5 pages
AAI Lecture 10 Sp 25
No ratings yet
AAI Lecture 10 Sp 25
37 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
Linear regression case study
No ratings yet
Linear regression case study
6 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
BA3-4-5modules
No ratings yet
BA3-4-5modules
258 pages
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
No ratings yet
Session 1: Simple Linear Regression: Figure 1 - Supervised and Unsupervised Learning Methods
16 pages
Regression v33
No ratings yet
Regression v33
81 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Bresenham Line Algorithm: Efficient Pixel-Perfect Line Rendering for Computer Vision
From Everand
Bresenham Line Algorithm: Efficient Pixel-Perfect Line Rendering for Computer Vision
Fouad Sabry
No ratings yet
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
No ratings yet
BStats 4
No ratings yet
BStats 4
44 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Handouts of BIO401 Lesson 1-22
No ratings yet
Handouts of BIO401 Lesson 1-22
270 pages
Rajni Sh
No ratings yet
Rajni Sh
6 pages
Anova Lecture Notes 1
No ratings yet
Anova Lecture Notes 1
8 pages
Predicting Brain Age using ml algorithms
No ratings yet
Predicting Brain Age using ml algorithms
9 pages
Mixed Effects Models and Extensions in Ecology With R (Statistics For Biology and Health) - ISBN 0387874577, 978-0387874579
100% (20)
Mixed Effects Models and Extensions in Ecology With R (Statistics For Biology and Health) - ISBN 0387874577, 978-0387874579
23 pages
Group Assignment 2 - SB02- Group 1
No ratings yet
Group Assignment 2 - SB02- Group 1
6 pages
Lab 2 - Accuracy and Precision - Key To Questions-1
No ratings yet
Lab 2 - Accuracy and Precision - Key To Questions-1
4 pages
Jurnal Nur Wulan1
No ratings yet
Jurnal Nur Wulan1
9 pages
3 - A Statistics Refresher
No ratings yet
3 - A Statistics Refresher
12 pages
Instant Download for Fundamental Statistics for the Social and Behavioral Sciences 1st Edition Tokunaga Test Bank 2024 Full Chapters in PDF
100% (9)
Instant Download for Fundamental Statistics for the Social and Behavioral Sciences 1st Edition Tokunaga Test Bank 2024 Full Chapters in PDF
60 pages
T - Table (Critical Values For The Student's T Distribution)
No ratings yet
T - Table (Critical Values For The Student's T Distribution)
1 page
Assignment Final_2(1)(1)
No ratings yet
Assignment Final_2(1)(1)
3 pages
Model Summary: Dimension0
No ratings yet
Model Summary: Dimension0
8 pages
Statistical Techniques in Spatial Analysis
No ratings yet
Statistical Techniques in Spatial Analysis
4 pages
Fundamental Law FT
No ratings yet
Fundamental Law FT
33 pages
Lecture-1 Descriptive Statistics
No ratings yet
Lecture-1 Descriptive Statistics
50 pages
Lecture 9 Sampling Techniques Lecture
No ratings yet
Lecture 9 Sampling Techniques Lecture
30 pages
A Stata Implementation of The Blinder-Oaxaca Decomposition
No ratings yet
A Stata Implementation of The Blinder-Oaxaca Decomposition
25 pages
Final Defense
No ratings yet
Final Defense
25 pages
Attachment 1 3
No ratings yet
Attachment 1 3
8 pages
Quiz Research Sampling
No ratings yet
Quiz Research Sampling
4 pages
CBSE Class 11 NCERT Book Statistics Correlation Chapter 7 PDF
No ratings yet
CBSE Class 11 NCERT Book Statistics Correlation Chapter 7 PDF
16 pages
Chi-Squared Test Worked Example
No ratings yet
Chi-Squared Test Worked Example
2 pages
Practical Exercise 8 - Solution
No ratings yet
Practical Exercise 8 - Solution
3 pages
123social Workers 2017 Exam Result
No ratings yet
123social Workers 2017 Exam Result
2 pages
SEM Notes
No ratings yet
SEM Notes
3 pages
Statistics Mcqs - Estimation Part 5: Examrace
No ratings yet
Statistics Mcqs - Estimation Part 5: Examrace
8 pages
7 .T - Test For Dependent
No ratings yet
7 .T - Test For Dependent
14 pages

(Unit-04) Part-01 - ML Algo

Uploaded by

(Unit-04) Part-01 - ML Algo

Uploaded by

Artificial Intelligence

Machine Learning Algorithms

Dr. Susant Kumar Panigrahi

• The main goal of regression is the construction of an efficient model

• We can also define regression as a statistical means that is used in

1. Evaluating Trends and Sales Estimates

• Linear regressions can be used in business to

• Linear regression can also be used to analyze the

• For example, if a company changes the price on a

• One of the most interesting

• The main goal of the simple linear regression

Data Points of some

But which among these lines best fit

This measure indicate how well the line

Rotate the line a little bit more and check how

Rotate the line a whole lot then how well it fits

We need to find out the optimum value of and

How do we find the optimal rotation:

Derivative tells us the slope of the function at

Notice: The slope at the best point (the “Least

Different rotations are the different values of slope m and y-intercept c.

• We do this by taking the derivative and finding the values of

• The final line minimizes the sums of squares (“least square”)

The best fit regression line must pass

So we need to find out the equation of line that should pass

The generic line equation for the above linear

• Also find the goodness of fit. Analyze the result.

• Regression analysis is a powerful statistical analysis technique. A dependent

• Remember that classification tasks have discrete categories,

• It is a technique to analyze a data-set which has a dependent

• The dependent variable is categorical in nature. Dependent

• We use the Sigmoid function/curve to predict the

• In order to map predicted values to probabilities, we use the

• The sigmoid function/logistic function is a function that

• What is the variable e in this instance? The e represents the

Age group people those who either brought

Have_insurance = 1 (Brought Insurance)

Have_insurance = 0 (Have no Insurance)

Likely to buy insurance

Applying Linear Regression  Thesholding

New predictions are more

3. Estimation Method Least square estimation Maximum likelihood estimation

• Binary logistic regression – It has

You might also like