0% found this document useful (0 votes)
2 views69 pages

ML Unit2

The document covers supervised learning techniques, focusing on various regression methods including linear regression, multiple linear regression, and logistic regression. It explains the concepts of independent and dependent variables, the mathematical representation of regression models, and the use of cost functions and gradient descent in optimizing these models. Additionally, it highlights applications of logistic regression in classification problems and provides examples of its use in predicting outcomes based on input variables.

Uploaded by

pm626560211
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views69 pages

ML Unit2

The document covers supervised learning techniques, focusing on various regression methods including linear regression, multiple linear regression, and logistic regression. It explains the concepts of independent and dependent variables, the mathematical representation of regression models, and the use of cost functions and gradient descent in optimizing these models. Additionally, it highlights applications of logistic regression in classification problems and provides examples of its use in predicting outcomes based on input variables.

Uploaded by

pm626560211
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 69

SUPERVISED

LEARNING
UNIT-2
Unit-2

SUPERVISED LEARNING: Linear


Regression, Multiple Linear
Regression, Logistic
Regression, K Nearest
Neighbours, Decision Trees:
ID3, Classification and
Regression Trees, Support
Vector Machines: Linear and
Nonlinear, Kernel Functions.
WHAT ARE INDEPENDENT AND DEPENDENT VARIABLES?

What's an independent variable?


Answer: An independent variable is exactly what it sounds like. It is a
variable that stands alone and isn't changed by the other variables
you are trying to measure. For example, someone's age might be an
independent variable. Other factors (such as what they eat, how much they
go to school, how much television they watch) aren't going to change a
person's age. In fact, when you are looking for some kind of relationship
between variables you are trying to see if the independent variable causes
some kind of change in the other variables, or dependent variables.
WHAT ARE INDEPENDENT AND
DEPENDENT VARIABLES?
What's a dependent variable?
Answer: Just like an independent variable, a dependent variable is exactly
what it sounds like. It is something that depends on other factors. For
example, a test score could be a dependent variable because it could
change depending on several factors such as how much you studied,
how much sleep you got the night before you took the test, or
even how hungry you were when you took it. Usually when you are
looking for a relationship between two things you are trying to find out
what makes the dependent variable change the way it does.
WHAT ARE INDEPENDENT AND
DEPENDENT VARIABLES?
• (Independent variable) causes a change in (Dependent Variable) and it isn't
possible that (Dependent Variable) could cause a change in (Independent
Variable).
• For example:
• (Time Spent Studying) causes a change in (Test Score) and it isn't possible that
(Test Score) could cause a change in (Time Spent Studying).
• We see that "Time Spent Studying" must be the independent variable and
"Test Score" must be the dependent variable because the sentence doesn't
make sense the other way around.
WHAT ARE INDEPENDENT AND DEPENDENT VARIABLES?
WHAT IS LINEAR REGRESSION?

• Linear regression analysis is used to predict the value of a variable based on the
value of another variable.
• The variable you want to predict is called the dependent variable. The
variable you are using to predict the other variable's value is called the
independent variable.
• Linear regression is a supervised machine learning model majorly used in
forecasting. Supervised machine learning models are those where we use the
training data to build the model and then test the accuracy of the model using
the loss function.
• As the name suggests, it assumes a linear relationship between a set of
independent variables to that of the dependent variable (the variable of interest).
A LINE OF
LINEAR
REGRESSION

Positive Regression Line: Negative Regression


X axis independent Line: X axis independent
variable value is variable value is
increasing and so does increasing and so does
the Y axis dependent the Y axis dependent
variable value is variable value is
increasing decreasing
• Below is the equation to the linear regression model, we all
D
are well aware of it
e
y=mx+c p
e
• Where y is the dependent variable n
d
• M refers to the slope of the line e
n
• X is the independent variable and t
• C is the constant-coefficient of the line
Independent variable
• When there is only one independent feature, it is known as
Simple Linear Regression, and when there are more than
one feature, it is known as Multiple Linear Regression.
• Similarly, when there is only one dependent variable, it is
considered Univariate Linear Regression, while when there
are more than one dependent variables, it is known as
Multivariate Regression.
Uses of Regression

• Determining the strength of predictors


• Relationship between Strength of sales and marketing
spending
• Relationship between age and income
• Forecasting an effect
• How much additinational sale income will I get for each
Thousands of Rupee spent on marketing
• Trend Forecasting /Point Estimates
• What will be the price of GOLD in next six months
Mathematical Representation of Linear Regression :
y= a0+a1x+ e
Where
y=taget variable i.e Dependent variable
x=preditor variable i.e Independent variable
a0=intercept of the line
a1=linear regression coefficient
e = random error
PROBLEM TO SOLVE
Linear Regression using Matrix method
Linear Regression using Matrix method
Linear Regression using Matrix method
Linear Regression using Matrix method
Finding the Best fit Line

• The error between predicted values and actual values


are minimized
• The best fit line will have least errors
• The different values for weights or the coefficient of
lines(a0,a1) gives different lines of regression,So we
need to find the calculate the best fist values for a0
and a1 to find bestfit line
• So to calculate this we use cost function
Need of cost function
Need of cost function
Need for cost function
Cost function-

○ The different values for weights or coefficient of lines (a0, a1) gives the

different line of regression, and the cost function is used to estimate the
values of the coefficient for the best fit line.
○ Cost function optimizes the regression coefficients or weights. It measures
how a linear regression model is performing.
○ We can use the cost function to find the accuracy of the mapping function,
which maps the input variable to the output variable. This mapping function is
also known as Hypothesis function.
The cost function used in linear regression are
• Mean Squared error(MSE)
Gradient Descent
○ Gradient descent is used to minimize the MSE by
calculating the gradient of the cost function.
○ A regression model uses gradient descent to update the
coefficients of the line by reducing the cost function.
○ It is done by a random selection of values of coefficient
and then iteratively update the values to reach the
minimum cost function.
Multiple linear regression

Simple Linear Regression:


If a single independent variable is used to predict the
value of a numerical dependent variable, then such a
Linear Regression algorithm is called Simple Linear
Regression.
Multiple Linear Regression :
It involves multiple independent variables/predictors
and one dependent variable
• The multiple regression of two variables x1 and x2 is given as

y = f(x1,x2)
y = ao+a1x1+a2x2
• Similarly,for a given ‘n’ independent variables,the equation is

y = f(x1,x2,..xn)
y = ao+a1x1+a2x2+..+anxn+ε
X1 X2 Y
- - - ● Apply multiple Regression
Product Product Weekly
for the values given in Table
1 Sales 2 sales sales
where weekly sales along
1 4 1 with sales for products x1
2 5 6 and x2 are provided
3 8 8 ● Matrix approach is used to
solve the problem
4 2 12
X1 X2 Y X= 1 1 4
1
- - - 1 2 5 Y= 6
1 3 8
Product 1 Product 2 Weekly 1 4 2
8
12
Sales sales sales CO
1 4 1 B I L1
AS - – X2
L 3
2 5 6 CO

3 8 8 COL2-X1
4 2 12
• The coefficient of multiple linear regression equation is
a0
a= a1
a2

• The regression coefficient for multiple linear regression is


calculated as:
XTX= 1 1 1 1 1 1 4
4 10 19
1 2 3 4 1 2 5
4 5 8 2 1 3 8 = 10 30 46
19 46 109
1 4 2

3.15 -0.59 -0.30


4 10 19 -0.59 0.20 0.016
10 30 -1
46 -0.30 0.016 0.054
19 46 109
(XTX)-1=

A
Now calculate
1 1 1 1
(X X) X =
T -1 T
A * 1 2 3 4
4 5 8 2

0.05 0.47 -1.02 0.19


-0.32 -0.098 0.155 0.26
-0.065 0.005 0.185 -0.125
Final Ans is..

a1 a0
a2
y=a0+a1x1+a2x2

y=-1.69+3.48x1-0.05x2

In the above equation ,if we know the values of


independent variables x1 and x2,then we can predict y
Logistic Regression
Logistic Regression

• Linear regression is suitable when we need to


predict the numerical responses but it is not
suitable for categorical responses
• When categorical problems are involved ,it is
called as classification problem
• Logistic Regression is suitable for Binary
classification problem
The sigmoid function transforms the continuous real number into a range of ( 0 , 1 )
Some of the applications of Logistic
regression are…
• Fraud Detection in credit card
• Email spam or not
• Sentiment Analysis in Twitter analysis
• Image segmentation,recognition, and classification-X-rays,scans
• Object detection through video
• Handwriting recognition
• Disease prediction-Diabetes,cancer etc
• and many more….
For example ,The organization wants to determine the
increase of sal based on the employee performance
In this case linear regression will help where
• emp rating-x-independent variable
• performance-y-dependent variable
• Now,what if the organization wants to know
whether an employee would be given promotion or
not based on performance
• Now the problem statement response is not in
numerical form ..it is in categorical form
• So the linear graph will not be suitable for this kind
of problem stmts
• Solution is SIGMOID CURVE(S
Curve)
• Based on the threshold
values,the organization decides
to give promotion or not
Example:

• The student dataset has entrance marks based on the historic data of those who are

Based on the logistic regression ,the values of the learnt parameters are 𝜷0=1 and
selected or not selected

𝜷1=8
• Assume the marks of X=60 and threshold value=0.5,compute the resultant class
p(x)=1/(1+e-(𝜷0+𝜷1x))
𝜷0+𝜷1x=481 where
𝜷0=1,𝜷1=8,x=60
p(x)=1/(1+e-481) =0.44
• 0.44<0.5 ,therefore for the given marks the student will come under the class “NOT
SELECTED”
Logistic Regression example

You might also like