Logistic Regression in Machine Learning
Logistic Regression in Machine Learning
o Logistic regression is one of the most popular Machine Learning algorithms, which comes under the
Supervised Learning technique. It is used for predicting the categorical dependent variable using a
given set of independent variables.
o Logistic regression predicts the output of a categorical dependent variable. Therefore the outcome must
be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or False, etc. but instead of
giving the exact value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
o Logistic Regression is much similar to the Linear Regression except that how they are used. Linear
Regression is used for solving Regression problems, whereas Logistic regression is used for solving
the classification problems.
o In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic function, which
predicts two maximum values (0 or 1).
o The curve from the logistic function indicates the likelihood of something such as whether the cells are
cancerous or not, a mouse is obese or not based on its weight, etc.
o Logistic Regression is a significant machine learning algorithm because it has the ability to provide
probabilities and classify new data using continuous and discrete datasets.
o Logistic Regression can be used to classify the observations using different types of data and can easily
determine the most effective variables used for the classification. The below image is showing the
logistic function:
Note: Logistic regression uses the concept of predictive modeling as regression; therefore, it is called logistic
regression, but is used to classify samples; Therefore, it falls under the classification algorithm.
o It maps any real value into another value within a range of 0 and 1.
o The value of the logistic regression must be between 0 and 1, which cannot go beyond this limit, so it
forms a curve like the "S" form. The S-form curve is called the Sigmoid function or the logistic
function.
o In logistic regression, we use the concept of the threshold value, which defines the probability of either
0 or 1. Such as values above the threshold value tends to 1, and a value below the threshold values
tends to 0.
The Logistic regression equation can be obtained from the Linear Regression equation. The mathematical steps
to get Logistic Regression equations are given below:
o In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above equation by (1-
y):
o But we need range between -[infinity] to +[infinity], then take logarithm of the equation it will become:
On the basis of the categories, Logistic Regression can be classified into three types:
o Binomial: In binomial Logistic regression, there can be only two possible types of the dependent
variables, such as 0 or 1, Pass or Fail, etc.
o Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered types of
the dependent variable, such as "cat", "dogs", or "sheep"
o Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of dependent
variables, such as "low", "Medium", or "High".
Linear Regression:
o Linear Regression is one of the most simple Machine learning
algorithm that comes under Supervised Learning technique and
used for solving regression problems.
o It is used for predicting the continuous dependent variable with the
help of independent variables.
o The goal of the Linear regression is to find the best fit line that can
accurately predict the output for the continuous dependent variable.
o If single independent variable is used for prediction then it is called
Simple Linear Regression and if there are more than two
independent variables then such regression is called as Multiple
Linear Regression.
o By finding the best fit line, algorithm establish the relationship
between dependent variable and independent variable. And the
relationship should be of linear nature.
o The output for Linear regression should only be the continuous
values such as price, age, salary, etc. The relationship between the
dependent variable and independent variable can be shown in
below image:
In above image the dependent variable is on Y-axis (salary) and
independent variable is on x-axis(experience). The regression line can be
written as:
y= a0+a1x+ ε
Logistic Regression: