Logistic Regression
Logistic Regression
Logistic regression, despite its name, is a classification model rather than regression model. Logistic
regression is a simple and more efficient method for binary and linear classification problems. It is a
classification model, which is very easy to realize and achieves very good performance with linearly
separable classes. It is an extensively employed algorithm for classification in industry.
What is a sigmoid function?
The logistic function in linear regression is a type of sigmoid, a class of functions with the same
specific properties.
Sigmoid is a mathematical function that takes any real number and maps it to a probability between
1 and 0.
The formula of the sigmoid function is:
The sigmoid function forms an S shaped graph, which means as xx approaches infinity, the
probability becomes 1, and as xx approaches negative infinity, the probability becomes 0. The model
sets a threshold that decides what range of probability is mapped to which binary variable.
Suppose we have two possible outcomes, true and false, and have set the threshold as 0.5. A
probability less than 0.5 would be mapped to the outcome false, and a probability greater than or
equal to 0.5 would be mapped to the outcome true.
Example
Suppose, a regression model is fit using some training data to obtain β and x represents the input
features:
Significance of Sigmoid Function in Logistic Regression
Sigmoid Function In Logistic Regression Is An Advanced Regression Technique That Can Solve Various
Classification Problems. Being A Classification Model, It Is Termed “Regression” Because The
Fundamental Techniques Are Similar To Linear Regression.
Binary Classification Problems Like A Tumour Is Malignant Or Not, An Email Is Spam Or Not And
Multiclass Classification Problems Like Classifying A Random Fruit As Apple, Mango Or Orange Can Be
Solved Using Logistic Regression.
In The Linear Regression Problems We Can Simply Predict A Non Discrete Outcome (∈R) To An Example
By Fitting The Best Line To The Data Given There’s A Hypothesis (H) That Maps X To Y.
We See That Even Though X Ranges From -∞ To ∞. The Outcome Remains In The Range 0-1 With The
Value 0.5 When X=0.
Let’s See How We Can Use This Function To Tune Linear Regression? Which Is An Algorithm Used For
Prediction In Logistic Regression That Can Be Used For Classification Problems? And How We Can
Define The Decision Boundary?
We Want Our Hypothesis To Follow The Condition 0≤Hθ(X)≤1. As Discussed Above, The Vectorized
General Form Of Linear Regression Hypothesis Is And If We Apply Sigmoid Function To It:
Where
So Our Hypothesis Becomes
We Prefer Sigmoid Function Over Other Functions Of Similar Nature Because The Loss Function Is
Smaller In The Case Of Sigmoid Function.
Defining A Decision Boundary In Sigmoid Function
Now That We Have Applied The Sigmoid Function To The Linear Regression Hypothesis And Have
Obtained A Hypothesis For Logistic Regression . Let’s See How The Classification Will Be Possible.
We Can Predict,
Y=1 If Hθ(X)≥0.5
Y=0 If Hθ(X)<0.5
Advantages and Disadvantages of Logistic Regression
Advantages Disadvantages
It can easily extend to multiple The major limitation of Logistic Regression is the
classes(multinomial regression) and a natural assumption of linearity between the dependent
probabilistic view of class predictions. variable and the independent variables.