Logistic Regression
Logistic Regression
Sigmoid Function:
It is a mathematical function that is used to map the
predicted value to determine the probability of a class. It
maps the value in the range of 0 to 1, which will never go
beyond this limit. since the Sigmoid’s shape is fixed as an "S"
shape.
Now before we go in depth on how logistic regression
actually works, let’s see what actually probability is.
1. Numerical:
Comes directly in numerical format like 0 or 1.
2. Likelihood:
More closer to 0 means more likelihood to 0th class.
More closer to 1 means more likelihood to 1th
class.
For example:
1 - 0.1 = 0.9
This 0.9 answer show that, the is input egger
to join 0 th class and its eagerness shows
probability of 0.9% out of 1. This mean 90% of
eagerness to join class 0th .
0 - 0.9 = 0.9
This 0.9 answer show that, the input is egger
to join 1st class and its eagerness shows
probability of 0.9% out of 1. This mean 90% of
eagerness to join class 1st .
Probability of one particular input is 0.6 and input
lies in class 1st .
Mathematically we can find it by subtracting
its opposite class number from that input’s
probability.
0 - 0.6 = 0.6
This 0.6 answer is more than the default
threshold of 0.5, and since that input is eager
to join 1st class, we can conclude that
1 - 0.4 = 0.6
This 0.6 answer is more than the default
threshold of 0.5, and since that input is eager
to join 0th class, we can conclude that
3. Events:
Events is nothing but outcomes of experiments.
For example: suppose we are going to find the
probability of a coin; then, it will count on basis of
events and samples.
P= n(events)/n(samples).
= (0, 1)
=we know log(0) = -∞
=for converting +∞ we use Odds.
Odds(A) = p / (1 - p)
=so now equation will be log(0,odds)
This equation convert the (-∞, +∞) in to range
0 to 1.
Evaluation Parameters:
Once model built, we go for evaluation for find out how
the actually model is built, is it god fit or poor fit.
Where:
To calculate the log loss for this example, we'll use the
formula:
where "n" is the number of instances (in this case, n = 4), and
Σ denotes the summation over all instances.
So, in this example, the log loss for the given predicted
probabilities and true actual labels is approximately 0.299.
Lower values of log loss indicate better model performance,
as it means the predicted probabilities are closer to the true
labels. And if the model finds a higher loss, then
automatically the model’s inbuilt algorithm, cum optimizer,
comes into play, which is named the Gradient Descent
algorithm, which helps reduce the log loss by applying the
partial derivative and learning rate parameter.
Confusion Matrix:
Confusion matrix is a matrix which shows how many
data points are truly classified and how many are
misclassified. It is 2*2 matrix if there is binary classification,
and it is n*n matrix if there is multiclass classification,
whether n is the number of classes.
Figure:
In above figure you can see four main parameters, named TF,
TN, FP, FN.
Precision:
In a stock market prediction model, the model gives
information about when the market is going to collapse and
when it is not. (Market collapse = 0th class, and market
collapse is not 1st class.) Now suppose the market is going to
collapse, but the model says the market is not going to
collapse. This is a miss-classification because, actually, it
should be positive but the model predicts it as negative. And
suppose, on the basis of the model, if the user stayed
relaxed, he might face a huge loss. Since we need such a
model that tries to classify accurately,
Recall:
Suppose we have one model that is predicting whether
the mail is spam or not (spam = 0th class and not spam = 1st
class). Now suppose one mail is genuine but the model
predicts it as Spam, so there might be a possibility of losing
genuine mail. Here, actually, the mail is genuine, but model
classifies it as spam, which means model prediction is miss-
classified.
The model always tries to increase the precision and
decrease the recall; to decrease the recall, the model should
increase true positive prediction and increase the precision at
the cost of a lower recall.
Accuracy Score:
TP+TN/ (TP+TN+FP+FN)
F1 Score :
The F1 score is nothing but the harmonic mean of
precision and recall.