CO 2 Session 3
CO 2 Session 3
CO-2-Session –3
CONTENTS
2
CONTENTS
Classification
3
LOGISTIC REGRESSION INTRODUCTION
In linear regression modeling, the outcome variable is continuous – e.g., income ~ age and
education
In logistic regression, the outcome variable is categorical, and this chapter focuses on two-
valued outcomes like true/false, pass/fail, or yes/no
4
LOGISTIC REGRESSION USE CASES
Medical
Marketing
5
LOGISTIC REGRESSION
It is supervised learning
Threshold = 0.5
3 main steps:
6
LOGISTIC REGRESSION
SAMPLE DATA
STEP1:
LOGISTIC FUNCTION / SIGMOID
FUNCTION
1/1+e-X
7
LOGISTIC REGRESSION
Logistic regression is one of the most popular Machine Learning algorithms, which comes
under the Supervised Learning technique. It is used for predicting the categorical dependent
variable using a given set of independent variables.
Logistic regression predicts the output of a categorical dependent variable. Therefore the
outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or
False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.
8
LOGISTIC REGRESSION
Logistic Regression is much similar to the Linear Regression except that how they are used.
Linear Regression is used for solving Regression problems, whereas Logistic regression is
used for solving the classification problems.
In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
The curve from the logistic function indicates the likelihood of something such as whether
the cells are cancerous or not, a mouse is obese or not based on its weight, etc.
9
LOGISTIC REGRESSION
Logistic Regression is a significant machine learning algorithm because it has the ability to
provide probabilities and classify new data using continuous and discrete datasets.
Logistic Regression can be used to classify the observations using different types of data
and can easily determine the most effective variables used for the classification. The below
image is showing the logistic function:
10
LOGISTIC REGRESSION
11
LOGISTIC REGRESSION MODEL DESCRIPTION
ey
f(y) = for -∞ < y < ∞
1 + ey
12
LOGISTIC REGRESSION MODEL DESCRIPTION
With the range of f(y) as (0,1), the logistic function models the probability of an outcome
occurring
• In contrast to linear regression, the values of y
Y = ᵦ0+ᵦ₁x1 + ᵦ₂x2 + ... + ᵦp-1 Xp-1 are not directly observed; only the values of
ey f(y) in terms of success or failure are observed.
P (x1,x2……….;xp-1) = 1 + ey • Called log odds ratio, or logit of p. Maximum
for -∞ < y < ∞
Likelihood Estimation (MLE) is used to
In (p/(1-p)) = Y = ᵦ0+ᵦ₁x1 + ᵦ₂x2 + ... + ᵦp-1 Xp-1 estimate model parameters. MLR is beyond
the scope of this book.
13
LOGISTIC REGRESSION MODEL DESCRIPTION:
CUSTOMER CHURN EXAMPLE
14
DIAGNOSTICS MODEL DESCRIPTION: CUSTOMER
CHURN EXAMPLE
>Churn_logistic1<glm(Churned~Age+Married+Cust_years+Churned_contacts,data=c
hurn_input,family=binomial(link=“logit”))
15
DIAGNOSTICS MODEL DESCRIPTION:
CUSTOMER CHURN EXAMPLE
>Churn_logistic3<glm(Churned~Age+Churned_contacts,data=churn_input,family=bi
nomial(link=“logit”))
16
DIAGNOSTICS DEVIANCE AND THE PSEUDO-R2
In logistic regression, deviance = -2logL
• where L is the maximized value of the likelihood function used to obtain the parameter
estimates
17
DIAGNOSTICS DEVIANCE AND THE PSEUDO-R2
Y = ᵦ0+ᵦ₁*Age+ ᵦ₂*Churned_Contacts
18
DIAGNOSTICS RECEIVER OPERATING
CHARACTERISTIC (ROC) CURVE
Logistic regression is often used to classify
• In the Churn example, a customer can be classified as Churn if the model predicts high
probability of churning
• Although 0.5 is often used as the probability threshold, other values can be used based
on desired error tradeoff
For two classes, C and nC, we have
19
DIAGNOSTICS RECEIVER OPERATING
CHARACTERISTIC (ROC) CURVE
• True Negative: predict nC, when actually nC
# of false positives
False Positive Rate (FPR) =
# of negatives
# of true positives
True Positive Rate (TPR) =
# of positives
20
REASONS TO CHOOSE AND CAUTIONS
21
ADDITIONAL REGRESSION MODELS
Multicollinearity is the condition when several input variables are highly correlated
• Lasso regression applies a penalty proportional to the sum of the absolute values of the
coefficients
22
CLASSIFICATION
• Decision trees
• Naïve Bayes
23
DECISION TREES
Decision Trees (DTs) are a non-parametric supervised learning method used for
classification and regression.
Decision trees learn from data to approximate a sine curve with a set of if-then-else
decision rules. The deeper the tree, the more complex the decision rules and the fitter the
model.
24
DECISION TREES
25
DECISION TREES
26
SIMPLE DECISION TREES
27
SIMPLE DECISION TREES
28
PRUNING
• The shortening of branches of the tree. Pruning is the process of reducing the size of the
tree by turning some branch nodes into leaf nodes, and removing the leaf nodes under the
original branch. Pruning is useful because classification trees may fit the training data
well, but may do a poor job of classifying new values. A simpler tree often avoids over-
fitting.
29
PRUNING EXAMPLE
30
WHY DECISION TREES?
They are easily interpretable and follows a similar pattern to human thinking. In other
words, you can explain a decision tree as a set of questions/business rules.
Prediction is fast.
31
DECISION TREE : THE GENERAL ALGORITHM
• Entropy methods
32
DECISION TREE : THE GENERAL ALGORITHM
Hx = the entropy of X
Hx
InfoGainA=Hs- HSA
33
DECISION TREE : THE GENERAL ALGORITHM
• Construct subtrees T1, T2… for the subsets of S recursively until one of following occurs
34
SUMMARY
In this Chapter explained about the Logistic Regression and Use cases,
Diagnostics Deviance and the Pseudo-R2 and Classification.
35
SELF-ASSESSMENT QUESTIONS
Reference Books:
1. Data science and big data analytics: discovering, analyzing, visualizing and presenting data – EMC
Education Services
2. Tom White, “Hadoop The Definitive Guide”, O’Reilly Publications, Fourth Edition,2015
3. Seema Acharya, Subhashini Chellappan, “Big Data and Analytics”, Wiley Publications, First
Edition,2015
Sites and Web links:
4. https // www.geeksforgeeks.org
5. Big Data Analytics www.javatpoint.com
6. https://www.analyticsvidhya.com/
THANK YOU
39