Chapter 2 - Logistic Regression
Chapter 2 - Logistic Regression
Logistic Regression
Support Vector Machine
Decision Tree
Logistic regression: Introduction
• In classification, we seek to identify the categorical class Ck associate with
a given input vector x.
Vehicle features / budget: Buy / Not ? 0: “Negative Class”
Online Transactions: Fraudulent (Yes / No)? 1: “Positive Class”
Malignant ?
(No) 0
Tumor Size Tumor Size
1 2 3 x1
Predict “ “ if
x2
-1 1 x1
Predict “ “ if
-1
• For given input, hypothesis always predicts value which is between 0 & 1.
if hθ(x) < 0.5 then consider hθ(x) = 0
else if hθ(x) >= 0.5 then consider hθ(x) = 1
15/12/2024 School of Computer Science and Engineering 7
Logistic regression – hypothesis
Training set:
m examples
Example: If
-log(z) –(log(1-z)
Want :
Repeat
Repeat
x2 x2
x1 x1
x1
x2 x2
x1 x1
x2
Class 1:
Class 2:
Class 3:
x1
x2
yi 1
yi 1 f ( x) sign( w x b)
A separating
hypreplane
w x b 0
x1
yi 1
yi 1 Which one should we choose!
x'
Good generalization! xi
SVM – Choosing a separating hyperplane
The SVM approach: Linear separable
case
-The SVM idea is to maximize the distance between The hyperplane
and the closest sample point.
In the optimal hyper- plane:
gin
ar
These are
M
Support
d
Vectors xi
We will see latter that the
d
Optimal hyperplane is
completely defined by
the support vectors.
SVM – Margin Decision Boundary
Class 2
m
Class 1
SVM - Example
SVM - Example
SVM - Example
SVM - Example
SVM - Example
SVM - Example
SVM - Example
SVM - Example
SVM - Example
Decision Tree and
Naïve Bayesian
Decision Tree
Issues Regarding Classification and
Prediction
12/15/2024
or predictor efficiently given large amounts of data.
Classification and Prediction 67
Classification by decision tree Induction
• Decision tree
• A flow-chart-like tree structure
• Internal node denotes a test on an attribute
• Branch represents an outcome of the test
• Leaf nodes represent class labels or class distribution
• Decision tree generation consists of two phases
• Tree construction
• At start, all the training examples are at the root
• Partition examples recursively based on selected
attributes
• Tree pruning
• Identify and remove branches that reflect noise or
outliers
• Use of decision tree: Classifying an unknown sample
• Test the attribute values of the sample against the
decision tree
Training Dataset
p p n n
I ( p, n) log 2 log 2
pn pn pn pn
Information Gain in Decision Tree
Induction
• Assume that using attribute A a set S will be partitioned into
sets {S1, S2 , …, Sv}
Gain( A) I ( p, n) E ( A)
The class label attribute, buys computer, has two distinct values
(namely, yes, no)
There are
Nine tuples of class yes
Five tuples of class no.
v
infoA(D)= ∑ ( pi + ni/p+n )Info(pi,ni)
i=1
Implementation
Now,
-> gain of income
Now,
-> gain of student
Now,
-> gain of credit limit
AGE
Age = mid_age
Class : C1
Implementation
AGE
Student Credit_limit
Credit_limit Credit_limit
Student = Yes Student = No = fair = excellent
Age = mid_age
Class : C1
Bayesian Classifiers :