0% found this document useful (0 votes)

89 views19 pages

Lecture 6

1) The lecture discusses linear classification using logistic regression. Logistic regression models the probabilities of different classes using a logistic function of the linear combination of inputs. 2) Like linear regression, logistic regression learns the model parameters through maximum likelihood. However, a closed-form solution is not possible for logistic regression due to the nonlinear logistic function. 3) The maximum likelihood estimate finds the parameters that maximize the probability of the training data. This involves minimizing the negative log-likelihood of the training data given the model.

Uploaded by

Mohit Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views19 pages

Lecture 6

Uploaded by

Mohit Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

APL 405: Machine Learning for Mechanics

Lecture 6: Linear classification (logistic regression)

Rajdip Nayek
Assistant Professor,
Applied Mechanics Department,
IIT Delhi

Instructor email: [email protected]

Recap of last lecture
• We introduced the linear regression model, which is a parametric model, for solving the regression problem

• Now we will look at basic parametric modelling techniques, particularly

• Linear regression (covered in last lecture)
• Logistic regression

• Linear regression
• A loss-based perspective, using least squares error

• A statistical perspective based on maximum likelihood, where the log-likelihood function was used

• A closed form solution was derived

• One-hot encoding to handle categorical inputs

• We will see that in logistic regression, we will not obtain a closed form solution
2
How to handle categorical input variables?
▪ We had mentioned earlier that input variables 𝐱 can be numerical, catergorical, or mixed

▪ Assume that an input variable is categorical and takes only two classes, say A and B

0, if 𝐀
▪ We can represent such an input variable 𝑥 using 1 and 0 𝑥= ቊ
1, if 𝐁

▪ For linear regression, the model effectively looks like

𝜃0 + 𝜖, if 𝐀
𝑦 = 𝜃0 + 𝜃1 𝑥 + 𝜖 = ቊ
𝜃0 + 𝜃1 + 𝜖, if 𝐁

▪ If the input is a categorical variable with more than two classes, let’s say A, B, C, and D, use one-hot encoding

1 0 0 0
0 1 0 0
𝐱= if A, 𝐱= if B, 𝐱 = if C, 𝐱 = if D
0 0 1 0
0 0 0 1
3
A statistical view of the Classification problem
▪ Classification → learn relationships between some input variables 𝐱 = 𝑥1 𝑥2 … 𝑥𝑝 𝑇 and a categorical output 𝑦
▪ The goal in classification is to take an input vector 𝐱 and to assign it to one of 𝑀 discrete classes 1,2 … , 𝑀

▪ From a statistical perspective, classification amounts to predicting the conditional class probabilities
𝑝 𝑦=𝑚𝐱 𝑦 → 1, 2, … , 𝑀

▪ 𝑝 𝑦 = 𝑚 𝐱 describes the probability for class 𝑚 given that we know the input 𝐱

▪ A probability over output 𝑦 implies the output label 𝑦 is a random variable (r.v.)

▪ We consider 𝑦 as a r.v. because the data (from real world) will always involve a certain amount of randomness (much like
the output from linear regression that was probabilistic due to random error 𝜖)

4
A statistical view of the Classification problem
▪ How to construct a classifier which can not only predict classes but also learn the class probabilities 𝑝 𝑦 | 𝐱 ?

▪ Consider the simplest case of binary classification 𝑀 = 2 and 𝑦 = −1 or 1

▪ In this binary classification case
𝑝 𝑦 = 1|𝐱 will be modelled by 𝑔(𝐱)

▪ By the laws of probability,

𝑝 𝑦 = 1|𝐱 + 𝑝 𝑦 = −1|𝐱 = 1

𝑝 𝑦 = −1|𝐱 will be modelled by 1 − 𝑔(𝐱)

▪ Since 𝑔 𝐱 is a model for a probability, it is natural to require that 0 ≤ 𝑔 𝐱 ≤ 1 for any 𝐱

▪ For a multi-class problem, the classifier should return a vector-valued function 𝒈 𝐱 , where
𝑝 𝑦 = 1|𝐱 𝑔1 𝐱
𝑝 𝑦 = 2|𝐱 𝑔 𝐱 Since 𝒈 𝐱 models a probability vector, each
is modelled by 2
⋮ ⋮ element 𝑔𝑚 𝐱 ≥ 0 and σ𝑀 𝑚=1 𝑔𝑚 𝐱 = 1
𝑝 𝑦 = 𝑀|𝐱 𝑔𝑀 𝐱 6
Logistic Regression model for binary classification
▪ Logistic regression can be viewed as an extension of linear regression that does (binary) classification (instead of
regression)

▪ We wish to learn a function 𝑔(𝐱) that approximates the conditional probability of the positive class, 𝑝 𝑦 = 1|𝐱

▪ Idea of Logisitic Regression: we start with the linear regression model which, without the noise term 𝜖
▪ Define logit, 𝑧 = 𝜃0 + 𝜃1 𝑥1 + 𝜃2 𝑥2 + ⋯ + 𝜃𝑝 𝑥𝑝 = 𝐱 𝑇 𝜽
▪ Logit takes values on the entire real line, but we need a function that returns a value in the interval 0, 1
𝑒𝑧
▪ Squash the logit 𝑧 = 𝐱𝑇 𝜽 into the interval 0, 1 by using the logistic function, ℎ 𝑧 = 1+𝑒 𝑧

7
Logistic Regression
▪ Idea of Logisitic Regression: we start with the linear regression model which, without the noise term
▪ Define logit, 𝑧 = 𝜃0 + 𝜃1 𝑥1 + 𝜃2 𝑥2 + ⋯ + 𝜃𝑝 𝑥𝑝 = 𝐱 𝑇 𝜽
▪ Logit takes values on the entire real line, but we need a function that returns a value in the interval 0, 1
𝑒𝑧
▪ Squash the logit 𝑧 = 𝐱 𝜽 into the interval 0, 1 by using the logistic function ℎ 𝑧 =
𝑇
1+𝑒 𝑧

▪ Recall that 𝑔 𝐱 was used to model for 𝑝 𝑦 = 1|𝐱

▪ Using the logistic function for 𝑔 𝐱 restricts the values between 0 and 1 and can be interpreted as a probability
𝑇
𝑒𝐱 𝜽
𝑔 𝐱; 𝜽 = 𝑇
1+𝑒 𝐱 𝜽

▪ It implicitly means that a model for 𝑝 𝑦 = −1|𝐱 is

𝑇𝜽 𝑇𝜽
𝑒𝐱 1 𝑒 −𝐱
1 − 𝑔 𝐱; 𝜽 = 1 − 𝑇𝜽 = 𝑇𝜽 = 𝑇𝜽
1 + 𝑒𝐱 1 + 𝑒𝐱 1 + 𝑒 −𝐱
8
Logistic Regression
▪ Logisitic Regression: Essentially linear regression appended with logistic function
▪ Logit, 𝑧 = 𝜃0 + 𝜃1 𝑥1 + 𝜃2 𝑥2 + ⋯ + 𝜃𝑝 𝑥𝑝 = 𝐱 𝑇 𝜽
𝑇 𝑇
𝑒𝐱 𝜽 𝑒 −𝐱 𝜽
▪ 𝑝 𝑦 = 1|𝐱; 𝜽 = 𝑔 𝐱; 𝜽 = 𝑇 , 𝑝 𝑦 = −1|𝐱; 𝜽 = 1 − 𝑔 𝐱; 𝜽 = 𝑇
1+𝑒 𝐱 𝜽 1+𝑒 −𝐱 𝜽

▪ Logistic regression is a method for classification, not regression!

▪ The randomness in classification is statistically modelled by the class probability 𝑝 𝑦 = 𝑚|𝐱 , instead of additive noise 𝜖

▪ Like linear regression, logistic regression is also a parametric model, and we learn the parameters 𝜽 from training data

9
Training binary classification model with Maximum Likelihood
▪ Logistic function is a nonlinear function
▪ Therefore, a closed-form solution to logistic regression cannot be derived

▪ Maximum likelihood perspective of learning 𝜽 from training data

෡ = argmax 𝑝 𝒚 𝐗; 𝜽
𝜽
𝜽

▪ Similar to linear regression, we assume that the training data points are independent, and we consider the logarithm of
the likelihood function for numerical reasons
𝑁 𝑁
෡
𝜽 = argmax ln 𝑝 𝒚 𝐗; 𝜽 = argmax ෍ ln 𝑝 𝑦 𝑖 𝑖
𝐱 ;𝜽 = argmin ෍ −ln 𝑝 𝑦 𝑖 𝐱 𝑖 ;𝜽
𝜽 𝜽 𝑖=1 𝜽 𝑖=1

▪ Note that 𝑝 𝑦 = 1 𝐱; 𝜽 is modelled using 𝑔 𝐱; 𝜽 which implies

−ln 𝑔 𝐱 𝑖 ; 𝜽 if 𝑦 𝑖 =1
𝑖 𝑖
− ln 𝑝 𝑦 𝐱 ;𝜽 = ቐ
−ln 1 − 𝑔 𝐱 𝑖 ; 𝜽 if 𝑦 𝑖 = −1

10
Training binary classification model with Maximum Likelihood
▪ Assume that the training data points are independent, and we consider the logarithm of the likelihood function for
numerical reasons
𝑁 𝑁
෡
𝜽 = argmax ln 𝑝 𝒚 𝐗; 𝜽 = argmax ෍ ln 𝑝 𝑦 𝑖 𝑖
𝐱 ;𝜽 = argmin ෍ −ln 𝑝 𝑦 𝑖
𝐱 𝑖 ;𝜽
𝜽 𝜽 𝑖=1 𝜽 𝑖=1

▪ 𝑝 𝑦 = 1 𝐱; 𝜽 is modelled using 𝑔 𝐱; 𝜽

−ln 𝑔 𝐱 𝑖 ; 𝜽 if 𝑦 𝑖 =1
− ln 𝑝 𝑦 𝑖 𝑖
𝐱 ;𝜽 = ቐ
−ln 1 − 𝑔 𝐱 𝑖 ; 𝜽 if 𝑦 𝑖
= −1

Cross-entropy loss function, 𝐿 𝑦 𝑖 , 𝑔 𝐱 𝑖 ; 𝜽

▪ Cross entropy loss can be used for any binary classifier, not just logistic regression, that predicts class probabilities 𝑔 𝐱; 𝜽

▪ The corresponding cost function (or average loss function)

𝑁
1 −ln 𝑔 𝐱 𝑖 ; 𝜽 if 𝑦 𝑖
=1
𝐽 𝜽 = ෍ቐ 𝑖 ;𝜽 𝑖
𝑁 −ln 1 − 𝑔 𝐱 if 𝑦 = −1
𝑖=1
11
Training Logistic Regression model with Maximum Likelihood
▪ We can write the cost function in more detail for logistic regression
𝑇 𝑇
𝐱 𝑖 𝜽 𝑦 𝑖 𝐱 𝑖 𝜽
𝑖 𝑖 𝑒 𝑒
For 𝑦 = 1, 𝑔 𝐱 ;𝜽 = 𝑇 = 𝑇
𝐱 𝑖 𝜽 𝑦 𝑖 𝐱 𝑖 𝜽
1+𝑒 1+𝑒

𝑇 𝑇
− 𝐱 𝑖 𝜽 𝑦 𝑖 𝐱 𝑖 𝜽
𝑖 𝑖 1 𝑒 𝑒
For 𝑦 = −1, 1 − 𝑔 𝐱 ;𝜽 = 𝑇 = 𝑇 = 𝑇
𝐱 𝑖 𝜽 − 𝐱 𝑖 𝜽 𝑦 𝑖 𝐱 𝑖 𝜽
1+𝑒 1+𝑒 1+𝑒

▪ Hence, we get the same expression in both cases and can write the cost function compactly as:
𝑁
1 −ln 𝑔 𝐱 𝑖 ; 𝜽 if 𝑦 𝑖 =1
𝐽 𝜽 = ෍ቐ 𝑖 ;𝜽 𝑖
𝑁 −ln 1 − 𝑔 𝐱 if 𝑦 = −1
𝑖=1
𝑁
𝑦 𝑖 𝐱 𝑖 𝑇𝜽 𝑁 𝑁
1 𝑒 1 1 1 𝑖 𝑇
𝐱𝑖 𝜽
= ෍ − ln 𝑇 = ෍ − ln 𝑇 = ෍ ln 1 + 𝑒 −𝑦
𝑁 1 + 𝑒𝑦
𝑖 𝐱𝑖 𝜽 𝑁 1 + 𝑒 −𝑦
𝑖 𝐱𝑖 𝜽 𝑁
𝑖=1 𝑖=1 𝑖=1
12
Training Logistic Regression model with Maximum Likelihood
▪ Cost function in logistic regression is given by:
𝑁
1 𝑖 𝑇
𝐱𝑖 𝜽
𝐽 𝜽 = ෍ ln 1 + 𝑒 −𝑦
𝑁
𝑖=1

Logistic loss function, 𝐿 𝑦 𝑖 , 𝐱 𝑖 ; 𝜽

▪ The logistic loss 𝐿 𝑦 𝑖 , 𝐱 𝑖 ; 𝜽 above is a special case of the cross-entropy loss

▪ Learning a logistic regression model thus amounts to solving the optimization problem:
𝑇
෡ = argmin 𝐽 𝜽 = argmin 1 σ𝑁
𝜽 ln 1 + 𝑒 −𝑦 𝑖 𝐱𝑖 𝜽
𝜽 𝑁 𝑖=1
𝜽

▪ Contrary to linear regression with squared error loss, the above problem has no closed-form solution, so we have to use
numerical optimization instead

13
Predictions using Logistic Regresion
▪ Logistic regression predicts class probabilities for a test input 𝐱 ∗
▪ by first learning 𝜽 from training data, and
▪ then computing 𝑔 𝐱 ∗ , which is the model for 𝑝 𝑦 ∗ = 1 𝐱 ∗

▪ However, sometimes we want to make a “hard” prediction for the test input 𝐱 ∗
▪ E.g., whether is 𝑦ො 𝐱 ∗ = 1 or 𝑦ො 𝐱 ∗ = −1 in binary classification?
▪ Recall, in 𝑘NN and decision trees, we made “hard” predictions

▪ To make hard predictions with logistic regression model, we add a final step, in which the predicted probabilities are
turned into a class prediction

▪ The most common approach is to let 𝑦ො 𝐱 ∗ be the most probable class ← the class having the highest probability

▪ For binary classification, we can express this as: 𝑟 = 0.5 minimises the so-called misclassification rate
1 if 𝑔 𝐱 ∗ > 𝑟
𝑦ො 𝐱∗ =ቊ with decision threshold 𝑟 = 0.5 (why?)
−1 if 𝑔 𝐱 ∗ ≤ 𝑟
14
Decision Boundaries of Logistic Regression
▪ Decision boundary ← The point(s) where the prediction changes from from one class to another

Grey plane is the decision boundary

▪ The decision boundary for binary classification can be computed by solving the equation
𝑔 𝐱 =1−𝑔 𝐱 meaning 𝑝 𝑦 = 1|𝐱; 𝜽 = 𝑝(𝑦 = −1|𝒙; 𝜽)
▪ The solutions to this equation are points in the input space for which the two classes are predicted to be equally probable
15
Decision Boundaries of Logistic Regression
▪ The decision boundary for binary classification can be computed by solving the equation
𝑔 𝐱 =1−𝑔 𝐱 meaning 𝑝 𝑦 = 1|𝐱; 𝜽 = 𝑝(𝑦 = −1|𝐱; 𝜽)
▪ The solutions to this equation are points in the input space for which the two classes are predicted to be equally probable
▪ For binary logistic regression, it means
𝑇𝜽
𝑒𝐱 1 𝑇𝜽
𝐱𝑇𝜽
= 𝐱𝑇𝜽
⟺ 𝑒𝐱 = 1 ⟺ 𝐱𝑇 𝜽 = 0
1+𝑒 1+𝑒

▪ The equation 𝐱 𝑇 𝜽 = 0 parameterises a (linear) hyperplane

▪ Therefore, the decision boundaries in logistic regression always have the shape of a (linear) hyperplane

16
Prediction and Decision Boundaries of Logistic Regression
▪ For binary classification, we can express this as:
∗ 1 if 𝑔 𝐱 ∗ > 𝑟
𝑦ො 𝐱 =ቊ with decision threshold 𝑟 = 0.5
−1 if 𝑔 𝐱 ∗ ≤ 𝑟
▪ Choosing 𝑟 = 0.5 minimises the so-called misclassification rate

▪ The decision boundary for logistic regression lies at 𝐱 𝑇 𝜽 = 0

⟹ The sign of the expression 𝐱 𝑇 𝜽 determines if we are predicting the positive (1) or the negative (-1) class

▪ Compactly, one can write the test output prediction for a test input 𝐱 ∗ from a logistic regression as
𝑦ො 𝐱 ∗ = sign 𝐱 ∗ 𝑇 𝜽

17
Linear vs Non-linear classifiers
▪ A classifier whose decision boundaries are linear hyperplanes is a linear classifier
▪ Logistic regression is a linear classifier
▪ 𝑘NN and Decision Trees are non-linear classifiers

𝑥2 𝑥2

𝑥1 𝑥1
Linear classifier Non-Linear classifier

▪ Note that the term ‘linear’ has a different sense for linear regression and for linear classification
▪ Linear regression is a model which is linear in its parameters,
▪ Linear classifier is a model linear whose decision boundaries are linear
18
Logistic Regression for more than two classes
▪ For the binary problem, we used the logistic function to design a model for 𝑔 𝐱
▪ 𝑔 𝐱 a scalar-valued function representing 𝑝 𝑦 = 1| 𝐱

▪ For a multi-class problem (𝑀 classes), the classifier should return a vector-valued function 𝒈 𝐱 , where
𝑝 𝑦 = 1|𝐱 𝑔1 𝐱
𝑝 𝑦 = 2|𝐱 𝑔 𝐱 Since 𝒈 𝐱 models a probability vector, each
is modelled by 𝒈 𝐱 = 2
⋮ ⋮ element 𝑔𝑚 𝐱 ≥ 0 and σ𝑀 𝑚=1 𝑔𝑚 𝐱 = 1
𝑝 𝑦 = 𝑀|𝐱 𝑔𝑀 𝐱

▪ For this purpose, we define 𝑀 different logits, 𝑧𝑚 = 𝜽𝑚 𝑇 𝐱 , 𝑚 = 1,2, … , 𝑀

▪ The use the softmax function (a vector-valued generalization of logistic function)

𝑒 𝑧1 • 𝒛 is an 𝑀-dimensional vector
1 𝑒 𝑧2 • softmax 𝒛 also returns a vector of the same dimension
softmax 𝒛 ≜
σ𝑀
𝑚=1 𝑒
𝑧𝑚 ⋮ • By construction, the output vector always sums to 1, and each element
𝑧
𝑒 𝑀 is always ≥ 0
19
Multi-class Logistic Regression model
▪ We have now combined linear regression and softmax function to model multi-class probabilities
𝑧1 𝜽1 𝑇 𝐱
𝑧2 𝜽2 𝑇 𝐱
𝒈 𝒛 = softmax 𝒛 , where 𝒛 = ⋮ =
⋮
𝑧𝑀 𝜽 𝑇𝐱
𝑀

▪ Equivalently, we can write out the individual class probabilities, that is, the elements of the vector 𝑔𝑚 𝐱
𝜽𝑚 𝑇 𝐱
𝑒
𝑔𝑚 𝐱 = 𝑇 𝑚 = 1,2, … , 𝑀
σ𝑀 𝑒 𝜽𝑗 𝐱
𝑗=1

▪ This is the multiclass logistic regression model

▪ Note that this construction uses 𝑀 parameter vectors 𝜽1 , … , 𝜽𝑀 (one for each class)
▪ Note the number of parameters to learn grows with 𝑀
20

Multiple Choice Questions For Class VIII: Crop Production and Management
63% (8)
Multiple Choice Questions For Class VIII: Crop Production and Management
21 pages
Integrated Motion On The ethernet/IP Network: Configuration and Startup
No ratings yet
Integrated Motion On The ethernet/IP Network: Configuration and Startup
354 pages
Lecture 5: Simulation Technology and Manufacturing System Simulation
No ratings yet
Lecture 5: Simulation Technology and Manufacturing System Simulation
50 pages
Multiple Choice Questions For Class VIII: Crop Production and Management
No ratings yet
Multiple Choice Questions For Class VIII: Crop Production and Management
21 pages
Academic Calendar 2016 17 PDF
No ratings yet
Academic Calendar 2016 17 PDF
16 pages
B A B Ed 4year-2017-18 PDF
No ratings yet
B A B Ed 4year-2017-18 PDF
236 pages
Sanda Marin
0% (6)
Sanda Marin
3 pages
20 Questions To Test Your Skills On Logistic Regression
No ratings yet
20 Questions To Test Your Skills On Logistic Regression
9 pages
High Performance: Vertical Machining Centers
No ratings yet
High Performance: Vertical Machining Centers
6 pages
SR5200 Service Manual
No ratings yet
SR5200 Service Manual
52 pages
XCT 7 40vcc617fqee
No ratings yet
XCT 7 40vcc617fqee
294 pages
Top 50 Mysql Interview Questions & Answers
No ratings yet
Top 50 Mysql Interview Questions & Answers
1 page
Lecture 2
No ratings yet
Lecture 2
22 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
A B C D E F: Q1: The Parts of A Machine As Shown in Fig. 1 Are Given in Adjoining
No ratings yet
A B C D E F: Q1: The Parts of A Machine As Shown in Fig. 1 Are Given in Adjoining
12 pages
Regional Mathematical Olympiad 2015 Questions With Solutions
No ratings yet
Regional Mathematical Olympiad 2015 Questions With Solutions
4 pages
Proxmox Backup 3
No ratings yet
Proxmox Backup 3
165 pages
Double Dragon Neon Switch NSP XCI NSZ
No ratings yet
Double Dragon Neon Switch NSP XCI NSZ
1 page
Logistic Regression by Nirzona
No ratings yet
Logistic Regression by Nirzona
11 pages
Logistic - Regression Class 3
No ratings yet
Logistic - Regression Class 3
88 pages
Block Cioher
No ratings yet
Block Cioher
38 pages
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
100% (2)
FEM 2063 - Data Analytics: CHAPTER 4: Classifications
76 pages
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
No ratings yet
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
153 pages
Session 5 - Logistic Regression
No ratings yet
Session 5 - Logistic Regression
69 pages
Class
No ratings yet
Class
102 pages
L
No ratings yet
L
1 page
Binary Logistic Regression 2
No ratings yet
Binary Logistic Regression 2
43 pages
Logistic Regression Video
No ratings yet
Logistic Regression Video
37 pages
XXIX Asian Pacific Mathematics Olympiad
No ratings yet
XXIX Asian Pacific Mathematics Olympiad
2 pages
Lecture 4-Logistic-Regression
No ratings yet
Lecture 4-Logistic-Regression
50 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Chapter Social Responsibility
No ratings yet
Chapter Social Responsibility
1 page
7 Logistic-Regression
No ratings yet
7 Logistic-Regression
63 pages
Unit II
100% (1)
Unit II
13 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
Devil S Dragon: White Paper
No ratings yet
Devil S Dragon: White Paper
20 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
Lecture 3. Classification
No ratings yet
Lecture 3. Classification
60 pages
Lecture 03 Logistic Regression
No ratings yet
Lecture 03 Logistic Regression
34 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
No ratings yet
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
67 pages
Consolidated Performance Report
No ratings yet
Consolidated Performance Report
1 page
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
04 - Linear-Classification-2024
No ratings yet
04 - Linear-Classification-2024
65 pages
Logistic Regression
No ratings yet
Logistic Regression
24 pages
Logit PDF
No ratings yet
Logit PDF
44 pages
Lecture3 Logistic Regression Regularization
No ratings yet
Lecture3 Logistic Regression Regularization
39 pages
Lecture Note #9 - PEC-CS701E
No ratings yet
Lecture Note #9 - PEC-CS701E
41 pages
Course Outline 1614577744333
No ratings yet
Course Outline 1614577744333
1 page
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
LEVELMASTER H8 Utility User's Guide Rel 25 Apr 2006
No ratings yet
LEVELMASTER H8 Utility User's Guide Rel 25 Apr 2006
20 pages
Course Code: MCS-014 Course Title: Systems Analysis and Design Assignment Number: MCA (I) /014/assignment/20-21
No ratings yet
Course Code: MCS-014 Course Title: Systems Analysis and Design Assignment Number: MCA (I) /014/assignment/20-21
6 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Lec 02 LogisticReg
No ratings yet
Lec 02 LogisticReg
33 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
Lecture Notes Chapt13
No ratings yet
Lecture Notes Chapt13
15 pages
37c22 Samsung Palau
No ratings yet
37c22 Samsung Palau
72 pages
Lec 20
No ratings yet
Lec 20
16 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
Lec18 Logistic Regression
No ratings yet
Lec18 Logistic Regression
17 pages
4.logistic Regression
No ratings yet
4.logistic Regression
16 pages
04 Probability and Learning PDF
No ratings yet
04 Probability and Learning PDF
34 pages
Lecture 13
No ratings yet
Lecture 13
14 pages
Machine Learning Unit 2 Que and Ans
No ratings yet
Machine Learning Unit 2 Que and Ans
16 pages
11-Logistic Regression
No ratings yet
11-Logistic Regression
27 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
4 Linear Regression Additional Notes
No ratings yet
4 Linear Regression Additional Notes
8 pages
Logistic Regression
No ratings yet
Logistic Regression
11 pages
Machine Learning 2
No ratings yet
Machine Learning 2
19 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
Logistic Regression
No ratings yet
Logistic Regression
30 pages
Lecture 05
No ratings yet
Lecture 05
5 pages
Unit-6 (Array) PDF
No ratings yet
Unit-6 (Array) PDF
12 pages
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
No ratings yet
A Research Project On Applying Logistic Regression To Predict Result of Binary Classification Problems
6 pages
Real-Time Location Tracking Using Browser-Based Geolocation API
No ratings yet
Real-Time Location Tracking Using Browser-Based Geolocation API
57 pages
Introduction To Machine Learning: 2 Linear Classifiers
No ratings yet
Introduction To Machine Learning: 2 Linear Classifiers
4 pages
Technical Ptoposal-Zncb Head Office-Questionnaire Information - Og
No ratings yet
Technical Ptoposal-Zncb Head Office-Questionnaire Information - Og
303 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Logistic Regression: Jia Li
No ratings yet
Logistic Regression: Jia Li
44 pages
Sokoine University of Agriculture: Introduction To Microcomputer AND Applications
No ratings yet
Sokoine University of Agriculture: Introduction To Microcomputer AND Applications
27 pages
Program
No ratings yet
Program
34 pages
WS-C3650-24TS-S Datasheet: Quick Specs
No ratings yet
WS-C3650-24TS-S Datasheet: Quick Specs
8 pages
Analysis and Detection of Simbox Fraud in Mobility Networks: Proceedings - Ieee Infocom April 2014
No ratings yet
Analysis and Detection of Simbox Fraud in Mobility Networks: Proceedings - Ieee Infocom April 2014
9 pages
Invoke Portscan - ps1
No ratings yet
Invoke Portscan - ps1
22 pages
Connecting Fog and Cloud Computing
No ratings yet
Connecting Fog and Cloud Computing
3 pages
02 Philippines Marketing - Lead Generation
No ratings yet
02 Philippines Marketing - Lead Generation
45 pages
Hiuhi
No ratings yet
Hiuhi
40 pages
Analysis On Handwriting Using Pen-Tablet For Identification of Person and Handedness
No ratings yet
Analysis On Handwriting Using Pen-Tablet For Identification of Person and Handedness
5 pages
Legal Aspects of Ethical Hacking - Sejal Agarwal
No ratings yet
Legal Aspects of Ethical Hacking - Sejal Agarwal
5 pages
AI Question Bank
No ratings yet
AI Question Bank
3 pages
Essentium Drybox Data Sheet - v2 1 (379296)
No ratings yet
Essentium Drybox Data Sheet - v2 1 (379296)
11 pages
2017 USAPhO Exam 2
No ratings yet
2017 USAPhO Exam 2
10 pages
Submitted By-Submitted To - : Dron Garg Class-Vii A Roll No.13 Mrs. Manju Bala MAM (Social Studies)
No ratings yet
Submitted By-Submitted To - : Dron Garg Class-Vii A Roll No.13 Mrs. Manju Bala MAM (Social Studies)
1 page
Electrical Parameter
No ratings yet
Electrical Parameter
1 page

Lecture 6

Uploaded by

Lecture 6

Uploaded by

APL 405: Machine Learning for Mechanics

Lecture 6: Linear classification (logistic regression)

Instructor email: [email protected]

• Now we will look at basic parametric modelling techniques, particularly

• A closed form solution was derived

• One-hot encoding to handle categorical inputs

▪ For linear regression, the model effectively looks like

▪ Consider the simplest case of binary classification 𝑀 = 2 and 𝑦 = −1 or 1

▪ By the laws of probability,

𝑝 𝑦 = −1|𝐱 will be modelled by 1 − 𝑔(𝐱)

▪ Since 𝑔 𝐱 is a model for a probability, it is natural to require that 0 ≤ 𝑔 𝐱 ≤ 1 for any 𝐱

▪ Recall that 𝑔 𝐱 was used to model for 𝑝 𝑦 = 1|𝐱

▪ It implicitly means that a model for 𝑝 𝑦 = −1|𝐱 is

▪ Logistic regression is a method for classification, not regression!

▪ Maximum likelihood perspective of learning 𝜽 from training data

▪ Note that 𝑝 𝑦 = 1 𝐱; 𝜽 is modelled using 𝑔 𝐱; 𝜽 which implies

Cross-entropy loss function, 𝐿 𝑦 𝑖 , 𝑔 𝐱 𝑖 ; 𝜽

▪ The corresponding cost function (or average loss function)

Logistic loss function, 𝐿 𝑦 𝑖 , 𝐱 𝑖 ; 𝜽

▪ The logistic loss 𝐿 𝑦 𝑖 , 𝐱 𝑖 ; 𝜽 above is a special case of the cross-entropy loss

Grey plane is the decision boundary

▪ The equation 𝐱 𝑇 𝜽 = 0 parameterises a (linear) hyperplane

▪ The decision boundary for logistic regression lies at 𝐱 𝑇 𝜽 = 0

▪ For this purpose, we define 𝑀 different logits, 𝑧𝑚 = 𝜽𝑚 𝑇 𝐱 , 𝑚 = 1,2, … , 𝑀

▪ The use the softmax function (a vector-valued generalization of logistic function)

▪ This is the multiclass logistic regression model

You might also like