0% found this document useful (0 votes)

5 views40 pages

Lecture ai

The document covers logistic regression, focusing on its application as a binary classifier, the formulation of its likelihood, and the derivation of its gradient and Hessian. It discusses the importance of regularization to prevent overfitting in models, emphasizing methods to reduce the number of features or penalize model parameters. Additionally, it outlines the cost function used in logistic regression, which is the cross-entropy loss, and provides insights into the implications of regularization on model performance.

Uploaded by

abdoallahbenalmokhtar06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views40 pages

Lecture ai

Uploaded by

abdoallahbenalmokhtar06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Logistic Regression - Regularization

Mourad Gridach
Department of Computer Science
High Institute of Technology - Agadir
Last Lecture

• Gradient descent for linear regression

• Linear regression with multiple variable

2
Today’s Lecture
• We will cover classification

• Binary classifiers using a technique called Logistic Regression:

– Apply logistic regression to discriminate between two classes.

– Formulate the logistic regression likelihood.

– Derive the gradient and Hessian of logistic regression.

– How to do logistic regression with the softmax link.

3
Notes on Notation
• In linear regression we used w to refer to weights or
parameters of the model

• In this lecture, we will use θ to refer to model

parameters since it’s the most common notation in
probability theory.

4
MucCulloch-Pitts Model of Neuron

5
Sigmoid Function

• We want

• Logistic regression:

• Where :

• “f ” called Sigmoid function or also “Logistic function”

• Question: what are the max and min

values that can take this function ?
Example of Data in Logistic Regression

X1 x2 y
34.1 10.12 0
30.11 43.21 1
35.1 72.12 1
60.2 86.78 1
79.23 75.23 0
45.08 96.67 1
75.89 46.23 0
Applications

• Spam vs Not Spam

• Sentiment classification

• Medical diagnosis

8
Probabilistic Interpretation
• Logistic regression:
Probabilistic Interpretation
• Logistic regression:
Logistic Regression Hypothesis

Formula:
Linear Separating Hyper-planes

• is an equation of the plane

12
Next
what is the Cost/Loss Function?

13
Bernoulli Distribution: a model of coins

• A Bernoulli random variable r.v. X takes values in {0,1}

• Where θ ∈ [0,1]. We can write this probability as follows:

θ if x=1

1–θ if x=0
14
Entropy
• In information theory, entropy H is a measure of the uncertainty
associated with a random variable. It is defined as:

• For a Bernoulli variable X, the entropy is:

15
Logistic Regression
• The logistic regression model specifies the probability of a binary output
yi∈{0, 1} given the input xi as follows:

16
Logistic Regression – Cost Function

Cross-Entropy loss will be the cost function

Let us prove that

17
(Binary) Cross Entropy Loss - Intuition
Summary
Correct answer à Loss

Wrong answer à Loss

y Y_pred Compute loss

1 0.2
1 0.8
0 0.1
0 0.9
Gradient of binary logistic regression – Exercise

Cost function

Prove that the gradient of the loss is:

19
Hessian of binary logistic regression - Exercise

Cost function

Prove that the Hessian (second derivative) of logistic regression is:

• One can show that H is positive definite;

• Therefore, the NLL is convex à has a unique global minimum.
20
Summary so far
Hypothesis of a logistic regression model:

Cost/Loss/Objective function called Cross-Entropy Loss:

21
Regularization :

The Problem of Overfitting

22
Training vs. Testing

• Students vs Exams

• Training: Students learn new concepts during the

lectures à Training Data (Training set)

• Testing : Professor tests them during exams à Test Data

(Test set)
23
Linear Regression Revisited
Output (y)

Output (y)

Output (y)
Input (x) Input (x) Input (x)

• See white board

24
Linear Regression Revisited
Output (y)

Output (y)

Output (y)
Input (x) Input (x) Input (x)

Overfitting: If we have too many features, the learned hypothesis may fit
the training set very well
but fail to generalize to new examples (predict prices on new examples).
25
Logistic Regression and Overfitting

26
Underfitting

27
Overfitting

28
Best Solution

29
How to Solve Overfitting
1. Reduce number of features.
– Manually select which features to keep.
– Model selection algorithm(out of the scope).

2. Regularization J
– Keep all the features, but reduce magnitude/values of parameters .
– Works well when we have a lot of features, each of which
contributes a bit to predicting .

30
So let us apply the second solution:

Regularization

31
Intuition

Output (y)
Output (y)

Input (x) Input (x)

32
Intuition

Output (y)
Output (y)

Input (x) Input (x)

• Suppose we penalize θ3 and θ4 and make , really small by :

33
Regularization
• Small values for parameters
– “Simpler” hypothesis
– Smooth function
– Less prone to Overfitting
• In the last example, we penalize θ3 and θ4
• Let us take the last example of car prices
ü Features :
ü Parameters :
• Question: how to choose which parameters to penalize ?
34
Regularization – General Mathematical Formula
• How to choose which parameters to penalize ?

• Solution : shrink all the parameters without focusing on specific

ones

• We will try to keep all parameters smaller

• The general cost function will be :

35
Remark
• What if λ (the regularization term) is very large (λ=1020) ?
• Let us take this example with 5 parameters

– θ1 ≈ θ2≈ θ3 ≈ θ4 ≈ 0
θ0 – What will happen ?
Output (y)

underfitting
– Answer :
Input (x)

36
Regularization for Linear/Logistic Regression

• The new Linear Regression cost function after adding the regularization term
will be :

• Recall: the difference will be just in the hypothesis and the cost
function J(θ) where the gradient descent formula will be the same
37
Gradient Descent in Action
• Add derivative of the regularization term to gradient descent for Linear
regression

Repeat {

} For j=1, …, n

38
Summary

• Logistic Regression

• Deep understanding of Classification problems

• Hypothesis for Logistic regression

• Cost function for Logistic regression

• Regularization problem
39
Questions

AD0-E725 Exam Valid Dumps Questions
No ratings yet
AD0-E725 Exam Valid Dumps Questions
11 pages
Btech Cs 4 Sem Operating Systems Btcoc402 Aug 2022 (1) DBATu
No ratings yet
Btech Cs 4 Sem Operating Systems Btcoc402 Aug 2022 (1) DBATu
3 pages
Data Science L19_LogisticRegression
No ratings yet
Data Science L19_LogisticRegression
52 pages
Regression (1)
No ratings yet
Regression (1)
31 pages
Logistic Regression(Probability Concepts) and Perceptron
No ratings yet
Logistic Regression(Probability Concepts) and Perceptron
20 pages
an_overview_of_the_applications_of_computers_in_chemistry
No ratings yet
an_overview_of_the_applications_of_computers_in_chemistry
16 pages
Quick Heal Invoice
No ratings yet
Quick Heal Invoice
1 page
04- Linear-Classification-2024
No ratings yet
04- Linear-Classification-2024
65 pages
3 Short
No ratings yet
3 Short
10 pages
SUPERVISED MACHINE LEARNING
No ratings yet
SUPERVISED MACHINE LEARNING
56 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Machine Learning For Blockchain Data Analysis: Progress and Opportunities
No ratings yet
Machine Learning For Blockchain Data Analysis: Progress and Opportunities
9 pages
10. Binary Logistic Regression 2
No ratings yet
10. Binary Logistic Regression 2
43 pages
3-LG_Eval
No ratings yet
3-LG_Eval
52 pages
Formal Letter Format To Principal
100% (2)
Formal Letter Format To Principal
4 pages
Role of Data in RBI-Challenges & Recommendations
No ratings yet
Role of Data in RBI-Challenges & Recommendations
18 pages
Logistic Regression[2]
No ratings yet
Logistic Regression[2]
36 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
The Fiverr Master Class The Secrets of Six Power Sellers (Patrick Smith, Allen Arroway, Kelly Shepard Etc.)
No ratings yet
The Fiverr Master Class The Secrets of Six Power Sellers (Patrick Smith, Allen Arroway, Kelly Shepard Etc.)
102 pages
ML Logistic Regression
No ratings yet
ML Logistic Regression
19 pages
Lecture 5_Logistic Regression (1)
No ratings yet
Lecture 5_Logistic Regression (1)
28 pages
Lec 3
No ratings yet
Lec 3
22 pages
4.Logistic Regression
No ratings yet
4.Logistic Regression
16 pages
Machine Learning Using Optimization and Logistic Regression and Sigmoid Function_grp 06
No ratings yet
Machine Learning Using Optimization and Logistic Regression and Sigmoid Function_grp 06
31 pages
W2 Ann
No ratings yet
W2 Ann
12 pages
Training-Handbook-Fortinet-FortiGate-Security
No ratings yet
Training-Handbook-Fortinet-FortiGate-Security
4 pages
01B-DL2023-LinearModels
No ratings yet
01B-DL2023-LinearModels
47 pages
SE COMPUTER 2019 COURSE Microprocessor MCQ UNIT 5: Compiled by SAJ MP - MCQ - Saj
No ratings yet
SE COMPUTER 2019 COURSE Microprocessor MCQ UNIT 5: Compiled by SAJ MP - MCQ - Saj
23 pages
7 Logistic-Regression
No ratings yet
7 Logistic-Regression
63 pages
PCCAIML601
No ratings yet
PCCAIML601
7 pages
Repair Service Rate 2020
No ratings yet
Repair Service Rate 2020
5 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Sample Research Paper
No ratings yet
Sample Research Paper
26 pages
09_23ECE216_LogisticRegression
No ratings yet
09_23ECE216_LogisticRegression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Algorithms Notes
No ratings yet
Algorithms Notes
66 pages
Idmc SP Op 010 Mattresses Deployment Procedure
No ratings yet
Idmc SP Op 010 Mattresses Deployment Procedure
14 pages
Lecture 03 Logistic Regression
No ratings yet
Lecture 03 Logistic Regression
34 pages
Binary Classification and Logistic Regression
No ratings yet
Binary Classification and Logistic Regression
7 pages
Chiranshu Arora Full Stack Engineer MyConsult CV V - 1
No ratings yet
Chiranshu Arora Full Stack Engineer MyConsult CV V - 1
1 page
Lec12 Logreg
No ratings yet
Lec12 Logreg
41 pages
Lecture 4-Logistic-Regression
No ratings yet
Lecture 4-Logistic-Regression
50 pages
06LogisticRegression
No ratings yet
06LogisticRegression
55 pages
CIS 112 Computer Technology: - Lectuerer: Karabo Macheng - Email
No ratings yet
CIS 112 Computer Technology: - Lectuerer: Karabo Macheng - Email
74 pages
Logistic Regression
No ratings yet
Logistic Regression
19 pages
Lecture3 Logistic Regression Regularization
No ratings yet
Lecture3 Logistic Regression Regularization
39 pages
Angles and Directions: An Introduction To JTS Warped
No ratings yet
Angles and Directions: An Introduction To JTS Warped
7 pages
Lecture Note #9_PEC-CS701E
No ratings yet
Lecture Note #9_PEC-CS701E
41 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Shrikant Resume
No ratings yet
Shrikant Resume
3 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
04 - Recurrence
No ratings yet
04 - Recurrence
22 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Logistic Regression Loss
No ratings yet
Logistic Regression Loss
7 pages
She Ra PDF
No ratings yet
She Ra PDF
1 page
Network Security PDF
No ratings yet
Network Security PDF
4 pages
Shravani Resume
No ratings yet
Shravani Resume
3 pages
Network Install Guide: For The Enterprise Version of Nuance PDF Products
No ratings yet
Network Install Guide: For The Enterprise Version of Nuance PDF Products
23 pages
Demo Script: Using Remote Views in Visual Foxpro: Purpose of This Document
No ratings yet
Demo Script: Using Remote Views in Visual Foxpro: Purpose of This Document
9 pages
output_23
No ratings yet
output_23
6 pages
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
No ratings yet
Machine Learning: Probabilistic View of Linear Regression Logistic Regression Hyperplane Based Classifiers and Perceptron
67 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Smart College Event Management System Using MERN Stack
No ratings yet
Smart College Event Management System Using MERN Stack
7 pages
Logistic Regression
No ratings yet
Logistic Regression
24 pages
Time Table for Winter 2024 Theory Examination
No ratings yet
Time Table for Winter 2024 Theory Examination
8 pages
A Layman's Guide to the Project
No ratings yet
A Layman's Guide to the Project
34 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
07: Regularization: The Problem of Overfitting
No ratings yet
07: Regularization: The Problem of Overfitting
5 pages
Logistic Regression
No ratings yet
Logistic Regression
37 pages
EquivalentFractions 2
No ratings yet
EquivalentFractions 2
22 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Clearance Sale 3-Aug 2021
No ratings yet
Clearance Sale 3-Aug 2021
13 pages
3 Logistic Regression and Regularization
No ratings yet
3 Logistic Regression and Regularization
42 pages
Week 3 Lecture Notes
No ratings yet
Week 3 Lecture Notes
7 pages
A Tutorial of Machine Learning
No ratings yet
A Tutorial of Machine Learning
16 pages
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
No ratings yet
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
20 pages
CS229 Supplemental Lecture Notes: 1 Binary Classification
No ratings yet
CS229 Supplemental Lecture Notes: 1 Binary Classification
7 pages
ML4 Linear Models
No ratings yet
ML4 Linear Models
34 pages
Introduction To Machine Learning: 2 Linear Classifiers
No ratings yet
Introduction To Machine Learning: 2 Linear Classifiers
4 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
06 Logistic Regression PDF
No ratings yet
06 Logistic Regression PDF
10 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Chem Folder
No ratings yet
Chem Folder
1 page
Registers and Counters: Module - 5
No ratings yet
Registers and Counters: Module - 5
33 pages
Lecture 6
No ratings yet
Lecture 6
19 pages
Solutions Problem Set 1
No ratings yet
Solutions Problem Set 1
7 pages
Ether Mach Mach3 Plugin Guide
No ratings yet
Ether Mach Mach3 Plugin Guide
24 pages
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet

Lecture ai

Uploaded by

Lecture ai

Uploaded by

Logistic Regression - Regularization

• Gradient descent for linear regression

• Linear regression with multiple variable

• Binary classifiers using a technique called Logistic Regression:

– Apply logistic regression to discriminate between two classes.

– Formulate the logistic regression likelihood.

– Derive the gradient and Hessian of logistic regression.

– How to do logistic regression with the softmax link.

• In this lecture, we will use θ to refer to model

• “f ” called Sigmoid function or also “Logistic function”

• Question: what are the max and min

• Spam vs Not Spam

• is an equation of the plane

• A Bernoulli random variable r.v. X takes values in {0,1}

• Where θ ∈ [0,1]. We can write this probability as follows:

• For a Bernoulli variable X, the entropy is:

Cross-Entropy loss will be the cost function

Wrong answer à Loss

y Y_pred Compute loss

Prove that the gradient of the loss is:

Prove that the Hessian (second derivative) of logistic regression is:

• One can show that H is positive definite;

Cost/Loss/Objective function called Cross-Entropy Loss:

The Problem of Overfitting

• Training: Students learn new concepts during the

• Testing : Professor tests them during exams à Test Data

• See white board

Input (x) Input (x)

Input (x) Input (x)

• Suppose we penalize θ3 and θ4 and make , really small by :

• Solution : shrink all the parameters without focusing on specific

• We will try to keep all parameters smaller

• The general cost function will be :

• Deep understanding of Classification problems

• Hypothesis for Logistic regression

• Cost function for Logistic regression

You might also like