0% found this document useful (0 votes)

10 views12 pages

Lec 42

This lecture introduces Logistic Regression, a classification technique that establishes linear decision boundaries based on probability interpretations, which can also be extended to non-linear boundaries. The method allows for predicting class membership of new data points while providing a probabilistic interpretation rather than a simple binary classification. The lecture further explains the mathematical formulation of Logistic Regression, including the use of the sigmoidal function to ensure probabilities are bounded between 0 and 1.

Uploaded by

sarika satya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views12 pages

Lec 42

Uploaded by

sarika satya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Data science for Engineers

Prof. Ragunathan Rengasamy

Department of Computer Science and Engineering
Indian Institute of Technology, Madras

Lecture - 42
Logistic regression

(Refer Slide Time: 00:24)

In this lecture, I will describe a technique called Logistic regression.

Logistics regression is a classification technique which basically
develops a linear boundary regions, based on certain probability
interpretations, while in general we develop a linear decision
boundaries, this technique can also be extended to develop non-linear
boundaries using what is called polynomial logistic regression. For
problems, where we are going to develop linear boundaries the solution
still results in a non-linear optimization problem for parameter
estimation, as we will see in this lecture. So, the goal of this technique
is, given a new data point, I would like to predict the class from which
this data point could have originated.

So, in that sense this is a classification technique, that is used in a

wide variety of problems and it is surprisingly effective for a large
class of problems.
(Refer Slide Time: 01:26)

Just to recap the things that we have seen before, we have talked
about binary classification problem before. Just to make sure that we
recall some of the things that we have talked about before. We said
classification is the task of identifying, what category a new data point,
or an observation belongs to. There could be many categories to which
the data could belong, but when the number of categories is 2, it is
what we call as the binary classification problem. We can also think of
binary classification problems as simple yes or no problems where, you
either say something belongs to particular category, or no it does not
belong to that category.

(Refer Slide Time: 02:15)

Now, whenever we talk about classification problems, we have

described this before, we say a data is represented by many attributes,
X1 to Xn. We can also call this as input features as shown in this slide.
And these input features could be quantitative, or qualitative. Now
quantitative features can be used as they are. However, if we have
going to use a quantitative technique, but we want to use input features
which are qualitative, then we should have some way of converting this
qualitative features into quantitative values. One simple example is if I
have a binary input like a yes or no for a feature. So, what do we mean
by this. So, I could have yes let us say 0.1 0.3 and, another data point
could be no 0.05 - 2 and so on.

So, you notice that these are quantitative numbers while these are
qualitative features. Now, you could convert this all into quantitative
features by coding yes as 1 and no as 0. So, then those also become
number. This is very crude way of doing this, there might be much
better ways of coding qualitative features into quantitative features and
so on. You also have to remember that there are some data analytics
approach, that can directly handle these qualitative features without a
need to convert them into numbers and so on.

(Refer Slide Time: 04:03)

So, you should keep in mind that that is also possible. Now that we
have these feature, we go back to our pictorial understanding of these
things. Just for the sake of illustration, let us take this example, where
we have this 2 dimensional data. So, here we would say X is x 1 x2; two
variables let us say x1 is here x2 is here. So, it is organized into data like
this. Now let us assume that all the circular data belong to one category
and all the starred data, belong to another category. Notice that circled
data would have certain x1 and certain x2 and, similarly starred data
would have certain x1 and certain x2. So, in other words all values of x1
and x2 such that the data is here, belongs to one class and, such that the
data is here belongs to another class.
Now, if we were able to come up with hyper plane such as the one
that is shown here, we learn from our linear algebra module that to one
side of this hyper plane is half space, this side is a half space and,
depending on the way the normal is defined, you would have positive
value and a negative value to each side of the hyper plane. So, this is
something that we have dealt with in detail, in one of the linear
algebraic classes.

So, if you were to do this classification problem, then what you

could say is if I get a data point somewhere here, I could say it belongs
to whatever this class is here. So, let us call this for example, we could
call this class 0 we could call this class 1 and, we would say whenever
a data point falls to this side of the line, then it is class 0 and if a data
point falls to this side of the line, we will say it is class 1 and so on.
However, notice that any data point. So, whether it falls here, or it falls
here we are going to say class 0, but intuitively you know that if this is
really a true separation of these classes, then this is for sure going to
belong to class 0, but as I go closer and closer to this line there is this
uncertainty about, whether it belongs to this class, or this class because
data is inherently noisy.

So, I could have a data point, which is true value here; however,
because of noise it could slip to the other side and so on. So, as I come
closer and closer to this then you know the probability, or the
confidence with which I can say it belongs to particular class, can
intuitively come down. So, simply saying yes this data point and, this
data point belongs to class 0 is 1 answer, but that is pretty crude. So,
the question that this logistic regression answers is can we do
something better using probabilities. So, I would like to say that the
probability that this belongs to class 1 is much higher than this,
because it is far away from the decision boundary. So, how do we do
this, is the question that logistics regression addresses.

(Refer Slide Time: 07:44)

So, as I mentioned before the probability of something being from a
class, if we can answer that question, that is better than just saying yes
or no answers, right.

So, one could say yes this belongs to class, a better nuanced answer
could be that yes it belongs to class, but with a certain probability as
the probability is higher, then you feel more con dent about assigning
that class to the data. On the other hand if you model through
probabilities, we do not want to lose binary answer like yes or no also.
So, if I have probabilities for something I can easily convert them to
yes or no answers through some thresholding, which we will see in the
logistics regression methodology when we describe that. So, while we
do not lose the ability to categorically say, if a data belongs to a
particular class or not by modelling this probability. On the other hand,
we get a bene t of getting a nuanced answer, instead of just saying yes
or no.

(Refer Slide Time: 08:52)

So, the question then, is how does one model these probabilities. So,
let us go back and look at the picture that we had before let us say this
is x1 and x2. Remember that this hyper plane, would typically have this
form here the solution is written in the vector form. If I want to expand
it in terms of x1 and x2, what I could do is I could write this as β 0 + β11
x1 + β12x2. So, this could be the equation of this line in this two
dimensional space. Now one idea might be just to say this itself is a
probability and then let us see what happens. The difficulty with this is,
this p of x is not bounded because, it is just a linear function. Whereas
you know that the probability has to be bounded between 0 and 1. So,
we have to find some function which is bounded between 0 and 1. The
reason why we are still talking about this linear function is because,
this is the decision boundary.

So, what we are trying to do here is really instead of just looking at

this decision boundary and then saying yes and no + and -, what we are
trying to do is, we are trying to use this equation itself to come up with
some probabilistic interpretation. That is the reason, why we are still
sticking to this equation and trying to see if we can model probabilities
as a function of this equation, which is the hyper plane.

So, you could think of something slightly different and, then say
look instead of saying p of x is this let me say log (p (x ))= β 0 + β1 x. In
this case you will notice that it is bounded only on one side. In other
words, if I write log (p (x ))= β0 + β1 x, I will ensure that p of x never
becomes negative; however, on the positive side p of x can go to ∞.
That again is a problem because we need to bound p of x between 0
and 1. So, this is an important thing to remember. So, it only bounds
this on one side.
(Refer Slide Time: 11:16)

So, is there something more sophisticated we can do? The next idea
is to write p of X as what is called as sigmoidal function. The
sigmoidal function has relevance in many areas. So, this is the function
that is used in neutral networks and other very interesting applications.

So, the sigmoid has an interesting form which is here. So, now, let
us look at this form right here. I want you to notice two things number
one is still we are trying to stick this hyperplane equation into the
probability expression because, that is the decision surface. Remember
intuitively somehow we are trying to convert that hyper plane into
probability interpretation. So, that is the reason why we are still
sticking to this β0 + β1 x. Now let us look at this equation and then see
what happens.

So, if you take this argument β 0 + β1 x. So, that argument depending

on the value of X you take, could go all the way from - ∞ to ∞. So, just
take a single variable case if I if I write let us say β0 + β1 just 1 variable
X not a vector. Now if β1 is positive, if you take X to be a very very
large value, this number will become very large, if β 1 is negative, if
you take X to be very very large value in the positive side this number
will become - ∞. And similarly if β one takes the other values, you can
correspondingly choose X to be positive or negative and then make this
unbounded between - ∞ to ∞.

So, we will see what happens to this function when β 0 + β1 x is - ∞,

you would get this to e power - ∞ + divided by 1 + e power - ∞, or you
can just think of this as - very large number. So, in that case when
numerator will become 0 and, the denominator will become 1 + 0. So,
on the lower side this expression will be bounded by 0.
Now if you take β0 + β1 x to be a very large positive number, then
the numerator will be a very very large positive number and the
denominator will be 1 + that very large positive number. So, this will
be bounded by 1 on the upper side. So now, from the equation for the
hyper plane, we have been able to come up with the definition of a
probability, which makes sense, which is bounded between 0 and 1.
So, it is an important idea to remember. By doing this what we are
doing is the following. If we were not using this probability, all that we
will do is we will look at this equation and whenever a new point
comes in we will evaluate this β0 + β1 x and then based on whether it is
positive or negative, we are going to say yes or no.

Now, what has happened is instead of that, this number is put back
into this expression and depending on what value you get you get a
probabilistic interpretation. That is the beauty of this idea here. You
can rearrange this in this form and then say log (p (X))/ (1 – p( X)) = β 0
+ β1 x. The reason why I show you this form is because the left hand
side could be interpreted as log of odds ratio, which is an idea that is
used in several places. So, that is the connection here.

(Refer Slide Time: 14:58)

Now we have these probabilities and remember, if you were to

write this hyper plane equation as the way we wrote in the last few
slides β0 + β11 X1 + β12 X2. The job of identifying a classifier as far as
we are concerned is done when we identify values for the parameters
β0 , β11 and β12 .

So, we still have to figure out what are the values for this. Once we
have a value for this, any time I get a new point I simply put it into the
p of x equation that we saw in the last slide and then get a probability.
So, this still needs to be identified and obviously, if we are looking at a
classification problem where I have this on this side and stars on this
side, I want to identify these β0 , β11 and β12 in such a way this
classification problem is solved. So, I need to have some objective for
identifying these values. Remember in the optimization lectures I told
you that that all machine learning techniques can be interpreted in
some senses an optimization problem.

So, here again we come back to the same thing and then we say
look we want to identify this hyper plane, but I need to have some
objective function that I can use to identify these values. So, these β 0 ,
β11 and β12 will become the decision variables but I still need an
objective function. And as we discussed before when we were talking
about the optimization techniques, the objective function has to reffect
what we want to do with this problem. So, here is an objective function
looks little complicated, but I will explain this as we go along. So, I
said in the optimization lectures we could look at maximizing or
minimizing. In this case, what we going to say is I want to find value
for β0 , β11 and β12 such that this objective is maximized.

So, take a minute to look at this objective and then see why
someone might want to do something like this. So, when I look at this
objective function, let us say I again draw this and then let us say I
have these points on one side and the other points on the other side. So,
let us call this class 0 and let us call this class 1. So, what I would like
to do is I want to convert this decision function into probabilities. So,
the way I am going to think about this is, when I am on this line I
should have the probability being = 0.5, which basically says that if I
am on the line I cannot make a choice between class 1 and class 2.

Because the probability is exactly 0.5. So, I cannot say anything

about it now. What I would like to do, is you can interpret it in many
ways one thing would be to say, as I go away from this line in this
direction, I want the probability of the data belonging to class 1 to keep
decreasing. The moment that the probability that the data belongs to
class 1 keeps decreasing, that automatically means since there are only
2 classes and this is the binary classification problem, the probability
that the data belongs to class 0 keeps increasing.

So, if you think of this interpretation whereas, I go from here. So,

here the probability that the data point belongs to class 1 let us say it is
0.5, then basically it could either belong to class 1 or class 0. And if it
is such that the probability keeps decreasing here, of the data point
belonging to class 1, then it has to belong to class 0. So, that is the
basic idea. So, in other words we could say the probability function
that we defined before should be such that whenever a data point
belongs to class 0 and I put that into that probability expression, I want
a small probability. So, it might interpret the probability as the
probability that the data belongs to class 1 for example, and whenever I
take a data point from this side and, put it into that probability function,
then I want the probability to be very high because I want that as the
probability that the data belongs to class 1. So, that is the basic idea.

So, in other words we can paraphrase this and then say for any data
point on this side belonging to class 0, we want to minimize p of x
when x is substituted into that probability function and, for any point
on this side when we substitute these data points into the probability
function, we want to maximize the probability. So, if you look at this
here what they say is if this data point belongs to class 0 then y i is 0.
So, whenever a data point belongs to class 0 anything to the power 0 is
1 so, this will vanish. So, in the product there will be functions of this
form, which will be 1 - p of xi and because yi is 0 this will become 1.
So, this will become something to the power 0 1. So, this term will
vanish and the only thing that will remain is 1 - p of x i. So, if we try to
maximize 1 - p of x i, then that is equivalent to minimizing p of xi. So,
for all the points that belong to class 0 we are minimizing p of xi.

Now, let us look at the other case of a data point belonging to class
1, in which case yi is 1 so, 1 - 1 0. So, this term will be something to
the power 0 which will become 1. So, it cannot drop out. So, the only
thing that will remain is p of xi now yi is 1. So, power 1 will be just left
with p of xi. And since this data belongs to class 1, I want this
probability to be very large. So, when I maximize this it will be large
number.

So, you have to think carefully about this equation. There are many
things going on here, number 1 that this is a multiplication of the
probabilities for each of the data point. So, this includes data points
from class 0 and class 1. The other thing that you should remember is
let us say I have a product of several numbers, if I am guaranteed that
every number is positive right, then the product will be maximized
when each of these individual numbers are maximized. So, that is the
principle that is also operating here, that is why we do this product of
all the probabilities.
However if a data point belongs to class 1, I want probability to be
high. So, the individual term is just written as p of xi. So, this is high
for class 1. When a data point belongs to class 0, I still want this
number to be high, that means, this number will be small. So, it
automatically takes care of this as far as class 0 and class 1 are
concerned. So, while this looks little complicated, this is written in this
way because it is easier to write this as one expression.

Now let us take a simple example to see how this will look. Let us
say I have class 0, I have 2 data points X1 and X2 and class 1, I have 2
data points X3 and X4. So, this objective function when it is written out
would look something like this. So, when we take let us say these
points belonging to class 0 then I said the only thing that will be
remaining is here. So, this will be 1 - p of X 1 for the second data point
it will be 1 - p of X 2, then for the data third data point it will be p of X 3
and for the fourth data point it will be p of X4.

So, this would be the expression from here. So, now when we maximize this,
then since p of X s are bounded between 0 and 1, this is a positive
number, this is a positive number, positive number positive number
and, if the product has to be maximized, then each number has to be
individually maximized. That means, this has to be maximized. So, it
will go closer and closer to 1 the closer to 1 it is better. So, you notice
that X4 would be optimized to belong in class 1 Similarly X3 would be
optimized to belong in class 1 and when you come to these two
numbers, you would see that this would be a large number if p of X1 is
a small number. So, p of X1 basically means that X1 is optimized to be
in class 0. And similarly X2 is optimized to be in class 0. So, this is an
important idea that we have to understand in terms of how this
objective function is generated.

(Refer Slide Time: 24:47)

Now, one simple trick you can do is take that objective function and
take a log of that and then maximize it. So, if I am maximizing a
positive number X, then that it is equivalent to maximising log of X
also. So, whenever this is maximized that will also be maximized, the
reason why you do this it makes the product into a sum makes it looks
simple. So, remember from our optimization lectures, we said we got a
maximise this objective. So, we always write this objective in terms of
decision variables and the decision variables in this case are β naught β
1 1 and β 1 2 so, we described before. So, what happens is each of
these probability expressions, if you recall from your previous slides,
will have these 3 variables and x i are the points that are already given.
So, you simply substitute them into this expression.

So this whole expression would become a function of β0 , β11 and

β12. Now we have come back to our familiar optimization territory,
where we have this function which is a function of these decision
variables, this needs to be maximized and this is an unconstrained
maximization problem be-cause we are not putting any constraints β0 ,
β11 and β12. So, they can take any value that that we want and also the
fact that the way the probability is defined, this would also become a
non-linear function. So, basically we have a non-linear optimization
problem in several decision variables and, you could use any non-
linear optimization technique to solve this problem and when you solve
this problem, what you get is basically the hyper plane. So, in this case
it is a two dimensional problem. So, we have 3 parameters. Now if
there is a n dimensional problem, if you have let us say n variables.

So, I will have something like β0 + β11 x1 + β12 x2 and so on + β1n xn,
this will be an n + 1 variable problem, there are n + 1 decision
variables, these n + 1 decision variables will be identified through this
optimization solution. And for any new data point, once we put that
data point into the p of x function that sigmoidal function that we have
described, then you get the probability that it belongs to class 0 or class
1.

So, this is the basic idea of logistic regression. In the next lecture, I
will take very simple example with several data points to show you
how this works in practice and I will also introduce notion of
regularization, which would help in avoiding over fitting when we do
logistic regression. I will explain what over fitting means in the next
lecture also, with that you will have theoretical under-standing of how
logistics regression works and in a subsequent lecture doctor Hemanth
Kumar would illustrate, how to use this technique in R on a case study
problem.

So, that will give you the practical experience of how to use
logistics regression and how to make sense out of the results that you
get from using logistics regression on an example problem.

Thank you and I will see you in the next lecture.

Murphy Book Solution
No ratings yet
Murphy Book Solution
100 pages
Course: 141 Tig Welding of Stainless Steel
No ratings yet
Course: 141 Tig Welding of Stainless Steel
17 pages
Lec 43
No ratings yet
Lec 43
9 pages
Lec 41
No ratings yet
Lec 41
6 pages
Lec 20
No ratings yet
Lec 20
16 pages
Lec 3
No ratings yet
Lec 3
21 pages
Yousef ML Washin Classification
100% (1)
Yousef ML Washin Classification
333 pages
04 Probability and Learning PDF
No ratings yet
04 Probability and Learning PDF
34 pages
Lecture Notes Chapt13
No ratings yet
Lecture Notes Chapt13
15 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
Q. 1) What Is Class Condition Density? (3 Marks) Ans
No ratings yet
Q. 1) What Is Class Condition Density? (3 Marks) Ans
12 pages
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
No ratings yet
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
10 pages
Lecture 6
No ratings yet
Lecture 6
19 pages
Lec 1
No ratings yet
Lec 1
42 pages
ML-chap10 2024 110300
No ratings yet
ML-chap10 2024 110300
29 pages
06 Lectureslides LinearClassification Fixed
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
Math Behind Machine Learning
No ratings yet
Math Behind Machine Learning
9 pages
Machine Learning and Pattern Recognition Bayesian Complexity Control
No ratings yet
Machine Learning and Pattern Recognition Bayesian Complexity Control
4 pages
Lecture 03 Bayes Classifier With Prob Concepts
No ratings yet
Lecture 03 Bayes Classifier With Prob Concepts
70 pages
NB 13
No ratings yet
NB 13
27 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Session01 DataScience
No ratings yet
Session01 DataScience
79 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Slide 2
No ratings yet
Slide 2
30 pages
Machine Learning Handbook - Radivojac and White
No ratings yet
Machine Learning Handbook - Radivojac and White
108 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Machine Learning - Unit 2
No ratings yet
Machine Learning - Unit 2
104 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
6.867 Section 3: Classification: 1 Intro 2 2 Representation 2 3 Probabilistic Models 2
No ratings yet
6.867 Section 3: Classification: 1 Intro 2 2 Representation 2 3 Probabilistic Models 2
10 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Weatherwax Theodoridis Solutions
No ratings yet
Weatherwax Theodoridis Solutions
212 pages
MLP RL1
No ratings yet
MLP RL1
6 pages
Module 3.1
No ratings yet
Module 3.1
25 pages
Datamining Lect12
No ratings yet
Datamining Lect12
75 pages
Lec 2
No ratings yet
Lec 2
37 pages
Logistic Regression 1
No ratings yet
Logistic Regression 1
32 pages
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
No ratings yet
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
20 pages
Reviews Less 1 - 4
No ratings yet
Reviews Less 1 - 4
115 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Lecture 1, Part 2: Linear Classification: Roger Grosse
No ratings yet
Lecture 1, Part 2: Linear Classification: Roger Grosse
10 pages
Mathematics of Machine Learning MIT
No ratings yet
Mathematics of Machine Learning MIT
411 pages
MIT18 657F15 LecNote PDF
No ratings yet
MIT18 657F15 LecNote PDF
194 pages
Lecture Notes MAI
No ratings yet
Lecture Notes MAI
114 pages
Machine Learning in 10 Pages PDF
No ratings yet
Machine Learning in 10 Pages PDF
10 pages
Mauryan Empire
No ratings yet
Mauryan Empire
11 pages
Notes6 Classification
No ratings yet
Notes6 Classification
10 pages
011 - Classification
No ratings yet
011 - Classification
99 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
Logistic Regression Training DR Anil
No ratings yet
Logistic Regression Training DR Anil
38 pages
Lecture Notes MAI
No ratings yet
Lecture Notes MAI
111 pages
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
Log Reg Skimed - Ipynb - Colab
No ratings yet
Log Reg Skimed - Ipynb - Colab
10 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
No ratings yet
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
101 pages
Attacking Problems in Logarithms and Exponential Functions
From Everand
Attacking Problems in Logarithms and Exponential Functions
David S. Kahn
5/5 (1)
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
Satplan: Fundamentals and Applications
From Everand
Satplan: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Magic box
From Everand
The Magic box
Toni Edge
No ratings yet
DWDM .Unit - 1
No ratings yet
DWDM .Unit - 1
58 pages
DWDM Unit 2
No ratings yet
DWDM Unit 2
20 pages
Lec 46
No ratings yet
Lec 46
12 pages
Mac Sublayer
No ratings yet
Mac Sublayer
42 pages
Multiplexing
No ratings yet
Multiplexing
40 pages
Sensor Networks Unit 1
No ratings yet
Sensor Networks Unit 1
20 pages
Internet Protocols
No ratings yet
Internet Protocols
12 pages
Ai PPT Un It 2
No ratings yet
Ai PPT Un It 2
60 pages
Knowledge Representation Ai Unit 3
No ratings yet
Knowledge Representation Ai Unit 3
26 pages
Ai Unit 1
No ratings yet
Ai Unit 1
54 pages
CHEM 113-Quiz #7 Answer Key
No ratings yet
CHEM 113-Quiz #7 Answer Key
4 pages
SCHLENKER Katalog 2022 EN - WEB
No ratings yet
SCHLENKER Katalog 2022 EN - WEB
136 pages
Effect of Grist
No ratings yet
Effect of Grist
9 pages
(Cambridge Mathematical Textbooks) Shahriar Shahriari - An Invitation To Combinatorics-Cambridge University Press (2021)
No ratings yet
(Cambridge Mathematical Textbooks) Shahriar Shahriari - An Invitation To Combinatorics-Cambridge University Press (2021)
636 pages
قوانين الفصول بملف واحد فيزياء السادس علمي للاستاذ سعيد محي تومان PDF PDF Mathematical Analysis Teaching Mathematics
No ratings yet
قوانين الفصول بملف واحد فيزياء السادس علمي للاستاذ سعيد محي تومان PDF PDF Mathematical Analysis Teaching Mathematics
1 page
Data Sheet For Anchor (2472)
100% (1)
Data Sheet For Anchor (2472)
2 pages
TUPLE
No ratings yet
TUPLE
16 pages
QTP Imp
No ratings yet
QTP Imp
53 pages
Classroom Inventory List SCHOOL YEAR
No ratings yet
Classroom Inventory List SCHOOL YEAR
1 page
Varela 1979
No ratings yet
Varela 1979
14 pages
Cusps: Akshuz 09-Nov-1984 09:55:15 PM Ernakulam 76:17:0 E, 9:59:0 N Tzone: 5.5 KP (Original) Ayanamsha 23:33:6
No ratings yet
Cusps: Akshuz 09-Nov-1984 09:55:15 PM Ernakulam 76:17:0 E, 9:59:0 N Tzone: 5.5 KP (Original) Ayanamsha 23:33:6
1 page
Dictionary - Programs Questions and Answers - Class 11
No ratings yet
Dictionary - Programs Questions and Answers - Class 11
17 pages
Theoretical Distributions 2
No ratings yet
Theoretical Distributions 2
3 pages
Study of Suspension System in All Terrain Vehicle: Presented by
No ratings yet
Study of Suspension System in All Terrain Vehicle: Presented by
14 pages
EMD Module 1
No ratings yet
EMD Module 1
69 pages
CH11-Digital Logic
No ratings yet
CH11-Digital Logic
6 pages
Classroom Activity - Externally Applied Loads
No ratings yet
Classroom Activity - Externally Applied Loads
1 page
CH3140 Lecture Notes S1AY23-24 Set A
No ratings yet
CH3140 Lecture Notes S1AY23-24 Set A
40 pages
3 Magnetic Effect of Current and Magnetism
No ratings yet
3 Magnetic Effect of Current and Magnetism
12 pages
Time Is Money - Estimating The Cost of Latency in Trading
No ratings yet
Time Is Money - Estimating The Cost of Latency in Trading
61 pages
DNA Extraction From Organic Phase of Trizol Reagent After RNA Isolation
No ratings yet
DNA Extraction From Organic Phase of Trizol Reagent After RNA Isolation
2 pages
Tables and Formulas Used in Dry-Run
No ratings yet
Tables and Formulas Used in Dry-Run
3 pages
Safety Function Guide
No ratings yet
Safety Function Guide
38 pages
Design For Test Scan Test
100% (1)
Design For Test Scan Test
31 pages
STF5 Equilibrium Beam Datasheet
No ratings yet
STF5 Equilibrium Beam Datasheet
2 pages
Module-3-Electro Chem PDF
No ratings yet
Module-3-Electro Chem PDF
11 pages
Coulter Counter
No ratings yet
Coulter Counter
16 pages
Chapter 24 Spectroscopic Methods
No ratings yet
Chapter 24 Spectroscopic Methods
44 pages
Cs606 Final Term Quizez and MCQZ Solved With Refer
No ratings yet
Cs606 Final Term Quizez and MCQZ Solved With Refer
18 pages

Lec 42

Uploaded by

Lec 42

Uploaded by

Data science for Engineers

Prof. Ragunathan Rengasamy

(Refer Slide Time: 00:24)

In this lecture, I will describe a technique called Logistic regression.

So, in that sense this is a classification technique, that is used in a

(Refer Slide Time: 02:15)

Now, whenever we talk about classification problems, we have

(Refer Slide Time: 04:03)

So, if you were to do this classification problem, then what you

(Refer Slide Time: 07:44)

(Refer Slide Time: 08:52)

So, what we are trying to do here is really instead of just looking at

So, if you take this argument β 0 + β1 x. So, that argument depending

So, we will see what happens to this function when β 0 + β1 x is - ∞,

(Refer Slide Time: 14:58)

Now we have these probabilities and remember, if you were to

Because the probability is exactly 0.5. So, I cannot say anything

So, if you think of this interpretation whereas, I go from here. So,

(Refer Slide Time: 24:47)

So this whole expression would become a function of β0 , β11 and

Thank you and I will see you in the next lecture.

You might also like