Lec 41
Lec 41
Lecture – 41
Classification
But here now if I get a new point if I get a point like this, then I
would like the classifier to say that this point most likely belongs to the
class which is denoted by star points here and that is what would
happen and similarly if I have a point here I would like that to be
classified to this class and so on. Now, we can think of two types of
problems. The simpler type of classification problem is what is called
the binary classification problem. Basically binary classification
problems are where there are 2 classes, yes and no.
So, examples are for example, if you have data for a process or an
equipment and then you might want to simply classify this data as
belonging to normal behaviour of the equipment or abnormal
behaviour of the equipment. So, that is just binary. So, if I get a new
data I want to say from this data if the equipment is working properly
or it is not working properly. Another example would be if let us say
you get some tissue sample and you characterize that sample through
certain means and then using those characteristics you want to classify
this as a cancerous or a non cancerous sample. So, that is another
binary classification example.
So, if we were to draw a hyper plane here and then say this is all
class 1 and this is class 2 then these points are classified correctly,
these points are classified correctly and these points are poorly
classified or misclassified. Now, if I were to come out similarly with a
hyper plane like this you will see similar arguments where these are
points that will be poorly classified.
So, whatever you do if you try to come up with something like this
then, these are points that would be misclassified. So, there is no way
in which I can simply generate a hyper plane that would classify this
data into 2 regions. However, this does not mean this is not a solvable
problem it only means that this problem is not linearly separable or
what I have called here as linearly non separable problems.
So, you need to come up with not a hyper plane, but very simply in
layman terms curved surfaces and here for example, if you were able to
generate a curve like this then you could use that as a decision function
and then say on one side of a curve I have data point belonging to class
2 and on the other side I have data points belonging to class 1.
So, in general when we look at classification problems we are we
look at whether they are linearly separable or not separable and from
data science view point this problem right here becomes lot harder
because when we look at the decision function in a linearly separable
problem we know the functional form, it is a hyper plane, and we are
simply going to look for that in the binary classification case.
So, when I have something like this here let us say this is class 1,
this is class 2 and this is class 3 what I could possibly do is the
following. I could do hyper plane like this and a hyper plane like this.
Now then I have these 2 hyper planes, then I have basically 4
combinations that are possible. So, for example, if I take hyper plane 1,
hyper plane 2, as the 2 decision functions, then I could generate 4
regions. For example, I could generate + +, + -, - +, - -. So, you know
that for a particular hyper plane you have 2 half spaces, a positive half
space and a negative half space.
So, when I have something like this here then basically what it says
is, the point is in the positive half space of both hyper plane one and
hyper plane 2 and when I have a point like this, this says the point is in
the positive half space of hyper plane 1 and the negative half space of
hyper plane 2 and in this case you would say it is in the negative half
space of both the hyper planes.
So, now, you see that when we go to multi class problems if you
were to use more than one hyper plane then depending on the hyper
planes you get a certain number of possibilities. So, in this case when I
use this 2 hyper planes I got basically 4 spaces as I show here. So, in
this multi class problem which is I have 3 classes, if I could have data
belonging to one class falling here data belonging to another class
falling here and let us say the data belonging to the third class falling
here for example.
So, this is another important idea that, that one should remember
when we go to multi class problems. So, when we solve multi class
problems, we can treat them directly as multi class problems or you
could solve many binary classification problems and come up with a
logic on the other side of these classification results to label the
resultant to one of the multiple classes that you have.
So, Kernel methods are important when we have linearly non separable
problems. So, with this I just wanted to give you a brief idea on the
conceptual underpinnings of classification algorithms the math behind
all of this is what we will try to teach at least some of it is what we will
try to teach in this course and in more advance machine learning
courses you will see the math behind all of this in much more detail.
Thank you.