0% found this document useful (0 votes)
14 views5 pages

18.1 - How "Classification" Works - mp4

knn

Uploaded by

NAKKA PUNEETH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views5 pages

18.1 - How "Classification" Works - mp4

knn

Uploaded by

NAKKA PUNEETH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

So in this chapter, we will learn about what is classification, what is regression, and we'll

learn a very, very simple, yet very powerful machine learning technique called the knee rest
neighbors. It's also often called as KNN. But before we go and understand all of it, first, let's
take classification itself. Let's understand how classification works. How classification
works. So let's take our Amazon food reviews data set. We had our Amazon fine food
reviews data set that we saw as an example. So we will keep using this Amazon fine foods
data set as a running real world example. We will use this as a real world example
throughout this course. And it will be a running example so that you know how different
algorithms perform. Well. Why are we learning new algorithms? We learn all of that using
Amazon fine food review as a running example. We'll also use mnest, if you recall our mnest,
mnest that we learned when we learned about PCA and when we learned about t snee. So
just to quickly recap, MnisT is basically you're given a vector representation of images of
handwritten characters, and you have to determine whether it is, whether the character is
0123 or nine, so on and so forth or nine. Right. So that's the MNIST data set. So we'll keep
using Amazon fine foods data set and the MNIST data set as running examples when we
learn different algorithms. Before we go, what does classification actually mean? Right.
Classification, to put it very simply, let's take our Amazon fine Foods review for
classification to understand classification, right. So we have reviews. We have multiple
reviews. We have like 360K, roughly about 360K reviews here. For each review, we are
using the review text because we felt that review text is the most informative signal or
informative feature or informative variable. Each text, we got it converted to a vector. Right?
So we converted into a vector using multiple techniques. Right? Either bag of words or tfidf
or word to. Right. We learned multiple techniques on how to convert the text into a vector.
Now, for each review, I have a vector, right. And for each review, again, I also have the data
whether it is a positive review or negative review. Right? So what is classification all about?
The problem of classification. So the problem of classification is this. This is very, very
important. This is the crux of whole of machine learning algorithms which fall under
classification, right? So now we have our three hundred and sixty k, three hundred and sixty
four K reviews for which for each review, we have a vector representation. And we also
have whether the review is positive or negative. Right? So this is the data that we have now,
what classification means is in this data. Okay, so classification is all about finding a
function. So if I think about it, classification is all about given a new review, given a new
review, given a new review text. In our case, determine, determine, or predict or predict if
the review is positive or not. If the review is positive or negative. This is the task of
classification. Because why is it called classification? Because given a new review, given a
new review, let's call it RQ R query. You're querying and you're asking, given this review,
tell me whether it is a positive review or a negative review. This is what we want to find. So,
we are classifying a new review. We are classifying a new review into two classes. The first
class is positive class. The second class is negative class. That's why it's called classification,
right? Very simple. So classification can be thought of. Let's try to put all of this
mathematically. Classification can be thought of as finding a function. Let me explain what it
means. Right? So it can be thought of as finding a function like this. Actually, most of
machine learning is about finding a function. Let me explain that. Let me connect those two
mathematical dots. So, imagine each review is represented by a vector called X, okay?
Instead of Vi, let's just say the notation is not Vi. The notation is actually xi. It's a standard
notation. Every data point, or every review in our case, or every data point we are given, is
represented using a mathematical vector called xi. Now, given an x, given an x, I want to find
a function. I want to find a function f that will return a y for me. What is y here y is whether
the review is positive or negative. What is x here x is my review text. So, mathematically
speaking, my classification, the objective of my classification, is all about finding this magic
function f. Finding this function f, such that given a review text, if I apply this function f, the
mathematical function f, on the review text, I would get y, which says whether the review is
positive or not. This is the crux. This is the central concept. Crux, in English, basically means
the central concept. For those of you who didn't know that this is the central concept of all
of machine learning and specifically classification, right? It's all about given a new review.
Given a new review. Let's call it xq review. This is basically a query review. This is basically
a query review. Why is it called a query review? Because you're querying your machine
learning algorithm, saying, this is the review that I have. Now tell me, now tell me, what is
its class? You're querying it, asking it what is its class? It will return me a YQ. This YQ should
say whether it is positive or negative. This is the whole objective. This is a problem we are
trying to solve. In most problems in machine learning, not all the problems, there are
problems where it's not exactly this, but classification is all about this classification is all
about where YQ takes a few classes, right? In our case, what classes we have, we have the
positive class and the negative class. We'll come to understand what if YQ is something else?
We'll come to that little later in this chapter. But for now, all you have to remember is this is
the crux of machine learning. And how does machine learning work? How does
classification algorithms work? How does classification algorithms work? How does
classification algorithms work? So let's assume this is your algorithm. Let's assume this is
your classification algorithm. We don't understand. Let's keep it as a black box. Right now.
Let's not worry. What is there inside? You give it something called a training data set. You
give it something called a data set, right? Or it's also called as a training data set. When you
give this data set, the algorithm, the training data set contains all your pairs of Xi and Yi. It
has Xi pairs of Yi and Xi, many, many such pairs. So it says, if this is the Xi, what is the
corresponding Yi? So let's say I goes from one to, let's say 100K. Okay? You give it lot of
data. You give it lot of. So to put it again in our perspective, you give it lot of data, where you
say, this is my review and this is the result, whether it's positive or not. The algorithm now
takes all of this input. This is called the training data, because the algorithm trains on this
data. Here you're giving Xi and Yi both. Right? Now the algorithm learns the function f.
Algorithm learns because it is seeing lot of examples, right? It's seeing lot of examples of xi
and yi. By looking at all these examples, it learns this function. Right? Now, when I take this
function, now, after it has learned this function f, if I give it any new point, if I give it any
new point, it will return me its corresponding class. This is the crux of classification. This is
how classification works. This stage is called the training stage. This stage is called the
training stage. This stage is called the testing or the evaluation stage. This stage is called the
testing or the evaluation stage. Because here, what are we doing here? We are giving it some
data. And we are saying, here is the mapping. Here is my x one. Here is my y one. Here is my
x two, here is my y two, here is my x three, y three, so on and so forth, x n, y n. Now, using all
of this data, try to learn the mapping or the function. Try to learn the function such that f of
x I equals to Y. I try to learn this, and that's what the algorithm tries to do. That's what the
algorithm tries to do. It trains on this data and learns this new function. Once it learns this
function, once it learns this function, our training is over. Now, in test or evaluation stage,
we give it new points that it has not seen. Remember, this xq may not be there in this data
set. So if you give it new data set and if it can predict yq accurately, then you say, wow, I've
learned the right function that I care about. Of course, machine learning is not perfect. It will
not learn the perfect function here. It will try to do its best job using various techniques. Of
course, some techniques will be able to learn better functions, some techniques will be able
to learn worse off functions. But this is the core idea of whole of classification and.

So in this chapter, we will learn about what is classification, what is regression, and we'll
learn a very, very simple, yet very powerful machine learning technique called the knee rest
neighbors. It's also often called as KNN. But before we go and understand all of it, first, let's
take classification itself. Let's understand how classification works. How classification
works. So let's take our Amazon food reviews data set. We had our Amazon fine food
reviews data set that we saw as an example. So we will keep using this Amazon fine foods
data set as a running real world example. We will use this as a real world example
throughout this course. And it will be a running example so that you know how different
algorithms perform. Well. Why are we learning new algorithms? We learn all of that using
Amazon fine food review as a running example. We'll also use mnest, if you recall our mnest,
mnest that we learned when we learned about PCA and when we learned about t snee. So
just to quickly recap, MnisT is basically you're given a vector representation of images of
handwritten characters, and you have to determine whether it is, whether the character is
0123 or nine, so on and so forth or nine. Right. So that's the MNIST data set. So we'll keep
using Amazon fine foods data set and the MNIST data set as running examples when we
learn different algorithms. Before we go, what does classification actually mean? Right.
Classification, to put it very simply, let's take our Amazon fine Foods review for
classification to understand classification, right. So we have reviews. We have multiple
reviews. We have like 360K, roughly about 360K reviews here. For each review, we are
using the review text because we felt that review text is the most informative signal or
informative feature or informative variable. Each text, we got it converted to a vector. Right?
So we converted into a vector using multiple techniques. Right? Either bag of words or tfidf
or word to. Right. We learned multiple techniques on how to convert the text into a vector.
Now, for each review, I have a vector, right. And for each review, again, I also have the data
whether it is a positive review or negative review. Right? So what is classification all about?
The problem of classification. So the problem of classification is this. This is very, very
important. This is the crux of whole of machine learning algorithms which fall under
classification, right? So now we have our three hundred and sixty k, three hundred and sixty
four K reviews for which for each review, we have a vector representation. And we also
have whether the review is positive or negative. Right? So this is the data that we have now,
what classification means is in this data. Okay, so classification is all about finding a
function. So if I think about it, classification is all about given a new review, given a new
review, given a new review text. In our case, determine, determine, or predict or predict if
the review is positive or not. If the review is positive or negative. This is the task of
classification. Because why is it called classification? Because given a new review, given a
new review, let's call it RQ R query. You're querying and you're asking, given this review,
tell me whether it is a positive review or a negative review. This is what we want to find. So,
we are classifying a new review. We are classifying a new review into two classes. The first
class is positive class. The second class is negative class. That's why it's called classification,
right? Very simple. So classification can be thought of. Let's try to put all of this
mathematically. Classification can be thought of as finding a function. Let me explain what it
means. Right? So it can be thought of as finding a function like this. Actually, most of
machine learning is about finding a function. Let me explain that. Let me connect those two
mathematical dots. So, imagine each review is represented by a vector called X, okay?
Instead of Vi, let's just say the notation is not Vi. The notation is actually xi. It's a standard
notation. Every data point, or every review in our case, or every data point we are given, is
represented using a mathematical vector called xi. Now, given an x, given an x, I want to find
a function. I want to find a function f that will return a y for me. What is y here y is whether
the review is positive or negative. What is x here x is my review text. So, mathematically
speaking, my classification, the objective of my classification, is all about finding this magic
function f. Finding this function f, such that given a review text, if I apply this function f, the
mathematical function f, on the review text, I would get y, which says whether the review is
positive or not. This is the crux. This is the central concept. Crux, in English, basically means
the central concept. For those of you who didn't know that this is the central concept of all
of machine learning and specifically classification, right? It's all about given a new review.
Given a new review. Let's call it xq review. This is basically a query review. This is basically
a query review. Why is it called a query review? Because you're querying your machine
learning algorithm, saying, this is the review that I have. Now tell me, now tell me, what is
its class? You're querying it, asking it what is its class? It will return me a YQ. This YQ should
say whether it is positive or negative. This is the whole objective. This is a problem we are
trying to solve. In most problems in machine learning, not all the problems, there are
problems where it's not exactly this, but classification is all about this classification is all
about where YQ takes a few classes, right? In our case, what classes we have, we have the
positive class and the negative class. We'll come to understand what if YQ is something else?
We'll come to that little later in this chapter. But for now, all you have to remember is this is
the crux of machine learning. And how does machine learning work? How does
classification algorithms work? How does classification algorithms work? How does
classification algorithms work? So let's assume this is your algorithm. Let's assume this is
your classification algorithm. We don't understand. Let's keep it as a black box. Right now.
Let's not worry. What is there inside? You give it something called a training data set. You
give it something called a data set, right? Or it's also called as a training data set. When you
give this data set, the algorithm, the training data set contains all your pairs of Xi and Yi. It
has Xi pairs of Yi and Xi, many, many such pairs. So it says, if this is the Xi, what is the
corresponding Yi? So let's say I goes from one to, let's say 100K. Okay? You give it lot of
data. You give it lot of. So to put it again in our perspective, you give it lot of data, where you
say, this is my review and this is the result, whether it's positive or not. The algorithm now
takes all of this input. This is called the training data, because the algorithm trains on this
data. Here you're giving Xi and Yi both. Right? Now the algorithm learns the function f.
Algorithm learns because it is seeing lot of examples, right? It's seeing lot of examples of xi
and yi. By looking at all these examples, it learns this function. Right? Now, when I take this
function, now, after it has learned this function f, if I give it any new point, if I give it any
new point, it will return me its corresponding class. This is the crux of classification. This is
how classification works. This stage is called the training stage. This stage is called the
training stage. This stage is called the testing or the evaluation stage. This stage is called the
testing or the evaluation stage. Because here, what are we doing here? We are giving it some
data. And we are saying, here is the mapping. Here is my x one. Here is my y one. Here is my
x two, here is my y two, here is my x three, y three, so on and so forth, x n, y n. Now, using all
of this data, try to learn the mapping or the function. Try to learn the function such that f of
x I equals to Y. I try to learn this, and that's what the algorithm tries to do. That's what the
algorithm tries to do. It trains on this data and learns this new function. Once it learns this
function, once it learns this function, our training is over. Now, in test or evaluation stage,
we give it new points that it has not seen. Remember, this xq may not be there in this data
set. So if you give it new data set and if it can predict yq accurately, then you say, wow, I've
learned the right function that I care about. Of course, machine learning is not perfect. It will
not learn the perfect function here. It will try to do its best job using various techniques. Of
course, some techniques will be able to learn better functions, some techniques will be able
to learn worse off functions. But this is the core idea of whole of classification and.

You might also like