0% found this document useful (0 votes)
21 views2 pages

38.7 - Matrix Factorization For Feature Engineering - mp4

ML Project

Uploaded by

NAKKA PUNEETH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views2 pages

38.7 - Matrix Factorization For Feature Engineering - mp4

ML Project

Uploaded by

NAKKA PUNEETH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

So matrix factorization can also be used for feature engineering as follows.

So imagine if I
have a matrix a which contains my user item ratings, right? So imagine if I want to generate
a vector representation for each user, also a vector representation for each item j. So I want
to create a vector representation, let's say a d dimensional vector. Let's assume I want to
create a d dimensional vector for each user, also a d dimensional vector for each item, such
that, see, remember, this is important. I want to create these representations by leveraging,
by using the data in Aig, by using these ratings. If I want to use these ratings to somehow
create a d dimensional representation of each user, or d dimensional representation for
each item, right? How do I do it? So imagine if I can decompose my matrix a into product of
my matrix a is an n cross m matrix, where n is the number of users, m is the number of
items, right? Suppose if I can decompose this into product of two matrices b and c
transpose, right? Where b is an n cross d matrix and c is an m cross d matrix, right? Suppose
if I can decompose this like this. Right? Now one thing you'll notice is my b is a n cross d
matrix, which means it has n rows, one, two, so on, so forth n rows, and it has d columns,
right? Suppose if you take the ith row of b, if you take the ith row of b, this is a d
dimensional vector. This ith row of d is a d dimensional vector and you have one vector
because you have n users. Remember, n is the number of users and you have n vectors
corresponding where each vector corresponds to one user, isn't it? So I can take the rho
vectors of bi of my matrix b, which are nothing but bi as my user vectors, because I have one
vector for every user, isn't it? Now, in my decomposition, my b, my b is one two, so on and
so forth n one two, so on and so forth d this is an n cross d matrix. Similarly, my vector c,
sorry, my matrix c has one two, so on, so forth m and one two, so on, so forth d m cross d
where m is the number of users and m is the number of items. So if I take the ith row of b,
because there is one row correspondent because there are n users, totally. I can take my bi,
my bi, which is my ith row of b as my user vector. Now remember, ui belongs to rd just like
bi, and there is one vector corresponding to each user. Similarly, I can take ci, which is the
ith row. So let's take the ith row here. You have m rows here, which means there is one row
corresponding to each item. So you can take the ci as the ith row of c as the item ith
representation, which is also rd. So by just doing matrix factorization, by performing matrix
factorization, I can arrive at a d dimensional representation, right? This d can be anything.
You can keep d as two, as long as d is greater than zero. That's a rule of matrix
decomposition, right? D has to be greater than zero, and d has to be less than equal to the
minimum value of m, comma, n. That's the only constraint. So if you want a three
dimensional representation, you can get it. If you want a ten dimensional representation,
you can get it. You can get any d dimensional representation as long as d satisfies these two
constraints. Now, the most important thing here is the d dimensional representation. The d
dimensional representation that you got, the d dimensional representation you arrived at,
you arrived at using both for items and products using matrix factorization has this
underlying behavior that if users, UI and Uj are very similar in their ratings of products, in
their ratings of products. So let's assume, let's just say that there are two users, UI and uj,
who are very similar in the ratings of products or items, then the d dimensional
representation that you got, or the distance between UI and uj, and these are d dimensional
vectors, right, that you got using matrix factorization, the distance is also going to be small.
Similarly, if you have two products, ii and ij, which are very similar, which are very similar
based on ratings, based on ratings data, then the distance between vector II and vector I j,
remember, these two are vectors, is also going to be very small, because how did we arrive
at these d dimensional representations? We arrived at these d dimensional representations
by performing matrix decomposition on the ratings data. Let's not forget that fundamental
fact. We took the rating data, we decomposed it, and now we are using the rows of B and C
to get our d dimensional representation right. We will look at this is a high level. So given
any ratings data like this, given any ratings data like this, you can arrive at the d
dimensional representation for users and items. Later in this chapter, we will look at how
this matrix factorization for feature engineering, how this core idea can be extended, not
just to recommender systems, but to things like word vectors. We will see how this idea of
matrix factorization for feature engineering can be extended to get word vectors very
similar, not exactly same, but similar to our word to EC. We studied word to Vec for a long
time now, right? We learnt it the very start when we learnt about text data, right? So we will
see how matrix factorization is connected to your word vectors. Because word vectors are a
way to featureize your data, isn't it? Similarly, we'll also see how matrix factorization is
related to an idea called eigen faces, which was one of the early techniques. Which was one
of the early techniques. Which was one of the early techniques for face detection. For face
recognition. For face recognition. It is no more used extensively, but it's an old idea. But
eigenphases is nothing but your matrix factorization, or SVD for feature engineering. So we
will see how matrix factorization can be used to arrive at word vectors or to arrive at
features for images. Eigenfaces is nothing but a feature representation, a feature
engineering task for face images, right? So we'll see more concrete examples later in this
chapter. But given any ratings data like this, you can easily arrive at user and item
representations very quickly. But this, this matrix factorization idea need not be limited
only to ratings data, right? We'll see later in this chapter how this core mathematical tool
can be used to arrive at word vectors and at representing images, especially face images, to
come up with features for face images later in this chapter.

You might also like