UNIT3
UNIT3
UNIT-III- Page 1 of 29
KCE-CSE –RS 2024
User-based collaborative filtering: This method identifies users who have similar
preferences or behavior to the target user and recommends items that they have
liked or interacted with. For example, if User A and User B have both liked similar
movies in the past, and User A likes a new movie, the system might recommend that
movie to User B.
∙ Data Collection: The system collects data on user interactions with items, such as
ratings, purchases, clicks, likes, etc. This data is typically represented in a matrix
where rows represent users and columns represent items. The entries in the matrix
represent the users' interactions with the items (e.g., ratings, binary indicators of
likes, etc.).
∙ Similarity Calculation: The system then calculates the similarity between users or
items based on their past interactions. This is typically done using similarity metrics
such as cosine similarity, Pearson correlation, or Jaccard similarity.
∙ User-based Collaborative Filtering: For user-based collaborative filtering, the
system calculates the similarity between users. Users who have similar interaction
patterns are considered similar to each other.
UNIT-III- Page 2 of 29
KCE-CSE –RS 2024
∙ Item-based Collaborative Filtering: For item-based collaborative filtering, the
system calculates the similarity between items. Items that are frequently interacted
with by the same users are considered similar to each other.
∙ Neighborhood Selection: Based on the calculated similarities, the system selects a
neighborhood of users or items that are most similar to the target user or item. ∙
Prediction: Once the neighborhood is identified, the system predicts the user's
preference for items they have not yet interacted with. This prediction is typically based
on a weighted average of the ratings or interactions of the neighbors, where the weights
are the similarities between the target user or item and the neighbors. ∙
Recommendation Generation: Finally, the system generates recommendations by
selecting the top-rated items from the predicted preferences for the target user. ∙
Feedback Incorporation: As users interact with the recommended items, their
feedback is incorporated back into the system to update the recommendations for
future users.
This process iterates continuously, with the system refining its recommendations as
more data becomes available and as users' preferences evolve over time. Collaborative
filtering is powerful because it does not rely on explicit knowledge about items or
users but instead learns from the implicit feedback provided by user interactions.
UNIT-III- Page 3 of 29
KCE-CSE –RS 2024
UNIT-III- Page 4 of 29
KCE-CSE –RS 2024
item latent factors. These latent factors capture the underlying patterns in the
data and are used to make predictions about user preferences.
∙ Factorization Machines: Factorization machines generalize matrix factorization
techniques by incorporating additional user and item features into the model.
They model interactions between user and item features to make more accurate
predictions.
The choice of algorithm depends on factors such as the characteristics of the data,
the scalability requirements, and the specific recommendation problem being
addressed. Experimentation and evaluation are essential to determine the most
effective algorithm for a given application.
13.User-based and item-based CF
Collaborative filtering techniques can be broadly categorized into two main types:
user-based and item-based collaborative filtering. Additionally, hybrid approaches
that combine elements of both user-based and item-based methods are also common.
Here's an overview of each technique:
UNIT-III- Page 5 of 29
KCE-CSE –RS 2024
∙ Nearest Neighbor: This approach identifies a set of users similar to the target user
based on their past interactions with items. Recommendations are then generated
by aggregating the preferences of these similar users. Common similarity metrics
include cosine similarity and Pearson correlation.
∙ User-User Collaborative Filtering: In this technique, the system calculates the
similarity between users and uses this similarity to predict the preferences of the
target user for items they have not yet interacted with. The predictions are typically
generated by averaging or weighted averaging the ratings of similar users for the
target items.
UNIT-III- Page 6 of 29
KCE-CSE –RS 2024
∙ Feature Combination: Hybrid models can also incorporate additional features such
as content-based features or demographic information to enhance recommendation
quality. These features are combined with collaborative filtering models to generate
more personalized recommendations.
Each collaborative filtering technique has its strengths and weaknesses, and the choice
of technique depends on factors such as the characteristics of the data, the scalability
requirements, and the specific recommendation problem being addressed.
Experimentation and evaluation are essential to determine the most effective approach
for a given application.
Suppose we have a small dataset containing information about user ratings for a few
movies
User Movie A Movie B Movie Movie D
C
User1 5 4 - 3
User2 - 3 4 -
User3 4 - 5 2
User4 2 5 3 -
In this dataset, users have rated movies on a scale of 1 to 5, where "-" indicates that the
user has not rated that particular movie.
Now, let's say we want to recommend movies to User 1, who has rated "Movie A" with 5
stars and "Movie B" with 4 stars. We can use user-based collaborative filtering to find
other users similar to User 1 and recommend movies that those similar users have rated
highly.
Similarity Calculation:
We calculate the similarity between User 1 and each of the other users using a similarity
metric such as cosine similarity or Pearson correlation coefficient. For example:
Similarity(User 1, User 2) = ? (since User 1 and User 2 have rated one movie in
common)
Similarity(User 1, User 3) = ? (since User 1 and User 3 have rated two movies in
common)
Similarity(User 1, User 4) = ? (since User 1 and User 4 have rated two movies in
common)
UNIT-III- Page 7 of 29
KCE-CSE –RS 2024
Neighborhood Selection:
Prediction:
We predict User 1's rating for "Movie C" by averaging the ratings of Users 3 and 4 for
that movie. Let's say User 3 rated "Movie C" with 5 stars and User 4 rated it with 3 stars.
So, our predicted rating for "Movie C" for User 1 would be (5 + 3) / 2 = 4 stars.
Recommendation Generation:
Finally, we recommend "Movie C" to User 1 since it has the highest predicted rating
among the movies User 1 has not yet rated.
User1 1 0 1 0
User2 0 1 1 1
User3 1 1 0 0
User4 0 1 0 1
In this dataset, each row represents a user, and each column represents a product. The
entries of the matrix indicate whether a user has purchased or interacted with a
particular product (1 indicates interaction, 0 indicates no interaction).
Now, let's say we want to recommend products similar to "Product A" to User
1. Similarity Calculation:
We calculate the similarity between "Product A" and each of the other products based
on the interactions of users who have interacted with both products. For example:
UNIT-III- Page 8 of 29
KCE-CSE –RS 2024
Similarity(Product A, Product D) = ? (since User 3 has interacted with "Product A" but
not "Product D")
Neighborhood Selection:
Prediction:
We predict User 1's likelihood of interacting with "Product D" by combining the
interactions of "Product D" from Users 1 and 4, along with the similarities between
"Product D" and the products in the neighborhood. The prediction could be computed
using a weighted average or another aggregation method.
Recommendation Generation:
Finally, we recommend "Product D" to User 1 since it has the highest predicted
likelihood of interaction among the products similar to those already interacted with by
User 1.
This example demonstrates how item-based collaborative filtering can be applied in the
e-commerce domain to recommend products to users based on the similarity of their
interactions with other products.
Suppose we have a small dataset representing user ratings for movies:
Ea
ch row represents a user, and each column represents a movie. A rating of 0 indicates
that the user has not rated the movie.
Now, let's say we want to recommend movies to User E based on collaborative filtering,
i.e., by finding users with similar tastes and recommending movies that they liked but
User E hasn't seen yet.
Steps to follow:
Compute Similarity: Calculate the similarity between User E and other users using a
similarity metric such as cosine similarity.
UNIT-III- Page 9 of 29
KCE-CSE –RS 2024
Find Neighbors: Select the top-k most similar users to User E as neighbors. Generate
Recommendations: Recommend movies to User E based on the movies liked by their
neighbors that User E hasn't seen yet.
Calculate the cosine similarity between User E and each of the other
users. For example, cosine similarity between User E and User A:
Find Neighbors:
Suppose we choose the top 2 most similar users as neighbors. Let's say User A and User
D are the closest neighbors.
Generate Recommendations:
Recommend movies liked by User A and User D that User E hasn't seen yet. For example,
User A liked Movie 1 and Movie 2, and User D liked Movie 5 and Movie 4. So, we can
recommend Movie 1, Movie 2, Movie 5, and Movie 4 to User E.
These recommended movies are based on the assumption that users with similar tastes
tend to like similar items. Collaborative filtering leverages this idea to provide
personalized recommendations to users.
Predicting Ratings with Neighborhood-Based Methods
UNIT-III- Page 10 of 29
KCE-CSE –RS 2024
For the m× n ratings matrix R = [ruj ] with m users and n items, let Iu denote the set of
item indices for which ratings have been specified by user (row) u. For example, if the
ratings of the first, third, and fifth items (columns) of user (row) u are specified
(observed) and the remaining are missing, then we have Iu = {1, 3, 5}. Therefore, the set
of items rated by both users u and v is given by Iu ∩ Iv. For example, if user v has rated
the first four items, then Iv = {1, 2, 3, 4}, and Iu ∩ Iv = {1, 3, 5} ∩ {1, 2, 3, 4} = {1, 3}. It is
possible (and quite common) for Iu ∩ Iv to be an empty set because ratings matrices are
generally sparse. The set Iu ∩ Iv defines the mutually observed ratings, which are used
to compute the similarity between the uth and vth users for neighborhood computation
In this case, the ratings of five users 1 . . . 5 are indicated for six items denoted by 1 . . . 6.
Each rating is drawn from the range {1 . . . 7}. Consider the case where the target user
index is 3, and we want to make item predictions on the basis of the ratings in Table.
Need to compute the predictions ˆr31 and ˆr36 of user 3 for items 1 and 6 in order to
determine the top recommended item.
The first step is to compute the similarity between user 3 and all the other users.
Two possible ways of computing similarity in the last two columns of the same table.
The second-last column shows the similarity based on the raw cosine between the
ratings and the last column shows the similarity based on the Pearson correlation
coefficient.
For example, the values of Cosine(1, 3) and Pearson(1, 3) are computed as follows:
UNIT-III- Page 11 of 29
KCE-CSE –RS 2024
The Pearson and raw cosine similarities of user 3 with all other users are illustrated in
the final two columns of Table. Pearson correlation coefficient is much more
discriminative and the sign of the coefficient provides information about similarity and
dissimilarity.
The top-2 closest users to user 3 are users 1 and 2 according to both measures. By using
the Pearson-weighted average of the raw ratings of users 1 and 2, the following
predictions are obtained for user 3 with respect to her unrated items 1 and 6:
Let us now examine the impact of mean-centered ratings on the prediction. The mean
centered ratings are illustrated in Table below.
The corresponding predictions with mean-centered Equation are as follows:
UNIT-III- Page 12 of 29
KCE-CSE –RS 2024
Thus, the mean-centered computation also provides the prediction that item 1 should
be prioritized over item 6 as a recommendation to user 3. There is, however, one crucial
difference from the previous recommendation.
In this case, the predicted rating of item 6 is only 0.86, which is less than all the other
items that user 3 has rated. This is a drastically different result than in the previous
case, where the predicted rating for item 6 was greater than all the other items that user
3 had rated.
Upon visually inspecting Table.1 (or Table 2), it is indeed evident that item 6 ought to be
rated very low by user 3 (compared to her other items), because her closest peers
(users 1 and 2) have also rated it lower than their other items. Thus, the mean-centering
process enables a much better relative prediction with respect to the ratings that have
already been observed
Example2
Calculate Similarity: Compute the cosine similarity between User E and each of the other
users.
2. Sort : Sort the users based on their similarity with User E in descending order.
Users
UNIT-III- Page 13 of 29
KCE-CSE –RS 2024
For example, after sorting, the similarity between User E and other users might be:
∙ User A: 1.467
∙ User B: 0.965
∙ User C: 0.576
∙ User D: 0.802
3. Select : Select the top-k users with the highest similarity scores as
Neighbors neighbors.
For example, if we choose k = 2, then the selected neighbors for User E would be User
A and User D.
These recommended movies are based on the assumption that users with similar
tastes tend to like similar items. User-based neighborhood selection leverages this
idea to provide personalized recommendations to users.
In item-based models, peer groups are constructed in terms of items rather than users.
Therefore, similarities need to be computed between items (or columns in the ratings
matrix).
Before computing the similarities between the columns, each row of the ratings matrix
is centered to a mean of zero. As in the case of user-based ratings, the average rating of
each item in the ratings matrix is subtracted from each rating to create a mean-centered
matrix.
First, the similarity between items are computed after adjusting for mean-centering. The
mean-centered ratings matrix is illustrated in Table 2. The corresponding adjusted
cosine similarities of each item to 1 and 6, respectively, are indicated in the final two
rows of the table.
For example, the value of the adjusted cosine between items 1 and 3, denotedby
AdjustedCosine(1, 3), is as follows:
UNIT-III- Page 14 of
29
Therefore, the weighted average of the raw ratings of user 3 for items 2 and 3 is used to
predict the rating ˆr31 of item 1, whereas the weighted average of the raw ratings of
user 3 for items 4 and 5 is used to predict the rating ˆr36 of item 6:
Thus, the item-based method also suggests that item 1 is more likely to be preferred by
user 3 than item 6.
However, in this case, because the ratings are predicted using the ratings of user 3
herself, the predicted ratings tend to be much more consistent with the other ratings of
this user.
Let's say we want to select the top-k items most similar to Movie 1 to form the item
based neighborhood. We'll again use cosine similarity as the similarity metric.
other movies.
For example, cosine similarity between Movie 1 and Movie 2:
3. Sort : Sort the items based on their similarity with Movie 1 in descending order.
Items
For example, after sorting, the similarity between Movie 1 and other movies might be:
∙ Movie 2: 0.71
∙ Movie 3: 0.48
∙ Movie 4: 0.56
∙ Movie 5: 0.48
neighbors.
For example, if a user has interacted with Movie 1, we can recommend Movie 2 and
Movie 4 to the user based on the item-based neighborhood model.
This approach leverages the idea that similar items are likely to be preferred by users
who have interacted with the same item. Item-based neighborhood selection provides
personalized recommendations to users based on the similarity between items.
UNIT-III- Page 16 of 29
KCE-CSE –RS 2024
∙ For example, similar items to a target historical movie might be a set of other
historical movies. In such cases, the user’s own recommendations for the similar set
might be highly indicative of her preference for the target. This is not the case for
user-based methods in which the ratings are extrapolated from other users, who
might have overlapping but different interests. As a result, item-based methods
often exhibit better accuracy.
∙ Diversity refers to the fact that the items in the ranked list tend to be somewhat
different. If the items are not diverse, then if the user does not like the first item, she
might not also like any of the other items in the list.
1. Data :
Representation
User-Based
Model
Item-Based
Model
User-Based
Model
Item-Based
Model
User-Based
Model
neighbors.
∙ : Selects a subset of items most similar to the target item as
Item-Based
Model
neighbors.
4. Recommendation :
Generation
User-Based
Model
UNIT-III- Page 17 of 29
KCE-CSE –RS 2024
Item-Based
Model
User-Based
Model
Item-Based
Model
User-Based
Model
items.
∙ : Better at handling sparsity since items are usually rated by
Item-Based
Model
multiple users.
7. Cold Start :
Problem
∙ : Faces a cold start problem when a new user joins the system
User-Based
Model
Item-Based
Model
User-Based
Model
Item-Based
Model
User-Based
Model
Item-Based
Model
1. Predicting the rating value of a user-item combination: This is the simplest and
most primitive formulation of a recommender system. In this case, the missing rating
ruj of the user u for item j is predicted.
2. Determining the top-k items or top-k users: In most practical settings, the
merchant is not necessarily looking for specific ratings values of user-item
combinations.
UNIT-III- Page 18 of 29
KCE-CSE –RS 2024
Rather, it is more interesting to learn the top-k most relevant items for a
particular user, or the top-k most relevant users for a particular item.
The problem of determining the top-k items is more common than that of finding the
top-k users. This is because the former formulation is used to present lists of
recommended items to users in Webcentric scenarios.
UNIT-III- Page 19 of 29
KCE-CSE –RS 2024
o Moreover, once a few ratings have been entered for a new item, only the
similarities between this item and the ones already in the system need to
be computed.
Types of ratings
Ordinal Ratings: Ordinal ratings involve ranking items based on their perceived
quality or preference, without specifying the magnitude of the difference
between ranks. For example, users might rank products from best to worst or rate
them on a scale such as "excellent," "good," "average," "poor," etc. Ordinal ratings
capture the relative ordering of preferences but do not provide information about the
magnitude of differences between items.
Binary Ratings: Binary ratings are ratings that indicate whether a user likes or
dislikes an item. They are typically represented as binary values (e.g., 1 for like, 0
for dislike) or as true/false values. Binary ratings are simple to collect and interpret
but provide limited information about the strength or intensity of user preferences.
Unary Ratings: Unary ratings involve users providing a single rating or feedback
without specifying any alternative options. For example, users might rate a movie
with a thumbs-up or thumbs-down, indicating whether they enjoyed it or not. Unary
ratings are straightforward to collect and can be useful for quick feedback but lack
granularity compared to multi-level rating systems.
UNIT-III- Page 20 of 29
KCE-CSE –RS 2024
Each type of rating has its advantages and limitations, and the choice of rating type
depends on factors such as the complexity of the recommendation task, user
preferences, and the specific goals of the recommendation system. Effective
recommendation systems often incorporate multiple types of ratings to capture
diverse aspects of user preferences and behavior.
Neighborhood methods in collaborative filtering recommendation systems involve
several components that work together to provide personalized recommendations
based on the preferences of users or the characteristics of items. Here are the key
components of neighborhood methods:
1. Rating :
Matrix
∙ The rating matrix represents the interactions between users and items, where
each cell contains the rating given by a user to an item. It forms the basis for
similarity calculations in neighborhood methods.
2. Similarity :
Metric
3. Neighborhood :
Selection
4. Prediction or Recommendation :
Generation
5. Aggregation :
Function
6. Normalizati :
on
7. Scalability :
Techniques
∙ Scalability techniques are employed to handle large-scale datasets efficiently. This
may include dimensionality reduction techniques like Singular Value
UNIT-III- Page 21 of 29
KCE-CSE –RS 2024
8. Cold Start :
Handling
∙ Cold start handling addresses the challenges posed by new users or items that
have limited or no interaction history. Techniques such as item popularity-
based recommendations, content-based recommendations, or hybrid
approaches are used to provide recommendations in cold start scenarios.
1.Rating Normalization
When it comes to assigning a rating to an item, each user has its own personal scale.
Even if an explicit definition of each of the possible ratings is supplied (e.g.,1=“strongly
disagree”, 2=“disagree”, 3=“neutral”, etc.), some users might be reluctant to give
high/low scores to items they liked/disliked.
Two of the most popular rating normalization schemes that have been proposed to
convert individual ratings to a more universal scale are mean-centering and Z-score.
Mean-centering
as
In the same way, the item-mean-centered normalization of rui is given
by
UNIT-III- Page 22 of 29
KCE-CSE –RS 2024
where ri corresponds to the mean rating given to item i by user in Ui. This
normalization technique is most often used in item-based recommendation, where a
rating rui is predicted as:
1 4 3 5
2 2 5 4
3 3 2 3
4 5 4 2
5 1 2 3
2. Subtract the mean rating for each item from the corresponding ratings
New rating (User 1, Item 1) =4−4=0
New rating (User 1, Item 2) =3−4=−1
New rating (User 1, Item 3) =5−4=1
So, after mean centering, the dataset would look something like this:
1 0 -1 1
5 -1 0 1
2. Subtract the mean rating for each item from the corresponding ratings.
UNIT-III- Page 23 of 29
KCE-CSE –RS 2024
User Item1 Item2 Item3
1 4 3 5
2 2 5 4
3 3 2 3
4 5 4 2
5 1 2 3
New rating (Item 1, user1) =4−3=1
New rating (Item 1,user2) =2−3=-1
New rating (Item 1,user3) =3−3=0
New rating (Item 1,user4) =3−3=0
New rating (Item 1,user5) =3−3=0
After mean centering, the dataset would look something like this:
User Item1 Item2 Item3
1 1 0.2 1.6
2 -1 1.8 0.6
3 0 -1.2 -0.4
4 0 0.8 -1.4
5 0 -1.2 -0.4
z-score normalization
In user-based methods, the normalization of a rating rui divides the user-mean centered
rating by the standard deviation σu of the ratings given by user u:
A user-based prediction of rating rui using this normalization approach would therefore
be obtained as
UNIT-III- Page 24 of 29
KCE-CSE –RS 2024
Likewise, the z-score normalization of rui in item-based methods divides the itemmean
centered rating by the standard deviation of ratings given to item i:
Correlation-based similarity
A measure of the similarity between two objects a and b, often used in information
retrieval, consists in representing these objects in the form of two vectors xa and xb and
computing the Cosine Vector (CV) (or Vector Space) similarity between these vectors:
In the context of item recommendation, this measure can be employed to compute user
A popular measure that compares ratings where the effects of mean and variance have
been removed is the Pearson Correlation (PC) similarity:
Note that this is different from computing the CV similarity on the Z-score normalized
ratings, since the standard deviation of the ratings is evaluated only on the common
items Iuv, not on the entire set of items rated by u and v, i.e. Iu and Iv. The same idea can
be used to obtain similarities between two items i and j, this time by comparing the
ratings made by users that have rated both of these items:
While the sign of a similarity weight indicates whether the correlation is direct or
inverse, its magnitude (ranging from 0 to 1) represents the strength of the correlation.
Let's consider a dataset where we have ratings for three movies (Movie A, Movie B, and
Movie C) from three users (User 1, User 2, and User 3):
UNIT-III- Page 26 of 29
KCE-CSE –RS 2024
The cosine similarity between each pair of movies using the formula:
Where:
5×0+4×2+0×5=85×0+4×2+
∥v∥=sqrt(42+22+52)=sqrt(4
5 ))
Movie B vs. :
Movie C
+22)=sqrt(20)
2
∥v∥=sqrt(4
and 5))
∙ Euclidean norms: )
∥u∥=sqrt(42+02+22)=sqrt(20
∙ Cosine similarity: 10/sqrt(20)×sqrt(45)≈0.739
This matrix quantifies the similarity between each pair of movies based on user
ratings. Higher values indicate greater similarity.
UNIT-III- Page 27 of 29
KCE-CSE –RS 2024
Let's consider another example with a slightly larger dataset. Suppose we have five
movies (Movie 1, Movie 2, Movie 3, Movie 4, and Movie 5) and four users (User A, User
B, User C, and User D) with the following ratings:
3. Neighborhood selection
The number of nearest-neighbors to select and the criteria used for this selection can
also have a serious impact on the quality of the recommender system. The selection of
the neighbors used in the recommendation of items is normally done in two steps: 1) a
global filtering step where only the most likely candidates are kept, and 2) a per
prediction step which chooses the best candidates for this prediction.
UNIT-III- Page 28 of 29
KCE-CSE –RS 2024
Pre-filtering of neighbors
∙ In large recommender systems that can have millions of users and items, it is usually
not possible to store the (non-zero) similarities between each pair of users or items,
due to memory limitations.
∙ Moreover, doing so would be extremely wasteful as only the most significant of these
values are used in the predictions.
∙ The pre-filtering of neighbors is an essential step that makes neighborhood-based
approaches practicable by reducing the amount of similarity weights to store, and
limiting the number of candidate neighbors to consider in the predictions. There are
several ways in which this can be accomplished:
• Top-N filtering:
For each user or item, only a list of the N nearest-neighbors and their respective
similarity weight is kept. To avoid problems with efficiency or accuracy, N should be
chosen carefully. Thus, if N is too large, an excessive amount of memory will be required
to store the neighborhood lists and predicting ratings will be slow. On the other hand,
selecting a too small value for N may reduce the coverage of the recommendation
method, which causes some items to be never recommended.
• Threshold filtering:
Instead of keeping a fixed number of nearest-neighbors, this approach keeps all the
neighbors whose similarity weight has a magnitude greater than a given threshold
wmin. While this is more flexible than the previous filtering technique, as only the most
significant neighbors are kept, the right value of wmin may be difficult to determine.
• Negative filtering:
In general, negative rating correlations are less reliable than positive ones. Intuitively,
this is because strong positive correlation between two users is a good indicator of their
belonging to a common group (e.g., teenagers, science-fiction fans, etc.). However,
although negative correlation may indicate membership to different groups, it does not
tell how different these groups are, or whether these groups are compatible for other
categories of items.
UNIT-III- Page 29 of 29