0% found this document useful (0 votes)
3 views

Lecture 2 Part1

Uploaded by

kashifkhanpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture 2 Part1

Uploaded by

kashifkhanpc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Recommender

Systems

Lecture 2: Neighborhood-
Based Collaborative Filtering
Imranuddin
Part 1
Neighborhood-based Collaborative
Filtering algorithms

• Also refered to as memory-based algorithms


• These were amongst earliest algorithms developed for
Collaborative Filtering
• They are based on the fact that similar users display similar
pattern of rating behavior and similar items receive similar
ratings.
• They are of two types
• User-Based Collaborative Filtering
• Item-Based Collaborative Filtering
Types of Collaborative Filtering

• 1. User-based collaborative filtering: In this case, the ratings provided by similar


users to a target user A are used to make recommendations for A. The predicted
ratings of A are computed as the weighted average values of these “peer group”
ratings for each item.
• 2. Item-based collaborative filtering: In order to make recommendations for target
item B, the first step is to determine a set S of items, which are most similar to
item B. Then, in order to predict the rating of any particular user A for item B, the
ratings in set S, which are specified by A, are determined. The weighted average
of these ratings is used to compute the predicted rating of user A for item B.

• Note: An important distinction between user-based collaborative filtering and item-based collaborative filtering algorithms is that
the ratings in the former case are predicted using the ratings of neighboring users, whereas the ratings in the latter case are
predicted using the user’s own ratings on neighboring (i.e., closely related) items. In the former case, neighborhoods are defined by
similarities among users (rows of ratings matrix), whereas in the latter case, neighborhoods are defined by similarities among items
(columns of ratings matrix).
Collaborative Filtering Problem
Formulation
• We assume that the user-item ratings matrix is an incomplete m × n matrix R = [ruj ]
containing m users and n items. It is assumed that only a small subset of the ratings matrix
is specified or observed. Neighborhood-based collaborative filtering algorithms can be
formulated in one of two ways:
• 1. Predicting the rating value of a user-item combination: This is the simplest and most
primitive formulation of a recommender system. In this case, the missing rating ruj of the
user u for item j is predicted.
• 2. Determining the top-k items or top-k users: In most practical settings, the merchant is not
necessarily looking for specific ratings values of user-item combinations. Rather, it is more
interesting to learn the top-k most relevant items for a particular user, or the top-k most
relevant users for a particular item. The problem of determining the top-k items is more
common than that of finding the top-k users. This is because the former formulation is used
to present lists of recommended items to users. In traditional recommender algorithms, the
“top-k problem” almost always refers to the process of finding the top-k items, rather than
the top-k users. However, the latter formulation is also useful to the merchant because it can
be used to determine the best users to target with marketing efforts.
Key Properties of Ratings Matrices

• We assume that the ratings matrix is denoted by R, and it is an (m×n)


matrix containing m users and n items. Therefore, the rating of user u for
item j is denoted by ruj . Only a small subset of the entries in the ratings
matrix are typically specified.
• The specified entries of the matrix are referred to as the training data,
whereas the unspecified entries of the matrix are referred to as the test
data
• This definition has a direct analog in classification, regression, and
semisupervised learning algorithms.
• In that case, all the unspecified entries belong to a special column, which is
known as the class variable or dependent variable. Therefore, the
recommendation problem can be viewed as a generalization of the problem
of classification and regression
Ratings Types

1. Continuous ratings
2. Interval-based ratings
3. Ordinal ratings
4. Binary ratings
5. Unary ratings

• Note: Indirect derivation of unary ratings from customer actions is also


referred to as implicit feedback, because the customer does not explicitly
provide feedback.
• Such types of “ratings” are often easier to obtain because users are far more
likely to interact with items on an online site than to explicitly rate them
Long-Tail Property

• The distribution of ratings among


items often satisfies a property in
real-world settings, which is
referred to as the long-tail
property. According to this
property, only a small fraction of
the items are rated frequently.
Such items are referred to as
popular items. The vast majority of
items are rated rarely. This results
in a highly skewed distribution of
the underlying ratings.
Predicting Ratings with Neighborhood-Based
Methods

• There are two basic principles used in neighborhood-based


models:
• 1. User-based models: Similar users have similar ratings on the
same item. Therefore, if Alice and Bob have rated movies in a
similar way in the past, then one can use Alice’s observed
ratings on the movie Terminator to predict Bob’s unobserved
ratings on this movie.
• 2. Item-based models: Similar items are rated in a similar way
by the same user. Therefore, Bob’s ratings on similar science
fiction movies like Alien and Predator can be used to predict his
rating on Terminator.
Example
Example

For the m× n ratings matrix R = [ruj ] with m users and n items, let Iu denote the set of item indices for
which ratings have been specified by user (row) u.

For example, if the ratings of the first, third, and fifth items (columns) of user (row) u are specified
(observed. and the remaining are missing, then we have Iu = {1, 3, 5}. Therefore, the set of items
rated by both users u and v is given by Iu ∩ Iv. For example, if user v has rated the first four items,
then Iv = {1, 2, 3, 4}, and Iu ∩ Iv = {1, 3, 5} ∩ {1, 2, 3, 4} = {1, 3}. It is possible (and quite common)
for Iu ∩ Iv to be an empty set because ratings matrices are generally sparse. The set Iu ∩ Iv defines the
mutually observed ratings, which are used to compute the similarity between the uth and vth users for
neighborhood computation.
Strictly speaking, the traditional definition of Pearson(u, v) mandates that the values of μu and μv
should be computed only over the items that are rated both by users u and v
Example

The mean-centered rating suj of a user u for item j is


defined by subtracting her mean rating from the raw
rating ruj . ->

Overall neighborhood-based
prediction function ->
Example

The mean-centered ratings :


Explanation: (rating-mean)
So for item 1
User1, 7-5.5=1.5,
User2, 6-4.8=1.8
For item 6
User1, 5-5.5= -1.5,
By using the Pearson-weighted average of the raw User2, 4-4.8= -0.8
ratings of users 1 and 2, the following predictions
are obtained for user 3 with respect to her unrated
items 1 and 6:
Similarity Function Variants
• Several other variants of the similarity function are
used in practice. One variant is to use the cosine
function on the raw ratings rather than the mean-
centered ratings:

• In some implementations of the raw cosine, the


normalization factors in the denominator are based on
all the specified items and not the mutually rated items
• The reliability of the similarity function Sim(u, v) is
often affected by the number of common ratings |Iu ∩
Iv| between users u and v.
• When the two users have only a small number of
ratings in common, the similarity function should be
reduced with a discount factor to de-emphasize the
importance of that user pair. This method is referred to
as significance weighting.
• The discount factor kicks in when the number of
common ratings between the two users is less than a
Variants of the Prediction Function

You might also like