Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
Content-Based Recommending
CS 293S. T. Yang
Slides based on R. Mooney at UT Austin
1
Recommendation Systems
• Systems for recommending items (e.g. books,
movies, music, web pages, newsgroup messages)
to users based on examples of their preferences.
– Amazon, Netflix. Increase sales at on-line stores.
• Basic approaches to recommending:
– Collaborative Filtering (a.k.a. social filtering)
– Content-based
• Instances of personalization software.
– adapting to the individual needs, interests, and
preferences of each user with recommending, filtering,
& predicting
2
Process of Book Recommendation
Red
Mars
Found
ation
Juras-
sic
Park
Machine User
Lost
Learning Profile
World
Differ-
ence
Engine
3
Collaborative Filtering
• Maintain a database of many users’ ratings of a
variety of items.
• For a given user, find other similar users whose
ratings strongly correlate with the current user.
• Recommend items rated highly by these similar
users, but not rated by the current user.
• Almost all existing commercial recommenders use
this approach (e.g. Amazon). User rating? User rating
User rating
User rating
User rating
User rating
Item recommendation
4
Collaborative Filtering
A 9 A A 5 A A 6 A 10
User B 3 B B 3 B B 4 B 4
C C 9 C C 8 C C 8
Database : : : : : : : : : : . .
Z 5 Z 10 Z 7 Z Z Z 1
A 9 A 10
B 3 B 4
Correlation C C 8
Match : : . .
Z 5 Z 1
A 9
Active B 3 Extract C
C Recommendations
User . .
Z 5 5
Collaborative Filtering Method
1. Weight all users with respect to similarity
with the active user.
2. Select a subset of the users (neighbors) to
use as predictors.
3. Normalize ratings and compute a
prediction from a weighted combination of
the selected neighbors’ ratings.
4. Present items with highest predicted
ratings as recommendations.
6
Find users with similar ratings/interests
A 9 A A 5 A A 6 A 10
User B 3 B B 3 B B 4 B 4
C C 9 C C 8 C C 8
Database : : : : : : : : : : . .
Z 5 Z 10 Z 7 Z Z Z 1
ru
A 9
Active B 3 ra
C
User . .
Z 5 7
Similarity Weighting
• Similarity of two rating vectors for active user, a,
and another user, u. covar (ra , ru )
– Pearson correlation coefficient ca ,u =
sr sr
– a cosine similarity formula a u
A 9 A A 5 A A 6 A 10
User B 3 B B 3 B B 4 B 4
C C 9 C C 8 C C 8
Database : : : : : : : : : : . .
8
Z 5 Z 10 Z 7 Z Z Z 1
Definition: Covariance and Standard
Deviation
• Covariance: m
å (r a ,i - ra )(ru ,i - ru )
covar(ra , ru ) = i =1
m
m
år x ,i
rx = i =1 m
m å x ,i x
( r - r ) 2
sr = i =1
• Standard Deviation:
x
m
10
Significance Weighting
11
Rating Prediction (Version 0)
• Predict a rating, pa,i, for each item i, for active user, a,
by using the n selected neighbor users, u Î {1,2,…n}.
• Weight users’ ratings contribution by their similarity to
the active user.
n
åw r
a ,u u ,i User a
pa ,i = u =1
n
åw
u =1
a ,u
Item i 12
Rating Prediction (Version 1)
• Predict a rating, pa,i, for each item i, for active user, a,
by using the n selected neighbor users, u Î {1,2,…n}.
• To account for users different ratings levels, base
predictions on differences from a user’s average rating.
• Weight users’ ratings contribution by their similarity to
the active user.
n User a
åw a ,u (ru ,i - ru )
pa ,i = ra + u =1
n
åw
u =1
a ,u
Item i
13
Problems with Collaborative Filtering
• Cold Start: There needs to be enough other users
already in the system to find a match.
• Sparsity: If there are many items to be
recommended, even if there are many users, the
user/ratings matrix is sparse, and it is hard to find
users that have rated the same items.
• First Rater: Cannot recommend an item that has
not been previously rated.
– New items, esoteric items
• Popularity Bias: Cannot recommend items to
someone with unique tastes.
– Tends to recommend popular items.
14
Recommendation vs Web Ranking
Item recommendation
Web page ranking
15
Content-Based Recommendation
16
Example: LIBRA System
Amazon Book Pages
LIBRA
Information Database
Extraction
18
Content-Boosted Collaborative Filtering
Movie
Content
Database
Active Collaborative
User Ratings Filtering
Recommendations
19
Content-Boosted Collaborative Filtering
User-ratings Vector
Training Examples
Content-Based
Predictor
20
Content-Boosted Collaborative Filtering
21
Conclusions
• Recommending and personalization are
important approaches to combating
information over-load.
• Machine Learning is an important part of
systems for these tasks.
• Collaborative filtering has problems.
• Content-based methods address these
problems (but have problems of their own).
• Integrating both is best.
22