0% found this document useful (0 votes)
45 views22 pages

Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin

This document summarizes recommendation systems and two main approaches: collaborative filtering and content-based recommending. Collaborative filtering recommends items based on ratings from similar users, while content-based recommending bases recommendations on item attributes rather than user ratings. The document outlines the collaborative filtering process and some limitations of collaborative filtering including cold start, sparsity, and popularity bias issues. It then introduces content-based recommendation as an alternative approach.

Uploaded by

John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views22 pages

Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin

This document summarizes recommendation systems and two main approaches: collaborative filtering and content-based recommending. Collaborative filtering recommends items based on ratings from similar users, while content-based recommending bases recommendations on item attributes rather than user ratings. The document outlines the collaborative filtering process and some limitations of collaborative filtering including cold start, sparsity, and popularity bias issues. It then introduces content-based recommendation as an alternative approach.

Uploaded by

John
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Collaborative Filtering &

Content-Based Recommending

CS 293S. T. Yang
Slides based on R. Mooney at UT Austin

1
Recommendation Systems
• Systems for recommending items (e.g. books,
movies, music, web pages, newsgroup messages)
to users based on examples of their preferences.
– Amazon, Netflix. Increase sales at on-line stores.
• Basic approaches to recommending:
– Collaborative Filtering (a.k.a. social filtering)
– Content-based
• Instances of personalization software.
– adapting to the individual needs, interests, and
preferences of each user with recommending, filtering,
& predicting
2
Process of Book Recommendation

Red
Mars

Found
ation

Juras-
sic
Park
Machine User
Lost
Learning Profile
World

2001 Neuro- 2010


mancer

Differ-
ence
Engine

3
Collaborative Filtering
• Maintain a database of many users’ ratings of a
variety of items.
• For a given user, find other similar users whose
ratings strongly correlate with the current user.
• Recommend items rated highly by these similar
users, but not rated by the current user.
• Almost all existing commercial recommenders use
this approach (e.g. Amazon). User rating? User rating
User rating
User rating
User rating
User rating

Item recommendation
4
Collaborative Filtering

A 9 A A 5 A A 6 A 10
User B 3 B B 3 B B 4 B 4
C C 9 C C 8 C C 8
Database : : : : : : : : : : . .
Z 5 Z 10 Z 7 Z Z Z 1

A 9 A 10
B 3 B 4
Correlation C C 8
Match : : . .
Z 5 Z 1

A 9
Active B 3 Extract C
C Recommendations
User . .
Z 5 5
Collaborative Filtering Method
1. Weight all users with respect to similarity
with the active user.
2. Select a subset of the users (neighbors) to
use as predictors.
3. Normalize ratings and compute a
prediction from a weighted combination of
the selected neighbors’ ratings.
4. Present items with highest predicted
ratings as recommendations.

6
Find users with similar ratings/interests

A 9 A A 5 A A 6 A 10
User B 3 B B 3 B B 4 B 4
C C 9 C C 8 C C 8
Database : : : : : : : : : : . .
Z 5 Z 10 Z 7 Z Z Z 1

ru

Which users have similar ratings?

A 9
Active B 3 ra
C
User . .
Z 5 7
Similarity Weighting
• Similarity of two rating vectors for active user, a,
and another user, u. covar (ra , ru )
– Pearson correlation coefficient ca ,u =
sr sr
– a cosine similarity formula a u

ra and ru are the ratings vectors for the m items rated by


both a and u

A 9 A A 5 A A 6 A 10
User B 3 B B 3 B B 4 B 4
C C 9 C C 8 C C 8
Database : : : : : : : : : : . .
8
Z 5 Z 10 Z 7 Z Z Z 1
Definition: Covariance and Standard
Deviation
• Covariance: m

å (r a ,i - ra )(ru ,i - ru )
covar(ra , ru ) = i =1
m
m

år x ,i
rx = i =1 m
m å x ,i x
( r - r ) 2

sr = i =1

• Standard Deviation:
x
m

• Pearson correlation coefficient


covar (ra , ru )
ca , u = = Cosine(ra - ra , ru - ru )
sr sr a u 9
Neighbor Selection
• For a given active user, a, select correlated
users to serve as source of predictions.
– Standard approach is to use the most similar n
users, u, based on similarity weights, wa,u
– Alternate approach is to include all users whose
similarity weight is above a given threshold.
Sim(ra , ru )> t

10
Significance Weighting

• Important not to trust correlations based on


very few co-rated items.
• Include significance weights, sa,u, based on
number of co-rated items, m.
wa ,u = sa ,u ca ,u
ìï 1 if m > 50 üï
s a ,u =ím
if m £ 50 ý
ïî 50 ïþ

11
Rating Prediction (Version 0)
• Predict a rating, pa,i, for each item i, for active user, a,
by using the n selected neighbor users, u Î {1,2,…n}.
• Weight users’ ratings contribution by their similarity to
the active user.
n

åw r
a ,u u ,i User a
pa ,i = u =1
n

åw
u =1
a ,u

Item i 12
Rating Prediction (Version 1)
• Predict a rating, pa,i, for each item i, for active user, a,
by using the n selected neighbor users, u Î {1,2,…n}.
• To account for users different ratings levels, base
predictions on differences from a user’s average rating.
• Weight users’ ratings contribution by their similarity to
the active user.
n User a

åw a ,u (ru ,i - ru )
pa ,i = ra + u =1
n

åw
u =1
a ,u

Item i
13
Problems with Collaborative Filtering
• Cold Start: There needs to be enough other users
already in the system to find a match.
• Sparsity: If there are many items to be
recommended, even if there are many users, the
user/ratings matrix is sparse, and it is hard to find
users that have rated the same items.
• First Rater: Cannot recommend an item that has
not been previously rated.
– New items, esoteric items
• Popularity Bias: Cannot recommend items to
someone with unique tastes.
– Tends to recommend popular items.
14
Recommendation vs Web Ranking

Text Content User click data


User rating
Link popularity Content

Item recommendation
Web page ranking

15
Content-Based Recommendation

• Recommendations are based on information on


the content of items rather than on other users’
opinions.
– Less dependence for data on other users.
• Able to recommend to users with unique tastes.
• Able to recommend new and unpopular items
– No first-rater problem.
– No cold-start or sparsity problems..

16
Example: LIBRA System
Amazon Book Pages
LIBRA
Information Database
Extraction

Rated Uses information


Author
Examples Machine Learning Title
Editorial Reviews
Learner Customer Comments
Subject terms
Recommendations
Related authors
1.~~~~~~ User Profile Related titles
2.~~~~~~~
3.~~~~~
:
:
: Predictor
17
Combining Content and Collaboration
• Content-based and collaborative methods have
complementary strengths and weaknesses.
• Combine methods to obtain the best of both.
• Various hybrid approaches:
– Apply both methods and combine recommendations.
– Use collaborative data as content.
– Use content-based predictor as another collaborator.
– Use content-based predictor to complete
collaborative data.

18
Content-Boosted Collaborative Filtering

EachMovie Web Crawler IMDb

Movie
Content
Database

User Ratings Full User


Matrix (Sparse) Ratings Matrix
Content-based
Predictor

Active Collaborative
User Ratings Filtering

Recommendations

19
Content-Boosted Collaborative Filtering

User-ratings Vector

Training Examples

Content-Based
Predictor

Pseudo User-ratings Vector


User-rated Items
Unrated Items
Items with Predicted Ratings

20
Content-Boosted Collaborative Filtering

User Ratings Content-Based Pseudo User


Matrix Predictor Ratings Matrix

• Compute pseudo user ratings matrix


– Full matrix – approximates actual full user ratings matrix
• Perform collaborative filtering
– Using Pearson corr. between pseudo user-rating vectors

21
Conclusions
• Recommending and personalization are
important approaches to combating
information over-load.
• Machine Learning is an important part of
systems for these tasks.
• Collaborative filtering has problems.
• Content-based methods address these
problems (but have problems of their own).
• Integrating both is best.

22

You might also like