0% found this document useful (0 votes)
19 views46 pages

DM Lect 6 - Recommender Systems

The document discusses recommender systems, which provide personalized recommendations to users based on their preferences and behaviors. It covers various types of recommender systems, including collaborative filtering, content-based, and hybrid systems, along with their inputs, outputs, and challenges such as cold start and sparsity. Additionally, it highlights evaluation metrics and advanced techniques like context-aware and trust-based systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views46 pages

DM Lect 6 - Recommender Systems

The document discusses recommender systems, which provide personalized recommendations to users based on their preferences and behaviors. It covers various types of recommender systems, including collaborative filtering, content-based, and hybrid systems, along with their inputs, outputs, and challenges such as cold start and sparsity. Additionally, it highlights evaluation metrics and advanced techniques like context-aware and trust-based systems.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Data Mining

Recommender Systems

Dr. Wedad Hussein


[email protected]

Dr. Mahmoud Mounir


[email protected]
Web Personalization

• Definition: the process of customizing


a Web site to the needs of specific
users.
• The success of Web personalization
depends on its ability to anticipate the
users’ needs or next moves and
recommend the suitable objects.
How many times have you seen this
statement?
“People who downloaded / bought /
liked this item also downloaded / bought
/ liked items X and Y”

Association Rules
Recommender Systems!
Recommender Systems
• Definition: systems that produce
individualized recommendations as output.
• They can guide the user in a personalized
way to interesting or useful objects in a
large space of possible options.
• Recommender Systems vs. Information
Retrieval Systems??
• "individuality" and "personalized
recommendations"
Taxonomy

Recommender
Systems

Collaborative
Content-Based Hybrid
Filtering
Inputs
1. Ratings: The opinions of users in the
items available in the system.
2. Demographic Data: Data about the
user like age, gender and education.
3. Content Data: Textual data related to
the contents of the items to be
recommended.
Collaborative Filtering
Systems
Collaborative Filtering

• Identify like-minded users.


• Search for the “Neighborhood” of the user; that
is the group of users exhibiting similar behavior
to the current user.
• Builds a user-item matrix containing the ratings
of users to all items whenever available.
• e-Commerce systems like e-Bay and Amazon
are using collaborative filtering to present their
users with a recommended list of products.
Outputs

• Collaborative filtering systems could be


used for one of two purposes:
• Prediction: Generates a value indicating
the expected rating of an item by the
current user.
• Recommendation: Produces a list of N
items that the user is expected to like (Top
N recommendations).
User-item matrix

Item 1 Item 2 … … Item m


User 1 R11 R12 … … R1m
User 2 R21 R22 … … R2m
… … … … … …
… … … … … …
User n Rn1 Rn2 … … Rnm
Representation
Current User Users
1 1st item rate
0 Dislike
?
0
1 Like
1

Items
? 1
Unknown
0
1
1
0
1
1
1
1
0 14th item rate
Neighborhood Formation

• Similarity between users in the user-item


matrix should be calculated.
• Users similar to the active user will form a
proximity-based neighborhood with him.
• Implemented in two steps:
1. Similarity between all users is calculated.
2. Similarities of users are processed to find
neighbors.
Similarity Measures
• Cosine Similarity
Similarity Measures
• Pearson Correlation
Predicting Ratings

• Select from the set of nearest neighbors


the users that rated the target item.
• The predicted rating is given by:

• Where n is the neighborhood size and


simuj is the similarity between current user
u and user j.
Top-N Recommendation

• Perform a frequency count of the items


that each neighbor user has purchased or
rated.
• Exclude items already rated by the active
user.
• Sort the remaining items according to their
frequency counts.
• Return the N most frequent items, as the
recommendation for active user.
Item-based CF

• The item-based approach works by


comparing items based on their pattern
of ratings across users. The similarity of
items i and j is computed as follows:
sim(i, j ) =
 uU
(ru ,i − ru )(ru , j − ru )

uU u,i u
( r − r ) 2
uU u, j u
( r − r ) 2
Item-based Recommendation
• After computing the similarity between items we
select a set of k most similar items to the target
item and generate a predicted value of user u’s
rating

p (u, i ) =
 r
jJ u , j
 sim(i, j )
 jJ
sim(i, j )
where J is the set of k similar items

• Advantages?
Evaluation

• Mean Absolute Error (MAE):

• Root Mean Square Error (RMSE):

• Where arij is the actual rating provided by user i for


item j, rij is the predicted rating and ni is the
number of items already rated by the user.
Evaluation

• Coverage: the percentage of items for which


the system can provide recommendations.

• Where m is the number of users, npi is the


number of items for which the system was able
to provide recommendations and ni is the
number of items already rated by the user.
Challenges with
Collaborative Filtering
Systems
Cold Start Problem
• Cold start refers newly added items or users.
• An item cannot be recommended until it has been
rated by a number of users.
• For a user, a system can’t find his set of nearest
neighbors unless he has rated a number of items.

• Solution?
• Integrating other sources of information.
• Collecting preferences / profiles over multiple sites
Sparsity of User-item Matrix
• A user typically only rates a very small
portion of the items.
• We need to find commonly rated items to
locate neighbors.
• Given the sparsity of the matrix the decisions
are usually based on very few items making
the predictions inaccurate and unreliable.
• Solution?
• Default Voting
• Clustering
• Dimensionality Reduction
Other Challenges
• Scalability: The calculations grow with the
number of users and items.
• Solution?
• Clustering

• Popularity Bias (The Gray Sheep Problem): The


system is not capable of offering accurate
recommendations for users with unique tastes
Content-Based Systems
Content-Based Recommendation

• In content-based recommendations the


system tries to recommend items that
matches the User Profile.
• The Profile is based on items user has
liked in the past or explicit interests that
he defines.
• A content-based recommender system
matches the profile of the item to the user
profile to decide on its relevancy to the
user.
Content-Based Systems

• They usually use the vector-space


model to represent items.
• Advantage: They can overcome the
cold start problem (new items)
• Disadvantage: Over specialization.
Approaches

A. Case-Based Reasoning:
• calculate similarity based on the attributes of the
item.
• Recommends items which are most similar to the
items the user has liked before.
• Still suffer from a new user problem.
B. Attribute-Based Techniques:
• Include information about the user in the
recommendation process.
• overcome the new user problem.
• Disadvantage: do not adapt to new ratings added
since user information is static.
Hybrid Systems
Hybrid Systems

• Hybrid schemes attempt to combine user


ratings and content information to yield
better recommendations.
• Methods:
• Generate recommendations from both
techniques separately and later combining the
recommendation lists.
• Incorporate content information into the data
collected by collaborative filtering systems
1. Feature Augmentation

• Based on content features additional


ratings are created.
• E.g. User X likes Items 1 and 3:
• Item7 is similar to 1 and 3 by a degree of
0.75
• Thus User X likes Item7 by 0.75
• User-item matrix becomes less sparse.
2. Weighted Hybrid

Score
Candidate Recommender 1
Weighted
Combination
Recommender 2
Score

Combined Score
Weighted Hybrid Example
n

rec weighted
(u, i ) =   k  reck (u, i )
k =1

Recommender 1 Recommender 2
Item1 0.5 1 Item1 0.8 2
Item2 0 Item2 0.9 1
Item3 0.3 2 Item3 0.4 3
Item4 0.1 3 Item4 0
Item5 0 Item5 0

Recommender weighted(0.5:0.5)
Item1 0.65 1
How are
Item2 0.45 2
weights
Item3 0.35 3
assigned?
Item4 0.05 4
Item5 0.00
Assigning Weights

• The weights are assigned in one of two


ways:
• Training Phase: During this phase training
data are fed to the two recommender systems
and weights are assigned according to the
accuracy of the predictions.
• Adjustable Weights: Start with equal
weights and these weights are adjusted
periodically to reflect the accuracy of
prediction.
3. Switching Hybrid

User Profile Recommender 1

?
Selection
Criteria

Recommender 2

Selected
Score
Recommender
4. Cascade Hybrid

Cascade Hybrid

Primary
Candidate Recommender

Score

Secondary Score
Recommender Combined Score
Method
• Each recommender system filters the list of
items produced by the previous one.
• Subsequent recommender may not introduce
additional items
• For all k > 1

reck (u , i ) : reck −1 (u , i )  0
reck (u , i ) = 
 0 : otherwise
Cascade Hybrid Example
Recommender 1 Recommender 2
Item1 0.5 1 Item1 0.8 2
Item2 0 Item2 0.9 1
Item3 0.3 2 Item3 0.4 3
Item4 0.1 3 Item4 0
Item5 0 Item5 0

Removing no-go items Ordering and refinement

Recommender 3
Item1 0.80 1
Item2 0.00
Item3 0.40 2
Item4 0.00
Item5 0.00
Other Techniques
1. Context-Aware Recommender Systems

• Recommend a vacation
• Winter vs. summer
• Recommend a purchase
• Gift vs. for yourself
• Recommend a movie
• With friends vs. with family
What Other Techniques Ignore

• What is the user doing when asking for


a recommendation?
• Where (and when) the user is located?
• What does the user really want (e.g.,
improve his knowledge or really buy a
product)?
• Is the user alone or with other fellows?
Challenges

• Obtain sufficient and reliable data


describing the user context.
• Selecting the right information.
• Computational model: how to extend
Collaborative Filtering to include
contextual dimensions?
2. Social (Trust based) Systems

• Intuition – Users tend to receive advice


from people they trust, i.e., from their
friends.
• Trusted friends can be defined explicitly
by the users or inferred from social
networks they are registered to.
Trust- based Collaborative
Filtering
Active users’ trusted
friends

Active user

3
?

Rating
prediction
Recommended Readings

• Vozalis, E., Margaritis, K., Analysis of


Recommender Systems’ Algorithms, In The Sixth
Hellenic European Conference on Computer
Mathematics and its Applications (HERCMA 2003),
Athens, Greece, 2003, pp. 1-14.
• Recommender Systems Handbook, Ricci, F.;
Rokach, L.; Shapira, B.; Kantor, P.B. (Eds.)2011.
• Burke, R.D., Hybrid Recommender Systems:
Survey and Experiments, User Modeling and User-
Adapted Interaction 12(4), 2002, pp. 331-370.
Thank You

You might also like