0% found this document useful (0 votes)

19 views104 pages

Rec Sys

Uploaded by

Tanisha Yuvaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views104 pages

Rec Sys

Uploaded by

Tanisha Yuvaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 104

Recommendation Systems

Pawan Goyal

CSE, IITKGP

October 29-30, 2015

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 1 / 61

Recommendation System?

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 2 / 61

Recommendation in Social Web

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 3 / 61

Why using Recommender Systems?

Value for the customers

Find things that are interesting
Narrow down the set of choices
Discover new things
Entertainment ...

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 4 / 61

Why using Recommender Systems?

Value for the customers

Find things that are interesting
Narrow down the set of choices
Discover new things
Entertainment ...

Value for the provider

Additional and unique personalized service for the customer
Increase trust and customer loyalty
Increase sales, click through rates, conversion etc
Opportunity for promotion, persuasion
Obtain more knowledge about customers

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 4 / 61

Real-world check

Myths from industry

Amazon.com generates X percent of their sales through the
recommendation lists (X > 35%)
Netflix generates X percent of their sales through the recommendation
lists (X > 30%)

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 5 / 61

Real-world check

Myths from industry

Amazon.com generates X percent of their sales through the
recommendation lists (X > 35%)
Netflix generates X percent of their sales through the recommendation
lists (X > 30%)

There must be some value in it

See recommendation of groups, jobs or people on LinkedIn
Friend recommendation and ad personalization on Facebook
Song recommendation at last.fm
News recommendation at Forbes.com (+37% CTR)

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 5 / 61

Recommender Systems as a function

What is given?
User model: ratings, preferences, demographics, situational context
Items: with or without description of item characteristics

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 6 / 61

Recommender Systems as a function

What is given?
User model: ratings, preferences, demographics, situational context
Items: with or without description of item characteristics

Find
Relevance score: used for ranking

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 6 / 61

Recommender Systems as a function

What is given?
User model: ratings, preferences, demographics, situational context
Items: with or without description of item characteristics

Find
Relevance score: used for ranking

Final Goal
Recommend items that are assumed to be relevant

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 6 / 61

Recommender Systems as a function

What is given?
User model: ratings, preferences, demographics, situational context
Items: with or without description of item characteristics

Find
Relevance score: used for ranking

Final Goal
Recommend items that are assumed to be relevant

But
Remember that relevance might be context-dependent
Characteristics of the list might be important (diversity)

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 6 / 61

Paradigms of Recommender Systems

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 7 / 61

Paradigms of Recommender Systems

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 8 / 61

Paradigms of Recommender Systems

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 9 / 61

Paradigms of Recommender Systems

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 10 / 61

Paradigms of Recommender Systems

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 11 / 61

Paradigms of Recommender Systems

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 12 / 61

Comparison across the paradigms

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 13 / 61

Collaborative Filtering (CF)

The most prominent approach to generate recommendations

Used by large, commercial e-commerce sites
well-understood, various algorithms and variations exist
applicable in many domains (book, movies, ...)

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 14 / 61

Collaborative Filtering (CF)

The most prominent approach to generate recommendations

Used by large, commercial e-commerce sites
well-understood, various algorithms and variations exist
applicable in many domains (book, movies, ...)

Approach
Use the “wisdom of the crowd” to recommend items

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 14 / 61

Collaborative Filtering (CF)

The most prominent approach to generate recommendations

Used by large, commercial e-commerce sites
well-understood, various algorithms and variations exist
applicable in many domains (book, movies, ...)

Approach
Use the “wisdom of the crowd” to recommend items

Basic assumption and idea

Users give ratings to catalog items (implicitly/explicitly)
Customers with certain tastes in the past, might have similar tastes in the
future

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 14 / 61

User-based Collaborative Filtering

Given an active user Alice and an item i not yet seen by Alice
The goal is to estimate Alice’s rating for this item, e.g., by

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 15 / 61

User-based Collaborative Filtering

Given an active user Alice and an item i not yet seen by Alice
The goal is to estimate Alice’s rating for this item, e.g., by
I Find a set of users who liked the same items as Alice in the past and who
have rated item i
I use, e.g. the average of their ratings to predict, if Alice will like item i
I Do this for all items Alice has not seen and recommend the best-rated ones

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 15 / 61

User-based Collaborative Filtering

Some first questions

How do we measure similarity?
How many neighbors should we consider?
How do we generate a prediction from the neighbors’ ratings?

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 16 / 61

Popular similarity model

Pearson Correlation
∑p∈P (ra,p − ra )(rb,p − rb )
sim(a, b) = q q
∑p∈P (ra,p − ra )2 ∑p∈P (rb,p − rb )2

a, b: users
ra,p : rating of user a for item p
P: set of items, rated both by a and b
ra , rb : user’s average ratings
Possible similarity values are between -1 to 1

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 17 / 61

Popular similarity model

Pearson Correlation
∑p∈P (ra,p − ra )(rb,p − rb )
sim(a, b) = q q
∑p∈P (ra,p − ra )2 ∑p∈P (rb,p − rb )2

a, b: users
ra,p : rating of user a for item p
P: set of items, rated both by a and b
ra , rb : user’s average ratings
Possible similarity values are between -1 to 1

For the example considered

sim(Alice, User1) = 0.85
sim(Alice, User4) = -0.79

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 17 / 61

Pearson Correlation

Takes Difference in rating behavior into account

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 18 / 61

Pearson Correlation

Takes Difference in rating behavior into account

Works well in usual domains

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 18 / 61

Making Predictions

A common prediction function:

∑b∈N sim(a, b) ∗ (rb,p − rb )

pred(a, p) = ra +
∑b∈N sim(a, b)

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 19 / 61

Making Predictions

A common prediction function:

∑b∈N sim(a, b) ∗ (rb,p − rb )

pred(a, p) = ra +
∑b∈N sim(a, b)

Calculate, whether the neighbor’s ratings for the unseen item i are higher
or lower than their average
Combine the rating differences - use similarity as a weight
Add/subtract neighbor’s bias from the active user’s average and use this
as a prediction

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 19 / 61

Item-based Collaborative Filtering

Basic Idea
Use the similarity between items to make predictions

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 20 / 61

Item-based Collaborative Filtering

Basic Idea
Use the similarity between items to make predictions

For Instance
Look for items that are similar to Item5
Take Alice’s ratings for these items to predict the rating for Item5

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 20 / 61

Similarity Measure

Ratings are seen as vector in n−dimensional space

Similarity is calculated based on the angle between the vectors

~a ·~b
sim(~a,~b) =
|~a| ∗ |~b|
Adjusted cosine similarity: take average user ratings into account

∑u∈U (ru,a − ru )(ru,b − ru )

sim(a, b) = p p
∑u∈U (ru,a − ru )2 ∑u∈U (ru,b − ru )2

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 21 / 61

Pre-processing for Item-based filtering

Calculate all pair-wise item similarities in advance

The neighborhood to be used at run-time is typically rather small,
because only those items are taken into account which the user has rated
Item similarities are supposed to be more stable than user similarities

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 22 / 61

Pure CF-based systems only rely on the rating matrix

Explicit ratings
Most commonly used (1 to 5, 1 to 10 response scales)
Research topics: what about multi-dimensional ratings?
Challenge: Sparse rating matrices, how to stimulate users to rate more
items?

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 23 / 61

Pure CF-based systems only rely on the rating matrix

Implicit ratings
clicks, page views, time spent on some page, demo downloads ..
Can be used in addition to explicit ones; question of correctness of
interpretation

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 23 / 61

Data sparsity problems

Cold start problems

How to recommend new items? What to recommend to new users?

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 24 / 61

Data sparsity problems

Cold start problems

How to recommend new items? What to recommend to new users?

Straight-forward approach
Use another method (e.g., content-based, demographic or simply
non-personalized) in the initial phase

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 24 / 61

Data sparsity problems

Cold start problems

How to recommend new items? What to recommend to new users?

Straight-forward approach
Use another method (e.g., content-based, demographic or simply
non-personalized) in the initial phase

Alternatives
Use better algorithms (beyond nearest-neighbor approaches)
Example: Assume “transitivity” of neighborhoods

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 24 / 61

Example algorithms for sparse datasets

Recursive CF
Assume there is a very close neighbor n of u who however has not rated
the target item i yet.

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 25 / 61

Example algorithms for sparse datasets

Recursive CF
Assume there is a very close neighbor n of u who however has not rated
the target item i yet.
Apply CF-method recursively and predict a rating for item i for the
neighbor n

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 25 / 61

Example algorithms for sparse datasets

Recursive CF
Assume there is a very close neighbor n of u who however has not rated
the target item i yet.
Apply CF-method recursively and predict a rating for item i for the
neighbor n
Use this predicted rating instead of the rating of a more distant direct
neighbor

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 25 / 61

Example algorithms for sparse datasets

Graph-based methods: Spreading activation

Idea: Use paths of lengths 3 and 5 to recommend items
Length 3: Recommend Item3 to User1
Length 5: Item1 also recommendable

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 26 / 61

Example algorithms for sparse datasets

Graph-based methods: Spreading activation

Idea: Use paths of lengths 3 and 5 to recommend items
Length 3: Recommend Item3 to User1
Length 5: Item1 also recommendable

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 26 / 61

Matrix Factorization Methods

Are shown to be superior to the classic nearest-neighbor techniques for

product recommendations
Allow the incorporation of additional information such as implicit
feedback, temporal effects, and confidence levels

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 27 / 61

User-oriented neighborhood method

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 28 / 61

Latent Factor Approach

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 29 / 61

Matrix Factorization Methods

Basic Idea
Both users and items are characterized by vectors of factors, inferred
from item rating patterns
High correspondence between item and user factors leads to a
recommendation.

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 30 / 61

Using Singular Value Decomposition

Let M be the matrix of user - item interactions

Use SVD to get a k−rank approximation

Mk = Uk × Σk × Vk T

Prediction: rˆui = ru + Uk (u) × Σk × Vk T (i)

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 31 / 61

SVD: Example

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 32 / 61

SVD: Example

Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 33 / 61

Using Singular Value Decomposition