0% found this document useful (0 votes)

34 views

Module5 Recommender Systems PartB

Uploaded by

Aathmika Vijay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

Module5 Recommender Systems PartB

Uploaded by

Aathmika Vijay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Module 5_PartB

Examples for Item based and Model based

approaches
Application Domains of Recommender Systems

 Which movie should I watch?

 Which digital camera should I buy?
 Which news article will I find interesting?
 Toward which degree should I study? –
 Which is the best investment for my retirement money? –

Universität Mannheim – Bizer/Ponzetto: Web Usage Mining – FFS2017 (Version: 12.2.2017) – Slide 25
Paradigms of Recommender Systems- Recall from
Part A
Personalized
recommendations

 Demographic Recommendation
 Offer Backstreet Boys albums only to girls under 16
 Offer cameras with American electricity plug to people from US.

 Contextual Recommendation (Location / Time of Day/Year)

 Send coupon to mobile user who passes by a shop (Foursquare)
 Show holiday related advertisements based on user location
Universität Mannheim – Bizer/Ponzetto: Web Usage Mining – FFS2017 (Version: 12.2.2017) – Slide 26
Paradigms of Recommender Systems- Recall from
Part A

Collaborative: "Tell me what's popular

among my peers"

User–Item Rating Matrix

Item1 Item2
Alice 5 ?
User1 2 1
User2 4 3
Paradigms of Recommender Systems- Recall from
Part A

Content-based: "Show me more of the

same what I've liked"
Paradigms of Recommender Systems- Recall from
Part A

Hybrid: Combinations of various inputs

and/or composition of different
mechanisms
When does a Recommender do a good Job?

1. User’s Perspective
 Recommend me items that I like and did not know about
 Serendipity: Accident of finding something good
while not specifically searching for it

Recommend
items from the
long tail

2. Merchant’s Perspective
 Increase the sale of high-revenue items
 Thus real-world recommender systems are not as neutral as
the following slides suggest
4. Collaborative Filtering (CF)

 The most prominent approach to generate recommendations

 used by large e-commerce sites
 applicable in many domains (book, movies, DVDs, ..)

 Approach
 use the "wisdom of the crowd" to recommend items

 Basic Assumptions
1. Users give ratings to catalog items (implicitly or explicitly)
2. Customers who had similar tastes in the past,
will have similar tastes in the future

 Input: Matrix of given user–item ratings

 Output types
1. (Numerical) prediction indicating to what degree the current user will
like or dislike a certain item
2. Top-K list of recommended items
User-Based Nearest-Neighbor Collaborative Filtering

 Given an "active user" (Alice) and an item i not yet rated by

Alice
1. find a set of users (peers/nearest neighbors) who liked the same
items as Alice in the past and who have rated item i
2. use their ratings of item i to predict, if Alice will like item i
3. do this for all items Alice has not seen and recommend the best-
rated.

 Example: User–Item Rating Matrix

Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
Note: >99%
of real-world
User2 4 3 4 3 5
values are
User3 3 3 1 5 4
NULL
User4 1 5 5 2 1
User-Based Nearest-Neighbor Collaborative Filtering

 Some questions we need to answer

1. How do we measure user similarity?
2. How many neighbors should we consider?
3. How do we generate a prediction from the neighbors' ratings?

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
4.1 Measuring User Similarity

 A popular similarity measure in user-based CF is the

Pearson Correlation Coefficient
a, b : users
ra,p : rating of user a for item e
P : set of items, rated both by a and b
∑p ∈P(ra,p —r¯a)(rb,p —r¯b)
sim a, b =
2 2
∑p ∈P r a,p —r¯a ∑p ∈P r b,p —r¯ b

 Takes different usage of rating scale into account

by comparing individual ratings to the user’s average rating
 Value range [-1,1]
1 : positive correlation
0 : variables independent
-1 : negative correlation
Example: Pearson Correlation

 A popular similarity measure in user-based CF is the

Pearson Correlation Coefficient

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?
User1 3 1 2 3 3 sim = 0.85
User2 4 3 4 3 5 sim = 0.70
User3 3 3 1 5 4 sim = 0.00
User4 1 5 5 2 1 sim = ‐0.79
Pearson Correlation

 Takes differences in rating behavior into account.

 Some people always give higher ratings than others.

6 Alice

5 User1

User4
4

Ratings
3

0
Item1 Item2 Item3 Item4

 Empirical studies show that Pearson Correlation often works

better than alternative measures such as cosine similarity
Making Predictions

1. A simple prediction function:

∑b∈N sin a, b ∗rb,p
pred a, e =
∑b∈N sin(a, b)
 Uses the similarity with a as a weight to combine ratings

2. A prediction function that takes rating behavior into

account:
∑b ∈ N sim a, b ∗(rb,p —r b )
pred a, p = r a +
∑b ∈ N sim a, b
 Calculates whether the neighbors' ratings for the unseen item i are
higher or lower than their average
 Uses the similarity with a as a weight to combine rating differences
 Add/subtract the neighbors' bias from the active user's average and
use this as a prediction
In the given example, Alice has more similarity to user 1 and
user 2, Hence the prediction of Alice rating for item 5 would be,

Given these calculation schemes, we can now compute rating predictions for Alice for all
items she has not yet seen and include the ones with the highest prediction values in the
recommendation list. In the example, it will most probably be a good choice to include
Item5 in such a list
Improving the Metrics / Prediction Function

 Neighborhood Selection
 Use fixed number of neighbors or similarity threshold

 Case Amplification
 Intuition: Give more weight to "very similar" neighbors,
i.e., where the similarity value is close to 1.
 Implementation: sim a, b 2

 Rating Variance
 Agreement on commonly liked items is not so informative as
agreement on controversial items
 Possible solution: Give more weight to items that have a higher
variance
Memory-based and Model-based Approaches

 User-based CF is said to be "memory-based"

 the rating matrix is directly used to find neighbors / make predictions
 does not scale for most real-world scenarios as large e-commerce
sites have tens of millions of customers and 10,000s of items

 Model-based approaches
 employ offline model-learning
 at run-time, the learned model is used to make predictions
 models are updated / re-trained periodically
 A large variety of techniques is used
1. Item-based Collaborative Filtering
2. Association Rules
3. Probabilistic Methods
4. Matrix Factorization Techniques
Item-based Collaborative Filtering

 Basic idea:
 Use the similarity between items (and not users) to make predictions

 Approach:
1. Look for items that have been rated similarly as Item5
2. Take Alice's ratings for these items to predict the rating for Item5

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
Calculating Item-to-Item Similarity

 Cosine Similarity
 Often produces better results as Pearson for calculating the
item-to-item similarity.

a· b a ·b
sim a, b =
a ∗|b| a

 Adjusted Cosine Similarity

 adjusts ratings by taking the average rating behavior of a user into
account
 U: set of users who have rated both items a and b

∑u∈U(ru,a —r u ) (ru,b —r u )
sim a, b =
2 2
∑u∈U ru,a —r u ∑u∈U r u,b —r u
Computing Cosine Similarity
between Item 5 and Item1,
Making Predictions

 A common prediction function for item-based CF:

∑i∈ratedItem(u) sim i,p ∗ru,i

pred u, p =
∑i∈ratedItem(u) sim i, p

ratedItem(u) : Set of items rated by Alice

r ui : Alice‘s rating for items i
sim(i, p) : Similarity of item i with target item p
Pre-Processing for Item-Based Filtering

 Item-based filtering does not solve the scalability problem

itself, but as there are usually less items than users, we can
pre-calculate the item similarities and store them in memory.

 Neighborhood size is typically also limited to a specific size

 An analysis of the MovieLens dataset indicates a neighbor-hood size
of 20 to 50 items is reasonable (Herlocker et al. 2002)
 Not all neighbors are taken into account for the prediction,
as Alice most likely only rated a small subset of the neighbors

 Memory requirements
 Up to N2 pair-wise similarities to be memorized (N = number of items)
in theory
 In practice, the memory requirements are significantly lower as
- many items have no co-ratings (heavy metal and samba CDs)
- neighborhood size often limited to max items above minimum similarity
threshold
MODEL BASED APPROACHES
Recap: Association Rule Mining
 Commonly used for shopping basket analysis
 aims at detection of rules such as "If a customer purchases beer
then he also buys diapers in 70% of the cases"

 Association rule mining algorithms

 detect rules of the form X → Y (e.g., beer → diapers)
from a set of sales transactions D = {t1, t2, … tn}
 Two step rule mining process
1. determine frequent item sets
2. derive rules from the frequent item sets
 Measures of rule quality
- used e.g. as a threshold to cut off unimportant rules
- Support count = σ(X)=|{x|x ⊆ ti, t i  D}|
σ(X∪Y)
- Support =
|D|
σ(X∪Y)
- Confidence =
σ(X)
Un-Personalized Recommendation

Items co-
occurring with
book in frequent
item sets
Personalized Recommendation using Association Rules

 Simplest approach Item1 Item2 Item3 Item4 Item5

 transform 5-point ratings into Alice 1 0 0 0 ?

binary ratings User1 1 0 1 0 1
(1 = above user average) User2 1 0 1 0 1
User3 0 0 0 1 1
 Mine rules such as
User4 0 1 1 0 0
 Item1 → Item5
- support (2/4), confidence (2/2) (without Alice)

 Make recommendations for Alice (basic method)

1. determine "relevant" rules based on Alice's transactions/ratings
(the above rule will be relevant as Alice bought/rated Item1)
2. determine items not already bought/rated by Alice
3. sort the items based on the rules' confidence values
Probabilistic Methods
 Basic idea:
 given the user/item rating matrix
 determine the probability that Alice will give item i a specific rating

 Calculation of rating probabilities based on Bayes Theorem

 Given Alice's previous ratings, how probable is it that she rates Item5
with the rating value 1?
 Corresponds to conditional probability P(Item5=1 | X), where
X = Alice's previous ratings = (Item1 =1, Item2=3, Item3= … )
 Can be estimated using Bayes' theorem and independence assumption
Y = Item5=1
Independence
Probability of Prior without
assumption
seeing evidence evidence

P X Y × P(Y) ∏i=1
d
P Xi Y × P(Y) See: IE500
P YX = P YX = Data Mining:
P(X) P(X) Chapter 3
• As P(X) is a constant value, we can omit it in our
calculations. P(Y ) can be estimated for each rating value
based on the ratings database: P(Item5=1) = 2/4 (as two
of four ratings for Item5 had the value 1), P(Item5=2)=0,
and so forth. What remains is the calculation of all
class-conditional probabilities P(Xi|Y ):
Estimation of the Probabilities
Item1 Item2 Item3 Item4 Item5
Alice 1 3 3 2 ?

User1 2 4 2 2 4
X = (Item1 =1, Item2=3, Item3= … )
User2 1 3 3 5 1
User3 4 5 2 3 3
User4 1 1 5 2 1

P X Item5 = 1
= P Item1 = 1 Item5 = 1 × P Item2 = 3 Item5 = 1
2 1 1 1
× P Item3 = 3 Item5 = 1 × P Item4 = 2 Item5 = 1 = × × ×
2 2 2 2
= 0. 125

 Based on these calculations, given that P(Item5=1) = 2/4 and omitting the
constant factor P(X) in the Bayes classifier, the posterior probability of a rating
value 1 for Item5 is P(Item5 = 1|X) = 2/4 × 0.125 = 0.0625. In the example
ratings database, P(Item5=1) is higher than all other probabilities, which
means that the probabilistic rating prediction for Alice will be 1 for Item5.
More on Ratings: Explicit Ratings
 Explicit ratings are probably the most precise ratings.
 Commonly used response scales:
 1 to 5 Likert scales
 Like (sometimes also Dislike)

 Main problems
 Users often not willing to rate items
- number of ratings likely to be too small
→ poor recommendation quality
 How to stimulate users to rate more items?
- Example: Amazon Betterizer

 Alternative
 Use implicit ratings
(in addition to explicit ones)
More on Ratings: Implicit Ratings
 Events potentially interpretable as positive ratings
 items bought
 clicks, page views
 time spent on some page
 demo downloads …

 Advantage
 implicit ratings can be collected constantly by the web site
or application in which the recommender system is embedded
 collection of ratings does not require additional effort from the user

 Problem
 one cannot be sure whether the user behavior is correctly interpreted
 for example, a user might not like all the books he or she has bought;
the user also might have bought a book for someone else
Collaborative Filtering Discussion
 Pros:
 works well in some domains: books, movies. Likely not: life insurances
 requires no explicit item descriptions or demographic user profiles

 Cons:
 requires user community to give enough ratings
(many real-world systems thus employ implicit ratings)
 no exploitation of other sources of recommendation knowledge
(demographic data, item descriptions)
 Cold Start Problem
- How to recommend new items?
- What to recommend to new users?
 Approaches for dealing with the Cold Start Problem
- Ask/force users to rate a set of items
- Use another method or combination of methods (e.g., content-based,
demographic or simply non-personalized) until enough ratings are
collected
4.2 Content-based Recommendation

 While collaborative filtering methods do not use any information

about the items, it might be reasonable to exploit such information.
 e.g., recommend fantasy novels to people who liked fantasy novels in the past

 What do we need:
 information about the available items (content)
 some sort of user profile describing what the user likes (user preferences)

 The tasks:
1. learn user preferences from what she has bought/seen before
2. recommend items that are "similar" to the user preferences

"show me more
of the same
what I've liked"
Content and User Profile Representation

 Content Representation
Title Genre Author Type Price Keywords

The Night of Memoir David Carr Paperback 29.90 Press and journalism, drug
the Gun addiction, personal memoirs,
New York
The Lace Fiction, Brunonia Hardcover 49.90 American contemporary fiction,
Reader Mystery Barry detective, historical

Into the Fire Romance, Suzanne Hardcover 45.90 American fiction, murder, neo-
Suspense Brockmann nazism

 User Profile
Title Genres Authors Types Avg. Keywords
Price
… Fiction. Brunonia, Paperback 25.65 Detective, murder,
Mystery Barry, Ken New York
Follett

 use attribute specific similarity measures and weights

Recommending Text Documents

 Content-based recommendation techniques are often applied to

recommend text documents, like news articles or blog posts.
 Documents and user profiles are represented as term-vectors:

Document Corpus User Profile

Doc 1 Doc 2 Doc 3 Liked Liked Liked
Doc 1 Doc 2 Doc 3

Antony 157 73 0 Antony 0 1 0

Brutus 4 157 0 Brutus 2 2 0

Caesar 232 227 0 Caesar 4 3 0

Calpurnia 0 10 123 Calpurnia 233 99 34

Cleopatra 17 0 52 Cleopatra 57 12 0

mercy 1 0 43 mercy 22 23 90
Similarity of Text Documents

 Challenges
 terms vectors are very sparse
 not every word has the same importance
 long documents have higher chance to overlap with user profile

 Methods for handling these challenges

 Similarity metric: Cosine similarity
 Preprocessing: remove stop words
 Vector Creation:
Term-Frequency - Inverse Document Frequency (TF —IDF)
Recommending Documents

 Given a set of documents already rated by the user

 either explicitly via user interface
 or implicitly by monitoring user behavior

1. Find the n nearest neighbors of an not-yet-seen item i in D

 measure similarity of item i with neighbors using cosine similarity

2. Use ratings from Alice for neighbors to predict a rating for item i
 Find 5 most similar items to i
 4 of these items were liked by Alice item i will also be liked by Alice

 Variations:
 Varying neighborhood size k
 upper similarity threshold to prevent system from recommending too similar
texts (variations of texts the user has already seen)

 Good to model short-term interests / follow-up stories

 Often used in combination with method to model long-term preferences
 E.g. ‘Semantic enrichment’ by assigning interests to each page/product.
Universität Mannheim – Bizer/Ponzetto: Web Usage Mining – FFS2017 (Version: 12.2.2017) – Slide 56
SAMPLE SOLVED EXAMPLE OF MATRIX FACTORIZATION
Assume that the factored matrix will have only 2 features F1 and F2
User Matrix: According to Ryan, if its a Marvel movie, he’ll give it 3
points and if he is in the movie, he’ll give it 2 more points (Typical
Ryan!). Item Matrix: Item Matrix contains binary values where the
value is 1 if conditions of features mentioned above are satisfied and 0
otherwise. By performing dot product of the user matrix and item
matrix, Infinity War gets 3 and Deadpool gets a 5.
Content-based Filtering Discussion

 Pros:
 In contrast to collaborative approaches, content-based techniques
do not require user community in order to work
 No problems with recommending new items

 Cons:
 Require to learn a suitable model of user's preferences based on
explicit or implicit feedback
- deriving implicit feedback from user behavior can be problematic
- ramp-up phase required (users needs to view/rate some items)
- Web 2.0: Use other sources to learn the user preferences might be an
option (e.g. share your Facebook profile with e-shop)
 Overspecialization
- Algorithms tend to propose "more of the same"
- Recommendations might be boring as items are too similar
4.3 Hybrid Recommender Systems

Hybrid: Combinations of various

inputs and/or composition of
different mechanism in order to
overcome problems of single
methods.

Demographic: “Offer American plugs to people from the US“

Collaborative: "Tell me what's popular among my peers"
Content-based: "Show me more of the same what I've liked"
Parallelized Hybridization Design

 Output of several existing recommenders is combined

 Least invasive design
 Requires some weighting or voting scheme
 weights can be learned using existing ratings as supervision
 dynamic weighting: Adjust weights or switch between different recommenders
as more information about users and items becomes available
- e.g. if too few ratings available the use content-based recommendation,
otherwise use collaborative filtering
Parallelized Hybridization Design: Weighted

n
• Compute weighted sum: rec weighted u, i    k  reck u, i 
k 1

Recommender 1 Recommender 2
Item1 0.5 1 Item1 0.8 2
Item2 0 Item2 0.9 1
Item3 0.3 2 Item3 0.4 3
Item4 0.1 3 Item4 0
Item5 0 Item5 0

Recommender weighted(0.5:0.5)
Item1 0.65 1
Item2 0.45 2
Item3 0.35 3
Item4 0.05 4
Item5 0.00
Adjustment of Weights

 Use existing ratings to learn individual weights for each user

 Compare prediction of recommenders with actual ratings by user
 For each user adapt weights to minimize Mean Absolute Error (MAE)

Absolute errors and MAE

Weight1 Weight2 rec1 rec2 error MAE
0.1 0.9 Item1 0.5 0.8 0.23 0.61
Item4 0.1 0.0 0.99
0.3 0.7 Item1 0.5 0.8 0.29 0.63
 
n
 k  reck (u, i)  ri
ri R k 1
Item4 0.1 0.0 0.97 MAE 
R
0.5 0.5 Item1 0.5 0.8 0.35 0.65
Item4 0.1 0.0 0.95
0.7 0.3 Item1 0.5 0.8 0.41 0.67
MAE improves as rec2
Item4 0.1 0.0 0.93
is weighted more strongly
0.9 0.1 Item1 0.5 0.8 0.47 0.69
Item4 0.1 0.0 0.91
Monolithic Hybridization Design

 Features/knowledge sources of different paradigms are combined in a

single recommendation component. E.g.:
 Ratings and user demographics
 Ratings and content features: user likes many movies that are comedies

 Example: Content-boosted Collaborative Filtering

 based on content features additional ratings are created
 e.g. Alice likes Items 1 and 3 (unary ratings)
- Item7 is similar to 1 and 3 by a degree of 0.75
- Thus Alice likes Item7 by 0.75
 rating matrix becomes less sparse
 see [Prem Melville, et al. 2002]
4.4 Evaluating Recommender Systems

Question: Is a Recommender System efficient with respect to

a specific criteria like accuracy, serendipity, online
conversion, response time, ramp-up efforts?

 So we need to determine the criteria that matter to us

 Popular Measures for Accuracy
 If items are rated on a Likert scale (1 to 5)
- MAE (Mean Absolute Error), RMSE (Root Mean Squared Error)
 If items are classified as good or bad
- Precision / Recall / F1-Score
 If items are presented as ranked Top-K list
- Lift Index, Normalized Discounted Cumulative Gain

 Methodologies for measuring Accuracy

 Split-Validation, Cross-Validation
Evaluation Methodology

 Setting to ensure internal validity: Alice

 One randomly selected share of known ratings Item1 5 Training
(training set) used as input to train the algorithm Set
and build the model Item2 1
Item3 3
 Remaining share of withheld ratings (test set)
used as ground truth to evaluate quality Item4 1
 To ensure the reliability of measurements the Item5 4 Test
random split, model building and evaluation steps Set
Item6 2
are repeated several times.

 Split-Validation
 Split-Validation: e.g. 2/3 training, 1/3 validation

 N-Fold Cross Validation

 N disjunct fractions of known ratings with equal size (1/N) are determined.
Setting N to 5 or 10 is popular.
 N repetitions of the model building and evaluation steps, where each fraction
is used exactly once as a testing set while the other fractions are used for
training.
Evaluation of Likert-Scaled Predictions

 Mean Absolute Error (MAE) computes the deviation

between predicted ratings and actual ratings
1 n
MAE   | pi  ri |
n i 1
 Root Mean Square Error (RMSE) is similar to MAE, but
places more emphasis on larger deviation

1 n
RMSE   ( pi  ri )2
n i 1
 Critique
 Not meaningful as inclusion into Top-K list is more important
to the user than overall accuracy of predictions.
 Rather evaluate inclusion into Top-K list as classification
problem (see next slide).
Evaluation of Good/Bad Classifications

Confusion Matrix
 Precision: Measure of exactness.
 determines the fraction of relevant Reality
items retrieved out of all items retrieved Actually Actually Bad
Good
 fraction of recommended movies
that are actually good Good True False

Prediction
Positive (tp) Positive (fp)
 Recall: Measure of completeness. Bad False True
 determines the fraction of relevant items Negative Negative (tn)
(fn)
retrieved out of all relevant items
 E.g. the fraction of all good movies
recommended

 F1-Measure
 combines Precision and Recall into a
single value for comparison purposes.
 May be used to gain a more balanced
view of performance
Evaluation of ranked Top-K List

For a specific user:

Actually good Recommended
(predicted as good)
Item 237 Item 345
Hit
Item 899 Item 237

Item 187

 Rank position also matters!

 Rank metrics extend recall and precision to take the
positions of correct items in a ranked list into account
 Relevant items are more useful when they appear earlier
in the recommendation list
 Particularly important in recommender systems as lower ranked
items may be overlooked by users
Public Rating Datasets

 MovieLens
 movie ratings collected via MovieLens website
 1M Dataset: 6.000 users, 3.900 movies, 1 million ratings
 10M Dataset: 71.000 users, 10.600 movies, 10 million ratings

 Netflix
 provided by commercial movie rental website for Netflix competition
($1,000,000 for 10% better RMSE)
 480.000 users rated 18.000 movies, 100M ratings

 Yahoo Music
 600.000 songs, 1 million users, 300M ratings
 provided for KDD Cup 2011

 Web 2.0 Platforms offer plenty of additional rating data

 e.g. LastFM, delicious

Stephen Skinner - Sacred Geometry - Deciphering The Code (2006, Sterling Publishing) - Libgen - Li
100% (5)
Stephen Skinner - Sacred Geometry - Deciphering The Code (2006, Sterling Publishing) - Libgen - Li
168 pages
Who's #1?: The Science of Rating and Ranking
From Everand
Who's #1?: The Science of Rating and Ranking
Amy N. Langville
4.5/5 (4)
Is593-Lecture04 Recommendation Systems
No ratings yet
Is593-Lecture04 Recommendation Systems
51 pages
Recommender System - New
No ratings yet
Recommender System - New
49 pages
M02 User-Based CF V02
No ratings yet
M02 User-Based CF V02
20 pages
AN OPTIMIZED ITEM-BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHM
No ratings yet
AN OPTIMIZED ITEM-BASED COLLABORATIVE FILTERING RECOMMENDATION ALGORITHM
5 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
36 pages
Recommender Systems-Unit Iii
No ratings yet
Recommender Systems-Unit Iii
9 pages
RecommenderSystems-Shortened
No ratings yet
RecommenderSystems-Shortened
95 pages
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
No ratings yet
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
4 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
RS Part 1
No ratings yet
RS Part 1
40 pages
Slides Lecture 2 RecSys
No ratings yet
Slides Lecture 2 RecSys
86 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
16 pages
AStudyof Mathematical Modelfor Collaborative Filtering
No ratings yet
AStudyof Mathematical Modelfor Collaborative Filtering
10 pages
Week 6 Recommender
No ratings yet
Week 6 Recommender
17 pages
UNIT III
No ratings yet
UNIT III
13 pages
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
100% (1)
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
4 pages
Lecture 2 Part1
No ratings yet
Lecture 2 Part1
14 pages
Lec15-S Sarkar
No ratings yet
Lec15-S Sarkar
12 pages
Unit-3
No ratings yet
Unit-3
21 pages
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
No ratings yet
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
12 pages
A Collaborative Filtering Recommendation Algorithm Based on Item Genre and Rating Similarity
No ratings yet
A Collaborative Filtering Recommendation Algorithm Based on Item Genre and Rating Similarity
4 pages
Recommended
No ratings yet
Recommended
8 pages
L6 Recommendation
No ratings yet
L6 Recommendation
56 pages
CS345A Data Mining: Recommendation Systems
No ratings yet
CS345A Data Mining: Recommendation Systems
26 pages
.Trashed-1724941095-Recommender Systems
No ratings yet
.Trashed-1724941095-Recommender Systems
30 pages
Recommendation System
No ratings yet
Recommendation System
32 pages
Recommendations Using Collaborative Filtering
No ratings yet
Recommendations Using Collaborative Filtering
37 pages
Module5 Recommender Systems PartA
No ratings yet
Module5 Recommender Systems PartA
54 pages
Lecture 1_Collaborative Filtering
No ratings yet
Lecture 1_Collaborative Filtering
27 pages
CSE545 sp23 (9) Recommendation Systems 4-10
No ratings yet
CSE545 sp23 (9) Recommendation Systems 4-10
72 pages
CS583 Recommender Systems
No ratings yet
CS583 Recommender Systems
40 pages
RecSys Updated
No ratings yet
RecSys Updated
37 pages
DM - Lecture 5
No ratings yet
DM - Lecture 5
75 pages
M03 Item-Based CF-V2 (1)
No ratings yet
M03 Item-Based CF-V2 (1)
27 pages
Unit 1 Recommender Systems
No ratings yet
Unit 1 Recommender Systems
33 pages
UNIT III - Recommender Systems
No ratings yet
UNIT III - Recommender Systems
11 pages
Recommendation System
No ratings yet
Recommendation System
17 pages
Recommender Systems & Collaborative Filtering
No ratings yet
Recommender Systems & Collaborative Filtering
14 pages
Recommender System
No ratings yet
Recommender System
26 pages
Movie Recommendation System: CSN-382 Project
No ratings yet
Movie Recommendation System: CSN-382 Project
25 pages
Recommender Systems-Chapter 4
No ratings yet
Recommender Systems-Chapter 4
76 pages
Unit Iii-Collaborative Filtering
No ratings yet
Unit Iii-Collaborative Filtering
34 pages
Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
No ratings yet
Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
22 pages
Module 5
No ratings yet
Module 5
8 pages
Filtering and Recommender Systems: Content-Based and Collaborative
No ratings yet
Filtering and Recommender Systems: Content-Based and Collaborative
30 pages
Title_obvhbResearch_Project
No ratings yet
Title_obvhbResearch_Project
7 pages
ITEM-ITEM Complete Lecture
No ratings yet
ITEM-ITEM Complete Lecture
19 pages
15.0 Collaborative Filtering
No ratings yet
15.0 Collaborative Filtering
13 pages
Recommender Week6
No ratings yet
Recommender Week6
34 pages
Movie Recommendations
No ratings yet
Movie Recommendations
12 pages
10 Recommender Systems
No ratings yet
10 Recommender Systems
35 pages
Chapter4 - Web Based Personalization Systems - Part2 - Collaborative Filtering - KNN
No ratings yet
Chapter4 - Web Based Personalization Systems - Part2 - Collaborative Filtering - KNN
22 pages
DM Lect 6_Recommender Systems
No ratings yet
DM Lect 6_Recommender Systems
46 pages
Lect 13 DM
No ratings yet
Lect 13 DM
20 pages
Building Accurate and Practical Recomender System Usnig ML Classifier and CBF by Asma
No ratings yet
Building Accurate and Practical Recomender System Usnig ML Classifier and CBF by Asma
19 pages
mod4
No ratings yet
mod4
6 pages
Answer
No ratings yet
Answer
13 pages
Example: Item1 Item2 Item3 Item4 Item5 Alice 5 3 4 4 User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1
100% (1)
Example: Item1 Item2 Item3 Item4 Item5 Alice 5 3 4 4 User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1
6 pages
Module-4_Notes_13-12-2024.docx
No ratings yet
Module-4_Notes_13-12-2024.docx
21 pages
Module 7 - Multimedia Information Retrieval
No ratings yet
Module 7 - Multimedia Information Retrieval
38 pages
Module 6 - Intrusion Detection System
No ratings yet
Module 6 - Intrusion Detection System
31 pages
Error Detection and Correction - Hamming Code, CRC Checksum-1
No ratings yet
Error Detection and Correction - Hamming Code, CRC Checksum-1
92 pages
Data Link Layer Flow Control
No ratings yet
Data Link Layer Flow Control
17 pages
Public Key Crypto - Annotated
No ratings yet
Public Key Crypto - Annotated
24 pages
R.G. Coyle - System Dynamics Modelling
100% (5)
R.G. Coyle - System Dynamics Modelling
426 pages
MSC Dissertation
No ratings yet
MSC Dissertation
108 pages
Automatic Control W01 Lec01
No ratings yet
Automatic Control W01 Lec01
8 pages
Theory of Interest
No ratings yet
Theory of Interest
15 pages
(2001) D3Q13 LBM
No ratings yet
(2001) D3Q13 LBM
7 pages
Using Softreference For Caching
No ratings yet
Using Softreference For Caching
13 pages
XIIth Application of Derivative Case studies
No ratings yet
XIIth Application of Derivative Case studies
29 pages
Questions of Lagrange's Mean Value Theorem
No ratings yet
Questions of Lagrange's Mean Value Theorem
8 pages
Anna University - Electromagnetic Theory (EMT) - Question Bank - All Units
No ratings yet
Anna University - Electromagnetic Theory (EMT) - Question Bank - All Units
7 pages
8.Concord-RULES OF ENGLISH GRAMMAR AND USAGE PDF
73% (41)
8.Concord-RULES OF ENGLISH GRAMMAR AND USAGE PDF
18 pages
12 Solved
No ratings yet
12 Solved
19 pages
Lyapunov Exponents, Barreira
No ratings yet
Lyapunov Exponents, Barreira
273 pages
Introduction to Quantum Field Theory with Applications to Quantum Gravity 1st Edition Iosif L Buchbinder Ilya Shapiro all chapter instant download
100% (3)
Introduction to Quantum Field Theory with Applications to Quantum Gravity 1st Edition Iosif L Buchbinder Ilya Shapiro all chapter instant download
55 pages
Hook's Law
No ratings yet
Hook's Law
14 pages
An Aggregate-Disaggregate Intermittent Demand Approach (ADIDA) To Forecasting: An Empirical Proposition and Analysis
No ratings yet
An Aggregate-Disaggregate Intermittent Demand Approach (ADIDA) To Forecasting: An Empirical Proposition and Analysis
17 pages
A Review of Pile Machines and Their Selection Crit
No ratings yet
A Review of Pile Machines and Their Selection Crit
14 pages
3.2.A UnitConversion
No ratings yet
3.2.A UnitConversion
6 pages
8.4 Angles of Elevation and Depression
89% (9)
8.4 Angles of Elevation and Depression
2 pages
Math 34 q4 Week 4 DLP
No ratings yet
Math 34 q4 Week 4 DLP
5 pages
Newtons Laws PDF
No ratings yet
Newtons Laws PDF
11 pages
OPTICS
No ratings yet
OPTICS
36 pages
Information Theory
No ratings yet
Information Theory
41 pages
Bell Nozzle Types
No ratings yet
Bell Nozzle Types
2 pages
Bibliography
No ratings yet
Bibliography
1 page
OOAD Metrics
No ratings yet
OOAD Metrics
27 pages
Micrometers Calipers Worksheet
No ratings yet
Micrometers Calipers Worksheet
4 pages
Seismic Performance Evaluation of A High-Rise Building With Novel Hybrid Coupled Walls
No ratings yet
Seismic Performance Evaluation of A High-Rise Building With Novel Hybrid Coupled Walls
10 pages

Module5 Recommender Systems PartB

Uploaded by

Module5 Recommender Systems PartB

Uploaded by

Module 5_PartB

Examples for Item based and Model based

 Which movie should I watch?

 Contextual Recommendation (Location / Time of Day/Year)

Collaborative: "Tell me what's popular

User–Item Rating Matrix

Content-based: "Show me more of the

Hybrid: Combinations of various inputs

 The most prominent approach to generate recommendations

 Input: Matrix of given user–item ratings

 Given an "active user" (Alice) and an item i not yet rated by

 Example: User–Item Rating Matrix

 Some questions we need to answer

Item1 Item2 Item3 Item4 Item5

 A popular similarity measure in user-based CF is the

 Takes different usage of rating scale into account

 A popular similarity measure in user-based CF is the

Item1 Item2 Item3 Item4 Item5

 Takes differences in rating behavior into account.

 Empirical studies show that Pearson Correlation often works

1. A simple prediction function:

2. A prediction function that takes rating behavior into

 User-based CF is said to be "memory-based"

Item1 Item2 Item3 Item4 Item5

 Adjusted Cosine Similarity

 A common prediction function for item-based CF:

∑i∈ratedItem(u) sim i,p ∗ru,i

ratedItem(u) : Set of items rated by Alice

 Item-based filtering does not solve the scalability problem

 Neighborhood size is typically also limited to a specific size

 Association rule mining algorithms

 Simplest approach Item1 Item2 Item3 Item4 Item5

 transform 5-point ratings into Alice 1 0 0 0 ?

 Make recommendations for Alice (basic method)

 Calculation of rating probabilities based on Bayes Theorem

 While collaborative filtering methods do not use any information

 use attribute specific similarity measures and weights

 Content-based recommendation techniques are often applied to

Document Corpus User Profile

Antony 157 73 0 Antony 0 1 0

Brutus 4 157 0 Brutus 2 2 0

Caesar 232 227 0 Caesar 4 3 0

Calpurnia 0 10 123 Calpurnia 233 99 34

 Methods for handling these challenges

 Given a set of documents already rated by the user

1. Find the n nearest neighbors of an not-yet-seen item i in D

 Good to model short-term interests / follow-up stories

Hybrid: Combinations of various

Demographic: “Offer American plugs to people from the US“

 Output of several existing recommenders is combined

 Use existing ratings to learn individual weights for each user

Absolute errors and MAE

 Features/knowledge sources of different paradigms are combined in a

 Example: Content-boosted Collaborative Filtering

Question: Is a Recommender System efficient with respect to

 So we need to determine the criteria that matter to us

 Methodologies for measuring Accuracy

 Setting to ensure internal validity: Alice

 N-Fold Cross Validation

 Mean Absolute Error (MAE) computes the deviation

For a specific user:

 Rank position also matters!

 Web 2.0 Platforms offer plenty of additional rating data

You might also like