0% found this document useful (0 votes)

5 views20 pages

M02 User-Based CF V02

The document discusses User-Based Collaborative Filtering (CF), a prominent recommendation approach used by e-commerce sites that relies on user ratings to predict preferences. It explains the process of finding similar users, measuring similarity using Pearson correlation, and making predictions based on neighbors' ratings. Additionally, it addresses challenges such as neighborhood selection, scalability issues, and the pros and cons of collaborative filtering methods.

Uploaded by

Fa Putra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views20 pages

M02 User-Based CF V02

Uploaded by

Fa Putra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

User-Based Collaborative

Filtering
Dosen : ZK Abdurahman Baizal

Sumber : Dietmar Jannach, et al, 2010, Introduction to Recommender

System
Collaborative Filtering (CF)
• The most prominent approach to generate recommendations
• used by large, commercial e-commerce sites
• well-understood, various algorithms and variations exist
• applicable in many domains (book, movies, DVDs, ..)
• Approach
• use the "wisdom of the crowd" to recommend items
• Basic assumption and idea
• Users give ratings to catalog items (implicitly or explicitly)
• Customers who had similar tastes in the past, will have similar tastes in the
future
User-based nearest-neighbor collaborative
filtering
• The basic technique:
• Given an "active user" (Alice) and an item “i” not yet seen by Alice
• The goal is to estimate Alice's rating for this item, e.g., by
• find a set of users (peers) who liked the same items as Alice in the past and who have
rated item i
• use, e.g. the average of their ratings to predict, if Alice will like item i
• do this for all items Alice has not seen and recommend the best-rated
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
User-based nearest-neighbor collaborative
filtering
• Some first questions
• How do we measure similarity?
• How many neighbors should we consider?
• How do we generate a prediction from the neighbors' ratings?
Item1 Item2 Item3 Item4 Item5
Alice 5 3 4 4 ?
User1 3 1 2 3 3
User2 4 3 4 3 5
User3 3 3 1 5 4
User4 1 5 5 2 1
Measuring user similarity
• A popular similarity measure in user-based CF: Pearson
correlation

a, b : users
ra,p : rating of user a for item p
P : set of items, rated both by a and b
Possible similarity values between -1 and 1;
𝒓𝒂 , 𝒓𝒃 = user's average ratings

Item1 Item2 Item3 Item4 Item5

Alice 5 3 4 4 ?
User1 3 1 2 3 3 Sim(a,1) = 0.85
User2 4 3 4 3 5 Sim(a,2) = 0.70
User3 3 3 1 5 4 Sim(a,3) = 0.00
User4 1 5 5 2 1 Sim(a,3) = -0.79
Measuring user similarity

The similarity of Alice to User1 is thus as follows

Based on these calculations, we observe that User1 and User2 were somehow similar to Alice in their rating
behavior in the past.
Pearson correlation
• Takes differences in rating behavior into account

6 Alice

5 User1

4 User4
Ratings
3

0
Item1 Item2 Item3 Item4

• Works well in usual domains, compared with alternative measures

• such as cosine similarity
Implementasi di Python
Implementasi di Python
Buat nilai rerata dari tiap user dan membuat matrix baru dengan nilai rating diambil dari selisih rerata dan
rating asli
Implementasi di Python
Untuk menghitung similarity, dapat menggunakan library dari sklearn yaitu cosine_similarity. Disini kita perlu
membuat suatu fungsi dengan parameter yaitu matrix rating, user aktif (yang akan kita cari nilai rating
kosongnya) dan nilai k (jumlah tetangga/neighbor).
Making predictions
• A common prediction function:

• Calculate, whether the neighbors' ratings for the unseen item i are
higher or lower than their average
• Combine the rating differences – use the similarity with as a weight
• Add/subtract the neighbors' bias from the active user's average and
use this as a prediction
Making predictions

In the example, the prediction for Alice’s rating for Item5 based on the ratings of near neighbors
User1 and User2 will be

In real-world applications, rating databases are much larger and can comprise thousands or even
millions of users and items, which means that we must think about computational complexity.

In addition, the rating matrix is typically very sparse, meaning that every user will rate only a very
small subset of the available items.
Neighborhood selection
we intuitively decided not to take all neighbors into account (neighborhood
selection).

For the calculation of the predictions, we included only those that had a positive
correlation with the active user (and, of course, had rated the item for which we are
looking for a prediction).

If we included all users in the neighborhood, this would not only negatively
influence the performance with respect to the required calculation time, but it
would also have an effect on the accuracy of the recommendation, as the ratings of
other users who are not really comparable would be taken into account
Neighborhood selection

The common techniques for reducing the size of the neighborhood are to define a specific minimum
threshold of user similarity or to limit the size to a fixed number and to take only the k nearest
neighbors into account

if the similarity threshold is too high, the size of the neighborhood will be very small for many users,
which in turn means that for many items no predictions can be made (reduced coverage).

In contrast, when the threshold is too low, the neighborhood sizes are not significantly reduced.
Neighborhood selection

The value chosen for k – the size of the neighborhood – does not influence
coverage. However, the problem of finding a good value for k still exists:

When the number of neighbors k taken into account is too high, too many
neighbors with limited similarity bring additional “noise” into the predictions.

When k is too small – for example, below 10 in the experiments, the quality of
the predictions may be negatively affected. An analysis of the MovieLens
dataset indicates that “in most real-world situations, a neighborhood of 20 to 50
neighbors seems reasonable”
Implementasi di Python

Untuk Pe Er !!!
Improving the metrics / prediction function
• Not all neighbor ratings might be equally "valuable"
• Agreement on commonly liked items is not so informative as agreement on
controversial items
• Possible solution: Give more weight to items that have a higher variance
• Value of number of co-rated items
• Use "significance weighting", by e.g., linearly reducing the weight when the number
of co-rated items is low
• Case amplification
• Intuition: Give more weight to "very similar" neighbors, i.e., where the similarity
value is close to 1.
• Neighborhood selection
• Use similarity threshold or fixed number of neighbors
Memory-based and model-based approaches
• User-based CF is said to be "memory-based"
• the rating matrix is directly used to find neighbors / make predictions
• does not scale for most real-world scenarios
• large e-commerce sites have tens of millions of customers and millions of
items
• Model-based approaches
• based on an offline pre-processing or "model-learning" phase
• at run-time, only the learned model is used to make predictions
• models are updated / re-trained periodically
• large variety of techniques used
• model-building and updating can be computationally expensive
2001: Item-based collaborative filtering recommendation algorithms, B.
Sarwar et al., WWW 2001

• Scalability issues arise with U2U if many more users than items
(m >> n , m = |users|, n = |items|)
• e.g. amazon.com
• Space complexity O(m2) when pre-computed
• Time complexity for computing Pearson O(m2n)

• High sparsity leads to few common ratings between two users

• Basic idea: "Item-based CF exploits relationships between items first,

instead of relationships between users"
Collaborative Filtering Issues
• Pros:
• well-understood, works well in some domains, no knowledge engineering required

• Cons:
• requires user community, sparsity problems, no integration of other knowledge sources, no explanation of results

• What is the best CF method?

• In which situation and which domain? Inconsistent findings; always the same domains and data sets; differences
between methods are often very small (1/100)

• How to evaluate the prediction quality?

• MAE / RMSE: What does an MAE of 0.7 actually mean?
• Serendipity: Not yet fully understood

• What about multi-dimensional ratings?

Unit III Collaborative Filtering Final
No ratings yet
Unit III Collaborative Filtering Final
65 pages
Unit Iii-Collaborative Filtering
No ratings yet
Unit Iii-Collaborative Filtering
34 pages
Lecture 1 - Collaborative Filtering
No ratings yet
Lecture 1 - Collaborative Filtering
27 pages
M03 Item-Based CF-V2
No ratings yet
M03 Item-Based CF-V2
27 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
16 pages
UNIT III - Recommender Systems
No ratings yet
UNIT III - Recommender Systems
11 pages
8 Recommender
No ratings yet
8 Recommender
139 pages
(2012) Sistemasderecomendacion
No ratings yet
(2012) Sistemasderecomendacion
18 pages
All Merge Chap 1
No ratings yet
All Merge Chap 1
69 pages
Module5 Recommender Systems PartB
No ratings yet
Module5 Recommender Systems PartB
57 pages
Lecture 2 Part1
No ratings yet
Lecture 2 Part1
14 pages
A Collaborative Filtering Recommendation Algorithm Based On Item Genre and Rating Similarity
No ratings yet
A Collaborative Filtering Recommendation Algorithm Based On Item Genre and Rating Similarity
4 pages
RecommenderSystems Shortened
No ratings yet
RecommenderSystems Shortened
95 pages
.Trashed-1724941095-Recommender Systems
No ratings yet
.Trashed-1724941095-Recommender Systems
30 pages
Week 6 Recommender
No ratings yet
Week 6 Recommender
17 pages
15.0 Collaborative Filtering
No ratings yet
15.0 Collaborative Filtering
13 pages
Unit..3 Rs
No ratings yet
Unit..3 Rs
8 pages
Example: Item1 Item2 Item3 Item4 Item5 Alice 5 3 4 4 User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1
100% (1)
Example: Item1 Item2 Item3 Item4 Item5 Alice 5 3 4 4 User1 3 1 2 3 3 User2 4 3 4 3 5 User3 3 3 1 5 4 User4 1 5 5 2 1
6 pages
6 Month MCQs (Oct To May 25) English
No ratings yet
6 Month MCQs (Oct To May 25) English
197 pages
Recommender System - New
No ratings yet
Recommender System - New
49 pages
Mod 4
No ratings yet
Mod 4
6 pages
Chapter4 - Web Based Personalization Systems - Part2 - Collaborative Filtering - KNN
No ratings yet
Chapter4 - Web Based Personalization Systems - Part2 - Collaborative Filtering - KNN
22 pages
Article 34
No ratings yet
Article 34
8 pages
Recommended
No ratings yet
Recommended
8 pages
10 Recommender Systems
No ratings yet
10 Recommender Systems
35 pages
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
No ratings yet
Foundations of Machine Learning: Module 3: Instance Based Learning and Feature Reduction
12 pages
Unit 1 Recommender Systems
No ratings yet
Unit 1 Recommender Systems
33 pages
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
No ratings yet
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
4 pages
Unit Iii
No ratings yet
Unit Iii
13 pages
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
No ratings yet
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
5 pages
Unit 3
No ratings yet
Unit 3
21 pages
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
100% (1)
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
4 pages
DM - Lecture 5
No ratings yet
DM - Lecture 5
75 pages
RMBI1020 - Data Analytics For Business - Collaborative Filtering
No ratings yet
RMBI1020 - Data Analytics For Business - Collaborative Filtering
34 pages
Slides Lecture 2 RecSys
No ratings yet
Slides Lecture 2 RecSys
86 pages
Shankland - Theoretical Rook Endgames (2023)
83% (6)
Shankland - Theoretical Rook Endgames (2023)
450 pages
An Item-Based Collaborative Filtering Recommendation Algorithm Using Slope
No ratings yet
An Item-Based Collaborative Filtering Recommendation Algorithm Using Slope
3 pages
AStudyof Mathematical Modelfor Collaborative Filtering
No ratings yet
AStudyof Mathematical Modelfor Collaborative Filtering
10 pages
Recommender Systems-Unit Iii
No ratings yet
Recommender Systems-Unit Iii
9 pages
RS Part 1
No ratings yet
RS Part 1
40 pages
Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
No ratings yet
Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
22 pages
Book Based Question
No ratings yet
Book Based Question
2 pages
Divinity Activation Mantras Empowerment
0% (2)
Divinity Activation Mantras Empowerment
2 pages
K P P Abhilash Emergency Medicine Best Practices at CMC EMAC 2018
100% (1)
K P P Abhilash Emergency Medicine Best Practices at CMC EMAC 2018
531 pages
Panasonic Kx-mb2025 2030 Service Manual
67% (6)
Panasonic Kx-mb2025 2030 Service Manual
313 pages
Isolated Footing Excel Computation
No ratings yet
Isolated Footing Excel Computation
27 pages
208B Manual de Vuelo PDF
100% (1)
208B Manual de Vuelo PDF
846 pages
Module5 Recommender Systems PartA
No ratings yet
Module5 Recommender Systems PartA
54 pages
Recommender Systems & Collaborative Filtering
No ratings yet
Recommender Systems & Collaborative Filtering
14 pages
Recommendation System
No ratings yet
Recommendation System
32 pages
Collaborative Filtering Process in A Whole New Light
No ratings yet
Collaborative Filtering Process in A Whole New Light
8 pages
User-Based Neighborhood Models
No ratings yet
User-Based Neighborhood Models
8 pages
White Topping Report
73% (11)
White Topping Report
21 pages
Collaborative Filtering 12529
No ratings yet
Collaborative Filtering 12529
26 pages
Recommender System
No ratings yet
Recommender System
20 pages
CS345A Data Mining: Recommendation Systems
No ratings yet
CS345A Data Mining: Recommendation Systems
26 pages
Assignment 3 RecSys Solution
No ratings yet
Assignment 3 RecSys Solution
2 pages
RecSys Updated
No ratings yet
RecSys Updated
37 pages
Is593-Lecture04 Recommendation Systems
No ratings yet
Is593-Lecture04 Recommendation Systems
51 pages
Recommender System
No ratings yet
Recommender System
26 pages
PAS Report 556
No ratings yet
PAS Report 556
264 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
36 pages
Lec15-S Sarkar
No ratings yet
Lec15-S Sarkar
12 pages
Module 5
No ratings yet
Module 5
8 pages
A FAREWELL TO VIROLOGY (EXPERT EDITION) DR Mark Bailey
No ratings yet
A FAREWELL TO VIROLOGY (EXPERT EDITION) DR Mark Bailey
67 pages
IMA2023109 - Imagine Invoice 132432 - Thecaratshop
No ratings yet
IMA2023109 - Imagine Invoice 132432 - Thecaratshop
1 page
Road Paving, Trenches
100% (2)
Road Paving, Trenches
42 pages
L6 Recommendation
No ratings yet
L6 Recommendation
56 pages
Orson Welles' Memo On by Lawrence French
100% (1)
Orson Welles' Memo On by Lawrence French
41 pages
Movie Recommendation System: CSN-382 Project
No ratings yet
Movie Recommendation System: CSN-382 Project
25 pages
A Novel Collaborative Filtering Model Based On Combination of Correlation Method With Matrix Completion Technique
No ratings yet
A Novel Collaborative Filtering Model Based On Combination of Correlation Method With Matrix Completion Technique
8 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
Personal Development Plan
No ratings yet
Personal Development Plan
2 pages
Soumya Ranjan Dash - Es20913
No ratings yet
Soumya Ranjan Dash - Es20913
1 page
Priciples of Marketing by Philip Kotler and Gary Armstrong
No ratings yet
Priciples of Marketing by Philip Kotler and Gary Armstrong
33 pages
Master Copy - ARCH 2023-2024
No ratings yet
Master Copy - ARCH 2023-2024
1 page
B.ing Kls XII
No ratings yet
B.ing Kls XII
1 page
15MW Periodic Maintenace Schedule
No ratings yet
15MW Periodic Maintenace Schedule
8 pages
Xanthan Gum On Foam Concrete PDF
No ratings yet
Xanthan Gum On Foam Concrete PDF
8 pages
Activity 3 Earths Interior
No ratings yet
Activity 3 Earths Interior
3 pages
Result
No ratings yet
Result
1 page
Cyber Crime Laboratory Manual 2022
No ratings yet
Cyber Crime Laboratory Manual 2022
7 pages
Title of Paper - Bending-Axis Effects On Load-Moment (P-M) Interaction Diagrams For Circular Concrete Columns Using A Limited Number of Longitudinal Reinforcing Bars
No ratings yet
Title of Paper - Bending-Axis Effects On Load-Moment (P-M) Interaction Diagrams For Circular Concrete Columns Using A Limited Number of Longitudinal Reinforcing Bars
8 pages
How To Convert From Decimal To Binary
No ratings yet
How To Convert From Decimal To Binary
5 pages
Ka & TN Cbse (c3 To c5) C Batch BWT - 7 Syllabus (19.02.2024)
No ratings yet
Ka & TN Cbse (c3 To c5) C Batch BWT - 7 Syllabus (19.02.2024)
2 pages
Random Details
No ratings yet
Random Details
2 pages
Don Mariano Marcos Memorial State University College of Graduate Studies
No ratings yet
Don Mariano Marcos Memorial State University College of Graduate Studies
4 pages
Specifications-700-HC Relays: Relay and Timer Specifications
No ratings yet
Specifications-700-HC Relays: Relay and Timer Specifications
1 page
Fall of Dhaka
100% (4)
Fall of Dhaka
4 pages
Applications of Finite Mathematics
From Everand
Applications of Finite Mathematics
Gautami Devar
No ratings yet

M02 User-Based CF V02

Uploaded by

M02 User-Based CF V02

Uploaded by

User-Based Collaborative

Sumber : Dietmar Jannach, et al, 2010, Introduction to Recommender

Item1 Item2 Item3 Item4 Item5

The similarity of Alice to User1 is thus as follows

• Works well in usual domains, compared with alternative measures

• High sparsity leads to few common ratings between two users

• Basic idea: "Item-based CF exploits relationships between items first,

• What is the best CF method?

• How to evaluate the prediction quality?

• What about multi-dimensional ratings?

You might also like