0% found this document useful (0 votes)

15 views11 pages

12 Recsys 1

The document outlines the structure and content of a course on Recommender Systems, detailing homework assignments, midterm information, and class expectations. It covers various topics including Singular Value Decomposition, content-based recommendations, collaborative filtering, and the long tail phenomenon in online recommendations. Additionally, it discusses user and item profiles, similarity metrics, and the pros and cons of different recommendation approaches.

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views11 pages

12 Recsys 1

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Announcements

• Homeworks:
• HW2 (due: 11/06)
• HW3 (will be posted on 11/06; due: 11/27)
Recommender Systems 1 • Midterm:
EE412: Foundation of Big Data Analytics • Claim: 10/28 (Mon) – 10/29 (Tue), 19:00 – 20:00
Fall 2024 • Location: N1 113
• Classum:
• Please review the board before asking questions.
• Aim to ask questions that could benefit other students as well.

Jaemin Yoo 1 Jaemin Yoo 2

Recap Outline
• Singular Value Decomposition (SVD) 1. Recommender Systems
• Dimensionality Reduction with SVD 2. Content-based Recommendation
• CUR Decomposition 3. Collaborative Filtering
4. The Netflix Challenge
n r r n

Σ VT r
m M = U

Jaemin Yoo 3 Jaemin Yoo 4

Recommender Systems Personalized Recommender Systems
• Class of applications that predict user responses to options. • Two groups of recommender systems:
• Non-personalized recommendations: • Content-based: Focus on the profiles (or features) of users/items.
• Editorial and hand-curated: List of favorites, essential items, … • Collaborative filtering: Focus on the interactions between users/items.
• Simple aggregates: Top 10, most popular, recent uploads, …
• Personalized recommendations: like
like like
• Tailored to individual users: Amazon, Netflix, YouTube, …
similar

recommend
like recommend

Jaemin Yoo 5 Jaemin Yoo 6

From Scarcity to Abundance The Long Tail

• Physical delivery systems have scarcity of resources. • The distinction is called the long tail phenomenon.
• Stores have limited shelf space to show to users. • Online institutions provide the entire range of items.
• Not possible to tailor the store to individual customers. • The tail as well as the head (popular) items
• Force them to focus on personalized recommendation.
• Online stores can make anything available to users.
• Amazon offers millions of books. Recommend

Popularity
“Touching the void” “Into Thin Air”

Items
Jaemin Yoo 7 Jaemin Yoo 8
Utility Matrix Gathering Ratings
• We consider two classes of entities: Users and items. • Explicit feedback: Ask users to rate items.
• Utility matrix shows the preference of users for items. • E.g., Youtube asks for likes/dislikes of watched videos.
• The values come from an ordered set, e.g., 1-5 stars. • Users are generally unwilling to provide responses.
• Assumed to be sparse, i.e., most entries are unknown. • Biased as it comes from people willing to provide ratings.
• Implicit feedback: Learn ratings from user actions.
HP1 HP2 HP3 TW SW1 SW2 SW3 • If a user watches a movie, the user is said to “like” it.
A 4 5 1
• Hard to model low ratings: 0 (no rating) or 1 (like).
B 5 5 4
C 2 4 5
D 3 3

Jaemin Yoo 9 Jaemin Yoo 10

Outline Content-based Recommendation

1. Recommender Systems • Main idea: Use the profiles of items.
2. Content-based Recommendation • E.g., for a movie, { genre, director, actors, plot, release year }.
• Recommend items similar to previous highly-rated items.
3. Collaborative Filtering
• Examples:
4. The Netflix Challenge • Recommend movies with same actor(s), director, genre, …
• Recommend websites or blogs with “similar” content.

Jaemin Yoo 11 Jaemin Yoo 12

Plan of Action Item Profiles
• For each item, create an item profile as a set of features.
• Finding features is not straightforward for some classes of items.
• E.g., find a set of “important” words from a document.
1. Eliminate stop words.
2. Compute the TF-IDF scores of words.
3. Use the top 𝑘 words as its features. Common stop words The, and because

Less frequent terms car, drive

with small TF-IDF
More frequent terms auto repair
with higher TF-IDF
auto repair
Source: Stanford CS246 (2022)

Jaemin Yoo 13 Jaemin Yoo 14

Item Profiles as Vectors Example: Item Profiles

• If a feature is a discrete value, use a one-hot vector. • We can also combine discrete and numerical features.
• Length 𝐿 is the number of possible discrete values. • E.g., set of actors, director, genre, and average rating.
• If it is a set of values, e.g., actors, it can be a multi-hot vector.
• If a feature is numerical, use the exact value. Movie profiles: Actors Director Genre Avg. rating

• Its scale might have to be adjusted considering other features.

• E.g., average ratings for movies. M1: 0 1 1 0 1 1 0 1 0 1 0 0 1 0 2.5
M2: 1 1 0 1 0 1 1 0 0 0 1 1 0 0 4.1

E.g., Julia Roberts E.g., Science Fiction

Jaemin Yoo 15 Jaemin Yoo 16

User Profiles User Profiles
• We need a user profile that summarizes the history of a user. • If the matrix is not binary, weight item profiles by utility values.
• Take some aggregation of item profiles using the utility matrix. • E.g., say user 𝑈 rates 𝑀1 with rating 1 and 𝑀2 with rating 4:
• If the utility matrix is binary, the simple average is natural.
• Suppose that user 𝑈 watched movies 𝑀1 and 𝑀2 : Movie M1’s profile: 0 1 1 0 1 1 0 0 Weight = 1

Movie M2’s profile: 1 1 0 1 0 1 1 0 Weight = 4

Movie M1’s profile: 0 1 1 0 1 1 0 0
U’s profile: 0.5 1 0.5 0.5 0.5 1 0.5 0
Movie M2’s profile: 1 1 0 1 0 1 1 0
U’s weighted profile: 2 2.5 0.5 2 0.5 2.5 2 0
U’s profile: 0.5 1 0.5 0.5 0.5 1 0.5 0

Jaemin Yoo 17 Jaemin Yoo 18

Prediction Heuristic Pros and Cons: Content-based Approach

• Compute the cosine distance between user and item profiles. • Pros:
• For efficiency, one can apply LSH to quickly find candidate items. • No need for data on other users.
• Recall the random hyperplane approach. • Able to recommend items to users with unique tastes.
• Able to recommend new or unpopular items.
• Able to provide explanations (by listing content features).
U’s profile: 0.5 1 -1 0.5 -1 1 0.5 -1.5 Movie to
recommend U’s profile • Cons:
Movie to recommend: 0 1 0 0 0 1 0 0 • Finding the appropriate features may be difficult.
Movie not to • New users may not have a profile.
Movie not to recommend: 0 0 1 0 1 0 0 0 recommend • Overspecialization: Cannot recommend beyond user’s profile.

Jaemin Yoo 19 Jaemin Yoo 20

Outline Collaborative Filtering
1. Recommender Systems • Create user and item profiles solely from the utility matrix.
2. Content-based Recommendation • E.g., an item profile is the set of users who purchased it.
= The profile of item 𝑖 is the 𝑖-th column of the matrix.
3. Collaborative Filtering • Don’t care about the content of items.
4. The Netflix Challenge • Comes in two flavors:
• User-user collaborative filtering like like
• Item-item collaborative filtering

like recommend

Jaemin Yoo 21 Jaemin Yoo 22

User-User Collaborative Filtering Finding Similar Users

• Given user 𝑈, find users whose ratings are similar to 𝑈’s ratings. • There are various ways for defining which users are “similar.”
• Estimate 𝑈’s ratings based on similar users. • Jaccard similarity:
• Treat ratings as sets, ignoring the values (i.e., likes vs. dislikes).
• For this example, it seems intuitively wrong.
• Since Jaccard 𝐴, 𝐵 = 1/5 < Jaccard 𝐴, 𝐶 = 2/4. Does it make sense?
HP1 HP2 HP3 TW SW1 SW2 SW3

A 4 5 1
B 5 5 4
C 2 4 5
D 3 3

Source: Stanford CS246 (2022)

23 Jaemin Yoo 24
Finding Similar Users Finding Similar Users
• There are various ways for defining which users are “similar.” • There are various ways for defining which users are “similar.”
• Cosine similarity: • Person’s correlation coefficient:
• Treat ratings as points (or vectors), considering blanks as 0. • Consider only the ratings on items rated by both users.
• Questionable, since no rating doesn’t mean dislike. σ𝑠∈𝑆 𝑟𝐴,𝑠 −𝑟ҧ𝐴 𝑟𝐵,𝑠 −𝑟ҧ𝐵
• In this example, 𝑣𝐴⊤ 𝑣𝐵 /‖𝑣𝐴 ‖‖𝑣𝐵 ‖ = 0.380 and 𝑣𝐴⊤ 𝑣𝐶 /‖𝑣𝐴 ‖‖𝑣𝐶 ‖ = 0.322. • sim 𝐴, 𝐵 =
2 2
σ𝑠∈𝑆 𝑟𝐴,𝑠 −𝑟ҧ𝐴 σ𝑠∈𝑆 𝑟𝐵,𝑠 −𝑟ҧ𝐵
HP1 HP2 HP3 TW SW1 SW2 SW3
• 𝑆: Items rated by both users 𝐴 and 𝐵
A 4 5 1 • 𝑟𝐴ҧ and 𝑟𝐵ҧ : Average ratings of users 𝐴 and 𝐵, resp.
B 5 5 4 • Limitation: The size of 𝑆 should be large enough.
C 2 4 5
D 3 3

Jaemin Yoo 25 Jaemin Yoo 26

Improving Similarity Metrics Improving Similarity Metrics

• How to improve Jaccard similarity: Rounding the data. • How to improve cosine similarity: Normalize ratings.
• Idea: Distinguish between high and low ratings. • Subtract from each rating the average rating of that person.
• Consider only the high ratings 3, 4, 5 as “1”. • Then, we have sim 𝐴, 𝐵 = 0.092 > sim(𝐴, 𝐶) = −0.559.
• Then, Jaccard 𝐴, 𝐵 = 1/4 = Jaccard 𝐴, 𝐶 = 0. • Notice that 𝐷’s ratings are not worth taking seriously.
• Better than the previous result. • Does it make sense? What if the two ratings are both 5?

HP1 HP2 HP3 TW SW1 SW2 SW3 HP1 HP2 HP3 TW SW1 SW2 SW3

A 1 1 A 2/3 5/3 -7/3

B 1 1 1 B 1/3 1/3 -2/3
C 1 1 C -5/3 1/3 4/3
D 1 1 D 0 0

Jaemin Yoo 27 Jaemin Yoo 28

Rating Predictions Item-Item Collaborative Filtering
• From similarity metrics to recommendations: • Another view: Item-item collaborative filtering.
• Let 𝑟𝑥 be the vector of user 𝑥’s ratings. • For item 𝑖, find other similar items.
• Let 𝑁 be the set of 𝑘 users most similar to user 𝑥. • Estimate rating for item 𝑖 based on ratings for similar items.
• Let 𝑁 ′ ⊆ 𝑁 be the subset of users who rated item 𝑖. • Can use the same similarity metrics as in the user-user model.
• Prediction for item 𝑖 of user 𝑥: σ𝑗∈𝑁 sim 𝑖, 𝑗 ⋅ 𝑟𝑥𝑗
𝑖;𝑥
• Simple version: 𝑟𝑥𝑖 =
1
σ ′𝑟 .
𝑟𝑥𝑖 =
𝑘′ 𝑦∈𝑁 𝑦𝑖 σ𝑗∈𝑁 𝑖;𝑥 sim 𝑖, 𝑗
σ𝑦∈𝑁′ sim 𝑥,𝑦 ⋅ 𝑟𝑦𝑖
• Complex version: 𝑟𝑥𝑖 = σ
.
𝑦∈𝑁′
sim 𝑥,𝑦 • 𝑁 𝑖; 𝑥 is the set of items similar to item 𝑗 and rated by user 𝑥.

Jaemin Yoo 29 Jaemin Yoo 30

Item-Item vs. User-User Pros and Cons: Collaborative Filtering

• Item-item similarity is often more reliable. • Pros:
• Intuitively, items are classifiable in simple terms, e.g., one genre. • Do not have to come up with features (or profiles).
• Users may like multiple genres, so harder to compute similarity. • Cons:
• However, there is no clear advantage of one from another. • Need enough users in the system to find a match.
• E.g., user-user is better for relatively new items. • Cannot recommend new or unpopular items that have not been rated.
• Cannot recommend items to someone with unique taste.
HP1 HP2 HP3 TW SW1 SW2 SW3
• I.e., tends to recommend popular items.
A 4 5 1
B 5 5 4
C 2 4 5
D 3 3

Jaemin Yoo 31 Jaemin Yoo 32

Outline The Netflix Challenge (2006)
1. Recommender Systems • Netflix offered $1M to beat its movie recommendation algorithm.
2. Content-based Recommendation • The prize was awarded after three years of work.
3. Collaborative Filtering
4. The Netflix Challenge

Jaemin Yoo 33 Jaemin Yoo 34

Description of the Challenge Utility Matrix from Netflix

• Training data: • Goal: Minimize RMSE on the test data.
• 100M 1-5 ratings from 480K users and on 18K movies. • RMSE = sqrt
1
σ 𝑟𝑥𝑖 − 𝑟𝑥𝑖
Ƹ 2
where 𝑇 is the test set.
𝑥,𝑖 ∈𝑇
• Collected for 6 years (2000 to 2005). 𝑇

• Test data: 17K movies

• Set of the last few ratings of each user (2.8M in total).

• Evaluation criterion: Root mean square error (RMSE). Training dataset
• Netflix’s system RMSE: 0.9514.
0.5M users
• Competition: 2,700+ teams.
Test dataset

Jaemin Yoo 35 Jaemin Yoo 36

Training and Test Data Key Takeaways
• Test data: Notable difference from the previous topics. • Not easy to outperform very simple approaches.
• Goal of data mining is often to find patterns in given data. • Naïve approach: Predict the rating 𝑟𝑢𝑖 as 𝜇𝑢 + 𝜇𝑖 /2.
• Clustering, frequent itemsets, dimensionality reduction. • 𝜇𝑢 = The average rating of 𝑢.
• The Netflix challenge is about generalizing to unseen data. • 𝜇𝑖 = The average rating for 𝑖 by all users.
• Most machine learning problems are this case. • CineMatch was only 3% better than the naïve approach.
• Focus on performance, rather than scalability or running time. • CineMatch: Netflix’s previous solution.

Jaemin Yoo 37 Jaemin Yoo 38

Key Takeaways Key Takeaways

• Additional data, other than the utility matrix, can help or not. • The winning solution was a combination of several algorithms.
• Movie information: • These algorithms are developed independently.
• Used IMDB to get information of actors, directors, and genres. • The second team also used a blend of independent algorithms.
• Not effective, maybe because they didn’t give useful information. • Combining different algorithms is effective in general.
• Time information: • Called an ensemble approach.
• Decreasing vs. increasing reputation over time. • Cons: Difficult interpretation, extensive computation, etc.
• Examined the ratings of a movie to see its rating trend in time.

Jaemin Yoo 39 Jaemin Yoo 40

Summary
1. Recommender Systems
• The long tail phenomenon
• Utility matrix
2. Content-based Recommendation
• User / item profiles
3. Collaborative Filtering
• Defining “similarity”
4. The Netflix Challenge
• Key takeaways

Jaemin Yoo 41

UNIT - 1 Advanced Algorithm PDF
50% (2)
UNIT - 1 Advanced Algorithm PDF
43 pages
WEIRD Bible Study Participant's Guide: Because Normal Isn’t Working
From Everand
WEIRD Bible Study Participant's Guide: Because Normal Isn’t Working
Craig Groeschel
5/5 (1)
T10 Recommender System
No ratings yet
T10 Recommender System
45 pages
Recommendation System
No ratings yet
Recommendation System
17 pages
.Trashed-1724941095-Recommender Systems
No ratings yet
.Trashed-1724941095-Recommender Systems
30 pages
RecSys Updated
No ratings yet
RecSys Updated
37 pages
Module 5
No ratings yet
Module 5
8 pages
CS345A Data Mining: Recommendation Systems
No ratings yet
CS345A Data Mining: Recommendation Systems
26 pages
Music Recommendation
100% (1)
Music Recommendation
113 pages
Implementation and Comparison of Recommender Systems Using Various Models
100% (1)
Implementation and Comparison of Recommender Systems Using Various Models
13 pages
Recommender System
No ratings yet
Recommender System
8 pages
Recommendation System in Python
No ratings yet
Recommendation System in Python
13 pages
DM Lect 6 - Recommender Systems
No ratings yet
DM Lect 6 - Recommender Systems
46 pages
L6 Recommendation
No ratings yet
L6 Recommendation
56 pages
Module 6 - Link Analysis Recommendation Systems
No ratings yet
Module 6 - Link Analysis Recommendation Systems
68 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
36 pages
Movie Recommendation System: CSN-382 Project
No ratings yet
Movie Recommendation System: CSN-382 Project
25 pages
Recommended System
No ratings yet
Recommended System
33 pages
2404 16177v1
No ratings yet
2404 16177v1
6 pages
CSE545 sp23 (9) Recommendation Systems 4-10
No ratings yet
CSE545 sp23 (9) Recommendation Systems 4-10
72 pages
E96660695201532
No ratings yet
E96660695201532
5 pages
Types of Recommendation Systems
No ratings yet
Types of Recommendation Systems
13 pages
CS583 Recommender Systems
No ratings yet
CS583 Recommender Systems
40 pages
Recommender System
No ratings yet
Recommender System
26 pages
RS Unit - I
No ratings yet
RS Unit - I
47 pages
Movie Recommendations
No ratings yet
Movie Recommendations
12 pages
Unit 1 Final
No ratings yet
Unit 1 Final
50 pages
Unit I-Introduction
100% (1)
Unit I-Introduction
23 pages
Unit 1 Final Merged
No ratings yet
Unit 1 Final Merged
254 pages
Building Accurate and Practical Recomender System Usnig ML Classifier and CBF by Asma
No ratings yet
Building Accurate and Practical Recomender System Usnig ML Classifier and CBF by Asma
19 pages
Week 13
No ratings yet
Week 13
26 pages
Slides Lecture 2 RecSys
No ratings yet
Slides Lecture 2 RecSys
86 pages
Filtering and Recommender Systems: Content-Based and Collaborative
No ratings yet
Filtering and Recommender Systems: Content-Based and Collaborative
30 pages
UNIT I - Introduction-Recommender Systems
No ratings yet
UNIT I - Introduction-Recommender Systems
24 pages
LondonR - Professional Matchmaking in R - Duncan Stoddard - 20160405-1
No ratings yet
LondonR - Professional Matchmaking in R - Duncan Stoddard - 20160405-1
28 pages
Movie Recommender System Using Genetic Algorithm
No ratings yet
Movie Recommender System Using Genetic Algorithm
8 pages
Recommendation Engines
No ratings yet
Recommendation Engines
17 pages
Lec15-S Sarkar
No ratings yet
Lec15-S Sarkar
12 pages
Recommendation System
No ratings yet
Recommendation System
21 pages
Module4 RecommenderSystem
No ratings yet
Module4 RecommenderSystem
11 pages
Recommendation System-WPS Office
No ratings yet
Recommendation System-WPS Office
18 pages
Unit-1 - Introduction
No ratings yet
Unit-1 - Introduction
46 pages
Unit I Introduction
No ratings yet
Unit I Introduction
24 pages
Recommendation in Social Media: Recommender System
No ratings yet
Recommendation in Social Media: Recommender System
29 pages
Recommendations Using Collaborative Filtering
No ratings yet
Recommendations Using Collaborative Filtering
37 pages
Recommender Week6
No ratings yet
Recommender Week6
34 pages
RecSys PyData2016
No ratings yet
RecSys PyData2016
32 pages
第十讲-Recommender Systems
No ratings yet
第十讲-Recommender Systems
81 pages
Unit 3
No ratings yet
Unit 3
21 pages
E - Commerce Recommendation System
No ratings yet
E - Commerce Recommendation System
29 pages
Movie Recommender
No ratings yet
Movie Recommender
23 pages
Movie Recommender System PDF
100% (1)
Movie Recommender System PDF
5 pages
Module5 Recommender Systems PartA
No ratings yet
Module5 Recommender Systems PartA
54 pages
DM - Lecture 5
No ratings yet
DM - Lecture 5
75 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
16 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
Music Recommendation System
No ratings yet
Music Recommendation System
22 pages
PCL Group2
No ratings yet
PCL Group2
21 pages
Recommendation System
No ratings yet
Recommendation System
32 pages
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
No ratings yet
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
5 pages
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
From Everand
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
Felice C. Frankel
No ratings yet
AI ML Module2 Chapter 4
No ratings yet
AI ML Module2 Chapter 4
54 pages
The Use of Reinforcement Learning in Gaming The BR
No ratings yet
The Use of Reinforcement Learning in Gaming The BR
9 pages
4.0 Trip Assignment Model
No ratings yet
4.0 Trip Assignment Model
11 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
5 pages
DS Notes Removed
No ratings yet
DS Notes Removed
14 pages
TCS NQT 2023 Coding Questions With Codes Hiringhus
No ratings yet
TCS NQT 2023 Coding Questions With Codes Hiringhus
18 pages
Metoda e Eulerit - (PDF Document) PDF
No ratings yet
Metoda e Eulerit - (PDF Document) PDF
1 page
Neural Networks: Tricea Wade (99-002187) Swain Henry (02-006844)
No ratings yet
Neural Networks: Tricea Wade (99-002187) Swain Henry (02-006844)
11 pages
Data Structures Algorithms in Java
No ratings yet
Data Structures Algorithms in Java
3 pages
Hands-On - Arrays and Functions - Questions
No ratings yet
Hands-On - Arrays and Functions - Questions
9 pages
Complete English Grammar Rules
No ratings yet
Complete English Grammar Rules
4 pages
2ED Advanced Math William Guo ToC Ref
No ratings yet
2ED Advanced Math William Guo ToC Ref
11 pages
Binary Multiplication and Division
No ratings yet
Binary Multiplication and Division
11 pages
Methodes Numerique
No ratings yet
Methodes Numerique
32 pages
Chapter-2-The Failure Distribution
No ratings yet
Chapter-2-The Failure Distribution
22 pages
Hough Examples - Algorithm
No ratings yet
Hough Examples - Algorithm
8 pages
DSP Integrated Circuits 3
No ratings yet
DSP Integrated Circuits 3
3 pages
Yolo
No ratings yet
Yolo
32 pages
ch7 Part1 4up
No ratings yet
ch7 Part1 4up
4 pages
Approaches To Fraud Detection On
No ratings yet
Approaches To Fraud Detection On
10 pages
Practice Set 2
No ratings yet
Practice Set 2
5 pages
Ridge Regression
No ratings yet
Ridge Regression
6 pages
1970 - Mehra2 - On The Identification of Variances and Adaptive - KF
No ratings yet
1970 - Mehra2 - On The Identification of Variances and Adaptive - KF
75 pages
FPGA - Ch0 - Folding
No ratings yet
FPGA - Ch0 - Folding
84 pages
Assignment #3 Solution
No ratings yet
Assignment #3 Solution
5 pages
CS2004
No ratings yet
CS2004
2 pages
Machine Learning For Predictive Maintenance: A Multiple Classifier Approach
No ratings yet
Machine Learning For Predictive Maintenance: A Multiple Classifier Approach
9 pages
Overlap-Save Method: Block Convolution
No ratings yet
Overlap-Save Method: Block Convolution
26 pages
02 LP Introduction (20241015)
No ratings yet
02 LP Introduction (20241015)
49 pages

12 Recsys 1

Uploaded by

12 Recsys 1

Uploaded by

Announcements

Jaemin Yoo 1 Jaemin Yoo 2

Jaemin Yoo 3 Jaemin Yoo 4

Jaemin Yoo 5 Jaemin Yoo 6

From Scarcity to Abundance The Long Tail

Jaemin Yoo 9 Jaemin Yoo 10

Outline Content-based Recommendation

Jaemin Yoo 11 Jaemin Yoo 12

Less frequent terms car, drive

Jaemin Yoo 13 Jaemin Yoo 14

Item Profiles as Vectors Example: Item Profiles

• Its scale might have to be adjusted considering other features.

E.g., Julia Roberts E.g., Science Fiction

Jaemin Yoo 15 Jaemin Yoo 16

Movie M2’s profile: 1 1 0 1 0 1 1 0 Weight = 4

Jaemin Yoo 17 Jaemin Yoo 18

Prediction Heuristic Pros and Cons: Content-based Approach

Jaemin Yoo 19 Jaemin Yoo 20

Jaemin Yoo 21 Jaemin Yoo 22

User-User Collaborative Filtering Finding Similar Users

Source: Stanford CS246 (2022)

Jaemin Yoo 25 Jaemin Yoo 26

Improving Similarity Metrics Improving Similarity Metrics

A 1 1 A 2/3 5/3 -7/3

Jaemin Yoo 27 Jaemin Yoo 28

Jaemin Yoo 29 Jaemin Yoo 30

Item-Item vs. User-User Pros and Cons: Collaborative Filtering

Jaemin Yoo 31 Jaemin Yoo 32

Jaemin Yoo 33 Jaemin Yoo 34

Description of the Challenge Utility Matrix from Netflix

• Test data: 17K movies

• Set of the last few ratings of each user (2.8M in total).

Jaemin Yoo 35 Jaemin Yoo 36

Jaemin Yoo 37 Jaemin Yoo 38

Key Takeaways Key Takeaways

Jaemin Yoo 39 Jaemin Yoo 40

You might also like