0% found this document useful (0 votes)
15 views11 pages

12 Recsys 1

The document outlines the structure and content of a course on Recommender Systems, detailing homework assignments, midterm information, and class expectations. It covers various topics including Singular Value Decomposition, content-based recommendations, collaborative filtering, and the long tail phenomenon in online recommendations. Additionally, it discusses user and item profiles, similarity metrics, and the pros and cons of different recommendation approaches.

Uploaded by

asansyzbai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views11 pages

12 Recsys 1

The document outlines the structure and content of a course on Recommender Systems, detailing homework assignments, midterm information, and class expectations. It covers various topics including Singular Value Decomposition, content-based recommendations, collaborative filtering, and the long tail phenomenon in online recommendations. Additionally, it discusses user and item profiles, similarity metrics, and the pros and cons of different recommendation approaches.

Uploaded by

asansyzbai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Announcements

• Homeworks:
• HW2 (due: 11/06)
• HW3 (will be posted on 11/06; due: 11/27)
Recommender Systems 1 • Midterm:
EE412: Foundation of Big Data Analytics • Claim: 10/28 (Mon) – 10/29 (Tue), 19:00 – 20:00
Fall 2024 • Location: N1 113
• Classum:
• Please review the board before asking questions.
• Aim to ask questions that could benefit other students as well.

Jaemin Yoo 1 Jaemin Yoo 2

Recap Outline
• Singular Value Decomposition (SVD) 1. Recommender Systems
• Dimensionality Reduction with SVD 2. Content-based Recommendation
• CUR Decomposition 3. Collaborative Filtering
4. The Netflix Challenge
n r r n

Σ VT r
m M = U

Jaemin Yoo 3 Jaemin Yoo 4


Recommender Systems Personalized Recommender Systems
• Class of applications that predict user responses to options. • Two groups of recommender systems:
• Non-personalized recommendations: • Content-based: Focus on the profiles (or features) of users/items.
• Editorial and hand-curated: List of favorites, essential items, … • Collaborative filtering: Focus on the interactions between users/items.
• Simple aggregates: Top 10, most popular, recent uploads, …
• Personalized recommendations: like
like like
• Tailored to individual users: Amazon, Netflix, YouTube, …
similar

recommend
like recommend

Jaemin Yoo 5 Jaemin Yoo 6

From Scarcity to Abundance The Long Tail


• Physical delivery systems have scarcity of resources. • The distinction is called the long tail phenomenon.
• Stores have limited shelf space to show to users. • Online institutions provide the entire range of items.
• Not possible to tailor the store to individual customers. • The tail as well as the head (popular) items
• Force them to focus on personalized recommendation.
• Online stores can make anything available to users.
• Amazon offers millions of books. Recommend

Popularity
“Touching the void” “Into Thin Air”

Items
Jaemin Yoo 7 Jaemin Yoo 8
Utility Matrix Gathering Ratings
• We consider two classes of entities: Users and items. • Explicit feedback: Ask users to rate items.
• Utility matrix shows the preference of users for items. • E.g., Youtube asks for likes/dislikes of watched videos.
• The values come from an ordered set, e.g., 1-5 stars. • Users are generally unwilling to provide responses.
• Assumed to be sparse, i.e., most entries are unknown. • Biased as it comes from people willing to provide ratings.
• Implicit feedback: Learn ratings from user actions.
HP1 HP2 HP3 TW SW1 SW2 SW3 • If a user watches a movie, the user is said to “like” it.
A 4 5 1
• Hard to model low ratings: 0 (no rating) or 1 (like).
B 5 5 4
C 2 4 5
D 3 3

Jaemin Yoo 9 Jaemin Yoo 10

Outline Content-based Recommendation


1. Recommender Systems • Main idea: Use the profiles of items.
2. Content-based Recommendation • E.g., for a movie, { genre, director, actors, plot, release year }.
• Recommend items similar to previous highly-rated items.
3. Collaborative Filtering
• Examples:
4. The Netflix Challenge • Recommend movies with same actor(s), director, genre, …
• Recommend websites or blogs with “similar” content.

Jaemin Yoo 11 Jaemin Yoo 12


Plan of Action Item Profiles
• For each item, create an item profile as a set of features.
• Finding features is not straightforward for some classes of items.
• E.g., find a set of “important” words from a document.
1. Eliminate stop words.
2. Compute the TF-IDF scores of words.
3. Use the top 𝑘 words as its features. Common stop words The, and because

Less frequent terms car, drive


with small TF-IDF
More frequent terms auto repair
with higher TF-IDF
auto repair
Source: Stanford CS246 (2022)

Jaemin Yoo 13 Jaemin Yoo 14

Item Profiles as Vectors Example: Item Profiles


• If a feature is a discrete value, use a one-hot vector. • We can also combine discrete and numerical features.
• Length 𝐿 is the number of possible discrete values. • E.g., set of actors, director, genre, and average rating.
• If it is a set of values, e.g., actors, it can be a multi-hot vector.
• If a feature is numerical, use the exact value. Movie profiles: Actors Director Genre Avg. rating

• Its scale might have to be adjusted considering other features.


• E.g., average ratings for movies. M1: 0 1 1 0 1 1 0 1 0 1 0 0 1 0 2.5
M2: 1 1 0 1 0 1 1 0 0 0 1 1 0 0 4.1

E.g., Julia Roberts E.g., Science Fiction

Jaemin Yoo 15 Jaemin Yoo 16


User Profiles User Profiles
• We need a user profile that summarizes the history of a user. • If the matrix is not binary, weight item profiles by utility values.
• Take some aggregation of item profiles using the utility matrix. • E.g., say user 𝑈 rates 𝑀1 with rating 1 and 𝑀2 with rating 4:
• If the utility matrix is binary, the simple average is natural.
• Suppose that user 𝑈 watched movies 𝑀1 and 𝑀2 : Movie M1’s profile: 0 1 1 0 1 1 0 0 Weight = 1

Movie M2’s profile: 1 1 0 1 0 1 1 0 Weight = 4


Movie M1’s profile: 0 1 1 0 1 1 0 0
U’s profile: 0.5 1 0.5 0.5 0.5 1 0.5 0
Movie M2’s profile: 1 1 0 1 0 1 1 0
U’s weighted profile: 2 2.5 0.5 2 0.5 2.5 2 0
U’s profile: 0.5 1 0.5 0.5 0.5 1 0.5 0

Jaemin Yoo 17 Jaemin Yoo 18

Prediction Heuristic Pros and Cons: Content-based Approach


• Compute the cosine distance between user and item profiles. • Pros:
• For efficiency, one can apply LSH to quickly find candidate items. • No need for data on other users.
• Recall the random hyperplane approach. • Able to recommend items to users with unique tastes.
• Able to recommend new or unpopular items.
• Able to provide explanations (by listing content features).
U’s profile: 0.5 1 -1 0.5 -1 1 0.5 -1.5 Movie to
recommend U’s profile • Cons:
Movie to recommend: 0 1 0 0 0 1 0 0 • Finding the appropriate features may be difficult.
Movie not to • New users may not have a profile.
Movie not to recommend: 0 0 1 0 1 0 0 0 recommend • Overspecialization: Cannot recommend beyond user’s profile.

Jaemin Yoo 19 Jaemin Yoo 20


Outline Collaborative Filtering
1. Recommender Systems • Create user and item profiles solely from the utility matrix.
2. Content-based Recommendation • E.g., an item profile is the set of users who purchased it.
= The profile of item 𝑖 is the 𝑖-th column of the matrix.
3. Collaborative Filtering • Don’t care about the content of items.
4. The Netflix Challenge • Comes in two flavors:
• User-user collaborative filtering like like
• Item-item collaborative filtering

like recommend

Jaemin Yoo 21 Jaemin Yoo 22

User-User Collaborative Filtering Finding Similar Users


• Given user 𝑈, find users whose ratings are similar to 𝑈’s ratings. • There are various ways for defining which users are “similar.”
• Estimate 𝑈’s ratings based on similar users. • Jaccard similarity:
• Treat ratings as sets, ignoring the values (i.e., likes vs. dislikes).
• For this example, it seems intuitively wrong.
• Since Jaccard 𝐴, 𝐵 = 1/5 < Jaccard 𝐴, 𝐶 = 2/4. Does it make sense?
HP1 HP2 HP3 TW SW1 SW2 SW3

A 4 5 1
B 5 5 4
C 2 4 5
D 3 3

Source: Stanford CS246 (2022)


23 Jaemin Yoo 24
Finding Similar Users Finding Similar Users
• There are various ways for defining which users are “similar.” • There are various ways for defining which users are “similar.”
• Cosine similarity: • Person’s correlation coefficient:
• Treat ratings as points (or vectors), considering blanks as 0. • Consider only the ratings on items rated by both users.
• Questionable, since no rating doesn’t mean dislike. σ𝑠∈𝑆 𝑟𝐴,𝑠 −𝑟ҧ𝐴 𝑟𝐵,𝑠 −𝑟ҧ𝐵
• In this example, 𝑣𝐴⊤ 𝑣𝐵 /‖𝑣𝐴 ‖‖𝑣𝐵 ‖ = 0.380 and 𝑣𝐴⊤ 𝑣𝐶 /‖𝑣𝐴 ‖‖𝑣𝐶 ‖ = 0.322. • sim 𝐴, 𝐵 =
2 2
σ𝑠∈𝑆 𝑟𝐴,𝑠 −𝑟ҧ𝐴 σ𝑠∈𝑆 𝑟𝐵,𝑠 −𝑟ҧ𝐵
HP1 HP2 HP3 TW SW1 SW2 SW3
• 𝑆: Items rated by both users 𝐴 and 𝐵
A 4 5 1 • 𝑟𝐴ҧ and 𝑟𝐵ҧ : Average ratings of users 𝐴 and 𝐵, resp.
B 5 5 4 • Limitation: The size of 𝑆 should be large enough.
C 2 4 5
D 3 3

Jaemin Yoo 25 Jaemin Yoo 26

Improving Similarity Metrics Improving Similarity Metrics


• How to improve Jaccard similarity: Rounding the data. • How to improve cosine similarity: Normalize ratings.
• Idea: Distinguish between high and low ratings. • Subtract from each rating the average rating of that person.
• Consider only the high ratings 3, 4, 5 as “1”. • Then, we have sim 𝐴, 𝐵 = 0.092 > sim(𝐴, 𝐶) = −0.559.
• Then, Jaccard 𝐴, 𝐵 = 1/4 = Jaccard 𝐴, 𝐶 = 0. • Notice that 𝐷’s ratings are not worth taking seriously.
• Better than the previous result. • Does it make sense? What if the two ratings are both 5?

HP1 HP2 HP3 TW SW1 SW2 SW3 HP1 HP2 HP3 TW SW1 SW2 SW3

A 1 1 A 2/3 5/3 -7/3


B 1 1 1 B 1/3 1/3 -2/3
C 1 1 C -5/3 1/3 4/3
D 1 1 D 0 0

Jaemin Yoo 27 Jaemin Yoo 28


Rating Predictions Item-Item Collaborative Filtering
• From similarity metrics to recommendations: • Another view: Item-item collaborative filtering.
• Let 𝑟𝑥 be the vector of user 𝑥’s ratings. • For item 𝑖, find other similar items.
• Let 𝑁 be the set of 𝑘 users most similar to user 𝑥. • Estimate rating for item 𝑖 based on ratings for similar items.
• Let 𝑁 ′ ⊆ 𝑁 be the subset of users who rated item 𝑖. • Can use the same similarity metrics as in the user-user model.
• Prediction for item 𝑖 of user 𝑥: σ𝑗∈𝑁 sim 𝑖, 𝑗 ⋅ 𝑟𝑥𝑗
𝑖;𝑥
• Simple version: 𝑟𝑥𝑖 =
1
σ ′𝑟 .
𝑟𝑥𝑖 =
𝑘′ 𝑦∈𝑁 𝑦𝑖 σ𝑗∈𝑁 𝑖;𝑥 sim 𝑖, 𝑗
σ𝑦∈𝑁′ sim 𝑥,𝑦 ⋅ 𝑟𝑦𝑖
• Complex version: 𝑟𝑥𝑖 = σ
.
𝑦∈𝑁′
sim 𝑥,𝑦 • 𝑁 𝑖; 𝑥 is the set of items similar to item 𝑗 and rated by user 𝑥.

Jaemin Yoo 29 Jaemin Yoo 30

Item-Item vs. User-User Pros and Cons: Collaborative Filtering


• Item-item similarity is often more reliable. • Pros:
• Intuitively, items are classifiable in simple terms, e.g., one genre. • Do not have to come up with features (or profiles).
• Users may like multiple genres, so harder to compute similarity. • Cons:
• However, there is no clear advantage of one from another. • Need enough users in the system to find a match.
• E.g., user-user is better for relatively new items. • Cannot recommend new or unpopular items that have not been rated.
• Cannot recommend items to someone with unique taste.
HP1 HP2 HP3 TW SW1 SW2 SW3
• I.e., tends to recommend popular items.
A 4 5 1
B 5 5 4
C 2 4 5
D 3 3

Jaemin Yoo 31 Jaemin Yoo 32


Outline The Netflix Challenge (2006)
1. Recommender Systems • Netflix offered $1M to beat its movie recommendation algorithm.
2. Content-based Recommendation • The prize was awarded after three years of work.
3. Collaborative Filtering
4. The Netflix Challenge

Jaemin Yoo 33 Jaemin Yoo 34

Description of the Challenge Utility Matrix from Netflix


• Training data: • Goal: Minimize RMSE on the test data.
• 100M 1-5 ratings from 480K users and on 18K movies. • RMSE = sqrt
1
σ 𝑟𝑥𝑖 − 𝑟𝑥𝑖
Ƹ 2
where 𝑇 is the test set.
𝑥,𝑖 ∈𝑇
• Collected for 6 years (2000 to 2005). 𝑇

• Test data: 17K movies

• Set of the last few ratings of each user (2.8M in total).


• Evaluation criterion: Root mean square error (RMSE). Training dataset
• Netflix’s system RMSE: 0.9514.
0.5M users
• Competition: 2,700+ teams.
Test dataset

Jaemin Yoo 35 Jaemin Yoo 36


Training and Test Data Key Takeaways
• Test data: Notable difference from the previous topics. • Not easy to outperform very simple approaches.
• Goal of data mining is often to find patterns in given data. • Naïve approach: Predict the rating 𝑟𝑢𝑖 as 𝜇𝑢 + 𝜇𝑖 /2.
• Clustering, frequent itemsets, dimensionality reduction. • 𝜇𝑢 = The average rating of 𝑢.
• The Netflix challenge is about generalizing to unseen data. • 𝜇𝑖 = The average rating for 𝑖 by all users.
• Most machine learning problems are this case. • CineMatch was only 3% better than the naïve approach.
• Focus on performance, rather than scalability or running time. • CineMatch: Netflix’s previous solution.

Jaemin Yoo 37 Jaemin Yoo 38

Key Takeaways Key Takeaways


• Additional data, other than the utility matrix, can help or not. • The winning solution was a combination of several algorithms.
• Movie information: • These algorithms are developed independently.
• Used IMDB to get information of actors, directors, and genres. • The second team also used a blend of independent algorithms.
• Not effective, maybe because they didn’t give useful information. • Combining different algorithms is effective in general.
• Time information: • Called an ensemble approach.
• Decreasing vs. increasing reputation over time. • Cons: Difficult interpretation, extensive computation, etc.
• Examined the ratings of a movie to see its rating trend in time.

Jaemin Yoo 39 Jaemin Yoo 40


Summary
1. Recommender Systems
• The long tail phenomenon
• Utility matrix
2. Content-based Recommendation
• User / item profiles
3. Collaborative Filtering
• Defining “similarity”
4. The Netflix Challenge
• Key takeaways

Jaemin Yoo 41

You might also like