0% found this document useful (0 votes)

87 views26 pages

CS345A Data Mining: Recommendation Systems

This document discusses recommendation systems and some of the key techniques used to build them. It describes how recommendation engines work to filter information and provide personalized recommendations to users. It covers content-based recommendation approaches that match user profiles to item profiles, collaborative filtering techniques that identify similar users, and hybrid methods. It also discusses challenges like data sparsity, evaluating recommendation quality, and efficiently finding similar items and users.

Uploaded by

Devang Thakkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

87 views26 pages

CS345A Data Mining: Recommendation Systems

Uploaded by

Devang Thakkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

CS345A

Data Mining
Recommendation Systems
Anand Rajaraman

Recommendations

Items

Products, web sites, blogs, news items,

From scarcity to abundance

Shelf space is a scarce commodity for
traditional retailers
Also: TV networks, movie theaters,

The web enables near-zero-cost

dissemination of information about
products
From scarcity to abundance

More choice necessitates better filters

Recommendation engines
How Into Thin Air made Touching the Void a
bestseller

The Long Tail

Source: Chris Anderson (2004)

Recommendation Types
Editorial
Simple aggregates
Top 10, Most Popular, Recent Uploads

Tailored to individual users

Amazon, Netflix,

Formal Model
C = set of Customers
S = set of Items
Utility function u: C S R
R = set of ratings
R is a totally ordered set
e.g., 0-5 stars, real number in [0,1]

Utility Matrix
Avatar

Alice

Bob

Carol

David

LOTR

Matrix

0.2
0.5

0.2

Pirates

0.3
1
0.4

Key Problems
Gathering known ratings for matrix
Extrapolate unknown ratings from
known ratings
Mainly interested in high unknown ratings

Evaluating extrapolation methods

Gathering Ratings
Explicit
Ask people to rate items
Doesnt work well in practice people cant
be bothered

Implicit
Learn ratings from user actions
e.g., purchase implies high rating
What about low ratings?

Extrapolating Utilities
Key problem: matrix U is sparse
most people have not rated most items

Three approaches
Content-based
Collaborative
Hybrid

Content-based recommendations
Main idea: recommend items to
customer C similar to previous items
rated highly by C
Movie recommendations
recommend movies with same actor(s),
director, genre,

Websites, blogs, news

recommend other sites with similar
content

Plan of action
Item profiles
likes

build

recommend

match

Red
Circles
Triangles

User profile

Item Profiles
For each item, create an item profile
Profile is a set of features
movies: author, title, actor, director,
text: set of important words in document

How to pick important words?

Usual heuristic is TF.IDF (Term Frequency
times Inverse Doc Frequency)

TF.IDF
fij = frequency of term ti in document dj

ni = number of docs that mention term i

N = total number of docs

TF.IDF score wij = Tfij IDFi

Doc profile = set of words with highest
TF.IDF scores, together with their scores

User profiles and prediction

User profile possibilities:
Weighted average of rated item profiles
Variation: weight by difference from average
rating for item

Prediction heuristic
Given user profile c and item profile s,
estimate u(c,s) = cos(c,s) = c.s/(|c||s|)
Need efficient method to find items with
high utility: later

Limitations of content-based
approach
Finding the appropriate features
e.g., images, movies, music

Overspecialization
Never recommends items outside users
content profile
People might have multiple interests

Recommendations for new users

How to build a profile?

Collaborative Filtering
Consider user c
Find set D of other users whose ratings
are similar to cs ratings
Estimate users ratings based on ratings
of users in D

Similar users
Let rx be the vector of user xs ratings
Cosine similarity measure
sim(x,y) = cos(rx , ry)

Pearson correlation coefficient

Sxy = items rated by both users x and y

Rating predictions
Let D be the set of k users most similar to c
who have rated item s
Possibilities for prediction function (item s):
rcs = 1/k d in D rds
rcs = (d in D sim(c,d) rds)/(d in D sim(c,d))
Other options?

Many tricks possible

Complexity
Expensive step is finding k most similar
customers
O(|U|)

Too expensive to do at runtime

Could pre-compute

Nave precomputation takes time

O(N|U|)
Stay tuned for how to do it faster!

Can use clustering, partitioning as

alternatives, but quality degrades

Item-Item Collaborative Filtering

So far: User-user collaborative filtering
Another view
For item s, find other similar items
Estimate rating for item based on ratings for
similar items
Can use same similarity metrics and
prediction functions as in user-user model

In practice, it has been observed that

item-item often works better than useruser

Pros and cons of collaborative

filtering
Works for any kind of item
No feature selection needed

New user problem

New item problem
Sparsity of rating matrix
Cluster-based smoothing?
Add more data!

Hybrid Methods
Implement two or more different
recommenders and combine predictions
Perhaps using a linear model

Add content-based methods to

collaborative filtering
item profiles for new item problem
demographics to deal with new user
problem

Evaluating Predictions
Compare predictions with known ratings
Root-mean-square error (RMSE)

Another approach: 0/1 model

Coverage
Number of items/users for which system
can make predictions
Precision
Accuracy of predictions
Receiver operating characteristic (ROC)
Tradeoff curve between false positives and
false negatives

Problems with Measures

Narrow focus on accuracy sometimes
misses the point
Prediction Diversity
Prediction Context
Order of predictions

In practice, we care only to predict high

ratings
RMSE might penalize a method that does
well for high ratings and badly for others

Finding similar vectors

Common problem that comes up in
many settings
Given a large number N of vectors in
some high-dimensional space (M
dimensions), find pairs of vectors that
have high similarity
e.g., user profiles, item profiles

Perfect set-up for next topic!

Near-neighbor search in high dimensions

02 Acs Check in Ia
100% (3)
02 Acs Check in Ia
203 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
36 pages
RecSys Updated
No ratings yet
RecSys Updated
37 pages
Recommender System
No ratings yet
Recommender System
26 pages
12 Recsys 1
No ratings yet
12 Recsys 1
11 pages
AIML Presentation
No ratings yet
AIML Presentation
21 pages
Module5 Recommender Systems PartA
No ratings yet
Module5 Recommender Systems PartA
54 pages
RS Part 1
No ratings yet
RS Part 1
40 pages
CSE545 sp23 (9) Recommendation Systems 4-10
No ratings yet
CSE545 sp23 (9) Recommendation Systems 4-10
72 pages
Slides Lecture 2 RecSys
No ratings yet
Slides Lecture 2 RecSys
86 pages
L6 Recommendation
No ratings yet
L6 Recommendation
56 pages
Recommendations Using Collaborative Filtering
No ratings yet
Recommendations Using Collaborative Filtering
37 pages
Recommendation System in Python
No ratings yet
Recommendation System in Python
13 pages
DM Lect 6 - Recommender Systems
No ratings yet
DM Lect 6 - Recommender Systems
46 pages
Recommendation System
No ratings yet
Recommendation System
17 pages
T10 Recommender System
No ratings yet
T10 Recommender System
45 pages
DM - Lecture 5
No ratings yet
DM - Lecture 5
75 pages
Module 5
No ratings yet
Module 5
8 pages
Filtering and Recommender Systems: Content-Based and Collaborative
No ratings yet
Filtering and Recommender Systems: Content-Based and Collaborative
30 pages
Module5 Recommender Systems PartB
No ratings yet
Module5 Recommender Systems PartB
57 pages
Module4 RecommenderSystem
No ratings yet
Module4 RecommenderSystem
11 pages
Movie Recommendation System: CSN-382 Project
No ratings yet
Movie Recommendation System: CSN-382 Project
25 pages
Recommender System - New
No ratings yet
Recommender System - New
49 pages
Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
No ratings yet
Collaborative Filtering & Content-Based Recommending: CS 293S. T. Yang Slides Based On R. Mooney at UT Austin
22 pages
Recommended System
No ratings yet
Recommended System
33 pages
.Trashed-1724941095-Recommender Systems
No ratings yet
.Trashed-1724941095-Recommender Systems
30 pages
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
No ratings yet
An Optimized Item-Based Collaborative Filtering Recommendation Algorithm
5 pages
Recommendation System
No ratings yet
Recommendation System
32 pages
Unit 1 Recommender Systems
No ratings yet
Unit 1 Recommender Systems
33 pages
Implementation and Comparison of Recommender Systems Using Various Models
100% (1)
Implementation and Comparison of Recommender Systems Using Various Models
13 pages
LondonR - Professional Matchmaking in R - Duncan Stoddard - 20160405-1
No ratings yet
LondonR - Professional Matchmaking in R - Duncan Stoddard - 20160405-1
28 pages
Types of Recommendation Systems
No ratings yet
Types of Recommendation Systems
13 pages
Unit III Collaborative Filtering Final
No ratings yet
Unit III Collaborative Filtering Final
65 pages
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
No ratings yet
A Personalized Recommender Integrating Item-Based and User-Based Collaborative Filtering
4 pages
Recommender Systems
No ratings yet
Recommender Systems
12 pages
Recommendation Engines
No ratings yet
Recommendation Engines
17 pages
Unit Iii-Collaborative Filtering
No ratings yet
Unit Iii-Collaborative Filtering
34 pages
E - Commerce Recommendation System
No ratings yet
E - Commerce Recommendation System
29 pages
Unit-1 - Introduction
No ratings yet
Unit-1 - Introduction
46 pages
2404 16177v1
No ratings yet
2404 16177v1
6 pages
MS - BDA Lec - Recommendation Systems I
No ratings yet
MS - BDA Lec - Recommendation Systems I
31 pages
Unit 3
No ratings yet
Unit 3
21 pages
RecSys PyData2016
No ratings yet
RecSys PyData2016
32 pages
Music Recommendation
100% (1)
Music Recommendation
113 pages
Session 1 2
No ratings yet
Session 1 2
92 pages
Module 6 - Link Analysis Recommendation Systems
No ratings yet
Module 6 - Link Analysis Recommendation Systems
68 pages
Lec15-S Sarkar
No ratings yet
Lec15-S Sarkar
12 pages
10 Recommender Systems
No ratings yet
10 Recommender Systems
35 pages
第十讲-Recommender Systems
No ratings yet
第十讲-Recommender Systems
81 pages
Recommender Week6
No ratings yet
Recommender Week6
34 pages
Lecture 1 - Collaborative Filtering
No ratings yet
Lecture 1 - Collaborative Filtering
27 pages
ITEM-ITEM Complete Lecture
No ratings yet
ITEM-ITEM Complete Lecture
19 pages
RMBI1020 - Data Analytics For Business - Collaborative Filtering
No ratings yet
RMBI1020 - Data Analytics For Business - Collaborative Filtering
34 pages
RecommenderSystems Shortened
No ratings yet
RecommenderSystems Shortened
95 pages
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
100% (1)
Combining Memory-Based and Model-Based Collaborative Filtering in Recommender System
4 pages
Recommender Systems Notes
No ratings yet
Recommender Systems Notes
16 pages
Building Accurate and Practical Recomender System Usnig ML Classifier and CBF by Asma
No ratings yet
Building Accurate and Practical Recomender System Usnig ML Classifier and CBF by Asma
19 pages
Notes On Recommender Systems
No ratings yet
Notes On Recommender Systems
72 pages
Review of Clustering-Based Recommender Systems
No ratings yet
Review of Clustering-Based Recommender Systems
22 pages
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
65 pages
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
1992 Can Altruism Evolve in Purely Viscous Populations
No ratings yet
1992 Can Altruism Evolve in Purely Viscous Populations
11 pages
Azencott BioML
No ratings yet
Azencott BioML
87 pages
Material Selection
No ratings yet
Material Selection
2 pages
Engineering MM207: Metallurgy
No ratings yet
Engineering MM207: Metallurgy
7 pages
Lecture Set3 Group 3
No ratings yet
Lecture Set3 Group 3
107 pages
Regional Mathematical Olympiad-2009: Problems and Solutions
No ratings yet
Regional Mathematical Olympiad-2009: Problems and Solutions
4 pages
ZabbeySam2017 PDF
No ratings yet
ZabbeySam2017 PDF
15 pages
Abhijeet Jadhav Resume April 2024
No ratings yet
Abhijeet Jadhav Resume April 2024
2 pages
VTU - B.E B.Tech - 2019 - 4th Semester - July - CBCS 17 Scheme - MECH - 17ME44 Fluid PDF
No ratings yet
VTU - B.E B.Tech - 2019 - 4th Semester - July - CBCS 17 Scheme - MECH - 17ME44 Fluid PDF
2 pages
No Online Education Required
No ratings yet
No Online Education Required
5 pages
Dissolved Gas Analysis (DGA) : PET, Madakkathara
No ratings yet
Dissolved Gas Analysis (DGA) : PET, Madakkathara
37 pages
TGR Request Form
No ratings yet
TGR Request Form
1 page
8162S Landing Gear and Brake System
No ratings yet
8162S Landing Gear and Brake System
14 pages
ITP - Installation of Chilled Water Pipes
No ratings yet
ITP - Installation of Chilled Water Pipes
2 pages
2022 Summer Question Paper (Msbte Study Resources)
No ratings yet
2022 Summer Question Paper (Msbte Study Resources)
4 pages
G37+ The Strongest 2.1 Monk Build (Quin69) - Monk - Diablo III Builds - Diablo Fans PDF
No ratings yet
G37+ The Strongest 2.1 Monk Build (Quin69) - Monk - Diablo III Builds - Diablo Fans PDF
6 pages
Test Report For Transformer Stability
No ratings yet
Test Report For Transformer Stability
14 pages
Load Management System
No ratings yet
Load Management System
40 pages
DAC Slides
No ratings yet
DAC Slides
31 pages
AC RLC Circuits
No ratings yet
AC RLC Circuits
5 pages
Virtual - Doctor - Robot - Using - Iot - PPT Literature Survey
No ratings yet
Virtual - Doctor - Robot - Using - Iot - PPT Literature Survey
29 pages
Adonis HF 6 KW 100 Ma
100% (1)
Adonis HF 6 KW 100 Ma
2 pages
EPL Futsal Centre Online System
100% (1)
EPL Futsal Centre Online System
26 pages
Pipes SS316
No ratings yet
Pipes SS316
7 pages
Safety Piping Color Codes
No ratings yet
Safety Piping Color Codes
2 pages
Cat Helimax 2016 Esp 1
No ratings yet
Cat Helimax 2016 Esp 1
74 pages
Edgerton White Heat (Uk, 1996)
No ratings yet
Edgerton White Heat (Uk, 1996)
30 pages
Acoustic Pod PB&PC Series L - EXW
No ratings yet
Acoustic Pod PB&PC Series L - EXW
5 pages
Olv SV 630 - 750 BF en
100% (1)
Olv SV 630 - 750 BF en
54 pages
Tideflex Valves
No ratings yet
Tideflex Valves
9 pages
Ogdcl Internship Report 2017
No ratings yet
Ogdcl Internship Report 2017
29 pages
An Integrated Machine Learning and Finite Element Analysis Framework, Applied To Composite Substructures Including Damage
No ratings yet
An Integrated Machine Learning and Finite Element Analysis Framework, Applied To Composite Substructures Including Damage
120 pages
Jurnal Sains Informasi Geografi (Jsig) : Analisis Kerentanan Dan Kualitas Airtanah Bebas Di Kota Mataram
No ratings yet
Jurnal Sains Informasi Geografi (Jsig) : Analisis Kerentanan Dan Kualitas Airtanah Bebas Di Kota Mataram
8 pages
J Perez Tech Report
No ratings yet
J Perez Tech Report
14 pages
Devorex Classic Brochure-03.2015 Eng1
No ratings yet
Devorex Classic Brochure-03.2015 Eng1
2 pages