RecSys PyData2016
RecSys PyData2016
Systems
Using Python
Aug 12, 2016
Slides: https://goo.gl/ehBnhf
Notebook:
https://github.com/dvysardana/RecommenderSys
tems_PyData_2016 Slides by Divya
Outline
1. Why Recommender Systems?
Is it Scalable?
Solution 1: Classification Model
Use features of both products as well as users in order to predict
whether a user will like a product or not.
User Features
(Eg. Age, Gender)
Is it Scalable?
Solution 2: Nearest neighbor Collaborative Filtering
User-based Collaborative Item-based Collaborative
Filtering Filtering
Find users who have Recommend items that are
a similar taste of products similar to the items the user
as the current user. bought.
B A C
A B C
A C A
B
A
C
B A C
A B C
A C A
100,000
B
A
C
Item-based Collaborative Filtering: Normalize co-
occurence matrix
Normalize by Popularity
Jaccard similarity
-Number of users common for i and j
A B C
Number of users for either i or j
A
100,000
3 2
100,002
B 3 2
C
3 2
Item-based Collaborative Filtering: Effect of
multiple items
Rows from normalized co-occurence matrix
B A C A B C D
A 0 0.33 1 0.5
Weighted sum=
(Scores for movie A 0.125 0.29 0.5 0.35
A D + Scores for movie D)/2
C D B A
Ranked
Recommendations: 0.5 0.35 0.29 0.125
Quiz
Given a user x itemRatings matrix of size
480,189 x 17,770, which model will you apply
given the matrix is very sparse?
Popularity based recommender system May be
Training: Use Matrix factorization approaches (Eg. Singular value Decomposition or SVD) to split the
Rating Matrix into constituent User Matrix and Item Matrix with minimum Sum of squared error (SSE).
Goal: Predict unknown ratings for the remaining set of movies using
the learned User Matrix and Item Matrix
● Refer to Gower 2014 to read more about Netflix prize and SVD (Gower, Stephen. "Netflix Prize and SVD." (2014): 1-10.)
Performance Metric for Recommendation Systems
All
Relevant
Items Precision = # of products relevant & recommended / # of items
(All items Relevant items that
in the recommended
are not
test set) recommendations (Measure of exactness)
(Measure of completeness)
Performance Metric for Recommendation Systems
Accuracy
ROC curve
Gunawardana, Asela, and Guy Shani. "A survey of accuracy evaluation metrics of recommendation tasks."
Journal of Machine Learning Research10.Dec (2009): 2935-2962.
Quiz: Comparison of Recommendation Systems
Which recommender model can handle brand new items Cold Start Problem!
(Eg., a new released movie)?
Personalized
Recommendations
Uses Context
(Eg. time of day)
User Features
Item Features
Purchase History
Scalable
Short url:
https://goo.gl/kVnNKf
Resources
1. Book: Recommender Systems An Introduction by Dietmar Jannach