0% found this document useful (0 votes)
200 views32 pages

RecSys PyData2016

This document discusses recommender systems and how to build them using Python. It begins by explaining why recommender systems are useful and provides examples like movie, product, and friend recommendations. It then discusses three main approaches to building recommender systems: popularity-based, classification-based, and collaborative filtering. Collaborative filtering is further broken down into user-based, item-based, and model-based using matrix factorization. The document provides examples and quizzes to help explain each approach.

Uploaded by

Chong Hoo Chuah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
200 views32 pages

RecSys PyData2016

This document discusses recommender systems and how to build them using Python. It begins by explaining why recommender systems are useful and provides examples like movie, product, and friend recommendations. It then discusses three main approaches to building recommender systems: popularity-based, classification-based, and collaborative filtering. Collaborative filtering is further broken down into user-based, item-based, and model-based using matrix factorization. The document provides examples and quizzes to help explain each approach.

Uploaded by

Chong Hoo Chuah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Recommender

Systems
Using Python
Aug 12, 2016
Slides: https://goo.gl/ehBnhf
Notebook:
https://github.com/dvysardana/RecommenderSys
tems_PyData_2016 Slides by Divya
Outline
1. Why Recommender Systems?

1. Examples of Recommender Systems

1. How to build a Recommender System?


a. Popularity based
b. Classification based
c. Collaborative Filtering
i. Nearest Neighbor
ii. Matrix Factorization

4. Evaluation of Recommender Systems


1. Why Recommender
Systems?
Goal of a Recommender
System: Identify products most
relevant to the user (Eg. Top n
offers).

The long tail phenomenon


2. Some
Examples?
Movie/TV show
Recommendations
Product Recommendations
Friend Recommendations
Job Recommendations
A Naive
understanding of
Recommender
Systems

Users Matching Items


Quiz
What are users and matching items the
following cases:
a.) LinkedIn (Users: members, Items: jobs)
b.) Facebook (Users: members, Items: members)
c.) Amazon (Users: members, Items: products, e.g., books)
d.) Netflix (Users: members, Items: movies, TV shows)
Power of Recommendations: A Success story

“In 1988, a British mountain


climber named Joe Simpson
wrote a book called Touching
the Void, a harrowing account
of near death in the Peruvian
Andes. It got good reviews,
only a modest success, it was
soon forgotten. Then, a
decade later, a strange thing
happened. Jon Krakauer
wrote Into Thin Air, another
book about a mountain-
climbing tragedy, which
became a publishing
Published in 1988 Published in 1996 sensation. Suddently,
Touching the Void started to
sell again.”...The Long Tail by
Chris Anderson
3. Building a
Recommender
System
Solution 0: Popularity based Recommender System

Recommend items viewed/purchased by most people


Recommendations: Ranked list of items by their purchase count
Quiz
Which of the following is true of a popularity
based recommender system?
Can generate Personalized Recommendations?

Can use Context (Eg. time of day)?

Can use User Features?

Can use Item Features?

Can use Purchase History?

Is it Scalable?
Solution 1: Classification Model
Use features of both products as well as users in order to predict
whether a user will like a product or not.

User Features
(Eg. Age, Gender)

Product Features (Eg. Classifier


Limitations.
cost, quality) Like/Not
1. It is difficult to collect
like
high quality information
Purchase History
about products and
users.
Quiz
Which of the following is true of a
Classification model based recommender
system?
Can generate Personalized Recommendations?

Can use Context (Eg. time of day)?

Can use User Features?

Can use Item Features?

Can use Purchase History?

Is it Scalable?
Solution 2: Nearest neighbor Collaborative Filtering
User-based Collaborative Item-based Collaborative
Filtering Filtering
Find users who have Recommend items that are
a similar taste of products similar to the items the user
as the current user. bought.

Similarity is based upon Similarity is based upon


similarity in users’ co-occurence of purcha
purchasing behaviour.
“Items A and B were
“User x is similar to user y purchased by both users
because both purchased x and y, so they are similar
items A, B and C.” Fig. Source:
http://www.salemmarafi.com/code/collaborative-filtering-with-python/
Item-based Collaborative Filtering: An Example
(People who bought this also bought)
History Matrix

B A C
A B C

A C A

B
A
C

Example source: Bob’s Recommendations= [C, B]


https://www.mapr.com/blog/inside-look-at-components-of-recommendation-engine
Item-based Collaborative Filtering: Effect of popular
items

B A C
A B C

A C A
100,000

B
A
C
Item-based Collaborative Filtering: Normalize co-
occurence matrix

Normalize by Popularity
Jaccard similarity
-Number of users common for i and j
A B C
Number of users for either i or j

A
100,000
3 2
100,002
B 3 2

C
3 2
Item-based Collaborative Filtering: Effect of
multiple items
Rows from normalized co-occurence matrix

B A C A B C D

A 0 0.33 1 0.5

A C D 0.25 0.25 0 0.2

Weighted sum=
(Scores for movie A 0.125 0.29 0.5 0.35
A D + Scores for movie D)/2

C D B A
Ranked
Recommendations: 0.5 0.35 0.29 0.125
Quiz
Given a user x itemRatings matrix of size
480,189 x 17,770, which model will you apply
given the matrix is very sparse?
Popularity based recommender system May be

Classification model based recommender system

Item similarity based recommender system

User similarity based recommender system

All of the above

None of the above


17,770
This is the Million Dollar Matrix!!!!$$$$$$!!!!

~100 million ratings

Only 100 million out of


possible 8.5 billion
ratings are non zero.

Very sparse matrix!


Solution 3:
Model based Collaborative Filtering (Matrix Factorization)
Identify latent (hidden) features from the input user x itemRatings matrix to
represent users and items as vectors in N dimensional space.

(Serious/Escapist?) Geared towards Males or Females?


User Vector (u) = [1.3 2.8]
Item Vector (v) = [2.5 -1.9]

New user (Known ratings): [4 5 ….3]

Netflix Prize diagram (Koren et al., 2009)


Solution 3:
Model based Collaborative Filtering (Matrix Factorization)

Training: Use Matrix factorization approaches (Eg. Singular value Decomposition or SVD) to split the
Rating Matrix into constituent User Matrix and Item Matrix with minimum Sum of squared error (SSE).

The winning entry for the famed


Netflix Prize had a number of
SVD models including SVD++
blended
SVD: with Restricted Boltzmann
Anxp= Unxn Snxp VTpxp Machines. Using these
methods they achieved a 10
percent increase in accuracy
over Netflix’s existing algorithm.
--Gower 2014

Goal: Predict unknown ratings for the remaining set of movies using
the learned User Matrix and Item Matrix
● Refer to Gower 2014 to read more about Netflix prize and SVD (Gower, Stephen. "Netflix Prize and SVD." (2014): 1-10.)
Performance Metric for Recommendation Systems

All Recommendations (made on training dataset)

Relevant Items Irrelevant Items that


that are also are
recommended recommended

All
Relevant
Items Precision = # of products relevant & recommended / # of items
(All items Relevant items that
in the recommended
are not
test set) recommendations (Measure of exactness)

Recall = # of products relevant & recommended / # of relevant items

(Measure of completeness)
Performance Metric for Recommendation Systems

Precision Recall Curve: Evaluation of top n


recommendations
Performance Metric for Recommendation Systems
Some other
metrics
Mean Absolute Error

Accuracy

ROC curve

Gunawardana, Asela, and Guy Shani. "A survey of accuracy evaluation metrics of recommendation tasks."
Journal of Machine Learning Research10.Dec (2009): 2935-2962.
Quiz: Comparison of Recommendation Systems
Which recommender model can handle brand new items Cold Start Problem!
(Eg., a new released movie)?

Popularity Classification (Nearest Neighbor- (Matrix Factorization


Based Based based CF) based CF)

Personalized
Recommendations

Uses Context
(Eg. time of day)

User Features

Item Features

Purchase History

Scalable

Can handle brand new


Items?
Music Recommendation
(Python notebook)
Notebook:
https://github.com/dvysardana/Recom
menderSystems_PyData_2016

Short url:

https://goo.gl/kVnNKf
Resources
1. Book: Recommender Systems An Introduction by Dietmar Jannach

1. Book: Mining Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeff


Ullman (www.mmds.org)

1. Coursera course on Recommender Systems, by University of Washington

1. Coursera course on Recommender Systems, by University of Minnesota


Do you have
any questions?

You might also like