0% found this document useful (0 votes)

100 views20 pages

Fast Maximum Margin Matrix Factorization

This document summarizes collaborative filtering methods for predicting user preferences in recommender systems. It describes low-rank matrix factorization approaches that represent users and items as vectors of factors inferred from existing ratings. A maximum margin matrix factorization (MMMF) method is proposed to directly optimize the hinge loss between predicted and observed ratings. MMMF finds a local minimum of the non-convex objective function using gradient descent on factor matrices U and V. An experimental study on movie rating datasets shows MMMF outperforms other factorization models in terms of normalized mean absolute error.

Uploaded by

Rahul Batra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views20 pages

Fast Maximum Margin Matrix Factorization

Uploaded by

Rahul Batra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Jason D.M.

Rennie Nathan Srebro

MIT Univ. of Toronto

BY:
RAHUL BATRA (MT13049)
SHILPA GARG (MT12049)
COLLABORATIVE PREDICTION

LOW RANK MATRIX FACTORIZATION

MAXIMUM MARGIN MATRIX FACTORIZATION(MMMF)

OPTIMIZATION METHOD

EXPERIMENTAL STUDY OF VARIOUS FACTOR MODELS ON

1MILLION MOVIELENS DATASET
Collaborative Prediction
Based on partially observed matrix:
Predict unobserved entries “Will user i like movie j?”
movies
2 1 4 5
5 4 ? 1 3
3 5 2
4 ? 5 3 ?
4 1 3 5
2 1 ? 4
1 5 5 4
2 ? 5 ? 4
users

3 3 1 5 2 1
3 1 2 3
4 5 1 3
3 3 ? 5
2 ? 1 1
5 2 ? 4 4
1 3 1 5 4 5
1 2 4 5 ?
Matrix Factorization

V’

U X Y
q
Ordinal Regression
Feature Vectors

v10
Preference Weights

v3
v2

v4
v5
v6
v7
v8
v9
w1 -2.5 1.4 -0.9 5.6 3.4 -1.8 -2.9 0.2 -4.2 2.1 2 3 3 5 5 2 2 3 1 4
Preference Scores q Ratings

1 2 3 4 5
Low Rank Matrix Factorization
2 45 1 42
31 22 5 4 × V’
4 2 41
3 34 2
31
4 X
Y
23 1 43
22 1
2
4 5 U = rank k
2 414 23
1 3 11 43
4 22 5 31

• Sum-Squared Loss Use SVD to find

• Fully Observed Y Global Optimum

• Classification Error Loss

Non-convex
• Partially Observed Y No explicit soln.
Problems with other factor models and
low rank approximations-
• In a collaborative prediction setting, only some of the entries
of Y are observed, and the low-rank matrix X minimizing the
sum-squared distance to the observed entries can no longer
be computed in terms of a singular value decomposition
• The problem of finding a low-rank approximation to a partially
observed matrix is a difficult non-convex optimization
problem with many local minima
• Low-rank approximations constrain the dimensionality of the
factorization X = UV’, i.e. the number of allowed factors.
Factor Model MMMF
User Preference Weights
movies
+1.5 × v1 comic value
Features + +2.4 × v2 dramatic value
+ -1.9 × v3 violence

Feature Values

movies
Preferences 1.4 -5.9 2.2 -0.8 -3.7 4.6 -1.8 3.5

movies
Ratings 3 1 4 3 2 5 2 4
Matrix Factorization
Feature Vectors

v10
v1

v8
v2

v4
v5
v6
v7

v9
w1 -2.5 1.4 -0.9 5.6 3.1 -1.8 -2.7 0.2 -4.2 2.1 3 5 2 3 4
w2 0.2 -4.2 2.1 -2.5 1.4 -0.9 5.6 3.1 -1.8 -2.7 3 4 2 3 3 5 2 2
w3 3.1 -1.8 -2.7 0.2 -4.2 2.1 -2.5 1.4 -0.9 5.6 5 2 2 3
w4 1.4 -0.9 5.6 3.1 -1.8 -2.7 0.2 -4.2 2.1 -2.5 5 5 2 1 4 2
w5 -4.2 2.1 -2.5 1.4 -0.9 5.6 3.1 -1.8 -2.7 0.2 1 4 2 3 5 2
Preference Weights

w6 -1.8 -2.7 0.2 -4.2 2.1 -2.5 1.4 -0.9 5.6 3.1 2 3 1 4 3 3 5
w7 -0.9 5.6 3.1 -1.8 -2.7 0.2 -4.2 2.1 -2.5 1.4 5 2 3 1 2 3
w8 2.1 -2.5 1.4 -0.9 5.6 3.1 -1.8 -2.7 0.2 -4.2 4 3 2 2
w9 -2.7 0.2 -4.2 2.1 -2.5 1.4 -0.9 5.6 3.1 -1.8 2 3 4 2 3 5 2
w10
w11
5.6 3.1 -1.8 -2.7 0.2 -4.2 2.1 -2.5 1.4 -0.9
-2.5 1.4 -0.9 5.6 3.1 -1.8 -2.7 0.2 -4.2 2.1
q 2 3
2 2 1 2
5 2 1
3
w12 0.2 -4.2 2.1 -2.5 1.4 -0.9 5.6 3.1 -1.8 -2.7 3 4 2 3 3 5 2 2
w13 3.1 -1.8 -2.7 0.2 -4.2 2.1 -2.5 1.4 -0.9 5.6 5 2 1 2 3 5
w14 1.4 -0.9 5.6 3.1 -1.8 -2.7 0.2 -4.2 2.1 -2.5 3 3 5 2 3 1 4
w15 -4.2 2.1 -2.5 1.4 -0.9 5.6 3.1 -1.8 -2.7 0.2 4 2 3 5 2 3
Preference Scores Ratings
Norm Constrained Factorization

low norm
V’ ||X||tr = minU,V
(||U||Fro2 + ||V||Fro2)
/2
||U||Fro2 =∑i,j Uij2
U X
MMMF Objective
Original Objective All-Thresholds

minX ||X||tr + c loss(X,Y)

Factorized Objective
minU,V (||U||Fro2 + ||V||Fro2)/2
+ c loss(UV’,Y)
All-Thresholds
low norm
V’
||U||Fro2 = ∑i,j Uij2 U X
LOW NORM FACTORIZATION-
First we consider binary labels Y £ {±1} nxm

Then we seek a minimum trace norm X that matches the

observed labels with a margin of one.

Now to minimize the hinge loss which is h(z)=max(0,1-z) , we

can write our optimization problem as-
OPTIMIZATION PROBLEM-
Our Problem can be re-written as –

where-

To relate the real-valued Xij to the discrete Yij we use R − 1 thresholds Θ1, Θ2..,
ΘR−1

So now we need to minimize -

OPTIMIZATION PROBLEM-

This gradient descent is used to locally optimize J(U,V,Θ)

Here we ignored the non-differentiability of hinge function h(z) at z=1,
and hence it is referred to as smooth hinge.
The only problem that arises now is that the J(U,V,Θ) is not a convex
function of U,V but the optimization problem was a convex function of
X,Θ.
So now we locally minimize the hinge function, as-
Smooth Hinge

Shown are the loss function values (left) and gradients

(right) for the Hinge and Smooth Hinge. The gradients are
identical outside the region z £ (0, 1).
Local Minima
Factorized Objective
minU,V (||U||Fro2 + ||V||Fro2)/2
+ c loss(UV’,Y)

 Optimize U,V with grad. descent

Optimize X with SDP

Local Minima

Y
Y

Data: 100 x 100 MovieLens, 65% sparse

Collaborative Prediction Results
EachMovie MovieLens
size, sparsity: 36656x1648, 96% 6040x3952, 96%
Weak Strong Weak Strong
NMAE NMAE NMAE NMAE
Algorith
m
URP .4422 .4557 .4341 .4444
Attitude .4520 .4550 .4320 .4375
MMMF .4397 .4341 .4156 .4203
Summary
• We scaled MMMF to large problems by optimizing the
Factorized Objective
• Empirical tests indicate that local minima issues are rare or
absent
• We compare against results obtained by Marlin (2004) and
find that MMMF substantially outperforms all nine methods
he tested.
THANK YOU

CEE 471, Fall 2019: HW2 Solutions: Bhavesh Shrimali, Aditya Kumar
No ratings yet
CEE 471, Fall 2019: HW2 Solutions: Bhavesh Shrimali, Aditya Kumar
7 pages
26 Matrix Factorization
No ratings yet
26 Matrix Factorization
20 pages
Lecture-05 - Least Squares and Optimization
No ratings yet
Lecture-05 - Least Squares and Optimization
34 pages
Matrix Factorization-1
No ratings yet
Matrix Factorization-1
5 pages
I M F M F: Ntroduction TO Atrix Actorization Ethods Collaborative Iltering
No ratings yet
I M F M F: Ntroduction TO Atrix Actorization Ethods Collaborative Iltering
20 pages
Week 9
No ratings yet
Week 9
14 pages
Collaborative Filtering Matrix Factorization Approach: Jeff Howbert Introduction To Machine Learning Winter 2012 #
No ratings yet
Collaborative Filtering Matrix Factorization Approach: Jeff Howbert Introduction To Machine Learning Winter 2012 #
30 pages
Chapter 8
No ratings yet
Chapter 8
52 pages
12 - Bài Toán Phân L P - SVM - v2
No ratings yet
12 - Bài Toán Phân L P - SVM - v2
138 pages
Lec 06 SVM
No ratings yet
Lec 06 SVM
34 pages
7 SVM For Scientists Annotated
No ratings yet
7 SVM For Scientists Annotated
76 pages
Introduction To Algorithms For Behavior Based Recommendation
No ratings yet
Introduction To Algorithms For Behavior Based Recommendation
36 pages
QSRI Lecture5
No ratings yet
QSRI Lecture5
79 pages
Matrix Factorization
No ratings yet
Matrix Factorization
18 pages
Lecture: Dimensionality Reduction With Principal Component Analysis
No ratings yet
Lecture: Dimensionality Reduction With Principal Component Analysis
42 pages
10 SVM
No ratings yet
10 SVM
77 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Recommendation Systems
No ratings yet
Recommendation Systems
62 pages
SVM 1
No ratings yet
SVM 1
36 pages
Chapter 5 - Support Vector Machine: Prepared By: Shier Nee, SAW
No ratings yet
Chapter 5 - Support Vector Machine: Prepared By: Shier Nee, SAW
44 pages
Probabilistic Matrix Factorization: Ruslan Salakhutdinov and Andriy Mnih
No ratings yet
Probabilistic Matrix Factorization: Ruslan Salakhutdinov and Andriy Mnih
8 pages
Ranking Problems: 9.520 Class 09, 08 March 2006 Giorgos Zacharia
No ratings yet
Ranking Problems: 9.520 Class 09, 08 March 2006 Giorgos Zacharia
27 pages
Feature Selection For SVMS: by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik
No ratings yet
Feature Selection For SVMS: by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik
19 pages
Linear Algebra For Machine Learning
No ratings yet
Linear Algebra For Machine Learning
65 pages
M03 Clustering
No ratings yet
M03 Clustering
37 pages
Lecture 7 - SVM
No ratings yet
Lecture 7 - SVM
125 pages
Support Vector Machines: Javier B Ejar Cbea
No ratings yet
Support Vector Machines: Javier B Ejar Cbea
44 pages
Convex Optimization in Classification Problems: MIT/ORC Spring Seminar
No ratings yet
Convex Optimization in Classification Problems: MIT/ORC Spring Seminar
39 pages
Week - 9 (SVM)
No ratings yet
Week - 9 (SVM)
16 pages
A Recommender System: John Urbanic
No ratings yet
A Recommender System: John Urbanic
36 pages
Fisher Linear Discriminant Analysis: Max Welling
No ratings yet
Fisher Linear Discriminant Analysis: Max Welling
4 pages
2IIG0 Cheat Sheet 1
No ratings yet
2IIG0 Cheat Sheet 1
2 pages
Main
No ratings yet
Main
12 pages
An Introduction To Support Vector Machines
No ratings yet
An Introduction To Support Vector Machines
13 pages
Bayesian Probabilistic Matrix Factorization
No ratings yet
Bayesian Probabilistic Matrix Factorization
11 pages
Lange Talk
No ratings yet
Lange Talk
40 pages
Kernel SVM For Image Classification
No ratings yet
Kernel SVM For Image Classification
20 pages
Lecture 7: Least-Squares Problem: Convex Optimization
No ratings yet
Lecture 7: Least-Squares Problem: Convex Optimization
7 pages
Berikov Chall BL Box Krasn
No ratings yet
Berikov Chall BL Box Krasn
19 pages
4 - SVM
No ratings yet
4 - SVM
58 pages
hw3 Soln
No ratings yet
hw3 Soln
7 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
1 Lmis: C 2005, 2007, 2009, 2011 Anders Helmersson, ISY, May 18, 2011
No ratings yet
1 Lmis: C 2005, 2007, 2009, 2011 Anders Helmersson, ISY, May 18, 2011
16 pages
Lecture: Classification With Support Vector Machines: CS 2XX: Mathematics For AI and ML
No ratings yet
Lecture: Classification With Support Vector Machines: CS 2XX: Mathematics For AI and ML
28 pages
Kernel Method and Support Vector Machines: Nguyen Duc Dung, Ph.D. Ioit, Vast
No ratings yet
Kernel Method and Support Vector Machines: Nguyen Duc Dung, Ph.D. Ioit, Vast
34 pages
L5 SVM
No ratings yet
L5 SVM
61 pages
Gonzalez 2020
No ratings yet
Gonzalez 2020
79 pages
Support Vector Machine
No ratings yet
Support Vector Machine
35 pages
Dis11 Sol
No ratings yet
Dis11 Sol
5 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Support Vector Machines (SVM) : Y.H. Hu
No ratings yet
Support Vector Machines (SVM) : Y.H. Hu
25 pages
ML-chap13 2024 110331
No ratings yet
ML-chap13 2024 110331
67 pages
Lecture16 Crossvalidation
No ratings yet
Lecture16 Crossvalidation
32 pages
Andriy Mnih and Ruslan Salakhutdinov: Atrix Factorization Methods For Collaborative Filtering
No ratings yet
Andriy Mnih and Ruslan Salakhutdinov: Atrix Factorization Methods For Collaborative Filtering
24 pages
Model 20161010
No ratings yet
Model 20161010
48 pages
ML Lectures - 20 22
No ratings yet
ML Lectures - 20 22
14 pages
ML Module 4
No ratings yet
ML Module 4
16 pages
Lecture 4
No ratings yet
Lecture 4
9 pages
Introduction To Support Vector Machines: Andrew Moore CMU
No ratings yet
Introduction To Support Vector Machines: Andrew Moore CMU
40 pages
Introducing Autodesk Maya 2015: Autodesk Official Press
From Everand
Introducing Autodesk Maya 2015: Autodesk Official Press
Dariush Derakhshani
No ratings yet
Rylie & Kylie Adventures: The Missing Tickets
From Everand
Rylie & Kylie Adventures: The Missing Tickets
Monique Scarver
No ratings yet
GWT Banking Project
No ratings yet
GWT Banking Project
59 pages
Peer To Peer Network in Distributed Systems
No ratings yet
Peer To Peer Network in Distributed Systems
88 pages
Modelling Bursting Pacemaker Neurons
No ratings yet
Modelling Bursting Pacemaker Neurons
3 pages
Arduino Processing
No ratings yet
Arduino Processing
18 pages
Modelling Bursting Pacemaker Neural Network
No ratings yet
Modelling Bursting Pacemaker Neural Network
14 pages
Lecture 6 MAT1052
No ratings yet
Lecture 6 MAT1052
47 pages
PHY SN M05 L01 685504 Digital TE
No ratings yet
PHY SN M05 L01 685504 Digital TE
7 pages
Multiple Chose Questions
No ratings yet
Multiple Chose Questions
8 pages
q1 Main
No ratings yet
q1 Main
1 page
Block LU Factorization
No ratings yet
Block LU Factorization
14 pages
NC 10
No ratings yet
NC 10
12 pages
EECE253 07 Convolution
No ratings yet
EECE253 07 Convolution
60 pages
Chapter 6: Isoparametric Elements Chapter 6: Isoparametric Elements
No ratings yet
Chapter 6: Isoparametric Elements Chapter 6: Isoparametric Elements
10 pages
Answers To Problems For Introduction To Robotics, 3rd Edition by Saeed Niku
No ratings yet
Answers To Problems For Introduction To Robotics, 3rd Edition by Saeed Niku
15 pages
Column Vectors Answers PDF
No ratings yet
Column Vectors Answers PDF
3 pages
100 MCQ
No ratings yet
100 MCQ
41 pages
Sheet-2 (Vector Space) - 1
No ratings yet
Sheet-2 (Vector Space) - 1
4 pages
Mit18 06S10 L31
No ratings yet
Mit18 06S10 L31
13 pages
Matrix Comp
No ratings yet
Matrix Comp
716 pages
Engineering Mathematics
No ratings yet
Engineering Mathematics
3 pages
Fast Computation of Channel-Estimate Based Equalizers in Packet Data Transmission
No ratings yet
Fast Computation of Channel-Estimate Based Equalizers in Packet Data Transmission
24 pages
Problem Set No 3
No ratings yet
Problem Set No 3
21 pages
Ladrsm2e Páginas 43 44
No ratings yet
Ladrsm2e Páginas 43 44
2 pages
Mat101 Linear Algebra and Calculus, December 2020
No ratings yet
Mat101 Linear Algebra and Calculus, December 2020
3 pages
Revision Test Vectors 22
No ratings yet
Revision Test Vectors 22
1 page
EE580 Final Exam 2 PDF
No ratings yet
EE580 Final Exam 2 PDF
2 pages
EVERxzz
No ratings yet
EVERxzz
1 page
Thesis On Matrix
No ratings yet
Thesis On Matrix
147 pages
Vectors - DPP 01 (Of Lec 03) - Lakshya MHT CET 2.O 2025
No ratings yet
Vectors - DPP 01 (Of Lec 03) - Lakshya MHT CET 2.O 2025
4 pages
Quiz 2 Section 5 Solution
No ratings yet
Quiz 2 Section 5 Solution
3 pages
BCS 012 Maths
No ratings yet
BCS 012 Maths
431 pages
System of Linear Equations Tutorial
No ratings yet
System of Linear Equations Tutorial
11 pages
Orthogonal Basis
No ratings yet
Orthogonal Basis
30 pages
A PERSONAL PROOF of The STOKES' THEOREM-Una Dimostrazione Personale Del Teorema Di Stokes
No ratings yet
A PERSONAL PROOF of The STOKES' THEOREM-Una Dimostrazione Personale Del Teorema Di Stokes
2 pages

Fast Maximum Margin Matrix Factorization

Uploaded by

Fast Maximum Margin Matrix Factorization

Uploaded by

Jason D.M.

Rennie Nathan Srebro

LOW RANK MATRIX FACTORIZATION

MAXIMUM MARGIN MATRIX FACTORIZATION(MMMF)

EXPERIMENTAL STUDY OF VARIOUS FACTOR MODELS ON

• Sum-Squared Loss Use SVD to find

• Classification Error Loss

minX ||X||tr + c loss(X,Y)

Then we seek a minimum trace norm X that matches the

Now to minimize the hinge loss which is h(z)=max(0,1-z) , we

So now we need to minimize -

This gradient descent is used to locally optimize J(U,V,Θ)

Shown are the loss function values (left) and gradients

 Optimize U,V with grad. descent

Optimize X with SDP

Data: 100 x 100 MovieLens, 65% sparse

You might also like