0% found this document useful (0 votes)

97 views30 pages

Factorization Machines

This document provides an introduction to factorization machines. Factorization machines combine the advantages of support vector machines and matrix factorization models. They are able to model interactions between variables in very sparse data efficiently in linear time. This makes them well-suited for applications involving sparse data like recommender systems. The document discusses how factorization machines can model different recommender system approaches like matrix factorization and pairwise interaction tensor factorization. It also provides information on available implementations and potential applications to click prediction.

Uploaded by

Guru75

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views30 pages

Factorization Machines

Uploaded by

Guru75

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Factorization Machines

- Introduction
Bartłomiej Twardowski
18.10.2016
Warsaw Data Science Meetup
Polish English?
• Support Vector Machines
=> “maszyna wektorów
nośnych”

• Matrix Factorization =>

“faktoryzacja macierzy”

• Factorization Machines =>

“maszyna faktoryzująca”?

• LMGTFY:-) Let’s stick to the

English name then!
Motivation
• one of the most successful model with a great of
expressiveness

• great for begin with context-aware recommendations

• considered as base toolbox for advertisers/kagglers

• FFM presentation from many years ago was on

RecSys 2016 ( still, almost nothing new in it :-( )

• considered it as a fun and original subject for meetup

2015.10.6 - meetup about recommender systems
Not motivated enough?
Success stories.
1. competitions

2. more often appears in DS job offers

Factorization Machines
• S. Rendle 2010 [1]

• combines advantages os Support Vector

Machines(SVM) with factorization models

• generic (real-value features)

• incredible good for sparse data

• model expressiveness
MF - quick recap
Simplest problem formulation[3]:

• U - user set, I - item set

|U |⇥|I|
• matrix contains user ratings R 2 R

• find the best representation in k dimensional latent space for

user P (|U| × k) and items Q (|I| × k) so the matrix Rˆ is defined as:  

• to predict rating:
MF - quick recap

with regularization[4]:
Linear & Poly2 models
simple linear regression model:
n
X
ŷ(x) = w0 + w i xi
i=1

adding two-way interactions:

n
X n
X n
X
ŷ(x) = w0 + w i xi + vi,j xi xj
i=1 i=1 j=i+1
FM Model
for two-way interactions:

model parameters:

For each xi we have dedicated vector vi with k-features.

Then instead of weight wij for feature interactions we have
dot product:
2
Wait, it’s O(kn )! Not linear!
Making it O(kn)
2 3
x11 x12 x13 ... x1n
6x21 x22 x23 ... x2n 7
6 7
6x31 x32 x33 ... x3n 7
6 7
6 .. .. .. .. .. 7
4 . . . . . 5
xd1 xd2 xd3 ... xdn
Simplified version
for k = 1, n =2 perspective
let:

v1 x1 = a, v2 x2 = b

then:
2 2 2
(a + b) = a + 2ab + b
1 2 2 2
ab = (a + b) a b
2

And now it looks very familiar :-)

FM vs SVM
• FM combines the advantages of SVM and factorization
models

• general prediction working on real-values (like SVM)

• good estimates interactions model with huge sparsity,

where SVM fail (e.g. recommender systems)

• model equation of FMs can be calculated in linear time

• comparable to a polynomial kernel in SVM, but works

for very spars data and works fast.
Use case: Context-Aware
Recommender Systems
• U = {Alice (A),Bob (B),Charlie (C), . . .}

• I = {Titanic (TI),Notting Hill (NH), Star Wars (SW),

Star Trek (ST), . . .}

• S = {(A,TI, 2010-1, 5), (A,NH, 2010-2, 3), (A, SW,

2010-4, 1),(B, SW, 2009-5, 4), (B, ST, 2009-8, 5),
(C,TI, 2009-9, 1), (C, SW, 2009-12, 5)}

• Example from [1]

Example of input data
preparation
Why us FM for this?
The drawback of tensor factorization models and
even more for specialized factorization models is
that [1]:

(1) they are not applicable to standard prediction

data (e.g. a real valued feature vector)

(2) that specialized models are usually derived

individually for a specific task requiring effort in
modeling and design of a learning algorithm.
How about ranking?

http://www.tongji.edu.cn/~qiliu/lor_vs.html

Go for pairwise approach!

Model expressiveness
FM ~ MF
given

the model will then mimic a biased MF:

MF ~ PITF
given user x item x tag interactions as:

FM will mimic a pairwise interaction

tensor factorization model (PITF) [7]:
And others

(e.g. factorized NN, KNN++, SVD++, …)

presented in [2].
Field-aware FM
• Have been used to win two CTR competitions [5].

• Introducing grouped features - fields, eg. user,

color, time.

• Learn a different set of latent factors for every pair

of fields
Xn Xn Xn
ŷ(x) = w0 + w i xi + hvi,f (j) , vj,f (i) ixi xj
i=1 i=1 j=i+1

where f(i) is the field of a feature i.

Available implementations
• libfm (http://www.libfm.org/), SGD/ALS/MCMC

• FM for Julia (https://github.com/btwardow/

FactorizationMachines.jl)

• fastFM (https://github.com/ibayer/fastFM)

• DiFacto (https://github.com/dmlc/difacto)

• lightfm

• spark-libFM, libffm
My experiments with FM on GPU

The same implementation moved from numpy to Theano was

~7x faster! Without using any special GPU tricks.
Going for click prediction?

• feature engineering (counting features, like hist. ctr)

• hashing trick

• L1, FTRL using e.g. vw

• making new features - e.g. decision tree encoding

How about now? :-)
References
[1] Rendle, Steffen. "Factorization machines." 2010 IEEE International
Conference on Data Mining. IEEE, 2010.

[2] Rendle, Steffen. "Factorization machines with libfm." ACM

Transactions on Intelligent Systems and Technology (TIST) 3.3 (2012): 57.

[3] Takács, Gábor, et al. "Matrix factorization and neighbor based

algorithms for the netflix prize problem." Proceedings of the 2008 ACM
conference on Recommender systems. ACM, 2008.

[4] Paterek, Arkadiusz. "Improving regularized singular value

decomposition for collaborative filtering." Proceedings of KDD cup and
workshop. Vol. 2007. 2007.
References
[5] http://www.csie.ntu.edu.tw/~r01922136/slides/ffm.pdf

[6] SREBRO,N., RENNIE,J. D. M., AND JAAKOLA, T. S. 2005.

Maximum-margin matrix factorization. In Advances in Neural
Information Processing Systems 17,MIT 1329–1336.

[7] RENDLE,S. AND SCHMIDT-THIEME, L. 2010. Pairwise interaction

tensor factorization for personalized tag recommendation. In
Proceedings of the third ACM International Conference on Web
Search and Data Mining (WSDM’10). ACM, New York, NY, 81–90.
Q&A
@btwardow, Bartłomiej Twardowski
[email protected]

AI ML 2024 Solved Question Paper - Vaibhavpandit - Tele - 250522 - 224429
No ratings yet
AI ML 2024 Solved Question Paper - Vaibhavpandit - Tele - 250522 - 224429
41 pages
Andriy Mnih and Ruslan Salakhutdinov: Atrix Factorization Methods For Collaborative Filtering
No ratings yet
Andriy Mnih and Ruslan Salakhutdinov: Atrix Factorization Methods For Collaborative Filtering
24 pages
Feature-Interactions Based Information Retrieval Models - Sumit's Diary
No ratings yet
Feature-Interactions Based Information Retrieval Models - Sumit's Diary
8 pages
M04 Matrix Factorization
No ratings yet
M04 Matrix Factorization
19 pages
Advanced Factorization Models For Recommender Systems
No ratings yet
Advanced Factorization Models For Recommender Systems
169 pages
Lecture 6 Classification P3 SVM
No ratings yet
Lecture 6 Classification P3 SVM
44 pages
Factorization Methods in Recommendation Systems
No ratings yet
Factorization Methods in Recommendation Systems
3 pages
214 Handout
No ratings yet
214 Handout
42 pages
Lecture 6 - Classification - SVM
No ratings yet
Lecture 6 - Classification - SVM
48 pages
2EL1730 ML Lecture11 NMF - Annotated
No ratings yet
2EL1730 ML Lecture11 NMF - Annotated
41 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Slide - SVM
No ratings yet
Slide - SVM
12 pages
Lecture 6 Classification SVM
No ratings yet
Lecture 6 Classification SVM
44 pages
Unit 4 - The Learning Mechanisms - New
No ratings yet
Unit 4 - The Learning Mechanisms - New
79 pages
Lecture 5 Classification SVM
No ratings yet
Lecture 5 Classification SVM
44 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
Approach 1: Exact RBF
No ratings yet
Approach 1: Exact RBF
6 pages
INT247 Lect3.03.1
No ratings yet
INT247 Lect3.03.1
23 pages
Recommender Systems-Chapter 5
No ratings yet
Recommender Systems-Chapter 5
23 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
Support Vector Machine Master Thesis
100% (3)
Support Vector Machine Master Thesis
7 pages
Xdeepfm: Combining Explicit and Implicit Feature Interactions For Recommender Systems
No ratings yet
Xdeepfm: Combining Explicit and Implicit Feature Interactions For Recommender Systems
10 pages
Support Vector Machine
No ratings yet
Support Vector Machine
34 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
Unvilling - Shapes - P
No ratings yet
Unvilling - Shapes - P
46 pages
26 Matrix Factorization
No ratings yet
26 Matrix Factorization
20 pages
Lecture 3 Gis
No ratings yet
Lecture 3 Gis
20 pages
FULLTEXT01
No ratings yet
FULLTEXT01
44 pages
Lec5 SVM Kernel SoftMargin
No ratings yet
Lec5 SVM Kernel SoftMargin
44 pages
SVM Intro
No ratings yet
SVM Intro
23 pages
38.7 - Matrix Factorization For Feature Engineering - mp4
No ratings yet
38.7 - Matrix Factorization For Feature Engineering - mp4
2 pages
A Recommender System: John Urbanic
No ratings yet
A Recommender System: John Urbanic
36 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
14-Introduction To Support Vector Machine-22-03-2024
No ratings yet
14-Introduction To Support Vector Machine-22-03-2024
12 pages
Ain3001 - 04 - Support - Vector.machines
No ratings yet
Ain3001 - 04 - Support - Vector.machines
50 pages
Kernel Machines
No ratings yet
Kernel Machines
33 pages
MAI Lecture 07 RBFN
No ratings yet
MAI Lecture 07 RBFN
23 pages
Feature Engineering: Getting The Most Out of Data For Predictive Models
No ratings yet
Feature Engineering: Getting The Most Out of Data For Predictive Models
75 pages
Matrix Factorization
No ratings yet
Matrix Factorization
18 pages
10 SVMAndEvaluation PDF
No ratings yet
10 SVMAndEvaluation PDF
60 pages
An Investigation On Pattern Classification and Recognition by Using Least Mean
No ratings yet
An Investigation On Pattern Classification and Recognition by Using Least Mean
5 pages
Matrix Factorization-1
No ratings yet
Matrix Factorization-1
5 pages
Recommender Systems Clustering Using Bayesian Non Negative Matrix Factorization
No ratings yet
Recommender Systems Clustering Using Bayesian Non Negative Matrix Factorization
16 pages
Special Topics of FEA: Somenath Mukherjee Gangan Prathap
No ratings yet
Special Topics of FEA: Somenath Mukherjee Gangan Prathap
70 pages
Factorization Machines Steffen Rendle Osaka University 2010
No ratings yet
Factorization Machines Steffen Rendle Osaka University 2010
6 pages
OpenStack-made-easy Ebook 11.17 PDF
No ratings yet
OpenStack-made-easy Ebook 11.17 PDF
29 pages
Lecture 02 - Data Communication Networks 2022.01.06
No ratings yet
Lecture 02 - Data Communication Networks 2022.01.06
32 pages
Tehnici Evolutive in Teoria Jocurilor
100% (1)
Tehnici Evolutive in Teoria Jocurilor
32 pages
DBMS July - 2024
No ratings yet
DBMS July - 2024
10 pages
Icml Tutorial
No ratings yet
Icml Tutorial
85 pages
Wireless Gateway User Guide
No ratings yet
Wireless Gateway User Guide
54 pages
UNIT2
No ratings yet
UNIT2
20 pages
27 SVM Interview Questions (ANSWERED) To Master Before ML & Data Science Interview - MLStack - Cafe
No ratings yet
27 SVM Interview Questions (ANSWERED) To Master Before ML & Data Science Interview - MLStack - Cafe
25 pages
C If Statement
No ratings yet
C If Statement
6 pages
MCA Rtu Syllabuss
No ratings yet
MCA Rtu Syllabuss
6 pages
Advanced Femap Programming With Applications To Structural Analysis - Mcgill
No ratings yet
Advanced Femap Programming With Applications To Structural Analysis - Mcgill
16 pages
Quotation For Office Wiring Rehabilitation
No ratings yet
Quotation For Office Wiring Rehabilitation
1 page
Matrix Factorization and Its Applications
No ratings yet
Matrix Factorization and Its Applications
17 pages
Os Lab Manual - 2024
No ratings yet
Os Lab Manual - 2024
79 pages
Ozone Console
No ratings yet
Ozone Console
3 pages
Nekta Management System
100% (1)
Nekta Management System
38 pages
Role of Matrix Factorization Model in Collaborative Filtering Algorithm: A Survey
No ratings yet
Role of Matrix Factorization Model in Collaborative Filtering Algorithm: A Survey
6 pages
UUCMS - Unified University College Management System
No ratings yet
UUCMS - Unified University College Management System
2 pages
Documentation Driven Software Development For Embedded Systems
No ratings yet
Documentation Driven Software Development For Embedded Systems
9 pages
Stock Market Time Series Forecasting
No ratings yet
Stock Market Time Series Forecasting
21 pages
List of Experiments: DDVHDL
No ratings yet
List of Experiments: DDVHDL
52 pages
pCO Sistema - EN - Ver - 1.08
No ratings yet
pCO Sistema - EN - Ver - 1.08
6 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
Factorization Machines With Follow-The-Regularized-Leader For CTR Prediction in Display Advertising
No ratings yet
Factorization Machines With Follow-The-Regularized-Leader For CTR Prediction in Display Advertising
3 pages
03TP Condez John Paul HCI
No ratings yet
03TP Condez John Paul HCI
3 pages
Libfm
No ratings yet
Libfm
7 pages
Program Schedule Web
No ratings yet
Program Schedule Web
13 pages
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Probabilistic Matrix Factorization: Ruslan Salakhutdinov and Andriy Mnih
No ratings yet
Probabilistic Matrix Factorization: Ruslan Salakhutdinov and Andriy Mnih
8 pages
Half Blind Attack MSP430
No ratings yet
Half Blind Attack MSP430
6 pages
Graph Regularized Non-Negative Matrix Factorization For Data Representation
No ratings yet
Graph Regularized Non-Negative Matrix Factorization For Data Representation
14 pages
This Is
No ratings yet
This Is
7 pages
Pytorch (Tabular) - Regression
No ratings yet
Pytorch (Tabular) - Regression
13 pages
Touch Panel Designer - Manual v1.0.6.0
No ratings yet
Touch Panel Designer - Manual v1.0.6.0
14 pages
Matrix Factorization - A Simple Tutorial and Implementation in Python
No ratings yet
Matrix Factorization - A Simple Tutorial and Implementation in Python
9 pages
Week 1 Assignment Questions
No ratings yet
Week 1 Assignment Questions
2 pages
SVM Using Python
No ratings yet
SVM Using Python
24 pages
1.4 Disclaimer: Shareware Register
No ratings yet
1.4 Disclaimer: Shareware Register
1 page
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
120 Interview Questions
83% (12)
120 Interview Questions
19 pages
MATRIX FACTORIZ-WPS Office
No ratings yet
MATRIX FACTORIZ-WPS Office
15 pages
Chapter 14: Stress Concentrations: - Locally High Stresses Can Arise Due To
No ratings yet
Chapter 14: Stress Concentrations: - Locally High Stresses Can Arise Due To
10 pages
Random Vibration - An Overview - BarryControls
No ratings yet
Random Vibration - An Overview - BarryControls
15 pages
Ai Image Generation: Presented by Mrunal Kotian:035 Nikhil Walunj: 032 Nikita Domale:034 Prathamesh Wagh 040
No ratings yet
Ai Image Generation: Presented by Mrunal Kotian:035 Nikhil Walunj: 032 Nikita Domale:034 Prathamesh Wagh 040
8 pages
Md. Azizur Rahman Sumon: Planners Tower, 13 Floor, Suit#1 - 3
No ratings yet
Md. Azizur Rahman Sumon: Planners Tower, 13 Floor, Suit#1 - 3
3 pages
Assignment 1 - Linear Regression (PyTorch - Zero To GANs) PDF
No ratings yet
Assignment 1 - Linear Regression (PyTorch - Zero To GANs) PDF
4 pages
RabbitMQ Training Daywise
No ratings yet
RabbitMQ Training Daywise
6 pages
Ideapad 330 15IGM Spec
No ratings yet
Ideapad 330 15IGM Spec
1 page
Chandra Resume
No ratings yet
Chandra Resume
6 pages
Oracle On Azure Whitepaper
No ratings yet
Oracle On Azure Whitepaper
34 pages
A3 1 1IntroductionFlipFlops
No ratings yet
A3 1 1IntroductionFlipFlops
5 pages
LKPD Bioteknologi Kelas 9 Worksheet
No ratings yet
LKPD Bioteknologi Kelas 9 Worksheet
4 pages
Ansys Solution Fall03 PDF
No ratings yet
Ansys Solution Fall03 PDF
83 pages
Bayesian Belief Network
No ratings yet
Bayesian Belief Network
30 pages
Maxima and Minima: X X XX X X
No ratings yet
Maxima and Minima: X X XX X X
2 pages
Cs301 Solved Subjective Final Term by Junaid
No ratings yet
Cs301 Solved Subjective Final Term by Junaid
39 pages
Functional Safety and Automotive
No ratings yet
Functional Safety and Automotive
7 pages
PM Master Data Creation (Task List) : Release V1.0
No ratings yet
PM Master Data Creation (Task List) : Release V1.0
14 pages
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
From Everand
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
Fouad Sabry
No ratings yet
STAR Interview Prep Document
No ratings yet
STAR Interview Prep Document
4 pages

Factorization Machines

Uploaded by

Factorization Machines

Uploaded by

Factorization Machines

• Matrix Factorization =>

• Factorization Machines =>

• LMGTFY:-) Let’s stick to the

• great for begin with context-aware recommendations

• considered as base toolbox for advertisers/kagglers

• FFM presentation from many years ago was on

• considered it as a fun and original subject for meetup

2. more often appears in DS job offers

• combines advantages os Support Vector

• generic (real-value features)

• incredible good for sparse data

• U - user set, I - item set

• find the best representation in k dimensional latent space for

adding two-way interactions:

For each xi we have dedicated vector vi with k-features.

And now it looks very familiar :-)

• general prediction working on real-values (like SVM)

• good estimates interactions model with huge sparsity,

• model equation of FMs can be calculated in linear time

• comparable to a polynomial kernel in SVM, but works

• I = {Titanic (TI),Notting Hill (NH), Star Wars (SW),

• S = {(A,TI, 2010-1, 5), (A,NH, 2010-2, 3), (A, SW,

• Example from [1]

(1) they are not applicable to standard prediction

(2) that specialized models are usually derived

Go for pairwise approach!

the model will then mimic a biased MF:

FM will mimic a pairwise interaction

(e.g. factorized NN, KNN++, SVD++, …)

• Introducing grouped features - fields, eg. user,

• Learn a different set of latent factors for every pair

where f(i) is the field of a feature i.

• FM for Julia (https://github.com/btwardow/

The same implementation moved from numpy to Theano was

• feature engineering (counting features, like hist. ctr)

• L1, FTRL using e.g. vw

• making new features - e.g. decision tree encoding

[2] Rendle, Steffen. "Factorization machines with libfm." ACM

[3] Takács, Gábor, et al. "Matrix factorization and neighbor based

[4] Paterek, Arkadiusz. "Improving regularized singular value

[6] SREBRO,N., RENNIE,J. D. M., AND JAAKOLA, T. S. 2005.

[7] RENDLE,S. AND SCHMIDT-THIEME, L. 2010. Pairwise interaction

You might also like