Online Learning for Matrix Factorization and Sparse Coding

Mairal, Julien; Bach, Francis; Ponce, Jean; Sapiro, Guillermo

Statistics > Machine Learning

arXiv:0908.0050 (stat)

[Submitted on 1 Aug 2009 (v1), last revised 11 Feb 2010 (this version, v2)]

Title:Online Learning for Matrix Factorization and Sparse Coding

Authors:Julien Mairal (INRIA Rocquencourt), Francis Bach (INRIA Rocquencourt), Jean Ponce (INRIA Rocquencourt, LIENS), Guillermo Sapiro

View PDF

Abstract: Sparse coding--that is, modelling data vectors as sparse linear combinations of basis elements--is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the large-scale matrix factorization problem that consists of learning the basis set, adapting it to specific data. Variations of this problem include dictionary learning in signal processing, non-negative matrix factorization and sparse principal component analysis. In this paper, we propose to address these tasks with a new online optimization algorithm, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems. A proof of convergence is presented, along with experiments with natural images and genomic data demonstrating that it leads to state-of-the-art performance in terms of speed and optimization for both small and large datasets.

Comments:	revised version
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:0908.0050 [stat.ML]
	(or arXiv:0908.0050v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.0908.0050
Journal reference:	Journal of Machine Learning Research 11 (2010) 19--60

Submission history

From: Julien Mairal [view email] [via CCSD proxy]
[v1] Sat, 1 Aug 2009 06:09:18 UTC (2,278 KB)
[v2] Thu, 11 Feb 2010 07:33:02 UTC (2,452 KB)

Statistics > Machine Learning

Title:Online Learning for Matrix Factorization and Sparse Coding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Online Learning for Matrix Factorization and Sparse Coding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators