0% found this document useful (0 votes)

72 views49 pages

Matrix and Tensor Factorization For Machine Learning: IFT 6760A

This document provides an overview of an advanced graduate course on matrix and tensor factorization for machine learning. The course will begin with a linear algebra refresher focusing on matrix decomposition techniques. It will then cover applications of linear algebra in machine learning before introducing tensors and tensor decomposition. Finally, it will discuss tensors in machine learning and include a seminar portion where students present their research projects. The goal is for students to learn how to use tensor methods in machine learning applications and gain experience with formal proofs and research papers.

Uploaded by

Abhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views49 pages

Matrix and Tensor Factorization For Machine Learning: IFT 6760A

Uploaded by

Abhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

IFT 6760A

Matrix and Tensor Factorization for

Machine Learning

Winter 2019

Instructor: Guillaume Rabusseau

email: [email protected]
Today

I What is this class about? Linear and multilinear algebra for ML...
I Class goals: refresh linear algebra, discover and have fun with
tensors and do research!
I Class logistics.
I Content overview.
I In-class quiz (not graded): for me to assess your background and
adjust material accordingly.
I Questions.
Material and discussion

I All material will be posted on my website

I We will use a google discussion group for announcements and
discussions, don’t forget to sign-up (link on the course webpage).
I We will mainly use Studium for you to upload project proposals /
reports...
Class high-level overview

I Objectives
I End goal: for you to know what tensors and tensor decomposition
techniques are and how they can be used in ML.
I Along the way: learn useful theoretical tools from linear algebra and
matrix decomposition techniques.
I We will start back from linear algebra and build up from there:
1. Linear algebra refresher with focus on matrix decomposition
2. Applications of Linear algebra in ML
3. Introduction to tensors and tensor decomposition techniques
4. Tensors in ML
5. Seminar part of the class → your turn! (more on that later)
(rough timeline: (1,2) in Jan., (3,4) in Feb, (5) in Mar.)
I I am very open to feedbacks and suggestions, and if there is a topic
you would like to see covered in class come see me / send a mail!
About this class

I Not a math course but still a lot of maths

I I will assume a reasonable background in
I probability / statistics
I machine learning (methodology, bias-variance, regularization, etc.)
I basic theoretical CS (big O notation, what is a graph, etc.)
I linear algebra (even though we will review main results)

I We will do a lot of proofs...

,→ getting comfortable with writing / reading formal proofs can be
seen as a side-objective of the class.
Who is this class for?

I Advanced grad students

I If you’re a first year MSc student the class may be tough or not
given your background: come discuss with me or send me a mail
with your background and list of relevant courses you followed.
I You should have followed at least one ML course (e.g.
IFT3395/6390, COMP-551, COMP-652).
Class goals

I Get a good grasp on linear algebra and matrix decomposition

techniques in the context of ML
I Learn theoretical tools potentially relevant to many aspects of ML
I Learn about tensors and how they can be used in ML

I Read research papers, get familiar with the literature

I Engage in research, connect the tools we’ll learn with your own
research
I Work on your presentation and writing skills
I Work on a semester research project
Class logistics
I Language: everything will be English (many foreign students,
papers are in English...), come see me or send me a mail if this is a
concern.
I Grading:
I participation (10%): questions/comments during class, project
updates, feedback, etc.
I scribing (10%): most lectures will be on the board, each lecture a
group of students is responsible for taking notes and write them (in
latex) for the following week.
Each student has to scribe at least once (likely twice) during the
semester.
I paper presentation (30%)...
I research project (50%)...
,→ no midterm but a project proposal due for middle of semester.
,→ no assignments per se but readings and some proofs let as
exercises.
Paper presentation

I A few classes (likely starting late Feb./early March) will be devoted

to paper presentation by students.
I Goal is for you to work on your presentation skills and potentially
start elaborating your project.
I read a research paper from the literature (I will post references but
you can come up with your own suggestion, to be validated).
I present the work in class either with slides or on the blackboard in a
talk.
I graded on the quality of the presentation.
I Specifics to be set up later (size of groups, length of presentations,
etc.).
Research project

I Groups of 2-3 students

I Proposal due middle of semester
I One lecture will be devoted to projects presentation / progress
report
I Project final presentation (either talks or poster session, TBD) at
the end of the semester (date TBD)
I Project final written report due end of semester (date TBD)
Research project

I Topic chosen based on

I your own research interests (aligned with the class)
I a bibliography I will make available as we progress in the class.
I Each project should be grounded on at least one research paper.
I Only requirement: the project should be related to the class
content.
Research project

I Types of projects:
I Literature review: choose a topic/problem and present the existing
approaches to handle it, comparing them and analyzing their
drawbacks / advantages / limits, perform some experimental
comparison.
I Application: apply ideas/algorithms seen in class to a new problem
you find interesting or related to your research.
I Theory: focus on theoretical analysis of an algorithm or a problem
(e.g. robustness analysis, complexity analysis), propose a new
method to tackle a problem (existing or a new problem), ideally still
perform experiments (e.g. on synthetic data).
I Best case scenario: project ⇒ research paper
Times and dates

I Class on Tuesdays 12:30-14:15 and Thursdays 11:30-13:15 here at

the Mila auditorium.
I Schedule conflicts (COMP-551/767)?
I Office hours: Tuesdays after class in my office (D15 @ 6666) or
one of the meeting rooms at Mila
I Reading week: March 3-8
I If I understood correctly:
I Deadline to switch/drop the course without fees: January 22nd
I Deadline to drop the course with fees: March 15th
Auditing the class

I Everyone’s welcome (as long as there are enough room, but this
should be ok!)
I Participating to research projects may be doable for auditing
students as well, come see me.
I Sign-up sheet
Questions?
Class high-level overview

I Overall objective: for you to know what tensors and tensor

decomposition techniques are and how they can be used in ML.
I We will start back from linear algebra and build up from there:
1. Linear algebra refresher with focus on matrix decomposition
2. Linear algebra and ML / linear algebra and discrete maths
3. Introduction to tensors and tensor decomposition techniques
4. Tensors in ML
5. Seminar part of the class → your turn! (more on that later)
(rough timeline: (1,2) in Jan., (3,4) in Feb, (5) in Mar.)
I I am very open to feedbacks and suggestions, if there is a topic you
would like to see covered in class come see me or send a mail!
Linear Algebra road map
I Start by briefly recalling the basics: span, linear independence,
dimension, rank of a matrix, rank nullity theorem, orthogonality,
subspaces, projection, etc.
I Give some special care to eigenvalues/eigenvectors,
diagonalizability, etc. and present common factorization (low rank
factorization, QR, eigendecomposition, SVD), a bit of optimization
(mainly the power method).
I Present a few fundamental theorems and tools (and prove some of
them):
I Spectral theorem
I Jordan canonical form
I Eckart-Young-Mirsky theorem
I Min-max theorem (Courant-Fischer-Weyl theorem)
I Generalized eigenvalue problems
I Perturbation bounds on eigen-vectors/values
I Illustrate with ML applications.
Linear Algebra and ML: Teasers

Let’s look at a few examples of connections between algebra and

ML / CS...
Linear Algebra and ML: Linear Regression

I We want to find f : Rd → R linear (i.e. f : x 7→ w> x) minimizing

P
the squared error loss on the training data L = N 2
i=1 (f (xi ) − yi ) .
Linear Algebra and ML: Linear Regression

I We want to find f : Rd → R linear (i.e. f : x 7→ w> x) minimizing

P
the squared error loss on the training data L = N 2
i=1 (f (xi ) − yi ) .
I Solution: w∗ = (X> X)−1 X> y where X ∈ RN×d and y ∈ RN .
I The vector of prediction is given by ŷ = Xw∗ = X(X> X)−1 X> y
| {z }
orthogonal projection
Linear Algebra and ML: Linear Regression

I We want to find f : Rd → R linear (i.e. f : x 7→ w> x) minimizing

ŷ is the orthogonal projection of y onto the subspace of RN

spanned by the columns of X.
Linear Algebra and ML: Principal Component Analysis

I Given a set of points x1 , · · · , xN ∈ Rd (assume centered) we want

to find the k-dimensional subspace of Rd such that he projections
of the points onto this subspace
I have maximal variance
I stay as close as possible to the original points (in `2 distance).
Linear Algebra and ML: Principal Component Analysis

I Given a set of points x1 , · · · , xN ∈ Rd (assume centered) we want

to find the k-dimensional subspace of Rd such that he projections
of the points onto this subspace
I have maximal variance
I stay as close as possible to the original points (in `2 distance).

The solution is given by the subspace spanned by the top k

eigenvectors of the covariance matrix X> X ∈ Rd×d .
Linear Algebra and ML: Spectral Graph Clustering

I The Laplacian of a graph is the difference between its degree

matrix and its adjacency matrix: L = D − A

   
v1 v3 v4 2 0 0 0 0 1 1 0
0 2 0 0 1 0 1 0
L=
0
− 
0 3 0 1 1 0 1
v2
0 0 0 1 0 0 1 0
Linear Algebra and ML: Spectral Graph Clustering

I The Laplacian of a graph is the difference between its degree

matrix and its adjacency matrix: L = D − A

   
v1 v3 v4 2 0 0 0 0 1 1 0
0 2 0 0 1 0 1 0
L=
0
− 
0 3 0 1 1 0 1
v2
0 0 0 1 0 0 1 0

Zero is an eigenvalue of the Laplacian, and its multiplicity is

equal to the number of connected components of G .
Linear Algebra and ML: Spectral Learning of HMMs/WFAs

I Let Σ be a finite alphabet (e.g. Σ = {a, b}).

I Let Σ∗ be the set of all finite sequences of symbols in Σ (e.g.
a, b, ab, aab, bbba, . . . ).
I Given a real-valued function f : Σ∗ → R, its Hankel matrix
∗ ∗
H ∈ RΣ ×Σ is a bi-infinite matrix whose entries are given by
Hu,v = f (uv ).
Linear Algebra and ML: Spectral Learning of HMMs/WFAs

I Let Σ be a finite alphabet (e.g. Σ = {a, b}).

The rank of H is finite if and only if H can be computed by

a weighted automaton.

(if you don’t know what a weighted automaton is, think some kind
of RNN with linear activation functions)
Linear Algebra and ML: Method of Moments

I Consider a Gaussian mixture model with k components, where the

ith Gaussian has mean µi ∈ Rd and all Gaussians have the same
diagonal covariance σ 2 I, i.e. the pdf of x is
k
X
f (x) = pi N (µi , σ 2 I)
i=1
Linear Algebra and ML: Method of Moments

I Consider a Gaussian mixture model with k components, where the

ith Gaussian has mean µi ∈ Rd and all Gaussians have the same
diagonal covariance σ 2 I, i.e. the pdf of x is
k
X
f (x) = pi N (µi , σ 2 I)
i=1

The rank of the (modified) second-order moment

M = E[xx> ] − σ 2 I

is at most k.

P
(we actually have M = i pi µi µ>
i )
Spectral Methods (high-level view)

I Spectral methods usually achieve learning by extracting structure

from observable quantities through eigen-decompositions/tensor
decompositions.
I Spectral methods often constitute an alternative to EM to learn
latent variable models (e.g. HMMs, single-topic/Gaussian mixtures
models).
I Advantages of spectral methods:
I computationally efficient,
I consistent,
I no local optima.
Tensors

What about tensors?

Tensors vs Matrices

M ∈ Rd1 ×d2
T ∈ Rd1 ×d2 ×d3
Mij ∈ R for
(T ijk ) ∈ R for i ∈ [d1 ], j ∈ [d2 ], k ∈ [d3 ]
i ∈ [d1 ], j ∈ [d2 ]
Tensors and Machine Learning

(i) Data has a tensor structure: color image, video, multivariate time
series...

(ii) Tensor as parameters of a model: neural networks, polynomial

regression, weighted tree automata...
X
f (x) = T ijk xi xj xk
i,j,k

(iii) Tensors as tools: tensor method of moments, system of polynomial

equations, layer compression in neural networks...
Tensors

M ∈ Rd1 ×d2 T ∈ Rd1 ×d2 ×d3

Mij ∈ R for i ∈ [d1 ], j ∈ [d2 ] (T ijk ) ∈ R for i ∈ [d1 ], j ∈ [d2 ], k ∈ [d3 ]

Tensors

M ∈ Rd1 ×d2 T ∈ Rd1 ×d2 ×d3

Mij ∈ R for i ∈ [d1 ], j ∈ [d2 ] (T ijk ) ∈ R for i ∈ [d1 ], j ∈ [d2 ], k ∈ [d3 ]

I Outer product. If u ∈ Rd1 , v ∈ Rd2 , w ∈ Rd3 :

u ⊗ v = uv> ∈ Rd1 ×d2 (u ⊗ v)i,j = ui vj

u ⊗ v ⊗ w ∈ Rd1 ×d2 ×d3 (u ⊗ v ⊗ w)i,j,k = ui vj wk

Tensors: mode-n fibers

I Matrices have rows and columns, tensors have fibers1 :

(a) Mode-1 (column) fibers: x:jk (b) Mode-2 (row) fibers: xi:k (c) Mode-3 (tube) fibers: xij:

Fig. 2.1 Fibers of a 3rd-order tensor.

1
fig. from [Kolda and Bader, Tensor decompositions and applications, 2009].
Tensors: Matricizations
I T ∈ Rd1 ×d2 ×d3 can be reshaped into a matrix as

T(1) ∈ Rd1 ×d2 d3

T(2) ∈ Rd2 ×d1 d3
T(3) ∈ Rd3 ×d1 d2

T T(1)
Tensors: Multiplication with Matrices

AMB> ∈ Rm1 ×m2 T ×1 A ×2 B ×3 C ∈ Rm1 ×m2 ×m3

Tensors: Multiplication with Matrices

AMB> ∈ Rm1 ×m2 T ×1 A ×2 B ×3 C ∈ Rm1 ×m2 ×m3

ex: If T ∈ Rd1 ×d2 ×d3 and B ∈ Rm2 ×d2 , then T ×2 B ∈ Rd1 ×m2 ×d3 and
d2
X
(T ×2 B)i1 ,i2 ,i3 = T i1 ,k,i3 Bi2 ,k for all i1 ∈ [d1 ], i2 ∈ [m2 ], i3 ∈ [d3 ].
k=1
Tensors are not easy...

MOST TENSOR PROBLEMS ARE NP HARD

CHRISTOPHER J. HILLAR AND LEK-HENG LIM

Abstract. The idea that one might extend numerical linear algebra, the collection of matrix com-
putational methods that form the workhorse of scientific and engineering computing, to numeri-
cal multilinear algebra, an analogous collection of tools involving hypermatrices/tensors, appears
very promising and has attracted a lot of attention recently. We examine here the computational
tractability of some core problems in numerical multilinear algebra. We show that tensor analogues
of several standard problems that are readily computable in the matrix (i.e. 2-tensor) case are NP
hard. Our list here includes: determining the feasibility of a system of bilinear equations, determin-
ing an eigenvalue, a singular value, or the spectral norm of a 3-tensor, determining a best rank-1
approximation to a 3-tensor, determining the rank of a 3-tensor over R or C. Hence making tensor
computations feasible is likely to be a challenge.

[Hillar and Lim, Most tensor problems are NP-hard, Journal of the ACM, 2013.]
Tensors are not easy...

MOST TENSOR PROBLEMS ARE NP HARD

CHRISTOPHER J. HILLAR AND LEK-HENG LIM

[Hillar and Lim, Most tensor problems are NP-hard, Journal of the ACM, 2013.]

... but training a neural network with 3 nodes is also NP hard

[Blum and Rivest, NIPS ’89]
Tensors vs. Matrices: Rank
I The rank of a matrix M is:
I the number of linearly independent columns of M
I the number of linearly independent rows of M
I the smallest integer R such that M can be written as a sum of R
rank-one matrix:
XR
M= ui vi> .
i=1
Tensors vs. Matrices: Rank
I The rank of a matrix M is:
I the number of linearly independent columns of M
I the number of linearly independent rows of M
I the smallest integer R such that M can be written as a sum of R
rank-one matrix:
XR
M= ui vi> .
i=1
I The multilinear rank of a tensor T is a tuple of integers
(R1 , R2 , R3 ) where Rn is the number of linearly independent
mode-n fibers of T :
Rn = rank(T(n) )
Tensors vs. Matrices: Rank
I The rank of a matrix M is:
I the number of linearly independent columns of M
I the number of linearly independent rows of M
I the smallest integer R such that M can be written as a sum of R
rank-one matrix:
XR
M= ui vi> .
i=1
I The multilinear rank of a tensor T is a tuple of integers
(R1 , R2 , R3 ) where Rn is the number of linearly independent
mode-n fibers of T :
Rn = rank(T(n) )

I The CP rank of T is the smallest integer R such that T can be

written as a sum of R rank-one tensors:
R
X
T = u i ⊗ v i ⊗ wi .
i=1
CP and Tucker decomposition
I CP decomposition2 :

I Tucker decomposition:

2
fig. from [Kolda and Bader, Tensor decompositions and applications, 2009].
Hardness results

I Those are all NP-hard for tensor of order ≥ 3 in general:

I Compute the CP rank of a given tensor
I Find the best approximation with CP rank R of a given tensor
I Find the best approximation with multilinear rank (R1 , · · · , Rp ) of a
given tensor (*)
I ...
I On the positive side:
I Computing the multilinear rank is easy and efficient algorithms exist
for (*).
I Under mild conditions, the CP decomposition is unique (modulo
scaling and permutations).
⇒ Very relevant for model identifiability...
Back to the Method of Moments
I Consider a Gaussian mixture model with k components, where the
ith Gaussian has mean µi ∈ Rd and all Gaussians have the same
diagonal covariance σ 2 I.

The (modified) second-order moment M = E[xx> ] − σ 2 I is

such that
X k
M= pi µi µ>
i
i=1

I Can we recover the mixing weights pi and centers µi from M?

Back to the Method of Moments
I Consider a Gaussian mixture model with k components, where the
ith Gaussian has mean µi ∈ Rd and all Gaussians have the same
diagonal covariance σ 2 I.

The (modified) second-order moment M = E[xx> ] − σ 2 I is

such that
X k
M= pi µi µ>
i
i=1

I Can we recover the mixing weights pi and centers µi from M?

I No, except if the µi are orthonormal, in which case they are the
eigenvectors of M and the pi are the corresponding eigenvalues.
I But we will see that if we know both the matrix M and the 3rd
P
order tensor T = ki=1 pi µi ⊗ µi ⊗ µi , then we can recover the
weights and centers if the µi are linearly independent.
Quiz

Quiz Time

S. Kesavan - Functional Analysis
100% (1)
S. Kesavan - Functional Analysis
283 pages
Hazewinkel M. (Ed.) Encyclopaedia of Mathematics. Supplement I 1997
No ratings yet
Hazewinkel M. (Ed.) Encyclopaedia of Mathematics. Supplement I 1997
594 pages
MATH-UH 1022-001 - Linear Algebra - Spring 2023 - Syllabus
No ratings yet
MATH-UH 1022-001 - Linear Algebra - Spring 2023 - Syllabus
8 pages
SolutionManual Ch1 2
100% (1)
SolutionManual Ch1 2
14 pages
Linear Algebra - Pure & Applied
83% (6)
Linear Algebra - Pure & Applied
734 pages
Prelims Mathematics and Philosophy Synopses 2024-25
No ratings yet
Prelims Mathematics and Philosophy Synopses 2024-25
20 pages
Linear Algebra For Computer Science
No ratings yet
Linear Algebra For Computer Science
279 pages
Yair Shapira - Linear Algebra and Group Theory For Physicists and Engineers-Birkhauser (2019)
No ratings yet
Yair Shapira - Linear Algebra and Group Theory For Physicists and Engineers-Birkhauser (2019)
456 pages
Linear Algebra: Lecture Slides For Chapter 2 of
No ratings yet
Linear Algebra: Lecture Slides For Chapter 2 of
23 pages
UG Multidisciplinary Programs With Hons. in Mathematics
No ratings yet
UG Multidisciplinary Programs With Hons. in Mathematics
10 pages
MMME 21 1st Long Exam Lecture Notes
No ratings yet
MMME 21 1st Long Exam Lecture Notes
74 pages
3410notes-Linear Algebra Python
No ratings yet
3410notes-Linear Algebra Python
235 pages
III-B.Sc Mathematics
No ratings yet
III-B.Sc Mathematics
15 pages
Math 2131 Lecture Notes
No ratings yet
Math 2131 Lecture Notes
190 pages
Syllabus - Linear Algebra For Engineers
No ratings yet
Syllabus - Linear Algebra For Engineers
5 pages
LAA Notes 2024 Web
No ratings yet
LAA Notes 2024 Web
215 pages
Fundamentals of Numerical Linear Algebra
No ratings yet
Fundamentals of Numerical Linear Algebra
265 pages
B.SC., Mathematics
No ratings yet
B.SC., Mathematics
25 pages
SM (1e) PDF
No ratings yet
SM (1e) PDF
212 pages
Kuttler LinearAlgebra AFirstCourse Yorku MATH2022 Summer2016
No ratings yet
Kuttler LinearAlgebra AFirstCourse Yorku MATH2022 Summer2016
256 pages
NLA Lecture Notes
No ratings yet
NLA Lecture Notes
86 pages
Linear Algebra UCD
No ratings yet
Linear Algebra UCD
152 pages
Maths Paper-I Algebra
No ratings yet
Maths Paper-I Algebra
1 page
Kuttler LinearAlgebra AFirstCourse YorkU MATH2022 Winter2017
No ratings yet
Kuttler LinearAlgebra AFirstCourse YorkU MATH2022 Winter2017
258 pages
線性代數113B v2
No ratings yet
線性代數113B v2
5 pages
ST Joseph's College (Autonomous) Bengaluru-560027 Department of Mathematics
No ratings yet
ST Joseph's College (Autonomous) Bengaluru-560027 Department of Mathematics
18 pages
MFMLHandout
No ratings yet
MFMLHandout
7 pages
24MA201 - Unit III InnerProduct Spaces Digital Material
No ratings yet
24MA201 - Unit III InnerProduct Spaces Digital Material
80 pages
Singular-Value Decomposition and Its Applications
No ratings yet
Singular-Value Decomposition and Its Applications
28 pages
MATH220 PrintedNotes 2018
No ratings yet
MATH220 PrintedNotes 2018
107 pages
Linear Algebra
No ratings yet
Linear Algebra
3 pages
AET Math Camp 2018 PDF
No ratings yet
AET Math Camp 2018 PDF
72 pages
Lecture 1 Intro
No ratings yet
Lecture 1 Intro
16 pages
IT Presentation Data
No ratings yet
IT Presentation Data
16 pages
LW 1115 Math245notes (Waterloo)
No ratings yet
LW 1115 Math245notes (Waterloo)
50 pages
Gaus Jordan
No ratings yet
Gaus Jordan
48 pages
Laa 2024
No ratings yet
Laa 2024
45 pages
Linear Algebra Working Professional Experts Interview
No ratings yet
Linear Algebra Working Professional Experts Interview
9 pages
INT254
No ratings yet
INT254
14 pages
CO-1 (MFC) Material
No ratings yet
CO-1 (MFC) Material
21 pages
24721
No ratings yet
24721
11 pages
Outline - MATH 105 - Linear Algebra
No ratings yet
Outline - MATH 105 - Linear Algebra
5 pages
Basic Concepts For Understanding ML & DL
No ratings yet
Basic Concepts For Understanding ML & DL
8 pages
Exam Final Solution
No ratings yet
Exam Final Solution
6 pages
Chapter 4 Controllability and Observability Part 2
No ratings yet
Chapter 4 Controllability and Observability Part 2
23 pages
Locs HK Rs c4 PDF
No ratings yet
Locs HK Rs c4 PDF
49 pages
Fundamental Matrix
No ratings yet
Fundamental Matrix
30 pages
02 Linear Algebra
No ratings yet
02 Linear Algebra
23 pages
Exercise 01
No ratings yet
Exercise 01
3 pages
Advanced Linear Algebra PDF
100% (13)
Advanced Linear Algebra PDF
348 pages
Lecture 0 Introduction
No ratings yet
Lecture 0 Introduction
23 pages
Linear Algebra - B.Tech. - Course Plan - 2024-25
No ratings yet
Linear Algebra - B.Tech. - Course Plan - 2024-25
4 pages
Course Outline Mat 125 Linear Algebra
No ratings yet
Course Outline Mat 125 Linear Algebra
5 pages
Mathematical Symbols
No ratings yet
Mathematical Symbols
20 pages
Undergraduate Program: Course Syllabus Course Title
No ratings yet
Undergraduate Program: Course Syllabus Course Title
10 pages
Linear Algebra 0 PDF
No ratings yet
Linear Algebra 0 PDF
22 pages
Syllabus LAOP
No ratings yet
Syllabus LAOP
3 pages
C2 CB212 Linear Algebra
No ratings yet
C2 CB212 Linear Algebra
3 pages
Advance Mathematical Methods
No ratings yet
Advance Mathematical Methods
3 pages
Course Outline 2
No ratings yet
Course Outline 2
4 pages
Numerical Methods: Dr. Nasir M Mirza
No ratings yet
Numerical Methods: Dr. Nasir M Mirza
27 pages
CMS 2103 Linear Mathematics Ii
No ratings yet
CMS 2103 Linear Mathematics Ii
3 pages
CE 403 Lecture 3
No ratings yet
CE 403 Lecture 3
9 pages
Math 1B03 2022 Spring Term
No ratings yet
Math 1B03 2022 Spring Term
9 pages
EML Couse Outcome
No ratings yet
EML Couse Outcome
2 pages
MI1036 Algebra Syllabus
No ratings yet
MI1036 Algebra Syllabus
5 pages
12 June Test 2025
No ratings yet
12 June Test 2025
1 page
Course Outline
No ratings yet
Course Outline
6 pages
C - Fakepathsyllabus Linear Algebra2023 2
No ratings yet
C - Fakepathsyllabus Linear Algebra2023 2
6 pages
Department of Mathematics & Statistics: Instructor: Office Hours: Preface
No ratings yet
Department of Mathematics & Statistics: Instructor: Office Hours: Preface
4 pages
MAT3002 - APPLIED-LINEAR-ALGEBRA - LT - 1.0 - 1 - Applied Linear Algebra
No ratings yet
MAT3002 - APPLIED-LINEAR-ALGEBRA - LT - 1.0 - 1 - Applied Linear Algebra
2 pages
Assignment 01 (Tma 101)
No ratings yet
Assignment 01 (Tma 101)
1 page
Syllabus Mathematics (Honours and Regular) : Submitted To
No ratings yet
Syllabus Mathematics (Honours and Regular) : Submitted To
19 pages
MATRICES and DETERMINANTS
No ratings yet
MATRICES and DETERMINANTS
21 pages
Course Outline S20
No ratings yet
Course Outline S20
6 pages
MATH110 Homework 10: Outline
No ratings yet
MATH110 Homework 10: Outline
4 pages
Course: ELL 701 - Mathematical Methods in Control Instructor: M. Nabi
No ratings yet
Course: ELL 701 - Mathematical Methods in Control Instructor: M. Nabi
8 pages
MATH-314 Linear Algebra Course Outlines
No ratings yet
MATH-314 Linear Algebra Course Outlines
4 pages
Outline La f2020
No ratings yet
Outline La f2020
4 pages
National University of Engineering Naval Engineering Program
No ratings yet
National University of Engineering Naval Engineering Program
3 pages
Mat3002 Applied-Linear-Algebra LT 1.0 1 Mat3002
No ratings yet
Mat3002 Applied-Linear-Algebra LT 1.0 1 Mat3002
2 pages
Numerical Method
No ratings yet
Numerical Method
4 pages
Course Syllabus
No ratings yet
Course Syllabus
3 pages
Algebra Lineal UNI
No ratings yet
Algebra Lineal UNI
3 pages
Course Syllabus: Faculty of Computers and Information Faculty of Computers and Information
No ratings yet
Course Syllabus: Faculty of Computers and Information Faculty of Computers and Information
1 page
Day1 Solutions-2007
No ratings yet
Day1 Solutions-2007
3 pages
Extremal Graph Theory
From Everand
Extremal Graph Theory
Bela Bollobas
3/5 (1)
Finite-Dimensional Vector Spaces: Second Edition
From Everand
Finite-Dimensional Vector Spaces: Second Edition
Paul R. Halmos
No ratings yet
Analysis in Euclidean Space
From Everand
Analysis in Euclidean Space
Kenneth Hoffman
No ratings yet
Calculus Fundamentals Explained
From Everand
Calculus Fundamentals Explained
Samuel Horelick
3/5 (3)

Matrix and Tensor Factorization For Machine Learning: IFT 6760A

Uploaded by

Matrix and Tensor Factorization For Machine Learning: IFT 6760A

Uploaded by

IFT 6760A

Matrix and Tensor Factorization for

Instructor: Guillaume Rabusseau

I All material will be posted on my website

I Not a math course but still a lot of maths

I We will do a lot of proofs...

I Advanced grad students

I Get a good grasp on linear algebra and matrix decomposition

I Read research papers, get familiar with the literature

I A few classes (likely starting late Feb./early March) will be devoted

I Groups of 2-3 students

I Topic chosen based on

I Class on Tuesdays 12:30-14:15 and Thursdays 11:30-13:15 here at

I Overall objective: for you to know what tensors and tensor

Let’s look at a few examples of connections between algebra and

I We want to find f : Rd → R linear (i.e. f : x 7→ w> x) minimizing

I We want to find f : Rd → R linear (i.e. f : x 7→ w> x) minimizing

I We want to find f : Rd → R linear (i.e. f : x 7→ w> x) minimizing

ŷ is the orthogonal projection of y onto the subspace of RN

I Given a set of points x1 , · · · , xN ∈ Rd (assume centered) we want

I Given a set of points x1 , · · · , xN ∈ Rd (assume centered) we want

The solution is given by the subspace spanned by the top k

I The Laplacian of a graph is the difference between its degree

I The Laplacian of a graph is the difference between its degree

Zero is an eigenvalue of the Laplacian, and its multiplicity is

I Let Σ be a finite alphabet (e.g. Σ = {a, b}).

I Let Σ be a finite alphabet (e.g. Σ = {a, b}).

The rank of H is finite if and only if H can be computed by

I Consider a Gaussian mixture model with k components, where the

I Consider a Gaussian mixture model with k components, where the

The rank of the (modified) second-order moment

I Spectral methods usually achieve learning by extracting structure

What about tensors?

(ii) Tensor as parameters of a model: neural networks, polynomial

(iii) Tensors as tools: tensor method of moments, system of polynomial

M ∈ Rd1 ×d2 T ∈ Rd1 ×d2 ×d3

Mij ∈ R for i ∈ [d1 ], j ∈ [d2 ] (T ijk ) ∈ R for i ∈ [d1 ], j ∈ [d2 ], k ∈ [d3 ]

M ∈ Rd1 ×d2 T ∈ Rd1 ×d2 ×d3

Mij ∈ R for i ∈ [d1 ], j ∈ [d2 ] (T ijk ) ∈ R for i ∈ [d1 ], j ∈ [d2 ], k ∈ [d3 ]

u ⊗ v = uv> ∈ Rd1 ×d2 (u ⊗ v)i,j = ui vj

u ⊗ v ⊗ w ∈ Rd1 ×d2 ×d3 (u ⊗ v ⊗ w)i,j,k = ui vj wk

I Matrices have rows and columns, tensors have fibers1 :

Fig. 2.1 Fibers of a 3rd-order tensor.

T(1) ∈ Rd1 ×d2 d3

AMB> ∈ Rm1 ×m2 T ×1 A ×2 B ×3 C ∈ Rm1 ×m2 ×m3

AMB> ∈ Rm1 ×m2 T ×1 A ×2 B ×3 C ∈ Rm1 ×m2 ×m3

MOST TENSOR PROBLEMS ARE NP HARD

CHRISTOPHER J. HILLAR AND LEK-HENG LIM

MOST TENSOR PROBLEMS ARE NP HARD

CHRISTOPHER J. HILLAR AND LEK-HENG LIM

... but training a neural network with 3 nodes is also NP hard

I The CP rank of T is the smallest integer R such that T can be

I Those are all NP-hard for tensor of order ≥ 3 in general:

The (modified) second-order moment M = E[xx> ] − σ 2 I is

I Can we recover the mixing weights pi and centers µi from M?

The (modified) second-order moment M = E[xx> ] − σ 2 I is

I Can we recover the mixing weights pi and centers µi from M?

You might also like