How to Calculate the SVD from Scratch with Python
How to Calculate the SVD from Scratch with Python
Navigation
Search...
Matrix decomposition, also known as matrix factorization, involves describing a given matrix using its constituent elements.
Perhaps the most known and widely used matrix decomposition method is the Singular-Value Decomposition, or SVD. All matrices have an SVD, which
makes it more stable than other methods, such as the eigendecomposition. As such, it is often used in a wide array of applications including
compressing, denoising, and data reduction.
In this tutorial, you will discover the Singular-Value Decomposition method for decomposing a matrix into its constituent elements.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 1/31
7/23/2021 How to Calculate the SVD from Scratch with Python
Kick-start your project with my new book Linear Algebra for Machine Learning, including step-by-step tutorials and the Python source code files for all
examples.
Update Mar/2018: Fixed typo in reconstruction. Changed V in code to VT for clarity. Fixed typo in the pseudoinverse equation.
Update Apr/2019: Fixed a small typo re array sizes in the explanation of the SVD reconstruction example.
Tutorial Overview
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 2/31
7/23/2021 How to Calculate the SVD from Scratch with Python
1. Singular-Value Decomposition
Start Machine Learning
2. Calculate Singular-Value Decomposition
3. Reconstruct Matrix from SVD
4. SVD for Pseudoinverse
5. SVD for Dimensionality Reduction
Singular-Value Decomposition
The Singular-Value Decomposition, or SVD for short, is a matrix decomposition method for reducing a matrix to its constituent parts in order to make
certain subsequent matrix calculations simpler.
For the case of simplicity we will focus on the SVD for real-valued matrices and ignore the case for complex numbers.
1 A = U . Sigma . V^T
Where A is the real m x n matrix that we wish to decompose, U is an m x m matrix, Sigma (often represented by the uppercase Greek letter Sigma) is an
m x n diagonal matrix, and V^T is the transpose of an n x n matrix where T is a superscript.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 3/31
7/23/2021 How to Calculate the SVD from Scratch with Python
The diagonal values in the Sigma matrix are known as the singular values of the original matrix A. The columns of the U matrix are called the left-singular
vectors of A, and the columns of V are called the right-singular vectors of A.
The SVD is calculated via iterative numerical methods. We will not go into the details of these methods. Every rectangular matrix has a singular value
decomposition, although the resulting matrices may contain complex numbers and the limitations of floating point arithmetic may cause some matrices to
fail to decompose neatly.
The singular value decomposition (SVD) provides another way to factorize a matrix, into singular vectors and singular values. The SVD allows
us to discover some of the same kind of information as the eigendecomposition. However, the SVDThank you
is more for signing
generally applicable. up! ×
Please check your email and click the link
— Pages 44-45, Deep Learning, 2016. provided to confirm your subscription.
The SVD is used widely both in the calculation of other matrix operations, such as matrix inverse, but also as a data reduction method in machine
learning. SVD can also be used in least squares linear regression, image compression, and denoising data.
The singular value decomposition (SVD) has numerous applications in statistics, machine learning, and computer science. Applying the SVD
to a matrix is like looking inside it with X-ray vision…
The function takes a matrix and returns the U, Sigma and V^T elements. The Sigma diagonal matrix is returned as a vector of singular values. The V
matrix is returned in a transposed form, e.g. V.T.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 4/31
7/23/2021 How to Calculate the SVD from Scratch with Python
The example below defines a 3×2 matrix and calculates the Singular-value decomposition.
Running the example first prints the defined 3×2 matrix, then the 3×3 U matrix, 2 element Sigma vector, and 2×2 V^T matrix elements calculated from the
decomposition.
Thank you for signing up! ×
1 [[1 2] Please check your email and click the link
2 [3 4] provided to confirm your subscription.
3 [5 6]]
4
5 [[-0.2298477 0.88346102 0.40824829]
6 [-0.52474482 0.24078249 -0.81649658]
7 [-0.81964194 -0.40189603 0.40824829]]
8
9 [ 9.52551809 0.51430058]
10
11 [[-0.61962948 -0.78489445]
12 [-0.78489445 0.61962948]]
The U, s, and V elements returned from the svd() cannot be multiplied directly.
The s vector must be converted into a diagonal matrix using the diag() function. By default, this function will create a square matrix that is n x n, relative to
our original matrix. This causes a problem as the size of the matrices do not fit the rules of matrix multiplication, where the number of columns in a matrix
must match the number of rows in the subsequent matrix.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 5/31
7/23/2021 How to Calculate the SVD from Scratch with Python
After creating the square Sigma diagonal matrix, the sizes of the matrices are relative to the original m x n matrix that we are decomposing, as follows:
1 U (m x m) . Sigma (m x n) . V^T (n x n)
We can achieve this by creating a new Sigma matrix of all zero values that is m x n (e.g. more rows) and populate the first n x n part of the matrix with the
square diagonal matrix calculated via diag().
1 # Reconstruct SVD
2
3
from numpy import array
from numpy import diag
Thank you for signing up! ×
4 from numpy import dot
5 from numpy import zeros Please check your email and click the link
6 from scipy.linalg import svd provided to confirm your subscription.
7 # define a matrix
8 A = array([[1, 2], [3, 4], [5, 6]])
9 print(A)
10 # Singular-value decomposition
11 U, s, VT = svd(A)
12 # create m x n Sigma matrix
13 Sigma = zeros((A.shape[0], A.shape[1]))
14 # populate Sigma with n x n diagonal matrix
15 Sigma[:A.shape[1], :A.shape[1]] = diag(s)
16 # reconstruct matrix
17 B = U.dot(Sigma.dot(VT))
18 print(B)
Running the example first prints the original matrix, then the matrix reconstructed from the SVD elements.
1 [[1 2]
2 [3 4]
3 [5 6]]
4
5 [[ 1. 2.]
6 [ 3. 4.]
7 [ 5. 6.]]
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 6/31
7/23/2021 How to Calculate the SVD from Scratch with Python
The above complication with the Sigma diagonal only exists with the case where m and n are not equal. The diagonal matrix can be used directly when
reconstructing a square matrix, as follows.
Running the example prints the original 3×3 matrix and the version reconstructed directly from the SVDPlease check your email and click the link
elements.
provided to confirm your subscription.
1 [[1 2 3]
2 [4 5 6]
3 [7 8 9]]
4
5 [[ 1. 2. 3.]
6 [ 4. 5. 6.]
7 [ 7. 8. 9.]]
It is also called the the Moore-Penrose Inverse after two independent discoverers of the method or the Generalized Inverse.
Matrix inversion is not defined for matrices that are not square. […] When A has more columns than rows, then solving a linear equation using
the pseudoinverse provides one of the many possible solutions.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 7/31
7/23/2021 How to Calculate the SVD from Scratch with Python
1 A^+ = VD^+U^T
Where A^+ is the pseudoinverse, D^+ is the pseudoinverse of the diagonal matrix Sigma and U^T is theThank you
transpose of U.for signing up! ×
We can get U and V from the SVD operation. Please check your email and click the link
provided to confirm your subscription.
1 A = U . Sigma . V^T
The D^+ can be calculated by creating a diagonal matrix from Sigma, calculating the reciprocal of each non-zero element in Sigma, and taking the
transpose if the original matrix was rectangular.
1 s11, 0, 0
2 Sigma = ( 0, s22, 0)
3 0, 0, s33
1 1/s11, 0, 0
2 D^+ = ( 0, 1/s22, 0)
3 0, 0, 1/s33
The pseudoinverse provides one way of solving the linear regression equation, specifically when there are more rows than there are columns, which is
often the case.
NumPy provides the function pinv() for calculating the pseudoinverse of a rectangular matrix.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 8/31
7/23/2021 How to Calculate the SVD from Scratch with Python
The example below defines a 4×2 matrix and calculates the pseudoinverse.
Running the example first prints the defined matrix, and then the calculated pseudoinverse.
Thank you for signing up! ×
Please check your email and click the link
1 [[ 0.1 0.2]
2 [ 0.3 0.4]
provided to confirm your subscription.
3 [ 0.5 0.6]
4 [ 0.7 0.8]]
5
6 [[ -1.00000000e+01 -5.00000000e+00 9.04289323e-15 5.00000000e+00]
7 [ 8.50000000e+00 4.50000000e+00 5.00000000e-01 -3.50000000e+00]]
We can calculate the pseudoinverse manually via the SVD and compare the results to the pinv() function.
First we must calculate the SVD. Next we must calculate the reciprocal of each value in the s array. Then the s array can be transformed into a diagonal
matrix with an added row of zeros to make it rectangular. Finally, we can calculate the pseudoinverse from the elements.
1 [[ 0.1 0.2]
2 [ 0.3 0.4]
3 [ 0.5 0.6]
4 [ 0.7 0.8]]
5
6 [[ -1.00000000e+01 -5.00000000e+00 9.04831765e-15 5.00000000e+00]
7 [ 8.50000000e+00 4.50000000e+00 5.00000000e-01 -3.50000000e+00]]
Data with a large number of features, such as more features (columns) than observations (rows) may be reduced to a smaller subset of features that are
most relevant to the prediction problem.
The result is a matrix with a lower rank that is said to approximate the original matrix.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 10/31
7/23/2021 How to Calculate the SVD from Scratch with Python
To do this we can perform an SVD operation on the original data and select the top k largest singular values in Sigma. These columns can be selected
from Sigma and the rows selected from V^T.
Start Machine Learning
An approximate B of the original vector A can then be reconstructed.
1 B = U . Sigmak . V^Tk
In natural language processing, this approach can be used on matrices of word occurrences or word frequencies in documents and is called Latent
Semantic Analysis or Latent Semantic Indexing.
In practice, we can retain and work with a descriptive subset of the data called T. This is a dense summary of the matrix or a projection.
1 T = U . Sigmak
Thank you for signing up! ×
Further, this transform can be calculated and applied to the original matrix A as well as other similar matrices.
Please check your email and click the link
provided to confirm your subscription.
1 T = V^k . A
First a 3×10 matrix is defined, with more columns than rows. The SVD is calculated and only the first two features are selected. The elements are
recombined to give an accurate reproduction of the original matrix. Finally the transform is calculated two different ways.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 11/31
7/23/2021 How to Calculate the SVD from Scratch with Python
Running the example first prints the defined matrix then the reconstructed approximation, followed by two equivalent transforms of the original matrix.
The scikit-learn provides a TruncatedSVD class that implements this capability directly.
The TruncatedSVD class can be created in which you must specify the number of desirable features or components to select, e.g. 2. Once created, you
can fit the transform (e.g. calculate V^Tk) by calling the fit() function, then apply it to the original matrix by calling the transform() function. The result is the
transform of A called T above.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 12/31
7/23/2021 How to Calculate the SVD from Scratch with Python
Running the example first prints the defined matrix, followed by the transformed version of the matrix.
We can see that the values match those calculated manually above, except for the sign on some values. We can expect there to be some instability when
Thank
it comes to the sign given the nature of the calculations involved and the differences in the underlying libraries andyou for used.
methods signing up! of sign
This instability ×
should not be a problem in practice as long as the transform is trained for reuse.
Please check your email and click the link
provided to confirm your subscription.
1 [[ 1 2 3 4 5 6 7 8 9 10]
2 [11 12 13 14 15 16 17 18 19 20]
3 [21 22 23 24 25 26 27 28 29 30]]
4
5 [[ 18.52157747 6.47697214]
6 [ 49.81310011 1.91182038]
7 [ 81.10462276 -2.65333138]]
Extensions
This section lists some ideas for extending the tutorial that you may wish to explore.
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 13/31
7/23/2021 How to Calculate the SVD from Scratch with Python
Further Reading
This section provides more resources on the topic if you are looking to go deeper. Start Machine Learning
Books
Chapter 12, Singular-Value and Jordan Decompositions, Linear Algebra and Matrix Analysis for Statistics, 2014.
Chapter 4, The Singular Value Decomposition and Chapter 5, More on the SVD, Numerical Linear Algebra, 1997.
Section 2.4 The Singular Value Decomposition, Matrix Computations, 2012.
Chapter 7 The Singular Value Decomposition (SVD), Introduction to Linear Algebra, Fifth Edition, 2016.
Section 2.8 Singular Value Decomposition, Deep Learning, 2016.
Section 7.D Polar Decomposition and Singular Value Decomposition, Linear Algebra Done Right, Third Edition, 2015.
Lecture 3 The Singular Value Decomposition, Numerical Linear Algebra, 1997.
Thank
Section 2.6 Singular Value Decomposition, Numerical Recipes: The Art of Scientific Computing, Third Edition,you
2007.for signing up! ×
Section 2.9 The Moore-Penrose Pseudoinverse, Deep Learning, 2016.
Please check your email and click the link
provided to confirm your subscription.
API
numpy.linalg.svd() API
numpy.matrix.H API
numpy.diag() API
numpy.linalg.pinv() API.
sklearn.decomposition.TruncatedSVD API
Articles
Matrix decomposition on Wikipedia
Singular-value decomposition on Wikipedia
Singular value on Wikipedia
Moore-Penrose inverse on Wikipedia
Latent semantic analysis on Wikipedia
Summary
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 14/31
7/23/2021 How to Calculate the SVD from Scratch with Python
In this tutorial, you discovered the Singular-value decomposition method for decomposing a matrix into its constituent elements.
Get a Handle on Linear Algebra for Machine you for signing up! ×
Learning!
Thank
Develop a working understand of linear
Please algebra
check your email and click the link
provided to confirm your subscription.
...by writing lines of code in python
https://machinelearningmastery.com/singular-value-decomposition-for-machine-learning/ 15/31