0% found this document useful (0 votes)
38 views

Coursework

SVD is a powerful technique for dimensionality reduction that decomposes a matrix into three constituent matrices, capturing essential structure. It identifies underlying patterns and relationships in data, enabling tasks like noise reduction, dimensionality reduction, and feature extraction. SVD has useful properties for both geometric and theoretical insights into linear transformations.

Uploaded by

usama.matin1998
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Coursework

SVD is a powerful technique for dimensionality reduction that decomposes a matrix into three constituent matrices, capturing essential structure. It identifies underlying patterns and relationships in data, enabling tasks like noise reduction, dimensionality reduction, and feature extraction. SVD has useful properties for both geometric and theoretical insights into linear transformations.

Uploaded by

usama.matin1998
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

SINGULAR VALUE DECOMPOSITION

(SVD) IN DIMENSIONALITY
REDUCTION

Usama Abdul Matin

170113869

CS4755 Mathematics for AI

Dr Mohammed Hadi & Dr Liam Escott

Aston University

1
Table of Contents

1 Introduction----------------------------------------------------------------------------------------------------------- 2
2 Overview of SVD----------------------------------------------------------------------------------------------------- 3
3 Mathematical Foundation of SVD-------------------------------------------------------------------------------- 3
3.1 Orthogonal Diagonalisation----------------------------------------------------------------------------------- 3
3.2 Singular Values------------------------------------------------------------------------------------------------- 4
3.3 Low-Rank Approximations----------------------------------------------------------------------------------- 5
4 Numerical Example-------------------------------------------------------------------------------------------------- 6
4.1 Dataset Preparation--------------------------------------------------------------------------------------------- 6
4.2 Singular Value Decomposition (SVD)----------------------------------------------------------------------- 8
4.3 Choosing Principal Components----------------------------------------------------------------------------- 8
4.4 Dimensionality Reduction------------------------------------------------------------------------------------- 9
5 Discussion-------------------------------------------------------------------------------------------------------------10
5.1 Reflection on Reduced Representation---------------------------------------------------------------------10
5.2 Significance of Singular Values and Vectors-------------------------------------------------------------- 11
5.3 Analysis of SVD for Dimensionality Reduction---------------------------------------------------------- 11
6 Conclusion------------------------------------------------------------------------------------------------------------ 11
7 References------------------------------------------------------------------------------------------------------------ 12
8 Appendix--------------------------------------------------------------------------------------------------------------13

2
1 Introduction

Dimensionality reduction is a crucial technique in artificial intelligence, allowing for the simplification
and abstraction of complex datasets via feature selection and feature extraction [1]. Among the various
methods available, Singular Value Decomposition (SVD) offers significant value in capturing the
underlying structure of data due to its versatility and effectiveness. Essentially, SVD decomposes a matrix
into its constituent singular vectors and values, enabling the identification of essential features or
components. This report will explore the mathematical foundations of SVD and its application in
dimensionality reduction, focusing on its role in such as Principal Component Analysis (PCA) and Latent
Semantic Analysis (LSA). By understanding the principles and its practical implications, valuable insights
can be gained into how it can be used to shape the landscape of artificial intelligence and data analysis. A
comprehensive understanding of SVD’s significance in dimensionality reduction can thus be provided via
this detailed exploration of its capabilities and limitations.

2 Overview of SVD

Singular Value Decomposition is one of the most powerful matrix factorisation techniques, predominantly
used in the field of information retrieval as well as other fields such as image processing, natural language
processing and recommendation systems due to the need for accuracy, sparsity and scalability [2]. It
decomposes a matrix into three constituent matrices: U, Σ, and V^T, capturing the essential structure of
the original data [3]. This factorisation enables SVD to identify underlying patterns and relationships
within the data, facilitating tasks like noise reduction by “de-noising” one step of the algorithm [4],
dimensionality reduction and feature extraction [5]. SVD has interesting and attractive algebraic
properties therefore to understand it means to grasp both geometrical and theoretical insights about linear
transformations [6] within the context of dimensionality reduction methods like Principal Component
Analysis (PCA) and Latent Semantic Analysis (LSA) [7]. By decomposing the data into its fundamental
components, SVD provides valuable insights into the inherent structure of complex datasets, thereby
enabling more efficient analysis and interpretation in several different applications of linear algebra [6].

3 Mathematical Foundation of SVD

3.1 Orthogonal Diagonalisation

The mathematical foundation of SVD revolves around several key concepts, one of the central concepts
being orthogonal diagonalisation, which involves the multiplication of three matrices to obtain a diagonal
matrix, D. This process is foundational to SVD as it allows for the representation of the original data in

3
terms of orthogonal directions or axes. The matrix diagonalisation is the process of converting a square
matrix, B, of size (d✕d) into a diagonal matrix, D, of size (d✕d), as shown below [8]:

B = QDQ-1 (1)

where Q is a matrix of size (d✕d) composed of eigenvectors of B, with the diagonal elements of D
containing the corresponding eigenvalues. Since Q contains linearly independent vectors, it can be
considered an invertible matrix. If B were to be symmetrical, it could be re-written as:

B = CDC-1 (2)
in which the columns of C are orthonormal to each other. It could also be re-written as:

B = C-1BC (3)

Since matrix C is an orthonormal matrix, it can be re-written as:

D = CTBC (4)

The process of orthogonal diagonalisation is summarised below:

Figure 1: The orthogonal diagonalisation process

Through orthogonal diagonalization, SVD provides a powerful tool for transforming data into a new basis
by enhancing convergence and its ability to provide a better solution.

3.2 Singular Values

Another crucial aspect of SVD is the notion of singular values. These values, denoted by σi, quantify the
importance of each principal component extracted by SVD. For example, let A be a m✕n matrix with
m≽n which would mean one form of the SVD of A is:

A = UTΣV

where U and V are orthogonal and Σ is square diagonal. That is, UUT = Irank(A), VVT = Irank(A), U is
rank(A)✕m, V is rank(A)✕n, and:

Σ = σ10⋯000σ2⋯00⋮⋮⋱⋮⋮00⋯σrank(A)-1000⋯0σrank(A)

4
is a rank(A)✕rank(A) diagonal matrix. Also, σ1≽σ2≽⋯σrank(A)>0. The σi’s are the singular values of A and
their number is equal to the rank of A. A large singular value corresponds to directions in the data space
that explain a greater amount of variance. This property is pivotal in showing how SVD can be used to
display linear algebra geometrically with applications allowing the solving of least-squares problems and
a seamless transition into data compression [9].

3.3 Low-Rank Approximations

Furthermore, SVD enables low-rank approximations by providing the true rank and the best low-rank
approximation of a matrix [10-12]. For example, let the rank of A be r = min (m, n), then the SVD would
read:

where u1, u2,...,ur are columns of Um✕r and v1, v2,...,vr are columns of Vn✕r. As can be seen, matrix A is
also the sum of outer products of vectors.

The rank k approximation (also known as truncated or partial SVD) of A, Ak where k < r, is given by
zeroing out the r - k trailing singular values of A:

5
Here:

which ultimately leads to:

where Ak is the projection of the A onto the space spanned by the top k singular vectors of A.

By truncating the SVD to only include the top k singular values and their corresponding vectors, it is
possible to approximate the original data matrix with reduced dimensions. This process not only reduces
computational complexity but also allows for efficient storage and analysis of large datasets [13].

4 Numerical Example

To gain a practical understanding of how Singular Value Decomposition operates in the context of
dimensionality reduction, a numerical example using a small dataset has been provided below. A dataset
of five individuals with their height and weight was first recorded:

Individual Height (cm) Weight (kg)

1 160 55

2 170 70

3 155 45

6
4 180 80

5 165 60

4.1 Dataset Preparation

In order to perform a SVD, the data must be preprocessed. The data is centred by subtracting the mean of
each feature:

● Mean height: (165 + 170 + 155 + 180 + 160) / 5 = 166 cm


● Mean weight: (70 + 75 + 60 + 80 +65) / 5 = 62 kg

These means are thus subtracted from each data point:

The data is then optionally scaled by dividing by the standard deviation to ensure that features are on
similar scales. For simplicity, in this example, the standard deviation for both height and weight was
assumed to be 7 and 8, respectively.

7
4.2 Singular Value Decomposition (SVD)

A SVD was thus performed on the optionally scaled data matrix:

Xscaled = UΣVT

where X is the centred and scaled data matrix, U is the left singular vectors matrix, Σ is the diagonal
matrix of the singular values and VT is the right singular vectors matrix. After the SVD computation was
performed, the following results were obtained:

8
4.3 Choosing Principal Components

The principal components were then selected; since there are two singular values, one principal
component was kept. The first column of U corresponds to the first principal component. (The
computations to calculate U, Σ and VT are provided in the appendix section).

4.4 Dimensionality Reduction

To reduce the dataset to one dimension, the original data was projected onto this principal component:

Z = XscaledV

The calculator for this projection was:

The reduced dataset Z represents the original data in a lower-dimensional space with only one feature.

9
5 Discussion

5.1 Reflection on Reduced Representation

This reduced dataset Z captures the most significant variation in the original dataset. The first column of Z
represents the chosen principal component, which is a linear combination of height and weight. Each
entry in this column corresponds to an individual’s position along this principal component.

By reducing the dataset to one dimension, the primary of variation in the original dataset was effectively
captured. This simplified presentation can be valuable for visualisation, analysis and modelling purposes.
Due to the features having different scales, the SVD was performed on the scaled matrix which ensures
all features contribute equally to the decomposition.

5.2 Significance of Singular Values and Vectors

As discussed previously, singular values and vectors play crucial roles in understanding the structure of
the data through SVD. The singular values in the example were 3.17 and 0.47. The first singular value
being much larger suggests that the principal component captures a substantial amount of the total
variance. The columns of the U matrix, known as the left singular vectors, represent the directions of
maximum variance in the original feature space. These vectors are orthogonal to each other and form the
basis for the principal components [13]. In the example provided, the first column of U corresponds to the
first principal component, which is the direction that captures the most significant variance in the data.

5.3 Analysis of SVD for Dimensionality Reduction

There are several strengths and weaknesses of SVD for dimensionality reduction, with the strengths
being:
● Capture of variance: SVD is effective at capturing the most significant sources of variance in the
data as can be seen in the example. This is crucial for reducing high dimensional data into a lower
dimensional space [13].
● Robustness to noise: SVD can be robust to noise in the data [13]. By focusing on the singular
values and vectors with the largest values, SVD reduces the noise that may be present in less
significant components.
● Interpretability: The resulting principal components from SVD are linear combinations of the
original features, making them interpretable. In the example, the first principal component was a
combination of height and weight, making intuitive sense.

However, there are also weaknesses such as:


● Loss of Information: While SVD retains the most significant variation, it also discards the less
significant variation. This can be a disadvantage if the discarded variation is important for the
problem at hand.

10
● Interpretation Complexity: While the resulting principal components may be interpretable, they
may not always align with the original features, especially in high dimensional data [13]. This can
make it challenging to interpret the reduced dimensions.
● Computational Complexity: For large datasets, computing the full SVD can be computationally
expensive. This can limit its applicability to very large datasets.

6 Conclusion

In conclusion, Singular Value Decomposition (SVD) plays a pivotal role in AI and data analysis, offering
a powerful method for dimensionality reduction. This report has highlighted SVD’s significance in
capturing essential features, demonstrated through a numerical example and discussed its mathematical
foundations. SVD’s strengths, including variance capture and noise reduction, were contrasted with its
weaknesses, such as information loss. Within AI, SVD’s importance lies in efficient data representation,
feature extraction and its role in foundational algorithms techniques like Principal Component Analysis
(PCA). Suggestions for future research may include exploring advanced dimensionality reduction
techniques and optimising SVD for real time applications. As AI evolves, continued exploration of SVD
will shape data-driven decision-making industries, from image processing to healthcare.

11
7 References

[1] van der Maaten, Laurens; Postma, Eric; van den Herik, Jaap (October 26, 2009). "Dimensionality
Reduction: A Comparative Review" (PDF). J Mach Learn Res. 10: 66–71.

[2] X. Zhou, S. Wu Rating LDA model for collaborative filtering Knowl.-Based Syst., 110 (2016), pp.
135-143.

[3] S. L. Brunton and J. N. Kutz, "Data-Driven Science and Engineering: Machine Learning, Dynamical
Systems, and Control," University of Washington, 2022.

[‌ 4] Ke, Z.T. and Wang, M., 2024. Using SVD for topic modeling. Journal of the American Statistical
Association, 119(545), pp.434-449.

[5] Albright, R., 2004. Taming Text with the SVD. SAS Institute Inc.

[6] Kalman, D., 1996. A singularly valuable decomposition: the SVD of a matrix. The college
mathematics journal, 27(1), pp.2-23.

[‌ 7] Ljungberg, B.F., 2019. Dimensionality reduction for bag-of-words models: PCA vs LSA.
Semanticscholar. org.

‌[8] Grossman, S.I., 1994. Elementary linear algebra. (No Title).

[‌ 9] Akritas, A.G. and Malaschonok, G.I., 2004. Applications of singular-value decomposition (SVD).
Mathematics and computers in simulation, 67(1-2), pp.15-31.

[‌ 10] Datta, B.N., 2010. Numerical linear algebra and applications. Society for Industrial and Applied
Mathematics.

‌[11] Golub, G.H. and Van Loan, C.F., 2013. Matrix computations. JHU press.

[‌ 12] Trefethen, L.N. and Bau, D., 2022. Numerical linear algebra. Society for Industrial and Applied
Mathematics.

[‌ 13] Kishore Kumar, N. and Schneider, J., 2017. Literature survey on low rank approximation of matrices.
Linear and Multilinear Algebra, 65(11), pp.2212-2244.

12
8 Appendix
Covariance matrix, C, was computed as:

13
Eigenvectors and Eigenvalues of C:

The singular values Σ are the square roots of the eigenvalues:

Ultimately leading to:

14

You might also like