Unit 4 (PCA)
Unit 4 (PCA)
UNIT- 4
PCA works by considering the variance of each attribute because the high
attribute shows the good split between the classes, and hence it reduces the
dimensionality. Some real-world applications of PCA are image processing, movie
recommendation system, optimizing the power allocation in various
communication channels. It is a feature extraction technique, so it contains the
important variables and drops the least important variable.
The PCA algorithm is based on some mathematical concepts such as:
o Variance and Covariance
o Eigenvalues and Eigen factors
Some common terms used in PCA algorithm:
o Dimensionality: It is the number of features or variables present in the
given dataset. More easily, it is the number of columns present in the
dataset.
o Correlation: It signifies that how strongly two variables are related to each
other. Such as if one changes, the other variable also gets changed. The
correlation value ranges from -1 to +1. Here, -1 occurs if variables are
inversely proportional to each other, and +1 indicates that variables are
directly proportional to each other.
o Orthogonal: It defines that variables are not correlated to each other, and
hence the correlation between the pair of variables is zero.
o Eigenvectors: If there is a square matrix M, and a non-zero vector v is given.
Then v will be eigenvector if Av is the scalar multiple of v.
o Covariance Matrix: A matrix containing the covariance between the pair of
variables is called the Covariance Matrix.
Principal Components in PCA
As described above, the transformed new features or the output of PCA are the
Principal Components. The number of these PCs are either equal to or less than
the original features present in the dataset. Some properties of these principal
components are given below:
o The principal component must be the linear combination of the original
features.
o These components are orthogonal, i.e., the correlation between a pair of
variables is zero.
o The importance of each component decreases when going to 1 to n, it
means the 1 PC has the most importance, and n PC will have the least
importance.
Steps for PCA algorithm
1. Getting the dataset
Firstly, we need to take the input dataset and divide it into two subparts X
and Y, where X is the training set, and Y is the validation set.
2. Representing data into a structure
Now we will represent our dataset into a structure. Such as we will
represent the two-dimensional matrix of independent variable X. Here each
row corresponds to the data items, and the column corresponds to the
Features. The number of columns is the dimensions of the dataset.
3. Standardizing the data
In this step, we will standardize our dataset. Such as in a particular column,
the features with high variance are more important compared to the
features with lower variance.
If the importance of features is independent of the variance of the feature,
then we will divide each data item in a column with the standard deviation
of the column. Here we will name the matrix as Z.
4. Calculating the Covariance of Z
To calculate the covariance of Z, we will take the matrix Z, and will
transpose it. After transpose, we will multiply it by Z. The output matrix will
be the Covariance matrix of Z.
5. Calculating the Eigen Values and Eigen Vectors
Now we need to calculate the eigenvalues and eigenvectors for the
resultant covariance matrix Z. Eigenvectors or the covariance matrix are the
directions of the axes with high information. And the coefficients of these
eigenvectors are defined as the eigenvalues.
6. Sorting the Eigen Vectors
In this step, we will take all the eigenvalues and will sort them in decreasing
order, which means from largest to smallest. And simultaneously sort the
eigenvectors accordingly in matrix P of eigenvalues. The resultant matrix
will be named as P*.
7. Calculating the new features Or Principal Components
Here we will calculate the new features. To do this, we will multiply the P*
matrix to the Z. In the resultant matrix Z*, each observation is the linear
combination of original features. Each column of the Z* matrix is
independent of each other.
8. Remove less or unimportant features from the new dataset.
The new feature set has occurred, so we will decide here what to keep and
what to remove. It means, we will only keep the relevant or important
features in the new dataset, and unimportant features will be removed out.
Applications of Principal Component Analysis
o PCA is mainly used as the dimensionality reduction technique in various AI
applications such as computer vision, image compression, etc.
o It can also be used for finding hidden patterns if data has high dimensions.
Some fields where PCA is used are Finance, data mining, Psychology, etc.
Let's suppose that we have input data of size (m and the number) with m being
the training samples, while n refers to the number of components in each
instance and an arbitrary label vector of the size (1 m). Then, it is initialized with
its weights in size (n and C) from the initial number of training samples that have
different labels. They must be removed in all samples of training. In this case, c is
an indication of the classes. Then, iterate over the remaining input information
for each example of training that it changes to the winner vector (weight vector)
with the closest distance (e.g., Euclidean distance ) from the example of training ).
1. if correctly_classified:
2. wij(new) = wij(old) + alpha(t) * (xik - wij(old))
3. else:
4. wij(new) = wij(old) - alpha(t) * (xik - wij(old))
Algorithm
The steps involved include:
o Weight initialization
o From 1 to N of Epochs
o Select a good training example
o Find the winning vector
o Make sure you update the vector that is winning
o Repeat steps 3, 4, and 5 for every exercise example.
o Classify test samples
Question bank
1. What is PCA (Principle Component Analysis )
2. Describe the relationship between the Self-Organizing Map algorithm, and the Learning
Vector Quantization algorithm.
ANS- In order to use Learning Vector Quantization (LVQ), a set of approximate
reconstruction vectors is first found using the unsupervised SOM algorithm. The
supervised LVQ algorithm is then used to fine-tune the vectors found using SOM
3. Explain Hebbain-based method using AND gate.
4. What is vector quantization?
5. Explain self-organizing maps.
6. What is dimensionality reduction?
7. Explain the dimensionality reduction using PCA.
8. What is SOM.