0% found this document useful (0 votes)
5 views11 pages

ML Mod-4

The document discusses key concepts related to Kernel Principal Component Analysis (KPCA) and its advantages over traditional PCA, particularly in capturing non-linear relationships. It also defines the unnormalized graph Laplacian, principal components, and the role of kernels in KPCA, along with the importance of maximizing variance in PCA for effective dimensionality reduction. Additionally, it touches on the concept of data dimension and its significance in feature extraction.

Uploaded by

moqhcrtel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views11 pages

ML Mod-4

The document discusses key concepts related to Kernel Principal Component Analysis (KPCA) and its advantages over traditional PCA, particularly in capturing non-linear relationships. It also defines the unnormalized graph Laplacian, principal components, and the role of kernels in KPCA, along with the importance of maximizing variance in PCA for effective dimensionality reduction. Additionally, it touches on the concept of data dimension and its significance in feature extraction.

Uploaded by

moqhcrtel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Module-4

1 Mark Questions:
Q1.

Ans.
Advantage:
Non-Linearity: One of the significant advantages of KPCA over PCA is its ability to capture
non-linear relationships in the data. PCA is a linear technique, and it might not effectively
represent complex, non-linear patterns in the data. KPCA, by using a kernel trick, allows for
non-linear mappings, making it more suitable for datasets with intricate structures.
Disadvantage:
Computational Complexity: KPCA tends to be computationally more demanding than PCA.
The kernel trick involves computing pairwise kernel evaluations between all data points,
leading to a larger computational cost, especially as the size of the dataset increases. This
can make KPCA less efficient and more resource-intensive compared to PCA, which has a
closed-form solution for linear transformations.
Q2. Define unnormalised graph laplacian.
Ans.
The unnormalized graph Laplacian is a matrix used in spectral graph theory to analyze the
properties of a graph. Given an undirected graph G with n nodes, the unnormalized graph
Laplacian matrix is defined as follows:
Let W be the adjacency matrix of the graph, where wij is the weight of the edge between
node i and node j. The degree matrix D is a diagonal matrix where dii is the degree (sum of
edge weights) of node i.
The unnormalized graph Laplacian matrix L is then given by:
L=D-W
Q3. Define principal component of a feature space. The principal components in KPCA are
uncorrelated. Justify.

Ans.
Principal Component in Feature Space:
In the context of Principal Component Analysis (PCA), the principal components are the
linear combinations of the original features in a dataset that capture the maximum variance.
These components are orthogonal to each other and ordered by the amount of variance
they explain.
Uncorrelated Principal Components in Kernel PCA (KPCA):
Kernel PCA extends the concept of PCA to handle non-linear relationships in the data by
implicitly mapping the data into a higher-dimensional space using a kernel function. In the
transformed feature space, the principal components are obtained in the same way as in
PCA, but they are now functions rather than simple linear combinations.
Q4. Why maximizing the variance of projected data is required in PCA?
Ans.
Maximizing the variance of the projected data in PCA is fundamental for capturing the most
relevant information, reducing dimensionality, and ensuring that the resulting principal
components are both informative and orthogonal. It allows for a concise representation of
the data while retaining its essential characteristics.

3 Marks Questions:
Q1.

Ans.
Q2. Define a kernel. Which kernels are suitable for KPCA algorithm? Briefly explain the role
of a kernel in KPCA.
Ans.
A kernel is a function that calculates the similarity between pairs of data points in a higher-
dimensional space. Kernels are often used in algorithms such as Support Vector Machines
(SVM) and Kernel Principal Component Analysis (KPCA) to implicitly map the data into a
higher-dimensional space without explicitly computing the transformed feature vectors.
Commonly used kernels for KPCA include:
1. Linear Kernel
2. Polynomial Kernel
3. Radial Basis Function (RBF) or Gaussian Kernel
4. Sigmoid Kernel
The role of a kernel in KPCA is to implicitly map the data into a higher-dimensional space,
making it possible to capture non-linear relationships among data points. Kernels achieve
this by computing the similarity or inner product between data points in the higher-
dimensional space. In the transformed space, KPCA then identifies the principal
components, which are linear combinations of the implicitly mapped features, allowing for
non-linear dimensionality reduction.

Q3. Define dimension of data. What is its role during feature extraction? Briefly explain
how PCA can help in this process.

Ans.
The term "dimension" refers to the number of attributes or features that describe each data
point.
During feature extraction, the goal is often to reduce the dimensionality of the data. High-
dimensional data can be computationally expensive and may suffer from the curse of
dimensionality, where the data becomes sparse and the algorithms become less efficient.
Feature extraction methods aim to capture the most important information in the data while
reducing its dimensionality.

Principal Component Analysis (PCA) is a popular technique for dimensionality reduction. It


works by identifying the principal components, which are linear combinations of the original
features that capture the maximum variance in the data. These principal components are
ranked by their importance, and by selecting a subset of them, one can effectively reduce
the dimensionality of the data while retaining as much of the original information as
possible.
Q4. What are the properties of K-Means algorithm? Mention its limitations (if any).
Ans.
Q5. Write down the expression for the covariance matrix used in PCA. What is its order?

Ans.
7 Marks Questions:
Q1.

Ans.

Q2.

Ans.
Q3.

Ans.
Q4.

Ans.

You might also like