0% found this document useful (0 votes)
9 views15 pages

STATfinal

The presentation covers three statistical techniques: Principal Component Analysis (PCA), Factor Analysis, and Multidimensional Scaling (MDS). PCA is used for data simplification and dimensionality reduction, while Factor Analysis identifies common patterns among variables, and MDS visualizes similarities or dissimilarities in high-dimensional data. Each technique has its advantages, limitations, and specific applications across various fields such as psychology, marketing, and social sciences.

Uploaded by

Adarsh Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views15 pages

STATfinal

The presentation covers three statistical techniques: Principal Component Analysis (PCA), Factor Analysis, and Multidimensional Scaling (MDS). PCA is used for data simplification and dimensionality reduction, while Factor Analysis identifies common patterns among variables, and MDS visualizes similarities or dissimilarities in high-dimensional data. Each technique has its advantages, limitations, and specific applications across various fields such as psychology, marketing, and social sciences.

Uploaded by

Adarsh Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

PRESENTATION

On
Principle component analysis, Factor analysis and
Multidimensional scaling

Submitted by: Submitted To:


Sundaram Tiwari (3103) Dr. Gaurav Shukla
Khyati Singh (3104) Assistant Professor
Sachin Yadav (3105)
Vidhu Bhooshan Yadav (3106)
Nancy Sharma (3107)
Abhishek Chaudhari (3108)
Adarsh (3109)
Vikas Singh (3110)
Principle Component Analysis (PCA)
Oldest and Best known technique of multivariate data analysis
It was first coined by karl pearson in (1901)
Definition
It is a way of identifying patterns in data, and expressing the data in such a
way as to highlight their similarities and differences. Since patterns in data
can be hard to find in data of high dimension, where the luxury of
graphical representation is not available, PCA is a powerful tool for
analyzing data.
Why use PCA?

➢ Simplify data analysis and visualization.


➢ Reduce computational cost.
➢ Improve model performance by addressing multicollinearity.
➢ Extract important features.
➢ Compress data.
Goals of PCA:
➢ Extract the most important information.
➢ Compress data by reducing dimensionality.
➢ Simplify data description.
➢ Analyze the structure of observations and variables.
➢ Reduce noise.
Advantages of PCA
➢ Dimensionality Reduction: Simplifies data and reduces computational cost.
➢ Data Visualization: Enables visualization of high-dimensional data in 2D or
3D.
➢ Multicollinearity Handling: Addresses issues with correlated variables in
regression.
➢ Data Compression: Reduces storage requirements.

Disadvantages of PCA
➢ Interpretation: Principal components can be difficult to interpret in terms of the
original variables.
➢ Non-linear Relationships: PCA assumes linear relationships between variables.
➢ Overfitting: Can occur if too many principal components are retained.
➢ Information Loss: Reducing dimensions can lead to some information loss.
FACTOR ANALYSIS

Factor analysis is a statistical technique that reduces a set of variables by


extracting all their commonalities into a smaller number of factors. It can also be
called data reduction.
Factor analysis uses several assumptions

➢ The variables’ linear relationships

➢ Absence of multicollinearity

➢ Relevance of the variables

➢ The existence of a true correlation between factors and variables


Features of Factor Analysis

 Identifies common patterns in variables


 Reduces complexity
 Used in behavioral and social science
Types of Factor Analysis

1. Exploratory Factor Analysis (EFA): Identifies patterns without prior


assumptions.
In this method, any variable can be related to any factor. This helps identify
complex relationships among variables and group them based on common
factors.
2. Confirmatory Factor Analysis (CFA): Confirms expected relationships based
on theory. Confirmatory Factor Analysis (CFA) assesses the fit of the
hypothesized model to the actual data, examining how well the observed
variables align with the proposed factor structure.
Assumptions of factor analysis:-

➢ There will not be any outliers in the data.

➢ The sample size will be greater than the size of the factor.

➢ Since the method is interdependent, there will be no perfect multicollinearity


between any of the variables.

➢ When in a sequence of random variables, all the variables have the same finite
variance, known as being homoscedastic. Since factor analysis works as a
linear function, it will not need homoscedasticity between variables.
Multidimensional Scaling (MDS)
➢ Multidimensional Scaling (MDS) is a statistical technique for visualizing the
similarity or dissimilarity of a group of objects or entities by converting high-
dimensional data into a more understandable two- or three-dimensional space.
➢ MDS is especially beneficial in domains like psychology, sociology,
marketing, geography, and biology, where understanding complex patterns is
critical for making decisions and developing strategies.
➢ MDS is being employed throughout a wide range of fields.
Conducting multidimensional scaling
Types of multidimensional scaling

Classical Multidimensional Scaling:- is a technique that takes an input matrix


representing dissimilarities between pairs of items and produces a coordinate
matrix that minimizes the strain.
Metric multidimensional scaling: generalizes the optimization procedure to
various loss functions and input matrices
Non-metric multidimensional Scaling:- finds a non-parametric monotonic
relationship between dissimilarities and Euclidean distances
Applications of multidimensional scaling

➢ Psychology and Cognitive Science


➢ Market Research and Marketing
➢ Geography and Cartography
➢ Biology and Bioinformatics
➢ Social Sciences and Sociology
Advantages of Multidimensional Scaling
➢ Minimizes the dimensionality of the original relationships among things
while maintaining essential information, hence enhancing comprehension
of the objects without sacrificing critical data.

➢ The scheme's customizable structure renders it suited for several fields and
data kinds, hence enabling its integration into any study area.

➢ It aids in uncovering the concealed structures within the data, so


elucidating the underlying patterns and linkages that may not be readily
apparent.

➢ It facilitates hypothesis testing and clustering analysis, hence underpinning


data-driven decision-making, which is fundamental to the scaling.
Limitations of Multidimensional Scaling

➢ Sensitivity to outliers: The MDS results may be skewed by outliers,


hence influencing the visualization or interpretation of the relationships.

➢ Computational complexity: MDS can be quite a process that demands a


lot of computational resources and time, especially when it comes to large
datasets.

➢ Subjectivity in interpretation: The process of interpreting MDS


outcomes may be a matter of subjective decision of the meaning of the
spatial arrangements which can result in the possible bias.

➢ Challenges in identifying the correct dimensionality: Determining the


appropriate number of dimensions for the reduced space can be a complex
endeavour and may require experimentation.

You might also like