0% found this document useful (0 votes)
27 views11 pages

Factor Analysis

Uploaded by

maroghassa7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views11 pages

Factor Analysis

Uploaded by

maroghassa7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

UNIVERSITY OF BALAMAND

Faculty of Arts & Sciences


Department of Biology

FACTOR ANALYSIS

Prepared by: Khoder Ghassa


Presented to : Dr Mira Sabat and colleagues
Contents
Abstract............................................................................................................................................3
1-what is factor analysis..................................................................................................................4
1.1 Definition of Factor Analysis.................................................................................................4
1.2 Purpose of Factor Analysis....................................................................................................4
1.3 Historical Context..................................................................................................................4
2. Types of Factor Analysis..............................................................................................................4
2.1 Exploratory Factor Analysis (EFA)........................................................................................4
2.1.1 Objective:........................................................................................................................4
2.1.2 Process:............................................................................................................................4
2.1.3 Applications:....................................................................................................................4
2.2 Confirmatory Factor Analysis (CFA).....................................................................................4
2.2.1 Objective:........................................................................................................................4
2.2.2 Process:............................................................................................................................5
2.2.3 Applications:....................................................................................................................5
3. Determining the Number of Factors............................................................................................5
3.1 Scree Plot Analysis.................................................................................................................5
3.1.1 Explanation:.....................................................................................................................5
3.1.2 Interpretation:..................................................................................................................5
3.2 Kaiser's Rule..........................................................................................................................6
3.2.1 Explanation:.....................................................................................................................6
3.2.2 Limitations:.....................................................................................................................6
3.3 Parallel Analysis.....................................................................................................................6
3.3.1 Explanation:.....................................................................................................................6
3.3.2 Advantages:.....................................................................................................................6
5. Factor Analysis vs. Principal Component Analysis (PCA).........................................................6
5.1 Distinctions............................................................................................................................6
5.1.1 Purpose:...........................................................................................................................6
5.1.2 Interpretability:................................................................................................................7
5.2 Use Cases...............................................................................................................................7
5.2.1 FA and Model Structure:.................................................................................................7
5.2.2 Sensitivity to Scale:.........................................................................................................7
5.3 Pros and Cons.........................................................................................................................7
5.3.1 Advantages of FA:...........................................................................................................7
5.3.2 Advantages of PCA:........................................................................................................7
5.3.3 Limitations of FA:...........................................................................................................7
5.3.4 Limitations of PCA:........................................................................................................7
6. Examples and R Code..................................................................................................................8
6.1 Example Dataset....................................................................................................................8
6.2 R Code for Factor Analysis....................................................................................................8
6.3 Interpretation..........................................................................................................................8
7. Conclusion...................................................................................................................................8
7.1 Key Concepts Explored.........................................................................................................8
7.2 Practical Implementation.......................................................................................................9
7.3 Significance and Applications................................................................................................9
.References.....................................................................................................................................10

Abstract
In the world of biostatistics, factor analysis is like a trusted guide helping us uncover hidden
structures within complex datasets. It's our go-to tool for peeling back layers and understanding
the factors that influence the things we observe. Think of it as a special pair of glasses that lets us
see beneath the surface. We use two main approaches: Exploratory Factor Analysis (EFA) and
Confirmatory Factor Analysis (CFA). EFA is like an explorer, helping us discover factors without
preconceived notions. On the other hand, CFA is like a validator, testing a predefined model to
see if it fits our data. Choosing the right number of factors can be tricky, but we have tools like
scree plots, Kaiser's rule, and parallel analysis to help us navigate this decision-making process.
Factor rotation is like adjusting the lens on a camera, making our view clearer. It helps us better
understand which factors really matter. Comparing factor analysis with principal component
analysis (PCA) is like understanding two different tools in a toolbox. They might seem similar,
but they serve distinct purposes, and factor analysis tends to be more about understanding
underlying influences. In a real-world example—let's say, in genetic research—factor analysis is
the hero. It's what allows biostatisticians to sift through complex genetic data, revealing hidden
structures and turning them into meaningful insights. It's like a secret codebreaker, helping us
make sense of the biological mysteries that lie within the data

1-what is factor analysis


1.1 Definition of Factor Analysis
Factor analysis is a technique that is used to reduce a large number of variables into fewer
numbers of factors. This technique extracts maximum common variance from all variables and
puts them into a common score. It is used in fields such as psychology, economics, and biology.

1.2 Purpose of Factor Analysis


In biostatistics, the application of factor analysis extends to understanding the inherent patterns
and interdependencies among a multitude of biological variables. Whether exploring the shared
genetic components of diseases or deciphering the underlying factors influencing patient
responses to treatments, factor analysis provides a powerful tool for extracting meaningful
information from complex datasets.

1.3 Historical Context


The roots of factor analysis can be traced back to the early 20th century, with pioneers like
Charles Spearman and Louis Thurstone laying the groundwork. Over time, factor analysis has
evolved, finding its place in the arsenal of statistical techniques used by biostatisticians to
unravel the intricacies of biological phenomena.
2. Types of Factor Analysis
2.1 Exploratory Factor Analysis (EFA)
2.1.1 Objective:
Exploratory Factor Analysis (EFA) is employed when the researcher aims to uncover latent
factors within a dataset without preconceived notions about the number of factors or their
relationships.

2.1.2 Process:
EFA involves an iterative process where the algorithm identifies the most significant factors that
explain the observed variance in the data. The researcher explores factor loadings,
communalities, and eigenvalues to interpret the underlying structure.

2.1.3 Applications:
EFA is commonly used in biostatistics for discovering hidden patterns in biological data, such as
identifying common genetic factors contributing to disease susceptibility across a population.

2.2 Confirmatory Factor Analysis (CFA)


2.2.1 Objective:
Confirmatory Factor Analysis (CFA) is utilized when the researcher has a predefined hypothesis
about the number of factors and their relationships. CFA tests the validity of a pre-established
theoretical framework.

2.2.2 Process:
In CFA, the researcher specifies a model based on prior knowledge or theory. The algorithm then
assesses how well the observed data fit the hypothesized model, allowing for the validation or
refinement of the initial assumptions.

2.2.3 Applications:
CFA is applied in biostatistics when researchers want to confirm or reject a theoretical model of
factor structure, such as validating a measurement instrument for assessing patient-reported
outcomes.
These two types of factor analysis cater to different research scenarios, offering flexibility in
exploring unknown structures or confirming existing theories within the intricate realm of
biostatistics. In the subsequent sections, we will delve into the methods for determining the
optimal number of factors, a critical aspect of the factor analysis process.

3. Determining the Number of Factors


3.1 Scree Plot Analysis

Factor loading tells you how much each observed variable is associated with each latent factor.
Communalities indicate the proportion of variability in an observed variable explained by the
latent factors.
Eigenvalues measure the amount of variance explained by each factor, helping to determine the
significance of each factor in explaining the observed data

3.1.1 Explanation:
A scree plot is a graphical representation of eigenvalues plotted against the number of factors.
Eigenvalues represent the amount of variance explained by each factor. In a scree plot, a "knee"
or point where eigenvalues level off indicates the optimal number of factors.

3.1.2 Interpretation:
Researchers should inspect the scree plot for the point where eigenvalues sharply decrease,
suggesting the number of factors that adequately represent the variance in the data. This visual
inspection helps avoid over-extraction or under-extraction of factors.

3.2 Kaiser's Rule


3.2.1 Explanation:
Kaiser's rule suggests retaining factors with eigenvalues greater than 1. Eigenvalues less than 1
indicate that the factor explains less variance than a single variable, making it less informative.

3.2.2 Limitations:
While Kaiser's rule is a widely used guideline, it has limitations and may overestimate or
underestimate the number of factors. Researchers should consider other criteria alongside
Kaiser's rule for a more robust determination.

3.3 Parallel Analysis


3.3.1 Explanation:
Parallel analysis involves generating random datasets with the same number of variables and
observations as the original dataset. By comparing the eigenvalues from the actual data to those
from the random datasets, researchers can identify the number of factors that exceed what would
be expected by chance.

3.3.2 Advantages:
Parallel analysis is considered more accurate than Kaiser's rule and is particularly useful when
dealing with complex datasets. It provides a statistical foundation for determining the appropriate
number of factors.
In practice, a combination of these methods is often recommended to ensure a robust
determination of the number of factors. As we transition to the next section, it's important to note
the role of factor rotation in refining the interpretation of the identified factors.
Factor rotation is a technique used in factor analysis to simplify the interpretation of the factor
solution. After the initial extraction of factors, rotation is applied to adjust the factor loadings and
make the factors more interpretable. The goal is to achieve a simpler and more meaningful factor
structure.

5. Factor Analysis vs. Principal Component


Analysis (PCA)
5.1 Distinctions
5.1.1 Purpose:
 Factor Analysis (FA): Focuses on identifying latent factors that contribute to observed
variables. Assumes that observed variables are influenced by common factors and unique
factors (error).
 Principal Component Analysis (PCA): Aims to maximize variance capture without
distinguishing between common and unique factors. It transforms the data into
uncorrelated principal components.

5.1.2 Interpretability:
 FA: Factors are interpreted as latent constructs influencing the observed variables. It
provides insight into the underlying structure of the data.
 PCA: Principal components are linear combinations of the original variables. While they
capture maximum variance, their interpretability may be challenging.

5.2 Use Cases


5.2.1 FA and Model Structure:
 FA: Assumes a specific model structure where observed variables are influenced by
common and unique factors. It's suitable when there's interest in understanding the latent
constructs driving the data.
 PCA: Focuses on maximizing variance and is often used for dimensionality reduction
without concern for the underlying structure.

5.2.2 Sensitivity to Scale:


 FA: Sensitive to the scale of measurement and is influenced by the choice of rotation
method.
 PCA: Insensitive to the scale of measurement; it only depends on the covariance matrix.

5.3 Pros and Cons


5.3.1 Advantages of FA:
 Offers insight into latent factors influencing observed variables.
 Allows for a more nuanced understanding of complex relationships in the data.
5.3.2 Advantages of PCA:
 Simple and computationally efficient.
 Effective for dimensionality reduction when the underlying structure is not a primary
concern.

5.3.3 Limitations of FA:


 Requires assumptions about the underlying model structure.
 Sensitive to the choice of rotation method.

5.3.4 Limitations of PCA:


 Ignores the distinction between common and unique factors.
 Principal components may lack clear interpretability.

6. Examples and R Code


6.1 Example Dataset
Imagine you conducted a survey among your colleagues to understand their preferences for
different aspects of coffee. You asked them about variables like sweetness, bitterness, aroma, and
temperature. Now, you want to figure out if there are underlying factors that influence these
preferences.
6.2 R Code for Factor Analysis
On Rstudio
Purpose: This conducts factor analysis on the synthetic dataset using the fa function from the
psych package. The nfactors = 2 specifies that we want to extract two factors. The rotate =
"varimax" option indicates the use of varimax rotation for better interpretability.
print(coffee_fa$loadings): Displays the factor loadings, indicating how much each variable is
influenced by the extracted factors.
print(coffee_fa$communality): Shows the communalities, indicating the proportion of variance
in each variable explained by the factors.
print(coffee_fa$Vaccounted): Displays the variance explained by each factor.
screeplot(coffee_fa): Generates a scree plot, allowing visual inspection of the eigenvalues to
determine the optimal number of factors.

6.3 Interpretation
In this example, the factor loadings will indicate how much each aspect (Sweetness, Bitterness,
Aroma, Temperature) is influenced by the hidden factors. The communalities will tell you how
much of each variable is explained by these factors. The scree plot will help you visually inspect
the eigenvalues.

7. Conclusion
In this chapter, we embarked on a journey through the intricacies of factor analysis in
biostatistics, uncovering its fundamental concepts, methodologies, and practical applications.
Let's recap the key takeaways:

7.1 Key Concepts Explored


 Factor Analysis Definition: Factor analysis is a statistical technique employed to
identify latent factors contributing to the observed variability in a set of correlated
variables within the realm of biostatistics.
 Types of Factor Analysis:
 Exploratory Factor Analysis (EFA): Unveils latent factors without preconceived
notions.
 Confirmatory Factor Analysis (CFA): Tests predefined hypotheses about factor
structure.
 Determining the Number of Factors:
 Scree Plot Analysis: Visual inspection for the "knee" point.
 Kaiser's Rule: Retaining factors with eigenvalues greater than 1.
 Parallel Analysis: Statistical comparison of observed eigenvalues with those
from random datasets.
 Factor Rotation:
 Definition: A process refining the interpretation of factors.
 Types: Varimax, Promax, Orthogonal vs. Oblique rotation.
 Factor Analysis vs. Principal Component Analysis (PCA):
 Purpose Distinctions: FA identifies latent factors; PCA focuses on variance.
 Use Cases: FA for understanding structure; PCA for dimensionality reduction.
 Pros and Cons: FA provides insight but is sensitive; PCA is simple but less
interpretable.

7.2 Practical Implementation


 Example Dataset: Considered a hypothetical biostatistical dataset with biomarker
measurements.
 R Code for Factor Analysis: Demonstrated how to perform EFA in R, including factor
loadings, communalities, scree plot analysis, and parallel analysis.

7.3 Significance and Applications


Factor analysis is like a detective for biostatisticians, helping us unlock hidden secrets within
complex biological data. Imagine it as a powerful tool that unveils patterns—like discovering
hidden genetic factors affecting diseases or confirming the reliability of tools measuring patient
outcomes. It's like putting on a pair of special glasses that allow us to see beyond the surface,
revealing meaningful insights in the vast world of biological information.

But, just like any detective work, using factor analysis involves making careful choices. Think of
it as deciding which clues to follow, determining how many factors to investigate, and then
piecing together the puzzle of results. It's a bit like choosing the right path in a dense forest—
there are different routes, and each decision matters

.References
 https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/factor-
analysis/#:~:text=Factor%20analysis%20is%20a%20technique,them%20into%20a
%20common%20score.
 https://www.analytixlabs.co.in/blog/factor-analysis-vs-pca/#What_is_Factor_Analysis
 https://docs.tibco.com/pub/stat/14.0.0/doc/html/UsersGuide/GUID-6087F7F5-F407-
45FA-B6DA-24674BCDC07C.html
 https://www.youtube.com/watch?v=TeIx7dRedkg
 https://www.youtube.com/watch?v=WV_jcaDBZ2I&t=332s

You might also like