0% found this document useful (0 votes)
109 views58 pages

Chapter 13 Multivariate Analysis Techniques

Here are the steps to solve this problem using the principal components method of factor analysis: 1. Compute the eigen values and eigen vectors of the correlation matrix R. 2. The eigen vectors corresponding to the largest eigen values are the factor loadings of the first principal component. 3. Compute the communalities (h2) as the sum of squared factor loadings for each variable. 4. Subtract the communalities from the diagonal of R to get the residual correlation matrix R1. 5. Repeat steps 1-3 on R1 to obtain the second principal component. 6. The sum of eigen values gives the total variance. The proportion of variance explained by each component is the eigen value

Uploaded by

Sharon Ancheta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views58 pages

Chapter 13 Multivariate Analysis Techniques

Here are the steps to solve this problem using the principal components method of factor analysis: 1. Compute the eigen values and eigen vectors of the correlation matrix R. 2. The eigen vectors corresponding to the largest eigen values are the factor loadings of the first principal component. 3. Compute the communalities (h2) as the sum of squared factor loadings for each variable. 4. Subtract the communalities from the diagonal of R to get the residual correlation matrix R1. 5. Repeat steps 1-3 on R1 to obtain the second principal component. 6. The sum of eigen values gives the total variance. The proportion of variance explained by each component is the eigen value

Uploaded by

Sharon Ancheta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

CHAPTER 13

Multivariate Analysis Techniques

Reporter: Stella M. Rivera


"Multivariate" refers to the presence of multiple random
variables.

Variables are any characteristics that can take on different values,


such as height, age, temperature, or test scores.
-Independent variable is the cause. Its value is independent of
other variables in your study.
-Dependent variable is the effect. Its value depends on changes
in the independent variable.
Multivariate data are data that are thought of us the
realizations of several random variables.

Multivariate analysis techniques- can be defined


broadly as an inquiry into the structure of
interrelationships among multiple random variables.
-all statistical techniques which simultaneously analyse
more than two variables on a sample of observations.
Growth of Multivariate Techniques
 Emerged as a powerful tool to analyse data represented in
terms of many variables.

 Series of univariate analysis carried out separately for each


variable may, at times, lead to incorrect interpretation of
the result.This is so because univariate analysis does not
consider the correlation or inter-dependence among the
variables.

 As a result, during the last fifty years, a number of


statisticians have contributed to the development of several
multivariate techniques.
 Today, these techniques are being applied in many fields such
as economics, sociology, psychology, agriculture, anthropology,
biology and medicine.

 Applications of multivariate techniques in practice have been


accelerated in modern times because of the advent of high
speed electronic computers.
CHARACTERISTICS AND APPLICATIONS
Multivariate techniques are
• largely empirical and deal with the reality; they possess the
ability to analyse complex data.

• also help in various types of decision-making.

• The basic objective is to represent a collection of massive data in


a simplified way.

• It also involve several complex mathematical computations and


as such can be utilized largely with the availability of computer
facility.
VARIABLES IN MULTIVARIATE ANALYSIS

(i) Explanatory variable and criterion variable:


If X may be considered to be the cause of Y,
then X is described as explanatory variable (also termed as
causal or independent variable) and Y is described as criterion
variable (also termed as resultant or dependent variable).

(ii) Observable variables and latent variables:


-Explanatory variables as observable variables.
-Unobservable variables as latent variables
(iii) Discrete variable and continuous variable
-Discrete variable is that variable which when
measured may take only the integer value
-Continuous variable is one which, when measured,
can assume any real value (even in decimal points).

(iv) Dummy variable (or Pseudo variable):


This term is being used in a technical sense and is
useful in algebraic manipulations in context of
multivariate analysis. We call Xi ( i = 1, …., m) a dummy
variable, if only one of Xi is 1 and the others are all
zero.
CLASSIFICATION OF MULTIVARIATE TECHNIQUES

Classified into two broad categories

 dependence methods - are used when one or


some of the variables are dependent on others.

 interdependence methods- make no such


distinction but treat all variables equally in a
search for underlying relationships.
IMPORTANT MULTIVARIATE TECHNIQUES

(i) Multiple regression*:


-has a single, metric criterion variable wich is supposed to be a
function of other explanatory variables.
-main objective in using this technique is to predict the
variability of the
dependent variable based on it covariance with all the
independent variables.
*Note:
Metric data is further classified into interval and ratio data.
Nonmetric data is classified into nominal and ordinal
(ii) Multiple discriminant
analysis:
-has a single,non-metric criterionExample of dependent variables:
variable which is supposed to be -Gender: Male Vs. Female
a function of other explanatory -Member vs. non member
variables. -Good,average or poor

non-metric metric or non-metric


(iii) Multivariate analysis of variance: (MANOVA)
Canonical Correlation Analysis

-This technique was first developed by Hotelling


wherein an
effort is made to simultaneously predict a set of
criterion variables from their joint co-variance with
a set of explanatory variables.
- Both metric and non-metric data can be used in the
context of this
multivariate technique.
Factor Analysis
-seeks to resolve a
large set of
measured variables
in terms of
relatively few
categories, known
as factors.
Service

Quality of food
The mathematical basis of
factor analysis concerns a
data matrix*
(also termed as score
matrix), symbolized as S.
Basic terms relating to factor analysis

(i) Factor: A factor is an underlying dimension that account


for several observed variables.

(ii) Factor-loadings:
-those values which explain how closely the variables are
related to each one of the factors discovered
-also known as factor-variable correlations.
(iii) Communality (h2):
- symbolized as h2
-shows how much of each variable is accounted for by the underlying
factor taken together.

A high value of communality means that not much of the variable is


left over after whatever the factors represent is taken into
consideration. It is worked out in respect of each variable as under:
h2 of the ith variable= (ith factor loading of factor A)2
+ (ith factor loading of factor B)2 + …
(iv) Eigen value (or latent root):
-when we take the sum of squared values of factor loadings
relating to a factor
-Eigen value indicates
the relative importance of each factor in accounting for the particular
set of variables being analysed
(v) Total sum of squares:
-when eigen values of all factors are totalled
This value, when divided by the number of variables (involved
in a study),
results in an index that shows how the particular solution
accounts for what all the variables taken
together represent.
(vi) Rotation:
-the process of transforming a factor pattern

(vii) Factor scores:


-this score is of all row and columns, which can be
used as an index of all variables and can be used for
further analysis.
IMPORTANT METHODS OF FACTOR
ANALYSIS

(i) the centroid method;


(ii) the principal components method;
(ii) the maximum likelihood method.
(A) Centroid Method of Factor Analysis

-developed by L.L. Thurstone, was quite frequently


used until about
1950 before the advent of large capacity high speed
computers.*
- tends to
maximize the sum of loadings, disregarding signs; it is
the method which extracts the largest sum of
absolute loadings
Note:Positive foriseach
manifold factor
the idea that allinthe
turn.
variables are
positively correlated.
Variables
Responde X1 X2 X3 X4 X5 X6 X7 X8
nts
Waiting Cleanliness Taste of Staff Food Self service Food Unlimited
Time Food Behavior Freshness Temperatur Foods
e
1 3 6 5 4 7 8 4 7

2 5 7 7 6 5 6 5 8

3 6 9 5 3 4 2 3 9

4 8 6 5 2 6 7 7 6

5 6 7 4 5 2 1 3 8
Using the centroid method of factor analysis, work out the first and
second centroid factors from the
information below.
To obtain the second centroid
Now we obtain first matrix of
factor B, we first of all develop the
residual coefficient (R1
first matrix of factor cross product,
) by subtracting Q1
Q1:
from R as shown
below.
Illustration 2
Variables Centroid Factor Centroid Factor Communality
A2 B2 h2

1 .480 .317 .797

2 .382 .333 .715

3 .412 .291 .703

4 .411 .362 .773

5 .396 .311 .707

6 .482 .397 .879

7 .461 .268 .729

8 .466 .352 .818

Eigen value 3.490 2.631 6.121


https://
youtu.be/
3i6e_m-
cyqo

Common
variance is
the amount
of variance
that is
shared
among a
set of
items.
(B) Principal-components Method of Factor Analysis
-developed by H. Hotelling,
-seeks to maximize the sum of squared loadings of each factor
extracted in turn.
Illustration 3
Take the correlation matrix, R, for eight variables of
illustration 1 of this chapter and then compute:
(i) the first two principal component factors;
(ii) the communality for each variable on the basis of said two
component factors;
(iii) the proportion of total variance as well as the proportion
of common variance explained by
each of the two component factors.
All these
values can
be
interpreted
in the same
manner as
stated
earlier.
(C) Maximum Likelihood (ML) Method of Factor Analysis

-estimating the parameters of an assumed


probability distribution, given some observed data
-this is achieved by maximizing a likelihood function
so that, under the assumed statistical model, the
observed data is most probable.
-Probability corresponds to finding the
chance of something given a sample
distribution of the data,

- Likelihood refers to finding the best


distribution of the data given a particular
value of some feature or some situation in
the data.
ROTATION IN FACTOR ANALYSIS
-minimizes the number of factors needed to
explain each variable
-simplifies the interpretation of the observed
variables.
A rotation method that is a combination of:
Orthogonal Rotations Oblique Rotations
-produce factors that are -allow the factors to correlate
uncorrelated -allow the X and Y axes to assume a
-correlation between the factors is different angle than 90°
zero
-maintain a 90° angle between axes

e.g.( varimax,quartimax,equimax) e.g (oblimin,promax)


R-type and Q-type Factor Analysis

R-type factor analysis: When factors are


calculated from the correlation matrix, then it is
called R-type factor analysis.

Q-type factor analysis: When factors are


calculated from the individual respondent, then
it said to be Q-type factor analysis.
(vi) Cluster Analysis
-methods of classifying variables into clusters
- grouping a set of objects in such a way that objects in the
same group technically, a cluster
consists of variables that correlate highly with one another
and have comparatively low correlations
with variables in other cluster
-the basic objective of cluster analysis is ensuring that the
observations are as similar as possible within the group and
the groups them- selves stand apart from one another.
(vii) Multidimensional Scaling (MDS)
-is to measure an item in more than one dimension
at a time.
The basic assumption is that people perceive a set of objects
as being more or less similar
to one another on a number of dimensions (usually
uncorrelated with one another) instead of only
Note:
one.
-Principal component analysis (PCA) creates based on correlates among
sample.
-MDS creates based on distance among sample.
(viii) Latent Structure Analysis
-to extract latent factors and
express relationship of observed (manifest) variables with these
factors as their indicators and to
classify a population of respondents into pure types.
-it explains the correlations among observed variables by
making assumptions about the hidden ('latent') causes of those
variables.
-This type of analysis is appropriate when the
variables involved in a study do not possess dependency
relationship and happen to be non-metric.
PATH ANALYSIS
The term ‘path analysis’ was first introduced by the
biologist Sewall Wright in 1934 in connection
composing the total correlation between any two
variables in a causal system.
-used to describe the directed dependencies among a set
of variables
The use of the path analysis
technique requires the
assumption that there are
linear additive, a
symmetric relationships
among a set of variables
which can be measured at
least on a quasi-interval
scale.
CONCLUSION
From the brief account of multivariate techniques
presented above, we may conclude that such
techniques are important for they make it possible to
encompass all the data from an investigation in
one analysis.
They in fact result in a clearer and better account of the
research effort than do the
piecemeal analyses of portions of data. These techniques
yield more realistic probability statements in hypothesis
testing and interval estimation studies.
Multivariate analysis (consequently the use of
multivariate techniques) is specially important in
behavioural sciences and applied researches for
most of such studies involve problems in which several
response variables are observed simultaneously.
The common source of each individual observation
generally results into dependence or correlation
among the dimensions and it is this feature that
distinguishes multivariate data and techniques from
their univariate prototypes.
In spite of all this, multivariate techniques are
expensive and involve laborious computations. As
such their applications in the context of research
studies have been accelerated only with the advent
of high speed electronic computers since 1950’s.
THANK YOU!❤

You might also like