0% found this document useful (0 votes)
22 views16 pages

Workshop Notes

The document outlines a workshop on data analysis covering topics such as missing data, exploratory and confirmatory factor analysis, and various regression techniques using SPSS and SEM. It details methods for handling missing data, conducting exploratory and confirmatory factor analyses, and assessing model fit indices. Additionally, it includes instructions for reliability testing and calculating composite variables in SPSS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views16 pages

Workshop Notes

The document outlines a workshop on data analysis covering topics such as missing data, exploratory and confirmatory factor analysis, and various regression techniques using SPSS and SEM. It details methods for handling missing data, conducting exploratory and confirmatory factor analyses, and assessing model fit indices. Additionally, it includes instructions for reliability testing and calculating composite variables in SPSS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

1 By Gulfam Murtaza & Basharat Javed

Workshop Data Analysis


Contents
SPSS, SEM

1. Missing Data (Why, When, How)

2. EFA (SPSS)

3. CFA (SEM)

4. One Way ANOVA

5. Correlation

6. Regression (Mediation analysis In SEM)

7. Regression (Moderation In SEM )

8. Mode Graph

BY:
Gulfam Murtaza (PHD Scholar) Basharat Javed (PHD Scholar)

CAPITAL UNIVERSITY OF SCIENCE & TECHNOLOGY


MISSING DATA

Capital University of Science & Technology


2 By Gulfam Murtaza & Basharat Javed

Why?

If you are missing much of your data, this can cause several problems;

 e.g., Means will be condensed and will not give accurate estimates

 SEM can’t run tests on missing data

When?

Missing less than 10% from a variable or respondent is typically not problematic (I prefer less
than 5%).

How?

Option 1: Use only valid data. No imputation, just use valid cases or variables

Option 2: Use known replacement values. Match missing value with similar case’s values.

Option 3: Use calculated replacement values. For this mean or median would be the choice.

Checking Missing Values


1. Analyze -> Descriptive statistics -> Frequencies
2. Shift all the variables in box
3. Uncheck all options
4. ok

Treatment
1. Transform -> Replace missing values -> box opens

2. Shift a variables containing missing value in the box

3. Give the original name to the variable in “Name” box

4. In “Method” box, select median for categorical variables and mean for continuous
variables
Note :“values and measures” of all the variables or items containing missing data should be
re-adjusted in variable view.
Best Method – Prevention!

 Shorter surveys (pre-testing critical!)

Capital University of Science & Technology


3 By Gulfam Murtaza & Basharat Javed

 Easy to understand and to answer survey items (pre-testing critical)

FACTOR ANALYSIS
1. Exploratory Factor Analysis

2. Confirmatory Factor Analysis

Exploratory Factor Analysis (EFA)


 EFA identifies structure (or factors or dimensions) of a variable

 In a model, factor analysis identifies items loading on their respective variables


(convergence and discriminance)

 Factor analysis confirms the existing factors of a variables

 Factor analysis explores the inter-relationships among variables to discover if those


variables can be grouped into a smaller set of underlying factors.

Method:

1. Analyze > dimension reduction > factor (box opens)

2. Shift all items of the all variables in “variables” box. (demographics are excluded)

3. In “descriptive” button, check “Initial solution” , “Reproduced”, “KMO” options

4. In “extraction” button, select “Maximum Liklihood” check “plot scree”

Note: if we want to take forced choice of fixed no. of factors then the option “fixed no. of
factors” would be selected.

5. Click on button “Rotation”, check “Promax”

6. In “option” check “suppress small coefficients” and set the value at .3

7. Continue > ok

Adequacy

1. KMO 0.5 to 0.9 or above

Capital University of Science & Technology


4 By Gulfam Murtaza & Basharat Javed

2. Bartlett’s Test of Sphericity à P < 0.5 (sig)

3. Cumulative % > 50 (idealy > 60)

4. Goodness of fit = Chi Sq/df < 3 (p>0.5 if possible)

5. Non redundant residuals ≤ 5%

6. Correlation (in factor correlation matrix) b/w 0.3 to 0.7

Two statistics on the SPSS output allow you to look at some of the basic
assumptions.

 Kaiser-Meyer-Olkin (KMO)

Measure of Sampling Adequacy

It also tells whether the variables are able to be grouped into a smaller set of
underlying factors or not

 Bartlett’s Test of Sphericity

The inter-correlation among variables (items) can be checked by using Bartlett’s


test of sphericity which should be significant p<0.05)

Capital University of Science & Technology


5 By Gulfam Murtaza & Basharat Javed

AMOS
1.Open the amos from spss

Analyse > IBM SPSS AMOS

(as amos is opened from spss, data file combine 3 is automatically linked with amos, otherwise if
we open amos from its short cut button, we have to link the file with it) the path would be as
follows:

i.start amos from short cut button

ii.click on “select data file” button from tool box > select your data file

CONFIRMATORY FACTOR ANALYSIS (CFA)


It is used to check the discriminent validity i.e., whether respondents perceived the modeled
constructs (variables) different from each other or not. For this purpose, series of CFA’s use to
be conducted to gauge model fitness. In our example, we have four variables in the model shown
in fig below:
Psychological
Safety climate (PS)

Safety
Abusive Psychological
Outcomes (SO)
supervision (AS) Distress (PD)

Drawing 4 Factor Model

1.Abusive supervision

1. Select Draw tool from the tool box and draw an oval for latent variable and its
observed variables by clicking (5times in our case) the same tool in oval.

2. Take select tool from the tool box and double click in oval to write name of the
variable, box opens, type name (Abusive supervision)

Capital University of Science & Technology


6 By Gulfam Murtaza & Basharat Javed

3. For naming the error terms, click Plugins > name unobserved variables

4. For loading item in observable variables, take List variable tool from the tool box, it
shows the list of all items. Select AS1 to AS5 and drop them one by one in rectangles

5. For moving a complete variable, select two tools from tool box, Move Objects tool

and Preserve symmetries toll , now you can move this variable on the canvas

6. For rotating the variable, use rotate tool , this is used to set many variables on the
canvas

7. For changing the direction, use Reflect the Indicators tool

8. Draw all the variables, place them in sequence, connect them through Covariance tool

as shown in the fig.

9. save the file at with the name cfa initial

10. For Analysis, click Analysis Properties

Tool from tool box, it shows a dialogue Box. Do the


following :

i.select output button

ii. check standardized estimates

iii. squared multiple correlation

iv. modification indices


FIG-1

Capital University of Science & Technology


7 By Gulfam Murtaza & Basharat Javed

v. close the box

11. click on calculate Estimates Button from tool box

12. to see the output, click on View Text button

In output file, two types of estimates are important

o Model Fit Indices

o Modification Indices

Model Fit Indices and their cut off values are given below
 Goodness of fit index(GFI)
 Adjusted Goodness of Fit Index (AGFI)
 Normed Fit Index (NFI)
 Comparative Fit Index (CFI)
 Incremental Fit Index (IFI)
 Tucker Lewis Index (TLI)
 Root mean Square Error of Approximation (RMSEA)
 Root Mean Square Residual (RMR)

Initial solution is close to the cutoff values (FIG1). Model fitness can be improved by joining the
error terms with greater values from modification indices (FIG 2). We get the final solution as
under:

Capital University of Science & Technology


8 By Gulfam Murtaza & Basharat Javed

MODIFICATION INDICES
Par
M.I. Change
e23 <--> SaftOut 4.368 -0.047
e23 <--> PsyDistt 9.618 0.062
e23 <--> e24 38.354 0.139
e22 <--> e24 11.486 -0.058
e21 <--> e23 23.196 -0.084
e21 <--> e22 11.689 0.045
e20 <--> AbuSup 9.874 0.1
e19 <--> PsySafty 7.683 -0.093
e19 <--> e23 4.038 0.05
e18 <--> SaftOut 6.255 -0.055

Measurement models CMIN /DF CFI GFI AGFI RMR NFI IFI TLI RMSEA

Threshold values <3 >0.95 >.95 >0.8 <0.09 >0.9 >0.9 >0.9 .05-0.1

Initial 4 factor solution 3.081 0.90 0.82 0.78 0.11 0.91 0.95 0.94 0.08

Final 4 factor solution 2.05 0.95 0.88 0.85 0.10 0.92 0.95 0.95 0.06

Most of the estimates lie within threshold values so model fit is good but
we have to compare it with other models for example 3 factor models, 2
factor models, 1 factor model etc.

There are many ways to pair the variables in different sets. Here our
variables are as follows:

4 FACTOR MODEL (as presented above)

3 FACTOR MODELS

2 FACTOR MODELS

1 FACTOR MODEL FIG-2


4 factor model: 1. abusive supervision (AS), 2.Psychological
distress (PD) 3.Psychological Safety Climate (PS) 4. Safety outcomes (SO)

Capital University of Science & Technology


9 By Gulfam Murtaza & Basharat Javed

3 Factor Model

There are many ways to pair the variables in different sets. Here our variables are as follows:

NOTE: always give regression weight to newly merged variables otherwise SEM will not
calculate estimates

3Factor models

1. (i).abusive supervision (AS) + Psychological Distress (PD) (ii).Psychological


safety Climate PS (iii) Safety outcomes (SO)

(i).AS-PD (ii) PS (iii). SO

2. (i).AS-PS (ii) PD (iii). SO

3. (i).AS-SO (ii) PD (iii). PS

4. (i). PD-PS (ii) AS (iii). SO

5. (i). PD-SO (ii) AS (iii). PS

6. (i).PS-SO (ii) AS (iii). PD

2 factors models

1. AS-PD PS-SO

2. AS-PS PD-SO

3. AS-SO PD-PS

1 factor model

One factor will carry all the items on a single variable

The results of these different models are summarized in a table with few estimates (here below,
in the table we presented almost all). If our proposed mode (here 4 factor model is our proposed
model) shows more fit estimates than other models, this makes our argument strong for the
fitness of model.

Capital University of Science & Technology


10 By Gulfam Murtaza & Basharat Javed

Measurement models CMIN /DF CFI GFI AGFI NFI IFI TLI RMSEA

Threshold values <3 >0.95 >.95 >0.8 >0.9 >0.9 >0.9 .05-0.1

Initial 4 factor solution 3.08 0.90 0.82 0.78 0.91 0.95 0.94 0.08

Final 4 factor solution 2.05 0.95 0.88 0.85 0.92 0.95 0.95 0.06

3Factor AS-PD 4.47 .83 .71 .65 .79 .83 .81 .10

3Factor AS-PS 4.80 .81 .71 .65 .78 .82 .79 .11

3Factor AS-SO 7.63 .67 .57 .48 .65 .68 .64 .15

3Factor PD-PS 8.48 .63 .42 .38 .60 .63 .59 .16

3Factor PS-SO 5.0 .80 .67 .60 .76 .80 .78 .11

2Factor AS-PD, PS-SO 8.91 .60 .53 .44 .58 .61 .57 .16

1Factor model 13.39 .39 .38 .26 .37 .39 .33 .20

Fig.3. few examples of pairing models

From the above table, model fit indices clearly point out fitness of 4 factor model is greater than
other models through statistical estimates.

Capital University of Science & Technology


11 By Gulfam Murtaza & Basharat Javed

Now after cfa, clear picture of variables is evoked. It’s the right time to get reliability test.

Reliability Test (SPSS)


Chronbach’s Alpha is used for the reliability tests. Spss is used for measuring chronbach’s alpha

Method:

1. Analyze > scale > reliability analysis (box opens)


2. Shift items of a particular variable in “items” box
3. In “STATISTICS” box, check “scale if item deleted”
4. In “scale label” give the name of the variable
5. Ok (repeat the process on all variables)

Note: In “reliability statics table” of output, cronbach’s alpha’s value is mentioned. If the
alpha’s value of any variable is smaller than 0.7, the table “item-total statistics” last column
shows what item should be deleted to improve the reliability and to what level it will increase the
alpha’s value. It is the option for researcher to delete a certain item for improving reliability

Sr# Scale Chronbach’s alpha No. of Items


1 Abusive Supervision 0.793 5
2 Psychological Distress 0.947 7
3 Psychological Safety Climate 0.941 7
4 Safety Outcomes 0.912 5

Composite Variables (Mean Calculation) (SPSS)

Reliability is good now our data is ready for regression analysis. But before regression, we have
to calculate means. In spss, transform > compute variable command will be used for calculating
composite variables. The main point to remember is that only those items will be included which
are surfaced during EFA and CFA.

Capital University of Science & Technology


12 By Gulfam Murtaza & Basharat Javed

Centering the Variables for Interaction Term (SPSS)

First calculate the means of those variables needed to calculate interaction term. In our case,
Psychological Safety Climate (PS) is the moderator and Abusive Supervision (AS) is
independent variable. So mean of these variables will be calculated for centering. For mean
calculations: method

1. Analysis > descriptive > frequencies

2. Statistics > mean

3. ok

Statistics

AS PS

Mean 2.7190 3.4490

For centering
1. Transform >compute variables
2. shift the variable (Composite) e.g., AS in right box,
3. Name the variable (e.g., ASC)
4. Place the minus sign, type the mean calculated and click ok. (same for PS)
For Interaction Term
1. Transform > compute variables
2. Name the variable (e.g., ASCxPSC)
3. Shift ASC in the box, place * sign, shift PSC
4. ok

One way ANOVA test (SPSS)

This test is performed to confirm the demographic variables with dependent variables. If the
results are significant, we have to carry that particular demographic variable to our next analysis
of correlation and regression.

Capital University of Science & Technology


13 By Gulfam Murtaza & Basharat Javed

Method

1. Analyze > compare means > one way ANOVA

2. Shift dependent variable and categorical/demographic variables one by one after every
reading

3. Ok

Note : For categorical variable Gender, the result of spss is below in the table. Perform
tests for all demographic variables. The results of all variables have been summarized in
the second table.

ANOVA
SO

Sum of Squares df Mean Square F Sig.

Between Groups .527 1 .527 .764 .383


Within Groups 201.444 292 .690
Total 201.971 293

Sno. DEMOGRAPHICS Mean square F Sig.


1 GENDER 0.527 0.764 0.383
2 AGE 3.818 5.910 0.000
3 QUALIFICATION 1.073 1.568 0.183
1 EXPERIENCE 1.279 1.833 0.097

The second table shows that only AGE is significant in One Way ANOVA. So we shall carry
only age in our correlation and regression analysis.

Correlation Analysis (SPSS)

This analysis explains the co-variance between two variables. This test shows that whether two
variables positively or negatively associated with one another. Positive correlation indicates that
both variables move in the same direction i.e., if one variables increases, the other will also
increase. While negative correlation refer to the fact that if one variable increases, the other
variable decreases and is indicated by negative sign with the co-efficient value. It is important to

Capital University of Science & Technology


14 By Gulfam Murtaza & Basharat Javed

note that correlation test just explains the association between variables but not the causal
impact.

For magnitude / Value, it must lie within 0.3 to 0.7. Lesser will show low correlation between
variables and greater will indicate multicollinearity issue in IVs.

Method

1. Analyze > Correlate > Bivariate

2. Shift composite variables in the box along with demographic variables (here in our case it
is only AGE)

3. Select Pearson and hit OK

AGE ASC PD PSC SO

AGE 1

ASC -0.38 *** 1

PD -0.29*** 0.21*** 1

PSC -0.06 0.16*** -0.19*** 1

SO -0.25*** 0.07 0.62*** 0.05 1

P<.05=*, P<.01 = **, P<.001 = ***

Reasonable Correlation is witnessed here. The next step is Regression.

REGRESSION (SEM)

1. Open amos from analyze > IBM SPSS AMOS

2. Link the amos file with spss data file (procedure explained above)

3. Change the orientation of the page form view > interface properties > landscape > apply

4. Click on list variable button from toolbox, it will show the list of variables.
Drop composite variables on the sheet through drag and drop method

5. Drop demographics which remained sig. with ANOVA. (e.g., AGE)

Capital University of Science & Technology


15 By Gulfam Murtaza & Basharat Javed

6. Position the variables on the canvas of amos as shown is figure below. (use move tool)

7. Resize all variables (Pluggin > resize observed variables)

8. For regression, add error terms to all DV’s and Mediators (select Add a Unique Variable

tool and click on PD and SO)

9. Join IV’s and Interaction term with covariance tool

10. Join IV’s with Mediator and DV with Draw Path Tool

11. Joint demographic variables (AGE) with DV (SO) with Draw path tool

12. Name the error terms (Plugins > Name unobserved variables)

13. Save the file

14. For setting output, click Analysis properties tool >

a. Estimates > Maximum likelihood

b. Output > standard estimates > 1. modification indices, 2. Indirect / direct total
effects

c. Bootstrap > 1-perform bootstrap (2000), 2- Percentile confidence interval


(95)OR Bias corrected confidence interval (95)preferred

Capital University of Science & Technology


16 By Gulfam Murtaza & Basharat Javed

15. Take Calculate estimates tool from tool box > proceed with analysis it will
calculate the estimates.

16. Click on view text from tool box, it will show the output file.

Without Mediator With mediator BootStrap

AS-PD-SO -0.02 (0.73) (NS) -0.11(0.01) Sig (Indirect)

17. For bootstrap effect from output, click Estimates > matrices > standard indirect effect

AND from Estimates bootstrap > bootstrap confidence > two tailed method

It will just show sig or not sig. here in our case the value is (0.004) which is significant
i.e., mediation exists. Without a mediator, there is no effect of AS on SO but the only
effect exists through mediator.

Grouping
Creating groups for moderation, mediation e.g., we want to get a separate results for male
and females or in different age groups etc

 In a pan on left side, click on “group name”

 In a box, type the name of the group…e.g., male.. than click new button and type
“female” and close the box

 Click of “select data file” button from the tools and select data file for both male
and female through “FILE NAME” button

 For attaching grouping variables to both the groups, click on each group and than
through “grouping variable” button attach grouping variable e.g., gender to both
the male and female

 For assigning value to the group, click on ‘male’ and double click on ‘1’ from the
box and ‘2’ for female from the box…. Hit ok

Capital University of Science & Technology

You might also like