Pub Structural Equation Modeling and Natural Systems
Pub Structural Equation Modeling and Natural Systems
JAMES B. GRACE
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo
Cambridge University Press has no responsibility for the persistence or accuracy of s
for external or third-party internet websites referred to in this publication, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
To my wife Peggy,
for her joyous spirit, wisdom, and laughter.
and
Preface page ix
Acknowledgments xi
PART I A BEGINNING
1 Introduction 3
2 An example model with observed variables 22
vii
viii Contents
This book is about an approach to scientific research that seeks to look at the
system instead of the individual processes. In this book I share with the reader
my perspective on the study of complex relationships. The methodological
framework I use in this enterprise is structural equation modeling. For many
readers, this will be new and unfamiliar. Some of the new ideas relate to statis-
tical methodology and some relate to research philosophy. For others already
familiar with the topic, they will find contained in this volume some new exam-
ples and even some new approaches they might find useful. In my own personal
experience, the approaches and methods described in this book have been very
valuable to me as a scientist. It is my assessment that they have allowed me
to develop deeper insights into the relationships between ecological pattern
and process. Most importantly, they have given me a framework for studying
ecological systems that helps me to avoid getting lost in the detail, without
requiring me to ignore the very real complexities. It is my opinion, after some
years of careful consideration, that potentially they represent the means to a
revolutionary change in scientific inquiry; one that allows us to ask questions
of interacting systems that we have not been able to ask before. These methods
provide many new opportunities for science, I believe, and it is my hope that
others will see their value as well.
It is important for the reader to keep in mind throughout this book the distinc-
tion between statistical procedures and the scientific enterprise. The application
of structural equation modeling (SEM) to research questions embodies both ele-
ments, but the priorities of one do not necessarily match with those of the other.
My approach to this book is from the perspective of a researcher, not a statis-
tician. My treatment is not designed to satisfy the requirements of statisticians
nor those interested in the mathematics. Rather, I strive to keep the focus on
developing models that match the questions being addressed. Many treatments
of statistical methods are prescriptive and based on protocols that have been
ix
x Preface
worked out on the basis of statistical requirements. While I could simply present
SEM protocols for use by the natural scientist, I am of the opinion that pro-
tocols are commonly an impediment to the best use of statistical methods for
research purposes (see also Abelson 1995). For this reason, my emphasis is on
fundamental issues that provide the reader with the material to make their own
decisions about how to apply statistical modeling to their particular research
problems.
The general arena of studying complex relationships and the specifics of
SEM is one where subject matter and statistical analysis intertwine to a greater
degree than is customary for researchers or statisticians. What is distinctively
different about the study of complex, multivariate relationships compared with
univariate hypothesis testing is the degree to which the analyst has to know both
the subtleties of the methods and the particulars of the system being studied.
The goal of this book is to show why it can be worth the effort to develop and
evaluate multivariate models, not just for statistical reasons, but because of the
added scientific insights that can be gained. Those who apply these methods
to their own data may find, as I have, that it is quite enjoyable. Hopefully the
reasons for excitement will be evident as the reader explores the chapters ahead.
Acknowledgments
I have a great many people to thank for helping me along the way in this major
venture. I thank Alan Crowden, Dominic Lewis, Emma Pearce, and Graham
Bliss of Cambridge University Press for their support with this project. If this
book is at all readable, it is because of the help of a great many individuals who
provided numerous comments. I especially thank Glenn Guntenspergen, who
not only read the whole volume in both first and last drafts, but who provided
sage advice from beginning to end. To him, I owe the greatest debt of gratitude.
I also wish to express special thanks to Sam Scheiner for many insightful
suggestions on both content and presentation, as well as for advising me on
the best way to present an illustration of SEM practice in the Appendix. I am
appreciative of the USGS National Wetlands Research Center for their history
of supporting the application of SEM to natural systems. Several examples in
this book came from their studies.
Two of my comrades in the quest to introduce SEM to the natural sciences,
Bill Shipley and Bruce Pugesek, both provided very helpful comments. It was
Bill who convinced me that the first draft of this work was far too condensed to
be useful to the student. The readers owe him a great debt of gratitude for the
final structure of this book, which attempts to lead one gradually through the
fundamentals of SEM in the beginning, in order to establish a base from which
to jump into more advanced issues later. Bruce is especially to be thanked for
introducing me to SEM and for working patiently with me through the initial
learning process.
I am also thankful for the training and advice I received from some of the leg-
endary figures in the development of structural equation modeling. My biggest
debt of gratitude is to Ken Bollen, who saved me from several fundamental
errors in the early development of many of my ideas. Ken possesses the rare
combination of being brilliant, patient, and kind, which has been enormously
helpful to me as I have struggled to make the statistical methodology fulfill my
xi
xii Acknowledgments
research ambitions. I am also grateful to Karl Jöreskog and Dag Sörbom for their
week-long workshop on SEM early in my learning and for answering my many
pesky questions about all those complications about which instructors hope to
avoid being asked. Bengt and Linda Muthén likewise shared lifetimes of experi-
ence with me in another week-long SEM workshop, again with tolerance for my
questions about fringe methods and thorny problems. Others who helped me
greatly through my involvement in their training classes include Bill Black
(LSU), Adrian Tomer (Shippensburg University), Alex von Eye (Michigan
State University), and most recently David Draper (University of California –
Santa Cruz).
There are many other people who provided helpful comments on all or part
of the book manuscript, including Jon Keeley, Diane Larson, Bruce McCune,
Randy Mitchell, Craig Loehle, Dan Laughlin, Brett Gilbert, Kris Metzger, Gary
Ervin, Evan Weiher, Tim Wootton, Janene Lichtenberg, Michael Johnson, Laura
Gough, Wylie Barrow, Valerie Kelly, Chris Clark, Elsa Cleland, and Ed Rigdon.
I apologize to any who I have left off the list, the process has gone on long enough
to make it hard to keep track. To all who helped, Thank you!
Last and certainly not least are the people who have provided the encourage-
ment and support in all those more personal ways that are essential to a great
and productive life. My deepest gratitude to my loving wife Peggy, who has
enhanced my quality of life in every way and who led me into a better life. To
my Mother and my Sister Diane, I am unspeakably grateful for all the years
of love and support. To Jeremy, Kris, Zach, Abi, Erica, Madison, Sophie, and
Luke, your acceptance and love means more to me than you know. To Bob
Wetzel, I am grateful for his encouragement over all these many years.
PA RT I
A beginning
1
Introduction
3
4 Structural equation modeling and natural systems
social sciences where it has gained a widespread application, may or may not
suit our needs. Further, ways of connecting structural equation models with the
broader scientific process are needed if we are to gain the maximum impact
from our models and analyses. All these issues need to be addressed if SEM is
to be applied properly and is to have utility for advancing the study of natural
systems.
Before I can discuss fully the potential contributions of SEM to the study of
natural systems, we must first have a fairly clear understanding of the principles
and practice of SEM. What is its history? What are the underlying statistical
principles? How are results from SEM applications to be interpreted? What
are the choices to be made and steps to be performed? After such questions
have been addressed and the nature of SEM is clear, I will consider its broader
significance for the study of natural systems.
To start us on common ground, I begin this chapter with a brief discussion
of material that should be familiar to the majority of readers, classic univariate
null hypothesis testing. This material will provide a point of comparison for
explaining SEM. In this chapter, I will present only a brief and simplistic char-
acterization of SEM, primarily from an historic perspective, while in Chapter
2, I will present an example from an application of SEM to give the reader a
tangible illustration.
In Chapter 3, I begin to present some of the fundamental principles of struc-
tural equation models, emphasizing their reliance on the fundamental princi-
ples of regression. This coverage of basic topics continues through Chapters 4
(latent variables) and 5 (estimation and model evaluation). In Chapter 6, I spend
some time presenting a more advanced topic, composite variables, for the dual
purposes of illustrating this important capability and also to help clarify the role
of latent variables. Chapter 7 provides a very superficial overview of some of
the more advanced capabilities of SEM.
Chapters 8 to 11 will emphasize examples of ecological applications to
give the reader more of a sense of how the principles can be applied to nat-
ural systems. Throughout this section of material, I will contrast the kinds of
results obtained from SEM with those obtained from the conventional scien-
tific methods that have guided (and limited) the natural sciences up to this
point. Experience suggests that such comparisons are often the most effective
means of conveying the potential that SEM has to transform the study of natural
systems. This section of chapters will include an illustration of the sustained
application of SEM to an ecological problem, the understanding of patterns of
plant diversity (in Chapter 10). In Chapter 11, I provide a summary of some
cautions as well as a set of recommendations relating to the application of SEM
so as to provide all of this practical advice in one place.
Introduction 5
In the final section of the book (Chapters 12 and 13), it will be my purpose
to give an overall view of the implications of applying SEM to the natural
sciences. I will discuss from a philosophical point of view some of the things
that I believe have limited the advance of ecological science, and how SEM
can lead to a greater maturation of our theories and investigations. Included in
this last section will be a discussion of how to integrate SEM into the broader
scientific enterprise. Finally, an Appendix provides example applications that
illustrate some of the mechanics and logic associated with SEM. The reader
will be directed to these examples at appropriate places throughout the book.
y1 = α1 + X + ζ1 (1.1)
the point that SEM practitioners long ago rejected the logical priority of null
hypotheses, though the use of p-values continues to be one of the tools used in
model evaluation.
It is perhaps useful to note that null hypothesis testing has recently been under
attack from several quarters. Biologists have begun to argue more vigorously for
a departure from reliance on null hypothesis testing (Anderson et al. 2000). The
lack of utility of null hypothesis testing has led to recommendations for the use
of model selection procedures as an alternative basis for developing inferences.
A tenacious effort to expose ecologists to this approach (e.g., Burnham and
Anderson 2002) has begun to bring familiarity with these issues to many. At
present, these efforts remain focused on univariate models and have not yet
tapped into the substantial experiential base of SEM practitioners.
Bayesian methods for estimating parameters and probabilities (Congdon
2001, Gelman et al. 2004) also suggest alternatives to null hypothesis testing.
While there are a number of different variants of the Bayesian procedure, the
basic concept is that from a personal standpoint, the concept of probability
is one that is based on the knowledge available to the investigator. In this
framework, empirical evidence is used to update prior probabilities so as to
generate posterior probability estimates. This form of probability assessment is
preferred by many because it corresponds more directly to the intuitive meaning
of probability as a measure of confidence in a result. As will be discussed in
the final chapter, Bayesian approaches are now being considered for use in
estimation and the evaluation of structural equation models.
A difficulty for some may arise from the fact that our definition of structural
includes the word causal. The average person who is neither a scientist nor
a philosopher may be rather surprised to find that scientists and philosophers
have historically had some difficulty with the concept of causation. Because of
the unease some have with discussing causality, the relationships embodied in
structural equations are sometimes referred to as dependencies instead of causes
(e.g., “the values of y depend on the values of x”). Thus, an alternative definition
that is sometimes seen for structural equations is that they represent statistical
dependencies (or statistical associations) that are subject to causal interpreta-
tion. What is ultimately most important to realize is that, while the results of
structural equation analyses are meant to be reflective of causal dependencies,
it is not the statistical results per se that demonstrate causation. Rather, the case
for making a causal interpretation depends primarily on prior experience and
substantive knowledge.
There has existed over the years an ongoing discussion on the nature of
causality and its relationship to structural equations. Perhaps one of the reasons
structural equation modeling has been slow to be adopted by biologists has
been the priority placed by Fisher on the adage that “correlation does not imply
causation”. His emphasis was on manipulative experiments that sought to isolate
causes and this approach remains a strong emphasis in biological research.
One can see this ongoing debate as a recurring process in which from time
to time some feel it wise to caution against overzealous inference of causes.
These cautionary periods are typically followed by a general defense of the
reasonableness of causal thinking. Fisher’s emphasis was on the development
of rigorous experimental protocols designed to isolate individual causes. The
emphasis of those who developed structural equation modeling has not been
on isolating causes, but instead, on studying simultaneous influences. Both of
these scientific goals have merit and, we can think of them as representing the
study of individual processes versus the study of system responses.
Some clearly articulated insights into the nature of causality and the relation-
ship to structural equations can be found in Wright (1921) and Bollen (1989),
and some of these are described below. Recently, Pearl (2000, Causality) has
addressed the issue of causation and structural equations at the level of funda-
mental logic. A distillation of some of these ideas as they relate to biology can
be found in Shipley (2000).
There are a number of arguments that have been made about the tendency
for some scientists and philosophers to steer away from using causal language.
As Pearl (2000) notes, one reason that some mathematicians have steered away
from discussing causation may be the fact that the language of mathematical
equations relies most commonly on the symbol “=” to describe the relation-
ships between y and x. As a mathematical operator, the “=” symbol simply
Introduction 9
g 31
x1 y3
γ11 g 21
b32 z3
b21
y1 y2
z1 z2
Figure 1.1. Graphical representation of a structural equation model (Eqs. (1.2)–
(1.4) involving one independent (x1 ) variable and three dependent (y) variables.
y1 = α1 + γ11 x1 + ζ1 (1.2)
y2 = α2 + β21 y1 + γ21 x1 + ζ2 (1.3)
y3 = α3 + β32 y2 + γ31 x1 + ζ3 (1.4)
Using graphical representation (and omitting the intercepts), Figure 1.1 illus-
trates this model. A quick comparison of the equational and graphical forms
of the model (Eqs. (1.2)–(1.4) versus Figure 1.1) illustrates the appeal of the
graphical representation, which has been in use since structural equation mod-
eling (in the form of path analysis) was first introduced. In this illustration,
boxes represent observed variables, arrows between boxes represent the direc-
tional relationships represented by equality signs in the equations, gammas (γ )
represent effects of x variables on y variables while betas (β) represent effects
of ys on other ys, and the zetas (ζ ) represent error terms for response variables.
The idea of combining a series of equations into a multivariate model, which
should be credited to Wright (1921), has proven to be a major step forward in the
advancement of the scientific method. With a series of structural equations, we
can now specify models representative of systems and address many questions
that cannot be answered using univariate models. In Chapter 3, we will describe
the anatomy of structural equation models and illustrate in more detail the vari-
ous ways structural equations can be combined and how they are interpreted. In
later chapters we will include additional elements, such as unobserved (latent)
12 Structural equation modeling and natural systems
and composite variables. We will also see that there are a large number of prob-
lems that can be tackled using SEM, including nonlinear modeling, reciprocal
influences, and many more.
Of prime importance is that structural equation models allow us to address
scientific questions about systems that cannot be addressed using univariate
approaches. As we shall see, these multivariate questions are often precisely
the ones of greatest importance to the scientist. An introduction to SEM brings
into focus the simple fact that up to this point, we have been studying systems
primarily using a methodology (specifically the univariate model) that is not
designed for examining system responses. Rather, the univariate model, which
has come to represent the scientific method over the past 50+ years, is only
designed for the study of individual processes or net effects, not interacting
systems. To make a crude analogy, if the development of the univariate model
is akin to the development of the wheel, the advance represented by combining
structural equations into complex models is akin to the attachment of four
wheels to a frame to make a cart. The change is not just one of degree, it is a
change of a revolutionary sort.
100%
software
80%
model evaluation
60% stat modeling
40% factor analysis
regression
20%
0%
Figure 1.2. Representation of the elements of modern SEM. Stat modeling refers
to statistical modeling.
Discriminant Analysis
Regression Trees
Multiple
Regression
likelihood methods for estimation. It was this advance that permitted (1) a gen-
eralized solution procedure for factor and path models, (2) a proper approach to
nonrecursive models, and (3) an assessment of discrepancies between observed
and expected which permits hypothesis evaluation at the level of the over-
all model. Simultaneous to the development of the LISREL model, Jöreskog
developed computer software (largely because of the iterative requirements of
maximum likelihood methods), and since that time, SEM has come to be defined
in part by the capabilities of the software packages used to implement analyses.
Over the years, numerous capabilities have been added to SEM, both through
theoretical advances and through additions to the various software packages
that have been developed. Such things as multigroup analyses, the inclusion
of means modeling, numerous indices of model fit, modification indices, capa-
bilities for handling categorical variables, and most recently, procedures for
hierarchical models have been added. All of these have expanded the elements
that make up SEM.
Introduction 17
At present, a number of new developments are taking place which have the
potential to strongly influence the future growth and capabilities of SEM. Novel
approaches to the analysis of network relationships are emerging from fields
as diverse as robotics and philosophy. Their goals range from pattern recog-
nition to data mining, from computer learning to decision support. Some of
these developments are fairly unrelated to the enterprise of structural equation
modeling because of their descriptive nature. However, others have substantial
relevance for the estimation and evaluation of structural equation models and
are already starting to be incorporated into the SEM literature. Collectively,
these new methods, along with SEM itself, fall under the umbrella concept of
graphical models (Pearl 2000, Borgelt and Kruse 2002). Of particular impor-
tance are those graphical models that incorporate Bayesian methods, such as
Bayesian networks (e.g., Neapolitan 2004), which permit what is now being
called Bayesian structural equation modeling (Scheines et al. 1999, Congdon
2003).
SEM DA RT PCA MR
Include measures of absolute model fit ✓
User can specify majority of relationships ✓
Capable of including latent variables ✓ ✓
Able to address measurement error ✓
Allows evaluation of alternative models ✓ ✓
Examines networks of relationships ✓
Can be used for model building ✓ ✓ ✓ ✓ ✓
Philosophy of presentation
As stated previously, the first goal of this book is to introduce the reader to
structural equation modeling. There are many different ways in which this can be
approached. One is to emphasize the fundamental mathematical and statistical
reasoning upon which modern SEM is based. An excellent example of this
approach can be found in Bollen (1989). Another approach is to emphasize the
basic concepts and procedures associated with SEM. A fair number of textbooks
offer this kind of introduction to the subject (e.g., Hair et al. 1995, Schumacker
& Lomax 1996, Loehlin 1998, Maruyama 1998, Raykov & Marcoulides 2000,
Kline 2005). The presentations in these works are meant to apply to any scientific
discipline that chooses to use SEM; in other words, they are descriptions of the
SEM toolbox, along with general instructions. A third kind of presentation
which is especially popular for those who have a background in the basics is
one that emphasizes how individual software packages can be used to apply
SEM (Hayduk 1987, Byrne 1994, Jöreskog and Sörbom 1996, Byrne 1998,
Kelloway 1998, Byrne 2001).
One other approach to the material would be to examine how modern SEM,
which has been largely developed in other fields, might be applied to the study of
natural systems. This is a focus of the first part of this book. Shipley (2000) was
20 Structural equation modeling and natural systems
the first to devote an entire book to present methods associated with modern
SEM specifically to biologists (although classic methods were presented in
Wright 1934 and Li 1975). In this effort, he covered a number of basic and
more advanced topics, with an emphasis on the issue of inferring causation from
correlation. More recently, Pugesek et al. (2003) have presented an overview
of the conventions of modern SEM practice, along with a selection of example
applications to ecological and evolutionary problems. A brief introduction to
the application of SEM to the analysis of ecological communities can also be
found in McCune and Grace (2002, chapter 30).
The need for a presentation of SEM that specifically relates to natural systems
is justified by the way in which disciplinary particulars influence application.
The flexible capabilities of SEM offer a very wide array of possible applications
to research problems. It is clear that the characteristics of the problems them-
selves (e.g., nature of the data, nature of the questions, whether manipulations
are involved, sample size, etc.) have a major influence on the way the SEM
tools are applied. Because many nuances of SEM application depend on the
characteristics of the problem being studied, its application is not so well suited
to a standardized prescription. This is why various presentations of SEM often
differ significantly in their emphasis. For example, researchers in the social
sciences where SEM has been most intensively used thus far, often apply it to
survey data, which tends to have large sample sizes, suites of related items, and
a major concern with measurement (e.g., how do we estimate a person’s ver-
bal aptitude?). When these same methods are applied to experimental studies
where a small number of subjects, for example, forest tracts or elephants, can
be studied intensively, the issues that are most important shift.
I have found that natural systems scientists often find it difficult to grasp
immediately the meaning and value of certain aspects of SEM. If this were not
the case, the books that already exist would be sufficient and biologists would
be applying these methods routinely to their full potential. While there have
been a significant number of applications of SEM to natural systems, it is safe
to say that the capability and appropriate use of these procedures has not been
approached. A decade of effort in applying SEM to ecological problems (as of
this writing) has convinced me that there are substantial challenges associated
with the translation and adaptation of these methods.
In conversations with biologists and ecologists I have consistently heard
requests for examples. Those in the natural sciences often do not see the kinds
of analyses obtained through structural equation modeling in the social sciences
as important or relevant to their problems. Very commonly, biologists still see
the univariate procedures that are applied to their data as sufficient. Why, they
ask, should one go to all the trouble of learning and applying SEM? For these
Introduction 21
22
An example model with observed variables 23
Figure 2.1. Euphorbia esula L. From USDA, NRCS. 2005. The PLANTS
database, version 3.5. (http://plants.usda.gov).
50
40
30
20
10
0
1999 2000 2001
0 0
−20 −20
−40 −40
−60 −60
−80 −80
−100 −100
0 2 4 6 0 2 4 6 8
log A. nigriscutis log A. lacertosa
Figure 2.3. Regression of change in spurge between 2000 and 2001 against the
log of the density of Aphthona nigriscutis and Aphthona lacertosa.
was a negative relationship between change in spurge density and flea beetle
density for both species. These univariate results would seem to further support
the interpretation that the biocontrol agents are contributing to the observed
decline in spurge.
Finally, examination of the bivariate correlations among variables (Table 2.1)
provides additional insights into some of the apparent relationships among the
species. For the sake of simplicity, I will lump bivariate results (i.e., correlations)
with regression coefficients under the general heading of “univariate” results.
Only the correlations for the period 2000 to 2001 are presented here. A more
complete exposition can be found in Larson and Grace (2004). As can be seen
in Table 2.1, 10 of the 15 bivariate correlations are statistically significant at
the p < 0.05 level. Several points are worth noting about these individual
correlations:
(1) The change in spurge density between years is not only negatively correlated
with A. nigriscutis and A. lacertosa in 2001 (as shown in Figure 2.3), but
also with their densities in 2000.
(2) The change in spurge density is additionally correlated (negatively) with
spurge density in 2000.
(3) The spurge density in 2000 correlates with A. nigriscutis density in both
2000 and 2001, while it only correlates with A. lacertosa in 2001.
(4) Species A. nigriscutis and A. lacertosa only correlate with themselves
across times; there are no correlations among species that are apparent.
These results indicate that the full story may be more complicated than described
thus far, primarily because the change in spurge was not only related to flea
beetle densities, but also strongly related to initial spurge density.
A. nigriscutis c A. nigriscutis
2000
0 2001
k e
h
a i
m Change in
Number of
stemss
stems 2000
2000--2001
b g j
A. lacertosa
l f A. lacertosa
a
2000
0 2001
d
Figure 2.4. Initially hypothesized model used to evaluate factors affecting leafy
spurge changes in density over time (from Larson and Grace 2004, by permission
of Elsevier Publishers).
self regulation, change in spurge over time might relate both to flea beetles and
to initial spurge density. These issues were dealt with by (1) separating changes
in stem density over time from instantaneous values of stem density, and (2) by
modeling the interaction with flea beetles over time.
The structure of the initial model (Figure 2.4) was based on a number of
premises, which were derived from general knowledge about these species and
similar systems. These premises are as follows:
(1) In any given year (denoted by year a), the densities of flea beetles can be
expected to covary with the density of spurge because these insects depend
on this species for food. These Aphthona species are presumed to be obligate
feeders on Euphorbia esula, so it would be reasonable to expect that spatial
variations in spurge densities will cause spatial variations in flea beetle
densities (paths a and b).
(2) The densities of insects in a plot would be expected to be related over time.
This premise is based on a number of biological features of the system
relating to dispersal and food requirements. However, it is known that in
some cases high population densities one year can lead to lower ones the
next (e.g., because of interactions with a parasite), so it would not be a
surprise if correlation over time were weak or variable. Thus, here we are
assessing site fidelity for the insects (paths c and d).
(3) There may be time-lag effects in the dependencies of flea beetles on spurge
density. This premise is supported by the fact that the flea beetles feed on
28 Structural equation modeling and natural systems
the plants for some period as larvae before emerging to feed as adults (paths
e and f).
(4) An important question is whether the species of Aphthona may interfere
with one another or otherwise be negatively associated. No a-priori infor-
mation was available upon which to base an expectation, therefore, we
included the pathways (paths g and h) to evaluate this possibility.
(5) The fundamental premise of the biocontrol program is that the flea beetles
will have a negative effect on the change in spurge over time (paths i and j).
Again, because feeding by the insects on plants begins in their larval stage
(note that larvae were not directly observed in this study), it is reasonable to
expect that some of the variation in spurge density changes will be related
to the density of insects in the previous year (paths k and l).
(6) The change in spurge density may be expected to correlate with initial
density due to density dependence. Most likely this would be manifested
as a negative influence of initial density due to autofeedback (a reduced
increase in stems at high densities) or even self thinning (a greater rate of
decline at high densities).
The procedures by which we determine whether the data support our initial
model will be presented later in this book. What the reader needs to know at
this time is that there are processes whereby expectations about the data are
derived from model structure, and the actual data are then compared to these
expectations. This process not only provides estimates for all the parameters
(including the numbered paths, the variances of the variables, and the distur-
bances or error terms), but also provides an assessment of overall model fit.
When the relationships in the data are found to inadequately correspond to the
initial model, the combination of a-priori knowledge and initial results is often
used to develop a revised model, which is then used to obtain final parameter
estimates. The results from revised models are not viewed with as much confi-
dence as results from an initial model that fits well. Rather, they are seen as in
need of further study if confidence is to be improved. That said, it is generally
more appropriate to interpret parameters from a revised model that fits well
than from an initial model that does not.
different way for structural equation models than for null hypotheses. In null
hypothesis testing, priority is given to the hypothesis of no relationship. This is
the case no matter what our a-priori knowledge is about the processes involved.
In contrast, when evaluating overall fit of data to a structural equation model,
priority is given to the model, and test results are used to indicate whether there
are important deviations between model and data. Thus, in SEM, the a-priori
information used to develop the initial model is used as a basis for interpretation.
Stated in other terms, we presume that our other knowledge (aside from the data
at hand) is important and that our data are meant to improve our knowledge.
Based on modification indices (which are provided by some SEM software
programs to indicate more specifically where lack of fit occurs), a relationship
not initially anticipated needed to be added before an adequate match between
model and data could be obtained. This missing relationship indicated by the
data was a negative correlation between A. nigriscutis and A. lacertosa in 2000
(the meaning of which is discussed below).
Now, the procedures in SEM are not completely devoid of null hypothesis
tests. Associated with the estimation of model parameters is the calculation
of standard errors for all path coefficients, and these can be used to estimate
associated p-values. Unlike usual null hypothesis testing procedures, though,
these paths are not always removed from the model simply because we fail to
reject the null hypothesis. Rather, the decision to remove a path is based on the
consequences for overall model fit, the individual p-values, and our substantive
knowledge. In our example, the paths from flea beetles in 2000 to change in stem
counts in year 2001 (paths k and l) were found to be indistinguishable from zero
based on t-tests. Because we were initially uncertain about whether these paths
should be in the model, they were removed. Also, the probability associated with
path h, which represented an effect of A. nigriscutis on A. lacertosa, suggested
little chance that such an effect occurred in this case. Therefore, this path was
also removed. Model chi-square changed little with the removal of these paths,
confirming that these relationships did not represent characteristics of the data.
The path from spurge density to A. lacertosa in 2000 was also found to be
weak and the probability that it differs from zero unlikely. However, this path
was retained in the model because of the potential that such an effect does occur
some of the time. In fact, other data have revealed that the relationship between
spurge density and A. lacertosa density does sometimes show a nonzero value.
One final change to the model was deemed necessary. Results indicated that
path i was nonsignificant (the effect of A. nigriscutis in 2001 on change in spurge
stems). When this path was removed, it was found that there is actually a residual
positive correlation between A. nigriscutis in 2001 and change in spurge stems.
This relationship was represented in the revised model as a correlated error
30 Structural equation modeling and natural systems
R 2 = 0.26 R 2 = 0.61
−0.56 Change in
−0.20 Number of
stems
stems 2000
2000--2001 R 2 = 0.44
ns 0.21
−0.22
R 2 = ns R 2 = 0.53
Figure 2.5. Results for revised model. Path coefficients shown are all standardized
values (from Larson and Grace 2004, by permission of Elsevier Publishers).
term, which represents the effects of a joint, unmeasured causal factor (this
topic is explored in more detail in Chapter 3). Following these modifications
of the initial model, a reanalysis was performed.
areas and have not fully dispersed yet. The third is that there is a neg-
ative interaction between the two species. There is currently insufficient
information to resolve between these possible causes.
(3) The densities of both A. nigriscutis and A. lacertosa show a fair degree of
consistency over time (path coefficients of 0.62 and 0.67).
(4) Both species of flea beetle in 2001 showed a lag relationship to spurge
density in 2000 (path coefficients of 0.23 and 0.21).
(5) There is evidence for a modest negative effect of A. lacertosa in 2000 on
A. nigriscutis in 2001 (path of −0.14). There is no indication that A.
nigriscutis has a similar effect on A. lacertosa.
(6) Changes in spurge stem densities from 2000 to 2001 were found to be
negatively related to A. lacertosa (path coefficient of −0.22). The most
reasonable interpretation for this relationship is that it represents the effects
of feeding by the insect on the plant. It does not appear that A. nigriscutis is
having a negative effect on spurge, however. Thus, it would seem that the
two biocontrol agents are not equally effective in reducing spurge.
(7) The biggest factor affecting changes in spurge density is negative density
dependence.
(1) A. nigriscutis tracks spurge strongly while A. lacertosa does not – this was
also found in the bivariate correlations.
(2) A. nigriscutis and A. lacertosa had an initial negative association – this was
not found from examination of bivariate correlations.
(3) Densities of A. nigriscutis and A. lacertosa were fairly consistent over time –
this was found from examination of correlations; however, bivariate results
overestimate the fidelity by a fair bit.
(4) A. nigriscutis and A. lacertosa both show a lag relationship to spurge density
in 2000 – this relationship could not be examined using univariate analyses.
32 Structural equation modeling and natural systems
Conclusions
It is apparent, I hope, that a structural equation model provides a framework for
interpreting relationships that is substantially superior to the piecemeal inspec-
tion of univariate results. When we look at Table 2.1, we see many correlations
that are significant and some that are not. When we look at Figure 2.4, we
see a whole system of interactions with complete accounting of direct and in-
direct interactions and the relative strengths of pathways (in the next chapter
we will discuss the unstandardized path coefficients, which give the absolute
strengths of pathways). Without a substantive model to guide our interpretation,
it would be unclear what relationships are specified by the various correlations in
Table 2.1. As it turns out, some bivariate correlations correspond to path coeffi-
cients and others do not. The reasons for this will also be presented in the next
chapter.
It has also been shown that path coefficients can differ quite a bit from correla-
tion coefficients. The most conspicuous example of that is the apparent negative
relationship between A. nigriscutis and changes in spurge density (Figure 2.2),
which turns out to be a spurious correlation. It comes about because both of
these variables are influenced by a common factor, spurge density in 2000. The
density of A. nigriscutis in 2001 is positively affected by spurge density in 2000
while the change in stems is negatively affected. This structure automatically
creates a negative correlation between A. nigriscutis and the change in spurge
density. It is through the use of our structural equation model that we are able
to ascertain that the correlation is completely spurious.
Finally, the reader should be made aware that obtaining a correct interpreta-
tion depends on having the correct model. We have gone to some effort to ensure
a match between model and data. The fact that they do match is no guarantee
that the model (and our interpretations) are correct. We will have quite a bit
An example model with observed variables 33
more to say about this issue in later chapters. For now, let us simply say that
SEM, when properly applied, has the potential to provide much greater insight
into interacting systems than do traditional univariate models.
Summary
The results from this example are both ecologically interesting and also very
instructive as to the value of partitioning relationships in a multivariate model.
Bivariate correlations suggest that both flea beetles are contributing to the
observed decline in spurge density over time (Figure 2.3). However, it appears
that in actuality, during the time interval of this study, the biggest impact on
changes in spurge density was from self thinning, with A. lacertosa having a
modest effect on spurge and A. nigriscutis having no impact at all. This illus-
trates how careful one must be in interpreting bivariate correlations between
variables under common influence (in this case, by initial stem densities). Stem
densities appear to be a major driver in this system, both in terms of regulating
flea beetle populations and in terms of self regulation of spurge densities. It
also appears that there may be some negative associations between the two flea
beetles, which could have important implications for biocontrol efforts. Further
work, including carefully targeted experimental manipulations, can help us to
continue to refine our knowledge and our models about this system through an
accumulation of knowledge. I believe that this process of confronting systems
using structural equation models will enhance both our understanding of the
system and our ability to predict its behavior.
PA RT I I
37
38 Structural equation modeling and natural systems
zm
da a
I M h eh
db b
0
N
dc c
dd d J K L
de e zk zl
f g
ef eg
variables have been mentioned in the SEM literature for quite some time, it
is only recently that they have begun to be considered more seriously for rou-
tine inclusion in structural equation models. Finally, error variables (which are
unenclosed, but nevertheless true variables), represent unknown or unspecified
effects on observed variables and latent response variables. Error variables are
usually presumed to represent a collection of effects, including both random
measurement error and unspecified causal influences (i.e., variables that were
not measured). Often, error variables are treated more like summary measures of
the other parameters (e.g., the sum of unexplained variance) than like variables
themselves.
It is not my intention that the model represented in Figure 3.1 will be com-
pletely understood by the reader at this time. Rather, my objective here is simply
to hint at the great versatility embodied in modeling with structural equations
and to point to some of the things that will come later. In the rest of this chapter
The anatomy of models I: observed variable models 39
we will focus on models that only include observed variables. All of the princi-
ples we discuss related to observed variable models will also apply to the more
complex models that are considered later, when latent and composite variables
are introduced more fully.
A B
x1 y2 x1 y2
z2 z2
y1 y1
z1 z1
C D
x1 y2 x1 y2
z2 z2
x2 y1 x2 y1
z1 z1
Figure 3.2. Examples of observed variable models. The observed variables are
included in boxes and may be connected by various types of arrows. The unenclosed
symbol ζ represents residual error variables.
referred to as exogenous, while the dependent (y) variables are called endoge-
nous. Endogenous variables can, in turn, have influences on other endogenous
variables. Commonly, causal order will flow from left to right across a model,
although this will not always be the case.
The ways in which variables in a model are connected are important. Model
A in Figure 3.2 is a simple example where x1 and y2 are only related indirectly.
In other words, the relationship between x1 and y2 can be explained by their
relationships to y1 . Thus, we would say that x1 has an indirect effect on y2 (within
the context of the model). In contrast, model B contains both indirect and direct
paths between x1 and y2 . Direct paths are interpreted as relationships that cannot
be explained through any other relationships in the model. In the case of model
B, the correlation between x1 and y2 is such that it can only be explained by a
direct effect of x1 on y2 , in addition to the indirect effect mediated through y1 .
A further property of model B that should be noted is that it is saturated,
which means that all the possible interconnections are specified. Model A is
unsaturated, which means that some variables are not directly connected (i.e.,
some paths are omitted). Model C contains two exogenous variables, x1 and x2 ,
which are correlated in this case. Such a correlation is generally presumed to be
caused by some common influence that was not measured or is not considered
in this model. Model C also has the property of being unsaturated (the path
between x2 and y2 is omitted). Finally, model D contains reciprocal interactions
between y1 and y2 . Models with reciprocal interactions of this sort are called
nonrecursive. Thus, models without reciprocal interactions (e.g., model C) are
described as being recursive (the term recursive refers to the mathematical
property that each item in a series is directly determined by the preceding
item). In this chapter we will consider models of the first three types, A–C.
Models possessing reciprocal interactions require special care to interpret and
will not be addressed here.
One final property that an observed variable model may possess, and that is
not shown in Figure 3.2, is a correlated error. Such a condition would be rep-
resented by having a two-headed arrow between ζ 1 and ζ 2 , the two error terms,
in any of the models. Correlated errors typically represent some unspecified
cause for association. We will illustrate and discuss the meaning of correlated
errors later in this chapter.
Path coefficients
The path coefficients specify values for the parameters associated with path-
ways between variables. In essence, the path coefficients represent much of the
The anatomy of models I: observed variable models 43
quantitative meaning of the model (although, as we shall see, there are many
other parameters of importance as well). There are a number of things we will
need to understand in order to properly interpret path coefficients. Some of the
basic concepts will be introduced first. Later we will relate this information to
our example in Chapter 2.
For this example, the variances of x and y, plus the covariance between x
and y can be calculated using Equations 3.1–3.3 as follows:
504.40
VARx = = 56.04
9
53 492.2
VAR y = = 5 943.58
9
4 036.20
COVx y = = 448.47
9
This yields the following variance–covariance matrix
x y
x 56.04
y 448.47 5943.58
Here, we can see that the variance of y is larger than that of x. The mean of y
(147.3) is also larger than the mean of x (14.6). Because the scales for x and
y are so different, it is difficult to draw intuitive meaning from the covariance
between x and y. In order to put things in terms where we can easily assess
the degree of covariation between x and y, we might like to put x and y on a
common scale. Typically the z transformation is used to adjust the means of
variables to zero and their variances to 1.0. The formula for calculating z scores
for variable x is
xi − x̄
zi = (3.4)
SDx
The anatomy of models I: observed variable models 45
x y zx zy zx * z y
1 2 70 −1.68 −1.00 1.68
2 6 55 −1.15 −1.20 1.38
3 9 50 −0.75 −1.22 0.94
4 12 156 −0.35 0.11 −0.04
5 15 115 0.05 −0.42 −0.02
6 15 200 0.05 0.68 0.03
7 19 155 0.59 0.10 0.06
8 20 202 0.72 0.71 0.51
9 22 295 0.99 1.92 1.90
10 26 175 1.52 0.36 0.55
146 1473 −0.01 0.00 6.99
SDx , the standard deviation of x, is obtained by taking the square root of VARx .
The end result is that the z scores are expressed in terms of the standard deviation
for that variable.
Using the information in Table 3.2, we can calculate the Pearson product
moment correlation coefficient, rxy , using the formula
(z x × z y )
rx y = (3.5)
n−1
In this case, the formula yields
6.99
rx y = = 0.7767
9
Given that we have the calculated variances and covariance for the data in Table
3.1, we can also calculate the Pearson coefficient directly from the covariance
and standard deviations using the formula
COVx y
rx y = (3.6)
SDx × SD y
which gives the same result,
448.47
rx y = = 0.7767
7.49 × 77.09
As illustrated by this example, data can be standardized either through the use
of z scores, or simply by dividing the covariances by the product of the standard
deviations.
46 Structural equation modeling and natural systems
ŷ = bx + a (3.7)
It is a simple matter to see from Eq. (3.8) that when variables have been z-
transformed, SD y = SDx = 1, and byx = rxy . Thus, the distinction between
correlation and regression is only noticeable for the unstandardized coeffi-
cients. When dealing with unstandardized coefficients, the relationship between
COVx y , which measures association, and b yx is given by
COVx y
b yx = (3.9)
VARx
It can be seen from Eq. (3.9) that in regression the covariance is standardized
against the variance of the predictor (x), rather than the cross product of SDx
and SDy as in Eq. (3.6).
x1 y1 y2
----------------------------------------
x1 1.0
y1 0.50 1.0
y2 0.30 0.60 1.0
x1 y2
g11 b21
z2
y1
z1
Figure 3.3. Simple directed path model.
Also, for simplicity the discussion of path relations here will be based on stan-
dardized values (correlations). Following the second rule of path coefficients,
the coefficient for the path from x1 to y1 can be represented by the correlation
between x1 and y1 (0.50). Likewise, the coefficient for the path from y1 to y2
can also be represented by the correlation between y1 and y2 (0.60). As with the
first rule of path coefficients, when the second rule of path coefficients applies
and we are only concerned with standardized coefficients, we can simply pick
the coefficient directly out of the table of correlations.
Continuing with our example from Figure 3.3, we can ask, What is the
relationship between x1 and y2 ? This leads us to our third rule of path coef-
ficients, which states that the mathematical product of path coefficients along
a compound path (one that includes multiple links) yields the strength of that
compound path. So, in our example, the compound path from x1 to y2 is 0.50
times 0.60, which equals 0.30.
We can see from the table in Figure 3.3 that in this case, the observed
correlation between x1 and y2 is equal to 0.30, the strength of the compound
path from x1 to y2 . We can illustrate another major principle related to path
coefficients by asking what would it mean if the bivariate correlation between
x1 and y2 was not equal to the product of the paths from x1 to y1 and y1 to y2 ? An
48 Structural equation modeling and natural systems
x1 x2 y
x1 52 900
x2 21.16 0.0529
y 3 967.5 2.645 529
x1 y1 y2
----------------------------------------
x1 1.0
y1 0.55 1.0
y2 0.50 0.60 1.0
g21
x1 y2
g11 b21
z2
y1
z1
Figure 3.4. Model including dual pathways between x1 and y2 .
example of such a case is presented in Figure 3.4. Here we see that the indirect
path between x1 and y2 again equals 0.50 times 0.60, or 0.30. However, the
observed correlation between x1 and y2 is 0.50. This means that the bivariate
correlation between x1 and y2 cannot be explained by the indirect path through
y1 . Rather, a second connection between x1 and y2 (a direct path from x1 to y2 )
is required to explain their overall bivariate correlation.
By having a direct path between x1 and y2 , as shown in Figure 3.4, we now
have a new situation, one where y1 and y2 are connected by two pathways, one
direct from y1 to y2 and another that is indirect through the joint effects of x1
on y1 and y2 . This leads to the fourth rule of path coefficients, which states
The anatomy of models I: observed variable models 49
that when two variables are connected by more than one causal pathway, the
calculation of partial regression coefficients becomes involved. We must be
clear here what is meant by a “causal pathway”. Certainly a directed path from
one variable to another, such as from y1 to y2 , is an obvious causal path. How
can it be that y1 and y2 are connected by two causal pathways? The answer is
that y1 and y2 are also connected through a joint causal influence exerted by
x1 . This indirect pathway, which can be traced from y1 to x1 and then to y2 (or,
alternatively, from y2 to x1 and then to y1 ) represents a second causal pathway,
one of association by joint causation. Thus, it is important that we should be
able to recognize when two variables are connected by more than one causal
pathway, being aware that some connections (discussed further below) will be
deemed noncausal. To understand this important issue further, we now need to
discuss partial regression coefficients.
0.24
x1 y2
0.55 0.47 0.77
z2
y1
0.84
z1
Figure 3.5. Model from Figure 3.4 showing path coefficients.
which gives us
0.60 − (0.55 × 0.50) 0.325
β21 = = = 0.47
1 − 0.55 2 0.6975
Using the path coefficients derived from the correlations in Figure 3.4, we
can now represent the quantitative characteristics of the model as shown in
Figure 3.5. It is sometimes the convention that the width of the arrows is made
proportional to the magnitude of the coefficients to make the representation of
overall relationships more apparent. We should always keep in mind, however,
that the interpretation of the magnitude of a path coefficient requires care, as
will be made clear later in this chapter. As the reader will note, we have two
additional paths in Figure 3.5 that have not been discussed; those from the error
variables to y1 and y2 . More needs to be stated about these.
52 Structural equation modeling and natural systems
For our example in Figure 3.5, the value of ζ1 equals 0.84. Sometimes the path
coefficients for the error terms are confusing for those first encountering them.
Since the value for R 2y1 = 0.552 = 0.30, we might think the path coefficient
would be the unexplained variance, which is 0.70. However, the path coefficient
from ζ 1 is typically expressed in the same form as for other path coefficients,
the non-squared form (the square root of 0.70 = 0.84).
In a similar fashion, R 2y2 refers to the variance in y2 explained by the combined
effects of y1 and x1 , which in this case is 0.40 (the means of calculating this by
hand is presented below in Eq. (3.16). Given the explained variance of 0.40, we
can calculate the path coefficient as
ζ y2 = 1 − R 2y2 (3.13)
which equals 0.77. Equations (3.12) and (3.13) represent the fifth rule of path
coefficients, that the coefficients associated with paths from error variables are
correlations or covariances relating the effects of error variables. In this case,
the ζ values represent the errors of prediction, though without specifying the
exact causes for that error.
0.24
x1 y2
0.55 0.47
0.59
y1
0.70
Figure 3.6. Alternative representation of effects of error variables (compare to
Figure 3.5).
Figure 3.5, while logical, still require some mental arithmetic to transform the
information into variance explained. Instead of presenting the path coefficients
for error variables, it is possible to present the standardized values of the error
itself, as shown in Figure 3.6. Here, we can directly obtain R2 as 1 minus the
standardized error. Note that in Figure 3.6, the numeric value is located at the
origin end of the arrow, rather than alongside the arrow (the convention for
path coefficients, compare to Figure 3.5). Also, for some audiences, simply
presenting the R2 directly may be preferred over the representation of error
variables.
z21 y2 z2
y12
x1
g11 z1
y1
Figure 3.7. Model illustrating correlations between endogenous variables (y1 and
y2 in this case), which are manifested as correlations between the errors.
where r y1 y2 •x1 refers to the partial correlation between y1 and y2 taking into
account their correlations with x1 . The reader should note that technically, a
model with correlated errors between endogenous variables is nonrecursive,
in that there is something akin to a reciprocal interaction between endogenous
variables. In LISREL terminology the correlated error is referred to as ψ 12 (psi
one–two).
Now, for a model such as the one in Figure 3.7, but with no partial correlation
between y1 and y2 , the intercorrelation between the two response variables
would simply be the product of the correlation between x1 and y1 and the
correlation between x1 and y2 . This would be represented by the formula
r y1 y2 = r x1 y1 × r x1 y2 (3.15)
If Eq. (3.15) were substituted into Eq. (3.14) the numerator of Eq. (3.14) will be
zero. Therefore, it is a general rule that under the conditions where Eq. (3.15)
holds, the partial correlation between y1 and y2 will be zero, and y1 and y2 are
said to be conditionally independent. If the relationship in Eq. (3.15) fails to
hold, there will then exist a direct path between y1 and y2 , and its path coefficient
will be the value of the partial correlation given by Eq. (3.14).
Let us consider again the example data in Figure 3.4. If y1 and y2 were
conditionally independent, according to Eq. (3.15), their bivariate correlation
should be 0.55 × 0.50 = 0.275. However, their observed correlation is actually
0.60; this seems to be a rather large difference. According to Eq. (3.14), we can
calculate the partial correlation between y1 and y2 in the presence of x1 as
0.60 − (0.55 × 0.50)
r y1 y2 •x1 = = 0.45
(1 − 0.552 )(1 − 0.502 )
which represents the value for ψ 12 . So, we can see that there is a rather large
correlation between y1 and y2 that is not explained by their joint dependence on
x1 . As we stated earlier, this correlation between dependent variables suggests
The anatomy of models I: observed variable models 55
x1 x2 y1 y2
---------------------------------------------------
x1 1.0
x2 0.80 1.0
y1 0.55 0.40 1.0
y2 0.30 0.23 0.35 1.0
0.15
x1 y2
-0.11
x2 y1
z1
Figure 3.8. Model and correlations illustrating concepts of total effect and total
correlation. The matrix represents the bivariate correlations among variables, while
the numbers on the diagram are the path coefficients.
the influence of some other factors that have not been explicitly included in the
model.
multiple directed paths to and among endogenous variables (y1 and y2 ). The first
thing we should notice when comparing the path coefficients to the bivariate
correlations is that all directed paths in this case involve partial regression
coefficients. Only the coefficient for the undirected path between x1 and x2 can
be found in the table of correlations. The reader should be able to calculate all
of the other path coefficients shown using Eq. (3.10).
As for total effects, these can be obtained from the path coefficients. Our
seventh rule of path coefficients allows us to determine that the total effect of
x1 on y1 is simply 0.64. This means that if we were to increase the value of x1
by one standard deviation while holding the value for x2 constant at its mean
value, the value of y1 would increase by 0.64 times its standard deviation. As
we can see, there is only one directed path connecting x1 with y1 . However,
the case for the total effect of x1 on y2 is different. Here, there are two directed
pathways, one simple path from x1 to y2 and one compound path through y1 .
Therefore, the total effect of x1 on y2 is 0.15 + (0.64 × 0.27), or 0.32. Again,
if we were to increase x1 one standard deviation while holding x2 constant, y2
would increase by 0.32 times its standard deviation. In this case, however, y1
would covary in the process because the total effect of x1 on y2 would involve
both paths from x1 to y2 . Finally, the total effect of x2 on y2 can be seen to be
−0.11 × 0.27 = −0.03.
For more complex models, the business of determining which of the paths
connecting two variables are causal can be slightly more tedious than in the
simple example given here. Wright (1960) proposed several simple tracing
rules that can help to identify causal paths. These rules are as follows:
(1) A path can go backwards as many times as necessary before going forward.
(2) Once a path goes forward, it cannot go backwards.
(3) A path cannot go through a variable twice.
(4) A path can include a curved arrow, but only one curved arrow per path.
Reflecting on these tracing rules should reveal that they are perfectly compatible
with our discussion of what constitutes a causal connection. Perhaps the least
transparent of these rules is number 4. To understand rule 4, we should keep in
mind that a curved arrow represents an unresolved shared cause. Thus, a curved
arrow between x1 and x2 actually represents an undescribed variable with arrows
pointing at x1 and x2 . So, when a path proceeds backwards (upstream) through a
curved arrow, it should be viewed as going upstream to the undescribed variable
and then forward (downstream) to the variable on the other end of the curved
arrow. If we were to consider a path that included two curved arrows, we would
be violating rule number 2.
The anatomy of models I: observed variable models 57
As for the total correlations, the calculation of these involves the eighth rule
of path coefficients. Thus, total correlations involve both the total effects plus
the undirected relations between variables. In our example in Figure 3.8, the
total correlation between x1 and y1 is the sum of the directed and undirected
pathways linking them. In this case, that is 0.64 + (0.80 × −0.11) = 0.55. Note
that this reconstitutes the bivariate correlation between x1 and y1 as seen in
Figure 3.8. Likewise, the total correlation between x2 and y2 can be seen to be
(−0.11 × 0.27) + (0.80 × 0.15) + (0.80 × 0.64 × 0.27) = 0.23.
As a final point of discussion relating to Figure 3.8, the reader may be
surprised to find that the correlation between x2 and y1 , which is 0.40, is sub-
stantially different from the path coefficient between these two variables, which
is −0.11. When such a case is observed, it is often referred to as suppression.
Suppression refers to the fact that the strong intercorrelation between x1 and x2 ,
in combination with the relatively strong effect of x1 on y1 , causes the effect of
x2 on y1 to be fairly unrelated to their net intercorrelation. Keep in mind that the
path coefficient of −0.11 does actually reflect the causal effect that x2 has on
y1 . However, this effect is rather weak, and as a result, the correlation between
x2 and y1 is dominated by the influence of x1 .
A basic issue to consider is the different ways in which these two types of
variable are interpreted and the kind of questions each addresses. Typically,
researchers are fond of standardized parameters because the scale is the same
(standard deviation units) for different relationships. Thus, the primary function
of standardized variables is to allow direct comparisons between relationships
that are measured on different scales. For example, the unstandardized param-
eter relating an animal’s fat deposits to its caloric intake might be measured
as percent body fat per kilocalories. Meanwhile the unstandardized parameter
relating body fat to activity level (e.g., kilometers traveled during migration)
would be on a different scale. In this case, standardized coefficients allow for
a common basis of comparison. As we shall see, however, it is easy to rely too
heavily on standardized path coefficients and an appreciation for unstandardized
coefficients needs to be developed by the user of SEM.
The process of partitioning path coefficients was originally developed using
Pearson product moment correlations, which are standardized parameters. Fur-
ther, many authors choose to introduce the basic concepts about path coeffi-
cients using standardized variables (as I have done above). Sewell Wright, in
his initial work on path analysis, relied exclusively on standardized data and
standardized coefficients to partition direct and indirect relationships among
variables. However, he later came to calculate and report “concrete” (unstan-
dardized) coefficients also, and discussed their use in interpreting path models
(Wright 1984). Often, applications of path analysis in the natural sciences have
relied strictly on the analysis of correlations instead of covariances. In modern
path analysis, as well as in most other forms of structural equation modeling,
we typically perform analyses based on covariances rather than correlations.
Analysis of covariances permits calculation of both unstandardized and stan-
dardized results. It also permits analysis of differences between means as well.
For this reason, it is recommended that SEM be performed using unstandard-
ized data (covariances). It is then possible to calculate both unstandardized and
standardized parameter estimates, and both can be used in drawing interpreta-
tions.
100 100
A B
80 80
60 60
y y
40 40
20 20
0 0
0 5 10 15 20 25 30 0 5 10 15 20 25 30
x x
Figure 3.9. Illustration of two regressions having the same intercept and slope,
but different degrees of scatter around the line. While unstandardized parameters
(intercept and slope) did not differ significantly between the two data sets, the
degree of correlation did, with correlation coefficients of 0.861 in A and 0.495 in B.
coefficients given in Figure 3.9. Thus, we have lost any knowledge of the
absolute slope of the relationships using standardized data. The same can be
said when we rely exclusively on standardized coefficients. Thus, a knowledge
of the absolute slope of the relationship is of critical importance in drawing
certain kinds of inferences, such as predictions, comparisons across samples,
and generalizations at large.
x1 x2 y
x1 –––––
x2 21.16 –––––
y 0.065 23.81 –––––
x differs between the two cases. In A, x is able to explain 74% of the variation
in y in the sample. In B, x is able to explain only 25% of the variation in y.
In multiple regression or a path model, relating the variance explained to
different predictors is more complicated. To consider how multiple predictors
relate to variance explanation in y, we need the formula for calculating total
variance explained in a multiple regression. When there are two correlated
predictors, x1 and x2 , influencing y, the total variance explained in y, R 2y can be
calculated as
In this context, γ11 refers to the standardized effect of x1 on y and γ12 refers
to the standardized effect of x2 on y. Both standardized path coefficients are
partial regression coefficients in this case. Equation (3.16) gives the impression
that we can apportion cleanly variance explanation in y among the xs. This is
not the case.
Another hypothetical case can be used to illustrate the application of Eq.
(3.16). Imagine a situation where y is influenced by two correlated predictors,
x1 and x2 , and the scale of y is in terms of percent change and ranges from 1 to
100. In this example, the scale of x1 is in raw units and ranges from 1 to 1000,
while the scale of x2 is as a proportion and ranges from 0 to 1.0. For simplicity
we let the standard deviation of each be a constant proportion of the range for
each variable. In this example, the covariance matrix has the values given in
Table 3.3.
If we analyze this variance/covariance matrix and present unstandardized
path coefficients, we get the values represented in Table 3.4.
Thus, our equation is
y = 0.065x1 + 23.81x2 + ζ
It is unclear from this equation what the relative importance of x1 and x2 might
be as explanations for the observed variation in y. Now, if we look at the data
in standardized form (Table 3.5),
62 Structural equation modeling and natural systems
x1 x2 y
x1 1.00
x2 0.40 1.00
y 0.75 0.50 1.00
STD 230 0.23 23
we see the correlations among variables, along with the standard deviations
(note, Table 3.3 can be exactly reconstructed from Table 3.5). Looking at the
standardized path coefficients (Table 3.6),
x1 x2 y
x1 –––––
x2 0.4 –––––
y 0.655 0.238 –––––
y = 0.655x1 + 0.238x2 + ζ
R 2 = (0.655)(0.75) + (0.238)(0.5)
The square of the semipartial correlation in Eq. (3.17) gives a measure of the
proportion of the variance in y that is uniquely explained by x1 . The meaning
of a semipartial correlation or regression can be understood to be the additional
variance explanation gained by adding x1 to the collection of predictors (akin
to a stepwise process). Using Eq. (3.17) and the data from Table 3.5, we can
calculate for our example that
0.75 − (0.40 × 0.50) 0.55
r yx1 (x2 ) = √ = = 0.60
1 − 0.40 2 0.92
Thus, the proportion of variance in y uniquely explained by x1 = 0.602 = 0.36.
Because the maximum variance in y that could be explained by x1 if it was the
only predictor is the square of the bivariate correlation, 0.752 = 0.56, and the
variance in y uniquely explained by x1 is 0.36, the shared variance explanation
must be 0.56 − 0.36 = 0.20.
We can further determine using the same procedure (an equation of the
same sort as Eq. (3.17)) that the proportion of variance in y uniquely explained
by x2 = 0.222 = 0.05. As before, the maximum variance in y that could be
explained by x2 if it were the only predictor is 0.502 = 0.25. Since x2 ’s unique
explanatory power is 0.05, we once again arrive at a shared variance explanation
of 0.20. These results are summarized in Table 3.7.
Having gone through the process of showing how variance explanation can
be partitioned, I must again emphasize that this is only relevant when we are
interested in associations in our sample. If we wish to extrapolate to another
situation or compare groups, standardized coefficients will not serve us well
unless the variances are comparable across samples. The reason is very simple.
The standardized coefficients are standardized by the ratios of the standard devi-
ations, and thus, sensitive to the sample variances. Reference back to Eq. (3.8)
reminds us that the standardized effect of x on y can easily be calculated as the
unstandardized effect multiplied by the ratio of the standard deviations (when
this is done, the units cancel out, leaving the coefficients unitless, though we
interpret them as being in standard deviation units). Since we know that further
64 Structural equation modeling and natural systems
Proportion of variance
Source of variance explanation in y explained
Unique influence of x1 0.36
Unique influence of x2 0.05
Shared influences of x1 and x2 0.20
Total variance explained 0.61
samples may not always have the same variances, the standardized coefficients
will not automatically extrapolate. To generalize beyond our sample, there-
fore, we should rely on unstandardized coefficients. Pedhazur (1997, page 319)
describes the situation this way, “The size of a [standardized coefficient] reflects
not only the presumed effect of the variable with which it is associated but also
the variances and the covariances of the variables in the model (including the
dependent variable), as well as the variances of the variables not in the model
and subsumed under the error term. In contrast, [the unstandardized coefficient]
remains fairly stable despite differences in the variances and the covariances of
the variables in different settings or populations.” Pedhazur goes on to recom-
mend that authors present both unstandardized and standardized coefficients,
and to be particularly careful to respect the assumptions involved in standard-
ized coefficients.
Guidelines
As a generalization, there are four kinds of conditions in which unstandardized
coefficients are likely to be essential to drawing proper inferences: (1) when
the units of x and y are meaningful and there is interest in describing the abso-
lute effect of x on y, (2) when separate path models for different groups are
being compared (so-called multigroup analysis) and those groups have differ-
ent variances, (3) in repeated measures situations where items are followed
over time, and population variances differ substantially between times, and
(4) when results are compared among datasets or used for prediction and we are
interested in average changes. While some authors will state that under such
circumstances only the unstandardized coefficients can be used, I think that
this is not strictly true. The focus of the research question is what determines
when and how the coefficients should be interpreted. The most general rule
of all about standardized and unstandardized coefficients is that researchers
The anatomy of models I: observed variable models 65
should use the coefficient type that is appropriate to the inferences they
make.
The rules of path coefficients (note, these are all stated in terms
of standardized coefficients, though they also apply to
unstandardized coefficients)
(1) The path coefficients for unanalyzed relationships between
exogenous variables are the bivariate correlations.
(2) When two variables are only connected through a single directed path,
the coefficient for that path corresponds to the bivariate regression
coefficient.
(3) The strength of a compound path (one that involves multiple arrows)
is the product of the coefficients along that path.
(4) When two variables are connected by more than one causal pathway,
the calculation of partial regression coefficients is required.
(5) Coefficients associated with paths from error variables are
correlations representing unexplained effects.
(6) Unanalyzed correlations between endogenous variables are
represented by partial correlations.
(7) The total effect one variable has on another equals the sum of its
direct and indirect effects through directed (causal) pathways.
(8) The sum of all pathways connecting two variables, including both
causal and noncausal paths, adds up to the value of the bivariate or
total correlation between those two variables.
R 2 = 0.26 R 2 = 0.61
R 2 = ns R 2 = 0.53
Figure 3.10. Results for leafy spurge–flea beetle model showing both standard-
ized coefficients (not enclosed by parentheses) and unstandardized coefficients
(enclosed by parentheses).
which were given in Figure 2.5, and the unstandardized path coefficients (in
parentheses). First, we will re-examine the standardized coefficients in this
model while reviewing the rules of path coefficients. Secondly, we will consider
the interpretation of the unstandardized coefficients.
(1) According to the first rule of path coefficients, the path coefficient for an
unspecified relationship (two-head arrow) between exogenous variables
will take on the value of their bivariate correlation. In this case, there is
only one exogenous variable, the number of stems in 2000. Therefore this
rule does not apply to any of the coefficients in Figure 3.10.
(2) According to the second rule of path coefficients, when two variables are
only connected through a single causal pathway, the value of the path
coefficient between them will be the bivariate regression coefficient. For
The anatomy of models I: observed variable models 67
(7) The seventh rule of path coefficients allows us to calculate total effects. In
this case, there are several and they are illustrated in Table 3.9. The reader
should be able to calculate these total effects from the information in Figure
3.10.
(8) In this situation, the eighth rule of path coefficients simply allows us to
see the relationships between bivariate correlations and path coefficients.
One of the more interesting cases is that for A. nigriscutis in 2000 and A.
lacertosa in 2000. The two paths connecting these two variables include
the shared causal effect of stems in 2000 (0.51 × 0.15) plus their direct
correlation (−0.20). Together, these constitute the bivariate correlation,
which is −0.12.
30 25
25 20
Frequency
Frequency
20
15
15
10
10
5 5
0 0
0--20 21-- 41-- 61-- 81-- 101-- 121-- 141-- 0--8 9--16 17-- 25-- 33-- 41-- 49-- 57--
40 60 80 100 120 140 160 24 32 40 48 56 64
Cover Stand Age, yrs
30
25
30
Frequency
20 25
Frequency
15 20
15
10
10
5 5
0 0
1--2 2--3 3--4 4--5 5--6 6--7 7--8 8--9
5
0
50
00
50
00
50
00
22
05
-1
-3
-4
-6
-7
-9
--1
-1
0-
1-
1-
1-
1-
1-
Fire Severity Index, mm
1-
51
15
30
45
60
75
90
10
Elevation, m
Figure 3.11. Frequency distributions for four variables from a study of wildfire
effects on plant recovery, used to discuss the concept of relevant ranges.
99% of the range of values. As discussed earlier, this may seem reasonable if
(1) we have a large enough sample to estimate a consistent sample variance,
(2) our variables are normally distributed, and (3) variances are equal across any
samples we wish to compare. The reason why many metricians oppose stan-
dardized coefficients is because generally these three necessary conditions are
not likely to hold. Of equal importance, rarely are these requirements explicitly
considered in research publications, so we usually don’t know how great the
violations of these requirements might be.
Figure 3.11 presents frequency distributions for four variables from an SEM
study by Grace and Keeley (2006). In the absence of further sampling, the
repeatability of the sample variance estimate is unknown. This contributes to
some uncertainty about the interpretability of coefficients standardized on the
standard deviations. As for approximating a normal distribution, three of the
four variables are truncated at the lower end of values. Cover can never be less
than 0%, elevation likewise has a lower limit of expression relevant to terrestrial
communities in this landscape, and stand age is also limited to a minimum
value of between 0 and 1 year. None of these deviations is substantial enough to
cause major problems with distributional assumptions (i.e., these variables are
not wildly nonnormal), however, the deviations from idealized normality may
very well impact the relationships between standard deviations, and ranges.
The anatomy of models I: observed variable models 71
The observed range for cover was from 5% to 153%, while 6 times the standard
deviation yields an estimated range of 190%. The observed range for elevation
was from 60 to 1225 m, while 6 times the standard deviation equals 1550. Stand
age ranged from 3 to 60 years old, with 6 times the standard deviation equaling
75 years. Finally, fire severity index values ranged from 1.2 to 8.2 mm, while 6
times the standard deviation equals 9.9. Thus, observed ranges are consistently
less than would be estimated based on standard deviations, and the degree to
which this is the case is slightly inconsistent (ratios of observed to predicted
ranges for cover, elevation, age, and severity equal 0.78, 0.75, 0.76, 0.71).
It is possible that in some cases information about the range of values likely
to be encountered or of conceptual interest can provide a more meaningful basis
for standardizing coefficients than can the sample standard deviations. We can
refer to such a range as the relevant range (Grace and Bollen 2005). For example,
if we have a variable whose values are constrained to fall between 0 and 100,
it would not seem reasonable for the relevant range chosen by the researcher
to exceed this value, regardless of what 6 times the standard deviation equals.
On the other hand, it may be that the researcher has no basis other than the
observed data for selecting a relevant range. Even in such a case, we can choose
to standardize samples that we wish to compare by some common range so as to
clarify meaning across those samples. Whatever the basis for standardization,
researchers should report both the unstandardized coefficients and the metrics
used for standardization.
In this case, we might choose to specify the relevant range for cover to be
from 0 to 270%. Obviously values cannot fall below 0%, but why choose an
upper limit of 270%? Examination of cover values for all plots across the five
years of the study show that values this high were observed in years 2 and 4 of
the study. By using a relevant range of from 0 to 270, we permit comparisons
across years standardized on a common basis. Of course, this implies that the
slopes measured will extrapolate to that full range, which is an assumption that
should be evaluated closely. For elevation, we might choose the relevant range to
be the observed range, from 60 to 1225 m. This span of 1165 m might be chosen
because we do not wish to extrapolate to lower or higher elevations, in case
relationships to other variables are not robust at those elevations. For stand age,
we could specify the relevant range to be 60 years for basically the same reason.
Finally, the fire index range chosen might also be the observed range, which was
7.0 mm. It is clear that values could be obtained beyond this range in another
fire. It is not known, however, whether the relationship between remaining twig
diameter and herbaceous cover would remain linear outside the observed range.
Based on these determinations, we can generate path coefficients standard-
ized on the relevant ranges for an example model involving these variables.
72 Structural equation modeling and natural systems
elevation
0.037
0.301
-0.022 0.160
-0.450
0.798 -0.450 0.578 0.692
fire plant
stand age
0.085 severity -7.32 cover
0.650 -0.386
0.692 -0.190
Figure 3.12. Example model comparing the unstandardized path coefficients
(upper), coefficients standardized based on their standard deviations (middle), and
coefficients standardized based on their relevant ranges (lower).
Figure 3.12 shows the three types of standardized coefficient. The biggest
numeric differences between values is that when standardized by the relevant
ranges, the values of the coefficients leading to cover are lower because of the
large relevant range selected for this variable. The coefficient for the effect of
age on severity is slightly higher, while that for the effect of elevation on age is
unchanged. Using these coefficients now allows us to describe the importance
of variables using their relevant ranges as the explicit context. These interpreta-
tions are only valid for relative comparisons within the parameter space defined
by the relevant ranges. So, we can say that as fire severity increases across its
relevant range, cover would be expected to decline by 19% of its relevant range.
As elevation increases across its relevant range, the total change in cover from
both direct and indirect causes would be an increase of 21.9% (the total effect).
We now conclude from this analysis that the sensitivity of cover to fire severity
and elevation (19% versus 21.9%) are roughly equivalent in this study, though
of opposing sign. It is possible to test whether these two estimates are reliable
differences, which in this case, they are not.
Conclusions
The sophisticated procedures of modern structural equation modeling start with
the basic principles of regression and the rules of path coefficients. When these
The anatomy of models I: observed variable models 73
rules and principles are combined with an understanding of model structure, the
anatomy and interpretation of structural equation models should become clear
for a wide variety of model types. For those who wish to understand SEM,
fundamental distinctions between standardized and unstandardized variables
are important. Understanding these issues is especially important for those in
the natural sciences where there has been much use of path analysis based on
correlations. In the next chapter we will continue to build on our knowledge of
the capability of SEM by considering latent variables and issues related to their
use and interpretation.
and
In this case, our goal is to express the partial path coefficients (which are our
unknown parameters) in terms of the known parameters (the correlations). So,
we must rearrange our equations, yielding
and
x1 g11
r x 1x 2 y1
x2 g12 z1
Figure 3.1.1. Simple multiple regression model.
Substitution of Eq. (3.1.4) into Eq. (3.1.3) allows us to derive the equation for
γ 11 as follows:
γ11 = r x1 y1 − r x1 x2 × (r x2 y1 − r x1 x2 × γ11 ) (3.1.5)
γ11 = r x1 y1 − r x1 x2 × r x2 y1 + r x21 x2 × γ11 (3.1.6)
γ11 − r x21 x2 × γ11 = r x1 y1 − r x1 x2 × r x2 y1 (3.1.7)
γ11 1 − r x21 x2 = r x1 y1 − r x1 x2 × r x2 y1 (3.1.8)
r x y − r x1 x2 × r x2 y1
γ11 = 1 1 (3.1.9)
1 − r x21 x2
Through similar means we can arrive at our equation for γ 12 ,
r x y − r x1 x2 × r x1 y1
γ12 = 2 1 . (3.1.10)
1 − r x21 x2
While we have illustrated the formulae for partial regression coefficients
using a multiple regression model (Figure 3.1.1), it is important to realize how
this extends to a fully directed path model of the form shown in Figure 3.1.2
(similar to Figure 3.4). Let us see why this is the case.
In Figure 3.1.1, the first rule of path coefficients tells us that the relationship
between x1 and x2 can be represented by their bivariate correlation coefficient.
While the model shown in Figure 3.1.2 may be interpreted differently, it is
similar to the previous model in that the standardized path coefficient from x1
to y1 is also represented by their bivariate correlation coefficient, though this
is based on the second rule of path coefficients (which applies to variables
connected by only one causal path).
g21
x1 y2
g11 b21
z2
y1
z1
Figure 3.1.2. Directed path model.
A x1 0.31
0.40 y1
x2 0.48 0.56
B
0.31
x1 y2
0.40 0.56
0.48
y1
0.20
Figure 3.1.3. Illustration of similarity of path coefficients for multiple regression
model (model A) and directed path model (model B).
To illustrate the similarity between the models in Figure 3.1.1 and 3.1.2, let
us consider data for three variables with the correlations given in Table 3.1.1.
Imagine that in the first case, we use these data to evaluate a multiple regression
model like the one in Figure 3.1.1 and let variable 1 be x1 , variable 2 be x2 , and
variable 3 be y1 . The reader should be able to use the rules of path coefficients
and Eqs. (3.1.9) and (3.1.10) to find the standardized path coefficients for that
model (Figure 3.3).
76 Structural equation modeling and natural systems
Let us now consider applying our data to a different model. Perhaps we decide
that variable 2 actually depends on variable 1 and that variable 3 depends on
both 1 and 2. Such a decision would have to be based on theoretical reasoning,
as there is nothing about the data that will tell us the causal structure of the
model. This decision would lead us to formulate our model as in Figure 3.1.3.
So, now variable 1 is x1 , variable 2 is y1 , and variable 3 is y2 . We can again use
our rules and equations to arrive at the standardized path coefficients. We now
find that the coefficients are the same for both models. The fundamental reason
for this is that in both cases, variables 1 and 3 are connected by two causal
pathways, thus, the equations for partial regression are invoked in both cases.
However, interpretations of the models are somewhat different. In particular, in
model B y1 is interpreted as being influenced by x1 and has an attending error
variable, while the relationship between x1 and x2 in model A is considered
to be the result of an unanalyzed joint cause (a correlation). Also, the other
difference between the two models is that the total effects of x1 and x2 on y1 in
model A are simply 0.31 and 0.48. However, in model B, the total effect of x1
on y2 is 0.31 plus 0.40 times 0.48, which equals 0.50.
4
The anatomy of models II:
latent variables
77
78 Structural equation modeling and natural systems
dx x x
Figure 4.1. Symbolic representation of a latent variable, ξ , a single observed
indicator variable, x, and the error variable δ x associated with x.
variable. Most importantly, we presume that the values of the indicator variable
are reasonably well correlated with the true latent variable values. Through this
correlation, our ability to measure x provides us with information about ξ .
A second thing to observe about Figure 4.1 is that the direction of causality
is represented by an arrow from the latent variable to the indicator. Initially
this can be confusing since the values of the observed variables are used to
estimate the values of the latent variable. However, it is important to realize
that conceptually, the latent variable is the “true” variable of interest, and the
indicator is a manifestation of the effects of that true entity. A classic example
of latent variables is the components of human intelligence. It is presumed that
people have abilities that cannot be directly measured. Rather, performance on
tests of various sorts can be used to provide indirect measures of those abilities.
In a similar fashion, we can presume that many concepts represent properties
that we cannot measure directly but that cause observed manifestations. This is
true even when the concepts are not highly abstract.
A third thing to notice about Figure 4.1 is that there exists an error associated
with the observed indicator. In terms of equations, we can represent this situation
with the formula
x = λξ + δx (4.1)
This equation makes it clear that we are stating that the values of x are the result
of the influence of the latent variable ξ , proportional to λ, its effect on x, plus
error, δ x .
A tangible example of a latent variable and its indicator is represented in
Figure 4.2. Here we declare that what is of interest is a concept referred to as
body size and that we have a single indicator, the mass of individual animals
measured at a single point in time. It is presumed in Figure 4.2 that there is some
measurement error associated with the values of animal mass. By measurement
error, we refer to the degree to which our indicator variable does not correlate
perfectly with our latent variable. We may or may not know something about the
magnitude of that measurement error, but making a distinction between latent
The anatomy of models II: latent variables 79
body Body
dx
mass Size
Figure 4.2. Example of a latent variable, Body Size, with a single indicator, ani-
mal mass. The term δ x represents the error associated with our values of animal
mass.
model. It allows the model to have explicitly linked theoretical and empirical
content. In the natural sciences, we have traditionally related our theories to
the empirical results without an explicit consideration. We theorize about the
effects of body size on survival and then proceed to use some individual and
specific measures of body size and survival to evaluate these theories. We may
spend some time wrestling with the fact that we can measure several aspects
of body size, and they do not all have exactly the same degree of relationship
with our measure(s) of survival. In the end, we are most likely to pick the one
measure of body size, that correlates most highly with our measure of survival
in our data set. The resulting statistical model is often one that feels very specific
because we have picked one predictor from several candidates, and also less
like a proper evaluation of a hypothesis than we would like, because we did not
have an initial theory about how our measurements would relate to our con-
cepts. Latent variable modeling seeks to improve upon this situation. It not only
makes explicit the concepts of interest and how we have estimated them, it also
leads to theories that are empirically meaningful. Ecologists have a long history
of theories with vague and nonoperational concepts. Explicitly linking theory
and observation in the same model has the consequence of promoting theory
with greater empirical validity. Thus, making a distinction between observed
and latent variables would seem to have the potential to lead to better science.
the proportion of variance of an indicator that is not in common with the true
underlying parameter is called the proportional error variance (which is 1.0
minus the reliability). As with all other parameters, reliability and error variance
can be expressed either in standardized or unstandardized metric.
Returning to our example in Figure 4.2, assuming we are willing to accept
body mass as a valid single indicator of body size, the reliability of our indi-
cator is determined by its repeatability. If repeated sets of measurements of
body mass are highly correlated with each other, we are said to have a reliable
measure. From a computational perspective, the reliability of our indicator is
equal to the average correlation between repeated measurements of body mass.
Thus, imagine that we obtain the data represented in Table 4.1. Relevant to
our example, we might imagine a calibration process where at the beginning
of our study we took repeated measurements of body mass to determine the
reliability of our measurement process. Data such as those in Table 4.1 can be
used to calculate the correlations between trials, and then we can also deter-
mine the average correlation. Say, for our example, that the average correlation
between measurement trials is 0.75; we would find that the reliability of our
measurement of body mass is 0.75.
Once we have an estimate of reliability for our indicator, there are several
things we can do with that information. If we have multiple measurements of x
for all dates in our study we may simply prefer to average across the multiple
measurements and use those average values. On the other hand, if we have
multiple measures for only the first time period, but single measurements of
body mass were taken for several additional time intervals, we may wish to
use our estimated reliability to specify the relationship between indicator and
82 Structural equation modeling and natural systems
latent variable throughout our model. In general terms, the advantage of such an
approach is that our knowledge about indicator reliability is incorporated into
our model. For single indicator latent variables, the path coefficient between
latent and indicator is defined by the ninth rule of path coefficients. This rule
states that the standardized path coefficient between a latent variable and its
indicator is the square root of the specified reliability of the indicator. The
specification of that reliability takes place outside the model, either through the
default assumption of a perfect correlation (in which case the coefficient has
a value of 1.0), or through the inclusion of some estimated reliability of our
indicator (which is some value less than 1.0).
Now, as stated earlier, there is a direct relationship between reliability and
measurement error. While reliability represents the proportion of variance in
common between indicator and latent variables, the measurement error rep-
resents the quantity of variance in the indicator that is estimated to be error.
Absolute error variance in x can be related to its reliability using the formula:
δx = 1 − λ2x × VARx (4.2)
Conceptual Size of
Body Size
Model Territory
indicators
Latent
body Size of singing
Variable mass
Body Size
Territory range
Model
latent
variables
Figure 4.4. Illustration of three types of model. The upper figure illustrates a
simple conceptual model, the middle figure illustrates a related model involving
observed variables, while the lower figure illustrates a latent variable model com-
bining the conceptual and observed variables.
differences in priorities. In the rest of this chapter, I will first elaborate on path-
focused applications and then discuss factor-focused applications. I will end by
extending a factor-focused application into a hybrid model analysis.
Path-focused applications
A survey of the published applications of SEM (including historic path analysis)
in the natural sciences demonstrates that the great majority have been focused
on path relationships. This holds true for models containing latent variables as
well as for the widespread use of observed variable models, which by design
are exclusively path-focused. Only a handful of SEM applications with a strong
emphasis on understanding factors have been published in the natural sciences,
although this may be expected to change as more researchers become familiar
with the technique.
We continue our discussion of latent variables by considering how they can
contribute to path-focused applications. Let us now imagine the situation where
our interest in animal body size is with reference to its role in influencing an
animal’s territory size (Figure 4.4). Our conceptual model of the system at hand,
whether explicitly stated or not, defines the context of our model by representing
the question being addressed. In this case, the question is, how does the size of
The anatomy of models II: latent variables 85
A B
25 0.64
20
y 15 0.60
10 x y
5 (4.84)
0
0 0.5 1 1.5 2
x
C 0.52
Figure 4.5. Figures used to illustrate important points about measurement error.
Path coefficients in parentheses are unstandardized values. Other values are stan-
dardized. A. Representation of a raw regression of y on x. B. Representation of
the regression in A in diagrammatic form, showing the standardized and unstan-
dardized slopes as well as the standardized error of prediction in y, which is 0.64.
C. Latent variable regression model in which 25% of the variance in x is assumed
to be measurement error.
an individual animal’s territory relate to the size of its body? Also in Figure 4.4,
is the observed variable model which might represent this question if we simply
used our available observed variables, and made no explicit statement about
the concepts they represent. Comparing our observed variable model to our
conceptual model helps to make the points that (1) observed variable models
are typically more specific and less theoretical than conceptual models, and (2)
there is often an unstated presumption that the observed variables used represent
the concepts of interest, although this presumption is not addressed explicitly
in this simple case. In latent variable modeling, we combine the conceptual and
observable into a model that includes both. As a result, our latent variable model
represents a more sophisticated model than either our conceptual or observed
variable models by themselves.
A 0.64
0.60
x y
(4.84)
B 0.52
Imagine the case where two variables, x and y, are measured with random
sampling and we believe that x has a causal influence on y. Let us further imagine
that there is some error in the measurement of x, referred to as δ x , such that values
of x obtained are not exactly the same as the true values of x. Such a situation
would seem to be very common in the natural sciences, and thus may be viewed
as a rather normal occurrence. Figure 4.5A illustrates our usual visualization
of the raw relationship between x and y. Simple linear regression reveals a
standardized path coefficient between x and y of 0.60 using least squares, which
is shown diagrammatically as an observed variable model in Figure 4.5B. Now,
let us further imagine that the reliability of our measure of x is equal to 0.75 (as
in the example in Figure 4.3). One implication of measurement error in x is that
some of the error of prediction of y, represented as the residual error variance,
is actually not prediction error, but instead, measurement error. In other words,
measurement error in x gets lumped in with the prediction error of y, leading to
an underestimation of the strength of the true relationship between x and y and
a downward bias of the path coefficient.
Given that we are aware of a less than perfect reliability for x, we can
represent our regression model using latent variables (Figure 4.5C). Here, for
simplicity, we assume no measurement error for y. In the latent variable model,
ξ x represents the true values of x, which are assumed to be error free by defini-
tion, while ηy represents the true values of y. The path from ξ x to x is specified
by the square root of the reliability, 0.87, and the path from ηy to y is set to 1.0.
In this fashion, the path from ξ x to ηy represents the true effect of x on y. Since
we specify a reliability for x of 0.75, the error variance for x can be calculated
using Eq. (4.2), yielding a value of 0.055 (given VARx = 0.22). In Figure 4.5,
the standardized error variance is shown, which is 0.25 (1 – reliability). What
The anatomy of models II: latent variables 87
difference does all this make to our estimation of the relationship between x
and y? The change in the estimated effect of x on y in standardized terms can
be anticipated by the formula
b =γ ×λ (4.3)
involving a single predictor and a single response variable belie the complex
effects that measurement error can have on structural equation models (or other
kinds of statistical model for that matter). In more complex models, a wide
variety of outcomes is possible when measurement error is substantial. The
varieties of consequences are such that no single example can provide a repre-
sentative illustration of the kinds of effects that can be seen. Certain generaliza-
tions do apply, however. First, when there is measurement error in exogenous
variables, there is generally a downward bias in path coefficients from exoge-
nous to endogenous variables. Exceptions to this pattern can be found in models
with multiple, correlated endogenous variables. Here, the effects of measure-
ment error are quite unpredictable, with a wide variety of consequences being
documented. Secondly, when endogenous variables contain measurement error,
the explanatory power of the model is underestimated and standardized coef-
ficients are biased. Thirdly, when both exogenous and endogenous variables
possess measurement error, a wide variety of effects are possible.
Aside from the general consequences of measurement error on model param-
eters, there can be important effects on model fit and the significance of paths.
For a given data set, ignoring or addressing measurement error can shift infer-
ence from one model to another. The greater the amount of measurement error,
the greater the differences between models. Bollen (1989, chapter 5) provides
a detailed consideration of some of the effects measurement error can have in
structural equation models. There can be little doubt that measurement error has
important consequences for all statistical models. Structural equation modeling
and the use of latent variables provide a means of correcting for measurement
error, at least in a fashion and to a degree. This capability provides our second
reason for using latent variables, to account for measurement error so as to
reduce its influence on parameter estimates.
elaborate on the use of multiple indicators in latent variable models for cases
where this is possible and feasible.
I offer three pieces of advice for those specifying reliability and measurement
error for single indicators. First, we must keep in mind that reliability is a
sample-specific property. In other words, since reliability is estimated as a cor-
relation between replicate samples, the value of that correlation will be strongly
related to the range of conditions over which calibration has taken place. If, for
example, we are interested in the reliability of body mass values, the range of
animals included in the calibration sample should match the range that applies
to the data being modeled. Taking a limited or nonrepresentative calibration
sample will not provide an appropriate estimate of reliability for our model.
Secondly, there will certainly be plenty of cases in the biological sciences where
we are working with reliable indicators. When the researcher is reasonably con-
fident that repeated measurements would yield an extremely high correlation
between trials, they are fully justified in specifying perfect reliability or ignoring
the need to specify measurement error. Thirdly, we should not be complacent
about specifying high levels of measurement error in our models. The reality
is that when we provide a single indicator and specify that it is not very reli-
able, we will often find that stable solutions for model parameters are difficult
to achieve (for reasons that will be discussed in more detail in Chapter 5).
To suggest some more specific guidance on this matter, Kline (2005, page 59)
offers the following: “. . . reliability coefficients around 0.90 are considered
‘excellent’, values around 0.80 are ‘very good’, and values around 0.70 are
‘adequate’.” Those below 0.50 indicate that at least one-half of the observed
variance may be due to random error, and such unreliable measures should be
avoided.
When reliabilities are low, it would be far better for us to use that information
to try and improve the reliability of our estimate rather than to take the easy
route of “solving” the problem through error specification. Despite the cautions
offered here, we cannot deny that ignoring measurement error is difficult to
justify. Single-indicator latent variable models allow one to explicitly address
the problem of measurement error in path-focused applications.
Illustration
To further build our understanding of latent variables, let us imagine that we
have in our possession two indicators of body size for individuals of a bird
species, body mass and wing length. What happens if we decide to use these
two measures as multiple indicators of the latent factor, body size? Figure 4.7
represents this situation.
We can represent Figure 4.7 using two equations
x 1 = λ1 ξ + δ x 1 (4.4)
x 2 = λ2 ξ + δ x 2 (4.5)
body
dx1 mass l1
Body
Size
wing l2
dx2 length
Figure 4.7. Representation of a two-indicator model of Body Size.
only have to estimate the λs. We must still make a simplifying assumption to
accomplish this. Resorting to standardized variables for ease of discussion once
again, we have the situation where we have one known piece of information,
r12 , the correlation between x1 and x2 , and two unknowns, λ1 , λ2 . Given that we
have only two indicators for our latent variable, we generally have no reason to
suspect that one indicator is better than the other in the sense that it correlates
more closely with the latent than does the other indicator. Thus, it is logical to
set the standardized values of λ1 and λ2 to be equal. Now, we have reduced the
estimation problem to one known, r12 , and one unknown, λ. All that is left is
to derive λ from r12 .
In Chapter 3 we encountered the statistical principles that relate to two
non-interacting variables under joint causal control. Here we have that same
situation, the only difference being that the causal control is being exerted
by a variable that we have not been able to measure directly, ξ . The model
represented by Eqs. (4.4) and (4.5) (and exemplified in Figure 4.7) is one in
which x1 and x2 are correlated solely because of the joint influences of ξ . In
this model there is an explicit assumption that x1 and x2 do not interact directly.
Therefore, we should be able to calculate the loadings that would result in a
correlation between x1 and x2 of r12 . As we may recall from Chapter 3, the
correlation between two variables is equal to the sum of all paths connecting
them, including both causal and noncausal paths (this is our eighth rule of path
coefficients). In this case, the only path connecting x1 and x2 is the indirect one
through ξ . The strength of that path is the product of the two segments, λ2 ,
which helps us to understand why the correlation between multiple indicators
equals the reliability. Since r12 equals λ2 , it holds that
√
λ= r12 (4.6)
92 Structural equation modeling and natural systems
A 0.52
B 0.52
body 0.86
0.25 mass
Body 0.69 Terr. 1.0 singing
ing
rangee 0
Size Size
wing 0.86
0.25 length
Figure 4.8. A. Illustration of model having a single indicator of body size with a
reliability of 0.75. B. Model containing two indicators of body size with reliabilities
of 0.75. These models represent two equivalent means of representing measurement
error, either by specification or by the use of multiple indicators. Terr. Size refers
to the latent variable territory size and singing range is its single indicator.
2
λj
ρxi xi = 2 (4.7)
λj + εj
body 0.52
0.25 mass 0.86
beak 0.78
0.39 length
Figure 4.9. Illustration of model of body size influences on territory size having
three indicators of body size.
our simple tracing rules from Chapter 3. Here I illustrate this process using a
model having three indicators.
multi-method multi-property
different attributes. It may be worth pointing out a few other ways multiple
indicators can be combined. Figure 4.10 shows four different examples. For
simplicity, I only show two or three indicators for each. As described below,
when feasible, more indicators are preferred. As for the examples, all these
represent ways in which the generality of our latent variable is enhanced. We
often select a particular method or technique of measurement, whether it is
for soil analysis, water chemistry, or vegetation assessment, for economy and
simplicity. When we use a single method, however, our results are method
dependent. A solution to this problem is to employ more than one method to
measure the attribute. Here I show two different procedures for measuring soil
96 Structural equation modeling and natural systems
organic content, soil carbon determination (e.g., with a CHN analyzer), and
loss on ignition.
We have already considered one example of a multi-property (or multi-
attribute) model when dealing with body size. Multi-property (also known as
multi-trait) models are often of interest when we wish to test general theories.
The key question we have to address is whether our traits represent a unidimen-
sional or multidimensional concept. Simply put, if they correlate consistently
and strongly, then they behave as two indicators of a single trait.
Frequently, we wish to generalize over time. When we take a single measure-
ment in time, we are often hoping, or presuming, that it generalizes over time
or correlates with conditions at the critical time in which consequences ensued.
Our earlier discussion of territory size implied that it is a static thing. It is rea-
sonable to assume that for many animals, territory size is rather dynamic. One
way to capture this is simply to average the values found in censuses, another is
to use the repeated samplings as multiple indicators (note, sometimes repeated
measures data require the inclusion of correlated error variables because his-
torical conditions cause errors to be nonindependent). The main advantage of
a multi-indicator approach compared to simply averaging the values, is that
it affords us the ability to generalize to other studies using individual census
survey data. If we were to use an average of, say three censuses, our results
would only generalize to other cases where we again average three censuses.
One other example of the use for multi-indicator latent variables is for multi-
sample data that is not explicitly repeated measures, but instead, designed to
permit generalization of a different sort. Perhaps we have the case where two
observer planes fly different survey lines searching for herds of caribou. Aside
from the differences in areas sampled, there are different teams of observers
as well. Because there are different teams of observers, we may not want to
simply add the numbers together, but rather, we can use the data as multiple
indicators. Many variations of such approaches can be used when employing
latent variables with multiple indicators in path-focused models.
sight of the fact that the observed variables are response variables and only the
latent variables represent true causes. Ecologists and other natural scientists
frequently propose general hypotheses, such as “body size influences territory
size”. Rarely do we subject these to rigorous evaluation. Latent variable models
hold the promise of permitting such evaluations of general ideas.
Along with the promise held by latent variable models is the need for a
healthy dose of caution, as well as a fairly good fundamental understanding of
the underlying theory and calculations. We have walked through some of the
most basic concepts and will move to another perspective in the next section.
Before moving on to a more explicit consideration of latent variables, there are
a few recommendations and cautions that apply in the context of path-focused
models. The reason I include these thoughts about latent variable models here
rather than later is because, at the moment, the great majority of ecological
applications are path-focused.
Recommendations
It is recommended that when possible, three or more indicators should be used
to represent a latent variable. There are a few reasons why this piece of advice
is frequently given in the SEM literature. First, there are sometimes problems
with achieving unique parameter estimates when latent variables have only
two indicators. This topic was addressed superficially earlier in the chapter,
and will be addressed more thoroughly in Chapter 5. For now, let us just say
that there are some difficulties that can arise when solving models, and having
three indicators provides more information with which to work. I should also
add, however, that when such problems arise, there are simplifications, such as
specifying equal loadings for two-indicator latent variables, which can make
model identification possible. One additional point that can be valuable is to
make sure that when you have two indicators of a latent variable they are scaled
so as to be positively correlated, otherwise it is not possible to set the loadings
as equal.
A second reason to include three or more indicators is to determine more
objectively the degree to which the indicators exhibit commonality. When only
two indicators are used, as long as the correlation between them is decent
(greater than 0.7), they are likely to fit and they will probably contribute equally
to the latent variable. When there are three or more indicators, we can assess not
only the degree of correlation among indicators, but the equality of correlations.
When two of three indicators are highly correlated with each other, but much
less well correlated with the third, that is a warning sign that they may not form
a stable latent variable. The consideration of a number of indicators affords
98 Structural equation modeling and natural systems
Cautions
Individual latent variables are said to behave as single causes. From a philo-
sophical standpoint, we must recognize that all single causes can be broken
down into parts, ultimately to the quantum level. For this reason, we should
understand that a latent variable is, to a certain degree, a convenience. Con-
cepts (and latent variables) emphasize the common attributes of a particular set
of items, while de-emphasizing their unique attributes. Ultimately, the value
of this convenience is judged by its utility. Perhaps it is Bollen (1989, page
180) who summed it up best: “Do concepts really exist? Concepts have the
same reality or lack of reality as other ideas. They are created by people who
believe that some phenomena have something in common. The concept iden-
tifies that thing or things held in common. Latent variables are frequently used
to represent unmeasured concepts of interest in measurement models.”
Figure 4.11 attempts to further clarify the unidimensional nature of a latent
variable and how this influences its role in path relations. Simply put, ξ only
represents the information in common among x1 –x3 , it does not represent their
unique information. Stated in another way, the model in Figure 4.11 does not
represent the effects of x1 –x3 on y. It represents the effects of the unmea-
sured factor that makes x1 –x3 correlated with y. Furthermore, even if ξ is the
unmeasured cause of joint correlation in x1 –x3 , if x1 –x3 are not well correlated
(i.e., unless ξ exerts strong influence on the values of x1 –x3 ), ξ will not behave
as a single latent cause, and we may obtain complex model relationships of the
sort discussed later (Figure 4.19). Generally, correlations among a set of indi-
cators need to exceed about 0.7 and be roughly equal, for them to adequately
estimate a latent variable using covariance procedures.
One additional caution to consider when dealing with latent variables has to
do with the names selected. The naming fallacy refers to the fact that putting a
name on a latent variable does not necessarily guarantee that we have measured
the concept embodied by the name. If we claim we are studying effects of body
The anatomy of models II: latent variables 99
d1 x1 z
x2
d2 x y
d3 x3
Figure 4.11. Venn diagrammatic representation of shared variance among indica-
tors. Each indicator, x1 –x3 , shares some variance with the others (represented by
vertical hatching), and the degree of overlap between them represents their corre-
lations. Only the variance shared by all three variables is linked to the unmeasured
latent variable. Error variances for each variable can be thought of as that part of
the variance not in the hatched portion. This unshared variance includes both the
true error and the unique information contained in the indicator.
size, but we only have a single measure, we have done little to bolster our claim.
The solution to this potential problem is a dedicated effort to demonstrate the
generality of the factor being estimated, which can be accomplished by studying
the relationships among a large set of related indicators. So, if we show that
a latent variable composed of numerous indicators reasonably related to size
does indeed evidence unidimensionality, we have greatly strengthened our case.
It is also worthwhile to be aware that in path-focused models, the predictive
context can subtly influence the meaning of a latent variable. When we have
a model that relates body size to territory size, such as in Figure 4.9, there are
two different ways we can describe what that model is about. First, we might
interpret this model as addressing the question, what is the effect of “body size”
on “territory size”? This is literally what the model states. This statement also
implies that the concept “body size” is independent of the other latent variables
in the model. However, there is a tendency for the researcher to select particular
indicators of body size that most strongly correlate with indicators of territory
size. When we do this, our latent variable “body size” is better described as
“the attributes of body size that influence territory size”. Since a single word or
phrase is always a shorthand statement for a more complex idea, it behooves
us to describe carefully the meaning of our latent variable and the context in
which it is to be interpreted. This is not a problem unique to SEM applications.
Rather, the unique thing about SEM is that we give an explicit and operational
statement of the meaning of our latent variable by the indicators we use to
represent it. As stated earlier, this is one of the many strengths of SEM.
100 Structural equation modeling and natural systems
When we dig into it, we find that there is much to be learned from a dedicated
study of correlated indicators. This aspect of quantitative science has been
somewhat ignored in the natural sciences. There is much to be gained by an
explicit consideration of latent factors and their relationships to indicators. This
is where the topic of factor-focused analyses comes in, and is the subject of our
next major section.
Factor-focused applications
Some concepts related to factor-focused applications
Factor-focused applications have as their background a considerable literature
dealing with factor analysis. In factor analysis there is interest in identifying
some smaller set of forces (unmeasured factors) that can explain the correla-
tions among a set of observed indicators. Through history there have been a
number of approaches proposed for studying factors, including both empirical
methods (e.g., principal components analysis, also known as component fac-
tor analysis), and exploratory methods (such as exploratory factor analysis). In
SEM we consider factor-focused applications that fall within the tradition of a
confirmatory (i.e., theory-driven), common-factor approach.
Our presentation skips over quite a bit of background that may be of interest
to the beginning practitioner of SEM. For those who wish to test hypotheses
involving models with multiple indicators, a working knowledge of exploratory
factor analysis (as well as principal components analysis) is valuable, though
beyond the scope of our presentation. A good reference providing a relevant
and comprehensive treatment of this subject for those working in the natural
sciences is Reyment and Jöreskog (1996).
Figure 4.12 provides us with an overall comparison of principal components
analysis (PCA), exploratory factor analysis (EFA), and confirmatory factor
analysis (CFA) (as is incorporated in SEM). I need to point out that historically
there has been some confusion over the term “factor analysis”, which in the
past included both principal components analysis (also known as component
factor analysis), and what I am referring to here as exploratory factor analysis
(also known as common factor analysis).
The reader is perhaps already familiar with PCA, since it is commonly used
in the natural sciences. In PCA it is assumed that all observed variables are
measured without error. Thus, in our representation of a PCA model, no errors
are shown for the xs. The goal of PCA is to create predictors that represent the
correlations among measured variables, using a reduced set of variables. This
The anatomy of models II: latent variables 101
x1 x1 x1
x2 C1 x2 F1 x2 LV1
x3 x3 x3
x4 x4 x4
x5 C2 x5 F2 x5 LV2
x6 x6 x6
Figure 4.12. Graphical comparison of principal components analysis (PCA),
exploratory factor analysis (EFA), and confirmatory factor analysis (CFA).
xi = λi j ξ j + δi (4.8)
where λij represents the loadings and δ i the errors. We should be aware that
the error term δ i is actually composed of two parts. This can be illustrated by
decomposing the variance of xi as follows:
d1 x1 l11
l 21
d2 x2 x1
l31
d3 x3
F12
d4 x4 l42
l52
d5 x5 x2
l62
d6 x6
Figure 4.13. Diagrammatic representation of a factor analysis model. The term δ i
represents error for the indicator variables (xi ), λij , represents the loading of factors
εj on indicators, and represents the covariance between factors.
where
σc2 is the common variance or communality, which is the variance shared
among indicators of a factor ξ ,
σδ2 is the residual variance or uniqueness, which is the variance of x un-
accounted for by the factor,
σs2 is the specific variance or specificity, which represents the variance spe-
cific to a variable that is not true error, and
σe2 is the true error variance, which is solely due to measurement error and
which can be estimated using repeated trials.
In practice, specific variance and true error variance are not distinguished and
these two components are combined in the error variance variables, δ i .
104 Structural equation modeling and natural systems
discussion on latent variables. Here I use data sets dealing with intercorrelated
soil properties to demonstrate how correlations can be understood in terms of
latent soil properties. I should emphasize that the results presented in these
analyses do not represent a thorough treatment of the subject, but serve as an
example.
1.0
0.0 elev ELEV
0.51 Ca 0.6
6
0.48
0.43 0.71
0.79 Mg
0.99
0.11 Mn
0.08 MINRL −0.67
0.66
0.27 Zn 0.53
9
0.3
0.29 K 0.
8 −0.33
3
0.8 1
4
0.22 P 0.93
−0.66
0.55 pH HYDR
0.78
0.38 C 7
0.27 0.6
0.53 N
Figure 4.14. Measurement model relating prairie soil variables to latent factors
(modified from McCune and Grace 2002). ELEV refers to elevational influences,
MINRL to soil mineral influences, and HYDR to hydric influences. All coefficients
presented are standardized values.
0.23
Ca 0.8
8
0.47 0.73
Mg
−0.43 MINRL
0.69
Zn 0.71 3
0.23 0.30 0.53
0.19 .65
K 0.30
0.29 −0.97
HYDR
pH
0.91
0.18
org
Figure 4.15. Measurement model of prairie soil conditions (derived from Weiher
et al. 2004). MINRL refers to mineral influences while HYDR refers to hydric
influences.
Table 4.3. Correlations and standard deviations for prairie soil characteristics
(from Grace et al. 2000). N = 105. Correlations shown in bold are significant
at the 0.05 level
Ca Mg Zn K pH C Biomass
Ca 1.0
Mg 0.790 1.0
Zn 0.468 0.450 1.0
K 0.400 0.350 0.687 1.0
pH 0.284 0.309 −0.054 −0.224 1.0
C 0.039 0.145 0.491 0.550 −0.410 1.0
Biomass 0.288 0.137 0.279 0.163 −0.127 0.120 1.0
Std. dev. 1.58 0.447 0.746 1.437 0.271 2.162 1.350
0.47
Ca 0.7
3
0.28
0.52 0.70
Mg
0.29 0.56 MINRL
Zn 0
0.4 3
5
0. 0 0.17
0.29 .54
K 0.68
0.42 −0.65
HYDR
pH
0.44
0.75
0.44
C
Figure 4.16. Measurement model of coastal prairie soil from analysis of data in
Table 4.3. All values are standardized coefficients.
because several of the variables were not expressed on a common basis, or were
not measured using the same technique, making unstandardized coefficients
incomparable and a rigorous evaluation not possible. Thus, our comparison
between Figures 4.15 and 4.16 can only be rather crude in this case, because of
differences in the data between the studies.
Several of the specifics differ between models (Figure 4.15 versus 4.16).
Based on a superficial examination, (1) the correlation between factors is weaker
in the Mississippi data, (2) the correlated errors differ between models, and (3)
some of the path coefficients are rather different. One’s satisfaction with the
similarities between the two sets of results could easily vary depending on
the objectives of the investigator and the state of their knowledge. This is the
first such comparative analysis of soil intercorrelations that I have seen, and
the similarity between the two sets of results was substantially greater than
expected given the differences between the two ecosystems and the differences
in sampling. The general similarity in results is encouraging for those wishing
to compare the effects of soil conditions on plant communities across systems.
Substantially further work will be required, however, if we are to determine
whether general factor models exist for a range of soils.
Hybrid models
Modern SEM practice and most treatments of the subject are based on the
premise that analyses are fully focused – as much about the measurement model
110 Structural equation modeling and natural systems
as about the path relations. This is partly due to the fact that in the social sciences,
where the dedication to SEM has been the strongest, it has long been recognized
that there is a compelling case for latent variables. In addition, there is also an
abundance of data, such as survey data, that are well suited to factor-focused
investigation. Another reason that the tradition in SEM is for fully focused
analyses using hybrid models is that (1) experience has taught us that latent
variables are valuable, and (2) a thorough understanding of latent variables
requires factor analytic studies.
The situation in the natural sciences is different. The application of SEM to
natural systems is in its infancy. The work has not yet been done that results
in a fundamental merger of path and factor traditions. This is one reason that
the presentation of ideas in this chapter is structured as it is. Most textbooks
on SEM first discuss factor analysis and only then bring latent variables into
path models. Here I have built from strongly path-focused models, starting with
single-indicator latents, to models containing multiple indicators.
The ultimate goal of SEM is to work with models that match both the
particular and general features of the situation. In this section, I will tran-
sition from the results from our example of a soil factor model to a model
that seeks to integrate both path and factor foci. I will then follow that up
with a more general discussion of the complexities that may arise in hybrid
models.
Ca
Mg
MINRL
Zn
BIOM bio
K
HYDR
pH
C
Figure 4.17. Hypothesized hybrid model relating soil latent factors to plant com-
munity biomass.
indicate that the model has gone from adequate to inadequate when the response
variable biomass is included. Thus, our data do not match the expectations that
can be derived from the model and we must conclude that the model is not the
correct one for the situation. How can this be when the measurement model has
been shown to be adequate in separate analyses and all possible paths among
latent variables are included?
The answer to our dilemma can be found by looking for specific effects of
individual indicators on biomass. Such specific effects are often overlooked,
in part because most SEM software packages are not configured to alert the
researcher to specific effects. In fact, for most software packages (excluding
Mplus) we must specify a different kind of model to even include specific
effects from observed indicators on biomass. In this case, when our model is
appropriately configured, we are able to determine that there is a specific effect
of soil calcium on biomass that is in addition to the general effects of our latent
soil variables. With this reformulated model, we can approximate a satisfactory
solution, which is shown in Figure 4.18.
What is conspicuous from the results of this analysis is that calcium, by
itself, provides an adequate explanation for mineral influences on community
biomass. Thus, while we were successful in estimating latent variables that
explain the correlation structure among indicators, these latent variables do not
necessarily represent the effects of soil conditions on community biomass. I
cannot overemphasize the importance of this result to those wishing to employ
latent variables to understand path relationships. Just because there is evi-
dence for latent factors causing a suite of variables to correlate, that does not
112 Structural equation modeling and natural systems
0.33
0.48 Ca 0.7
2
0.29
0.53 0.69
Mg
0.56 MINRL ns
0.30 Zn 9
0.3 56 1.0
0. 0 0.17 BIOM bio 0.0
.54
0.30 K 0.68 0.22
HYDR R 2 = 0.124
− 0.65
0.38 pH
0.75
0.45 C
necessarily mean that these latent factors are the causal agents affecting other
variables in the model.
A
x1 y1
x2 x h y2
x3 y3
B
x1 x1 h1 y1
x2 x2 x4 h4 h2 y2
x3 x3 h3 y3
Conclusions
Addressing measurement issues can be valuable, both for increasing the accu-
racy of our analyses of relationships among concepts, and in the discovery or
validation of general factors that explain correlated effects. Structural equation
modeling provides us with an opportunity to correct for sampling and other
forms of nonreliability in our measurements. Specification of reliabilities is
generally easy to accomplish and valuable, even if our models only have sin-
gle indicators for estimation of our concepts. The use of multiple indicators to
specify general latent variables helps to add generality to our models. At the
same time, detailed factor-focused analyses enhance our ability to deepen our
understanding of multi-indicator latent variables. Often ecologists have appro-
priate data for the development of latent variable models, but have lacked an
understanding of measurement theory and the characteristics of latent variables.
When we say that an animal’s size influences the size of its territory, what do
we mean? Most of us have assumed all along that general concepts must be
114 Structural equation modeling and natural systems
quantified with specific attributes, unaware that a general concept with robust
statistical properties and theoretical components might be estimated and made
tangible.
For those of us in the biological and ecological sciences, there is a need
to consider carefully the value and procedures involved with using latent vari-
ables. Typically our theoretical knowledge is concentrated in the inner model
and largely absent from our measurement model. Thus, the adoption of latent
variables in our models requires that we become familiar with the implications
of relating latent concepts to one another. Ultimately latent variables represent
a potentially evolutionary step in the maturation of our theories. For this rea-
son, latent variables inspire us to seek more general models that match general
theories.
5
Principles of estimation and model assessment
Introduction
In previous chapters, I have presented structural equation model parameters,
such as path coefficients, and we have considered their interpretation. Regres-
sion equations have been presented to help the reader understand the meaning
of parameters and how they are commonly expressed. However, we have not
yet considered the important question of how parameter values are estimated.
Since we are dealing with the solution of complex multi equational systems,
this is not a minor matter.
Historically, path models were solved primarily by the use of the familiar
technique of least squares regression (also referred to as ordinary least squares,
OLS). Today, most applications of SEM rely on model fitting programs that
offer a variety of options for estimation methods, as well as many other sup-
porting features. In this chapter our emphasis will be on maximum likelihood
estimation, both because of its central role in the synthetic development of mod-
ern SEM, and because it provides a means of solving nonrecursive and latent
variable models.
Another important issue that is related to the topic of estimation has to do
with the assessment of model fit. One of the most powerful features of SEM
is that techniques exist for comparing the observed relations in data to those
expected based on the structure of the model and the estimated parameters. The
degree to which the data match the model-derived expectations provides us
with the capability of evaluating the overall suitability of a model. This means
we have the capacity to reject models as inadequate and to compare models in
direct tests of alternative hypotheses. While not always a simple matter, model
evaluation is fundamental to the process of SEM and integral to arriving at
suitable parameter estimates. In this chapter I will consider basic principles
associated with estimation and model evaluation. In Chapter 8, illustrations of
115
116 Structural equation modeling and natural systems
model evaluation will be given, along with additional information on some of the
indices of model fit. However, before we can consider the topics of estimation
method and assessment of model fit, we must first contend with a basic issue
that underlies the estimation process, identification.
Identification
We say that a model is identified if it is possible to derive a unique estimate of
each of its parameters. Similarly, we may ask whether an individual parameter
is identifiable in a model, as this specific attribute relates to the overall question
of model identification. The problem in practice is that for some types of models
or for certain data sets, it is not possible to achieve this property of identification
and, as a result, the model cannot be solved for its parameters. Before going
too far into the complexities of this topic, let us begin with what should be a
familiar example, the solution of multi equation systems using algebra.
Consider that we have two algebraic relations,
a+b = 8
2a + b = 14
In this case, we can readily find the solution by solving the top equation for b,
(b = 8 − a), and substituting that expression into the bottom equation, which
leads us to the conclusion that a = 6. Substituting the solution for a from the sec-
ond equation back into the top equation allows us to derive a solution of b = 2.
By extension, it should be clear that when we only know one of the pieces of
information, say a + b = 8, we have a problem with one known value (8) and
two unknowns (a and b). There are any number of solutions that will satisfy this
single equation, such as 1 + 7, or 2 + 6, or 3 + 5, etc. Thus, we cannot achieve
unique solutions and our multi equational system is not identified. This case is
formally described as an underidentified model.
The t-rule
From this example we are able to recognize the most basic requirement for
solving structural equation models, which is also known as the t-rule, where
t, the number of parameters to be estimated, must be less than or equal to
the number of known values. When considering the analysis of covariance
structures, as we are in SEM, we can visualize our known information as the
matrix of variances and covariances. For example, if we have observed data
Principles of estimation and model assessment 117
x1 x2 y1 y2
x1 3.14
x2 0.72 0.26
y1 3.45 0.72 12.55
y2 1.65 0.36 3.85 9.67
g21
x1 y2
g11
r12 b21 z2
g22
x2 g12 y1
z1
Figure 5.1. Example of saturated model involving four observed variables.
A
g21
x1 y2
g11
r12 b21 z2
x2 g12 y1
z1
B
g21
x1 y2
g11
r12 b21 z2
g22
x2 y1 y12
g12
z1
Figure 5.2. Illustration of two alternative models involving four observed vari-
ables. In model A, the model is unsaturated and only involves the estimation of 9
parameters; thus, this model is overidentified since our variance/covariance matrix
contains 10 known values. Model B includes a correlation between the errors for
y1 and y2 . In this model there are 11 parameters to be estimated. As a result, the
model is underidentified.
a+b = 8
3a + 3b = 24
When we rearrange the top expression (b = 8 − a) and substitute into the bot-
tom one, we arrive at 24 = 24. While this is a true statement, it is not helpful.
The reason that a drops out of the picture when we rearrange the equations
is because the bottom equation is simply 3 times the top equation. Thus, the
second equation is redundant since it contains no unique information about
the relationship between a and b. So, while we have two equations, they do
not represent two unique pieces of information, but only one. The same thing
can happen in solving certain types of structural equation models. The most
common case is when two predictors (say x1 and x2 ) are very highly correlated
(e.g., r12 = 0.999). In this case, the two predictors actually represent only one
unique piece of information, because they are completely interchangeable. In
the estimation process, it is highly likely (though not guaranteed) that unique
solutions will not be obtained for the parameters. Most SEM programs (as well
as other statistical programs) will report that the matrix is nonpositive defi-
nite, which indicates that there is an apparent mathematical dependency among
elements of the variance/covariance matrix. In cases where it is the characteris-
tics of the data, rather than the structure of the model, that prevent the determi-
nation of unique estimates, this is referred to as empirical underidentification.
A B
x1 x1 x2
* *
x1 x2 x1 x2 x3 x4
e1 e2 e1 e2 e3 e4
C D
x1 x1 y1 z1
x1 x2 x3 x1 x2
e1 e2 e3 e1 e2
Figure 5.3. Examples of latent variable models used to illustrate certain principles
of model identification.
are introduced into structural equation models, they bring with them additional
parameters to consider, such as their variances. They can also act to increase or
decrease the number of paths in a model. As a practical matter, evaluating the
identification of a structural equation model that includes both latent variables
and directed paths among the latent variables (so-called hybrid models) can be
addressed by separate evaluations of the measurement and structural models.
If the measurement model is identified and the structural model is identified, then
the full model will be identified. While these conditions are sufficient (except in
the case of empirical underidentification), they are not always necessary. Also,
it is important for the potential practitioner to note that there are a variety of
model parameter constraints that can be used to achieve identification, even
when the original model is not identified. Let us look at a few examples to
illustrate some of the possibilities (keeping in mind that a complete treatment
of this topic is beyond our scope).
Our first example in Figure 5.3 (model A) revisits an issue considered in
Chapter 4. Here we see the case of a latent variable associated with two observed
indicators. Since our data matrix only has one covariance (between the two indi-
cators), we lack sufficient information with which to estimate the two loadings
Principles of estimation and model assessment 121
from the latent to each indicator. The way to solve this problem in practice is
usually to specify that the two loadings are equal, thereby reducing the number
of parameters to estimate. Note that for an equality specification to be reason-
able, the variances of the two indicators should be made equal as well (through
variable coding).
Model B in Figure 5.3 represents a case where, unlike model A, we are
able to satisfy the t-rule. Since there are four observed variables, we have
(4 × 5)/2 = 10 knowns. Including the variances of the xs, we have 9 parameters
to estimate. Therefore, our model is structurally overidentified. However, there
can be a problem associated with empirical underidentification for this model.
If the correlation between the latent variables is indistinguishable from zero,
that implies that the two constructs are independent, 2-indicator models. In this
case, we have the situation where we cannot identify both indicator loadings
for the latent variables, even though we satisfy the t-rule. The solution is the
same as for model A, equality constraints for loadings on each latent variable
will be required in order to arrive at unique solutions. Model C in Figure 5.3
represents a case where, assuming we specify the scale for our latent variable,
we satisfy the t-rule and otherwise should have an identified model.
Finally, model D in Figure 5.3 represents the case where a two-indicator
latent variable model is identified. Since there are three observed variables, we
have six knowns with which to estimate the six unknowns (3 variances and 3
path coefficients). Therefore, involving a two-indicator latent in regression is
another way in which we can estimate models containing latent variables with
less than three indicators.
x1
y1
y2 +
like
liho
od S = { }
1.3
0.24 0.41
0.01 9.7 12.3
um
xim
ma
compare
Model Fit
Evaluations
Parameter
Estimates Σ =
{ }
s11
s12 s22
s13 s23 s33
Implied Covariance Matrix
Figure 5.4. Simplified overview of the processes associated with parameter esti-
mation and model fit assessment.
and the body of methodology continues to grow and evolve. While we can
(and do) consider the many separate elements that constitute modern SEM, it is
really the synthetic system that evolved from this paper and the influence of the
associated LISREL software (developed with Dag Sörbom) that has become
the core of modern practice. A detailed historic perspective on this can be found
in Cudeck et al. (2001) “Structural Equation Modeling: Present and Future –
A Festschrift in honor of Karl Jöreskog”.
A simplistic representation of the overall process of parameter estimation
and assessment of model fit is presented in Figure 5.4. The elements of the
process represented in this figure are that (1) the hypothesized model implies
general expectations for the observed covariance matrix, (2) when the observed
matrix is combined with the model structure, we can obtain parameter esti-
mates, (3) the parameter estimates in combination with the hypothesized model
yield a model-implied (predicted) covariance matrix, and (4) comparison of the
implied and observed matrices allows for assessments of overall fit of data to
model.
There are several features that characterize modern SEM as represented in
Figure 5.4. First, SEM is based on the analysis of covariances instead of indi-
vidual observations. This is a strong departure from traditional statistical anal-
yses. At present, the continued evolution of SEM is incorporating both covari-
ances and raw observations, although this development is beyond the scope
of the present discussion. Another aspect of modern SEM is the reliance on a
comparison between observed and model-implied covariances. While methods
other than maximum likelihood can be used to perform such comparisons,
Principles of estimation and model assessment 123
likelihood for a given model to the likelihood of a model with perfect fit. This
fitting function is commonly expressed as
FML = log
ˆ + tr(Sˆ −1 ) − log |S| − ( p + q) (5.2)
A third point to make is that issues of data distribution (as long as data
are continuous) do not generally impact the parameter estimates themselves,
only their standard errors and our inferences about the probabilities associated
with those parameters. Thus, while we may be concerned with sample-related
problems such as outliers and influential points, we will find that the absolute
values of our parameter estimates are uninfluenced by methods designed to
compensate for data nonnormality.
A fourth point is that in recent years there have evolved a number of methods
for evaluating and compensating for certain deviations from normality that are
widely available in modern SEM software packages. Among these, one that
should generally be avoided (but which still remains because of the historic
role it has played) is the asymptotic distribution-free (ADF) approach. While
the “distribution-free” part sounds appealing, the problem with the use of ADF
methods is the “asymptotic” part, which alludes to the very large sample size
requirements for the validity of such methods. Use of the ADF method has
largely ceased with the development of so-called “robust” methods (Satorra and
Bentler 1988, 1994). In a nutshell, such methods estimate the kurtosis in the data
for the response variables, and provide specific and appropriate adjustments to
the standard errors. Practitioners will find robust methods highly useful for
analyses. Further information on the subject can be found in Kaplan (2000,
chapter 5).
A fifth point is that resampling methods, such as Monte Carlo bootstrapping,
provide another means of dealing with nonparametric data distributions. In
bootstrapping, data are randomly sampled with replacement in order to arrive at
estimates of standard errors that are empirically associated with the distribution
of the data in the sample. The main advantage of this approach is that probability
assessments are not based on an assumption that the data match a particular
theoretical distribution.
“Researchers using structural equation modeling aspire to learn about the world
by seeking models whose causal specification match the causal forces extant
in the world.” Evaluating models by assessing the fit between expectations and
data, or the comparative fit between alternative models represents the method of
“seeking models”. Characteristic of a theory-based approach, when evaluating
structural equation models we consider substantive information when deciding
the adequacy of a model or when choosing a “best” model. While p-values are
used in evaluating both measures of model fit and individual parameters, a strict
reliance on arbitrary cutoffs such as 0.05 is not adhered to. Ultimately, the goal
is to estimate absolute and relative effect sizes, and to draw inferences about the
processes operating in a system, rather than to reject null hypotheses. Structural
equation modeling philosophy is, thus, fairly compatible with recent criticisms
of null hypothesis testing by biologists (Anderson et al. 2000), although our
approach to model evaluation and selection is somewhat different. In this sec-
tion, I describe some of the elements of model evaluation, with an emphasis on
model fit assessment.
Hypothesized
Model x1 y2
y1
residual = 0.15
Figure 5.5. Illustration of the related concepts of model-implied covariance and
residual covariance. For simplicity, covariances are standardized in this example,
although the principles generalize to unstandardized (true) covariances.
discrepancy between theory (our model) and the real world (the population
being sampled). This issue will be dealt with in the section that follows. For
now, it is important to realize that the very simple example shown in Figure 5.5
generalizes to the universe of structural equation models, including those with
latent variables. Appendix 5.1 provides a more technical and comprehensive
presentation of how structural equation models can be rewritten in terms of
the expected variances and covariances, and how these are represented in the
LISREL system.
such tests (the model χ 2 test) benefits from the fact that the maximum likelihood
fitting function FML follows a χ 2 (chi-square) distribution such that
χ 2 = (n − 1)FML (5.4)
Here, n refers to the sample size. In a model that includes paths between all
pairs of variables, no degrees of freedom are available for tests of overall model
fit. Rather, we say that the model is saturated and has perfect fit. However,
for models that have degrees of freedom, the χ 2 test is the most unambiguous
measure of overall model fit. Further, when comparing two models that differ in
a single constraint, it is possible to apply single-degree-of-freedom χ 2 tests to
assess the consequences of making or relaxing a constraint. Based on parametric
probability theory, a change in model χ 2 that is found to be greater than 3.84
indicates a significant difference between models.
When evaluating a model, the overall χ 2 should always be examined as a first
order of business. The χ 2 is of primary interest because of its fundamental rela-
tionship to the fitting function used in the estimation process (described above).
When χ 2 values have associated p-values greater than 0.05, authors often deem
a model to be acceptable. In my experience, in most situations when the χ 2
p-value is larger than 0.05, there are no pathways that can be added that will
have significant path coefficients based on individual t-tests.1 This condition is
not true 100% of the time, however, so it is still worthwhile examining the indi-
vidual residuals to see if any of them are pronounced (note that the χ 2 evaluation
is based on the sum of all deviations, therefore, many small deviations may yield
a χ 2 equal to that where there is one moderately large deviation). In addition,
models with p-values indicating adequate fit can frequently contain paths with
coefficients that are not significantly different from zero. In other words, the
model χ 2 test is not symmetric in its sensitivity to missing paths versus surplus
paths. Normally missing paths will be detected; surplus paths will not. For this
reason, the p-values for individual path coefficients are typically examined as
well. As will be discussed below, paths with p-values greater than 0.05 are not
always eliminated from the model, despite the fact that a p-value of greater than
0.05 for an individual path indicates that the path may be uncalled for. Again, the
philosophy of SEM is not to be a slave to arbitrary p-value cutoffs, but instead,
to consider both data and theory when evaluating the adequacy of models.
As with the estimation of standard errors for parameters, deviations from
multivariate normality can influence the estimated probabilities associated with
1 Note that when parameter estimates are based on maximum likelihood, the ratio of a parameter
to its standard error is actually a z-value instead of a t-value. T-values and z-values are virtually
identical except when sample sizes are very small.
130 Structural equation modeling and natural systems
the χ 2 . When data appear to have failed to meet the assumption of multivariate
normality by a wide margin, robust standard errors (Satorra and Bentler 1988,
1994) provide an alternative that seems to hold up well to substantial distri-
butional deviations. Resampling (e.g., bootstrapping) methods can be used as
well, in this case it is also possible to correct the evaluation of the overall model
χ 2 for distributional violations (Bollen and Stine 1992).
As just described, a nonsignificant χ 2 value (with probability >0.05) is
generally taken to mean that there is reasonable fit between the model and
the data. However, in practice the p-value associated with the model χ 2 test
is not always a workable gauge of model fit. One reason for problems is that
the p-values associated with a particular χ 2 are affected by the sample size.
Models with large samples sizes will show statistically significant deviations
even when the absolute deviations are small. This unfortunate property of the
χ 2 test in SEM has led to substantial efforts to develop measures of model
fit that are independent of sample size. Probably no other aspect of SEM has
received as much attention as the consideration and development of measures
of overall model fit. At present, two things are generally agreed to: (1) if the
model χ 2 has an associated p-value that is nonsignificant, the model can be
accepted (assuming the model is theoretically justifiable in the first place, and
the data meet distributional assumptions), and (2) if the χ 2 has an associated
p-value that is significant (usually, less than 0.05) and the sample size is large
(say, greater than 150), other fit measures may indicate that the model is never-
theless acceptable. It should be made clear that regardless of sample size, a sig-
nificant chi-square indicates deviations between model and data and this should
never be ignored, although one may decide that the deviations detected are not
of sufficient magnitude to reject the model entirely. The reader should be aware
that the issue of adequacy of fit continues to be debated among SEM specialists.
The literature on this subject is deep and not all of it leads to useful advice.
Therefore, I will not take the time here to discuss in detail the great variety
of alternative measures of model adequacy. The interested reader can consult
Bollen and Long (1993), Marsh et al. (1996), Kelloway (1998), or Schermelleh-
Engel et al. (2003). Consensus opinion at the moment suggests that multiple
measures should be considered when evaluating model fit, particularly when
sample size is large (Marsh et al. 1996, Kelloway 1998).
A substantial number of size-corrected indices have been developed as alter-
natives to χ 2 . One such index is the root mean square error of approximation
(RMSEA) proposed by Steiger (1990). This index is certainly not infallible and
there is some degree of controversy about its suitability. Nevertheless, it does
have certain desirable properties including (1) it is one of the few alternative
measures of fit that presents associated 95% confidence intervals and probability
values, (2) it is adjusted for sample size, and (3) it strikes a balance in sensi-
tivity with deviations in the structural model versus the measurement model.
Other indices include measures of goodness of model fit (GFI and AGFI), and
other summations of deviations between observed and predicted covariances
(e.g., the root mean residual, RMR). For measures other than χ 2 and RMSEA,
only rules of thumb are available, and no associated probabilities or error can
be calculated. In Chapter 8, the use of these and other measures of model
fit, including information-theoretic methods (such as the Akaike information
criterion described below), will be illustrated.
was to correct for the fact that model fit is generally improved by including
additional parameters. To counter this problem and create an index that could
be applied to models of different structure (e.g., with different numbers of
variables included), Akaike sought to take model parsimony as well as model
fit into account. One expression of Akaike’s index is
AIC = c − 2d
where c is the model chi-square and d is the model degrees of freedom. Recall
from the presentation earlier in this chapter that the parameter c is one that
satisfies the criterion
c = sF[S, ()]
where s is the number of samples minus one and the function F is one that min-
imizes differences between the observed covariance matrix S, and the expected
covariance matrix based on the model (). The reader should be aware that
there are actually two different formulae for AIC in the literature, although they
produce equivalent results (see Kline 2005, page 142 for a brief discussion).
Bozdogan (1987) extended the work of Akaike by further correcting for
the effects of sample size on c (which are well documented), arriving at the
following
CAIC = c − (1 + ln S)d
where CAIC refers to the consistent AIC and S is the total number of samples.
Brown and Cudeck (1989), following a somewhat different line of reason-
ing, arrived at another alternative model comparison index, the expected cross
validation index
ECVI = (c/n) + 2(t/n)
where c is S – 1 and t equals the number of estimated parameters. The ECVI
was proposed as a way of assessing for a single sample the likelihood that the
selected model will apply across replicate samples of the same size from a
common population. While devised from a different starting point, the ECVI
shares with the other indices a capacity to compare models of similar or differing
structure.
Finally, indices of Bayesian information criteria (BIC) have been developed
to aid in model selection among structural equation models (Raftery 1993). Such
indices ask the questions which model predicts the data better, or alternatively,
under which model are the data more likely to be true? More will be said about
Bayesian approaches in the final chapter. For now, we can understand BIC to
be another information theoretic index, similar in principle to AIC but based
on Bayes factors.
Principles of estimation and model assessment 133
The AIC, CAIC, ECVI, BIC, and sample size adjusted BIC are all employed
in the same general fashion to compare alternative models. It is common practice
to select the model with the lowest index value as the “best” model. For AIC, the
model thus selected is the one with the combined lowest chi-square and highest
degrees of freedom. For the CAIC and ECVI, there is an additional weighting
of models to account for differences in sample size. A further discussion of the
principles of model selection based on AIC and CAIC can be found in Burnham
and Anderson (2002).
In spite of the fact that SEM can be used in an exploratory mode, results
thus obtained must be considered tentative. It is particularly important that
substantive interpretations be given to any model modifications. Likewise, it
is not generally recommended that paths judged to be nonsignificant using an
arbitrary p-value of 0.05 not be set to zero and the model re-estimated, unless
one is willing to propose that the deleted path represents a process that one has
now decided is not expected to be important. It is also desirable when making
modifications that only the required minimum changes should be made, in order
to avoid overfitting (defined as “the ill-advised act of changing a model based
on results that are actually due to chance relationships in the data at hand, rather
than based on a revised mechanistic assumption”).
A second element required before the results from a modified model are
considered definitive is subsequent evaluation with a new data set. This require-
ment has been ignored by many. Yet, the requirement for further evaluation of
a modified model in SEM practice is the basis for the philosophy of viewing
repeatability as a primary criterion for theory acceptance. Hopefully, in future
applications there will be an increased awareness of the value of independent
confirmation of SEM model results.
Summary
In this chapter I have discussed the topics of parameter identification, parameter
estimation, and at least to a limited extent, some of the associated assumptions
and alternative procedures. The simplistic consideration of parameter identi-
fication covered in this chapter belies the potential complexity of this topic.
Principles of estimation and model assessment 135
Model-implied covariances
As stated in Chapter 5, the following equation summarizes the formal problem
of expressing the relationship between data and model:
= () (5.1.1)
136 Structural equation modeling and natural systems
and the covariance between x1 and x2 is defined by the variance of the common
latent factor φ. Now, Eq. (5.1.1) becomes
VAR(x1 ) φ + VAR(δ1 )
= (5.1.10)
COV(x1 , x2 ) VAR(x2 ) φ φ + VAR(δ2 )
η = Bη + Γξ + ζ (5.1.11)
This equation is composed of (1) three vectors of variables, (2) two matrices of
coefficients, and (3) two matrices of covariances. The three vectors of variables
are
η, an m × 1 vector of latent endogenous variables,
ξ , an n × 1 vector of latent exogenous variables, and
ζ , an m × 1 vector of latent errors for the latent endogenous variables,
where m is the number of latent endogenous variables and n is the number
of latent exogenous variables. Note that endogenous variables are dependent
latent variables (i.e., those with an arrow pointing towards them from another
latent variable. Exogenous variables are independent latent variables.
The two matrices of coefficients in Eq. (5.1.11) are
B, an m × m matrix of structural coefficients describing the relationships
among endogenous latent variables, and
Γ, an m × n matrix of structural coefficients describing the effects of exoge-
nous latent variables on endogenous ones.
138 Structural equation modeling and natural systems
x = Λx ξ + δ (5.1.12)
y = Λy η + ε (5.1.13)
expectations for a full structural equation model. Equation (5.1.1) now takes
the form
| covariances | covariances |
| among ys | between ys and xs |
= | - - - - - - - - - - -- -- - - - - - - - - - - - - - - - - - - |. (5.1.14)
| covariances | covariances |
| between xs and ys | among xs |
There is insufficient space in this short chapter to give the derivations of the
four submatrix elements. The interested reader is referred to Hayduk (1987,
chapter 4, section 4) for a lucid account of the process. It is helpful, nonetheless,
to present the equations for the four submatrices individually. First, the equation
for the lower right submatrix is
where A = (I – B)−1 , the inverse of the beta matrix subtracted from the identity
matrix and A is the transpose of A. The third submatrix equation to define
(lower left quadrant) is
Plugging Eqs. (5.1.15) through (5.1.19) into (5.1.14) gives the LISREL equiv-
alent of Eq. (5.1.1) in the form given by Jöreskog
|Λ y A(ΓΦΓ + Ψ)A Λ y + Θε Λ y AΓΦΛx |
= | | (5.1.20)
|Λx ΦΓ A Λ y Λx ΦΛx + Θδ |
This is not an equation that the reader needs to memorize. Rather, it is an
expression of the LISREL notation that describes how matrices can be used to
express, in a rather comprehensive way, the covariance expectations from a full
structural equation model that includes both latent variables and measurement
models.
PA RT I I I
Advanced topics
6
Composite variables and their uses
Introduction
It would seem that structural equation modeling holds the promise of providing
scientists the capacity to evaluate a wide range of complex questions about
systems. The incorporation of both conceptual and observed variables can be
particularly advantageous by allowing data to interface directly with theory.
Up to this present time, the emphasis in SEM has been on latent variables as
the means of conveying theoretical concepts. It is my view that this is quite
limiting. As we saw in the final section of Chapter 4, causal relationships in
a model may deviate quite a lot from the stereotypic “hybrid” model. In the
current chapter, I discuss the use of an additional variable type, the composite, in
structural equation models. In simple terms, composite variables represent the
influences of collections of other variables. As such, they can be helpful for (1)
representing complex, multifaceted concepts, (2) managing model complexity,
and (3) facilitating our ability to generalize. In my experience, these are all
highly desirable capabilities when representing ecological systems and, as a
result, I frequently find myself including composites in models.
While long recognized as a potentially important element of SEM, composite
variables have received very limited use, in part because of a lack of theoretical
consideration, but also because of difficulties that arise in parameter estimation
when using conventional solution procedures. In this chapter I tackle both the
theoretical and practical issues associated with composites. To accomplish this,
it will be necessary to introduce additional terms, as well as a framework for
deciding when a composite is appropriate to include in a model.
The question of what is appropriate in a structural equation model must be
judged relative to the theoretical concepts, as well as the variables measured,
and the nature of the data. In this chapter I introduce the idea of the “construct
model”, which is the theoretical precursor to the development of a structural
143
144 Structural equation modeling and natural systems
equation model. I will consider the various possibilities for how observed vari-
ables relate to constructs and the resulting model architectures. This transition
from theory to model represents a phase of SEM that is largely absent from
other general treatments. I believe it is the “missing link” in most modeling
exercises, and should be studied carefully by those interested in developing
their own structural equation models.
History
As early as 1964, Blalock pointed out the need to represent some constructs
using composites. In particular, when the indicator variables associated with
a construct are viewed as causes of that construct, rather than effects, a com-
posite variable can be used in a model to represent that association. Heise
(1972) extended the discussion by providing explicit demonstrations of several
examples where composites would be appropriate. In addition, Heise discussed
some of the problems with estimating models containing composites, and pro-
vided a limited set of solutions within a least squares framework. Periodically,
practitioners of SEM have continued to raise the need for careful considera-
tion of the best way to represent constructs given the available data. Bollen
(1984) cautioned that a set of indicators for a latent variable should possess
“internal consistency” (i.e., they should represent a single entity and, therefore,
would be expected to be highly intercorrelated and joint responsive). He went
on to point out that we should differentiate between the case where the con-
cept of interest is the cause of some set of observed responses (i.e., is a latent
factor), and the opposing case where the observed variables have causal influ-
ences on the concept (i.e., composites). A number of authors have continued
to discuss the issue (Bollen and Lennox 1991, MacCallum and Browne 1993),
with an increased interest in the subject emerging in recent years (Bollen and
Ting 2000, Diamantopoulous and Winklhofer 2001, Edwards 2001, Jarvis et al.
2003, Williams et al. 2003). This literature has provided a number of useful
insights, diagnostic procedures, and approaches to the question of how to treat
concepts with only causal indicators. Overall, however, a clear, comprehensive,
and satisfying framework for considering composites has only recently begun
to emerge (Grace and Bollen, unpublished). This chapter presents an overview
of that framework, along with examples, diagnostics, and procedures.
Terminology
A consideration of the subject of composites requires that we have sufficient
terminology to express clearly the relevant relationships. To begin with, I
Composite variables and their uses 145
z1 0
d1 x1 x1 x1
d2 x2 x1 x2 h1 x2 h1
d3 x3 x3 x3
“effect indicators” (as in the L➙M block in Figure 6.1 and “causal indicators”
(as in the M➙L block in Figure 6.1), and I also use these terms occasionally.
The M➙L block shown in Figure 6.1 represents the situation where mani-
fest variables have causal influences on a latent variable that possesses no effect
indicators (no outward path to a manifest variable). By referring to the concep-
tual variable as a latent variable in this case, we infer that a latent factor does
indeed exist, independent from our data despite the fact that we do not have
effect indicators by which it can be measured. Graphically, this property of
independent existence for the latent variable is represented by the presence of
an error variance ζ 1 , which implies that while the xs have causal effects on η1 ,
they do not completely define it.
In contrast to the M➙L block is the M➙C block, in which the conceptual
variable represented is a composite. Since the error variance is specified to be 0
for a composite, this signifies that it is completely defined by its causes. There
actually exist two kinds of composite – one is the fixed composite. In this type,
the loadings from causes are specified a priori to have particular values by way
of definition. An ecologically relevant example would be the importance value,
which is defined as the sum of the relative density, relative abundance, and
relative frequency (usually for a species within a community). A second type
of composite is the statistical composite, which is the type considered in this
discussion. A statistical composite represents the collective effects of a set of
causes on some response variable(s). Such a composite does not exist separately
from the associated response variables. In simplistic terms, the statistical com-
posite is akin to a multiple regression predictor, some weighted combination of
causal influences that maximizes variance explanation in one or more response
variables. This predictor can have theoretical meaning, and can be quite useful
in developing structural equation models that involve multifaceted concepts.
Composites can represent collections of influences from both latent and
manifest variables. Figure 6.2 presents examples of blocks where relationships
between conceptual variables are considered. The L➙L block represents effects
of latent causes on a latent response, where all latent variables have effect
indicators. This is the standard block type found in the hybrid model discussed in
most introductory SEM treatments. What is represented by this block structure
is that mechanisms of causal interaction are actually among latent factors, and
we can understand the covariances among a set of x and y variables based on
that mechanism. Also implied is that the effect indicators associated with each
latent variable serve as multiple measures of the latent factors. Because the flow
of causation is from the latent factors to the indicators, error terms are used to
represent how well the indicators are predicted by the model; while the latent
factors are presumed to be free from measurement error.
Composite variables and their uses 147
d1 x1
z1
d2 x2 x1
d3 x3
h1 y1 e1
d4 x4
d5 x5 x2
d6 x6
L→L block
d1 x1 d1 x1
z1 0
d2 x2 x1 d2 x2 x1
d3 x3 d3 x3
h1 h1
d4 x4 d4 x4
d5 x5 x2 d5 x5 x2
d6 x6 d6 x6
L→L(no M) block L→C block
Figure 6.2. Representation of blocks specifying relationships between latent
causes and latent or composite conceptual variables. Block terminology is
described in the caption for Figure 6.1.
The L➙L(no M) block in Figure 6.2 represents causal influence from latent
variables that have effect indicators, to a latent variable for which there is
no effect indicator. This block type resembles the M➙L block in Figure 6.1,
except for the fact that latent causes replace the manifest ones. Similarly, the
L➙C block in Figure 6.2 is analogous to the M➙C block in Figure 6.1. The
relevant situations for these representations will be illustrated in the example
applications below.
Examples
Rather than continuing to elaborate on constructs, blocks, and composites in
the abstract, the remaining development of ideas, as well as the development
of formal notation, will be conducted in the context of specific examples. The
reader should keep in mind that while not representing an exhaustive treatment
of the subject, the examples chosen and procedures discussed in conjunction
with those examples are broadly applicable to the general issues associated with
incorporating conceptual variables into structural equation models.
Soil Competitor
Conditions Abundance
Landscape Colonization
Properties Success
Figure 6.3. Construct model reflecting the presumed mechanisms causing herb
colonization success to relate to the characteristics of forest stands.
In this case, our starting point is the initial construct model summarizing the
core theoretical questions posed (Figure 6.3). We can start by considering the
question, what are the various ways we might model the relationships shown
in Figure 6.3? To address this question, the available indicators must be con-
sidered and then the linkages between indicators and concepts. The authors
measured eight variables in the process of developing indicators for the four
constructs of interest. Two indicators of landscape conditions were measured,
the estimated age of each forest stand and the distance from each reforested
target stand to the nearest source patch. Three indicators of soil conditions were
measured, including soil texture, soil moisture, and soil pH. Two indicators of
competitor abundance were measured, the cover of herbaceous competitors and
the abundance of understory shrubs. For the herb species whose colonization
will be considered in this example application (Lamium galeobdolon), shrub
abundance was not significantly related to colonization success (although herba-
ceous cover was). So, to simplify the example, I only use a single indicator for
competitor abundance, which will be herbaceous cover. The proportion of sam-
pling points where a species was found in a forest stand serves as the single
indicator of colonization success.
Figure 6.4 shows the presumed linkages between measured indicators and
conceptual variables based on the theoretical reasoning of the authors. In this
initial representation, no attempt is made to express the structure of the blocks,
therefore, concepts are related to observed indicators with nondirectional lines
rather than directional arrows. Disturbance terms are specified for Competitors
and Colonization, while the diagram is ambiguous as to whether Soil and Land-
scape possess disturbances or not, because the directions of influences are not
given.
150 Structural equation modeling and natural systems
pH
texture
distance
Landscape Colonization col. freq.
stand age
pH, x1
Soil Competitors
moisture, x2 x1 h1 cover, y1
texture, x3
Figure 6.5. Model A, all observed (manifest) variables (in boxes) are represented
as “effect indicators” associated with latent variables (in circles). Thus, all blocks
are of the L➙M type.
pH, x1
Soil Competitors
moisture, x2 hc1 he1 cover, y1
texture, x3
Figure 6.6. Model B, soil conditions and landscape properties are represented
as components of M➙L blocks, with the associated manifest variables (x1 –x5 )
representing cause indicators.
A second possibility is that the multiple indicators associated with Soil and
Landscape are causal indicators. This type of structure is illustrated in model B
(Figure 6.6). Here, the arrows pointing from pH, moisture, and texture to Soil
represent the presumption that the measured soil properties jointly determine
the soil influences, rather than vice versa. A similar logic applies to distance
and stand age, which are presumed to contribute to Landscape in this model. In
this case, the multi-indicator blocks are of the M➙L type.
If we assume that the causal indicators in Model B are themselves imper-
fect measures, which is very reasonable in this case, then we might represent
the situation as shown in Figure 6.7 (model C). Here, Soil represents a multi-
dimensional construct related at least in part to True pH, True Moisture, and
True Texture (the use of the term “True” here does not mean that we have esti-
mated the true value of variables using our model, instead, it means that we
recognize that the true value of pH is latent or unmeasured). Similarly, Land-
scape is a multidimensional construct related to True Distance and True Age.
As the names of these latent variables indicate, this model explicitly recognizes
that our manifest variables are not perfect measures of the “true” parameter
values.
Discussion of the range of possible ways to represent the relations in Fig-
ure 6.4 would be incomplete without considering another alternative, which is
referred to as model D (Figure 6.8). In this model, the two multidimensional
constructs, soil conditions and landscape properties, are represented simply as
groups of variables. In this case, the only two block types are L➙M and L➙L.
Formal representations
To discuss the relationships embodied in our models more formally, we can
begin with the three characteristic equations of the LISREL model (the reader
152 Structural equation modeling and natural systems
True pH
pH, x1
x1
Colonization
True Dist.
he col. freq., y2
distance, x4
x4 2
Landscape
hc
True Age 2
stand age, x5 x5
Figure 6.7. Model C, which represents Soil and Landscape and their associated
causes as multidimensional constructs, allowing for measurement error in all man-
ifest variables. In this case, Soil and Landscape are components of L➙L (no M)
blocks. The line with arrows connecting the latent variables represents all possible
intercorrelations.
Soil Conditions
True pH
pH, x1
x1
True Moist.
moisture, x2
x2
Competitors
cover, y1
he
True Texture 1
texture, x3
x3
Colonization
True Dist.
distance, x4 col. freq., y2
x4 he
2
True Age
stand age, x5
x5
Landscape Properties
Figure 6.8. Model D, partially reduced form of model C omitting explicit con-
sideration of collective soil and landscape effects.
may wish to refer to Appendix 5.1, where the LISREL notation is summarized,
to better appreciate the following presentation). This notation, which applies to
hybrid models such as model A in Figure 6.5 is
x = Λx ξ + δ (6.1)
y = Λy η + ε (6.2)
Composite variables and their uses 153
and
η = Bη + Γξ + ζ (6.3)
where x and y are vectors of observed indicators of the exogenous and endoge-
nous latent variables, ξ and η are vectors containing the individual exogenous
and endogenous latent variables, Λx and Λy are vectors of coefficients relat-
ing indicators to latent variables, B and Γ are coefficient matrices for effects
of endogenous and exogenous latent variables on endogenous latent variables,
δ and ε are vectors of measurement errors for x and y, and ζ is a vector of
disturbances for the η variables. For exogenous latent variables, ξ, their vari-
ances are represented by the diagonal elements of the matrix Φ, while the
off-diagonal elements of the matrix are the covariances. The disturbance terms
for the endogenous latent variables (ζ) are contained within the diagonal ele-
ments of the Ψ matrix, while the off-diagonal elements of that matrix represent
any covariances among disturbances (normally assumed to be zero). Error vari-
ables (in δ and ε) are expected to be uncorrelated with ξ and η. However, it is
possible to model correlations among errors, typically represented by nonzero
off-diagonal elements in the matrices representing cross products among mea-
surement errors, θ δ and θ ε .
In the case of a model such as model B, which contains causal indicators,
we recognize the need for an additional equation to represent the M➙L block
η c = Λc x + ζ c (6.4)
Further, for ηe variables in general,
η e = Bc η c + Be η e + Γξ + ζ e (6.5)
Here, for clarity, we distinguish two different types of endogenous latent vari-
able. Those with causal indicators, either directly as in the M➙L block, or
indirectly as in the L➙L (no M) block, are represented by the vector η c , while
those with effect indicators are designated by η e . We also distinguish two differ-
ent matrices of structural coefficients, Bc and Be , as well as separate vectors of
disturbance terms, ζ c and ζ e , for the two types of eta variable. In the particular
case of model B, there are no variables in the ξ vector. For model C, Eqs. (6.1)
and (6.2) apply as they did for model A. However, we now have the case where
we need to expand Eq. (6.4) to be more general, so that
η c = Λc x + Γξ + ζ c (6.6)
In the herb colonization example, Soil and Landscape (collectively represented
by the vector η c ) are affected by the exogenous latent variables in the ξ vector.
A summary of the equations that would apply to models A–D is presented in
Table 6.1. As we can see, models A and D can be described unambiguously using
154 Structural equation modeling and natural systems
Composites
So far in our discussion of formal notation I have treated all conceptual variables
as true latent variables – unmeasured variables that exist independent from our
model and data, though we have no direct measures of them. We must now
consider the very practical matter that, for blocks such as M➙L and L➙L (no
M), there is usually insufficient information to estimate the true variance for
latent variables that do not have at least one effect indicator. In the absence of an
ability to estimate the true variance, latent variables without effect indicators can
only represent composites of their causes. For this reason, in the practical case
of estimating models that have latent variables without effect indicators, and
where there are no means to estimate their variances, we can replace M➙L with
M➙C and L➙L (no M) with L➙C. Here, C refers to a composite variable. In the
case of M➙C, the letter C describes a “composite of manifest variables”, while
in the case of L➙C, the letter C describes a “composite of latent variables”.
Composites are, in all cases that I describe in this chapter, considered to be
endogenous (η) variables. However, since they are, in effect, perfect represen-
tations of their cause, they are variables with zero value error variances. While
composites are imperfect representations of any construct they might repre-
sent, they can still be useful approximations of those constructs. Regarding the
example of forest herb colonization, if stand age and distance from source con-
stitute the predominant landscape features of importance for the model, then a
composite formed from their effects will have a very general meaning. If stand
age and distance from source are just two of many unique landscape properties
of importance to herb colonization, our composite will simply represent their
Composite variables and their uses 155
For the sake of this discussion, it is recognized that model B is a special case
of model C where the x variables are considered to be without measurement
error. For this reason, in the immediate discussion I will contrast models A and
B, with the understanding that models B and C are equivalent with regard to
this evaluation.
For the first question, we must begin by asking whether the construct “Soil”
has causal influence on pH, moisture, and texture. Stated in a different way,
we are asking whether the variation among stands in soil conditions is such
that it causes common responses in pH, moisture, and texture; in which case
156 Structural equation modeling and natural systems
that common antecedent, which can be modeled as a latent variable (of course,
if that latent factor is not the cause affecting other parts of the model, then a
completely different model structure would be implied). On the other hand,
in a model with an M➙C block structure, the indicators do not have common
antecedents, but rather, unique ones. In this example, it does not seem likely
that patch age and distance from source have a determined common antecedent.
The case is less clear for soil properties. Soil formation processes are such that
general soil conditions are capable of having common antecedents, such as
hydrologic and mineral influences. Thus, a potential basis for modeling Soil as
an L➙M block does exist based on this criterion.
Collectively, it would seem that the expectations of the authors, as reflected
by the four questions, would lean towards a specification of the multi-indicator
blocks as M➙C for both Soil and Landscape. This does not guarantee that this
is the correct causal structure, nor does it override the possibility that empirical
characteristics of the data might imply a different block structure. What is most
important to realize is that there are logical tests for developing, justifying, and
interpreting block structure, and that automatically presuming a conventional
hybrid model structure (L➙M) is not justified. Neither is it advisable to rely
simply on data properties to make the determination as to whether a construct
is best represented by a particular structure.
this assumption is stronger for the Landscape construct and less definitive for
the Soil construct. In this section, I consider some of the empirical properties
to see if either block has properties consistent with the L➙M form.
Table 6.2 presents covariance relations for colonization frequency and other
manifest variables (represented by the correlations and standard deviations) for
the species Lamium galeobdolon. While SEM is based on an analysis of covari-
ance relations (as discussed in Chapter 3), an inspection of correlations can be
instructive. As Bollen and Ting (2000) have described, there is an expectation
that a set of effect indicators associated with a single latent variable in an L➙M
block will be intercorrelated, as implied by Eq. (6.1). For such a set of indica-
tors, their degree of intercorrelation will depend on the strength of the common
influence of the latent cause, relative to their total variances. So, for model A,
we would expect conspicuous correlations among soil pH, moisture, and tex-
ture, because of the joint influence of Soil on those specific factors. We would
have similar expectations for a strong correlation between distance and patch
age. Again, the degree of correlation expected would depend on the relative
importance of the errors, although for a reliable set of indicators, correlations
should at least be moderately high. In contrast, for a set of causal indicators
associated with a single latent variable in a single M➙C block, there is no basis
for expecting any particular correlation pattern. None of the equations that apply
to model B imply common causal influence on sets of causal indicators. So, a
set of causal indicators may or may not intercorrelate in such a case, since our
equations do not describe their interrelationships, except that they are classified
as being of common interest.
Inspection of Table 6.2 reveals a correlation between age and distance of
−0.5934. Thus, a correlation of moderate magnitude is observed for this pair
of indicators. In addition, the correlations between these variables and the other
variables in the model are, very approximately, of the same magnitude. Thus,
based on a crude inspection of correlations in the matrix, we are unable to rule
Composite variables and their uses 159
out the possibility that either an L➙M or M➙C block structure would be con-
sistent with the empirical characteristics of the indicators related to Landscape.
The correlations between pH, moisture, and texture are 0.0265, 0.1324, and
0.5767. The low magnitude of correlations between pH and the other indicators
in the block suggests that these three soil properties would not be likely to
represent redundant measures of the sort normally found in L➙M blocks. A
method for formally evaluating this has been proposed by Bollen and Ting
(2000) based on vanishing tetrads. Correlations/covariances among a set of
truly redundant indicators in an L➙M block should possess the mathematical
property of vanishing tetrads, with a tetrad being the difference between the
products of pairs of covariances among four random variables. It can be said
in this case that the pattern of correlations among pH, moisture, and texture do
not appear to be consistent with such a block structure.
0.76
0.99 pH, x1 ns
0.83 Soil 0.24 Competitors 0.95
0.37 moisture, x2 cover, y1 0.1
x1 he1
0.69 ns
0.57 texture, x3
ns 0.30
0.57
Figure 6.9. Standardized parameter estimates for model A. Chi-squared for model
fit was 45.20 with 10 degrees of freedom (sample size = 180) and a p-value
< 0.00005, indicating very poor fit of data to model expectations.
True pH
pH, x1
x1 0
0.67
0.73
True Moist. ns Soil
0.34
moisture, x2 hc 1
x2
Competitors
ns ns cover, y1
he 1
True Texture
texture, x3
x3
ns 0.43
−0.31
Colonization
True Dist.
−0.37 he 2 col. freq., y2
distance, x4
x4
Landscape 0.72
True Age
0.72 hc 2
stand age, x5
x5 0
Figure 6.10. Select standardized parameter estimates for model C. Here Soil
and Landscape are composites with zero disturbance terms. Model chi-square was
6.005 with 3 degrees of freedom (p = 0.1107).
composite variables introduce four new paths that represent the effects of the
composites (the ηc s) on the ηe variables. In this particular case, the relationships
between the exogenous latent variables (ξ s) and the ηc variables are reduced to
single effects from each ξ . So, for our example, the ten potential paths from the
individual ξ variables to the ηe variables (e.g., in model D) are replaced with
five paths from the ξ to the ηc variables, plus four paths from the ηc variables
to the ηe variables. In spite of the net gain of a degree of freedom from this
substitution, problems remain with parameter identification. Ultimately, there is
a general problem that arises when attempting to identify all paths leading to, as
well as flowing out from, a composite. This problem is similar to the routinely
encountered problem associated with latent variables with effect indicators,
where the scale of the latent needs to be established. Both of these problems
can be solved in the same fashion by specifying a single incoming relationship,
which establishes the scale of measurement.
In spite of the fact that parameter identification issues can be resolved by
specification of select parameters, an issue still remains as to the significance of
the specified parameters. In model C (Figure 6.10), we set the unstandardized
coefficients for the paths from True pH to Soil and from True Distance to
Landscape to 1.0, to establish the scale of measurement for the composites (note,
this is not shown in Figure 6.10 because only the standardized parameters are
presented). This procedure ignores the question of whether there are significant
relationships associated with these paths. More specifically, does True pH have
a significant (nonzero) contribution to Soil, and similarly, does True Distance
162 Structural equation modeling and natural systems
Soil Conditions
True pH
pH, x1 0.19
x1
ns
True Moist.. 0.70
ns
moisture, x2
x2
ns Competitors
r, y1
cover,
True Texture
ns he
1
texture, x3
x3
ns ns 0.43
ns
Colonization
ni
True Dist.
he col. freq., y2
distance,, x4
x4 -0.32 2
-0.43
True Age
stand age, x5 0.45
x5
Landscape Properties
Figure 6.11. Standardized parameter estimates for model D. Model D lacks com-
posites and represents a partially reduced form of model C.
shared variance explanation (described in Chapter 3). In this case, if we drop the
variables associated with landscape properties (True Distance and True Age,
as well as their indicators) from the model, the variance explained in Competi-
tors and Colonization is reduced to 19.5% and 22.1% respectively. Also, if we
keep the variables associated with landscape properties in the model, but drop
those associated with soil conditions, the variance explained in Competitors
and Colonization is found to be 21.1% and 57.0% respectively. A summary
of these results can be seen in Table 6.3, along with derivations of unique and
shared variance explanation. Here we can see that soil conditions and land-
scape properties had roughly equal unique variance explanation contributions
for Competitors (9.0% and 10.6%) with 10% shared between them. For Col-
onization, unique variance explanation was very dissimilar (0.4% and 35.3%)
for soil conditions and landscape properties, while 21.7% was shared between
the two groups of predictors. In this case, the path from Competitors to Colo-
nization was nonsignificant, therefore, all effects from predictors are direct and
no indirect effects on Colonization are described.
0
True pH
pH, x1 0.59
x1 Soil→Com
ns hc 1 0.32
ns 0.70
True Moist.
0
moisture, x2
x2 ns
Competitors
ns Soil→Col
he 1 cover, y1
True Texture hc 2 ns
texture, x3 ns
x3
ns 0.43
−0.35
ns Land→Com Colonization
True Dist.
hc 3 he 2 col. freq., y2
distance, x4
x4
−0.46
1.23 0
True Age Land->Col 0.71
stand age, x5
x5 0.64 hc 4
Figure 6.12. Select standardized parameter estimates for model E. Here four
composites are included, one for each response variable.
our best model for the situation. On the other hand, if Eq. (6.9) does not hold
because regression relations are not similar, our model will underestimate the
variance explanation for Competitors and Colonization. It can easily be seen
that when gammas are very different in Eqs. (6.7) and (6.8), then the gammas
in Eq. (6.9) will be insufficiently relevant to accurately predict the ηe s. To
evaluate this possibility, we consider one additional model, which we refer to
as model E. In this model, separate composites are derived to estimate effects
on Competitors and Colonization.
Results for model E are shown in Figure 6.12. As with model D, when
all paths are included there are no degrees of freedom for testing model fit.
However, deletion of nonsignificant paths allows for model testing and, again,
no unspecified paths are indicated. As before, results presented in the figure are
the standardized parameter estimates for the model, with all paths included.
Macrohabitat Anuran
Type Diversity
Microhabitat
Conditions
Figure 6.13. Construct model relating macrohabitat type and microhabitat con-
ditions to anuran diversity.
features at the site, including the area of open water and the mean and maxi-
mum water depths.
In conjunction with our presentation of Example 1, we provided a detailed
consideration of how constructs may be represented and both theoretical and
empirical criteria for arriving at decisions about block structure. In this second
example, our emphasis is more on the question of how to model a situation where
an endogenous variable (microhabitat conditions) has multiple indicators and
may involve composites. This question was not addressed in Example 1, where
multiple indicators existed only for the exogenous constructs.
In the current example, we begin our analysis with the construct labeled
Macrohabitat Type. Since our measure is nominal and multi-level (whether a site
is classified as lake, impoundment, swale, or riverine), it immediately suggests
the need to model this construct using a set of dummy variables representing the
possible macrohabitat types. We can assume for the sake of simplicity that
the classification of individual sites as to habitat type was correct. Therefore,
the presumption is that the construct Macrohabitat Type can be modeled using
either an M➙L or M➙C block type. The deciding factors for choosing between
these two block types are whether we believe we have a complete sampling of
all possible macrohabitat types, and whether we can derive an estimate of the
error of prediction for Macrohabitat Type. Since neither of these criteria hold
in this case, the M➙C block structure seems more appropriate.
The construct labeled Microhabitat Conditions is one where the specific
details of how the measured variables would interrelate was not known a priori.
For this reason, Lichtenberg et al. performed an exploratory factor analysis to
see if the correlations among the many measured variables might suggest the
operation of a smaller number of latent factors. I will not go into the details of
that analysis here, but only say that the result was the recognition by the authors
of two factors of importance to anuran diversity, the abundance of herbaceous
vegetation and the abundance of leaf litter. Based on the conceptualization of
Microhabitat Conditions by Lichtenberg et al., it is clear that the indicators
could represent a collection of factors that affect anurans, based on theoretical
grounds. Thus, we begin with the expectation of a block structure of L➙C,
with two latent variables, Herbaceous Vegetation and Litter, contributing to the
construct. A total of seven indicators of the two latent variables were included
in their final model (see below).
Lichtenberg et al. discuss certain issues of measurement regarding the con-
struct Anuran Diversity. It is widely held that there are several causes of
measurement error for wildlife populations and communities. In addition to
the usual matter of sampling, varying detectability can contribute to error.
Lichtenberg et al. addressed the issue of detectability to some degree by using
170 Structural equation modeling and natural systems
0
lake, x1
Macrohabitat Diversity
impound, x2 rich, y8
hc 1 he 3
swale, x3
0
Microhabitat
hc 2
vhit1, y1
wlitr, y5
vhit2, y2
Herbaceous Litter
litrd, y6
he 1 he 2
herbl, y3
litrc, y7
herbc, y4
Figure 6.14. Model F, which shows one of the possible ways that indicators
could be related to the construct model shown in Figure 6.13. Refer to Table
6.3 for definitions of the observed variables. Note that by omission, the riverine
macrohabitat condition represents the baseline against which other macrohabitats
are compared.
the total number of species recorded across samplings, instead of the mean.
Nevertheless, error in assessing the true number of species at each site is likely
to be significant and, while no estimate of this error exists, we again use an
arbitrary estimate of 10% of the total variance.
0
lake, x1
Macrohabitat Diversity
impound, x2 rich, y8
hc 1 he 3
swale, x3
0
Microhabitat
hc 2
vhit1, y1
wlitr, y5
vhit2, y2
Herbaceous Litter
litrd, y6
he 1 he 2
herbl, y3
litrc, y7
herbc, y4
Figure 6.15. Model G, which differs from model F by allowing separate coeffi-
cients (represented by separate paths) to convey the effects of Macrohabitat on the
Herbaceous and Litter dimensions of Microhabitat.
with their declarations as composites, the error variances for Macrohabitat and
Microhabitat are set to zero.
Of the models we consider here, model F is the most abstract. At the same
time, model F is based on the greatest number of assumptions. It is presumed
in this model that the influences of Macrohabitat (ηc1 ) on Microhabitat (ηc2 )
can be summarized by a single coefficient, β c2 c1 , despite the fact that ηc2 is
actually a predictor of Herbaceous and Litter effects on Diversity, rather than
the microhabitat conditions themselves. Stated in other terms, in model F the
covariances between x1 –x3 and y1 –y7 must be resolved by their joint relations
to a linear predictor that depends on ηe3 . These are fairly critical assumptions in
that their failure is likely to mean an underestimate of Macrohabitat effects on
Microhabitat, and possibly unresolved covariances among manifest variables.
An alternative formulation that is perhaps more biologically meaningful,
model G (Figure 6.15), represents the modeling implications of relaxing the
assumptions just discussed for model F. Here we are allowed to consider effects
of Macrohabitat on the two dimensions of Microhabitat, Herbaceous and Lit-
ter. Since these two dimensions are independently estimated in L➙M blocks,
the interpretative meaning of the paths from Macrohabitat to each dimension
is clear, and also independent of other relationships in the model. In contrast,
since the composite Microhabitat is defined as a linear predictor of Diversity,
it depends on the covariances between three latent variables, ηe1 , ηe2 , and ηe3 .
172 Structural equation modeling and natural systems
If any of the covariances among these variables changes, the meaning of ηc2
changes, and thus, the meaning of a direct path from Macrohabitat to Micro-
habitat (as in model F). In model G, the effect of Macrohabitat on Microhabitat
can be summarized by calculating the total effect of ηc1 on ηc2 , which can be
summarized by the equation
Macro→Div
hc1
0
lake, x1 Diversity
rich, y8
Macro→Lit he3
impound, x2
hc2
0
swale, x3 0
Macro→Herb
Microhabitat
hc3
hc4
vhit1, y1
wlitr, y5
vhit2, y2
Herbaceous Litter
litrd, y6
he1 he2
herbl, y3
litrc, y7
herbc, y4
Figure 6.16. Model H, which differs from model G by using separate composites
(ηc1 –ηc3 ) to convey the effects of individual Macrohabitat types on ηe1 , ηe2 , and
ηe3 .
Macrohabitat
lake, x1 Diversity
rich, y8
he3
impound, x2
swale, x3
Microhabitat
vhit1, y1
wlitr, y5
vhit2, y2
Herbaceous Litter
litrd, y6
he1 he2
herbl, y3
litrc, y7
herbc, y4
Figure 6.17. Model I. Partially reduced form model representing effects of Macro-
habitat and Microhabitat nominally through grouping variables.
Table 6.4. Correlations among variables related to anuran richness and their standard deviations. “rich” refers to the number of
anurans at a site, “imp” refers to impoundments, “vhit2” and “vhit1” are measures of vegetation density, “herbl” and “herbc”
are measures of dead and live herbaceous vegetation, “wlitr” is woody litter, “litrd” is litter depth, and “litrc” is litter cover
rich lake imp swale vhit2 vhit1 herbl herbc wlitr litrd litrc
rich 1.0
lake 0.696 1.0
imp −0.167 −0.355 1.0
swale −0.431 −0.659 −0.253 1.0
vhit2 0.372 0.167 0.111 −0.099 1.0
vhit1 0.222 −0.156 0.552 −0.118 0.653 1.0
herbl 0.060 −0.252 0.562 −0.009 0.581 0.825 1.0
herbc 0.091 −0.087 0.419 −0.132 0.437 0.745 0.756 1.0
wlitr 0.509 0.430 −0.284 −0.099 −0.051 −0.290 −0.395 −0.396 1.0
litrd 0.238 0.146 −0.433 0.383 0.027 −0.097 −0.180 −0.281 0.419 1.0
litrc 0.219 0.194 −0.442 0.273 −0.118 −0.414 −0.509 −0.580 0.568 0.762 1.0
sd 2.170 0.510 0.332 0.476 0.512 1.482 0.173 0.122 0.100 0.122 0.148
Composite variables and their uses 175
Macrohabitat 0.28
lake, x1 0.76
Diversity 0.95
rich, y8
he3
impound, x2
0.73
swale, x3
vhit1, y1
0.92 wlitr, y5
0.63 0.60
vhit2, y2
Herbaceous Litter 0.79
0.91 litrd, y6
he1 he2
herbl, y3
0.98
0.82 litrc, y7
−0.30
herbc, y4 0.67 0.61
Figure 6.18. Results obtained for model I, showing standardized values for path
coefficients and the error variances of latent endogenous variables. Correlations
among xs and errors for ys are not shown, for simplicity. Nonsignificant effects
of macrohabitat types were dropped from the final model, which possessed a chi-
square of 50.42 with 39 degrees of freedom and a p of 0.104.
Results for model I The results for model I are given in Figure 6.18. For the
purposes of this evaluation (and in contrast to our practice in Example 1),
nonsignificant effects of macrohabitat types on endogenous variables were
dropped from the final model. In spite of the small sample size, results were
stable and the fit between model expectations and data was acceptable. As these
results show, impoundments had significantly higher levels of herbaceous veg-
etation than did other macrohabitat types. Litter accumulation, in contrast, was
substantially higher in lakes and swales than in impoundments and riverine
habitats (recall, the riverine variable was omitted and, therefore, serves as the
baseline condition). Anuran diversity was found to be higher in lakes than in
all other habitat types.
Results for model H Model H provides for a single path from each compo-
site to replace the multiple paths that would otherwise connect one construct
with another. Aside from that, the models are very similar. Results from the
estimation of model H are presented in Figure 6.19. A comparison of results
from models I and H shows numerous similarities, and a few differences. Model
fit parameters are identical for the two models. Also, variance explanation for
176 Structural equation modeling and natural systems
MacroÆDiv
0.28
1.00 hc1 0.76
0
lake, x1 Diversity
1.17 0.95
he3 rich, y8
MacroÆLit
impound, x2
hc2
1.25 0
swale, x3 0.42
0
1.00 0.62
MacroÆHerb
Microhabitat
hc3
hc4
0.57
vhit1, y1
0.92 1.18 0.87 wlitr, y5
0.63 0.60
vhit2, y2
Herbaceous Litter 0.79
0.91 litrd, y6
he1 he2
herbl, y3
0.98
0.82 -0.30 litrc, y7
herbc, y4 0.67 0.61
Figure 6.19. Results obtained for model H, showing standardized values for path
coefficients and the error variances of latent endogenous variables. Correlations
among xs and errors for ys are not shown, for simplicity. Nonsignificant effects
of macrohabitat types were dropped from the final model, which possessed a chi-
square of 50.42 with 39 degrees of freedom and a p of 0.104; precisely as for
model I.
endogenous latent variables is the same, with R2 s of 0.33, 0.39, and 0.78 for
ηe1 –ηe3 respectively. Loadings in L➙M blocks are the same for both models, as
are outward paths from composites possessing single causes (ηc1 to ηe3 and ηc3
to ηe1 ), in comparison to the equivalent effects in model I (x1 to ηe3 and x2 to ηe1 ).
Composites with multiple causes in model H yielded parameters not found in
model I, such as those associated with ηc4 to ηe3 (Microhabitat to Diversity)
and ηc2 to ηe2 (Macro➙Lit to Litter). The path coefficients associated with
these paths represent standardized collective effects of the composites’ causes
on the response variables involved. Heise (1972) referred to these coefficients
as “sheath” coefficients to designate the fact that they represent a collection of
causes.
The most conspicuous differences between models I and H reside with
the paths from causes to composites. For example, in model I, the effects of
Herbaceous and Litter on Diversity are represented by two path coefficients
(0.50 and 0.37), while in model H, the same effects are represented by three
paths, two from Herbaceous and Litter to Microhabitat (1.18 and 0.87), and
one from Microhabitat to Diversity (0.42). Upon first examination, the paths
Composite variables and their uses 177
0 0.26
lake, x1 1.00
Macrohabitat 0.75 Diversity 0.95
impound, x2 rich, y8
hc he 3
1
swale, x3
0.44
0
ns
Microhabitat
ns hc 2
vhit1, y1
0.92 1.13 0.80 wlitr, y5
0.64 0.59
vhit2, y2
Herbaceous Litter 0.77
0.90 he litrd, y6
1
he 2
herbl, y3
0.98
0.81 −0.47 litrc, y7
herbc, y4 1.0 1.0
Figure 6.20. Results obtained for model G, showing standardized values for path
coefficients and the error variances of latent endogenous variables. Correlations
among xs and errors for ys are not shown, for simplicity. Chi-square for this model
was 30.03 with 24 degrees of freedom and a p of 0.1837.
181
182 Structural equation modeling and natural systems
that are any more complex than necessary. When a simple modeling approach
will suffice, it is to be preferred. Also, when one is starting out with SEM, it is
advisable to begin with simple models to understand the fundamentals involved.
As alluded to in Chapters 4 and 6, there is a long history of misuse of latent
variables and although they are presented in most basic presentations relating
to SEM, to use them properly is an advanced skill.
In this chapter, I will attempt to present in brief and concise form a lit-
tle about a number of more advanced topics. Nearly all of these topics have
one or more books, or many articles written about them. I will not be able to
cover any of them in the depth they deserve in this introductory text. Some
beginning users of SEM may wish to skim this chapter and save the mastery
of these topics for another day. Others may find that their applications require
them to confront some of the topics covered in this chapter from the beginning.
In addition to providing a brief introduction and a limited number of exam-
ples, I will provide references to in-depth treatments of the topics. It can be
argued that SEM is akin to a construction process. As with modern building
construction, tools, techniques, and materials can be used in a variety of cre-
ative ways to solve problems. Here I illustrate a few more of the tools that are
available.
Multigroup analyses
One of the early elaborations of standard SEM practice was the development
of the multigroup analysis. It is not uncommon that one wishes to address
questions about groups. For example, samples may include male and female
members of a population and one may wish to ask whether the multivariate
relations of interest are the same for both groups. Or, we may be interested in
complex interactions involving the responses of different groups. Multigroup
analyses can range from simple (e.g., comparing regressions between groups)
to complex (e.g., evaluating the effects of treatment applications on models
containing conceptual variables). As a fundamental tool in the SEM toolbox,
the multigroup analysis is quite a powerful capability.
The essence of the multigroup analysis is that elements in a data set can
be assigned to one or another group membership, and comparative models that
apply to each group can be developed. Of course, separate models for each group
of samples can always be developed and evaluated separately. However, such
an approach misses the opportunity to arrive at a general answer. What differs
in multigroup analysis is that it is possible to evaluate for all model param-
eters whether they differ between groups and, therefore, whether the groups
Additional techniques for complex situations 183
z2 y4
d1 x1 l1 l7 e4
l2 g21 l8
d2 x2 x1 h2 y5 e5
x3 l3 g11 b21 l9 y6 e6
d3
z1
h1
l4 l6
l5
y1 y2 y3
e1 e1 e2
Figure 7.1. Example model used to discuss multigroup analysis.
Procedures
In multigroup analysis, there are quite a few parameters to compare between
groups (all path coefficients, variances, error terms, and means). Figure 7.1
provides us with an example model to aid in the description of the process.
Additionally, the reader can refer to Appendix 5.1 where a representation of
the LISREL model is given, which may be helpful if one wishes to better
understand the mathematical notation. Essentially, the lambdas (λ) describe
the loadings between latent and observed variables, deltas (δ) are errors for
184 Structural equation modeling and natural systems
indicators of exogenous latent variables, while epsilons (ε) are errors for indi-
cators of endogenous latent variables. Gammas (γ ) describe path coefficients
relating endogenous to exogenous latents, while betas (β) describe path coeffi-
cients relating endogenous to other endogenous latent variables. Zetas (ζ ) are
disturbance terms for endogenous latents.
Because of the large number of comparisons being made, care must be taken
to arrive at a stable solution. Two general strategies have been proposed for
comparing parameters in multigroup analysis. In the first of these, one begins
by performing separate analyses for each group allowing all parameters to
differ between groups. Equality constraints are progressively added and the
appropriateness of such constraints evaluated. Bollen (1989, pages 355–365)
has suggested the following progression as a reasonable way to proceed:
where Θ is a matrix containing the variances and covariances for observed vari-
ables, B is a matrix giving the covariances among endogenous latent variables,
Γ is a matrix giving the covariances between exogenous and endogenous latent
variables, Ψ specifies the covariances among errors of exogenous latent vari-
ables, and Φ specifies covariances among errors of endogenous latent variables.
It is always recommended that the first step in a multigroup analysis should be
to consider whether the appropriate model for both groups is of the same form,
for if that analysis fails, it makes little sense to evaluate all the individual param-
eters. The addition of constraints can deviate from the above sequence, how-
ever, depending on the comparisons of greatest theoretical importance. In this
sequence, parameters associated with the measurement model are considered
first, followed by an assessment of similarity in the structural model. Correla-
tions among errors and disturbances are often of least interest, and considered
last. In cases where the questions of most interest have to do with differences
between groups in the structural model (the betas and gammas), they may be
evaluated first. Model evaluation typically employs single-degree-of-freedom
χ 2 tests as well as the use of comparative indices such as AIC.
Additional techniques for complex situations 185
An illustration
In 1993 and 1994, Heli Jutila conducted a study of vegetation in coastal areas
of Finland. Samples were taken from paired grazed and ungrazed meadows
across several geographic areas. The emphasis in the SEM analysis of these
data (Grace and Jutila 1999) was to understand patterns of species richness and
how they might be influenced by grazing. The effects of geographic region, soil
conditions, and elevation/flooding on community biomass and richness were
186 Structural equation modeling and natural systems
site2 site3
Ungrazed
0
dol
par1 SITE
par2 −0.17
ns 0.10 z
par3 0
par4 0.23
SOIL RICH rich
par5 0.36 0.56
sol1 ns R 2 = 0.56
0 0
sol2 -0.30
FLOD BIOM
sol3
-0.40
R 2 = 0.31
sol4
par1 SITE
par2 ns*
0.43* 0.09 z
par3 0
par4 ns*
SOIL RICH rich
par5 0.59 0.56
sol1 0.25* R 2 = 0.47
0 0
sol2 ns*
FLOD BIOM
sol3
-0.31
R 2 = 0.44
sol4
Figure 7.2. Multigroup results for study of grazing effects. Paths marked with
asterisks in the lower figure are statistically different among groups. From Grace
and Jutila (1999), by permission of Oikos.
Additional techniques for complex situations 187
believed to apply to both grazed and ungrazed sites, forming the basis for a
multigroup analysis. The form of the model and results for the two groups
are shown in Figure 7.2. The effects of geographic region were modeled as
categorical effects using dummy (yes or no) variables. It is necessary when
modeling with dummy variables to omit at least one variable (in the case of
SITE, sites 1 and 4 were equivalent and were omitted). The interpretation of
the SITE effect is the influence of being at sites 2 and 3 relative to 1 and 4. The
overall influence of sites was modeled as a composite, as described in Chapter
5. SOIL was likewise a composite of the effects of different parent materials
(par 1–6), soil types (sol 1–5), and the depth of the litter layer (dol), with all but
dol being dummy variables. Both flooding (FLOD) and community biomass
(BIOM) had nonlinear relationships with richness, and this was also modeled
using composite effects (this procedure is discussed in more detail later in the
chapter). One set of parameters evaluated in the multigroup analysis but not
shown in Figure 7.2 is the variances, which in this case were not found to differ
significantly. Means were not evaluated in this case as part of the multigroup
analysis.
We can see from the multigroup results (Figure 7.2) that relationships
between SOIL and SITE, as well as SOIL and FLOD, differed between groups.
These differences support the case for suspecting that the soil differences
between grazed and ungrazed meadows might have resulted from grazing itself.
The results further indicate that in the grazed meadows, species richness was
unrelated to site or soil effects, while these relationships were both modestly
important in ungrazed meadows. There is a suggestion from these differences
between groups that grazing effects on richness may be overwhelming site and
soil influences. The other main difference between groups in this study is the
relationship between biomass and richness. We can see that there is evidence
for a strong, predominantly negative effect of biomass on richness in ungrazed
meadows (path coefficient = −0.30). However, in grazed meadows there is no
significant relationship between biomass and richness.
As illustrated in this presentation, multigroup analyses allow for a detailed
inspection of how groups differ. One of the greatest strengths of this type of
analysis is the ability to explicitly consider interactions between the group vari-
able and all within-group relationships. The ability to consider differences in
variances and means between groups also allows for an analysis framework
that is superior to traditional analysis of variance approaches in many regards.
As SEM becomes used more in manipulative experiments, we can expect that
multigroup analyses will become very popular because of the power and flexi-
bility they provide.
188 Structural equation modeling and natural systems
The third step in the probit method involves analysis of the polychoric cor-
relation matrix in place of the observed correlation matrix. We should be aware
that since y is categorical, the mean and variance are not identified parameters.
Therefore, the mean is set to 0 and the variance to 1. As a result, this is one
of the few cases in SEM where we must evaluate the correlation matrix instead
of the covariance matrix.
The remaining problem to address when dealing with categorical response
variables is nonnormality. The earliest approach to addressing this problem
was the asymptotic distribution-free method. Studies have subsequently shown
that extremely large sample sizes are required for this method to yield reliable
results. As an alternative approach, Muthén (1984) has proposed a weighted
least squares method that has been found to behave reasonably well in providing
approximately correct standard errors.
An illustration
When making observations of wildlife species, it is often the case that individ-
uals are observed only occasionally. This alludes to the fact that animal count
data often contain lots of 0s. In a study of the responses of grassland birds
to fire history, Baldwin (2005) found that individual species were encountered
infrequently such that it was best to simply represent the data as either observed
or unobserved. Given the data in this form, the questions posed were about how
different habitat factors relate to the probability of observing a bird species.
In this case, the categorical observations represent y, while the probability of
seeing a bird represents y*. Thus, we envisage the problem as one in which the
probability of seeing a bird varies continuously with habitat conditions, while
some threshold determines whether a bird will be observed or not.
In this study, the primary driving variable of interest was the number of years
since a grassland unit was burned (typically by prescribed burning). Fifteen
management units were selected for sampling, with five having been burned
within one year, five within one to two years, and five within two to three
years. Sampling along transects within the management units produced 173
observations at a scale where bird occurrence could be related to vegetation
characteristics. The vegetation characteristics of presumed importance included
vegetation type, density of herbaceous vegetation, and the density and types of
shrubs. The hierarchical nature of the data was accommodated using cluster
techniques, which are discussed later in the chapter.
Figure 7.3 illustrates the hypothesis evaluated for each of the four most
common species. The two primary questions of interest were (1) whether the
probability of finding a bird in a habitat was affected by the time since burning,
Additional techniques for complex situations 191
Andro. Spart.
shrublow
shrubmed
brnyr1
bird?
brnyr2
herblow
herbmed
and (2) whether any relationship between time of fire and the probability of
birds could be explained by associated vegetation characteristics. One may ask
why shrub density and herbaceous density are represented as categorical vari-
ables? In theory, these variables could be represented as continuous influences.
However, examination of the data revealed that the relationships between birds
and vegetation density were nonlinear. The use of ordered categories of veg-
etation density proved effective in this case as a means of dealing with those
nonlinearities.
The results from this analysis for one of the bird species, LaConte’s sparrow,
are given in Figure 7.4. For this species, the probability of observing birds was
least in fields burned three years previously, thus, there was a positive associa-
tion between this species and year 1 fields and year 2 fields. These associations
could be explained as indirect effects, since the associated vegetation conditions
(particularly low shrub density) provide an adequate explanation for the asso-
ciation with burn year. It can be seen that there was an additional influence of
medium levels of herbaceous vegetation on LaConte’s, as well as a preference
for vegetation types that include Andropogon (upland prairie dominated by lit-
tle bluestem). Altogether, these results indicate that this species is most highly
associated with habitat that has been burned between 1 and 2 years previously,
that has a medium level of herbaceous cover, and that is upland prairie. Not
192 Structural equation modeling and natural systems
-0.54
Andro. mixx
0.82
1.12 0.93
herbmed
Figure 7.4. Results for LaConte’s sparrows (from Baldwin 2005). A composite
variable, comm, was included to represent the collective effect of vegetation type.
only does the use of categorical modeling accommodate the nature of the data
in this case, the nature of the results would seem to be in a form quite usable
for making decisions about habitat management.
Nonlinear relationships
It has been my experience that nonlinear relationships are very common in eco-
logical problems. There are actually several different ways nonlinear relations
can be addressed, some quite simple, and some more involved. The simplest
means of addressing nonlinear relations is through transformations. In many
cases, transformed variables can be used in place of explicitly modeled non-
linear relations. For monotonic nonlinear relations, simple transformations can
often be quite effective. For transforming unimodal or more complex rela-
tions, however, it must be that the linearization of one relationship does not
create a new nonlinear relation with some other variable. When this can be
avoided, transformations can be effective. An additional way to deal with non-
linear relations was demonstrated in the previous section dealing with bird
populations. By converting vegetation density into an ordered categorical vari-
able, any assumption of a linear relationship involving vegetation was removed.
Multigroup analysis, as discussed in the first section of this chapter, can be used
Additional techniques for complex situations 193
An illustration
Here I give some results from Grace and Keeley (2006) who examined the
response of vegetation to wildfire in California chaparral. Some nonlinearities
could be addressed using transformations. However, the relationship between
species richness and plant cover was one where an explicit modeling of the
nonlinearity was of interest. The results of this analysis can be seen in Figure
7.5, and afford us a means of seeing how this problem was addressed. First,
visual inspection of the relationship between cover and richness revealed a
simple unimodal curve. For this reason, we felt that a polynomial regression
approach could be taken using cover and cover squared. This type of approach
to modeling nonlinearities can often be effective, and is particularly appropriate
where some optimum condition for a response exists. In this case, it appears that
100% cover is optimal for richness, and that at higher levels, richness declines.
The introduction of a higher order term, such as Plant Cover Squared, creates
some potential problems. First, variables and their squares are highly correlated
and this must be modeled. In Figure 7.5, the magnitude of the correlation
between Plant Cover and Plant Cover Squared is not shown, as it is not of
substantive interest. Instead, we followed the recommendation of Bollen (1998)
and represented the relationship using a zigzag arrow that depicts that Plant
Cover Squared is derived from Plant Cover. In some cases, the introduction of
a higher order term results in a host of additional covariances in the model that
must be included, but which are not of substantive interest because the higher
order term is merely included for the purpose of representing the nonlinearity.
To solve some of the potential problems associated with the inclusion of poly-
nomial terms, the higher order terms should be derived by first zero-centering
194 Structural equation modeling and natural systems
community
heterogen.
distance
from 0.95 optimum
coast abiotic
1.0 Hetero-
0.40 geneity 1.0 R 2 = 0.53
0.38
Landscape
Position 1.0
Richness spp/plot
0.26
-0.43
Stand 0.45 Fire Plant
Plant
Cover
Age Severity Cover
Squared
Figure 7.5. Relationship between herbaceous richness and various factors fol-
lowing fire in southern California chaparral (from Grace and Keeley 2006). This
model includes an explicit nonlinearity between Plant Cover and Richness, along
with a composite variable summarizing the relationship called Optimum Cover.
the original variable before raising to a higher power. This reduces the correla-
tion between first and second order terms. Second, one should always specify a
correlation between polynomial terms to represent their intercorrelations. High
correlations between terms may still exist. However, the use of maximum like-
lihood estimation reduces problems potentially created by high multicollinear-
ity, such as variance inflation, nonpositive definite matrices (ones that contain
apparent contradictions), and empirical underidentification.
One further problem associated with polynomial regression is the interpre-
tation of path coefficients. Typically, we would interpret a path coefficient, say
from Plant Cover to Richness, as the expected change in Richness if we were
to vary Plant Cover while holding the other variables in the model constant.
However, it is not reasonable to expect to be able to hold Plant Cover Squared
constant, since it is directly calculated from Plant Cover. Thus, the independent
coefficients from Plant Cover and Plant Cover Squared do not have meaning,
but instead, it is their collective effects that are of interest. To address this, we
Additional techniques for complex situations 195
An illustration
In the previous section, the illustration involved a study of vegetation responses
following wildfire (Grace and Keeley 2006). Nonlinear relations were illus-
trated as part of a static model representing spatial variations in diversity across
a heterogeneous landscape. In the same study, the authors were also interested
in certain hypotheses regarding changes in diversity over time. Previous studies
suggested that following fire in these systems, there is an initial flush in herba-
ceous plant diversity associated with fire-induced germination. Over time, it is
expected that diversity will decline as time since disturbance increases. Exam-
ination of the temporal dynamics of richness (Figure 7.6A) shows that there
was only a hint of a declining trend. The authors were aware, however, of sub-
stantial year to year variations in rainfall. Furthermore, there is strong evidence
that rainfall in this system is a substantial limiting factor for plant richness.
Patterns of year to year rainfall (Figure 7.6B) suggest the high levels of rich-
ness observed in 1995 and 1998 can be explained by the very high amounts
of rainfall in those years. To evaluate this hypothesis, a latent growth model
was constructed, with annual rainfall included as a time-varying covariate. The
results are shown in Figure 7.7.
196 Structural equation modeling and natural systems
A 60
Richness, spp/plot
50
40
30
20
10
0
94 95 96 97 98
B
Precipitation, mm
300
250
200
150
100
50
0
94 95 96 97 98
Figure 7.6. A. Changes in herbaceous richness over a five-year period following
fire (from Grace and Keeley 2006). B. Variations in annual precipitation accom-
panying changes in richness.
The results from the latent growth analysis of richness showed that the data
are consistent with the hypothesis of a general decline over time. When the
influence of rainfall was controlled for in the model, an average decline of 4.35
species per year was estimated for the five-year period, although the decline
function was not strictly linear. For the overall model, a robust chi-square of
14.2 was found with 9 degrees of freedom and an associated p-value of 0.115.
The variance in richness explained by the model for the five years were 34, 62,
68, 76, and 60%. Overall, these results support the contention that behind the
somewhat noisy observed dynamics are two major influences, post-disturbance
decline and responsiveness to precipitation.
INTERCEPT SLOPE
typically true for sampling efforts involving mobile animals. When the hier-
archical structure of data is ignored, it can lead to downward biased standard
errors; when hierarchical structure is incorporated into the analysis, it can lead
to an increased understanding of the system.
There are several different ways that hierarchical structure can be modeled,
ranging from the simplistic to the sophisticated. We have examined a couple of
the possibilities in previous sections. Multigroup analysis is an explicit means
of dealing with hierarchical structure. This approach is most useful when there
are primary hypotheses of interest about the highest level of structure. If one
is interested in comparing survival of males and females in a population, a
multigroup analysis is a good way to look at main effects and interactions.
Sampling structure can also be dealt with using exogenous categorical variables
representing that structure. In Figure 7.2, we can see that the influences of three
geographic areas (sites 1–3) along the Finnish coast were accommodated using
two dummy variables. Such an approach permits the hierarchical structure to be
included in the model, and also allows for general evaluation of the differences
attributable to each site. Using variables to represent sites, as with certain other
approaches, does not, however, allow for interactions between the site and the
other processes to be considered.
198 Structural equation modeling and natural systems
When the units of hierarchical structure are not of interest, for example, one
uses cluster sampling and is not particularly interested in the differences among
clusters; the data can be adjusted to make the clustering transparent. This kind
of correction can be done by hand or is implemented in some software packages.
The basic approach is to estimate standard errors and chi-square tests of model
fit, taking into account nonindependence of the observations.
A somewhat recent development in the field of SEM are software packages
that permit multi-level modeling (also known as hierarchical linear modeling).
Multi-level modeling recognizes that there are relationships of interest within
and between sample levels. For example, if one samples a large number of
populations collecting data on individuals in each population, it is of interest to
ask questions about the processing operating within and between populations.
Multi-level modeling permits the development and evaluation of population and
individual models simultaneously. Given the interest in the effects of spatial
scale in many ecological problems, multi-level models would seem to hold
promise for addressing a new set of analyses. The interested reader is referred
to Little et al. (2000) and Hox (2002) for a detailed discussion of this topic
and to Shipley (2000, chapter 7) for an illustration of its application to natural
systems.
z1 z1
A B
b
11
y1 y11 y12
b
b r12 21
b 12
21
b
12
y2 y21 y22
b
22
z2 z2
Illustration
There have been very few applications of nonrecursive models to ecological
problems. One exception is the analysis of limnological interactions by Johnson
et al. (1991), who examined interactions between various components of aquatic
foodwebs, including reciprocal interactions between phytoplankton and zoo-
plankton. Their work is discussed further in Chapter 9, as an example of the
use of SEM with experimental studies. To illustrate the potential utility of non-
recursive models in the current, brief discussion, I will rely on a hypothetical
example. In this example, we imagine two potentially competing species that
occur together across some range of environmental conditions. Let us further
Additional techniques for complex situations 201
z2
x1 g21
y2
b12
g31
g11 b32 b23 y11
y1 y3
b13
z1 z3
Figure 7.9. Nonrecursive model with two reciprocally interacting entities (y2 and
y3 ) and two variables having joint control over them (x1 and y1 ).
Summary
Reciprocal effects and feedback loops can be important components of systems.
Representing these temporal dynamics in static models using nonrecursive rela-
tionships has its limitations. There are a number of things that can be done to
arrive at estimable models. However, the restrictions are rather severe and many
models will not be estimable. For this reason, it is perhaps more important in
this case than in any other for the researcher to consider carefully potential
Additional techniques for complex situations 203
identification problems when developing their initial models, and when select-
ing parameters to measure. It is also very important to consider whether the
assumption of equilibrium is approximated for the system, otherwise, estimates
of reciprocal effects will not be unbiased. For all these reasons, one should be
careful when incorporating feedback in structural equation models, and should
treat such models as complex cases requiring advanced techniques.
PA RT I V
207
208 Structural equation modeling and natural systems
the process whereby multivariate models are evaluated and conclusions drawn,
including (1) the evaluation of adequacy for individual models, (2) comparisons
among alternative models, and (3) model selection.
x1 y2
y1
Figure 8.1. Illustration of a simple model that might be subjected to a strictly
confirmatory analysis.
saturated model, thus, the fit of data to model will be assessed as being perfect,
and no degrees of freedom exist to provide a chi-square test of model fit. In
this case, even if the results show that one of the path coefficients cannot be
distinguished from a value of zero, we should still retain all paths in the model.
It would be a mistake in such a situation to constrain the nonsignificant path to
be zero and re-estimate the model. Rather, we conclude that all pathways are
valid, though in this particular sample, one of the paths cannot be distinguished
from a zero value (perhaps a larger or more extensive sample would have led to a
path coefficient deemed to be statistically significant at p = 0.05, for example).
It is anticipated, thus, that future studies of this system will, at times, find all
the paths in Figure 8.1 to be of significant importance. Commonly in such a
case, we will be interested in the strength of the paths to inform us about the
influence of different processes controlling the behavior of y2 (assuming we
continue to feel we have justification for a causal interpretation).
x1 y2 x1 y2
(A) (B)
y1 y1
x1 y2 x1 y2
(C) (D)
y1 y1
x1 y2 x1 y2
(E) (F)
y1 y1
x1 y2 x1 y2
(G) (H)
y1 y1
Figure 8.2. Nested set of models (A–H) with common causal structure. Solid
arrows represent significant paths and dashed arrows indicate nonsignificant paths.
In this case, models with correlated errors are omitted for simplicity.
a competing models approach are because (1) a more specific evaluation is being
performed, and (2) our comparison-wise error rate is substantially reduced. For
unlimited model comparisons, we should be aware that even simple models
have a substantial number of alternatives (as suggested in Figure 8.2). Models
of moderate complexity can have hundreds of alternatives (especially if we
consider the possibility of correlations among errors).
In the evaluation of any given model, we can simultaneously evaluate
(1) overall model fit, (2) the significance level of included pathways, (3) the
consequences for model fit of including additional paths, and (4) comparative
model fit. Such evaluations are accomplished both by assessments of model
fit (e.g., the chi-square test), through the examination of residuals or modifica-
tion indices, and by using information theoretic measures such as AIC or BIC
(see Chapter 5). Residuals between predicted and observed covariances and any
associated modification indices can indicate that a missing path would signifi-
cantly improve model fit. If such a path is subsequently added, further evaluation
can confirm that the path coefficient is indeed significant. Thus, model evalua-
tion typically involves a comparison among nested models as an inherent part
of the assessment.
212 Structural equation modeling and natural systems
It is important to keep in mind that each data set only permits a single
untainted assessment of a model or limited set of competing models. Frequently,
analyses will suggest either the addition or deletion of paths based on the fit to
data. The fundamental principle of hypothesis evaluation in SEM is that only
the minimum number of changes to a model should be made, and the results
based on a modified model must be considered provisional. Thus, it is not
proper to present results of an analysis based on a modified model and claim
that the resulting model has been adequately evaluated. When a single data set
is utilized, only the initial model is subjected to a confirmatory assessment, the
modified model is actually “fitted” to the data. Where sufficient samples are
available, multiple evaluations can be conducted by splitting a data set, with
each half being examined separately. Even greater confidence can be achieved
if there exist independent data sets for model evaluation.
Every sample is likely to have some chance properties due to random sam-
pling events. Adding excess paths to a model in order to achieve perfect fit
is referred to as overfitting. Overfitting can be thought of as fitting a model
exactly to a set of data, representing both the general features of the data and
its idiosyncrasies. When overfitting takes place, the model no longer represents
an approximation to the underlying population parameters.
As stated above, additional power and interest can be achieved through the
use of a limited set of a-priori competing models. When this is possible based
on theoretical grounds, there is generally a greater opportunity for substantively
based decisions about the appropriate model and a reduced likelihood of laps-
ing into empirically driven changes. We should not, of course, be unwilling
to use empirical findings to suggest a modified model. This kind of empirical
feedback to our models is precisely what we hope to gain from SEM. How-
ever, it is very important to always remember that any change to a model,
including the addition of correlated errors, must be theoretically justified and
interpreted. This sort of explicit process has not been the norm in the natural
sciences up to this point. Next I present an illustration that is an exception to that
pattern.
Table 8.1. Correlation matrix and standard deviations for plant traits and pol-
linator behavior from Mitchell (1992) based on data from Campbell (1991).
Minimum sample size was 82
Background
In this example, the plant involved was Ipomopsis aggregata, scarlet gilia,
which was pollinated primarily by two species of hummingbird, broad-tailed
and rufous (Selasphorus platycercus and Selasphorus rufus). The goal of this
study was to estimate the effects of plant floral traits on hummingbird visitation
(both in the form of approaches and probes), and the combined effects of plant
traits and hummingbird visitation on fruit set, measured as the proportion of
marked flowers that developed fruits. The plant floral traits examined included
the average length and width of the corolla for flowers open at that time, esti-
mated floral nectar production, and the number of open flowers. Hummingbird
approaches were measured as the number per hour, as were the number of probes
by a hummingbird per flower. Some of the variables were transformed prior to
analysis, including nectar production (as the square root), number of flowers
(as the natural log), and fruit set (as the arcsine square root of the proportion).
The goal of the analysis by Mitchell was to evaluate the two competing
models shown in Figure 8.3. For model A, it is presumed that the floral traits
(corolla length, corolla width, nectar production, and number of flowers) can
affect both approaches to plants and the number of probes per flower. This is the
more general hypothesis of the two as it allows for more biological processes,
such as repeated probing of flowers possessing high nectar content. This more
liberal hypothesis was compared to a more restrictive one, model B. In the
second model, it was reasoned a-priori that hummingbirds might base their
choice of plants primarily on visual cues. This second model also presumed
that fruit set would be affected by probes, not simply by approaches.
214 Structural equation modeling and natural systems
COROLLA LENGTH
FRUIT SET
COROLLA WIDTH
COROLLA LENGTH
FRUIT SET
COROLLA WIDTH
Results
Mitchell found that both models failed to fit the data adequately. For
model A, a chi-square of 23.49 was obtained with 4 degrees of freedom
(p < 0.001). For model B, a chi-square of 26.46 with 9 degrees of freedom
(p = 0.003) was found. Perhaps of equal or even greater importance in this case
was that only 6% of the variance in fruit set was explained by both models.
To make a strong statement about the proper procedures for using SEM,
Mitchell concluded that his two models were not adequate and he did not
continue to explore further configurations that had no theoretical support.
Overview
If one performs an analysis of model A, one finds that substantial residuals
exist for relationships between plant traits and fruit set. If paths from corolla
length and number of flowers to fruit set are allowed, the chi-square drops to
6.08 with 2 degrees of freedom (p = 0.05), and the R2 for fruit set increases
Model evaluation in practice 215
from 6% to 23%. Mitchell was aware of this possibility, but chose not to take
the analysis in this direction. Instead, he reasoned, “Although these paths [from
floral traits to fruit set] have no clear biological meaning themselves, this may
indicate a common factor (such as energy reserves or photosynthetic rate) that
influences both fruit set and the floral characters.” I applaud Mitchell for taking
this opportunity to make an important point about SEM methodology – that
sometimes the results indicate that key system properties were not measured
and an adequate model cannot be constructed. I also applaud the reviewers and
editor of his manuscript for allowing this “nonresult” to be published, providing
us with a clear demonstration of this kind of SEM application.
Open flowers Corolla length Corolla width Nectar prod. Height Dry mass Total flowers Approaches Probes Fruit set Total fruits
(ln#) (mm) (mm) (µL)−2 (ln cm) (ln g) (ln#) (#/hr) (#/fl/hr) (prop.) (ln#)
Almont
Open flowers 1.0
Corolla length 0.084 1.0
Corolla width 0.190 0.093 1.0
Nectar 0.156 0.350 0.441 1.0
Height 0.368 0.091 0.093 0.107 1.0
Biomass 0.809 0.148 0.206 0.164 0.523 1.0
Total flowers 0.855 0.096 0.195 0.167 0.477 0.935 1.0
Approaches 0.248 0.174 0.157 0.271 0.155 0.226 0.277 1.0
Probes/flower −0.145 0.099 −0.028 0.141 −0.008 −0.115 −0.073 0.676 1.0
Fruit set −0.102 0.127 −0.037 0.213 0.308 −0.012 −0.072 0.182 0.247 1.0
Total fruits 0.748 0.069 0.156 0.152 0.501 0.833 0.871 0.236 −0.050 0.296 1.0
Mean 2.56 27.8 3.22 1.80 3.73 1.12 4.88 0.151 0.056 0.99 4.32
Std. dev. 0.755 2.473 0.399 0.492 0.251 0.717 0.718 0.171 0.069 0.177 0.859
Avery
Open flowers 1.0
Corolla length 0.166 1.0
Corolla width 0.002 0.248 1.0
Nectar 0.154 0.062 0.149 1.0
Height 0.325 0.305 0.291 0.015 1.0
Biomass 0.650 0.340 0.261 0.001 0.402 1.0
Total flowers 0.754 0.178 0.093 −0.036 0.390 0.748 1.0
Approaches 0.233 −0.093 −0.003 0.203 0.080 0.240 0.220 1.0
Probes/flower 0.060 −0.199 −0.063 0.199 0.019 0.167 0.123 0.841 1.0
Fruit set −0.006 −0.106 0.114 0.068 0.254 0.002 −0.108 0.180 0.166 1.0
Total fruits 0.683 0.110 0.027 −0.073 0.417 0.637 0.881 0.197 0.119 0.165 1.0
Mean 2.18 26.8 3.92 1.85 3.62 0.92 4.35 0.156 0.081 0.94 3.52
Std. dev. 0.550 2.483 0.325 0.365 0.285 0.558 0.572 0.208 0.114 0.142 0.643
Model evaluation in practice 217
E
COROLLA LENGTH E APPROACHES
E
E
COROLLA WIDTH E
E
E
NECTAR
PRODUCTION E
C TOTAL
TOTAL FLOWERS
FRUITS
Because of this, there will be additional effects of dry mass on fruit set
and total fruits, indicated by the paths labeled B.
Model C – This model evaluates the possibility that fruit set and production
are not limited by plant resources; thus, all paths from height and total
flowers to fruit set and total fruits are omitted.
Model D – This model evaluates the possibility that pollinator behavior has
no effect on fruit set and total fruit production.
Model E – This model evaluates the possibility that nectar production and
visible plant characters have no effect on pollinator behavior.
Model F – This model evaluates the possibility that plant mass is the under-
lying cause of any correlations among corolla length, corolla width, and
nectar production. This model is not causally nested within the general
framework shown in Figure 8.4, because it involves directional arrows
from dry mass to corolla length, corolla width, and nectar production.
Model G – This model considers the possibility that dry mass is merely
correlated with plant height and total flower production, instead of their
causal determinant. This model also involves a change in the basic causal
order.
The statistical fit of these 7 models was evaluated in two ways. First, chi-
squares permitted a consideration of the degree to which the data deviated from
the models. This considers the absolute fit of the model. Second, chi-square
difference tests were used to evaluate whether some models possessed better fit
when paths were either added or removed. As described in Chapter 5, this latter
analysis is based on the fact that differences between chi-squares, themselves,
follow a chi-square distribution.
The results of the evaluation of competing models are given in Table 8.3.
For the Almont site, model A had an adequate absolute fit to the data based on
a chi-square of 28.9 with 26 degrees of freedom (p = 0.315). Adding the paths
from dry mass to fruit set and total fruits (model B) resulted in a decrease in
chi-square of 1.59 with two degrees of freedom. A single degree of freedom chi-
square test requires a change in value of 3.841, and a two-degree-of-freedom
value of 5.991 is required before a significant difference between models is
achieved. Thus, model B was not found to be a significantly better fitting model
than model A. Given the logical priority given to model A in this study, its
greater parsimony, and the fact that paths from dry mass to fruit set and total
fruits are not significant at the 0.05 level, model B was rejected in favor of model
A. Deletion of paths associated with models C–E led to significant increases in
chi-square, also indicating that these models were inferior to model A. Since
model A fitted the data well for the Almont site and because models F and G
Model evaluation in practice 219
Table 8.3. Measures of fit for the models evaluated in Mitchell (1994)
both fitted the data less well than model A, Mitchell deemed model A to be the
best representation of the data of any of the models.
For the Avery site, model A did not have an adequate absolute fit to the data
based on a chi-square of 59.01 with 26 degrees of freedom (p < 0.001). While
not reported by Mitchell, other indices of model fit agree with this assessment.
The root mean square error of approximation (RMSEA) gives a value of 0.090
and a 90% confidence interval for the RMSEA from 0.054 to 0.125. In order
for the RMSEA to indicate a nonsignificant difference between model and data,
the minimum RMSEA would need to include a value of 0.0 within its range,
which in this case, it did not. Models B–F did not have adequate absolute fits
to the data either. Only model G, which represents the case where all plant
traits freely intercorrelate, showed a model fit that was close to adequate. Based
on this information, Mitchell concluded that the model fitting data at the two
sites was not constant and that the best model for the Avery site was not fully
determined.
There are a fair number of interpretations to be drawn from the results from
this study. For brevity, I will focus only on the results for the Almont site
(Figure 8.5). Mitchell’s original model, which he deemed most biologically
220 Structural equation modeling and natural systems
R 2 = 0.12
COROLLA LENGTH APPROACHES
NECTAR
PRODUCTION
R 2 = 0.57
DRY MASS PROBES/FLOWER
0.52 R 2 = 0.27
−0.32 PROPORTION
HEIGHT 0.44 FRUIT SET
0.94 R 2 = 0.21
OPEN FLOWERS 0.41
0.86 R2 = 0.73 −0.28
0.95
TOTAL FLOWERS TOTAL
FRUITS
R 2 = 0.87 R 2 = 0.90
Figure 8.5. Results of analysis for Almont site. Directional paths with dashed lines
were nonsignificant. From Mitchell (1994) by permission of Blackwell Publishers.
reasonable from the beginning, matched the covariance structure of the data.
This does not prove that the model represents the true forces controlling the
system. Instead, the adequacy of model fit serves to “fail to disprove” the model
structure. This is no small achievement for a model of this complexity.
There was substantial variation in the degree to which variation in endoge-
nous variables was explained in the model. Looking first at plant traits, plant
height was only modestly related to dry mass. However, both total flowers and
open flowers were related quite strongly to dry mass. Considering pollinator
behavior, approaches by hummingbirds to plants were significantly related to
both nectar production and the number of open flowers. However, only 12% of
the variation in approaches between plants was explained by these two factors.
Probes per flower were strongly determined by the number of approaches and
the number of open flowers (R 2 = 0.57). As for reproductive success, the pro-
portion of fruit set was significantly related to height, which is a function of
dry mass. Proportion of fruit set was also found to be negatively related to the
total number of flowers, suggesting that the plant was indeed resource limited in
Model evaluation in practice 221
filling fruits. Variation among plants in total fruits per plant was well predicted
by the total number of flowers (R 2 = 0.90).
Looking at overall path relations, we see that total fruit set was largely
determined by the total number of flowers produced, and thus, ultimately deter-
mined by dry mass of the plant. Hummingbird behavior did not explain any
of the observed variation in total fruits in this model, even though approaches
to plants and fruit set were significantly correlated (r = 0.236). Based on this
result, we must conclude that the correlation between approaches and total
fruits was spurious and not causal. This does NOT mean that approaches do
not affect fruit production in some broader range of circumstances. Rather, it
simply means that the variation in fruit production observed in this data set was
related almost entirely to plant size and not to variation in pollination.
This example does an excellent job of illustrating how critical the model is
to interpretation. Mitchell was painstaking in his evaluation of model adequacy.
This gives substantially more credence to the interpretations that derive from
the results than if he had simply searched for the model that best fitted the
data with little regard to a-priori theoretical justification. It is easy to see from
the results of this analysis, that Mitchell was justified in being cautious about
interpreting the results from the earlier study (Mitchell 1992), where plant size
was not measured.
Background
A number of previous studies, including SEM studies using both experimental
and nonexperimental data (Grace and Pugesek 1997, Gough and Grace 1999),
provided a significant amount of experience upon which to base our initial mod-
els and their interpretations. The question addressed in this study was whether
landscape position could explain unique variation in species richness, that could
not be explained by contemporary environmental conditions. Thus, our ques-
tion was of the step-wise sort. To achieve this objective we compared the two
models shown in Figure 8.6. These models differ in that model B includes two
222 Structural equation modeling and natural systems
−0.76 −0.36
1.0 −0.55 1.
1.0
salt SALT RICH rich
2
R = 0.63 R 2 = 0.54
0.21
ns −0.29
0.31
−0.14
DIST ns −0.62
DIST PABUN
RIVR R 2 = 0.10 2
R = 0.44
1.0
1.0 1.0
m dist ltcap
−0.67
0.56 −0.36
−0.76
Figure 8.6. Competing models used to evaluate whether landscape position vari-
ables, distance from the sea (DIST SEA) and distance from the river’s edge
(DIST RIVR) contributed to the explanation of unique variation in species richness
(RICH). From Grace and Guntenspergen (1999) by permission of Ecoscience.
Model evaluation in practice 223
paths not found in model A, from DIST SEA (distance from the sea) to RICH
(plant species richness), and from DIST RIVR (distance from river’s edge) to
RICH. The results from this analysis are mentioned again in Chapter 10, where
they are put into the context of a systematic use of SEM to understand plant
diversity patterns.
Trait Definition
Specific leaf area (SLA) Leaf surface area per gram leaf tissue mass
Leaf nitrogen concentration ([N]) mg nitrogen per gram tissue mass
Net leaf photosynthetic rate (A) nmol CO2 uptake per gram leaf tissue per
second
Stomatal conductance (G) mmol water loss per gram leaf tissue per
second
MODEL A MODEL B
ε1 ε2 ε3 ε1 ε2 ε3
MODEL C MODEL D
ε1 ε2 ε3 ε1 ε2 ε3
ε1
MODEL E [N]
SLA A G
ε2 ε3
Model df 6 6 4 4 6
trt 1 3.7 (0.72) 12.6 (**) 2.8 (0.60) 7.0 (0.13) 6.4 (0.38)
trt 2 7.4 (0.29) 7.0 (0.32) 5.0 (0.28) 4.3 (0.37) 7.4 (0.28)
trt 3 12.0 (0.06) 10.2 (0.12) 9.9 (**) 4.6 (0.33) 12.9 (**)
trt 4 11.8 (0.07) 13.7 (**) 0.5 (0.97) 3.2 (0.52) 2.6 (0.86)
all trts 14.1 (0.08) 19.4 (**) 10.0 (0.27) 9.5 (0.30) 11.0 (0.20)
(df = 8)
data set 2 29.8 (**) 10.7 (0.10) 22.8 (**) 1.4 (0.85) 29.6 (**)
data set 3 18.9 (**) 13.2 (**) 11.7 (**) 2.1 (0.72) 27.0 (**)
data set 4 74.5 (88) 133.0 (**) 13.0 (**) 93.3 (**) 54.3 (**)
Note that treatments 1–4 were modifications of light and nutrient supply conducted in
this study. Data set 2 was from all plants studied in Shipley and Lechowicz (2000).
Data set 3 was for only the C3-type plants in Shipley and Lechowicz (2000). Data set
4 was from Reich et al. (1999).
MODEL F
SLA [N] A G
e1 e2 e3
Figure 8.8. Final, general model accepted by Meziane and Shipley (2001).
Reprinted by permission of Oxford University Press.
the sample size was low (n = 22), an alternative test statistic was employed that
is exact for small samples and relatively robust to distributional violations (see
Shipley 2000, chapter 3). This test statistic was evaluated using chi-square tests
and results are shown in Table 8.5.
None of the models was found to fit all data sets, although model D did fit all
except those from the comparative field study by Reich et al. (1999). Because
this study sought to provide a unified solution that would hold across all data
sets, the authors fitted all data to a sixth model, which is shown in Figure 8.8.
Chi-square tests indicate that model F fits adequately for all data sets, and thus
represents a general model for the system studied.
226 Structural equation modeling and natural systems
Canopy
Size at Maturity
Resource Males
Availability
1
Density 5 Size at Maturity
Females
Density
Effects 2
Life Reproductive
6 History Allotment
7 Extrinsic
3
Mortality 8 Number of
Offspring
Predation Unknown
4 Effects
Habitat Offspring
Stability Size
Width
Flow
Gradient
Figure 8.9. Global model evaluated by Johnson 2002 for the effects of envi-
ronmental conditions on life history characteristics of the fish B. rhabdophora.
Hypotheses tested relative to this diagram and their results are given in Table 8.6.
Reproduced by permission of Oikos.
Equivalent models
A careful consideration of alternative models often demonstrates that there are
multiple models existing that are indistinguishable (Figure 8.10). This may be
the result of statistical equivalency (because there is insufficient power to dis-
criminate among alternatives) or because of pure equivalency (the existence
of alternative causal structures with identical statistical expectations). Further-
more, for saturated models (those specifying all possible interactions among
variables), all variations of the causal order of the model are also saturated and,
therefore, equivalent. There has been a significant discussion of this issue for
over a decade among SEM practitioners (Glymour et al. 1987; Hayduk 1996,
chapters 3 and 4; Raykov and Penev 1999; Spirtes et al. 2000). Recently,
Shipley (2000) has discussed this important topic in detail using ecologically
relevant examples.
A variety of rules for identifying equivalent models have been devel-
oped, the historically most important being the development of the TETRAD
228 Structural equation modeling and natural systems
Selective Paths in CAIC 1996 CAIC 1997 CAIC 1998 CAIC 1999
agents model n = 21 n = 14 n = 16 n = 19
R 1,8 20.2 18.5 18.9 19.9
D 2,8 20.4 18.4 19.1 19.7
M 3,8 20.9 18.3 18.9 22.0
H 4,8 20.7 18.2 19.6 20.2
ec ec
C C
ed ed
A A
eb
D D
ee ee
B B
E E
ec
C C
ed
A A
eb eb
D D
ee ee
B B
E E
Figure 8.10. Example of a few equivalent models that possess identical covariance
expectations.
easily be dismissed and robustness established. For systems with a high degree
of integration or feedback, equivalent models can challenge one’s assumptions
and interpretations. To some degree, considering equivalent models provides
one with a test of where our understanding lies relative to this problem.
RMSEA = 0.085
DIST BIOM LIGHT
GHT 90% CI = 0.059; 0.112
p-value = 0.016
Df = 28
dist massm2 lighthi lightlo
RMSEA = 0.065
DIST BIOM LIGHT 90% CI = 0.034; 0.094
p-value = 0.188
Df = 2
dist massm2 lighthi lightlo
Figure 8.11. Illustration of lack of model fit (upper figure) resolved by addition
of a pathway (lower figure). This figure only shows part of the full model (refer to
Chapter 10, Figure 10.10, for the full model). DIST refers to disturbance, BIOM
refers to above-ground biomass (living and dead), LIGHT refers to percent of
full sunlight reaching the ground surface. RMSEA is the root mean square error of
approximation (refer to Chapter 5). Results indicate that data deviated significantly
from the upper model (p = 0.016), but fit to the lower model was substantially
better (p = 0.188).
In the study by Grace and Pugesek (1997), we proposed initially that dis-
turbance affects plant biomass, which in turn affects light penetration (Figure
8.11). For simplicity, only part of the whole model is shown here, although it
is important to realize that the results are only valid for the full model (Chapter
10, Figure 10.10).
The finding that there was unexplained residual variance between DIST
and LIGHT was initially a surprise. How is it that disturbance, which in this
system was caused by grazers (nutria, rabbits, wild hogs) and by wrack deposits
or scouring by water, could affect light, independent of effects on biomass?
More detailed examination of the data suggested an interpretation. It seems
that vegetation that has recently been disturbed has a more upright architecture
than less disturbed vegetation. The result is that disturbed vegetation allows
more light to penetrate through the canopy per gram biomass than undisturbed
vegetation (Figure 8.12).
This finding turns out to have led to a line of thought which, I believe, may
be of quite general importance. These and other subsequent results suggest
that (1) light penetration is a key predictor of the dynamics of diversity in plant
Model evaluation in practice 231
communities (Grace 1999), and (2) plant morphology and litter dynamics should
be far more important than total biomass in explaining community diversity
dynamics (Grace 2001). Empirical studies to evaluate this proposition are lead-
ing to a much more mechanistic understanding of richness regulation that may
permit us to predict, among other things, the effects of dominant (including
invasive) species on native diversity patterns (Jutila and Grace 2002).
Summary
A serious analysis of model fit is important both for the evaluation of scientific
hypotheses, as well as the results and their interpretation being dependent on
the model. When theory is well understood, samples are adequately large and
representative, and measurements are reliable, then evaluations of model fit
provide clear indications of the match between hypotheses and interpretations.
One finds that most evaluations of model fit in those fields with a substantial
history of using SEM (e.g., social and economic sciences) are of the nested type,
where it is generally recognized that there is a limit to the ability of statistical
testing to resolve issues of causal structure. The lesson for ecologists is that
care must be exercised in drawing conclusions too firmly based on a single data
set, regardless of how well the data fit a model.
Biologists and other scientists in many different fields are just beginning to
explore the utility of SEM. This means that modeling efforts are often quite
exploratory, with many more unknowns than knowns. The example by Mitchell
presented early in the chapter represents an outstanding illustration of how
232 Structural equation modeling and natural systems
exploratory analyses can proceed with care, and lead to convincing conclusions
when the conventions of SEM model evaluation are applied. Some questions,
of course, are more difficult to address than others. This is notably the case for
interacting components of individuals or populations, where the high degree of
integration and feedback makes causation sometimes difficult to disentangle.
Here, we will require both persistence and patience to make progress, since the
observations we require are often novel and without precedence in ecological
study.
I believe the process of evaluating multivariate models can be a pivotal
step forward in the maturation of the ecological sciences. This process can be
unambiguously strong in its rejection of pre-existing theory. However, through
the examination of residual effects, one can often discover new relationships
that are of importance, which motivates one to modify one’s theory so that a
better match with nature is obtained. The net results of such model evaluations
will be a reduction in entrenched theory, a higher empirical content to new
theories, and will allow us to have a greater capacity to understand and predict
system behavior.
9
Multivariate experiments
Basic issues
The gold standard for studying causal relations is experimentation. As Fisher
(1956) labored so hard to demonstrate, experimental manipulations have the
ability to disentangle factors in a way that is usually not possible with non-
experimental data. By creating independence among causes, experimentation
can lead to a great reduction in ambiguity about effects. There is little doubt
for most scientists that well designed and properly analyzed experiments pro-
vide the most powerful way of assessing the importance of processes, when
appropriate and relevant experiments are possible.
In this chapter I address a topic that generally receives little attention in dis-
cussions of SEM, its applicability to experimental studies. I hope to deal with
two common misconceptions in this chapter, (1) that multivariate analysis is
only for use on nonexperimental data, and (2) that when experiments are possi-
ble, there is no need for SEM. In fact, I would go one step further and say that
the value of studying systems using SEM applies equally well to experimental
and nonexperimental investigations.
There are several reasons why one might want to combine the techniques
of SEM with experimentation. First, using experiments to evaluate multivariate
relationships provides inherently more information about the responses of a
system to manipulation. It is often difficult and sometimes impossible to exert
independent control over all the variables of interest in a system. Examination of
how the various pathways among variables respond to experimental treatment
can yield important insights into system function and regulation. It can also
isolate effects of interest in the presence of covarying factors.
A second reason to combine SEM with experimental studies is the fact
that “replicates” are often quite dissimilar in important ways. One clear case
of such a situation is with manipulations of ecosystems. Often experiments
233
234 Structural equation modeling and natural systems
involve units of some size in order to incorporate all the processes of interest,
and to achieve some degree of realism. The various “replicates” being studied
may differ amongst themselves in characteristics that cause their individual
responses to vary widely. The use of SEM automatically leads the researcher to
consider the broader suite of variables that may influence results. These covari-
ate variables can often be of overwhelming importance, and understanding their
role in the operation of the system can greatly improve a researcher’s chances
of obtaining general and coherent findings.
Yet a third reason to use SEM when designing and analyzing an experiment
is a desire to accommodate cases where ecosystem manipulations are not simple
or precise. For example, in the case of prescribed burning in grasslands, a fire
started in the morning may have quite different characteristics from a fire ignited
the afternoon of that same day. Even within an individual fire, spatial variations
in vegetation, topography, and fire behavior may cause substantial variation in
fire temperatures, residency times, and soil heating. Recognizing that treatments
such as burning may not be precise, automatically encourages the researcher
to measure covariates that might impact the responses and to incorporate these
covariates into the model to be evaluated.
As a fourth motivation, experimental multivariate studies can help evaluate
assumptions and predictions that arise from nonexperimental manipulations. By
combining intensive multivariate experiments with extensive nonexperimental
studies, the strengths of both approaches can be integrated into a broader under-
standing of system behavior.
At present, SEM is less commonly combined with experimental studies than
it might be. At the risk of being repetitive, it is difficult to study systems effec-
tively using methods that are designed for the study of individual factors and
effects. Methods such as ANOVA, MANOVA, and ANCOVA are not really
designed to look at system behavior, but instead, to summarize net effects. As
the following examples illustrate, such net effects usually hide a rich comple-
ment of individual processes that are only revealed when multiple pathways
are considered. Furthermore, a multivariate mindset automatically motivates
the researcher to incorporate a broader range of variables that control system
behavior, allowing for the effects of manipulated variables to be put into system
context.
I should emphasize here that I do not mean that we should abandon the use of
conventional univariate analysis when performing experiments. Rather, what I
recommend is that we include both kinds of analyses as complementary sources
of information. Most of the examples that follow are selected from studies in
which conventional analysis of variance methods were also applied, and it is
important to keep this in mind.
Multivariate experiments 235
Adjustment of correlations/covariances
In “random-effect” studies, inference is drawn to a larger universe of possi-
bilities using a limited subset of treatment levels. For example, one may wish
to consider the effect of nutrient loading on a lake ecosystem. The experi-
mental treatment levels are thus viewed as finite samples from an underlying
continuum. As discussed in Chapter 7, when standard correlations are used
to represent an underlying continuous variable, correlations are attenuated. In
practice, this means that values of the correlation fall in a narrower range of
values than the true values. Through a somewhat involved set of procedures
that define thresholds and make assumptions about the relationship between
categories and the underlying continuum, it is possible to adjust correlations
involving categorical variables. In the case of a categorical treatment variable
and continuous response variable, a polyserial correlation can be substituted
for the Pearson product-moment correlation and we can obtain an estimate of
the correlation between the response variable and the underlying continuous
treatment variable. The discrepancy between Pearson and polyserial correla-
tions is greatest when a continuous variable is measured as a dichotomous one.
Multivariate experiments 237
Multigroup analyses
It is always possible to subdivide data into groups (say, treatment types) and
perform separate analyses on each group. Within SEM, this can be performed
in a comprehensive way using multigroup analysis as described in Chapter 7.
Multigroup analysis allows the investigator to determine whether parameters in
a group are constant across groups, or whether they vary. This applies not only
to path coefficients, but to all model parameters (e.g., factor loadings, latent
variable variances), and can be extended to a comparison of parameter means
across groups.
Among the many advantages of multigroup analysis is the investigation of
interactions among variables. When one wishes to study the interaction between
a treatment variable and a covariate, for example, one approach would be to
create an interaction variable and include it in the model. This can be cum-
bersome and sometimes difficult to implement. Within a multigroup analysis,
slopes of relationships are allowed to vary across groups, permitting a detailed
examination of interactions. An example of a multigroup analysis can be found
in Chapter 7.
Results
The analyses of these data using SEM were accompanied by a detailed discus-
sion of the process of model development and refinement. The reader is encour-
aged to read the original paper (Johnson et al. 1991) to see the steps the authors
took in reaching the final model. Here I present a figure showing the final
results (Figure 9.1). One major focus of this study was to determine the effects
of atrazine on aquatic vegetation, phytoplankton and zooplankton. As can be
seen by the number of significant relationships among the variables examined in
this system, there was a complex interplay of parts that influenced how response
variables were affected by atrazine. These interactions included the following:
(1) Grass carp played an important role in these mesocosms by reducing the
amount of aquatic vegetation. Through this mechanism, grass carp had an
indirect positive effect on both phytoplankton and zooplankton by reducing
aquatic vegetation; which had a very strong negative effect on phytoplank-
ton.
(2) The depth of the pond also had important effects, both on aquatic vegetation
and on phytoplankton. In general, aquatic vegetation and phytoplankton
were reduced in deeper ponds. Since aquatic vegetation had a negative effect
on phytoplankton, we can see that the effect of pond depth on phytoplankton
Multivariate experiments 239
grass
atrazine pond
carp
concent. depth
presence
0.43 2.19 0.34
GRASS POND
ATRAZINE
ATRA −0.21
CARP DEPTH
−0.71
−0.82
−0.38 0.88 −0.50
zooplank.
abund.
Figure 9.1. Final model from analyses by Johnson et al. (1991) on the effects
of atrazine on aquatic mesocosms. Loadings from latent variables (in circles) to
measured variables (in boxes) are influenced by the effects of standardization
on units using a prior version of LISREL software which did not scale latent
and indicator variables equally. Thus, loadings cannot be interpreted as if they
are standardized values, although path coefficients (directional arrows) can be.
Reproduced by permission of the Ecological Society of America.
Conclusions
This example illustrates that there is substantial value in studying a system
response to manipulation, in order to get a more realistic and predictive under-
standing of the potential effects of contaminants. Imagine, for a moment, that
the web of relationships studied in this case was ignored and that only the
response of phytoplankton to atrazine was examined. Depending on the sample
size and statistical power, we would find for a case like the one studied by
Johnson et al. either a nonsignificant or weak positive effect (despite the fact
that the herbicide atrazine is known to have a negative effect on algae in isola-
tion!). How would we interpret such a result? We might also imagine that when
the effects of atrazine are studied in ponds, mesocosms, or whole lakes, the
responses of phytoplankton vary from strong negative effects to clear positive
effects, depending on the abundance and role of macrophytes in these stud-
ies. How would our chances of publication and convincing interpretation be
Multivariate experiments 241
affected by a widely varying response, as might be expected for this case? The
example by Johnson et al. (1991) clearly illustrates many of the advantages of
using SEM in the study of ecosystem responses.
Results
Repeated measures analysis of covariance was used to examine mean responses
of richness and biomass to experimental treatments (Figure 9.2). For richness,
results showed significant effects of time, row (as a spatial covariate), and the
fertilizer treatment. For biomass, significant effects of time and its interactions
with burning and fertilizer treatments were found.
242 Structural equation modeling and natural systems
45
2000 U-U
U-F
1500 ** ** A-U
1000
* A-F
* * O-U
500
+
O-F
0
2000 2001 2002
Year
Figure 9.2. Responses of means for richness and biomass to experimental treat-
ments over three years of study. U–U = unburned + unfertilized, U–F = unburned
+ fertilized, A–U = annually burned + unfertilized, A–F = annually burned
and fertilized, O–U = once burned + unfertilized, O–F = once burned + fertil-
ized. Asterisks represent means that differ significantly from the U–U treatment at
p < 0.05, while the “+” symbol indicates differences at the p < 0.01 level.
In order to better understand how richness and biomass are jointly regulated
in this system, a multivariate analysis was conducted to evaluate the general
model shown in Figure 9.3. This model supposes that richness and biomass
might influence each other, and these interactions could contribute to responses
of both to treatments. Reciprocal arrows between richness and biomass were
meant to include both negative effects of biomass on richness through competi-
tive effects (which has been documented for this site by Jutila and Grace 2002),
as well as postulated positive effects of richness on biomass.
The first step in evaluating the model in Figure 9.3 was to determine the
shape of the relationship between biomass and richness. A positive relationship
between these two variables would indicate the possibility of an effect from
richness to biomass; a negative relationship would indicate the possibility of a
Multivariate experiments 243
UU rich
UF
AU trt
AF
OU
OF biom
50 r = 0.47
Richness/plot
40
30
20
10
0 500 1000 1500 2000 2500
Biomass, g/m2
Figure 9.4. Observed relationship between biomass and richness in study of
coastal prairie. Sample size in this study is 72 and includes values pooled across
treatments and times.
Figure 9.5. Results from analysis of initial model (Figure 9.3). Path from rich to
biom was not theoretically justified in this analysis, because the observed correla-
tion between these two variables (Figure 9.4) was negative and linear.
(1) The correlation between biomass and richness observed in Figure 9.4 does
not represent a causal interaction between these two variables, but instead,
joint control by the other variables.
(2) Row effects on richness can be explained entirely by variations in wetness
across rows.
(3) Year effects were of three sorts, one being an increase in biomass over time,
a second being an increase in community wetness during the experiment,
and a third being a decrease in richness independent of wetness or biomass.
(4) Upon considering the year effects, the authors concluded that the path
from year to rich most likely represented a system-wide decline in richness
resulting from a cessation of mowing during the course of the study. This
effect was further deemed to be a form of “experimental drift” that was an
unintended consequence of the experiment.
Multivariate experiments 245
R 2 = 0.55
0.35
row wet −0.36
UU ns rich R 2 = 0.77
UF 0.65
AU trt −0.53 ns
AF ns
−0.54
OU
OF biom R 2 = 0.63
year 0.58
Y1 Y2 Y3
Figure 9.6. Model and results when the effects of year, spatial variation (the row
variable), and vegetation wetness tolerances were incorporated.
Conclusions
Conventional univariate analyses, such as the repeated measures ANCOVA are
often important in a complete evaluation of experimental results. Such analy-
ses permit a detailed examination of net treatment effects on mean responses.
However, such univariate analyses are unable to evaluate the multivariate rela-
tionships among variables. Thus, a combination of univariate and multivariate
techniques can be most effective.
246 Structural equation modeling and natural systems
R 2 = 0.55
0.35 −0.66
row wet
UU rich R 2 = 0.43
ns
UF 0.65
AU
trt ns
AF ns
−0.54
OU
OF R 2 = 0.63
biom
year 0.58
Y1 Y2 Y3
Figure 9.7. Model with effects of “experimental drift” (the path from year to rich)
removed.
There has been and remains a great interest in the relationship between
biomass and richness. It is instructive in this case to see how a multivariate
analysis sheds light on the interpretation of this relationship. The observed
correlation between richness and biomass found in this study (Figure 9.4) is
suggestive of a competitive effect on richness. This correlation is supportive of
an interpretation in which treatment, row, and year effects might influence rich-
ness through their impacts on biomass. However, when a multivariate analysis
is performed, we can see that the correlation between richness and biomass,
though it is fairly strong, results from common control rather than biological
interactions. Both richness and biomass are under strong control from com-
mon causes and as emphasized in the previous chapter, relationships between
variables possessing common causes cannot be accurately evaluated using a
univariate or bivariate approach.
This example also illustrates another important capability of a multivariate
experiment. Univariate results showed that vegetation wetness increased over
time while biomass increased and richness decreased. On the face of it, it might
seem that the decrease in richness over time resulted from increased rainfall and
increased competitive exclusion. Multivariate analysis isolated a direct effect of
year on rich that was interpreted as an undesired side effect of the experimental
treatments due to the cessation of mowing during the study. It was then possible
to remove this side effect (dubbed by the authors “experimental drift”) from the
data while retaining the other components of the year effect.
Multivariate experiments 247
Table 9.2. Correlations and standard deviations from initial bird exclusion
experiment conducted by Wootton (1994). GooseB refers to goose barnacles.
AcornB refers to acorn barnacles
Results
The results from the initial bird exclusion experiment are presented in Table
9.2, and the model selected to best represent the data is shown in Figure 9.8.
Examination of the correlations indicate that exposure to bird foraging leads to
a strong reduction in goose barnacles and increases in acorn barnacles, snails,
and mussels. These interpretations from the correlation matrix are matched
with results from the analysis of mean responses as well. The correlation matrix
further indicates that both goose barnacles and mussels are negatively associated
with all community members, while acorn barnacles and snails are positively
associated with each other.
Comparing the correlation matrix to the path model results indicates the
following:
(1) The effects of birds on snails, acorn barnacles, and mussels are all indi-
rect, being mediated by direct effects on goose barnacles. Thus, birds are
primarily feeding on goose barnacles.
(2) Goose barnacles have strong negative effects on acorn barnacles that are
only partially offset by indirect beneficial effects caused by reductions in
mussels.
Multivariate experiments 249
Birds −0.96
R 2 = 0.91
(3) Acorn barnacles have a strong positive effect on snails, and there is no
indication that there is a reciprocal effect of snails on acorn barnacles.
(4) Goose barnacles have a complex effect on snails. This effect has positive
direct and indirect components that are offset by a negative effect on acorn
barnacles, which stimulate snails.
(5) Tide height plays an important part in this system. Its actual influence is
poorly represented by the correlation results because of the importance of
the indirect effects that tide height has on acorn barnacles and snails.
Conclusions
It can be seen from this example, as with previous examples, that a multivariate
examination of the data yields a very different interpretation of the underlying
mechanisms. In this case, the author possessed several plausible alternative
interpretations about underlying causal mechanisms. However, two of the three
models evaluated were found not to fit the covariance patterns in the data. Thus,
SEM has the ability to provide guidance as to which proposed mechanisms
do not appear consistent with the results. In this study, Wootton went on to
make predictions from this initial experiment that were subsequently tested
with further experiments. Such experimental tests are considered in the final
chapter relating SEM to ecological forecasting.
250 Structural equation modeling and natural systems
season
burned
soil
study
fuel fate ht_chg
area
elev
damage
init_ht
Figure 9.9. Initial model of how height and other factors might influence the
effects of burning on the Chinese tallow tree.
percentage of a tree damaged, (9) its fate (whether it died, was topkilled, or
survived intact), and (10) the height change of the tree during the study. Height
change basically quantified the degree to which resprouting might have allowed
the tree to recover from the burn. The logic of this initial model was that some
factors of importance, such as soil conditions and fuel would potentially depend
on study area and elevation gradients within sites. It was also hypothesized that
the amount of fuel beneath trees would be affected by tree size.
The amount of fuel was expected to influence both the completeness of burn,
as well as the amount of damage that would be done to trees. However, fires are
also affected by immediate weather and other influences, such as soil moisture,
that vary from site to site and time to time. In this model, such generalized
effects were confounded with study area, which was postulated to include such
effects.
Finally, the fate of a tree and ultimately its change in height during the study
were expected to depend on how complete the burn was around a tree, and how
much damage it sustained. In woodland burns, heterogeneity in fire behavior
can be substantial at the level of the individual tree, and obviously only trees that
actually sustained significant fire damage are expected to respond to burning.
Results
Initial model results found that both soil conditions and elevation influenced
fuel, but did not affect plant responses in any major way. For this reason, it
was possible to simplify the model by removing these two factors. Correlations
among the remaining factors are shown in Table 9.3 and the final model is
shown in Figure 9.10.
Multivariate experiments 253
−0.30
season
−0.11 −0.38 burned
R 2 = 0.51
0.46
0.08 −0.12
0.86
−0.17
0.12
0.17
study 0.11 fuel fate 0.56 ht_chg
area R2= 0.11 0.12 R 2 = 0.55 R 2 = 0.54
0.10
0.43
−0.39
−0.26 −0.26
damage
−0.21 R 2 = 0.82
init_h
init_ht
0.05
Figure 9.10. Final model for fire effects on Chinese tallow tree.
(1) Initial tree height was well correlated with the responses of trees to burning.
While a number of processes appear to be involved, the biggest fraction of
this relationship was through a direct effect on the fate of the tree that
was unrelated to fuel suppression, or to the fact that larger trees suffered
proportionally less damage.
(2) The study areas also had a substantial impact on the results. This effect
could result from a number of factors relating to the nature of the fires,
the overall fuel conditions, or the conditions affecting tree growth. The
authors concluded that the study area effect was most likely to be related
254 Structural equation modeling and natural systems
to differences in fuel conditions (both general and at the time of the burn),
which had strong effects on the completeness and intensity of the burns. It
would seem that in this study, the fuel load immediately around the tree did
not represent the general effects of fuel on fire effects, although there were
some modest effects detected.
(3) The season of burn was related to several pathways in the model. Some of
these appear to result from a confoundment of factors caused by the small
number of burn units. Effects of season on initial fuel conditions and the
percentage of plots that burned were deemed coincidental. The main effect
of season that was important in this study was the direct effect on height
change. This effect represents the growth responses of resprouts to season.
It was apparent that trees burned during the dormant season produced
resprouts that fared much better than those produced by trees burned in
summer. There was a similar but weaker effect of season on tree fate.
Conclusions
This example represents a case where an analysis of multivariate relationships
among trees seeks to overcome the limits of field experimentation. Since the
ultimate objective of the research was to develop a predictive model that can be
tested against repeated trials, the limits of the results will ultimately be judged
by the success of prediction over a range of conditions. Because of the goal of
predicting effects in other locations, this experiment contributes to a program
of study that seeks to be generally consistent over space and time, rather than
intensively precise for a particular space and time.
Table 9.4. Correlations. Those in bold are significantly different from zero
at the p < 0.05 level for a sample size of 254
Background
In part because of a desire to test experimentally assumptions about the relation-
ships among environmental and community properties, a study was undertaken
to manipulate as many of the factors thought to be important in controlling
species richness as feasible. This study was conducted at the Pearl River (see
Chapter 10 for a more complete description of this system) in oligohaline and
mesohaline marshes. The manipulations included (1) the addition of NPK fer-
tilizer to alter community biomass, (2) changing the levels of flooding stress by
raising or lowering sods of plants, (3) exposing sods to different salinity regimes
by transplanting them between oligohaline and mesohaline locations, and (4)
the use of fences to exclude herbivores and thereby reduce disturbance levels.
Treatments were applied in a full factorial study. While species composition
was measured in this study, the focus of the model was on species richness
as the primary response variable of interest. Eight replicates of each of the 32
treatment combinations were used in this study, which included a total of 256
experimental units. Problems resulted in the loss of data for two experimental
units, leaving 254 samples in the analysis.
Results
Table 9.4 gives correlations for the variables analyzed by Gough and Grace
(1999). Prior to analysis using LISREL, compositing techniques were used to
provide estimates of total abiotic effects, disturbance, and biomass for subse-
quent analysis. In the preanalysis, first- and second-order polynomial effects
256 Structural equation modeling and natural systems
salt
FENCE FERT
soilcarb −0.16
−0.31
0.65(+/-)
1.0 DIST BIOM
dist
R 2 = 0.11 R2 = 0.39
−0.49
mass masssr
Figure 9.11. SEM results for experimental studies of the factors affecting species
richness in coastal wetland plant communities (from Gough and Grace 1999).
Reproduced by permission of the Ecological Society of America.
of salinity, flooding, and soil organic variations were formative indicators for
the statistical composite variable “abiotic”. First- and second-order polynomial
terms were also incorporated into both “disturbance” and “biomass” in order
to achieve linear relations among variables. In the process of performing the
preanalysis, all variables were z-transformed to means of zero and variances
of 1.0.
Comparison of the SEM results to those apparent from only the raw cor-
relations is revealing. To reduce complexity, only the effects of fencing and
fertilization on richness will be emphasized here. Additional interpretations
can be found in Gough and Grace (1999).
Variations in species richness in this study were unrelated to whether or not
the plots were fenced (r = 0.108), even though fencing reduced disturbance
levels and disturbance, in turn, was related to richness. As the SEM results
show (Figure 9.11), the experimental manipulation of fencing actually had
three significant effects.
Conclusions
The value of this multivariate experimental approach and analysis should be
clear. A univariate experiment would only have revealed that fencing had no net
effect on richness. What would have been hidden from view were the effects
of three significant processes, two positive and one negative, which offset one
another.
As for the effects of fertilization, behind the net negative effect on richness
there appear to lie four significant processes.
(1) Two rather weak effects operated through a reduction in disturbance asso-
ciated with fertilization. We hypothesize that, in this case, the more rapid
growth rate of plants in fertilized plots showed a more rapid recovery from
disturbance and, thus, evidence of disturbance was slightly lowered by fertil-
ization. As described above for the effects of fencing, reducing disturbance
can promote richness through two somewhat different paths.
(2) The third indirect effect of fertilization was through an enhancement of
biomass and an associated enhancement of richness, presumably primarily
in plots recovering from disturbance.
(3) The fourth process, implied by Figure 9.11, is a direct negative effect of
fertilization on richness. This strong negative effect occurred independent
of increases in biomass, as was also seen for the fencing treatment. Analysis
of the timecourse of events in this study would indicate that fertilization led
to an early increase in biomass that ultimately inhibited later plant growth.
Ultimately, much of the loss of species associated with fertilization was not
accompanied by an increased biomass at the end of the study.
The story for fertilization would seem to be similar to that for fencing in that
the net effect only tells a small part of the total story. The actual direct negative
effect, independent of the other variables, is actually quite a bit stronger (path
coefficient = −0.50) than the net negative effect (r = −0.27) representing all
258 Structural equation modeling and natural systems
the processes. It is possible that our interpretations of all the paths are not
complete. However, the multivariate analysis clearly indicates that fertilization
is influencing richness through a variety of different processes, and it gives an
estimate of the relative importance of direct and indirect influences. It is worth
pointing out that the rather large number of total experimental units in this study
was important in revealing many of the indirect effects that were observed.
Conclusions
It is hopefully clear through the examples presented in this chapter that exper-
imental studies of systems can and often should be conducted using SEM. The
basic test an investigator can apply to determine their needs is to ask themselves
the following question: am I interested in how the various parts of the system
interact to produce the results? If the answer is yes, conventional univariate
methods will not be adequate, although they can complement the analyses.
Once an investigator decides that they wish to use SEM in an experimental
situation, they are automatically encouraged to include the important unmanip-
ulated factors into their models. This moves the experiment from being simply
an investigation of a single process, to the study of how their system is regu-
lated. I expect that often ecologists will find that the process of initial interest
to them is not really the one most important in regulating their system, at least
that has been my experience.
Society asks us to provide answers that are relevant to real world concerns
and that can be applied to conservation solutions. Understanding the effects
of individual processes will not, in my opinion, generate the information we
need. The conduct of multivariate experiments is a largely neglected topic
that I believe will become the standard for understanding ecological systems.
There exist great opportunities for developing methods and applications for
experimentally evaluating multivariate hypotheses using SEM.
10
The systematic use of SEM: an example
259
260 Structural equation modeling and natural systems
30
20
SD
10
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000 2600
Maximum standing crop + litter (gm−2)
Figure 10.1. Example of the relationship between biomass (measured as the
maximum standing crop including litter), and species richness (SD, or species
density), presented by Al-Mufti et al. in 1977. Solid circles represent woodland
herbs, open circles represent grasslands, and triangles represent tall herbs of open
areas. Reproduced by permission of Blackwell Publishers.
In conclusion, in this study we found that biomass was not an adequate predictor of
species richness. One reason for this inadequacy appears to be that while stresses
such as flooding and salinity may greatly reduce the pool of potential species that
The systematic use of SEM: an example 261
Cumulative
Predictor variables Coefficient Std error R-square p<
Constant −3.90 1.084 0.001
Biomass −0.0011 0.0003 0.02 0.001
Elevation 3.10 0.377 0.57 0.001
Salinity 0.51 0.137 0.69 0.001
Soil organic 0.052 0.011 0.82 0.001
12
10
Richness
8
6
4
2
0
0 1000 2000 3000 4000
Biomass, g m−2
Figure 10.2. Relationship between total above-ground community biomass
(live + dead) per m2 and number of species per m2 , found by Gough et al. (1994)
in coastal marsh systems. Reproduced by permission of Oikos.
can occur at a site, those species that have evolved adaptations to these factors may
not have substantially reduced biomass. Thus, we recommend that models
developed to predict species richness should incorporate direct measurements of
environmental factors as well as community attributes such as biomass in order to
increase their applicability.
12
10
Observed Richness 8
0
0 2 4 6 8 10 12
Predicted Richness
Figure 10.3. Fit of data to the multiple regression model in Table 10.1 (from
Gough et al. 1994). Reproduced by permission of Oikos.
Potential
Richness
Environmental Competitive
Biomass
Conditions Exclusion
Realized
Richness
Figure 10.4. Hypothesized conceptual model of factors controlling species rich-
ness in plant communities.
on the shore of the Gulf of Mexico (Figure 10.5). This landscape contains a
number of conspicuous environmental gradients, including gradients in salinity
(salt marsh to fresh marsh), microelevation (from plants in deep water to those
growing on raised levees), and soil organic content (from sandy soil deposits
to organic muck sediments). We also knew from previous studies that natural
disturbances were common and resulted from the activities of wild mammal
populations, as well as from flooding and storm effects. Thus, this system
contained a variety of stresses and disturbances of the sort commonly invoked
in theories about diversity regulation.
The systematic use of SEM: an example 263
Figure 10.5. Aerial photograph of the Pearl River marsh complex, which was the
site of these investigations.
The next step in the process was to convert the conceptual model
(Figure 10.4) into a construct model, as represented in Figure 10.6. The pro-
cess of specifying a construct model is an important step forward in theory
maturation. The conceptual model presents important ideas, however, it is very
general and the meanings of the terms are somewhat vague. This is not to say
that the construct model in Figure 10.6 is without ambiguity. Models always
have a context, some set of physical circumstances under which the model
makes sense. I have argued previously (Grace 1991) that the context for mod-
els is often not clearly specified in ecological theories, leading to irresolvable
debate. The application of SEM seeks to make concepts and context tangible in
stages, which preserves both general and specific perspectives on the problem.
This topic will be discussed in more detail in Chapter 12 as it relates to the
concept of theory maturation.
One thing that happens when we specify our construct model is that there is
an immediate expectation that the concepts represented will have to be made
operational. This means that it is soon going to be necessary to specify the
meaning of the concepts by describing exactly how they will be measured.
This reality suggested to us a distinction that we wanted to make that was not
specified in the conceptual model; that there are two distinctly different kinds
264 Structural equation modeling and natural systems
ABIOTIC
RICH
DIST BIOM
abiotic
index 1
ABIOTIC
abiotic
index 1
RICH rich
Figure 10.7. Initial structural equation model with latent variables representing
concepts and observed variables representing indicators. Reproduced by permis-
sion of The University of Chicago Press.
BIOM, what is meant is that the disturbance regime has an effect on community
biomass. When a measure of recent disturbance is used as an indicator of
DIST, we are proposing that recent indications of disturbance correlate with
the disturbance regime over a period of time, reflected by the vegetation.
Once the initial structural equation model shown in Figure 10.7 was formu-
lated, we designed a sampling scheme and schedule. Data were then collected
over a two-year period to test this model. Only the results from the first year
were used in Grace and Pugesek (1997), with the second year’s data saved for a
subsequent test of whether the model results would hold up over time, and how
the system changes between years. Data collected included sediment salinity,
site microelevation, soil organic and mineral content (components of the abiotic
indices), recent disturbance (dist), measured as percentage of the plot disturbed,
above-ground biomass per m2 (massm2 ), percentage of full sunlight reaching
the ground surface in the densest part of the plot (lightlo), percentage of full
sunlight reaching the ground surface in the sparsest part of the plot (lighthi),
and the number of species in a plot (rich).
As is often the case when evaluating multivariate models, the fit between our
data and the initial model indicated that it was not adequate. Simply put, our data
were not consistent with the expectations implied by the initial model. The part
of the hypothesis in Figure 10.7 that failed was the proposition that biomass and
light readings can both be used to represent a single conceptual variable. The
symptoms of this failure were that lighthi and lightlo were highly correlated,
266 Structural equation modeling and natural systems
abiotic
index 1
ABIOTIC
abiotic
index 1
RICH rich
Figure 10.8. Modified full model defining BIOM and LIGHT as separate con-
ceptual variables.
but that massm2 was not well correlated with either light variable. Because of
this, we reformulated our model, as shown in Figure 10.8. An important lesson
was learned here, regardless of your conceptualization of the problem, if two
variables are not consistently and equally well correlated, they will not function
as multiple indicators of a single conceptual variable. Of course, we could have
used the heterogeneous indicators to represent a composite. However, that did
not fit with our objectives in this analysis.
Once the model was reformulated and reanalyzed, another inconsistency
between model and data emerged. A large residual correlation between DIST
and LIGHT was found to exist. As was discussed in Chapter 8 (see Figure 8.12),
this particular residual represents an effect of disturbance on plant morphology
that moderates the effect of disturbance on light. In order to proceed further,
it was necessary to include a pathway from DIST to LIGHT. Only then was
there a consistent match between the relationships in the model and those in the
data.
The results shown in Figure 10.9 represent the partitioning of covariances
in the data as specified by the relationships in the model. Since these results
were obtained using a maximum likelihood statistical procedure, they satisfy
the criterion of being a simultaneous solution for all relationships. It is not my
purpose here to describe all the ecological interpretations of these results. The
interested reader can consult the paper by Grace and Pugesek (1997). What I
would like to point out, however, is that while this model fits the data, we must
conclude that our original formulated model (Figure 10.7) did not. Thus, further
The systematic use of SEM: an example 267
abiotic 0.99
index 1
0.99 ABIOTIC −0.52
abiotic
index 1
0.30
0.86 0.98 0.97
Figure 10.9. Results for accepted model. Standardized partial regression coeffi-
cients are given. The (+/−) next to the path from LIGHT to RICH signifies that
this path was unimodal. Reproduced by permission of The University of Chicago
Press.
evaluation is still needed using an independent data set, before we can conclude
that our accepted model is valid for the system sampled. As this rather weak
conclusion reveals, the demands that SEM places on empirical validation are
quite stringent. Stated in another way, the model evaluation philosophy pushes
the scientist rather hard to demonstrate that their results have consistent validity
(i.e., consistent applicability in different places and over time), not just local
application.
1.0
salt SALT −0.23
0.17
0.15
1.0
floodhi
−0.35
0.97 FLOOD
floodlo
0.24
−0.30
0.99
soilorg RICH 1.0
rich
0.96 INFERT R 2 = 0.45
−0.21 −0.43
soilcarb
0.31 −0.63(+/-)
−0.68 −0.59
1.0 BIOM LIGHT
dist DIST
R 2 = 0.70 R 2 = 0.65
0.26
0.86 0.98 0.97
massm2 lighthi lightlo
Figure 10.10. Results for a more specific version of the model. Reproduced by
permission of The University of Chicago Press.
10 10
Observed Species Richness
A B
8 8
6 6
4 4
2 2
0 0
0 2 4 6 8 0 2 4 6 8
Predicted Species Richness
Figure 10.11. Comparisons between predicted and observed species richness
taken from Gough and Grace (1999). A. All plots. B. Excluding fertilized and
fenced plots. Reproduced by permission of the Ecological Society of America.
data collected by Grace and Pugesek. Secondly, the results were represented as
a structural equation model of their own. This model was presented in Chapter 9
and can be seen in Figure 9.11.
Figure 10.11 presents two graphs showing how observed values of species
richness in Gough’s experiments compared to those predicted from non-
experimental data in the broader landscape. When all plots were included
(Figure 10.11A), there was considerable scatter and only 35% of the variance
was explained. Further analyses showed that this was due to the fact that fencing
and fertilizing caused effects that were not quantitatively predicted. With plots
subjected to either of these treatments removed (Figure 10.11B), the remaining
treatments, which included those subjected to changes in salinity and flooding
as well as the controls, demonstrated a stronger correlation between predicted
and observed (R2 = 0.63).
Details of the interpretations of the experimental study can be found in
Gough and Grace (1999). What should be pointed out here, however, is that
many aspects of the model of this system based on nonexperimental data were
supported by the results of the experimental treatments. It was our interpretation
that where the model based on nonexperimental data failed to predict accurately
was for conditions that were experimentally created, but that did not exist
naturally in the field. In other words, it appears that the nonexperimental data
can be used to reveal how factors relate in the unmanipulated community, while
experimental treatments permit one to ask the question, what will happen if we
change the conditions in the system? Thus, when it comes to comparing the
270 Structural equation modeling and natural systems
position. This time, we were joined in the search by an eager group of graduate
students at the University of Louisiana, who were involved in a multicampus
course in Biocomplexity offered by the National Center for Ecological Analysis
and Synthesis. Together, we re-examined the earlier data Guntenspergen and I
collected to see if the grids of plots at each site along the river held additional
clues. This time, the question we wished to address was whether small-scale
historical effects might show up as positive correlations in richness among
adjacent plots that were not related to known environmental gradients (Mancera
et al. 2005).
Starting with an examination of the data, we determined that there was
spatial autocorrelation among plots. In other words, we found that plots that
were spatially close were similar to one another in richness, more often than
would be expected by chance. Such spatial autocorrelation has been reported
before (Legendre 1993) and is probably very common. It is entirely possi-
ble, of course, that such spatial autocorrelation in richness simply reflects
spatial autocorrelation in controlling environmental conditions. As seen in
Figure 10.12, the relationship between spatial patterns in richness and spa-
tial patterns in environmental variables represents an important problem for
interpretation. Do the spatial patterns represent a tight mapping to spatial vari-
ations in environmental conditions, or do they represent historic effects such
as dispersal? To address this problem we first factored out the variation in
species richness that could be ascribed to known environmental factors, and
then tested for whether residual richness still showed spatial autocorrelation.
This sequential hypothesis testing is represented in Figure 10.13, with the
test of neighbor richness represented by a “ghost” variable, indicating that
its effect was determined after the effects of the other variables had been
considered.
The analyses showed that once environmental factors were considered, spa-
tial autocorrelation in species richness disappeared. This means that we were
unable to find any evidence of small-scale historical effects, or other unmea-
sured causes of spatially controlled variations in richness in this system.
Figure 10.12. Topographic plots of spatial variation in species richness and other
variables at one of the five sample sites along the Pearl River, based on sampling
in a 5 × 7 grid of 1 m2 plots. In these figures, rows are at different distances from
the river’s edge (in meters) and columns are at different locations along the river
at a site (also in meters). From Mancera et al. 2005. Reproduced by permission of
Springer Publishers.
General support for such a construct model was also found by Grace et al.
(2000) in coastal tallgrass prairie, although the influence of recent disturbances
was minor in this system. In a study of woodlands in Mississippi, Weiher
et al. (2004) found that the presence of trees in a prairie grassland moderates
soil and biomass effects on herbaceous richness; requiring an alteration of the
general construct model for such situations. Investigations of diversity regu-
lation in California chaparral (Grace and Keeley 2006) revealed both similar
construct relations, along with the importance of landscape features, and the
increasing importance of spatial heterogeneity with increasing plot sizes. In
other studies, the overwhelming importance of abiotic constraints on diver-
sity patterns of serpentine endemics (Harrison et al. 2006) suggests a dif-
ferent balance of forces at play for plants occupying extreme environmental
The systematic use of SEM: an example 273
Neighbor
salt SALT Richness
elev1
elev2
FLOOD RICH rich
elev3
elev4
light
Figure 10.13. Model used to evaluate relationship between species richness and
contemporary variables at each of the five grid sample sites at the Pearl River.
PABUN represents plant abundance. The “ghost” variable, Neighbor Richness,
was evaluated for a relationship to RICH after the effects of all other variables
were removed; representing a sequential variance explanation test.
conditions. Altogether, these studies suggest support for certain general fea-
tures across systems, and numerous specific factors of importance in particular
situations or contexts. We will return to the question of generality in our final
chapter, where we consider how SEM methods may evolve to permit the evalu-
ation of very general models that can apply across systems diverging in specific
properties.
Summary
This chapter has sought to give the reader an insight into an example of the
ecological perspective that can be created through a committed approach to
multivariate model development, evaluation, and refinement. The point of pre-
senting such an extensive example is not to imply that all of these steps are
required. Instead, what I hope to have accomplished is to show the reader how
the adoption of a multivariate perspective opens up a new way of learning
about ecological systems. It has been my personal experience that the pursuit
of a multivariate understanding has greatly enhanced the ecological insights
274 Structural equation modeling and natural systems
In this chapter I consider some of the pitfalls that can be encountered, and present
a few recommendations that may help to avoid problems in implementing and
interpreting results from SEM. Once the reader is finished with this chapter,
they may wish to visit Appendix I to see how some of the recommendations
from this chapter are applied to example SEM applications.
275
276 Structural equation modeling and natural systems
some comfort that this “new” method has substantial theoretical and practical
backing.
It would be a mistake, however, to think that the proper application of SEM
methods is straightforward, or that all the major issues with multivariate mod-
eling have been worked out. Many of the models we might wish to solve are
currently not solvable. Sometimes the solutions offered by authors are not appro-
priate. A few of these issues have been discussed in previous chapters. Aside
from these important technical issues, there is a significant need to rely on
fundamental SEM principles rather than prepackaged protocols, if we are to
apply the methodologies to environmental problems where the context of data,
theory, and research objectives may be different. There are many opportuni-
ties for misapplication of SEM methods, some with minor consequences and
some with major. In the first two sections of this book, I tried to be rather care-
ful to emphasize proper applications and interpretations. In the third section
(Chapters 8, 9, and 10), however, I have been quite liberal in my use of pub-
lished applications of path analysis and SEM in the ecological sciences. This
was done deliberately so that I could emphasize the merits of using multivariate
models. It must be disclosed, however, that many of the examples presented do
not provide good illustrations of the best way to conduct analyses. Therefore,
in this chapter I provide a brief description of some of the problems faced in
the application of SEM. This is followed by a brief set of recommendations to
provide additional guidance for the newly beginning SEM practitioner.
Cautions
Kline (2005) offers a list of 44 common errors made using SEM. Here I borrow
heavily from his description of such problems, with the caveat that I qualify
some of his comments. These problems fall under the four headings of errors
of model specification, problematic data, errors of analysis, and errors of inter-
pretation. Table 11.1 contains a paraphrased list of the problems as he presents
them, and I provide some limited discussion of them by group afterwards. In
this discussion I do not address all of the listed issues, some are not discussed
because they are self-evident, and some because they are explicitly covered
elsewhere in the book and cannot be encapsulated briefly.
Specification errors
Specification errors can indeed cause some of the greatest problems with SEM
results. Basing analyses on an inappropriate model can lead to results that
are not close to the true values, because the model structure does not match the
Cautions and recommendations 277
Table 11.1. List of commonly made errors (adapted from Kline 2005)
Errors of specification
1. Failure to specify model(s) prior to data collection.
2. Omission of important factors from model.
3. Failure to have sufficient number of indicators for latent variables.
4. Using unsound indicators for latent variables.
5. Failure to carefully consider directionality of arrows.
6. Specification of feedback effects to mask uncertainty about relations.
7. Overfitting of model.
8. Including error correlations without sufficient theoretical reasons.
9. Cross-loading indicators on multiple factors without justification.
Improper treatment of data
10. Failure to root out errors in the data.
11. Ignoring whether pattern of missing data is systematic (versus random).
12. Failure to examine data distributional characteristics.
13. Failure to screen for outliers.
14. Assuming relations are linear without checking.
15. Ignoring lack of independence among observations.
Errors of analysis
16. Failure to rely on theory to guide decisions of model acceptance.
17. Failure to check accuracy of computer syntax.
18. Failure to check for admissibility of solutions.
19. Reporting only the standardized parameter estimates.
20. Analyzing a correlation matrix when inappropriate.
21. Improperly performing covariance analysis using correlation matrix.
22. Failure to check for constraint interactions.
23. Failure to properly address collinear relations.
24. Estimation of complex model using small sample size.
25. Setting inappropriate scales for latent variables.
26. Ignoring problems resulting from improper starting values.
27. Failure to check model identification when solving models.
28. Failure to recognize empirical underidentification.
29. Failure to separately analyze measurement and structural models.
30. Failure in multigroup analyses to establish common measurement model.
31. Analysis of categorical items as continuous indicators.
Errors of interpretation
32. Relying solely on indices of overall model fit.
33. Interpreting good model fit as proving validity of model.
34. Interpreting good fit as suggesting model is good at explaining variance.
35. Relying solely on statistical criteria for model evaluation.
36. Relying too heavily on p-values.
37. Inappropriate interpretation of standardized parameters.
38. Failure to consider equivalent models.
39. Failure to consider alternative nonequivalent models.
40. Reification of latent variables.
41. Equating a latent variable with its label.
42. Believing SEM can compensate for either poor data or weak theory.
43. Failure to report sufficient information to permit proper scrutiny.
44. Interpreting significant results as proof of causality.
278 Structural equation modeling and natural systems
situation. Kline begins this set of errors by stating that the failure to specify your
model prior to collecting data is an error to avoid. While it is certainly true that
it is desirable to have your model prior to collecting your data, I do not believe
(1) it is reasonable to expect beginning practitioners to be sufficiently experi-
enced to always specify a correct or near-correct model prior to data collection,
(2) forming a model based on pre-existing data always leads to problems, or
(3) forming a model prior to collecting data necessarily leads to a correct model.
We must first recognize that initial applications of SEM to a problem take place
in the face of insufficient information. Often we do not have sufficient experi-
ence to anticipate the appropriateness of the data and models used. While I do
recommend specification of models before data collection, this still does not
solve the problem of insufficient experience. With experience in modeling a
system, initial specification of models provides important opportunities.
Omitting important factors from models can be a major limitation to our
success. When omitted factors are uncorrelated with other predictors, their
omission leads to reduced predictive power. When omitted factors are correlated
strongly with included factors, their omission can bias coefficients. Only the
relentless pursuit of a problem can uncover either the adequacy of a model, or
the missing factors and their degree of correlation with included factors.
In the social sciences general recommendations exist for the number of indi-
cators to include when using a latent variable model. The “magic number” often
recommended is 3 indicators per latent. Such prescriptions must be understood
within context, however. In Chapter 4, I describe the fundamental principles
associated with latent variables and their measurement. I show that the speci-
fication of single-indicator latents can be quite valuable. I also argue that the
criterion of indicator replaceability should guide decisions about the adequacy
of a set of indicators. Furthermore, whether the objective of an analysis is
inward- or outward-focused strongly influences questions about the number
of indicators per latent. Finally, the nature of the data relative to the concept
embodied by the latent variable is particularly important. Keep in mind that
a latent variable is simply an unmeasured variable. It may represent a narrow
concept, such as an individual’s body weight, or it may represent something
more abstract, such as the concept of body size. When latents represent narrow
concepts, it may be that only one or two indicators are needed to assess that
concept adequately. The more general the concept, the more important it will
be to assess a wide array of properties that relate indicators to its meaning.
Errors 4–6 in Table 11.1 all relate to the general problem of having an
inappropriate model. Sometimes this results from insufficient experience with
the subject being modeled, and sometimes it results from carelessness by the
researcher or analyst. The point here is to urge the practitioner to avoid being
careless by developing an appreciation of the possibilities of alternative model
Cautions and recommendations 279
Errors of analysis
Most of the errors of analysis listed in Table 11.1 are technical problems. Basi-
cally, we find that the results achieved from SEM analyses are dependent on
properly coded models that achieve admissible solutions providing unique esti-
mates based on the proper algorithms for the data used and model specified.
280 Structural equation modeling and natural systems
There are also certain requirements for proper results that involve recognizing
when certain conditions are not met (e.g., common measurement model across
groups), or being aware that the test of overall model fit integrates lack of fit in
both the measurement model and structural (inner) model.
Only a few of the errors listed require additional, specific mention. It is a
mistake to rely exclusively on statistical criteria when judging model accep-
tance. The good news is that for observed variable models, judging adequacy
of fit is usually fairly clear cut. Latent variable models with lots of indicators,
on the other hand, pose the greatest difficulties. As we discussed in Chapters 4
and 6, such models tend to accumulate a lack of fit load, and when combined
with large sample sizes, discrepancies between model and data become very
noticeable. The point is, even well-fitting models can be way off the mark if
theory is either insufficient or ignored. At the same time, lack of fit may rep-
resent an isolated misspecification in a model that does not apply to the model
generally, although it must still be addressed.
The problem of sample size adequacy is one that has no easy answers. It is true
that certain sample sizes are so low that they do not provide much opportunity
to detect relationships. Bootstrapping methods, as well as other approaches
(Bayesian estimation), can allow us to avoid the large-sample assumptions of
maximum likelihood. On the other hand, some models are not easily resolvable
even with large samples, and others can provide stable, reasonable solutions
even with very small samples. It is true that model complexity plays a role.
However, the clarity of patterns in the data also plays an overwhelming role in
sample size adequacy. Thus, it is not possible to make reliable pronouncements
in the absence of knowing something about the data. More discussion of this
topic can be found in Shipley (2000, chapter 6).
The management of collinear relations in models is indeed one that requires
care. When two predictors are very highly correlated (r > 0.85), they begin to
become somewhat redundant. If appropriate, modeling them as multiple indica-
tors of a common latent factor is an effective solution that removes problems of
variance inflation and parameter bias. When it is not appropriate to model them
as multiple indicators, some other means of reducing the model to something
equivalent to a single predictor may be needed. Options range from dropping
a variable, to the use of composites, as is needed in the challenging case of
nonlinear polynomial relations, where first- and second-order terms are highly
correlated (Chapter 7).
Errors of interpretation
A few of the errors of interpretation listed in Table 11.1 result from a failure to
understand technical issues. If one relies solely on indices of overall model fit,
Cautions and recommendations 281
they fail to recognize that lack of fit can either be spread uniformly throughout a
model, or be concentrated in one spot. When all but a few expected covariances
match observed ones exactly, the average discrepancy may not be too high,
even though one or more of the specific discrepancies may indicate a problem.
Also, interpreting measures of fit as indicating a model is “adequate” for all
purposes is a mistake. While we judge a model first by adequacy of fit, poor
predictive power (indicated by low explained variances) can indicate that the
utility of the model may be low. This could be an inherent limitation of the
phenomenon being studied, or it could be a data or model problem. Further,
equating the magnitude of p-values with effect sizes represents a major mis-
understanding of what p-values mean. It is the parameter value that is ultimately
important. Finally, inappropriate use of results, such as an incorrect usage
of standardized path coefficients, usually results from a misunderstanding of
subtle technical issues influencing these parameters. This issue is discussed in
detail at the end of Chapter 3. The improper use of standardized parameters is a
very common problem that needs attention, so care does need to be paid to this
issue.
Most problems in this category, however, represent a misunderstanding or
misuse of statistical analysis. The most egregious error is to assume or imply
that adequacy of fit equates to a demonstration of causation. Many pieces of
evidence are needed, including both an absence of equivalent models with equal
theoretical support, and the existence of independent information supporting
a causal interpretation. The consideration of alternative (either equivalent or
nonequivalent) models (Chapter 8) is a very good way to take a reality check. If
you can convince a skeptic that the other possibilities are not supportable, you
will have achieved a strong “fail to reject” result for your model. Similarly, even
the most sophisticated SEM application cannot change the underlying data and
availability of theory. More likely, SEM will uncover the inadequacies. It is my
experience that adding an SEM analysis to a manuscript does not change the
journal for which it is appropriate. That is determined by the data. Structural
equation modeling can, however, greatly improve our extraction of information
from that data, assuming analyses are correct and meaningful.
Summary recommendations
Given that there are so many particulars that need to be considered in an SEM
application, as implied from the above list of possible errors, my advice to the
beginning practitioner is rather general. In many ways, the following account
is redundant with previous advice, though I believe it may be appreciated by
some readers that it should be presented in one place in simple form.
282 Structural equation modeling and natural systems
Getting started
The transition to SEM is often a bit of a bumpy ride for a number of reasons. It is
hoped that the other chapters of this book will be of some help to the beginning
SEM user by providing a basic understanding of concepts and procedures.
The examples presented (as well as the analyses presented in Appendix I) also
provide opportunities for the reader to become familiar with some of the variety
of possible applications. The suggestions presented here are not meant to be a
rigid protocol, but instead, suggest one approach to take so as to proceed in an
orderly fashion.
The starting point for an SEM analysis will often be the selection of a
system property, or set of system properties, whose variations one wishes to
understand. Note that this is a different emphasis from univariate analyses,
which typically place the focus on a relationship between two variables (e.g.,
the effect of fertilizer addition on plant growth), or perhaps a pair of relation-
ships (e.g., the interactive effects of fertilizer and herbivore exclusion on plant
growth). Certainly a key relationship between two variables can be central to
a multivariate model. However, for a multivariate analysis to be successful,
we need to include the main variables that control the variations in our focal
system properties. Since our objective is to understand some response vari-
able within a system context, we need to approach the problem with a system
perspective.
It is often useful to select or develop a conceptual or other model before
developing the structural equation model. A conceptual model used as a starting
point should include the processes and properties that are known or suspected
to control the focal system properties, regardless of whether they will all be
measured in this study or not. This can help with both model development and
the interpretation of results.
Translation of a conceptual or other nonstatistical model into either a con-
struct model or directly into a structural equation model is the next step. Devel-
oping the initial structural equation model can begin either with a consideration
of the inner model among concepts, or with a measurement model that seeks to
understand factor structure. Regardless of where model development begins, a
model that considers the conceptual variables to be studied as well as the way
they will be measured is the objective at this stage. Sometimes, the develop-
ment of the structural equation model will occur without the development of an
associated conceptual or other model. Ultimately, this depends on the amount
of prior knowledge.
What is often unappreciated is the variety of ways an analysis can be for-
mulated. The flexibility of SEM permits an array of possibilities for how to
Cautions and recommendations 283
Sampling considerations
Once an initial model has been developed, sampling proceeds and the decisions
made at this step can have a substantial influence on subsequent analyses,
results, and interpretations. It should be clear that the degree to which results
can be said to represent something beyond the sample itself depend on how
well the sample represents the sphere of generalization.
Sample size
A first point of consideration is sample size. Again, earlier chapters showed that
even very small samples can be analyzed by resorting to small-sample methods.
However, we are rarely able to have a great deal of confidence in the generality
of studies based on small samples. In many cases in ecological studies, the
ecological units of interest may be large and replication may be challenging. It
is realistic to realize that not all ecological units will make ideal study subjects
for multivariate models, where many parameters are to be estimated. This is
not to say that multivariate models cannot be evaluated for, say, replicate large
ecosystems; only that it cannot be done very precisely using a limited number
of such systems.
Some guidelines are available regarding sample size requirements. Monte
Carlo simulation studies can be implemented in several of the current SEM
software packages, and this is the most efficient way to determine the sample size
needed to obtain particular levels of power. In the absence of such information,
many authors make recommendations as to the number of samples needed to be
estimated per parameter, a common number being 10. Other authors have more
general recommendations such as, 200 samples is a satisfactory number, 100
samples as minimal, and 50 samples as a bare minimum. When working with
smaller sample sizes, bootstrapping procedures are especially recommended
because they do not rely on large-sample assumptions. Additional discussion
of this subject can be found in Hoyle (1999).
284 Structural equation modeling and natural systems
Sample distribution
Multigroup analysis has long permitted analysis of samples that fall into dis-
tinct categories of interest. However, problems arise when data are clustered or
subsampled, and that subsampling is ignored in the analysis. Typically, ignoring
subsampling leads to an increase in the chi-square of the model, because there
is a source of variation in the data that is not reflected in the model. Also, R-
squares and standard errors are overestimated when data structuring is ignored.
A simple and common procedure for handling data hierarchy is the use of
so-called background variables. Several approaches to this problem have been
discussed in both Chapters 7 and 9. However, until fairly recently, more sophis-
ticated methods explicitly designed for handling data structure in multivariate
modeling were not available in SEM software packages. Now, both LISREL
and Mplus, as well as a program called HLM (Hierarchical Linear Modeling,
www.ssisoftware.com) have capabilities for handling hierarchical data. These
more complete methods for hierarchical data handling are based on analysis of
raw data and cannot be accomplished through the analysis of covariances. As a
consequence, results akin to those obtained by repeated measures ANOVA can
be obtained, allowing individuals in the population to possess different response
patterns. Recent advances in Mplus permit an amazing array of analyses to be
performed at different levels in hierarchical data.
One other issue related to sample distribution has to do with distribution in
parameter space. The span of variation in variables captured by sampling, as
well as the distribution of those samples, can have a substantial effect on results
and interpretations. If the goal of a study is to obtain a representative random
sample, random or stratified random sampling is required. Often, however, sam-
pling is stratified or restricted, for example, sampling across gradients. While
various sampling approaches should arrive at similar estimates for the unstan-
dardized coefficients (slope of relationship), they will differ in the standardized
coefficients, as well as the R-squares. This is basically a signal to noise prob-
lem. Sampling at two ends of a gradient will be expected to generate a strong
regression relationship. This can actually be helpful if the goal is to estimate
a linear slope and to ensure that endpoints are well measured. Samples spread
evenly along a gradient will provide an even better representation of a gradi-
ent, although there will be more unexplained variance. Systematic sampling,
regardless of its merits for efficiency, does not provide an accurate measure of
variance explained for the population at large. Traditionally, most hypothesis
testing enterprises are based on a presumption of random sampling. Sequential
methodologies, such as SEM or Bayesian learning, seek repeatability and can
thus be less restrictive about sampling so as to adequately estimate population
variances.
Cautions and recommendations 285
Analysis
Initial evaluation
When working with pre-existing data, one should be aware that the strength of
inference that can be achieved is dependent on the degree to which analyses
are exploratory versus confirmatory. Beginning users often have the impression
that confirmatory evaluation of models requires that they should not look at
the data prior to fitting the proposed model. This is both a practical and a
theoretical mistake. The appropriate criterion for a confirmatory analysis is
that the researcher should not have used exploratory SEM analyses of the data
to arrive at the final model, unless it is to be acknowledged that the results are
exploratory and provisional. This applies to both the analysis of pre-existing data
as well as cases where initial models were developed prior to data collection.
The place to start with an analysis is a thorough screening and examination
of the data, point by point in most cases. Paying inadequate attention to this step
is a common mistake made by new analysts. Results of all sorts, particularly
regression results (and therefore, SEM results as well) are extremely sensitive
to outliers. These outliers cannot always be detected in univariate space or even
in bivariate space (an outlier in bivariate space is a combination of x and y that is
very different from any other combination). The researcher needs to know that
all the points are representative. Often, apparent outliers can be found in random
samples because random samples do not capture the edges of parameter space
easily unless there is a very large sample. In that case, such samples are not true
outliers. Nevertheless, a decision must be made whether to remove an outlying
sample value that is useful in defining a slope in order to avoid overestimating
variance explained by a relationship. Again, the choice may depend on whether
the goal of the study is to estimate population values, or to develop general
equations.
be used. Fortunately for the modern user of SEM, such methods are available.
For large samples, the Satorra–Bentler Robust method provides standard errors
that are accurate for a variety of residual distributions. For small or large sam-
ples, bootstrapping methods can accomplish this. It is important to point out
that parameter estimates, such as path coefficients, are unaffected by resid-
ual distributions, because their estimation does not depend on the normality
assumption.
While normality problems are easily addressed, extreme skew can affect
covariance, and therefore, path coefficients. For this reason, transformations of
the individual variables may be required. Categorical variable modeling options
in programs like Mplus provide a variety of ways to address the analysis of
extremely skewed or otherwise nonnormal variables.
Nonlinearities can lead to serious problems in analyses. Generally it is possi-
ble to detect nonlinearities through the examination of bivariate plots. The sim-
plest approach to correcting for nonlinear relations is through transformations
of the raw variables. As illustrated in Chapter 7, nonlinear modeling for cases
where multi-indicator latent variables are not involved can be accomplished.
Nonlinear relationships (including interactions) among latent variables have
generally been quite difficult to implement due to the complexity of specifica-
tion. This is no longer true for the latest version (version 3) of Mplus, which
now automates the process.
A number of strategies can be used for missing data. While listwise deletion
(deletion of any cases where values for a variable are missing) is often the
simplest approach, there are other options. Imputation of missing values has
received considerable attention and can often be used successfully if there are
not very many missing values for a variable, and they are considered to be
missing at random. Some programs have rather elaborate and quite clever ways
of dealing with missing values, particularly if values are missing for a reason
(for example, through mortality of individuals). In such situations, deletion of
cases with missing values can lead to bias in the sample, and is to be avoided.
Evaluation of models
Once data have been prepared for analysis, it may be necessary to modify the
initially hypothesized model(s). The processes of data collection, data examina-
tion, and data screening can often lead to the conclusion that the resulting data
set differs in important ways from what was initially intended. One should make
as few changes as possible to theoretically based models. However, it should
be clear that in the initial stages of evaluating models relating to a system, there
will be substantial learning associated with all parts of the process.
Cautions and recommendations 287
Interpretation of findings
Virtually everything we have covered so far in this book can come into play
when we draw interpretations. Results are certainly highly dependent on the
correctness of the model structure. They are also dependent on the way data are
sampled. For these reasons, in manuscripts it is best to describe SEM results
in terms that make the fewest assumptions, saving causal inferences for the
discussion section of the paper. Few things raise reviewers’ concerns as read-
ily as what sounds like pre-interpreted results. Structural equation modeling
lingo contains plenty of terms that can encourage us to state results as if their
interpretations are pre-judged, such as discussions of direct and indirect effects.
Sensitivity to this issue can save the author from harsh reactions.
Often, results and associated modification indices will indicate that models
should be changed. When deciding on both model modification and the inter-
pretation of results the first priority should be given to reliance on substantive
(theoretical) logic. An additional caveat is to avoid overfitting to the data and
thereby capitalizing on chance characteristics of the data set. It is important to
always report how the analysis was conducted and never present a final result
288 Structural equation modeling and natural systems
that was formulated through a very exploratory process, as if it was the initial
model and the study was confirmatory. When data sets are large, consider split-
ting the data set into an exploratory sample and a confirmatory sample. Finally,
plan to do followup studies in which modified models receive further testing
with new data sets. Ultimately, the two ingredients for success are a sustained,
multifaceted approach to a problem (Chapter 10 may provide the reader with
some ideas on this subject) combined with a strong reliance on theory. It is
unlikely that we will determine the general validity of our solutions from a
single study, but with repeated efforts conducted using comparable methods,
strong inference can develop.
Conclusions
While the evaluation of multivariate models holds great promise for ecological
studies, the current level of rigor for ecological applications is less than it needs
to be. Proper application of methods such as SEM, requires an appreciation
of the need to collect appropriate samples, handle data properly, formulate
models correctly for the sample being analyzed, perform analyses carefully,
and interpret results appropriately. Because of the complexity of the issues
involved, a commitment of some time and effort is required, compared with
many of the univariate procedures ecologists are accustomed to learning. I
believe the effort is well worth it and hope that the material in this book will
make the process easier for others.
PA RT V
291
292 Structural equation modeling and natural systems
the search for simple general explanations for complex ecological problems.
Simple generalizations will continue to fascinate ecologists. They will not, I
contend, provide adequate answers to environmental problems. We have been
doing the best we can with the resources and tools available to us. This book
presents new tools and approaches for the biologist to consider, ones that are
specifically designed to confront complexity. There is a special emphasis in this
book on ways to analyze complex data and achieve answers that are specific to
a particular system, and yet have generality across systems as well.
Methodologies that have been developed for the study of multivariate rela-
tions provide us with a means to advance ecological understanding in a number
of new ways. Their capacity to sort out simultaneous effects can allow us to
move beyond dichotomous debates of this factor versus that factor. They can
also provide us with the means to accommodate both the specific and general
features of data. Perhaps of greatest importance, they push us to collect data in
a different way, one that will, I believe, clarify our ideas and refine our under-
standing. Ultimately, the premise of this book is that the study of multivariate
relationships using methods like SEM can advance our understanding of natural
systems in ways not achieved through other methods.
description
and
comparison
hypothesis
prediction
formation
hypothesis
evaluation
Figure 12.1. Representation of the progress of the scientific method.
In this chapter, I will set the stage for a critical look at current ecological
theory by first discussing concepts relating to the development and maturation
of scientific theories. I contend that a univariate mindset has caused ecologists
and others studying natural systems to develop and cling to simplistic theories
that fail to mature. There are many conspicuous properties of immature theories,
and it is readily apparent that such properties are commonly associated with the
most discussed ecological theories of the day. I go on to discuss properties of
SEM that can contribute to theory maturation, and within that discussion point
out topics in our field where current limitations are apparent. In the last part
of the section, I discuss and illustrate an associated set of scientific behaviors,
such as theory tenacity, that contribute to the accumulation of a large set of
immature theories that contradict each other but that do not evolve over time
and for which there is no resolution of merit. My thesis is that the ability to
develop and evaluate multivariate theories has the capacity to greatly enhance
the maturation of theory and lead to scientific progress.
Theory Maturation
Clear, Convincing,
Novel Idea or
Mechanistic, Predictive
Bold Proposition
Explanation
One-dimensional Models
r- selected K- selected
Two-dimensional Modell
C
R S
Multidimensional Model
C1
R2 C2
R1
S1
S3
S2
Figure 12.4. Engraved title page in Sir Francis Bacon’s book, Novum Organum,
published in 1620 (engraving by Simon Pass). The latin motto written beneath the
ship translates into, “Many shall pass through and learning shall be increased”.
Image reproduced by permission of Richard Fadem.
that fall outside of the mainstream of application. Often this process involves
tradeoffs and compromises that give up one model attribute (e.g., estimation of
measurement error) for another (e.g., across-system generality). Examples of
these efforts are found throughout this book, though particularly in Chapters 6,
7, 10, and 13. Two of my several objectives in this work have been to demonstrate
the ability of SEM to be flexible, and also to strive for continued developments
that will permit us to address a broader range of models. There remain, at the
present time, a number of important difficulties in dealing with certain kinds of
problems. Given the methodological advances that are being made, I am hopeful
that further progress can permit SEM to achieve the degree of flexibility needed
for use by researchers studying ecological systems.
argue that this debate has been complicated by ambiguous definitions, unclear
predictions, equivocal tests, and considerable controversy – all signs that the
hypothesis being addressed needs clarification and refinement. I believe that
adoption of a multivariate approach to theory construction would faciliate the
theory maturation process in this case as well.
We have already seen that certain refinements in both definitions and exper-
imental methods have had to be made in order to facilitate progress. First,
the consequences of drawing a limited number of species from a larger pool of
species are not really so simple (Huston 1997, Wardle 1999, Mikola et al. 2002).
Separating the number of species selected from the productivities of the particu-
lar species that wind up in a sample has turned out to be more difficult than most
investigators initially expected. The recognition of this “sampling effect” has
forced researchers to make a distinction between the effects of having a variety
of species in a sample, and the effects of winding up with a particularly produc-
tive species in a sample just because many were selected. As the importance of
this distinction has started to sink in, it has been necessary to recognize that a
key process that should be isolated is “overyielding”, the capacity for a mixture
of species to be more productive than any of the individual species in the mix-
ture. Thus, we now see that evaluating the hypothesis that diversity enhances
productivity requires us to partition at least two processes: the sampling effect
and the overyielding effect.
A second area in which definitions have had to be refined deals with the
term diversity. Initially it was assumed that species diversity automatically led
to functional diversity. However, it has now been found quite useful to estimate
the effect of growth form or functional group diversity on productivity because,
on average, species of different functional groups are more reliably different
than species of the same functional group. Again, here it seems that separating
the effects of the diversity of specific plant attributes, such as the ability to fix
nitrogen, from other types of plant variety may help us to further refine our
hypotheses and experiments.
Many of the experiments that have been conducted to address the potential
effects of diversity on productivity have been extremely impressive from the
standpoint of the effort involved. In spite of this, there has been much con-
troversy about the interpretation of the results. Some problems have resulted
from the fact that the above-described distinctions had not been made at the
time the experiments were initiated. Other problems arose because of the diffi-
culties associated with developing an operational test of an abstract hypoth-
esis. The particular set of species considered, the growing conditions, and
the duration of the study can all have specific effects that might confound
any attempt to understand the general effect of diversity on productivity from
306 Structural equation modeling and natural systems
an individual study. The process of theory maturation requires that we use the
lessons learned from trying to implement experimental tests in order to further
refine hypotheses. One such refinement that is often valuable is to “bound” the
hypothesis by describing the conditions under which it is more or less likely to
occur.
Failure to reconcile the hypothesis that diversity enhances productivity with
other ideas about how these two variables relate represents another area in which
refinements need to be made. Some of the initial resistance to the hypothesis
that diversity enhances productivity comes from evidence that high produc-
tivity inhibits diversity for herbaceous plants. So, in experiments involving
herbaceous plants that have examined the effect of diversity on productivity
(the majority of studies thus far), how is it that the negative feedback from
enhanced productivity on diversity is ignored? Reconciliation of the opposing
processes relating diversity and productivity will ultimately need to be consid-
ered if we are to draw conclusions about the effects of diversity on productivity
that are meaningful in the real world.
Another area where further refinement is needed deals with the spatial
scale at which the relationship between diversity and productivity is examined.
We may imagine (and even predict) that for a natural landscape, having a large
pool of species in that landscape will enhance the ability of plants to colonize
disturbed sites rapidly, occupy stressful areas more completely, and respond
to environmental changes more quickly. However, this is not the question that
has been addressed up to this point, at least not at the scale of experimental
evaluation (Symstad et al. 2003). Whenever there is a disconnect between the
idea being tested, that the number of species existing in a small plot affects
the productivity of that plot, and the broad appeal of the idea that diversity is
important in maintaining a productive landscape, further theory maturation is
needed.
Finally, another factor that is playing a role in the debate over the relation-
ship between diversity and ecosystem function is the relevance of this idea to
conservation efforts (Grime 2002). There has been a great deal of emphasis
placed by some ecologists on the prospect that an enhancement of productivity
by diversity represents a justification for the preservation of diversity (Naeem
2002). Many ecologists, myself included, think we should be careful in this
arena. Does this mean that if a species is rare and has no measurable effect on
ecosystem function, as certainly will be true for many rare species, that it has no
value? Traditionally ecologists have argued that diversity is important because
it represents successful evolutionary experiments, and the loss of diversity is a
loss not only of variety itself, but also of genetic information. It is important
How can SEM contribute to scientific advancement? 307
that we should not allow a desire to justify the importance of diversity to society
to be the supporting basis for keeping a scientific hypothesis in an immature
form.
Overall, the current controversy about how species diversity influences habi-
tat productivity represents one stage in the development of an important ques-
tion. Compared to most previous ecological debates, I am encouraged by the
degree to which ideas, definitions, and experimental tests have been refined.
Much of this has been caused by the intense scrutiny the work on this topic has
faced. However, there is much to do. Ecological questions of this magnitude
require a long-term sustained effort to evaluate. Furthermore, I believe that an
explicitly multivariate approach could help in a number of ways. There is still
little effort aimed at integrating any positive effects of diversity on productiv-
ity with models that explain how productivity and other factors can control
diversity.
309
310 Structural equation modeling and natural systems
we have seen how the introduction of a statistical concept such as a latent vari-
able can imply new elements of our research paradigm, leading to new lines
of inquiry. A significant number of statistical inventions have been described
in this book and their role in permitting different types of models to be esti-
mated has been illustrated. In Chapter 7, a number of more advanced statistical
techniques were referenced, including multigroup analysis, categorical variable
modeling, the treatment of nonlinear relationships, latent growth models, hier-
archical methods such as multi-level modeling, and the modeling of reciprocal
interactions. All of these open up new questions that can be addressed by the
researcher.
Currently, progress is being made on many fronts related to conventional
SEM. Recent compilations of advancements can be found in Cudeck et al.
(2001), Marcoulides and Schumacker (2001), and Pugesek et al. (2003). These
address a wide range of topics from nonlinear modeling to alternative estima-
tion techniques, to categorical variable modeling. The reader should be aware,
however, of parallel methodological advances taking place outside of the tradi-
tional domain of SEM that have the potential to greatly alter the way we work
with multivariate hypotheses. The broad enterprise of modeling multivariate
relations is currently undergoing a staggering rate of growth, with a wide vari-
ety of approaches being developed and promoted under the umbrella concept
of “graphical models” (e.g., Borgelt and Kruse 2002). These efforts have been
driven by a wide variety of objectives, from the desire to develop artificial
intelligence to the wish to mine vast quantities of data. These methods are also
highly dependent upon the capacity for high-speed computers as well as new
algorithms for solving multiequational systems and searching iteratively for
solutions. Borgelt and Kruse (2002) provide the following list of approaches to
the search for relationships in multivariate data:
r classical statistics
r decision/classification and regression trees
r naive Bayes classifiers
r probabilistic networks (including Bayesian and Markov networks)
r artificial neural networks
r neuro-fuzzy rule induction
r k-nearest neighbor/case-based reasoning
r inductive logic programming
r association rules
r hierarchical and probabilistic cluster analysis
r fuzzy cluster analysis
r conceptual clustering
Frontiers in the application of SEM 311
Consideration of many of these approaches is beyond the scope of this book. The
reader can find out more about these and other methods in Dasarathy (1990),
Bezdek and Pal (1992), Langley et al. (1992), Muggleton (1992), Pearl (1992),
Agrawal et al. (1993), and Anderson (1995). Shipley (2000) also provides an
overview of some of these methods in chapter 8 of his book.
It would seem that one important external influence impinging on classical,
maximum likelihood SEM is Bayesian reasoning. Bayesian thinking is lead-
ing to distinctively different methods for estimating and interpreting statistical
models. It is also leading to novel approaches to evaluating networks of rela-
tionships. Emerging from the growth in interest in, and applications of Bayesian
statistics is the recognition of a new future for SEM, Bayesian structural equa-
tion modeling (Jedidi and Ansari 2001). Thus, it is valuable, I think, to describe
briefly what these new ideas might mean for SEM.
probability (or joint probability) distributions. Having the precise shape of the
probability distribution yields inherently superior information, at least in theory.
In practice, there are a number of things that can interfere with reaping the
potential benefits of Bayesian estimation, especially problems in estimation.
Nevertheless, available information based on comparisons and considerations
of Bayesian versus frequentist (typically maximum likelihood) estimates yield
the following points (all of which should be considered preliminary at the
present time).
(1) With large samples that conform well to distributional assumptions and
in the absence of informative priors, Bayesian and frequentist (maximum
likelihood) estimates produce similar results (Scheines et al. 1999).
(2) Bayesian estimates of centrality and precision measures are substantially
superior when maximum likelihood is used on data that deviate a great
deal from the associated distributional assumptions.
(3) Bayesian estimation is more suited to small sample data because maximum
likelihood depends on asymptotic behavior.
(4) The choice of priors has a greater impact on small samples in Bayesian
estimation.
(5) Bayesian methods provide more intuitively useful statements about prob-
abilities, particularly in the context of decision making.
(6) Bayesian estimation, when good priors exist, can overcome some problems
of model underidentification, enhancing the range of models that can be
solved.
(7) Bayesian modeling holds promise for facilitating the modeling of non-
linear and complex model structures.
(8) Bayesian estimation may be inherently better suited for modeling cate-
gorical outcomes, because it is based on conditional probabilities.
(9) Bayesian estimation for structural equation models with multi-indicator
latent variables is potentially problematic, because the lack of a simultane-
ous solution procedure can yield inconsistent estimates (Congdon 2003,
chapter 8).
(10) It is unclear whether methods for evaluating model fit under Bayesian
estimation are as reliable as, or superior to, those achieved under maximum
likelihood.
Overall, it would seem that Bayesian approaches hold promise for general
application to SEM (Rupp et al. 2004). That said, maximum likelihood may be
comparable for a significant range of applications. Current SEM software has
now automated a very wide range of model types, including latent class anal-
ysis, mixture growth modeling, incorporation of interactions in latent variable
Frontiers in the application of SEM 315
models, and hierarchical models. Finally, the potential pitfalls for Bayesian
estimation in complex models have not been adequately researched as of yet.
Initial indications are that particular problems arise in latent variable models
involving multiple indicators per latent, when using Bayesian estimation. Since
this is a particular strength of ML SEM procedures and represent a very com-
mon model type, further work is needed to ascertain whether Bayesian methods
will be applicable to the full range of model types.
Bayesian networks
What we might refer to as “Bayesian SEM” goes beyond considerations of esti-
mation and can be viewed as falling under the umbrella of “Bayesian networks”
(Pearl 2000, Jensen 2001, Neapolitan 2004). This is a topic currently experi-
encing explosive growth and interest. Thus, any attempt to generalize about it is
subject to significant misrepresentation. That said, it would seem that the field
of Bayesian networks includes interest in both causal as well as descriptive and
predictive modeling, and is thus not synonymous with Bayesian SEM. Never-
theless, there is a great deal of applicability of Bayesian network (BN) methods
to SEM problems. This applicability relates not just to Bayesian estimation,
as described above, but also to a Bayesian perspective on conditional proba-
bility that can be used to evaluate network structure. I am reluctant to delve
too deeply into the specifics of BN methods in this book, because the termi-
nology associated with networks is both completely different from that used in
conventional SEM, and somewhat involved. Bayesian network methods make
reference to “directed acyclic graphs” (DAGs) instead of “recursive models”,
“nodes” instead of “observed variables”, “edges” instead of “paths”, and so on.
Therefore, I refer the reader to Pearl (2000), Shipley (2000), and Neapolitan
(2004) for a description of this terminology, as well as the topic of “d-separation”
which deals with how models can be evaluated for conditional independence.
some degree, each type of modeling has a particular kind of utility. Conceptual
models are capable of great generality and flexibility. They can, in some cases,
imply a great deal of content without requiring specification of explicit details.
They may also serve as logical starting points for exploring topics or organiz-
ing our knowledge. Analytical mathematical models, in contrast, seek to make
mechanistic assumptions and their consequences explicit. They are often used
to explore the interactions between a small number of variables. Such models
can have very general implications, although the business of determining their
relevance to real ecosystems is often left to the field biologist. System simulation
models have come to be the most common type of mechanistic model. Motives
for their development and usage range from organizational to exploratory to
predictive. Their size and complexity also vary greatly. System simulations can
grade from those exploring the dynamics of a single process to those involving
hundreds of equations. Traditional statistical models also have a long history of
application in ecology. Experimental designs and analyses, sampling programs,
and hypothesis evaluations of all sorts rely on statistical models. Such models
can, and often are, used for static or dynamic predictive modeling.
heterogeneity
species
richness
species pool
recruitment extinction
environmental disturbance
conditions
plant
abundance
growth biomass
loss
models only include system variables that can be measured or estimated from
data, and (3) the measurement portion of structural equation models explicitly
defines how the concepts will be quantified, and therefore, their explicit mean-
ing. For these reasons, it is expected that structural equation models will come to
play a role in the conveyance of ecological theory that extends beyond their role
as a specific statistical hypothesis. The construct model, which represents the
statistical expectations in a more general way, without concern for how
the constructs are to be measured, can play a special role in transitioning from
the general to the specific. An illustration of this transitioning was presented in
Chapter 10 (compare Figures 10.4, 10.6, and 10.7).
that is consistent with field and experimental data. According to this conceptual
model, species richness is controlled by the size of the species pool, species
recruitment into a site, and local extinction, all of which can be affected by
habitat heterogeneity. At the same time, community biomass is controlled by
growth and loss processes and, in turn, can suppress recruitment and increase
extinction at elevated levels. Finally, this model presumes that environmental
conditions can have differential effects on species pools and plant growth. Thus,
an environmental condition can be evolutionarily stressful (i.e., few species can
live there) and yet not appear to be ecologically stressful (i.e., the species that
can tolerate those conditions are quite productive). Collectively, this conceptual
model differs from any of the pre-existing models of this subject by being both
multivariate and relatable to the statistical expectations of field data. I believe
it also has many of the desirable properties described in Chapter 12, such as
being able to accommodate new information as it is found, operational, reason-
ably general, and predictive. Current work is seeking to incorporate additional
processes that are now believed to be important, and to include more explicit
information about the relative importance of different pathways.
Statistical forecasting
The linkage of statistical models to forecasting is simple and direct. Regres-
sion relationships have historically been used to estimate coefficients that can be
applied to forecast future outcomes (Pankratz 1991). Here I use the term forecast
to emphasize that projected outcomes are conditional on a number of assump-
tions that may or may not hold. It is reasonable to expect that estimation of
predictive relationships can often be improved upon using modern multivariate
methods. There are two basic characteristics of structural equation and related
models that make this so. First, multivariate models partition net effects into
separate pathways, allowing for greater accuracy in conditional predictions.
Secondly, the use of latent variables allows for the removal of measurement
error and the production of more accurate path coefficients.
The value of isolating effects using SEM can be illustrated by referring
back to the example of atrazine effects on ponds by Johnson et al. (1991;
Figure 9.1). When atrazine, a phytotoxin, was added to replicate mesocosm
ponds, the responses of algal populations were variable and did not consistently
differ from control ponds overall. Use of this net outcome as the basis for pre-
dicting risk would be of limited generality or utility, as it was found that other
factors, including aquatic vegetation, grass carp, pond depth, and zooplank-
ton densities all affected algal populations. Once the effects of other factors
were considered using SEM, the direct negative effect of atrazine on algae was
found to be quite strong. Furthermore, the dissection of effects using a model
320 Structural equation modeling and natural systems
RICH
pool
effects species sploss
sploss
rate
species
recruit
pool
recruit germ
exclude
richness
rate effect
submodel
total splossmax
environ. mass
factors recruitmax
seasons
LIVE
MASS
growth
biomass
bioloss
effects growth
submodel
gromax persistence
environmental biomassmax
submodel
LITTER litter
submodel
addition decomposition
decomprate
System simulation
I believe there can be an important complementarity between system simula-
tion and SEM. System simulation has tremendous flexibility and the capacity
to incorporate a nearly unlimited amount of theoretical detail. A considerable
variety of kinds of mechanism can be included, and these can readily be parti-
tioned to incorporate spatial heterogeneity and scalability. The primary limita-
tion with system simulation models is that they generally project or extrapolate
from what is already known. Thus, like other theoretical mathematical models,
the potential for informative feedback is limited. Discrepancies between pre-
dicted and observed can certainly be detected and analyzed. However, for many
models, the number of estimated parameters that might be involved can mask
Frontiers in the application of SEM 321
50
A
40
Richness
30
20
10
0
0 300 600 900 1200
Biomass, g m−2
50
40
B
Richness
30
20
10
0
0 300 600 900 1200
Biomass, g m−2
Figure 13.3. A. Relationship between total biomass and richness expected along
a resource gradient based on simulation results (Grace 2001). B. Relationships
expected along a nonresource gradient. Field results are generally more consistent
with case B than with case A, and emphasize the importance of nonresource stresses
on communities.
the reasons for observed discrepancies. Because of their capacity for directed
discovery, structural equation models can not only reveal that data do not fit a
model overall, but can also show precisely where discrepancies occur. Through
this capability, SEM has the ability to assist in the construction and evaluation
of simulation models.
On the other side of the process, it is possible to use the results of system
simulations to generate multivariate theoretical expectations that can be evalu-
ated using subsequent SEM analyses. Such an approach was used to explore in
more detail the explicit interworkings of the various processes in Figure 13.1
(Grace 2001). To accomplish this, I created a system simulation model (shown
in Figure 13.2) designed to capture some of the processes implied by the
322 Structural equation modeling and natural systems
ENV RICH
BIOM
B
ENV RICH
BIOM
Conclusions
Structural equation modeling will not always be appropriate or useful in every
study. Furthermore, it will not solve all our problems and can yield results no
better than the data upon which analyses are performed. Yet, it would seem that
it has the ability to allow us to tune in to nature’s symphony to a greater degree
than ever before. Only time and experience will tell whether these methods will
revolutionize science. It would seem from the current vantage point in history
that they have great potential. It is my hope that biologists will be open to the
process of exploring their utility in helping us to understand natural systems
and their dynamics.
Appendix I
Example analyses
Purpose
The purpose of this appendix is to give the reader a little experience with the process
of structural equation modeling. I should make the disclaimer that my treatment of the
illustrations in this appendix is admittedly superficial. I expect that most readers will
forgive this transgression, although some methodological experts may wish that I went
into far more detail here rather than assuming that the reader understood all the more
subtle nuances from the main chapters of the book. It is beyond the scope of an appendix
such as this to give a complete exposition on model evaluation. It is also beyond our scope
to give even a superficial overview of the variety of models that can be analyzed using
SEM. Our task is further complicated by the significant number of competing software
packages, their various features, and the significant issue of how they arrive at the results
they yield. Somehow, in spite of all this, it seems useful to provide some illustrations
of example analyses, allowing the reader to see some of the steps in the process, some
sample output, some of the pitfalls and how they may be avoided or resolved.
It seems that the wisest thing to do in this appendix is to first give an overview of
some of the various resources that are available to help the reader towards becoming a
competent SEM analyst. I will assume, rightly or wrongly, that the material and advice
given in the text will be sufficient for the individual to understand at least the basic
analyses. Chapter 11 provides a brief summary of some of the main things to avoid and
to attempt to accomplish when performing SEM. So, if we assume that one knows what
one wants to do, this appendix serves to provide some examples of analyses, including
how to translate questions into models and how to interpret output and create revised
models. Additional supporting material can be found at www.jbgrace.com.
324
Example analyses 325
There are a number of commercial software programs currently available for per-
forming SEM. The first widely used program of this sort was LISREL, which as of this
writing is in its eighth generation. Over the years, the authors (Karl Jöreskog and Dag
Sörbom) have added a number of modules and features to LISREL, as they continue to
do. The original programming language for LISREL involved a rather cumbersome spec-
ification of matrix elements. Since that time, a more intuitive programming language,
SIMPLIS, has been added and most recently, a graphical programming capability has
been introduced. A number of other programs are also available for covariance analyses,
including EQS, AMOS, CALIS (a module of SAS), SEPATH (a module of Statistica),
and Mplus. New users of SEM inevitably wish to know which of these programs is best.
One point in this regard is that most of these programs are updated fairly frequently,
making any description I might give of the limitations of a program potentially out of
date by the time this material is read. For this reason, I suggest readers perform internet
searches for the programs that interest them and go to the software developer’s web
site to find out the latest features and associated costs. Also, free student editions of the
software are available for download for many of these programs and many of the exam-
ples that follow can be run using the free versions of the software. Finally, published
reviews can often be found in the journal Structural Equation Modeling, as well as in
other outlets.
Given the above caveats, I will go on to say that as of this writing, AMOS, EQS, and
LISREL all possess graphical programming capabilities. It is generally true that most
new users would prefer to begin with a graphical programming environment, so these
all make good choices. On the other hand, I think when one is aware of how simple the
command-based programming is for programs such as Mplus, CALIS, and the SIMPLIS
module of LISREL, there is no need to automatically pick a graphical-based package.
More important to the selection process in my opinion is the availability of features.
When it comes to features, most of the main programs being used today possess some
unique features, which frustrates our ability to choose one program that is “the best”
at all things. Again, this can change at any time, so the reader may wish to compare
features at the time they read this. At the moment, I will note my enthusiasm for Mplus,
which is the program I use most often, for having the most features. Other programs
such as LISREL and EQS are also feature rich, while the program AMOS has some very
friendly bootstrap features. Again, let the buyer be aware that all these comments will
soon be out of date. Also, some programs are considerably more expensive than others,
which can influence choices.
Example applications
In this appendix I offer just a few simple examples to illustrate some of the kinds
of models that can be run and the type of output produced. First I set the stage and
then present a graphical representation of the model or models of interest. One or
more hypothetical data sets are presented for the purpose of seeing if the data fit the
expectations of any of the models. Note that the data will typically be summarized as the
correlations plus the standard deviations. This information is sufficient for calculating
the covariances, which will be the raw material for our analyses in most cases. Next,
output from analyses are presented, along with a brief discussion. Those interested can
download the student version of Mplus from their web site to run these and other models
326 Structural equation modeling and natural systems
?
x1 y2 e2
x2 y1 e1
Figure A.1. Simple path model representing a set of possible relations between
exogenous and response variables.
if they like. The Mplus program code that accompanies the example analyses presented
in this appendix can be downloaded from www.JamesBGrace.com.
One additional comment about the examples to follow is warranted. In these examples
I tend to discuss the estimation of the “effects” of one variable on another. It is important
to always keep in mind that the SEM analysis does not establish causality, but that the
ability to interpret relationships as effects or influences depends on outside information.
Of course, when such outside information is lacking, the whole enterprise of SEM is
somewhat compromised. For that reason, I make the assumption that there is reason
to justify the structure of the models presented. For those working with real examples,
sometimes thought experiments can help to establish or justify model structure. For
example, for an x → y relationship, we can ask, if we were to manipulate x, would we
expect y to respond? We may also ask, if we were to manipulate y would x respond? If
the reasonable answers to those two questions are yes and no, respectively, a basis for
establishing directional influence is established, although the degree of confidence in
that conclusion is influenced by many things.
x1 x2 y1 y2
x1 1.0
x2 0.200 1.0
y1 0.387 0.576 1.0
y2 0.679 0.356 0.766 1.0
Std. dev. 1121.6 1.79 278.9 0.0856
x1 x2 y1 y2
x1 1 257 986
x2 401.53 3.204
y1 121 059 287.56 77,785
y2 65.19 0.054 55 18.287 0.007 327
in the plot. In a situation such as this, one might naturally ask whether variations in soil
conditions relate to the frequency with which plants flower, and if so, whether this can
be explained by the fact that plants are larger under certain soil conditions and larger
plants are more likely to flower.
Another example that would fit this situation is where we are interested in variations
in population densities of an herbivorous insect as our y2 variable, and we wish to relate
population densities to the abundance of its preferred food plant, y1 , when there are two
habitat factors x1 and x2 thought to influence plant growing conditions. We might wonder
if there is something about either of the habitat factors that might have some additional
influences on insect populations, independent of influences mediated by influences on
its food plant. These additional influences would show up as significant direct paths
from the xs to y2 .
Data
Illustration 1.1
Here I wish to convey a simple but critical lesson about running models. To accomplish
this, we will first convert the data in Table A.1 into its raw form, a variance and covariance
matrix. Typically when one analyzes data using any of the SEM software packages, the
raw data will be examined and for many (though not all) analyses, a variance–covariance
matrix will be used to perform the analyses. Let us see what our raw data will look like.
We can make the calculations directly. The variances that will go into the diagonal will
simply be the squares of the standard deviations from Table A.1. The covariances can
be calculated by multiplying the correlations by the product of the standard deviations
of the two variables involved. The logic for this can be seen by revising Eq. (6) in
Chapter 3. The results are presented in Table A.2.
328 Structural equation modeling and natural systems
x1 x2 y1 y2
x1 1.0
x2 0.200 1.0
y1 0.387 0.576 1.0
y2 0.679 0.356 0.766 1.0
Std. dev. 1.1216 1.79 2.789 0.856
Our goal now will be to use the data in Table A.2 to evaluate the model in Figure A.1
that omits the paths with question marks. The output from that analysis is as follows:
Illustration 1.2
The goal now is to use our recoded data to evaluate the model in Figure A.1. We again
start with the “indirect effects” model, which omits the paths with question marks. The
results obtained from that analysis are presented below. Note that the form of the results,
including the types of indices presented and the form of the output vary depending
on software package. The Mplus package presents a fairly abbreviated form of output
compared with some others, such as LISREL.
Illustration 1.3
Our next move will be to run an alternative model, one that includes a path from x1 to
y2 (refer back to Figure A.1).
Y1 Y2 X1 X2
Y1 7.779
Y2 1.829 0.733
X1 1.211 0.652 1.258
X2 2.876 0.545 0.402 3.204
Y1 Y2 X1 X2
Y1 7.701
Y2 1.810 0.725
X1 1.198 0.645 1.245
X2 2.847 0.654 0.398 3.172
Y1 Y2 X1 X2
Y1 0.078
Y2 0.019 0.008
X1 0.013 0.007 0.013
X2 0.029 −0.109 0.004 0.032
Example analyses 331
Standardized Residuals
Y1 Y2 X1 X2
Y1 0.000
Y2 0.000 0.000
X1 0.000 0.000 0.000
X2 0.000 −0.076 0.000 0.000
MODIFICATION INDEX FOR Y2 ON X2 = 3.486
0.34
x1 y2 0.175
0.70
0.40 0.18
0.81
x2 y1 4.553
Figure A.2. Unstandardized parameter estimates for accepted model.
Residual Variances
Y1 4.553 0.644 7.071 0.591
Y2 0.175 0.025 7.071 0.241
0.45
x1 y2 0.24
0.28
0.20 0.59
0.52
x2 y1 0.59
Figure A.3. Standardized parameter estimates for accepted model.
coefficients. This can be remedied by selecting relevant ranges for the variables and
using those to standardize the raw parameters, providing a means of standardization
that does not suffer from reliance on sample variances. Again, see the latter part of
Chapter 3 for a discussion of this option.
The ultimate interpretations the researcher makes concerning the results depend
on many things, including the system being studied, the associated theoretical knowl-
edge and prior experience, the questions being asked, and characteristics of the sample.
Hopefully the main chapters in this book, along with illustrative publications using these
methods, provide sufficient instruction on how to draw careful and meaningful interpre-
tations of results such as these. Basically, my advice is to rely on fundamental statistical
principles rather than someone else’s protocol.
Sample A
x1a y2a e 2a
x2a y1a e 1a
Sample B
x1b y2b e 2b
x2b y1b e 1b
Figure A.4. Multigroup model of the form obtained in Example 1.
plants of any given size. All these questions can be addressed using the results from a
multigroup analysis.
It is not necessary for our comparison to involve an experimental treatment, of course.
We might again consider a case where we are interested in how herbivorous insects
associate with a particular food plant in the face of covarying habitat factors. This time
we ask whether the species initially studied has the same responses and relationships
to habitat as a second, closely related species also found in the same sample. Here, our
two groups would be independent estimates of each insect species and their relations
to plant density and habitat conditions. In such a situation, our questions may have to
do with differential efficiency of converting food into individuals, differential habitat
preferences, or mortality rates for the two species.
Data
Let us imagine that we have two samples now, our original sample (from Table A.3)
and a second sample. These are shown together in Table A.4. Keep in mind, that while
I show the correlations, the models will be assessed using the covariances.
x1 x2 y1 y2
Group 1
x1 1.0
x2 0.200 1.0
y1 0.387 0.576 1.0
y2 0.679 0.356 0.766 1.0
Mean 6.22 0.756 100.75 10.25
Std. dev. 1.1216 1.79 2.789 0.856
Group 2
x1 1.0
x2 0.212 1.0
y1 0.501 0.550 1.0
y2 0.399 0.285 0.389 1.0
Mean 6.01 0.998 95.56 21.55
Std. dev. 1.32 1.66 2.505 1.701
the variances of the variables, and the means for the variables. There is a particular
sequence of evaluations usually recommended for determining whether parameters are
equal across groups. Stated briefly, our strategy is to begin by allowing all parameters to
be uniquely estimated for each group. It is a basic requirement that the models be of the
same form, which is something that is tested by allowing all parameters to be unique,
and then determining whether model fit is adequate for each model separately. In this
case, our data do fit a common model.
Table A.5. Summary of tests for equality of parameters across groups. The base
model is one where all parameters are unique for each group. As parameters
are constrained to be equal across groups, we determine whether model chi-
square increases significantly, if so, the parameter is not equal across groups.
If chi-square does not increase significantly, parameters are judged to be indis-
tinguishable across groups
deviation of y2 in group 2 is 1.701. Finally, two of the means were found to be equal
across groups, those for x1 and x2 . When the means for y1 and y2 were constrained to
be equal across groups, model chi-square was greatly inflated and overall model fit was
poor.
Sample A
0.34/0.48
x1a y2a 0.18/0.23
0.74/0.34
0.42/0.20
0.18/0.55
0.76/0.50
Sample B
0.34/0.25
x1b y2b 2.27/0.80
0.74/0.34
0.42/0.20
0.18/0.28
0.76/0.50
Figure A.5. Results from multigroup analysis. Coefficients presented are the
unstandardized values followed by the standardized values. The parameters that
were found to be significantly different among groups are summarized in Table
A.5 and only included the variance of y2 and the means of y1 and y2 .
that all comparisons are automatically incorrect. It is only when variances differ among
groups that we will have a problem. In fact, since the declaration of significant differences
among groups is based on tests of the unstandardized parameters, we know that the
standardized path coefficients in Figure A.5, even those that appear to be different, do
not represent significant differences. If those paths did differ significantly and there were
also differences in variances among groups, then the standardized coefficients could lead
to some incorrect inferences. Again, the solution mentioned at the end of Chapter 3 to
standardize based on relevant ranges would solve the problem, and if applied to this
example, we would see that all such standardized path coefficients would be equal
across groups.
One other point that is made by this example to illustrate why statisticians are not
enamored with judging a model by the R2 values. The R2 for y2 in group 1 is 0.769,
while the R2 for y2 in group 2 is only 0.204. Would we judge the model for group 2 to
be inferior to the one for group 1? We should not, because the path coefficient estimates
are the same for both. It may be that we actually obtained the correct path coefficients in
group 2 in the face of more noise, which could be construed as a major success. There
is a natural attraction for researchers to focus on explained variance in the response
variables. As we see here, it is not one that should be accepted too readily, as it focuses
on the wrong parameters.
338 Structural equation modeling and natural systems
c x1 x2 y1 y2
c 1.0
x1 0.25 1.0
x2 0.41 0.200 1.0
y1 0.33 0.387 0.576 1.0
y2 0.37 0.679 0.356 0.766 1.0
Mean 0.5 6.22 0.756 100.75 10.25
Std. dev. 0.20 1.1216 1.79 2.789 0.856
Data
For this example, we continue using the data from Example 1, which was utilized as
the data for group 1 in Table A.4. In this case, we imagine that these data were actually
taken from two disjunct sampling locations and combined, and we now wish to control
for this cluster sampling in the analyses. It would be our general assumption that if the
clusters differed consistently in some fashion, ignoring them should create bias in the
parameter estimates observed in Figure A.2.
Results
Here we will not worry about the steps involved in the process of arriving at the final
results, as the reader should now be familiar with the process. Instead, we focus on
Example analyses 339
A
0.34
x1 y2 0.175
0.70
0.40 0.18
0.81
x2 y1 4.55
B
0.33
x1 y2 0.155
0.62
0.40 0.19
−0.07
x2 0.70
y1 4.35
Figure A.6. Analysis results either A. ignoring clustering in the data, or B. adjust-
ing for clustering.
the results obtained. For comparison, we will refer to the results in Figure A.2, which
represent what one would obtain here if we ignored the clustering. When our model was
modified to include the cluster variable, the first thing that was found was that model
fit was substantially reduced. Instead of a chi-square of 3.548 with 1 degree of freedom
(p = 0.0596), we now have a chi-square of 7.803 with 2 degrees of freedom (p = 0.0197).
Inclusion of a path from x2 to y2 was indicated by these deviations, which could lead us
to accept a different model from these same data once clustering is taken into account.
The model with the path from x2 to y2 included is found to have a chi-square of 1.226
with 1 degree of freedom (p = 0.2681), indicating good fit.
Results from the analysis including a cluster variable can be seen in Figure A.6.
For simplicity, the paths involving the cluster variable are omitted from the presenta-
tion. The principal change in results compared to when the effects of clustering were
ignored (Figure A.2), is the inclusion of the path from x2 to y2 . Since these are the
unstandardized results, which seems like the appropriate metric to use when comparing
non-nested models as we are here, it is a little more challenging to judge the magnitude
of the differences in parameters. We can improve our perspective a little by express-
ing the differences in path coefficients as percentage changes, and they range from
0% up to a maximum of 17% (for the path from x2 to y1 ). The take home message
here is that it is both possible and desirable to control for nonindependencies in the
data.
340 Structural equation modeling and natural systems
x1
z2 y4
x2 x1 h2 y5
x3 y6
h1
z1
y1 y2 y3
Figure A.7. Latent variable model with multiple indicators. Error terms for indi-
cators are omitted for simplicity of presentation.
x1 x2 x3 y1 y2 y3 y4 y5 y6
x1 1.0
x2 −0.86 1.0
x3 −0.64 0.70 1.0
y1 −0.23 0.25 0.24 1.0
y2 −0.19 0.20 0.19 0.64 1.0
y3 −0.22 0.24 0.23 0.77 0.62 1.0
y4 −0.19 0.20 0.20 0.38 0.31 0.71 1.0
y5 −0.16 0.18 0.17 0.33 0.27 0.62 0.72 1.0
y6 −0.17 0.19 0.18 0.35 0.28 0.66 0.76 0.66 1.0
s.d. 2.9 1.1 0.06 0.9 1.0 2.3 1.6 0.8 1.9
Data
I begin by throwing another curve at the reader. One that represents an easy programming
mistake to make which has fatal consequences. We will then correct the data and proceed
through the model analyses.
Estimates
XI1 BY
X1 1.000
X2 ********
X3 8697.837
These results describe the initial estimates of the loadings for the three indicators on
the latent variable ξ 1 (“XI-one”). The loading for x1 is set at 1.0 automatically by the
program in order to set the scale for the latent variable (the scale for latent variables must
always be specified in some fashion). The results presented indicate that the program
was unable to estimate a loading for the path from ξ 1 to x2 , and that the loading for x3
was very unrealistic. To see what a more normal pattern would look like, here we see
the interim results for η1 .
342 Structural equation modeling and natural systems
ETA1 BY
Y1 1.000
Y2 0.894
Y3 2.474
There are a number of things we might try at this point to get started. Since SEM
programs use iterative algorithms, we might specify starting values. We can also increase
the number of iterations attempted. None of these changes will help in this case, however.
Another approach is to set the scale for the latent variable using a different indicator.
When we do this, we achieve the following:
THE MODEL ESTIMATION TERMINATED NORMALLY
If we now examine the estimates for loadings for ξ 1 , we find
XI1 BY
X1 −2.418
X2 1.000
X3 0.041
So, we obtained seemingly reasonable estimates. However, we see that one has a negative
loading (X1) and the others are positive. Programs vary in how well they are able to
handle this situation where indicators are not coded to correlate positively. In general,
this is to be avoided. So, to eliminate this as a potential source of problems, we recode our
original data in Table A.7 so that the indicators X2, X2, and X3 are positively correlated.
Our new matrix of correlations is given in Table A.8.
This may be a good time to comment on another problem that frequently arises in
analyses of latent variable models, the so-called Heywood case. A Heywood case refers
to a solution that is inadmissible, in that some parameter estimates have illogical values.
The most common such situation is when one obtains negative error variance estimates.
Sometimes this can be solved by recoding the data or by changing the indicator used to fix
the scale of the latent variable. Another approach is to constrain the estimate to be greater
than or equal to zero (using programming constraints). However, in recalcitrant situations
where this results in unacceptable solutions, one further alternative is to specify a small,
nonzero, positive value for the error variance that keeps coming up with a negative
estimate. This approach invalidates our estimates of loadings and residual variances
for the indicators on that latent variable. However, it at least allows us to have an
admissible solution with which to proceed. It is hoped that the researcher will eventually
identify the true source of the problem and resolve it through a reformulation of the
model.
New data
Further results and discussion – evaluation of the measurement model
As stated above, our evaluation of a hybrid model is usually performed as a two-stage
process, with the first stage being a test of the measurement model. As stated above,
this is accomplished by modifying the model in Figure A.7 so that the latent variables
intercorrelate freely, thus allowing us to focus on the adequacy of our measurement
model in a separate analysis. When we run such a model using the data in Table A.8,
we get the following message:
Example analyses 343
Table A.8. Data were recoded from Table A.7 so that indicators for all latent
variables are positively correlated. N = 100
x1 x2 x3 y1 y2 y3 y4 y5 y6
x1 1.0
x2 0.86 1.0
x3 0.64 0.70 1.0
y1 0.23 0.25 0.24 1.0
y2 0.19 0.20 0.19 0.64 1.0
y3 0.22 0.24 0.23 0.77 0.62 1.0
y4 0.19 0.20 0.20 0.38 0.31 0.71 1.0
y5 0.16 0.18 0.17 0.33 0.27 0.62 0.72 1.0
y6 0.17 0.19 0.18 0.35 0.28 0.66 0.76 0.66 1.0
s.d. 2.9 1.1 0.06 0.9 1.0 2.3 1.6 0.8 1.9
x1
z2 y4
x2 x1 h2 y5
x3 y6
h1
z1
y1 y2 y3
Figure A.8. Modified measurement model evaluated in stage one of the analysis.
In measurement model evaluations, all latent variables are allowed to intercorre-
late freely, so that the measurement relations are the focus of the analysis. The
modification made was to specify an effect of y3 on η2 .
0.23
x1 0.89 0.91 y4
0.97 0.79
x2 x1 h2 y5
0.72 0.84
0.27 −0.77
x3 y6
0.93
h1 1.45
0.90 0.72 0.86
y1 y2 y3
Figure A.9. Results obtained for modified latent variable model. Errors for indi-
cators are omitted to simplify presentation. Coefficients shown are standardized
values.
to use a two-stage analysis with the composite omitted from the model in the first stage.
This process will be illustrated below.
x1 x2 x3 x4 y1
x1 1.0
x2 0.25 1.0
x3 0.02 0.26 1.0
x4 0.12 0.31 0.25 1.0
y1 0.55 0.47 0.42 0.41 1.0
Std. dev. 1.23 0.66 2.30 1.56 2.09
x1
0
x2
c y1 e1
x3
x4
Figure A.10. Model in which a composite variable is used to represent the col-
lective effects of x1 –x3 on y1 .
Data
Results and discussion of first stage of analysis of composite model
As stated above, the first stage in the analysis is to omit the composite from the model
and evaluate the component processes whose effects will be composited. In this par-
ticular case, we simply have a saturated model equivalent to a multiple regression
(Figure A.11). Since the model is saturated, the fit is perfect, thus, our evaluation is
quantitative – determining the magnitudes of the path coefficients, their significance,
and the variance explained by the model.
x1 0.46
x2 0.21
0.30 y1 0.43
x3
0.21
x4
Figure A.11. Results from first stage of analysis of composite model.
x1
0
x2 0.66
c y1 0.43
x3
0.21
x4
Figure A.12. Results from second stage of analysis of composite model. Coeffi-
cients shown are standardized values.
presentation because we wish to focus on the net effect. One can see that aside from
compositing the effects of x1 –x3 , the other results are the same as those from stage 1 of
the analysis. Now we can consider the collective effect of x1 –x3 , giving us flexibility to
provide results that match with our constructs of theoretical interest.
Conclusions
A facile use of SEM comes from a reliance on fundamental statistical and scientific
principles, not from blindly following conventions. One way (actually the best way) to
develop confidence in one’s analysis is to generate simulated data where you control
what relationships are built into the data. Then you can analyze those data in various
ways to see what you get out. It is extremely instructive if you use conventional univariate
Example analyses 349
analyses in comparison with multivariate ones to gain a respect for how far off the mark
univariate results can be. You will also find out that ignoring certain features of the
situation in multivariate models (e.g., data clustering) can have either subtle or not so
subtle effects. Several of the SEM programs have the capability to generate simulated
data. It is also possible to simulate data sets using any number of other means. You
will gain both an appreciation for the assumptions that lie behind the analyses and an
appreciation for SEM. Try it, you will be glad you did.
References
350
References 351
Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: John Wiley
& Sons.
Bollen, K. A. (1996). An alternative 2SLS estimator for latent variable models. Psy-
chometrika, 61, 109–121.
Bollen, K. A. (1998). Path analysis. pp. 3280–3284. In: Encyclopedia of Biostatistics.
P. Armitage and T. Colton (eds.). New York: John Wiley & Sons.
Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual
Review of Psychology, 53, 605–634.
Bollen, K. A. & Lennox, R. (1991). Conventional wisdom on measurement: a structural
equation perspective. Psychological Bulletin, 110, 305–314.
Bollen, K. A. & Long, J. S. (eds.) (1993). Testing Structural Equation Models. Newbury
Park, CA: Sage Publications.
Bollen, K. A. & Stine, R. (1992). Bootstrapping goodness of fit measures in structural
equation models. Sociological Methods and Research, 21, 205–229.
Bollen, K. A. & Ting, K. (2000). A tetrad test for causal indicators. Psychological
Methods, 5, 3–22.
Borgelt, C. & Kruse, R. (2002). Graphical Models. New York: John Wiley & Sons.
Bozdogan, H. (1987). Model selection and Akaike’s Information Criterion (AIC).
Psychometrika, 52, 345–370.
Brewer, J. S. & Grace, J. B. (1990). Vegetation structure of an oligohaline tidal marsh.
Vegetatio, 90, 93–107.
Browne, M. W. & Cudeck, R. (1989). Single sample cross-validation indices for covari-
ance structures. Multivariate Behavioral Research, 24, 445–455.
Burnham, K. P. & Anderson, D. R. (2002). Model Selection and Multimodel Inference.
Second Edition. New York: Springer Verlag.
Byrne, B. M. (1994). Structural Equation Modeling EQS and EQS/Windows. Thousand
Oaks, CA: Sage Publications.
Byrne, B. M. (1998). Structural Equation Modeling with LISREL, PRELIS, and
SIMPLIS. Mahway, NJ: Lawrence Erlbaum Associates.
Byrne, B. M. (2001). Structural Equation Modeling with AMOS. Mahway, NJ: Lawrence
Erlbaum Associates.
Campbell, D. R., Waser, N. M., Price, M. V., Lynch, E. A., & Mitchell, R. J. (1991).
A mechanistic analysis of phenotypic selection: pollen export and corolla width in
Ipomopsis aggregata. Evolution, 43, 1444–1455.
Casella, B. (1992). Explaining the Gibbs sampler. The American Statistician, 46, 167–
174.
Congdon, P. (2001). Bayesian Statistical Modeling. Chichester: Wiley Publishers.
Congdon, P. (2003). Applied Bayesian Modeling. Chichester: Wiley Publishers.
Cottingham, K. L., Lennon, J. T., & Brown, B. L. (2005). Knowing when to draw the
line: designing more informative ecological experiments. Frontiers in Ecology, 3,
145–152.
Cudeck, R., Du Toit, S. H. C., & Sörbom, D. (eds.) (2001). Structural Equation
Modeling: Present and Future. Lincolnwood, IL: SSI Scientific Software Inter-
national.
Dasarathy, B. V. (1990). Nearest Neighbor (NN) Norms: NN Pattern Classification
Techniques. Los Alamitos, CA: IEEE Computer Science Press.
352 References
Grace, J. B. & Bollen, K. A. (2005). Interpreting the results from multiple regression
and structural equation models. Bulletin of the Ecological Society of America, 86,
283–295.
Grace, J. B. & Guntenspergen, G. R. (1999). The effects of landscape position on
plant species density: evidence of past environmental effects in a coastal wetland.
Ecoscience, 6, 381–391.
Grace, J. B. & Jutila, H. (1999). The relationship between species density and community
biomass in grazed and ungrazed coastal meadows. Oikos, 85, 398–408.
Grace, J. B. & Keeley, J. E. (2006). A structural equation model analysis of post-
fire plant diversity in California shrublands. Ecological Applications, 16, 503–
514.
Grace, J. B. & Pugesek, B. (1997). A structural equation model of plant species richness
and its application to a coastal wetland. American Naturalist, 149, 436–460.
Grace, J. B. & Pugesek, B. H. (1998). On the use of path analysis and related procedures
for the investigation of ecological problems. American Naturalist 152, 151–159.
Grace, J. B., Allain, L., & Allen, C. (2000). Factors associated with plant species richness
in a coastal tall-grass prairie. Journal of Vegetation Science, 11, 443–452.
Grime, J. P. (1973). Competitive exclusion in herbaceous vegetation. Nature, 242, 344–
347.
Grime, J. P. (1977). Evidence for the existence of three primary strategies in plants
and its relevance to ecological and evolutionary theory. American Naturalist, 111,
1169–1194.
Grime, J. P. (1979). Plant Strategies and Vegetation Processes. London: John Wiley &
Sons.
Grime, J. P. (2001). Plant Strategies, Vegetation Processes, and Ecosystem Properties.
London: John Wiley & Sons.
Grime, J. P. (2002). Declining plant diversity: empty niches or functional shifts? Journal
of Vegetation Science, 13, 457–460.
Grimm, V. (1994). Mathematical models and understanding in ecology. Ecological Mod-
elling, 74, 641–651.
Gross, K. L., Willig, M. R., & Gough, L. (2000). Patterns of species density and pro-
ductivity at different spatial scales in herbaceous plant communities. Oikos, 89,
417–427.
Grubb, P. J. (1998). A reassessment of the strategies of plants which cope with shortages
of resources. Perspectives in Plant Ecology, Evolution and Systematics, 1, 3–31.
Hägglund, G. (2001). Milestones in the history of factor analysis. pp. 11–38. In: R.
Cudeck, S. H. C. Du Toit, & D. Sörbom (eds.). Structural Equation Modeling:
Present and Future. Lincolnwood, IL: SSI Scientific Software International.
Hair, J. F., Jr., Anderson, R. E., Tatham, R. L., & Black, W. C. (1995). Multivariate Data
Analysis. Fourth Edition. Englewood Cliffs, NJ: Prentice Hall.
Hannon, B. & Ruth, M. (1997). Modeling Dynamic Biological Systems. New York:
Springer.
Hargens, L. L. (1976). A note on standardized coefficients as structural parameters.
Sociological Methods & Research, 5, 247–256.
Harrison, S., Safford, H. D., Grace, J. B., Viers, J. H., & Davies, K. F. (2006). Regional
and local species richness in an insular environment: serpentine plants in California.
Ecological Monographs, 76, 41–56.
354 References
Hastings, W. (1970). Monte Carlo sampling methods using Markov chains and their
applications. Biometrika, 57, 97–106.
Hayduk, L. A. (1987). Structural Equation Modeling with LISREL. Baltimore, MD:
Johns Hopkins University Press.
Hayduk, L. A. (1996). LISREL Issues, Debates, and Strategies. Baltimore, MD: Johns
Hopkins University Press.
Heise, D. R. (1972). Employing nominal variables, induced variables, and block variables
in path analyses. Sociological Methods & Research, 1, 147–173.
Hodson, J. G., Thompson, K., Wilson, P. J., & Bogaard, A. (1998). Does biodiversity
determine ecosystem function? The ecotron experiment reconsidered. Functional
Ecology, 12, 843–848.
Hox, J. (2002). Multilevel Analysis. Mahway, NJ: Lawrence Erlbaum Associates.
Hoyle, R. H. (ed.) (1999). Statistical Strategies for Small Sample Research. Thousand
Oaks, CA: Sage Publications.
Huston, M. A. (1979). A general hypothesis of species diversity. American Naturalist,
113, 81–101.
Huston, M. A. (1980). Soil nutrients and tree species richness in Costa Rican forests.
Journal of Biogeography, 7, 147–157.
Huston, M. A. (1994). Biological Diversity. Cambridge: Cambridge University Press.
Huston, M. A. (1997). Hidden treatments in ecological experiments: Re-evaluating the
ecosystem function of biodiversity. Oecologia, 110, 449–460.
Huston, M. A. (1999). Local processes and regional patterns: appropriate scales for
understanding variation in the diversity of plants and animals. Oikos, 86, 393–
401.
Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct
indicators and measurement model misspecification in marketing and consumer
research. Journal of Consumer Research, 30, 199–218.
Jedidi, K. & Ansari, A. (2001). Bayesian structural equation models for multilevel data.
pp. 129–158. In: Marcoulides, B. A. & Schumacker, R. E. (eds.), New Developments
and Techniques in Structural Equation Modeling. Mahway, NJ: Lawrence Erlbaum
Associates.
Jensen, F. V. (2001). Bayesian Networks and Decision Graphs. New York: Springer
Verlag.
Johnson, M. L., Huggins, D. G., & deNoyelles, F., Jr. (1991). Ecosystem modeling
with LISREL: a new approach for measuring direct and indirect effects. Ecological
Applications, 1, 383–398.
Johnson, J. B. (2002). Divergent life histories among populations of the fish
Brachyrhaphis rhabdophora: detecting putative agents of selection by candidate
model analysis. Oikos, 96, 82–91.
Jöreskog, K. G. (1973). A general method for estimating a linear structural equation sys-
tem. pp. 85–112. In: A. S. Goldberger & O. D. Duncan (eds.). Structural Equation
Models in the Social Sciences. New York: Seminar Press.
Jöreskog, K. G. & Sörbom, D. (1996). LISREL 8: User’s Reference Guide. Chicago:
Scientific Software International.
Jutila, H. & Grace, J. B. (2002). Effects of disturbance and competitive release on
germination and seedling establishment in a coastal prairie grassland. Journal of
Ecology, 90, 291–302.
References 355
Neapolitan, R. E. (2004). Learning Bayesian Networks. Upper Saddle River, NJ: Prentice
Hall Publishers.
Oakes, M. (1990). Statistical Inference. Chestnut Hill, MA: Epidemiology Resources
Inc.
Palmer, M. W. (1994). Variation in species richness: towards a unification of hypotheses.
Folia Geobot. Phytotax. Praha, 29, 511–530.
Pankratz, A. (1991). Forecasting with Dynamic Regression Models. New York: John
Wiley & Sons.
Pearl, J. (1992). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible
Inference. San Mateo, CA: Morgan Kaufmann.
Pearl, J. (2000). Causality. Cambridge: Cambridge University Press.
Pedhazur, E. J. (1997). Multiple Regression in Behavioral Research, 3rd edition. Toronto:
Wadsworth Press.
Peters, R. H. (1991). A Critique for Ecology. Cambridge: Cambridge University Press.
Pianka, E. R. (1970). On r- and K-selection. American Naturalist, 104, 592–
597.
Popper, K. R. (1959). The Logic of Scientific Discovery. London: Hutchinson.
Pugesek, B. H. (2003). Modeling means in latent variable models of natural selection.
pp. 297–311. In: B. H. Pugesek, A. Tomer, & A. von Eye (eds.). Structural Equation
Modeling. Cambridge: Cambridge University Press.
Pugesek, B. H. & Tomer, A. (1996). The Bumpus house sparrow data: a reanalysis using
structural equation models. Evolutionary Ecology, 10, 387–404.
Pugesek, B. H., Tomer, A., & von Eye, A. (2003). Structural Equation Modeling.
Cambridge: Cambridge University Press.
Raftery, A. E. (1993). Bayesian model selection in structural equation models. pp. 163–
180. In: K. A. Bollen & J. S. Long (eds.). Testing Structural Equation Models.
Newbury Park, CA: Sage Publishers.
Raykov, T. & Marcoulides, G. A. (2000). A First Course in Structural Equation Model-
ing. Mahway, NJ: Lawrence Erlbaum Associates.
Raykov, T. & Penev, S. (1999). On structural equation model equivalence. Multivariate
Behavioral Research, 34, 199–244.
Reich, P. B., Ellsworth, D. S., Walters, M. B. et al. (1999). Generality of leaf trait
relationships: a test across six biomes. Ecology, 80, 1955–1969.
Reyment, R. A. & Jöreskog, K. G. (1996). Applied Factor Analysis in the Natural
Sciences. Cambridge: Cambridge University Press.
Rosenzweig, M. L. & Abramsky, Z. (1993). How are diversity and productivity related?
pp. 52–64. In: R. E. Ricklefs & D. Schluter (eds.). Species Diversity in Ecological
Communities. Chicago: University of Chicago Press.
Rupp, A. A., Dey, D. K., & Zumbo, B. D. (2004). To Bayes or not to Bayes, from
whether to when: Applications of Bayesian methodology to modeling. Structural
Equation Modeling, 11, 424–451.
Salsburg, D. (2001). The Lady Tasting Tea. New York: Henry Holt & Company.
Satorra, A. & Bentler, P. M. (1988). Scaling corrections for chi-square statistics in covari-
ance structure analysis. pp. 308–313. In: Proceedings of the American Statistical
Association.
Satorra, A. & Bentler, P. M. (1994). Corrections to test statistics and standard errors in
covariance structure analysis. pp. 399–419. In: A. von Eye & C. C. Clogg (eds.).
358 References
Tomer, A. (2003). A short history of structural equation models. pp. 85–124. In:
B. H. Pugesek, A. Tomer, & A. von Eye (eds.). Structural Equation Modeling.
Cambridge: Cambridge University Press.
Tukey, J. W. (1954). Causation, regression, and path analysis. pp. 35–66. In: O.
Kempthorne, T. A. Bancroft, J. W. Gowen, & J. D. Lush (eds.). Statistics and
Mathematics in Biology. Ames, IA: Iowa State College Press.
Turner, M. E. & Stevens, C. D. (1959). The regression analysis of causal paths. Biomet-
rics, 15, 236–258.
Verheyen, K., Guntenspergen, G. R., Biesbrouck, B., & Hermy, M. (2003). An integrated
analysis of the effects of past land use on forest herb colonization at the landscape
scale. Journal of Ecology, 91, 731–742.
von Mises, R. (1919). Grundlagen der Wahrscheinlichkeitsrechnung. Mathematische
Zeitschrift, Vol. 5. (referenced in Neapolitan (2004).)
Waide, R. B., Willig, M. R., Steiner, C. F. et al. (1999). The relationship between
productivity and species richness. Annual Reviews in Ecology and Systematics, 30,
257–300.
Wardle, D. A. (1999). Is “sampling effect” a problem for experiments investigating
biodiversity–ecosystem function relationships? Oikos, 87, 403–407.
Weiher, E., Forbes, S., Schauwecker, T., & Grace, J. B. (2004). Multivariate control of
plant species richness in a blackland prairie. Oikos, 106, 151–157.
Wheeler, B. D. & Giller, K. E. (1982). Species richness of herbaceous fen vegetation
in Broadland, Norfolk in relation to the quantity of above-ground plant material.
Journal of Ecology, 70, 179–200.
Wheeler, B. D. & Shaw, S. C. (1991). Above-ground crop mass and species richness
of the principal types of herbaceous rich-fen vegetation of lowland England and
Wales. Journal of Ecology, 79, 285–302.
Wiley, D. E. (1973). The identification problem for structural equation models
with unmeasured variables. In: A. S. Goldberger & O. D. Duncan (eds.).
Structural Equation Models in the Social Sciences. New York: Seminar Press
A. S.
Williams, L. J., Edwards, J. R., & Vandenberg, R. J. (2003). Recent advances in
causal modeling methods for organizational and management research. Journal of
Management, 29, 903–936.
Wilson, J. B. & Lee, W. G. (2000). C-S-R triangle theory: community-level predictions,
tests, evaluation of criticisms, and relation to other theories. Oikos, 91, 77–96.
Wisheu, I. C. & Keddy, P. A. (1989). Species richness-standing crop relationships along
four lakeshore gradients: constraints on the general model. Canadian Journal of
Botany, 67, 1609–1617.
Wootton, J. T. (1994). Predicting direct and indirect effects: an integrated approach using
experiments and path analysis. Ecology, 75, 151–165.
Wootton, J. T. (2002). Indirect effects in complex ecosystems: recent progress and future
challenges. Journal of Sea Research, 48, 157–172.
Wright, S. (1918). On the nature of size factors. Genetics, 3, 367–374.
Wright, S. (1920). The relative importance of heredity and environment in determining
the piebald pattern of guinea pigs. Proceedings of the National Academy of Sciences,
6, 320–332.
Wright, S. (1921). Correlation and causation. Journal of Agricultural Research, 10,
557–585.
360 References
Wright, S. (1932). General, group, and special size factors. Genetics, 17, 603–619.
Wright, S. (1934). The method of path coefficients. Annals of Mathematical Statistics,
5, 161–215.
Wright, S. (1960). Path coefficients and path regressions: alternative or complementary
concepts? Biometrics, 16, 189–202.
Wright, S. (1968). Evolution and the Genetics of Populations, Vol. 1: Genetic and
Biometric Foundations. Chicago: University of Chicago Press.
Wright, S. (1984). Diverse uses of path analysis. pp. 1–34. In: A. Chakravarti (ed.).
Human Population Genetics. New York: Van Nostrand Reinhold.
Index
361
362 Index