0% found this document useful (0 votes)
77 views24 pages

F Test Interpretation20240121212524

The variance inflation factor (VIF) measures the amount of multicollinearity (correlation between independent variables) in a regression analysis. A higher VIF indicates greater multicollinearity which can inflate the variance of regression coefficients and reduce their statistical significance. The VIF is calculated by running a regression with one independent variable as the dependent and the other independent variables as independents.

Uploaded by

Jitesh Bhardwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views24 pages

F Test Interpretation20240121212524

The variance inflation factor (VIF) measures the amount of multicollinearity (correlation between independent variables) in a regression analysis. A higher VIF indicates greater multicollinearity which can inflate the variance of regression coefficients and reduce their statistical significance. The VIF is calculated by running a regression with one independent variable as the dependent and the other independent variables as independents.

Uploaded by

Jitesh Bhardwaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

F test interpretation

How to Interpret the F-test of Overall Significance in Regression


Analysis

The F-test of overall significance indicates whether your linear


regression model provides a better fit to the data than a model that
contains no independent variables. In this post, I look at how the F-
test of overall significance fits in with other regression statistics, such
as R-squared. R-squared tells you how well your model fits the data,
and the F-test is related to it.

An F-test is a type of statistical test that is very flexible. You can use
them in a wide variety of settings. F-tests can evaluate multiple model
terms simultaneously, which allows them to compare the fits of
different linear models. In contrast, t-tests can evaluate just one term
at a time.

To calculate the F-test of overall significance, your statistical software


just needs to include the proper terms in the two models that it
compares. The overall F-test compares the model that you specify to
the model with no independent variables. This type of model is also
known as an intercept-only model.

The F-test for overall significance has the following two hypotheses:

The null hypothesis states that the model with no independent


variables fits the data as well as your model.
The alternative hypothesis says that your model fits the data better
than the intercept-only model.
In statistical output, you can find the overall F-test in the ANOVA
table. An example is below.
Analysis of variance table that displays the F-test of overall
significance.
Related Post: What are Independent and Dependent Variables?

Interpreting the Overall F-test of Significance


Compare the p-value for the F-test to your significance level. If the p-
value is less than the significance level, your sample data provide
sufficient evidence to conclude that your regression model fits the
data better than the model with no independent variables.

This finding is good news because it means that the independent


variables in your model improve the fit!

Generally speaking, if none of your independent variables are


statistically significant, the overall F-test is also not statistically
significant. Occasionally, the tests can produce conflicting results. This
disagreement can occur because the F-test of overall significance
assesses all of the coefficients jointly whereas the t-test for each
coefficient examines them individually. For example, the overall F-test
can find that the coefficients are significant jointly while the t-tests
can fail to find significance individually.

These conflicting test results can be hard to understand, but think


about it this way. The F-test sums the predictive power of all
independent variables and determines that it is unlikely that all of the
coefficients equal zero. However, it’s possible that each variable
isn’t predictive enough on its own to be statistically significant. In
other words, your sample provides sufficient evidence to conclude
that your model is significant, but not enough to conclude that any
individual variable is significant.

Notes:
1) The null hypothesis here is that the means are equal, and the
alternative hypothesis is that they are
not. A big t, with a small p-value, means that the null hypothesis is
discredited, and we would assert that
the means are significantly different (while a small t, with a big p-
value indicates that they are not significantly different).

2) The null hypothesis here is that one mean is greater than the other,
and the alternative hypothesis is
that it isn't. A big t, with a small p-value, means that the null
hypothesis is discredited, and we would
assert that the means are significantly different in the way specified
by the null hypothesis (and a small t,
with a big p-value means they are not significantly different in the
way specified by the null hypothesis).
3) The null hypothesis here is that the group means are all equal, and
the alternative hypothesis is that
they are not. A big F, with a small p-value, means that the null
hypothesis is discredited, and we would
assert that the means are significantly different (while a small F, with
a big p-value indicates that they
are not significantly different).
4) The null hypothesis here is that the group variances are all equal,
and the alternative hypothesis is
that they are not. A big X2, (Chi-squared) value, with a small p-value,
means that the null hypothesis is
discredited, and we would assert that the group variances are
significantly different (while a small X2,
with a big p-value indicates that they are not significantly different).
5) The null hypothesis here is that there is not a general relationship
between the response (dependent)
variable and one or more of the predictor (independent) variables,
and the alternative hypothesis is that
there is one. A big F, with a small p-value, means that the null
hypothesis is discredited, and we would
assert that there is a general relationship between the response and
predictors (while a small F, with a
big p-value indicates that there is no relationship).
6) The null hypothesis is that the value of the p-th regression
coefficient is 0, and the alternative
hypothesis is that it isn't. A big t, with a small p-value, means that the
null hypothesis is discredited, and
we would assert that the regression coefficient is not 0 (and a small t,
with a big p-value indicates that it
is not significantly different from 0).

,............--------------------------------------+--------------
A big F, with a small p-value, means that the null hypothesis is
discredited, and we would assert that the means are significantly
different (while a small F, with a big p-value indicates that they are
not significantly different).

The p-value can be perceived as an oracle that judges our results. If


the p-value is 0.05 or lower, the result is trumpeted as significant, but
if it is higher than 0.05, the result is non-significant and tends to be
passed over in silence. So what is the p-value really, and why is 0.05
so important?
P-value less than 0.5 is statistically significant, while a value higher
than 0.5 indicates the null hypothesis is true; hence it is not
statistically significant.

The F ratio is the ratio of two mean square values. If the null
hypothesis is true, you expect F to have a value close to 1.0 most of
the time. A large F ratio means that the variation among group
means is more than you'd expect to see by chance.

VIF
A variance inflation factor (VIF) is a measure of the amount of
multicollinearity in regression analysis. Multicollinearity exists when
there is a correlation between multiple independent variables in a
multiple regression model. This can adversely affect the regression
results. Thus, the variance inflation factor can estimate how much the
variance of a regression coefficient is inflated due to multicollinearity.

TRADE

TABLE OF CONTENTS
ECONOMY ECONOMICS
Variance Inflation Factor (VIF)
By THE INVESTOPEDIA TEAM Updated September 30, 2023
Reviewed by CHARLES POTTERS
Fact checked by TIMOTHY LI
Variance Inflation Factor (VIF): A measure of the amount of
multicollinearity in a set of multiple regression variables.
Investopedia / Yurle Villegas

What Is a Variance Inflation Factor (VIF)?


A variance inflation factor (VIF) is a measure of the amount of
multicollinearity in regression analysis. Multicollinearity exists when
there is a correlation between multiple independent variables in a
multiple regression model. This can adversely affect the regression
results. Thus, the variance inflation factor can estimate how much the
variance of a regression coefficient is inflated due to multicollinearity.

KEY TAKEAWAYS
A variance inflation factor (VIF) provides a measure of
multicollinearity among the independent variables in a multiple
regression model.
Detecting multicollinearity is important because while
multicollinearity does not reduce the explanatory power of the
model, it does reduce the statistical significance of the independent
variables.
A large VIF on an independent variable indicates a highly collinear
relationship to the other variables that should be considered or
adjusted for in the structure of the model and selection of
independent variables.
Understanding a Variance Inflation Factor (VIF)
A variance inflation factor is a tool to help identify the degree of
multicollinearity. Multiple regression is used when a person wants to
test the effect of multiple variables on a particular outcome. The
dependent variable is the outcome that is being acted upon by the
independent variables—the inputs into the model. Multicollinearity
exists when there is a linear relationship, or correlation, between one
or more of the independent variables or inputs.

The Problem of Multicollinearity


Multicollinearity creates a problem in the multiple regression model
because the inputs are all influencing each other. Therefore, they are
not actually independent, and it is difficult to test how much the
combination of the independent variables affects the dependent
variable, or outcome, within the regression model.

While multicollinearity does not reduce a model's overall predictive


power, it can produce estimates of the regression coefficients that are
not statistically significant. In a sense, it can be thought of as a kind
of double-counting in the model.

In statistical terms, a multiple regression model where there is high


multicollinearity will make it more difficult to estimate the
relationship between each of the independent variables and the
dependent variable. In other words, when two or more independent
variables are closely related or measure almost the same thing, then
the underlying effect that they measure is being accounted for twice
(or more) across the variables. When the independent variables are
closely-related, it becomes difficult to say which variable is
influencing the dependent variables.
Small changes in the data used or in the structure of the model
equation can produce large and erratic changes in the estimated
coefficients on the independent variables. This is a problem because
the goal of many econometric models is to test exactly this sort of
statistical relationship between the independent variables and the
dependent variable.

Tests to Solve Multicollinearity


To ensure the model is properly specified and functioning correctly,
there are tests that can be run for multicollinearity. The variance
inflation factor is one such measuring tool. Using variance inflation
factors helps to identify the severity of any multicollinearity issues so
that the model can be adjusted. Variance inflation factor measures
how much the behavior (variance) of an independent variable is
influenced, or inflated, by its interaction/correlation with the other
independent variables.

Variance inflation factors allow a quick measure of how much a


variable is contributing to the standard error in the regression. When
significant multicollinearity issues exist, the variance inflation factor
will be very large for the variables involved. After these variables are
identified, several approaches can be used to eliminate or combine
collinear variables, resolving the multicollinearity issue.

Formula and Calculation of VIF


The formula for VIF is:

VIF

=
1
1



2
where:


2
=
Unadjusted coefficient of determination for
regressing the ith independent variable on the
remaining ones

VIF
i

=
1−R
i
2

where:
R
i
2

=Unadjusted coefficient of determination for


regressing the ith independent variable on the
remaining ones
What Can VIF Tell You?
When Ri2 is equal to 0, and therefore, when VIF or tolerance is equal
to 1, the ith independent variable is not correlated to the remaining
ones, meaning that multicollinearity does not exist.

In general terms,

VIF equal to 1 = variables are not correlated


VIF between 1 and 5 = variables are moderately correlated
VIF greater than 5 = variables are highly correlated

Isixsigma. "Variance Inflation Factor (VIF)."

What Does a VIF of 1 Mean?


A VIF equal to one means variables are not correlated and
multicollinearity does not exist in the regression model.

What Is VIF Used for?


VIF measures the strength of the correlation between the
independent variables in regression analysis. This correlation is
known as multicollinearity, which can cause problems for regression
models.

The Bottom Line


While a moderate amount of multicollinearity is acceptable in a
regression model, a higher multicollinearity can be a cause for
concern.

What Is R-Squared?
R-squared (R2) is a statistical measure that represents the proportion
of the variance for a dependent variable that’s explained by an
independent variable in a regression model.
Whereas correlation explains the strength of the relationship between
an independent and a dependent variable, R-squared explains the
extent to which the variance of one variable explains the variance of
the second variable. So, if the R2 of a model is 0.50, then
approximately half of the observed variation can be explained by the
model’s inputs.

KEY TAKEAWAYS
R-squared is a statistical measure that indicates how much of the
variation of a dependent variable is explained by an independent
variable in a regression model.
In investing, R-squared is generally interpreted as the percentage of a
fund’s or security’s price movements that can be explained by
movements in a benchmark index.
An R-squared of 100% means that all movements of a security (or
other dependent variable) are completely explained by movements in
the index (or whatever independent variable you are interested in).

Modeling (PLS-SEM) Using R

Chapter

DownloadBook PDF

DownloadBook EPUB

Evaluation of the Structural Model

Joseph F. Hair Jr.,

G. Tomas M. Hult,

Soumya Ray

Show authors
Chapter

Open Access

First Online: 04 November 2021

54k Accesses

11 Citations

Part of the Classroom Companion: Business book series (CCB)

Abstract

Structural model assessment in PLS-SEM focuses on evaluating the


significance and relevance of path coefficients, followed by the
model’s explanatory and predictive power. In this chapter, we
discuss the key metrics relevant to structural model assessment in
PLS-SEM. We also discuss model comparisons and introduce key
criteria for assessing and selecting a model given the data and a set
of competing models.

Keywords

Akaike weights

Bayesian information criterion (BIC)

BIC

Coefficient of determination (R2)

Collinearity
Cross-validated predictive ability test (CVPAT)

CVPAT

Explanatory power

f2 effect size

Geweke and Meese criterion (GM)

GM

Holdout sample

In-sample predictive power

k-fold cross-validation

Linear regression model (LM) benchmark

MAE

Mean absolute error (MAE)

Model comparison

Out-of-sample predictive power

PLSpredict

Prediction error

Prediction statistics

R2
Relevance of the path coefficients

RMSE

Root-mean-square error (RMSE)

Significance of the path coefficients

Training sample

Variance inflation factor (VIF)

Download chapter PDF

Learning Objectives

After reading this chapter, you should understand:

1.

The steps involved in structural model assessment

2.

The concept of explanatory power and how to evaluate it

3.

How to use PLSpredict to assess a model’s predictive power

4.

The concept of model comparison and metrics for selecting the best
model
5.

How to assess structural models using SEMinR

Once you have confirmed that the measurement of constructs is


reliable and valid, the next step addresses the assessment of the
structural model results. ◘ Figure 6.1 shows a systematic approach to
the structural model assessment. In the first step, you need to
examine the structural model for potential collinearity issues. The
reason is that the estimation of path coefficients in the structural
models is based on ordinary least squares (OLS) regressions of each
endogenous construct on its corresponding predictor constructs. Just
as in an OLS regression, the path coefficients might be biased if the
estimation involves high levels of collinearity among predictor
constructs. Once you have ensured that collinearity is not a problem,
you will evaluate the significance and relevance of the structural
model relationships (i.e., the path coefficients). Steps 3 and 4 of the
procedure involve examining the model’s explanatory power and
predictive power. In addition, some research situations involve the
computation and comparison of alternative models, which can
emerge from different theories or contexts. PLS-SEM facilitates the
comparison of alternative models using established criteria, which are
well known from the regression literature. As model comparisons are
not relevant for every PLS-SEM analysis, Step 5 should be considered
optional.

Fig. 6.1

Structural model assessment procedure. (Source: Hair, Hult, Ringle, &


Sarstedt, 2022, Chap. 6; used with permission by Sage)

Full size image


6.1 Assess Collinearity Issues of the Structural Model

Structural model coefficients for the relationships between constructs


are derived from estimating a series of regression equations. As the
point estimates and standard errors can be biased by strong
correlations of each set of predictor constructs (Sarstedt &
Mooi, 2019; Chap. 7), the structural model regressions must be
examined for potential collinearity issues. This process is similar to
assessing formative measurement models, but in this case, the
construct scores of the predictor constructs in each regression in the
structural model are used to calculate the variance inflation factor
(VIF) values. VIF values above 5 are indicative of probable collinearity
issues among predictor constructs, but collinearity can also occur at
lower VIF values of 3–5 (Becker, Ringle, Sarstedt, & Völckner, 2015;
Mason & Perreault, 1991). If collinearity is a problem, a frequently
used option is to create higher-order constructs (Hair, Risher,
Sarstedt, & Ringle, 2019; Hair, Sarstedt, Ringle, & Gudergan, 2018;
Chap. 2; Sarstedt et al., 2019).

6.2 Assess the Significance and Relevance of the Structural Model


Relationships

In the next step, the significance of the path


coefficients and relevance of the path coefficients are evaluated.
Analogous to the assessment of formative indicator weights ( ►
Chap. 5), the significance assessment builds on bootstrapping
standard errors as a basis for calculating t-values of path coefficients
or alternatively confidence intervals (Streukens & Leroi-
Werelds, 2016). A path coefficient is significant at the 5% level if the
value zero does not fall into the 95% confidence interval. In general,
the percentile method should be used to construct the confidence
intervals (Aguirre-Urreta & Rönkkö, 2018). For a recap on
bootstrapping, return to ► Chap. 5.
In terms of relevance, path coefficients are usually between −1 and
+1, with coefficients closer to −1 representing strong negative
relationships and those closer to +1 indicating strong positive
relationships. Note that values below −1 and above +1 may
technically occur, for instance, when collinearity is at very high levels.
Path coefficients larger than +/−1 are not acceptable, and
multicollinearity reduction methods must be implemented. As PLS-
SEM processes standardized data, the path coefficients indicate the
changes in an endogenous construct’s values that are associated
with standard deviation unit changes in a certain predictor construct,
holding all other predictor constructs constant. For example, a path
coefficient of 0.505 indicates that when the predictor construct
increases by one standard deviation unit, the endogenous construct
will increase by 0.505 standard deviation units.

The research context is important when determining whether the size


of a path coefficient is meaningful. Thus, when examining structural
model results, researchers should also interpret total effects, defined
as the sum of the direct effect (if any) and all indirect effects linking
one construct to another in the model. The examination of total
effects between constructs, including all their indirect effects,
provides a more comprehensive picture of the structural model
relationships (Nitzl, Roldán, & Cepeda Carrión, 2016).

6.3 Assess the Model’s Explanatory Power

The next step involves


ook PDF

DownloadBook EPUB

Evaluation of the Structural Model

Joseph F. Hair Jr.,


G. Tomas M. Hult,

Soumya Ray

Show authors

Chapter

Open Access

First Online: 04 November 2021

54k Accesses

11 Citations

Part of the Classroom Companion: Business book series (CCB)

Abstract

Structural model assessment in PLS-SEM focuses on evaluating the


significance and relevance of path coefficients, followed by the
model’s explanatory and predictive power. In this chapter, we
discuss the key metrics relevant to structural model assessment in
PLS-SEM. We also discuss model comparisons and introduce key
criteria for assessing and selecting a model given the data and a set
of competing models.

Keywords

Akaike weights

Bayesian information criterion (BIC)


BIC

Coefficient of determination (R2)

Collinearity

Cross-validated predictive ability test (CVPAT)

CVPAT

Explanatory power

f2 effect size

Geweke and Meese criterion (GM)

GM

Holdout sample

In-sample predictive power

k-fold cross-validation

Linear regression model (LM) benchmark

MAE

Mean absolute error (MAE)

Model comparison

Out-of-sample predictive power


PLSpredict

Prediction error

Prediction statistics

R2

Relevance of the path coefficients

RMSE

Root-mean-square error (RMSE)

Significance of the path coefficients

Training sample

Variance inflation factor (VIF)

Download chapter PDF

Learning Objectives

After reading this chapter, you should understand:

1.

The steps involved in structural model assessment

2.

The concept of explanatory power and how to evaluate it

3.
How to use PLSpredict to assess a model’s predictive power

4.

The concept of model comparison and metrics for selecting the best
model

5.

How to assess structural models using SEMinR

Once you have confirmed that the measurement of constructs is


reliable and valid, the next step addresses the assessment of the
structural model results. ◘ Figure 6.1 shows a systematic approach to
the structural model assessment. In the first step, you need to
examine the structural model for potential collinearity issues. The
reason is that the estimation of path coefficients in the structural
models is based on ordinary least squares (OLS) regressions of each
endogenous construct on its corresponding predictor constructs. Just
as in an OLS regression, the path coefficients might be biased if the
estimation involves high levels of collinearity among predictor
constructs. Once you have ensured that collinearity is not a problem,
you will evaluate the significance and relevance of the structural
model relationships (i.e., the path coefficients). Steps 3 and 4 of the
procedure involve examining the model’s explanatory power and
predictive power. In addition, some research situations involve the
computation and comparison of alternative models, which can
emerge from different theories or contexts. PLS-SEM facilitates the
comparison of alternative models using established criteria, which are
well known from the regression literature. As model comparisons are
not relevant for every PLS-SEM analysis, Step 5 should be considered
optional.

Fig. 6.1

Structural model assessment procedure. (Source: Hair, Hult, Ringle, &


Sarstedt, 2022, Chap. 6; used with permission by Sage)

Full size image

6.1 Assess Collinearity Issues of the Structural Model

Structural model coefficients for the relationships between constructs


are derived from estimating a series of regression equations. As the
point estimates and standard errors can be biased by strong
correlations of each set of predictor constructs (Sarstedt &
Mooi, 2019; Chap. 7), the structural model regressions must be
examined for potential collinearity issues. This process is similar to
assessing formative measurement models, but in this case, the
construct scores of the predictor constructs in each regression in the
structural model are used to calculate the variance inflation factor
(VIF) values. VIF values above 5 are indicative of probable collinearity
issues among predictor constructs, but collinearity can also occur at
lower VIF values of 3–5 (Becker, Ringle, Sarstedt, & Völckner, 2015;
Mason & Perreault, 1991). If collinearity is a problem, a frequently
used option is to create higher-order constructs (Hair, Risher,
Sarstedt, & Ringle, 2019; Hair, Sarstedt, Ringle, & Gudergan, 2018;
Chap. 2; Sarstedt et al., 2019).

6.2 Assess the Significance and Relevance of the Structural Model


Relationships

In the next step, the significance of the path


coefficients and relevance of the path coefficients are evaluated.
Analogous to the assessment of formative indicator weights ( ►
Chap. 5), the significance assessment builds on bootstrapping
standard errors as a basis for calculating t-values of path coefficients
or alternatively confidence intervals (Streukens & Leroi-
Werelds, 2016). A path coefficient is significant at the 5% level if the
value zero does not fall into the 95% confidence interval. In general,
the percentile method should be used to construct the confidence
intervals (Aguirre-Urreta & Rönkkö, 2018). For a recap on
bootstrapping, return to ► Chap. 5.

In terms of relevance, path coefficients are usually between −1 and


+1, with coefficients closer to −1 representing strong negative
relationships and those closer to +1 indicating strong positive
relationships. Note that values below −1 and above +1 may
technically occur, for instance, when collinearity is at very high levels.
Path coefficients larger than +/−1 are not acceptable, and
multicollinearity reduction methods must be implemented. As PLS-
SEM processes standardized data, the path coefficients indicate the
changes in an endogenous construct’s values that are associated
with standard deviation unit changes in a certain predictor construct,
holding all other predictor constructs constant. For example, a path
coefficient of 0.505 indicates that when the predictor construct
increases by one standard deviation unit, the endogenous construct
will increase by 0.505 standard deviation units.

The research context is important when determining whether the size


of a path coefficient is meaningful. Thus, when examining structural
model results, researchers should also interpret total effects, defined
as the sum of the direct effect (if any) and all indirect effects linking
one construct to another in the model. The examination of total
effects between constructs, including all their indirect effects,
provides a more comprehensive picture of the structural model
relationships (Nitzl, Roldán, & Cepeda Carrión, 2016).

6.3 Assess the Model’s Explanatory Power

The next step involves examining the coefficient of


determination (R2) of the endogenous construct(s). The R2 represents
the variance explained in each of the endogenous constructs and is a
measure of the model’s explanatory power (Shmueli &
Koppius, 2011), also referred to as in-sample predictive
power (Rigdon, 2012). The R2 ranges from 0 to 1, with higher values
indicating a greater explanatory power. As a general guideline,
R2 values of 0.75, 0.50, and 0.25 can be considered substantial,
moderate, and weak, respectively, in many social science disciplines
(Hair, Ringle, & Sarstedt, 2011). But acceptable R2 values are based
on the research context, and in some disciplines, an R2 value as low
as 0.10 is considered satisfactory, as for example, in predicting stock
returns (e.g., Raithel, Sarstedt, Scharf, & Schwaiger, 2012).

Researchers should also be aware that R2 is a function of the number


of predictor constructs – the greater the number of predictor
constructs, the higher the R2. Therefore, the R2 should always be
interpreted relative to the context of the study, based on the
R2 values from related studies as well as models of similar complexity.
R2 values can also be too high when the model overfits the data. In
case of model overfit, the (partial regression) model is too complex,
which results in fitting the random noise inherent in the sample
rather than reflecting the overall population. The same model would
likely not fit on another sample drawn from the same population
(Sharma, Sarstedt, Shmueli, Kim, & Thiele, 2019). When measuring a
concept that is inherently predictable, such as physical processes,
R2 values of (up to) 0.90 might be plausible. However, similar
R2 value levels in a model that predicts human attitudes, perceptions,
and intentions would likely indicate model overfit (Hair, Risher, et
al., 2019). A limitation of R2 is that the metric will tend to increase as
more explanatory variables are introduced to a model. The adjusted
R2 metric accounts for this by adjusting the R2 value based upon the
number of explanatory variables in relation to the data size and is
seen as a more conservative estimate of R2(Theil, 1961). But because
of the correction factor introduced to account for data and model
size, the adjusted R2 is not a precise indication of an endogenous
construct’s explained variance (Sarstedt & Mooi, 2019; Chap. 7).

You might also like