Marginal Effects For Generalized Linear Models: The MFX Package For R
Marginal Effects For Generalized Linear Models: The MFX Package For R
Alan Fernihough
Queen’s University Belfast
Abstract
mfx is an R package which provides functions that estimate a number of popular gen-
eralized linear models, returning marginal effects as output. This paper briefly describes
the method used to compute these marginal effects and their associated standard errors,
and demonstrates how this is implemented with mfx in R. I also illustrate how the package
extends to incorporate the calculation of odds and incidence rate ratios for certain gener-
alized linear models. Finally, I present an example showing how the output produced via
mfx can be translated into LATEX.
Keywords: Marginal effects, odds ratio, incidence rate ratio, generalized linear models, R, mfx.
1. Introduction
The Generalized Linear Model (GLM) is a modified version of the classic linear regression
model typically estimated via Ordinary Least Squares (OLS).1 Researchers will generally
use a GLM approach when the response variable being modeled does not have a normally
distributed error term. Since the absence of a normally distributed error term violates the
Gauss-Markov assumptions, the use of a GLM is preferable in many scenarios.2 The GLM
works by permitting the regressors to be related to the response variable by means of a link
function. For example, in cases where the response variable is binary (takes a value of either
zero or one), the probit or logit link functions are commonly used because these functions
bound the predicted response between zero and one.
One drawback associated with the GLM is that the estimated model coefficients cannot be
directly interpreted as marginal effects (i.e., the change in the response variable predicted
after a one unit change in one of the regressors), like in an OLS regression. The estimated
coefficients are multiplicative effects, dependent on both the link function chosen for the GLM
and other variables alongside their estimated coefficient values. Therefore, it is difficult for
one to judge the magnitude of a GLM regression based on the estimated coefficient values.
The open-source R offers a number of functions that facilitate GLM estimation. Furthermore,
two R packages are available that contain functions providing platform from which users can
interpret an estimated GLM. The package effects (Fox et al. 2013), described in Fox (2003),
contains a comprehensive array of functions that allow users to graphically illustrate a GLM
in effect plots. While effect plots are arguably a better representation of the results, these
1
McCullagh and Nelder (1989) provide a complete overview of the GLM.
2
Since the error term is non-normal this induces heteroskedasticity.
2 mfx: Marginal Effects for Generalized Linear Models
plots may become unwieldy for researchers trying to display the effects for a large number
of variables and/or multiple model specifications. In such cases, a table of marginal effect
results may offer a more concise method of displaying results. The erer (Sun 2013) package
allows users to calculate marginal effects for either a binary logit or probit model.
While the packages effects and erer host a number of functions aiding the interpretation of the
GLM, the package described in this article, mfx (Fernihough 2014), contains important addi-
tional features that are useful in empirical research. First, mfx both estimates the GLM and
calculates the associated marginal effects in one function. Second, mfx can estimate adjusted
standard errors, robust to either heteroskedasticity or clustering. Third, mfx provides the
user with the ability to estimate marginal effects for a variety of GLM specifications, namely:
binary logit, binary probit, count Poisson, count negative binomial, and beta distributed re-
sponses. Fourth, since odds ratios or incidence rate ratios are more commonly used in certain
academic disciplines, like epidemiology, mfx also contains functions that return these values
instead of marginal effects. Fifth, mfx allows the user to decide if they want to compute
either the marginal effects for the average individual in the sample or the “average partial
effects” as advocated in Wooldridge (2002). Finally, the output produced in mfx can easily
be accommodated using the texreg (Leifeld 2013), so that publication quality LATEX tables
can be generated with relative simplicity.
The paper proceeds as follows. Section 2 contains a brief overview on the methods by which
marginal effects are computed. Section 3 outlines details of the software. Section 4 offers a
worked example that demonstrates how to use the software in practice and how the output
can be used to generate publication standard LATEX tables. Finally, Section 5 summarizes the
main contributions of the paper, highlights a number of the package’s drawbacks, and offers
possible areas for future development.
2. Marginal effects
Let E(yi |xi ) represent the expected value of a dependent variable yi given a vector of ex-
planatory variables xi , for an observation unit i. In the case where y is a linear function of
(x1 , · · · , xj ) = X and y is a continuous variable, the following model with k regressors can
be estimated via OLS:
y = Xβ + , (1)
where represents the error term, or
so the additive vector of predicted coefficients can be obtained from the usual computation:
β̂ = (X > X)−1 X > y. From (1) and (2) it is straightforward to see that the marginal effect
of the variable xj , where j ∈ {1, · · · , k} on the dependent variable is: ∂y/∂xj = βj . In other
words, a unit increase in the variable xj increases the variable y by βj units.
A GLM takes the following form:
g(y) = Xβ, (3)
where the link function g(·) transforms the expectation of the response to a linear equation.
The function g(·) is invertible, and thus we can rewrite Equation 3:
y = g −1 (Xβ), (4)
Alan Fernihough 3
so the inverse link function, also known as the mean function, is applied to the product of
the regressors (X) and the model coefficients (β). Therefore, the GLM in Equation 4 can be
seen as the linear regression model nested within a nonlinear transformation. The choice of
g(·) should depend on the distribution of the response y.
Since the GLM typically implies that the linear model inside a nonlinear function, one cannot
directly infer the marginal effects from the estimated coefficients.3 Alternatively, based on
Equation 4, we can see that:
∂y ∂g −1 (Xβ)
= βj × . (5)
∂xj ∂xj
Thus, the nonlinearity in the link function means that the marginal effect of xj now depends
on the derivative of the inverse link function, and contained within this function are all of the
other regressors and their associated regression coefficient values.
Here we use the probit model as an example, although the calculations for other GLM ap-
proaches is similar. The link function for the probit is based on the inverse normal distribution,
so: Z Xβ
P(y = 1|x) = φ(z)dz = Φ(Xβ), (6)
−∞
where Φ(·) and φ(·) denote both the normal cumulative and probability density functions
respectively. The marginal effect for a continuous variable in a probit model is:
∂y
= β̂j × φ(X β̂) (7)
∂xj
since Φ0 (·) = φ(·), so the marginal effect for a continuous variable xj depends on all of
the estimated β̂ coefficients, which are fixed, and the complete design matrix X, the values
for which are variable. Because the values for X vary, the marginal effects depend on the
procedure one employs. The literature offers two common approaches (Kleiber and Zeileis
2008). The first, and simplest, calculates the marginal effects when each variable in the design
matrix is at its average value. Otherwise known as the partial effects for the average individual
(Greene 2008), they can be calculated as:
∂y
= β̂j × φ(X̄ β̂). (8)
∂xj
The alternative approach calculates the average partial effects (Wooldridge 2002) or average
of the sample marginal effects (Kleiber and Zeileis 2008), by calculating a partial effect for
each observation unit (where there are n observations) and then averaging:
Pn
∂y φ(Xi β̂)
= β̂j × i=1 . (9)
∂xj n
Usually, the choice over which method one uses is unimportant as the difference in values
returned by both methods is likely to be small (Greene 2008).
The partial effects calculation in Equation 5 is not applicable in cases where xj is a bi-
nary/dummy variable like gender. This is because the derivative in Equation 5 is with respect
3
In the case where g(·) is the identity function, the estimated GLM will be identical to the standard linear
regression model.
4 mfx: Marginal Effects for Generalized Linear Models
to a infinitesimally small change in xj not the binary change from zero to one. Fortunately,
calculating the marginal effects in such instances is very straightforward. In the probit model
where the j-th regressor is a dummy variable the partial effect for the average individual is
simply:
∆y
= Φ(X̄ −j β̂ −j + β̂j ) − Φ(X̄ −j β̂ −j ), (10)
∆xj
where X̄ −j is a vector of the average values of the design matrix X that excludes the j-th
variable. The corresponding sample marginal effect is:
Pn −j −j −j −j
∆y i=1 Φ(Xi β̂i + β̂j ) − Φ(X β̂ )
= . (11)
∆xj n
All functions in mfx automatically detect dummy regressors and perform the calculation in
either Equation 10 or Equation 11, depending on the type of marginal effect the user wants.
We have already seen that the marginal effect for the j-th regressor in a probit GLM, β̂j ×
φ(X β̂), is a nonlinear function of β̂. Therefore, the standard errors that correspond to these
marginal effects must be calculated via the delta method of finding approximations based on
Taylor series expansions to the variance of functions of random variables:
" #> " #
∂f (X β̂) ∂f (X β̂)
VAR[f (X β̂)] = VAR[β̂] , (12)
∂ β̂ ∂ β̂
where f is the nonlinear transformation and VAR[β̂] is the usual variance-covariance of the
estimated parameters. With respect to the probit model previously used the variance of the
marginal effects (for the average individual) is:
" #> " #
∂[β̂ × φ(X̄ β̂)] ∂[β × φ(X̄ β̂)]
VAR[β̂ × φ(X̄ β̂)] = VAR[β̂] , (13)
∂ β̂ ∂ β̂
and since
∂[β̂ × φ(X̄ β̂)]
= φ(X̄ β̂) × [Ik − X̄ β̂ × (β̂ X̄)], (14)
∂ β̂
the probit marginal effect standard errors will be derived from the diagonal elements of the
following matrix of derivatives:
VAR[β̂ × φ(X̄ β̂)] = [φ(X̄ β̂)]2 × [Ik − X̄ β̂ × (β̂ X̄)][VAR[β̂]][Ik − X β̂ × (β̂ X̄)] (15)
White (1980) correction to the estimated variance-covariance matrix to account for this het-
eroskedasticity.5 Another example applies in cases where the researcher is estimating models
with clustered data. Ignoring the clustered nature of certain data will lead to an underesti-
mate of the standard errors. The mfx package allows the user to correct for clustering using
either a one-way or two-way correction in the variance-covariance matrix (Cameron et al.
2011) using the functionality offered in the sandwich package (Zeileis 2004, 2006).
Typically economists use marginal effects to display the output after estimating a GLM.
However, other disciplines, particularly the medical sciences, use odds ratios (for example, in
a logistic regression) or incidence rate ratios (for count regression models). Both ratios are
derived from the fact that the underlining GLM is a log-linear model, so taking the exponent
of the coefficient results in a multiplicative effect. Odds ratios are defined as the ratio of
the probability of success and the probability of failure and therefore range between zero
and infinity. Thus, an explanatory variable in a logistic regression with an odds ratio of 2
indicates that a one unit change in the explanatory variable increases the odds of the event
by 2 to 1. Alternatively, an odds ratio of 1 would indicate that the regressor of interest does
not influence the response. The incidence rate ratios used in Poisson and negative binomial
count regression models are analogous to the aforementioned odds ratios. Once again they
are multiplicative effects. For example, an incidence rate ratio of two will indicate that a
one unit increase in the explanatory variable of interest doubles the underlying rate by which
the count event is occurring. The mfx package accommodates odds ratios and incidence rate
ratios in the applicable log-linear models.
3. Package details
The mfx software is an add-on package to the statistical software R, and is freely avail-
able from the Comprehensive R Archive Network (CRAN, http://CRAN.R-project.org/
package=mfx). In addition to the base implementation of R, it requires the following pack-
ages: MASS (Venables and Ripley 2002), sandwich (Zeileis 2004, 2006), lmtest (Zeileis and
Hothorn 2002), and betareg (Cribari-Neto and Zeileis 2010; Grün et al. 2012). Once R and
the required packages have been installed, mfx can be loaded using the following code.
R> library("mfx")
Table 1 summarizes the GLM approaches that are compatible with the functions provided in
mfx. The functions in mfx will first estimate the specified GLM, and after the GLM is fitted,
the marginal effects (or odds/incidence rate ratios). These functions all return the requested
output in the familiar coefficient table summary.
First, we look at the function that estimates a probit model, and returns its marginal effects
as an output. The probitmfx function and it’s arguments are shown below.
The function is similar to either the lm or glm functions. The first argument: formula
requires an object suitable for the formula class in R. The formula argument is identical
5
The presence of heteroskedasticity in models with a binary response is best handled explicitly using glmx
(Zeileis et al. 2013).
6 mfx: Marginal Effects for Generalized Linear Models
to that required when estimating a probit model via the glm function, and is required by
probitmfx. The next argument, data is for a data frame object. This argument is necessary
so users should group their data into a data frame object prior to use. When atmean =
TRUE, the resulting marginal effects will be for the average observation—as in Equation 8 and
Equation 10—while if atmean = FALSE, the average of the sample marginal effects will be
calculated—as in Equation 9 and Equation 11. In general, average of the sample marginal
effects will take longer to be calculated. The robust argument allows the users to apply
White’s correction for the presence of heteroskedasticity in the calculation of marginal effect
standard errors. Both of the clustervar1 and clustervar2 arguments are reserved for the
names of the variables on which the user wishes to calculate either one or two-way clustered
standard errors. These cluster names must correspond to a variable contained within the
data object. The start and control arguments relate to identical arguments used to fit a
model with glm.
Let’s take a look at the output produced by probitmfx with a simple simulated example.
R> set.seed(12345)
R> n = 1000
R> x = rnorm(n)
R> y = ifelse(pnorm(1 + 0.5 * x + rnorm(n)) > 0.5, 1, 0)
R> data = data.frame(y, x)
R> (mod1 = probitmfx(formula = y ~ x, data = data))
Call:
probitmfx(formula = y ~ x, data = data)
Marginal Effects:
dF/dx Std. Err. z P>|z|
x 0.121643 0.012165 9.9997 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R> names(mod1)
R> mod1$mfxest
Alan Fernihough 7
R> mod1$fit
Coefficients:
(Intercept) x
0.9911 0.5102
R> mod1$dcvar
character(0)
R> mod1$call
Calling the probitmfx object returns a printCoefmat object similar to that produced when
summary(glm(...)) is used for a GLM. However, instead of the model coefficients, the
probitmfx produces the marginal effects: dF/dx. The probitmfx object contains four ob-
jects. The first, mfxest, is a table of the marginal effects, their standard errors, a z-test
statistic (testing if the marginal effect is equal to zero) and the corresponding p-value asso-
ciated with the z-test representing a two-tailed test. The name fit refers to the stored glm
object—in this case a probit model. Note that using summary(probitmfx$fit) reports un-
corrected standard errors, not ones that have been adjusted using the robust, clustervar1,
and clustervar2 arguments in probitmfx. A notifier that signifies for which variables a
discrete change marginal effects is captured with dcvar. Finally, call is the matched call
object.
The mfx package also contains the following other functions: betamfx, betaor, logitmfx,
logitor, negbinirr, negbinmfx, poissonirr, poissonmfx. Each of these functions is self
explanatory, with mfx, or, or irr indicating marginal effects, odds ratios, or incidence rate
ratios respectively. The logit and Poisson models are fit with the glm function available as a
base package in R. The negative binomial is fit using the glm.nb function in MASS. Finally,
the beta regression is fit via the betareg package. Both betamfx and betaor functions use a
logit link for the mean function, so it is feasible to calculate both marginal effects and odds
ratios for these models.
4. Example analysis
This section illustrates how a simple analysis can be performed in mfx. For this analysis, I use
the Swiss labor market participation data SwissLabor that is included in the AER (Kleiber
8 mfx: Marginal Effects for Generalized Linear Models
and Zeileis 2008) package. The code below, clears the workspace and loads the relevant data
frame.
as.Date, as.Date.numeric
For this example, we want to model labor force participation as a function of covariates. In
the next step we load the mfx and estimate the baseline probit model returning the marginal
effects as an output.
R> library("mfx")
Loading required package: MASS
Loading required package: betareg
Loading required package: Formula
R> (mod1 = probitmfx(participation ~ income + age + education +
+ youngkids + oldkids + foreign,
+ data = SwissLabor))
Call:
probitmfx(formula = participation ~ income + age + education +
youngkids + oldkids + foreign, data = SwissLabor)
Marginal Effects:
dF/dx Std. Err. z P>|z|
income -0.1992314 0.0485655 -4.1023 4.090e-05 ***
age -0.1232260 0.0214953 -5.7327 9.885e-09 ***
Alan Fernihough 9
[1] "foreignyes"
Creates a probitmfx object called mod1. The printed object shows the function call, a table of
the marginal effects, and a notification that the foreign variable represents a discrete change
and the marginal effects for this variable have been calculated accordingly. The marginal
effect values appear sensible. For example, a one-unit change in the number of young children
associated with an observation reduces the probability of labor force participation by ≈ 31%.
We must keep in mind that these marginal effects refer to the average individual. However,
we can calculate the average of the sample marginal effects.
Marginal Effects:
dF/dx Std. Err. z P>|z|
income -0.1729131 0.0409901 -4.2184 2.460e-05 ***
age -0.1069480 0.0176141 -6.0717 1.265e-09 ***
education 0.0070203 0.0060165 1.1668 0.2433
youngkids -0.2699201 0.0321591 -8.3933 < 2.2e-16 ***
oldkids -0.0046379 0.0154409 -0.3004 0.7639
foreignyes 0.2856119 0.0397184 7.1909 6.436e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[1] "foreignyes"
Which leads to comparable results. Given that the response variable is binary, we can also
calculate the odds ratios obtained after fitting a logit regression. The code below demonstrates
this.
+ data = SwissLabor))
Call:
logitor(formula = participation ~ income + age + education +
youngkids + oldkids + foreign, data = SwissLabor)
Odds Ratio:
OddsRatio Std. Err. z P>|z|
income 0.442621 0.090959 -3.9661 7.305e-05 ***
age 0.600298 0.054338 -5.6379 1.721e-08 ***
education 1.032237 0.029972 1.0927 0.2745
youngkids 0.264286 0.047616 -7.3859 1.514e-13 ***
oldkids 0.978254 0.072162 -0.2980 0.7657
foreignyes 3.707675 0.740637 6.5600 5.382e-11 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The output for the odds ratios is as we would expect. The negative marginal effects have
odds ratios below one, and the positive marginal effects, above one. The next mfx feature
that is worth highlighting is the ability of the functions in the package to compute clustered
standard errors. This functionality is displayed in the example below.
Marginal Effects:
dF/dx Std. Err. z P>|z|
income -0.1992314 0.0280393 -7.1054 1.199e-12 ***
age -0.1232260 0.0124103 -9.9293 < 2.2e-16 ***
education 0.0080889 0.0040117 2.0163 0.04377 *
youngkids -0.3110035 0.0236506 -13.1499 < 2.2e-16 ***
oldkids -0.0053438 0.0102732 -0.5202 0.60294
foreignyes 0.3112408 0.0247808 12.5598 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[1] "foreignyes"
R> (mod5 = probitmfx(participation ~ income + age + education +
+ youngkids + oldkids + foreign,
+ data = SwissLabor3, clustervar1 = "id"))
Alan Fernihough 11
Call:
probitmfx(formula = participation ~ income + age + education +
youngkids + oldkids + foreign, data = SwissLabor3, clustervar1 = "id")
Marginal Effects:
dF/dx Std. Err. z P>|z|
income -0.1992314 0.0453073 -4.3973 1.096e-05 ***
age -0.1232260 0.0210013 -5.8675 4.423e-09 ***
education 0.0080889 0.0069325 1.1668 0.2433
youngkids -0.3110035 0.0467831 -6.6478 2.976e-11 ***
oldkids -0.0053438 0.0174540 -0.3062 0.7595
foreignyes 0.3112408 0.0437153 7.1197 1.081e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
[1] "foreignyes"
In this example the data frame is duplicated twice and added to the existing data frame.
Replicating the first probitmfx command on the enlarged data frame results in much lower
standard errors. However, this can be corrected if we cluster the standard errors on the
observation unit. As we can see this “fixes” the problem created by duplicating the data
frame.
The final part of this example analysis illustrates how the mfx objects estimated in the above
can be used to create a table with LATEX. This involves coercing the mfx objects so that they
are compatible with the texreg package. Since the mfx objects return the fitted glm object,
we can use this as an input in the texreg function and override the model coefficients with
the estimated marginal effect/odds ratios. Since the estimated glm models contain intercept
values (and this will nearly always be the case) we take care to create some an additional
value in our marginal effect/odds ratios and then remove this output using the omit.coef
argument. Finally, some aesthetic alterations are made to the texreg object before saving
the object. Table 2 shows what the table looks like when compiled in LATEX.
R> library("texreg")
Version: 1.29.6
Date: 2013-09-27
Author: Philip Leifeld (University of Konstanz)
R> mods = list(mod1$fit, mod2$fit, mod3$fit, mod4$fit, mod5$fit)
R> coefs = list(c(0, mod1$mfxest[, 1]), c(0, mod2$mfxest[, 1]),
+ c(0, mod3$oddsratio[, 1]), c(0, mod4$mfxest[, 1]),
+ c(0, mod5$mfxest[, 1]))
R> ses = list(c(0, mod1$mfxest[, 2]), c(0, mod2$mfxest[, 2]),
+ c(0, mod3$oddsratio[, 2]), c(0, mod4$mfxest[, 2]),
+ c(0, mod5$mfxest[, 2]))
R> pvals = list(c(0, mod1$mfxest[, 4]), c(0,mod2$mfxest [, 4]),
+ c(0, mod3$oddsratio[, 4]), c(0,mod4$mfxest [, 4]),
12 mfx: Marginal Effects for Generalized Linear Models
5. Summary
This article introduces the mfx package for R. The package hosts a number of useful functions
that should be of interest to those who conduct empirical research. Similarities between the
functions provided in mfx and the well-known glm function mean that using mfx should be
trivial for existing R users. There are a number of areas upon which the package could be
improved. One such area would be to extend the number of models available. Examples of
models that could be added include: ordered probit, multinomial logit, heteroskedastic probit,
and instrumental variables probit. Another area for future expansion would be to improve
the manner in which mfx handles nonlinear and interaction terms. For example, the current
version of mfx calculates the marginal effect for each regressor separately, even if the same
variable is included twice—albeit in two different forms, e.g., as a linear value and it’s squared
term. In instances like this, it may be preferable to have one marginal effect for each unique
regressor and therefore mfx users should exercise caution before interpreting such values.
References
Alan Fernihough 13
Table 2: Models explaining labor force participation, marginal effects and odds ratio example.
14 mfx: Marginal Effects for Generalized Linear Models
Cameron AC, Gelbach J, Miller D (2011). “Robust Inference with Multi-way Clustering.”
Journal of Business and Economic Statistics, 29(2), 238–249.
Fernihough A (2014). mfx: Marginal Effects, Odds Ratios and Incidence Rate Ratios for
GLMs. R package version 1.1, URL http://cran.r-project.org/web/packages/mfx.
Fox J (2003). “Effect Displays in R for Generalised Linear Models.” Journal of Statistical
Software, 8(15), 1–27. URL http://www.jstatsoft.org/v08/i15/.
Greene WH (2008). Econometric Analysis. 6th edition. Prentice Hall, New York.
McCullagh P, Nelder JA (1989). Generalized Linear Models. 2nd edition. Chapman & Hall,
London.
Sun C (2013). erer: Empirical Research in Economics with R. R package version 1.4, URL
http://cran.r-project.org/web/packages/erer.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S. 4th edition. Springer-
Verlag, New York.
Wooldridge JM (2002). Econometric Analysis of Cross Section and Panel Data. MIT Press,
Cambridge.
Zeileis A (2004). “Econometric Computing with HC and HAC Covariance Matrix Estimators.”
Journal of Statistical Software, 11(10), 1–17. URL http://www.jstatsoft.org/v11/i10/.
Affiliation:
Alan Fernihough
Queen’s University Management School
Queen’s University Belfast
185 Stranmillis Road
Belfast
BT9 5EE, United Kingdom
E-mail: [email protected]