0% found this document useful (0 votes)
266 views

Panel Data Analysis Using EViews Chapter - 2 PDF

The document discusses heterogeneous regression models (HRMs) that allow for differential effects of explanatory variables between groups. It presents the general equation specification for various HRMs, including those with one or two numerical explanatory variables, lagged variable models, and autoregressive models. Example 2x2x2 factorial models are shown to illustrate how hypotheses about differential effects between groups can be tested using Wald tests. The objective is to study how the effects of explanatory variables on an outcome variable may differ between groups defined by classification factors.

Uploaded by

imohamed2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
266 views

Panel Data Analysis Using EViews Chapter - 2 PDF

The document discusses heterogeneous regression models (HRMs) that allow for differential effects of explanatory variables between groups. It presents the general equation specification for various HRMs, including those with one or two numerical explanatory variables, lagged variable models, and autoregressive models. Example 2x2x2 factorial models are shown to illustrate how hypotheses about differential effects between groups can be tested using Wald tests. The objective is to study how the effects of explanatory variables on an outcome variable may differ between groups defined by classification factors.

Uploaded by

imohamed2
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

2

Heterogeneous Regressions, ANCOVA,


And Fixed-Effects Models

2.1 Introduction
Referring to various DVMs having interaction independent variables, namely the factors
A, and B, and the time-period TP, then if we are considering one or more numerical variables as
additional exogenous (cause, source, upstream, or independent) variable(s), we would be very
confidence, in a theoretical sense, that the numerical exogenous variable(s) should have
differential effects on the corresponding endogenous (impact, down-stream or dependent)
variable(s), either numerical, zero-one, or ordinal variable, between the cells or groups generated
by A, B, and TP. For this reason, a heterogeneous regression model (HRM) should be considered
as the best possible model to be applied.
This chapter would present alternative HRMs, starting with the simplest possible model
in each specific group of models, by presenting their equation specifications (ESs). The other
equation specifications are left for exercises.

2.2 HRMs having a Numerical Exogenous Variable


Following the DVMs presented in previous chapter, the HRMs considered would have the
following general equation specification only.
G(Y) F(X)*@Expand(A,B,TP) @Expand(A,B,TP) (2.1)
For all possible HRMs by the factors A, B, and TP (time-period), have the following
characteristics.

31
(i) G(Y) represents a numerical endogenous variable, including proportion or percentage
variable, Yit, and its transformed variables such as log[(Yit –`L)/log(U-Yit)], log(Yit -L), or
log(U-Yit), where L and U are fixed lower and upper bounds of Yit.
(ii) F(X) represents any numerical exogenous variable Xit, including Yi,t-p and an
environmental variable, namely Zt, or Zt-p, and the numerical time-t.
(iii) The other alternative functions of X are the functions having no parameter, and without a
constant, such as log(Xit), for Xit > 0, 1/Xit, and X it , for a fixed number α ≠ 0. Compare
to the models (6.20), in the main book, using different type of functions, for instance
Fk(X)=C(k1)+C(k2)*Xα(k), α(k) ≠ 0, in (6.20d), which has two parameters, with a constant
parameter, because the models have independent dummy variables Dij, instead of using
the function @Expand(*).
(iii) In addition note that (aX+b), b ≠ 0 cannot be inserted for the F(X) in (2.1), because the
design matrix will be perfectly singular. However, if you would like to use the function
F(X) = aX+b, a new variable has to be generated, namely X_New = aX+b, then the
variable X_New can be used to replace F(X) in (2.1).
(iv) On the other hand, the invers function F(X) = 1/(ax+b) could be applied directly.
(v) However, in the following subsections, only some simple alternative HRMs would be
presented.
The main objective of all HRMs in (2.1) are to study and to test the differential linear
effects of F(X) on G(Y), between the cells generated by the cause or classification factors A, B,
and the time-period, TP. For an illustration Table 2.1 presents the slope parameters or the linear
effects of F(X) on G(Y) of a 2x2x2 HRM in (2.1), which are indicated by the parameters C(1) up
to C(8).

Table 2.1 Slope Parameters of a 2x2x2 factorial HRM in (2.1)


A=1 A=2 A(1-2)
TP B=1 B=2 B(1-2) B=1 B=2 B(1-2) B=1 B=2
1 C(1) C(2) C(1)-C(2) C(3) C(4)) C(3)-C(4) C(1)-C(3) C(2)-C(4)
2 C(5) C(6) C(5)-C(6) C(7) C(8) C(7) -(8) C(5)-C(7) C(6)-C(8)
TP(1-2) C(1)-C(5) C(2)-C(6) C(3)-C(7) C(4)-C(8)

32
Based on this table, hypotheses on the differential linear effects of F(X) on G(Y) between
relevant cells/groups (A=i,B=j,TP=k) can easily be defined, and then tested using the Wald test.
Some of the hypotheses are as follows:
(1) Conditional for (A=i,B=j), the linear effect of F(X) on G(Y), depends on the factor TP. In
other words, F(X) has different linear effects on G(Y), between levels of the factor TP,
conditional for (A=i,B=j). For an illustration, conditional for (A=1,B=1), the statistical
hypothesis is as follows:
H0: C(1) = C(5), versus H1: C(1) ≠ C(5)
(2) Conditional for (A=i), the linear effect of F(X) on G(Y), depends on the factors B and TP. .
In other words, F(X) has different linear effects on G(Y), between the cells/groups generated
by of the factors B and TP, conditional for (A=i). For an illustration, conditional for (A=1),
the statistical hypothesis is as follows:
H0: C(1) = C(2) = C(5) = C(6), versus H1: Otherwise
(3) The linear effect of F(X) on G(Y), depends on the factors A, B and TP, with the following
hypothesis.
H0: C(1)=C(2)=C(3)=C(4)=C(5)=C(6)=C(7)=C(8), versus H1: Otherwise

2.2.1 Heterogeneous Classical Growth Models (HCGMs)


It is recognized that there are three types of HCGMs can be considered, such as follows –
refer to the classical growth models presented in Agung (2009a, and 2011b).
(i) HCGM for all firms or individuals has the following equation specification (ES). Note that if
the data contains hundreds of firms, then this model is representing hundreds of classical
growth models, for any positive endogenous variable Yit.
log(Y) t*@Expand(Firm) @Expand(Firm) (2.2)
(ii) HCGMs by the firm groups/sectors or the cause/classification factors, namely A and B, would
have the following ES. Note that the researchers should be very confidence that all firms or
all research objects within each group generated by A and B can be considered as a single
homogenous group.
log(Y) t*@Expand(A,B) @Expand(A,B) (2.3)

33
(iii) Peace-wise HCGMs by the cause or classification A and B, and the time-periods TP, have
the following ES. Note that the peace-wise growths are indicated by the time-periods TP,
log(Y) t*@Expand(A,B,TP) @Expand(A,B,TP) (2.4)

2.2.2 First-order Lagged Variable HRMs


First-order lagged variable HRMs, namely LV(1)_HRMs, have the following general ES.
G(Y) Y(-1)*@Expand(A,B,TP) @Expand(A,B,TP) (2.5)

2.2.3 First-order Autoregressive HRMs


First-order autoregressive HRMs, namely AR(1)_HRMs, have the following general ES.
G(Y) F(X)*@Expand(A,B,TP) @Expand(A,B,TP) AR(1) (2.6)

2.3 HRMs having two Numerical Exogenous Variable


2.3.1 Hierarchical IxJxK Factorial HRMs
The HRMs of an endogenous variable Y, and two exogenous numerical variables X, X1
and X2, would have the following general equation specification:.
G(Y) F1(X1)* F2(X2)*@Expand(A,B,TP) F1(X1)*@Expand(A,B,TP)
F2(X2)*@Expand(A,B,TP) @Expand(A,B,TP) (2.7)

where F1(X1), F2(X2), and G(Y), respectively, can be any functions of the exogenous variables
X1it, and X2it,, and the endogenous variable Yit, without a parameter. So that there would be a lot
of possible HRMs could be proposed or defined by a researcher. Note that the functions F1(X1),
and F2(X2), do not have constant numbers – refer to the notes for the function F(X) for the model
in (2.1).
For an illustration, Table 2.2 presents the parameters of a 2x2x2 factorial HRM in (2.7).
Based on the statistical results of this model and this table, the following findings, notes and
comments are presented.

34
Table 2.2 Parameters of the 2x2x2 factorial HRMs in (2.7)
A=1 A=2
B=1 B=2 B=1 B=2
Variable TP=1 TP=2 TP=1 TP=2 TP=1 TP=2 TP=1 TP=2
F1(X1)*F2(X2) C(1) C(2) C(3) C(4) C(5) C(6) C(7) C(8)
F1(X1) C(9) C(10) C(11) C(12) C(13) C(14) C(15) C(16)
F2(X2) C(17) C(18) C(19) C(20) C(21) C(22) C(23) C(24)
Intercept C(25) C(26) C(27) C(28) C(29) C(30) C(31) C(32)

(1) The differential effects of the numerical interaction independent variables, namely
F1(X1)*F2(X2) on G(Y) between cells/groups (A=i,B=j,TP=k) for some relevant i. j, and k,
could easily be defined and tested using the Wald test. The hypotheses would be similar to
the three types/groups hypotheses presented in Section 2.2, such as follows:
1.1 Conditional for (A=i,B=j), the adjusted effect of F1(X1)*F2(X2) on G(Y), depends on the
factor TP. In other words, F1(X1)*F2(X2) has different effects on G(Y), between levels of the
factor TP, conditional for (A=i,B=j). For an illustration, conditional for (A=1,B=1), the
statistical hypothesis is as follows:
H0: C(1) = C(2), versus H1: C(1) ≠ C(2)
1.2 Conditional for (A=i), the adjusted effect of F1(X1)*F2(X2) on G(Y), depends on the factors B
and TP. In other words, F1(X1)*F2(X2) has different effects on G(Y), between the
cells/groups generated by of the factors B and TP, conditional for (A=i). For an illustration,
conditional for (A=1), the statistical hypothesis is as follows:
H0: C(1) = C(2) = C(3) = C(4), versus H1: Otherwise
(2) Within each of the cells (A=i,B=j,TP=k), the effect of F1(X1)*F2(X2) on G(Y), adjusted for
F1(X1) and F2(X2), can easily be tested using the test-statistic shown in the statistical results.
However, for testing the effect of a numerical main independent variable, the following
alternative hypotheses should be considered, because it is inappropriate to test the main
effect, if its interaction is in the model.
(3) The hypothesis stated that the effect of F1(X1) on G(Y), depends on F2(X2), within each of
the cell (A=i,B=j,TP=k). In other words, the hypothesis on the joint effects of F1(X1), and

35
F1(X1)*F2(X2) on G(Y), within each of the cell (A=i,B=j,TP=k). For instance, within the cell
(A=1,B=1,TP=1), we have the statistical hypothesis as follows:
H0: C(1)=C(9)=0; versus H1: Otherwise
(4) The hypothesis stated that F1(X1), and F1(X1)*F2(X2) have different joint effects on G(Y),
between two or more relevant cells (A=i,B=j,TP=k). For instance, as follows:
4.1 Between the two cells (A=1,B=1,TP=1), and (A=1,B=1,TP=2) we have the following
statistical hypothesis.
H0: C(1)=C(2), C(9)=C(10); versus H1: Otherwise
4.2 Between the four cells (A=1,B=j,TP=k)’s, and (A=1,B=1,TP=2) we have the following
statistical hypothesis.
H0: C(1)=C(2)=C(3)=C(4), C(9)=C(10)=C(11)=C(12); versus H1: Otherwise
(5) If one or more numerical independent variables have large p-values, say Sig.(2-tailed > 0.30)
within a cell of (A=i,B=j,TP=k), then an acceptable reduced model would be explored. In
this case, it is recommended to apply the manual stepwise selection method, as presented in
the main book.

2.3.2 Nonhierarchical IxJxK Factorial HRMs


Note that all HRMs in (2.7) are hierarchical two-way interaction models within each of
the cells/groups generated by the factors A, B, and TP. In practice, however, either one of the
following nonhierarchical reduced models could be a good fit model.

(1) Nonhierarchical two-way interaction models:


These HRMs have the following alternative equations specifications.

G(Y) F1(X1)* F2(X2)*@Expand(A,B,TP)


F1(X1)*@Expand(A,B,TP) @Expand(A,B,TP) (2.8)

G(Y) F1(X1)* F2(X2)*@Expand(A,B,TP)


F2(X1)*@Expand(A,B,TP) @Expand(A,B,TP) (2.9)

G(Y) F1(X1)* F2(X2)*@Expand(A,B,TP) @Expand(A,B,TP) (2.10)

36
(2) Additive models within each cell/group (A=i,B=j,TP=k):
These HRMs have the general equation specification as follows;:
G(Y) F1(X1)*@Expand(A,B,TP) F2(X2)*@Expand(A,B,TP)
@Expand(A,B,TP) (2.11)

One of the HRMs, which has been widely applied is the translog linear model having the
following ES, for positive variables X1, X2, and Y.
log(Y) log(X1)*@Expand(A,B,TP) log(X2)*@Expand(A,B,TP)
@Expand(A,B,TP) (2.12)

This model could be extended to a bounded translog linear model having the following ES,
where L and U are the fixed lower and upper bound of Yit.
log((Y-L)/(U-Y)) log(X1)*@Expand(A,B,TP) log(X2)*@Expand(A,B,TP)
@Expand(A,B,TP) (2.13)

On the other hand, for a comparison, the following ES is representing the worst additive
models. Refer to Table 1.12, which shows the worst 2x2x2 factorial ANOVA model.
log(Y) C log(X1)) log(X2) @Expand(A,@Droplast)
@Expand(B,@Droplast) @Expand(TP,@Droplast) (2.14)

Referring to the simple models in (2.2) up to (2.4), then the exogenous variables X1, and
X2, could be the time t as a numerical environmental variable, and Y(-1). In addition, either one
of X1, and X2, or both can be replaced by X1(-1), and X2(-1). Some of the models are as follows:

2.3.1 Peace-wise HRMs with trends, having the general ES as follows:


G(Y) F1(X1)*@Expand(A,B,TP) t*@Expand(A,B,TP) @Expand(A,B,TP) (2.15)

2.3.2 Peace-wise HRMs with the Time-Related-Effects (TRE), having the general
ES as follows:

37
G(Y) t*F1(X1)*@Expand(A,B,TP) t*@Expand(A,B,TP)
F1(X1)*@Expand(A,B,TP) @Expand(A,B,TP) (2.16)

2.3.3 HRMS with an Environmental Variable, Zt, having the general ES as follows:
G(Y) F1(X1)* F2(Z)*@Expand(A,B,TP) F1(X1)*@Expand(A,B,TP)
F2(Z)*@Expand(A,B,TP) @Expand(A,B,TP) (2.17)

2.3.4 LV1_ HRMs, having the general ES as follows:


G(Y) Y(-1)*F1(X1)*@Expand(A,B,TP) Y(-1)*@Expand(A,B,TP)
F1(X1)*@Expand(A,B,TP) @Expand(A,B,TP) (2.18)

2.3.5 AR(1)_ HRMs, having the general ES as follows:


G(Y) F1(X1)* F2(X2)*@Expand(A,B,TP) F1(X1)*@Expand(A,B,TP)
F2(X2)*@Expand(A,B,TP) @Expand(A,B,TP) AR(1) (2.19)

2.4 HRMs having three Numerical Exogenous Variables


2.4.1 General Equation
The HRMs of an endogenous variable Y, and two exogenous numerical variables X1, X2,
and X3, would have the following general equation specification:

G(Y) F1(X1)* F2(X2)*F3(X3)*@Expand(A,B,TP) F1(X1)* F2(X2)*@Expand(A,B,TP)


F1(X1)*F3(X3)*@Expand(A,B,TP) F2(X2)*F3(X3)*@Expand(A,B,TP)
F1(X1)*@Expand(A,B,TP) F2(X2)*@Expand(A,B,TP)
F3(X3)*@Expand(A,B,TP) @Expand(A,B,TP) (2.20)

where F1(X1), F2(X2), F3(X3), and G(Y), respectively, can be any functions of the exogenous
variables X1it, X2it,, and X3it,, and the endogenous variable Yit, without a parameter, and the
functions F1(X1), F2(X2), and F3(X3), do not have constant numbers – refer to the notes for the
various functions of F(X) in the model (2.1) .

38
Table 2.3 Parameters of the 2x2x2 factorial HRMs in (2.17)
A=1 A=2
B=1 B=2 B=1 B=2
Variable TP=1 TP=2 TP=1 TP=2 TP=1 TP=2 TP=1 TP=2
F1(X1)*F2(X2)*F3(X3) C(1) C(2) C(3) C(4) C(5) C(6) C(7) C(8)
F1(X1)*F2(X2) C(9) C(10) C(11) C(12) C(13) C(14) C(15) C(16)
F1(X1)*F3(X3) C(17) C(18) C(19) C(20) C(21) C(22) C(23) C(24)
F2(X2)*F3(X3) C(25) C(26) C(27) C(28) C(29) C(30) C(31) C(32)
F1(X1) C(33) C(34) C(35) C(36) C(37) C(38) C(39) C(40)
F2(X2) C(41) C(42) C(43) C(44) C(45) C(46) C(47) C(48)
F3(X3) C(49) C(50) C(51) C(52) C(53) C(54) C(55) C(56)
Intercept C(57) C(58) C(59) C(60) C(61) C(62) C(63) C(64)

For an illustration, Table 2.3 presents the parameters of a 2x2x2 factorial HRM in (2.17).
Based on the statistical results of this model and this table, the following findings, notes and
comments are presented. Then, based on the statistical results of this model, and Table 2.17,
various hypotheses similar to those based on Table 2.2 could be easily defined and tested. Do it
for an exercise.
In addition, if a reduced model should be explored, then it is recommended to conduct the
analysis using the manual stepwise selection method.
Note that all HRMs in (2.17) are hierarchical three-way interaction models within each of
the cells/groups generated by the factors A, B, and TP. In practice, however, a good fit
acceptable model would be nonhierarchical three-way interaction model, or hierarchical or
nonhierarchical two-way interaction models, either one of the nonhierarchical reduced models
could be a good fit model, and an additive model. Refer to the models in (2.7) to (2.11).
Furthermore, note that the exogenous variables X1it, X2it, and X3it, could be any
numerical variables, as presented for the independent variable X of the models in (2.1). Refer to
the models in (2.12) to (2.16).

39
2.4.2 Some Specific HRMs
Corresponding to all possible HRMs in (2.20), Figure 2.1 presents the path diagrams to
present three specific up-and-down or seemingly causal relationships (SCMs), based on an
endogenous variable Yit, and two exogenous variables X1it, and X2it, within each of the cell or
group generated by A, B, and TP. These path diagrams have the characteristics as follows:

X1 X1 X1(-1)

Y(-1) Y Y(-1) Y Y(-1) Y

X2 X2(-1) X2(-1)
(a) (b) (c)

Figure 2.1 Alternative up-and-down relationships based on the variables LY, X1 and X2
Figure 2.1 Alternative up-and-down relationships based on the variables LY, X1 and X
(1) The arrows with dashed lines are representing that the causal or up-and-down relationships
between the corresponding pair of variables are not taken into account as dependent and
independent variables, but they are taken into account in the models of Y, to indicate an
interaction, either two- or three-way interactions, should be used as independent variables, as
presented in the models (2.20), such as follows:
1.1 Figure 2.1(a) shows that the effect of X1 on Y, depends on Y(-1), and X2. So that the
interactions Y(-1)*X1*X2, Y(-1)*X1, and X1*X2, should be used as possible independent
variables of the model, in a theoretical sense. In addition, the effect of X2 on Y, depends on
Y(-1), then the interaction Y(-1)*X2 also should be taken into account as an independent
variable of the model. Finally, the three main variables Y(-1), X1, and X2. Thence, the
general model would have the ES as follows:
Y Y(-1)*X1* X2*@Expand(A,B,TP) Y(-1)*X1*@Expand(A,B,TP)
Y(-1)* X2*@Expand(A,B,TP) X1* X2*@Expand(A,B,TP) Y(-1)*@Expand(A,B,TP)
X1*@Expand(A,B,TP) X2*@Expand(A,B,TP) @Expand(A,B,TP) (2.20a)

However, since the seven exogenous variables are highly correlated, in general, then a good
fit model within each cell or group would have only some of the seven numerical

40
independent variables which should be selected using the manual stepwise selection method.
Refer to the Example 2.2.
1.2 Similarly, Figure 2.1(b) represents the model of Y on Y(-1)* X1* X2(-1), Y(-1)* X1, Y(-1)*
X2(-1), X1* X2(-1), Y(-1), X1, and X2(-1).
1.3 Finally, Figure 2.1(c) represents the model of Y on Y(-1)* X1(-1)* X2(-1), Y(-1)* X1(-1), Y(-
1)* X2(-1), X1(-1)* X2(-1), Y(-1), X1(-1), and X2(-1).
(2) These models could be extended to the models with trends, the models with the time-related
effects, and the models with environmental variable(s), similar to the models in (2.15),
(2.16), and (2.17), respectively.

2.5 Advanced HRMs


2.5.1 Polynomial HRMs
As an extension of the simplest HRMs in (2.1), polynomial HRMs would be considered
having the following general equation specification.
G(Y) F(X)*@Expand(Group,TP) ... F(X)^k* @Expand(Group,TP)
… F(X)^K*@Expand(Group,TP) @Expand(Group,TP) (2.22)
where the categorical variable Group can be generated by one or more variables, and TP is a
time-period variable- refer to all possible functions of the variable X, for the models in (2.1).

2.5.2 General Equation Specification


As the extension of the HRMs in (2.20), having three numerical independent variables,
advanced HRMs of G(Y), would be presented using the following general equation specification.
G(Y) V1*@Expand(Group,TP) … Vk*@Expand(Group,TP)
… VK*@Expand(Group,TP) @Expand(Group,TP) (2.23)
where G(Y) is a function of an endogenous variable Yit, without a parameter, Vk can be any main
factor or variable – refer to all possible choices of the variable X in ES (2.1), a two- or a three-
way interaction of specifically selected main factors, the categorical variable Group can be
generated by one or more variables, and TP is a time-period variable. Note that, instead of using
the time-period TP, the time-t (year or others), also could be used as the categorical independent
variable, if and on if the time-t is not used as a numerical variables –refer to the models in (2.2),
(2.3), and (2.4).

41
Note that these models can easily be modified to following HRMs.

2.5.2.1 HRMs by Individuals/Firm, and TP


The general equation of these HRMs are as follows:
G(Y) C V1*@Expand(Firm,TP) … Vk*@Expand(Firm,TP)
… VK*@Expand(Firm,TP) @Expand(Firm,TP,@Drop(*)) (2.23a)
Note this ES present a set of N time series models, that is for all firms in the sample, by the time-
periods TP.

2.5.2.2 HRMs by Cell-Factor(CF), and Time-T


The general equation of these HRMs are as follows:
G(Y) C V1*@Expand(CF,T) … Vk*@Expand(CF,T)
… VK*@Expand(CF,T) @Expand(CF,T,@Drop(*)) (2.23b)
where CF is a cell-factor or a group variable, which should be invariant or constant over times,
and it can be generated based one of more variables.

2.5.2.3 Selected Specifc HRMs


Several specific alternative HRMs having the general ES (2.23), which need to be
considered are as follows:
(1) Translog linear HRMs by the Group and TP, with the ES as follows:
log(Y) lo(X1)*@Expand(Group,TP) … log(Xk)*@Expand(Group,TP)
… log(XK)*@Expand(Group,TP) @Expand(Group,TP) (2.24)
where Y and Xk, for all k=1,….K are positive variables. These models are an extension of the
Cobb-Douglas production functions by Group, and TP
(2) HRMs with trend by the Group, with the ES as follows:
G(Y) V1*@Expand(Group) … Vk*@Expand(Group)
t*@expand(Group) @Expand(Group) (2.25)
(3) HRMs with the Time-Related-Effects by the Group, with the ES as follows:
G(Y) V1*@Expand(Group) … Vk*@Expand(Group) t*Expand(Group)
t*V1*@Expand(Group) ..,. t*Vk*@expand(Group) @Expand(Group) (2.26)

42
2.6 Various Altenatives HRMs
All equation specifications above can easily be applied for the following models.
(1) The models having the numerical endogenous variable presented in subsection 1.3.1, such as
the LS Regressions, Quantile Regressions, and Instrumental Variables Models.
(2) The binary choice (probit, logit, and extreme value) models, for a dummy problem indicator,
(3) The ordered choice (probit, logit, and extreme value) models, for an ordinal problem
indicator, as presented in subsection 1.3.1.3,
(4) The firm, or cross-section fixed-effects HRMs, and the time, or period fixed-effects HRMs,
as the extension of FEMs presented in Section 1.4.3.
The data analyses are left for exercises. The following examples present the application
of the manual stepwise selection method, which has not been known by most of the readers, and
some selected specific models.

2.7 Application of the Manual Stepwise Selection Method


In order to obtain a reduced model, which is acceptable in both theoretical and statistical
sense, it is recommended to apply the manual stepwise selection method (Agung, 2011), since by
using the STEPLS – Stepwise Least Squares estimation method once for all possible independent
variables, unexpected statistical results or the worst regression might be obtained . The important
exogenous (cause, source or upstream) variable(s), the interaction variable(s) in particular, might
not be selected as an independent variable(s).

2.7.1 Empirical Statistical Results


The following examples present two empirical statistical results of a common
heterogeneous regressions model (HRM), and the simplest lagged variables model, respectively
based on the data in CES.wf1.

Example 2.1 (A two-way interaction HRM by two dichotomous factors) Table 2.4 presents
a summary of the statistical results of a HRM model of a numerical variable LY=log(Y) on LK =
log(K), and LL=log(L) by two dichotomous factors Group, and TP=1+1*(Year>14), in the data
CES.wf1. The stages of analysis are as follows:

43
Table 2.4 Summary of statistical results
Dependent Variable: LY
Method: Panel Least Squares
Date: 03/22/13 Time: 13:08
Sample: 1 28
Periods included: 28
Cross-sections included: 82
Total panel (balanced) observations: 2296
Stages-1,2, & 3 Stage-4 Stage-5
Variable Coeff. Prob. Coeff. Prob. Coeff. Prob.
GROUP=1 AND TP=1 4.476 0.000 -1.695 0.280 -1.695 0.280
GROUP=1 AND TP=2 1.747 0.000 -1.103 0.515 1.747 0.000
GROUP=2 AND TP=1 7.856 0.000 -2.354 0.285 -2.354 0.285
GROUP=2 AND TP=2 4.527 0.000 -2.121 0.376 -2.121 0.377
LK*LL*(GROUP=1 AND TP=1) 0.011 0.000 -0.005 0.198 -0.005 0.198
LK*LL*(GROUP=1 AND TP=2) 0.005 0.000 -0.002 0.693 0.005 0.000
LK*LL*(GROUP=2 AND TP=1) 0.011 0.000 -0.016 0.006 -0.016 0.006
LK*LL*(GROUP=2 AND TP=2) 0.007 0.000 -0.010 0.114 -0.010 0.114
LK*(GROUP=1 AND TP=1) 0.621 0.000 0.865 0.000 0.865 0.000
LK*(GROUP=1 AND TP=2) 0.806 0.000 0.916 0.000 0.806 0.000
LK*(GROUP=2 AND TP=1) 0.450 0.000 0.865 0.000 0.865 0.000
LK*(GROUP=2 AND TP=2) 0.655 0.000 0.918 0.000 0.918 0.000
LL*(GROUP=1 AND TP=1) 0.407 0.000 0.407 0.000
LL*(GROUP=1 AND TP=2) 0.185 0.085
LL*(GROUP=2 AND TP=1) 0.681 0.000 0.681 0.000
LL*(GROUP=2 AND TP=2) 0.429 0.005 0.429 0.005
R-squared 0.970 0.970 0.970
Adjusted R-squared 0.969 0.970 0.970
S.E. of regression 0.336 0.333 0.333
Sum squared resid 258.125 252.648 252.977
Log likelihood -748.953 -724.330 -725.825
Durbin-Watson stat 0.032 0.034 0.034

The statistical results of the first three stages are presented in one column, since noreduced
model should be developed at each stage of the data analysis. The equation specifications (ESs)
used are as follows:
(1) At Stage-1: the ES is: LY @Expand(Group,TP). wich represents a two-way ANOVA
model. In this stage, no reduced model should be developed even though one or more
dummy variables do have large p-values.

44
1.1 At Stage-2: the ES is: LY @Expand(Group,TP) LK*LL*@Expand(Group,TP). In this stage,
no reduced model should be developed, since all numerical independent variables have
significant effects on LY. Based on my own point of view, a numerical interaction
independent variable(s) should be kept in the model if it has a p-value < 0.30, because it
would have either positive or negative significant effect on LY, at the α = 0.15 level; of
significance reduced model Lapin, 1973). For a comparison, Hosmer & Lemeshow (2000)
propose to keep the independent variable if it has a p-value < 0.25.
1.2 The final statistical results of the first three stages of the data analysis is using the ES as is:
LY @Expand(Group,TP) LK*LL*@Expand(Group,TP) LK*@Expand(Group,TP).
1.3 The statistical results show that each independent variable LK*LL*(Group=i and TP=j) has
significant effect. So that we do not have to reduce the model.
(2) At Stage-4, the function LL*@Expand(Group,TP) is inserted as additional independent
variables, where the statistical results show that the variable LK*LL*(Group=1 and TP=2)
has the largest p-value = 0.693. In statistical sense, a reduced model should be explored.
But, this interaction independent variable should not be deleted from the model, because it is
a more important independent variable, compare to the variable LL*(Group=1 and TP=2).
So that, the new additional independent variable, namely LL*(Group-1 and TP=2), should
be deleted from the model, even though it has a smaller p-value = 0.085.
(3) At Stage-5, the final results are obtained by using LL*@Expand(Group,TP,@Drop(1,2)) to
replace the function LL*@Expand(Group,TP).
(4) In addition, note that at the α = 0.10 level of significance, each of LK*LL*(Group=1 and
TP=1), and LK*LL*(Group=2 and TP=2), has negative significant effect on LY, with the p-
values of 0.198/2 = 0.099, and 0.114/2= 0.072, respectively. So that the final model should
be considered as an acceptable model, in both theoretical and statistical sense, to present that
the effect of LK (or LL) on LY, is significantly dependent on LL(or LK).
(5) Furthermore, note that the statistical results of the three models have small values of the
Durbin-Watson statistic. It is recognized that the models can be improved by using the lag(s)
of the dependent variable. The simplest lagged HRM would be obtained by inserting the
independent variable LY(-1)*@Expand(Group,TP), in the final model, or at the second stage

45
of data analysis. Do it for exercise, and find special LV(1)_HRMs presented in the following
example.

Example 2.2 ( Application of LV(1)_HRMs) Referring to the path diagram in Figure 2.1(a),
Figure 2.2 presents the statistical results of the final model of a special LV(1)_HRMs of LY =
log(Liability) on X1 = log(Sale), and X2 = log(Size), by two dichotomous factors A and B, in
Special_BPD.wf1. using the following equation specification.
ly @expand(a,b) ly(-1)*x1*x2*@expand(a,b,@drop(1,1),@drop(2,1))
ly(-1)*x1*@expand(a,b) ly(-1)*x2*@expand(a,b,@drop(2,2))
x1*x2*@expand(a,b,@drop(2,2)) ly(-1)*@expand(a,b,@drop(1,1),@drop(1,2))
x1*@expand(a,b,@drop(1,2),@drop(2,1))
x2*@expand(a,b,@drop(1,1),@drop(2,2)) (2.27)

Based on the ES in (2,27) and its statistical results presented in Figure 2.2, the following
notes are presented
(1) The ES in (2.27) shows eight groups of independent variables, so that there are eight stages
of data analysis, as follows:
1.1 At Stage-1, the SE applied is “LY @Expand(a,b)”, which is an ANOVA model. In this
stage, no reduced model should be developed, even though two of the means of LY have very
large p-values. Because the final model should have four intercepts.
1.2 At Stage-2, the SE applied is “LY @Expand(a,b) LY(-1)*X1*X2*@Expand(a,b)”, but it is
found that LY(-1)*X1*X2*(A=1 and B=1), and LY(-1)*X1*X2*(A=2 and B=1), have large p-
values of 0.3305 and 0.3958, respectively, which are greater than 0.30. So I decide to reduce
the model, by using LY(-1)*X1*X2*@Expand(a,b,@Drop(1,1),@Drop(2,2)), as presented in
the SE (2.24).
1.3 Similar processes can easily be done by inserting each of the other functions, step by step,
starting from LY(-1)*X1*@Expand(a,b) up to X2*@Expand(a,b), and deleting one or two of
the new inserted independent variables, in order to keep the independent variables of
previous model having p-values < 0.25 (or 0.30).

46
Dependent Variable: LY
Method: Panel Least Squares
Sample (adjusted): 2 8
Periods included: 7
Cross-sections included: 218
Total panel (unbalanced) observations: 1436
Variable Coeff. s.e. t-Stat. Prob.
A=1 AND B=1 1.711 0.139 12.286 0.000
A=1 AND B=2 2.778 0.156 17.853 0.000
A=2 AND B=1 6.008 0.370 16.219 0.000
A=2 AND B=2 6.196 0.810 7.645 0.000
LY(-1)*X1*X2*(A=1 AND B=2) -0.022 0.014 -1.578 0.115
LY(-1)*X1*X2*(A=2 AND B=2) 0.002 0.001 3.387 0.001
LY(-1)*X1*(A=1 AND B=1) 0.061 0.008 7.408 0.000
LY(-1)*X1*(A=1 AND B=2) 0.076 0.006 13.675 0.000
LY(-1)*X1*(A=2 AND B=1) 0.054 0.006 8.587 0.000
LY(-1)*X1*(A=2 AND B=2) 0.068 0.013 5.081 0.000
LY(-1)*X2*(A=1 AND B=1) -0.151 0.025 -5.954 0.000
LY(-1)*X2*(A=1 AND B=2) 0.299 0.074 4.024 0.000
LY(-1)*X2*(A=2 AND B=1) -0.269 0.062 -4.333 0.000
X1*X2*(A=1 AND B=1) 0.204 0.021 9.800 0.000
X1*X2*(A=1 AND B=2) -0.105 0.058 -1.815 0.070
X1*X2*(A=2 AND B=1) 0.140 0.046 3.062 0.002
LY(-1)*(A=2 AND B=1) -0.256 0.090 -2.853 0.004
LY(-1)*(A=2 AND B=2) -0.276 0.114 -2.425 0.015
X1*(A=1 AND B=1) 0.283 0.057 5.005 0.000
X1*(A=2 AND B=2) -0.073 0.101 -0.728 0.467
X2*(A=1 AND B=2) -0.230 0.314 -0.734 0.463
X2*(A=2 AND B=1) 1.142 0.273 4.178 0.000
R-squared 0.875 Mean dependent var 5.791
Adjusted R-squared 0.873 S.D. dependent var 1.908
S.E. of regression 0.680 Akaike info criterion 2.082
Sum squared resid 654.126 Schwarz criterion 2.163
Log likelihood -1473.020 Hannan-Quinn criter. 2.112
Durbin-Watson stat 1.472
Figure 2.2 Statistical results of a LV(1)_HRM using the equation specification in (2.27)

(2) Note that the independent variables X1*(A=2, B=2), and X2*(A=1,B=2) do not have to be
deleted from the final model, even though each has large p-value, because their interactions,
namely LY(-1)*X1*X2*(A=2,B=2), and LY(-1)*X1*X2*(A=1,B=2), have significant effects

47
on LY. However, if you wish to delete them, then an acceptable reduced model would be
obtained.
(3) Since the model has a DW statistic of 1.472, then an AR(1) model is applied. It is obtained
the model having a DW statistic of 2.0528, but some of the interaction independent variables
have large p-values, such as LY(-1)*X1*X2(A=1,B=2) has a p-value = 0.6052.. Thence a
reduced model should be explored, which is left for the exercise.
(4) The causal or up-and-down stream relationships between the numerical variables LY, X1 and
X2 in fact are defined based on the path diagram in Figure 2.1(a), which shows that LY(-1)
is an upstream variable of the endogenous variable LY, and both exogenous variables X1
and X2. In addition, X2= log(Size) is an upstream variable of X1=log(Sale). So that, in a
theoretical sense, the effect of LY(-1) on LY should depend on X1 and X2, Thence, the
interaction between the variables LY(-1), X1, and X2 should be used as independent variables
within each cell generated by the factors A, and B. This model could be extended to the
models by the factors A, B, and TP, as well as to LV(p)_Models, for p > 1.
(5) On the other hand, the simplest reduced model is the translog linear model by the two factors
A and B, with the ES as follows, since x1= log(Sale), and x2=log(Size)
ly @expand(a,b) ly(-1)*@expand(a,b) x1*@expand(a,b) x2*@expand(a,b) (2.28)

2.7.2 Special Notes and Comments


Referring to the statistical results of the models presented in the above examples, it is
recognized that the good fit models obtained are highly dependent on the ordering of the set of
the numerical independent variables inserted step by step into the models. For this reason, it is
important to present the following notes and comments.
(1) Referring to the full model in Example 2.1, the following notes are presented.
1.1 The sets of numerical variables LK*LL*@Expand(Group,TP), LK*@Expand(Group,TP),
and LL*@Expand(Group,TP), could be inserted in 3! (= 3 factorial) = 1*2*3 = 6 possible
orderings or permutations. Do it as an exercise.
1.2 It happens the estimates obtained at Stage-4 are very simple, where only one of the
independent variables, namely LK*LL*(GROUP=1 AND TP=2) has a large p-value > 0.30.
So that a reduced model could be easily obtained. One of possible reduced model is
presented at Stage-5, in Table 2.1.

48
(2) Now, referring to the full model of the reduced model in Example 2.2, the following notes
are presented.
2.1 It is found, that the statistical results based on the full model of the reduced model in (2.27)
would have many independent variables having large p-values, say p-values > 0.30. So that
by inserting different ordering of the independent variables LY(-1)*X1`*X2*@Expand(a,b)
up to X2*@Expand(a,b), several other good fit models would be obtained. In this case, we
have 7! = 5,040 possible ordering or permutation
2.2 However, it is recommended to insert the highest order interaction numerical variables at the
first stage, followed by the lower order interactions of selected main variables/factors which
are considered as the most important upstream (cause, or source) variables, the most
important main variables, and finally other additional upstream variable(s)
2.3 So that a researcher have to be using his/her knowledge and experience (judgment) to select
the best possible ordering, in a theoretical sense, or some alternative orderings for a
comparison results.

2.8 Cross Section Fixed-Effects Models


Fixed-effects models in fact are ANCOVA models. Based on cross-section data, Agung
(2011, and 2006) has presented special notes that ANCOVA models are not recommended
models, because the ANCOVA models have the assumption that the covariates have the same
effects on the corresponding endogenous variable within all groups generated by the factors
considered, and this assumption is not valid in general. On the other hand, an additive
ANCOVA model is considered as the worst model among alternative ANCOVA models having
the same set of numerical variables.
Corresponding to the ANCOVA Models, I would also consider that fixed-effects models,
are commended models. However, fixed-effects models area acceptable models in a statistical
sense, since fixed-effects models have been widely applied by many researchers, and presented
in text books, such as Baltagi (2009), Gujaraty (2003), and Wooldridge (2002). This section
presents special fixed-effects models, namely the cross-section fixed-effects models (CSFEMs),
with special notes and comments, compare to other alternative models.

49
2.8.1 CSFEMs Based On a Variable Yit
2.8.1.1 Classical Growth Model by Firms, and its CSFEM
As a modification of the HCGM in (2.2), the classical growth model of Yit, i =1, … ,N;
t=1, … ,T, by firms would be considered as the simplest heterogeneous regressions model,
namely HCGM, by firms. The model would have the equation as follow:
ln(Yit) = αi + βi*t + εit, for i =1, …, N (2.29)
where αi and βi, respectively, are the intercept and the exponential growth parameters of the
model for the firm i, and Yit is a positive endogenous variable. Then the model (2.9) is
representing the N firms as having N different growth rates, indicated by the parameters βi, for
i=1,…,N. Furthermore, note the N regression functions of ln(Y) on the time-t, in fact are a set of
N time-series functions, and they can be graphically presented using N heterogeneous regression
lines in a two-dimensional coordinate system.
For the analysis using EViews, the following equation specification (ES) can be applied.
log(Y) C t*@Expand(Firm) @Expand(Firm,@Dropfirst) (2.30)
The most important reduced model of this model (2.30) is a cross-section fixed-effects
model (CSFEM), or firm-fixed-effects model, having the ES as follows:
log(Y) C t @Expand(Firm,@Dropfirst) (2.31)
Note that this model can be considered as a one-way ANCOVA model of log(Y) by a
single factor FIRM, with the time-t as a covariate. Based on a panel data, a CSFEM can present
hundreds or thousands of firms having the same growth rate, indicated by the parameter C(2).
For sure, this condition would never be observed in practice. Moreover, a single classical growth
model (CGM) having the following equation.
ln(Yit) = α + β*t + εit, (2.32)
where α, and β, area fixed parameter for all firm-time N*T observations. So that this model
would be considered as the worst continuous panel data model. Similarly, for all continuous
panel data models.

Example 2.3 (Application of CSFEM in (2.31)) Figure 2.3 presents a summary of the statistical
results of the CSFEM in (2.31), based on the CES.wf1, and the testing on an omitted variables:
t*@Expand(Firm,@Dropfirst), for the following hypothesis.
H0 : The CSFEM in (2.3), versus H1: The HCGM by Firms in (2.2)

50
Based on this summary, the following notes and comments are presented.
(1) The CSFEM is representing 82 parallel regression lines of LY on the numerical time-t for
t=1,…,28. The model only has 81 = (82-1) firm-dummies, because the model contains the
intercept parameter ‘C’. However, the list of their coefficients are not presented

Figure 2.3 Summary of the statistical results of CSFEM in (2.31), and an Omitted Variables test

(2) It is found that t*@Expand(Firm,@Dropfirst) has a significant effect, based on the F-


statistics of F0 = 65.412, with df = (81,2132), and p-value = 0.0000. In other words, the data
supports the HCGM by Firms.
(3) It is also found that even for the first three firms, t*@Expand(Firm,@Dropfirst) has a
significant effect, based on the F-statistics of F0 = 10.11448, with df = (2,78), and p-value =
0.0001.So that it can be said that the CSFEM is not acceptable model, in both theoretical
and statistical sense, more over the continuous classical growth model in (2.4), which is the
worst growth model for all panel data.
(4) Note that the model has a very small DW-statistic. Thence it is recommended to apply a
lagged variable or autoregressive model. Fine the following example.

2.8.1.2 Lagged Variables Model by Firms and its CSFEM


The first-order lagged variable model of Yit, by firms, would be considered as the
simplest lagged variables heterogeneous regressions model, namely LV(1)_HRM, having the
following equation specification.
Y C Y(-1)*@Expand(Firm) @Expand(Firm,@Dropfirst) (2.33)

51
with its two possible reduced models are as follows:
(i) A CSFEM having the following ES, which a One-way ANCOVA model of Y on Y(-1)
with the FIRM as a factor.
Y C Y(-1) @Expand(Firm,@Dropfirst) (2.34)
(ii) A continuous LV(1) model, having the following ES, which is considered as the
worst LV(1) panel data model.
Y C Y(-1) (2.35)

2.8.1.3 LV(1)_HRM with Trend, and its CSFEM


As an extension of the model (2.6), the LV(1)_HRM with Trend, would have the ES as
follows:
Y C Y(-1)*@Expand(Firm) t*@Expand(Firm) @Expand(Firm,@Dropfirst) (2.36)
with its two possible reduced models are as follows:
(i) A LV(1)_CSFEM with trend, having the ES as follows:
Y C Y(-1) t @Expand(Firm,@Dropfirst) (2.37)
(ii) A continuous LV(1) model with trend, having the following ES, which is considered
as the worst LV(1) panel data model with trend.
Y C Y(-1) t ) (2.38)

Example 2.4 (Application of LV(2)_CSFEMs) As an extension of the CSFEM in Figure 2.3,


having a very small DW-statistic, Figure 2.4 presents a summary of the statistical results based
on two alternative LV(2)_CSFEMs. Based on this summary, the following notes and comments
are presented.
(1) Specific for the first model, the omitted variables test shows that the three variables,
t*@Expand(Firm,@Dropfirst), LY(-1)*@Expand(Firm,@Dropfirst), and
LY(-2)*@Expand(Firm,@Dropfirst),

have a joint significant effects based on the F-statistic of F0 = 1.772624 with df = (6,66) and
p-value = 0.1183. So that a heterogeneous regression model (HRM) should be explored. For
an additional illustration, Figure 2.5 presents the statistical results of a HRM using two
different equation specifications, as follows:

52
LY C t LY(-1) LY(-2) t*@Expand(Firm,@Dropfirst)
LY(-1)*@Expand(Firm,@Dropfirst) LY(-2)*@Expand(Firm,@Dropfirst)
@Expand(Firm,@Dropfirst) (2.39)

LY t*@Expand(Firm) LY(-1)*@Expand(Firm)
LY(-2)*@Expand(Firm) @Expand(Firm,@Dropfirst) (2.40)

Based on these results, the following findings and notes are presented.
1.1 Both outputs present exactly the same regression functions, indicated by the same values of
13 statistics presented at the bottom of the print-outs. The model in (2.39) is using the
Firm=1 as a reference group, and the model in (2.40) is not using a reference group.

Figure 2.4 Summary of the statistical results based on two LV(2)_CSFEMs, and
their Omitted Variables Tests

53
Figure 2.5 Statistical results based on the HRM, using the ESs (2.39), and (2.40)

1.2 In a statistical sense, a reduced model should be explored. It is recommended the reduced
model should be developed based on the model in (2.40) by deleting LY(-2)*(Firm=3) from
the model, using the following ES.
LY t*@Expand(Firm) LY(-1)*@Expand(Firm)
LY(-2)*@Expand(Firm,@Drop(3)) @Expand(Firm,@Dropfirst) (2.41)

(2) Both LV(2)_CSFEMs models are additive models of LY on t, LY(-1), and LY(-2), and they
have sufficient values of the DW-statistic, and each of the numerical variable has a
significant effect. So that both models can be considered as acceptable models, but in a
statistical sense.
(3) However, these models are using the hidden assumption that all numerical independent
variables, namely t, LY(-1), and LY(-2), have the same effects on the dependent variable LY

54
for the first three firms, and 82 firms, respectively, which is inappropriate, in a theoretical
sense. Compare to the following models.

2.8.1.3 LV(1)_HRM with a Time-Related-Effects, and its CSFEM


As an extension of the model (2.36), the LV(1)_HRM with a Time-Related-Effect (TRE),
would have the ES as follows.
Y C Y(-1)*t*@Expand(Firm) Y(-1)*@Expand(Firm)
t*@Expand(Firm) @Expand(Firm,@Dropfirst) (2.42)

Note that this model is presenting a set of N two-way interaction LV(1)_HRMs, having
the following equation.
Yit = αi + βi*Yi,t-1*t + δi*Yi,t-1 + γi*t + εit (2.43)
And all of the models above can be considered as its reduced models. Furthermore, two
additional reduced models should be considered are as follows:
(i) A LV(1)_CSFEM with TRE, having the ES as follows:
Y C Y(-1)*t Y(-1) t @Expand(Firm,@Dropfirst) (2.44)
(ii) A continuous LV(1) model with TRE, having the following ES, which is considered
as the worst LV(1) panel data model with TRE.
Y C Y(-1)*t Y(-1) t (2.45)

Example 2.5 (Application of LV(1)_HRM in (2.42), and LV(1)_CSFEM in (2.44)) Figure 2.6
presents the statistical results of the LV(1)_HRM in (2.42), and LV(1)_CSFEM in (2.44), for an
endogenous variable LY, based on a subsample {FIRM < 4}. Based on these results, the
following findings and notes are presented.
(1) The LV(1)_HRM applied, since it is defined that each of the exogenous variables LY(-1)*t,
LY(-1), and t has different effects on LY for the first three firms. On the other hand, the
LV(1)_CSFEM applied under a special assumption that each exogenous variables has the
same effects for the first three firms. So they have different prerequisites.
(2) At the first stage, the choice for the LV(1)_CSFEM is completely depend on personal
expert’s judgment, it does not depend on a testing hypothesis. If the assumption that the
variables LY(-1)*t, LY(-1), and t have the same effect for the first three firms can be

55
accepted, then LV(1)_CSFEM can be considered as a good fit model. However for a large
number firms, such as hundreds or thousands of firms, the assumption would never be
acceptable.

Figure 2.6 Statistical results based on the LV(1)_HRM in (2.42), and LV1_CSFEM in (2.44)

Figure 2.7 Statistical results of the omitted variables tests, based on the LV(1)_CSFEM
Figure 2.7 Statistical results of two omitted variables tests

(3) For the illustration, Figure 2.7 presents the statistical results of the omitted variables tests
based on the LV(1)_CSFEM, with the findings as follows:
3.1 At the α = 0.10 level of significance, the variables LY(-1)*t*@Expand(Firm,@Dropfirst) and
LY(-1)*@Expand(Firm,@Dropfirst) have an insignificant joint effects, based on the F-
statistic of F0 = 1.624578 with df = (4,71), and a p-value = 0.1778 > α = 0.10. However, the

56
variables LY(-1)*t*@Expand(Firm,@Dropfirst) an insignificant joint effects, based on the F-
statistic of F0 = 2.782379, with df = (2,73), and a p-value = 0.0685 < α = 0.10. These
findings indicate that at least the intercation LY(-1)*t*@Expand(Firm,@Dropfirst), should
be inserted as additional independent variables for the LV(1)_CSFEM.
3.2 For a comparison, Figure 2.8 presents the statistical results of a reduced model of the
LV(1)_HRM in Figure 2.7, which is better than the LV(1)_CSFEM, in both theoretical and
statistical senses.
3.3 Furthermore, for this model, it is found that the interactions LY(-1)*t*@Expand(Firm), have
a significant joint effects on LY, based on the F-statistic of F0 = 2.485546, with df = (3,71),
and a p-value = 0.0676 < α = 0.10.

2.8.2 Generalized CSFEMs


2.8.2.1 CSFEMs with Time-Related-Effects
As the extension of the CSFEM with
TRE in (2.44), the CSFEMs with TRE
considered would have the following general
equation specifications.
G(Y) C X1… Xk t t*X1 … t*Xk
@Expand(Firm,@Dropfirst) (2.46)

Figure 2.8 Statistical results of a reduced model


where G(Y) = Y or a transformed endogenous
of the LV(1)_HRM in Figure 2.7
variable Y, and the exogenous variables Xkit,
k=1,…,K, where Xk can be a lag of the endogenous variable, an original numerical exogenous
variable, a transformed variable, an environmental variable, namely Zt, a dummy of defined
time-period, and a two or higher interaction of the previous types of variables.
These models could be considered as One-Way ANCOVA models having a single factor,
namely FIRM, with a set of covariates: X1,…, Xk, t, t*X1, …, and t*Xk, for a lot of possible Xk.
So that a researcher could subjectively defined various alternative CSFEMs, under the
assumption that all covariates have the same effect on the dependent variable, for all firms (the
research objects). Even though the assumption might not be relevant or valid for a large number
of the research objects, however fixed-effects models have been presented in the international

57
journals, such as the Journal of Finance, based on panel data sets having thousands or hundred-
thousands of observations.
Note that all CSFEMs previously presented, as well as other CSFEMs, could be
considered as the reduced models of this model, as well as the following reduced models.

2.8.2.2 CSFEMs with trend


They have the following general ES.
G(Y) C X1 … Xk… t @Expand(Firm,@Dropfirst) (2.47)

2.8.2.3 CSFEMs without the time-t


It is recognized that most of the papers in the international journals present the fixed-
effects models without using the numerical time independent variable. So that those CSFEMs
can be represented using the following general ES, which is a direct reduced model of (2.47).
Find the following example
G(Y) C X1 … Xk… @Expand(Firm,@Dropfirst) (2.48)

Example 2.6 (An application of a CSFEM in (2.48)) Figure 2.9 presents two kinds of the
statistical results of a LV(1)_CSFEM in (2.48), based on three variables LK, LL and LY, in
CES.wf1, using the following the ES, respectively
ly ly(-1)*lk*ll ly(-1)*lk ly(-1)*ll lk*ll ly(-1) lk ll C @Expand(Firm,@Dropfirst) (2.49a)
and
ly ly(-1)*lk*ll ly(-1)*lk ly(-1)*ll lk*ll ly(-1) lk ll @Expand(Firm) (2.49b)

Based on these results, the following findings and notes are presented.
(1) Note that for each firm i, the model is representing a three-way hierarchical time series
model. Hence, various alternative time series models in Agung (2009) could be modified
to CSFEMs.
(2) The ES in (2.49a) is applied with an objective to test the joint effects of all independent
variables, namely 7 numerical variables, and 81 firm dummies, on the dependent
variable. LY. The result shows the 88 independent variables have a significant joint
effects, based on the F-statistic of F0 = 32519.19 with a p-value = 0.0000.

58
(3) The LV(1)_CSFEM is a three-way interaction hierarchical model, which could be
considered as an acceptable model, under the assumption that all independent variables
have the same effect on LY for the 82 firms, because each numerical independent
variables has a probability < 0.30, and only one of them has a probability > 0.25.

Figure 2.9 Statistical results of the LV(1)_CSFEMs in (2.49a), and (2.49b)

(4) On the other hand, it is found that LY(-1)*LK*LL*@Expand(Firm,@Dropfirst) have a


significant joint effects, based on the F-statistic of F0 = 3.833972 with df = (81,2044) and a
p-value = 0.0000, using the omitted variables test. The other interactions also can easily be
done. Based on this finding, the following notes should be very important to be considered.
4.1 Even though LY(-1)*LK*LL*@Expand(Firm,@Dropfirst) as a significant effect, it does not
mean that all of the 81 variables presented by this function have significant effects. So that
some of them should be deleted.

59
4.2 If the interaction between the numerical variables with the dummies would be used as
additional independent variables, then the model will present 82 heterogeneous regressions
by FIRM,. then it will be too many regressions should be presented in a paper, thesis or
dissertation For this reason it is recommended to applied a HRM by group of the firms,
even though the variable GROUP should be subjectively defined or generated based on one
or more variables. Refer to the HRMs presented in previous sections.
4.3 Furthermore the HRM might could be reduced to the ANCOVA model or group-fixed-effects
model (GRFEM), which is acceptable in a statistical sense.
(5) The CSFEMs in (2.49a), and (2.49b) can easily be extended to the CSFEMs with trend, and
CSFEMs with the time related-effects By inserting additional independent variables, it is
recommended to apply the manual stepwise selection method, as presented in Example 2.2.
(6) In addition they could be extended to CSFEMs by inserting a defined time-period (TP)
independent variable, either as a function @Expand(TP) (time-period dummies) or
interaction NV*@Expand(TP) for at least one numerical variable (NV).
(7) Note that the LV(1) model is applied in order to have a sufficient value of the DW-statistic,
which could be extended to LV(p) models, for p >1

2.8.2.4 CSFEMs by Time-Period (TP)


It is recognized that all CSFEMs could be modified or extended to CSFEMs by a time-
period (TP). So that based on the most general CSFEMs in (2.46), we may have two groups of
CSFEMs, namely the heterogeneous CSFEMs by TP, and the bi-factorial ANCOVA models,
having the following equations specifications. In addition, refer to all possible reduced models of
the CSFEMs in (2.46), which can also be the reduced models of the CSFEMs by TP.

(i). Heterogeneous CSFEMs by TP.


These models can be represented using the following general equation specification (ES),
which shows all numerical independent variables have different slopes between the time-periods
considered.
G(Y) C … Xk*@Expand(TP)... t*Expand(TP)
… t*Xk*Expand(TP )… @Expand(Firm,@Dropfirst) (2.50)

60
(ii) Bi-factorial ANCOVA Models.
These models can be represented using the following general ES, with the assumption
that all numerical independent variables have the same slopes between all groups generated by
the variable FIRM, and TP. There are two types of ANCOVA models should be considered, such
as follows:
1. The interaction ANCOVA models having the following ES, which shows that the
dummy independent variables are the dummy of the interactions of the firm dummies
and the time-period dummies.
G(Y) C X1… Xk t t*X1 … t*Xk @Expand(Firm,TP,@Dropfirst) (2.51)

2. The additive ANCOVA models having the following ES, which shows that the
dummy independent variables are the additive of the firm dummies and the time-
period dummies. These models are considered as the worst ANCOVA model among
the ANCOVA models having the same independent variables (Agung, 2011). Note
that these models in fact are two-way FEMs having the firm-fixed effects, and the
time-period-fixed-effects.

G(Y) C X1… Xk t t*X1 … t*Xk


@Expand(Firm,@Dropfirst) @Expand(TP,@Dropfirst) (2.52)

2.9 Time or Period Fixed Effects Models


2.9.1 Generalized Period Fixed Effects Models
Similar to the generalized CSFEMs in (2.48), the period fixed-effects models also are
One-way ANCOVA models of G(Yit), but with the discrete time variable as a single factor, and a
set of covariates, namely Xkit, k=1,…,K, where Xk can be a lag of the endogenous variable, an
original numerical exogenous variable, a transformed variable, an environmental variable, a
dummy variable of the firm-groups, and a two or higher interaction of the previous types of
variables.
Then the generalized time or period fixed-effects models, namely PEFEMs, can be
represented using the following equation specification. Note that these models are representing a

61
set of (T-p) cross-section models if the lagged variables models, namely LV(p)_Models, are
applied.
G(Y) C X1 … Xk… @Expand(Time,@Dropfirst) (2.53)

2.9.2 Some Specific PEFEMs


2.9.2.1 Lagged Variables PEFEMs based on a single variable Yit
The first-order lagged variable PEFEMs of Yit, which is the simplest PEFEM, namely
LV(1)_PEFEM, having the following equation .
G(Yi,t-1) = δt + γ*G(Yi,t-1) + εit, for t=2,…,T (2.54)
which is representing a set of (T-1) homogenous regressions of Yit on Yi,t-1 for all t=2,…,T,
indicated by the (T-1) intercept parameters δt, and a constant slope parameter γ.
For the analysis using EViews, the following equation specification would be applied.
G(Y) C G(Y(-1)) @Expand(Time,@Dropfirst) (2.55)
This model can easily be extended to LV(p)_PEFEMs, having the following equation
specification, where the value of p should be highly dependent on the data set used.
G(Y) C G(Y(-1)) … G(Y(-p)) @Expand(Time,@Dropfirst) (2.56)

2.9.1.2 The First-Order Lagged Variable PEFEMs based on (Xit,Yit)


It is recognized that the effect of a numerical independent variable on the dependent
variable depends on the other numerical independent variable. Since the LV(1)_PEFEMs based
on (Xit,Yit) would have at least two numerical independent variables, namely Yi,t-1 and Xi,t , then
it can be accepted, in a theoretical sense, that the effects of the exogenous variable Xit on the
endogenous variable Yit depends on Yi,t-1, or the effects of Yi,t-1 on Yit depends on Xi,t,. So that the
LV(1)_PEFEMs can be represented the following general equation specification (ES).
G(Y) G(Y(-1))*F(X) G(Y(-1)) F(X) C @Expand(Time,@Dropfirst) (2.57)

where G(Y) = Y and F(X) = X, or the transformed variables of Y and X, respectively. Thence
there would be a lot of possible LV(1)_PEFEMs could be defined by a researcher, based on a
bivariate (Xit,Yit). Note that these models are two-way interaction hierarchical models. However,
the good fit models obtained can be either nonhierarchical two-way interaction models or
additive models, which are highly dependent on the data sets used.

62
Furthermore, each of those models could easily be extended to higher-order lagged
variables models, and various polynomial models, such as the simplest polynomial model which
as an additive LV(1)_PEFEM having the ES as follows
G(Y) G(Y(-1)) X X^2 … X^n C @Expand(Time,@Dropfirst) (2.58)

2.9.1.3 The First-Order Lagged Variable PEFEMs based on (X1it,X2it,Yit)


As the extension of the LV(1)_PEFEMs in (2.57), based on the variables (X1it,X2it,Yit)
we would have the three-way interaction LV(1)_PEFEMs, having the following general
equation.
G(Y) G(Y(-1))*F(X1)*F(X2) G(Y(-1))*F(X1) G(Y(-1))* F(X2
F(X1)*F(X2) G(Y(-1)) F(X1) F(X2) C @Expand(Time,@Dropfirst) (2.59)

where G(Y) = Y, F(X1)=X1, and F(X2) = X2, or the transformed variables of Y, X1 and X2,
respectively. Thence there would be a lot of possible LV(1)_PEFEMs could be defined by a
researcher, based on a bivariate (X1it, X2it, Yit). Note that these models are three-way interaction
hierarchical models. However, the good fit models obtained can be either nonhierarchical three-
way interaction models, two-way hierarchical and nonhierarchical models, or additive models,
which are highly dependent on the data sets used.

Example 2.6 (An application of PEFEM in (2.59)) As a modification of the LV1)_CSFEM in


(1.49), Figure 2.10 presents the statistical results of a LV(1)_PEFEM based on the variable LK,
LL, and LY, using the following ES, and its acceptable reduced model
ly ly(-1)*lk*ll ly(-1)*lk ly(-1)*ll lk*ll ly(-1) lk ll @Expand(t) (2.60)
Based on these results, the following findings and notes are presented.
(1) For each time point t, the models are presenting a cross-section model. Hence, all models
based on cross-section data sets, presented in Agung (2011) could be modified to PEFEMs.
(2) Furthermore, the models can easily be extended to the PEFEMs by a variable: GROUP of the
firms, either as a function @Expand(Group), or an interaction NV*@Expand(Group), for at
least one numerical variable (NV).

63
ly(-1)*x1*@expand(a,b) ly(-1)*x2*@expand(a,b,@drop(2,2))
x1*x2*@expand(a,b,@drop(2,2)) ly(-1)*@expand(a,b,@drop(1,1),@drop(1,2))
x1*@expand(a,b,@drop(1,2),@drop(2,1))

Figure 2.10 Statistical results of the LV(1)_PEFEM (2.60), and its reduced model

2.9.1.4 The PEFEMs by GROUP


As the modification or extension the of all possible PEFEMs in (2.53), we may have the
PEFEMs by a defined groups of the firms, say the variable GROUP, which can be generated or
defined based on one or more variables. Note that the groups of the firms should be invariant or
constant over times. There would be two types of PEFEMs should be considered, , namely the
heterogeneous PEFEMs by GROUP, and the bi-factorial ANCOVA models, having the
following equations specifications. In addition, refer to all possible reduced models of the
PEFEMs in (2.53), which can also be the reduced models of the PEFEMs by GROUP.

(i). Heterogeneous PEFEMs by GROUP


These models can be represented using the following general ES, which shows that the
heterogeneous PEFEMs, since the effects of the numerical variables Xk’s have different effects
on G(Y) between the levels of the categorical variable : GROUP.

64
G(Y) C X1*@Expand(Group) … Xk*@Expand(Group)…
@Expand(Time,@Dropfirst) (2.61)

(ii) Bi-factorial ANCOVA Models.


These models can be represented using the following general ES, with the assumption
that all numerical independent variables have the same slopes between all groups generated by
the variable GROUP and TIME-PERIOD (TP). There are two types of ANCOVA models should
be considered, such as follows:
1. The interaction ANCOVA models having the following ES, which shows that the
dummy independent variables are the dummy of the interactions of the group
dummies and the time-period dummies.
G(Y) C X1… Xk t t*X1 … t*Xk @Expand(Group,TP,@Dropfirst) (2.62)

2. The additive ANCOVA models having the following ES, which shows that the
dummy independent variables are the additive of the group dummies and the time-
period dummies. These models are considered as the worst ANCOVA model among
the ANCOVA models having the same independent variables (Agung, 2011). Note
that these models in fact are two-way FEMs having the group fixed effects, and the
time-period fixed effects.
G(Y) C X1… Xk t t*X1 … t*Xk
@Expand(Firm,@Dropfirst) @Expand(TP,@Dropfirst) (2.63)

2.10 Firm-Year of Two-Way Fixed Effects Models


It is recognized that all period-fixed-effects models (PEFEMs) can easily be modified to
the firm-year or two-way fixed-effects models (TWFEMs). So that based on the general equation
specification (ES) of the PEFEMs in (2.49), we have the general ES of the TWFEMs, as follows:

G(Y) C X1… Xk… @Expand(Firm,@Dropfirst) @Expand(Time,@Dropfirst) (2.64)

Note that this general ES could be representing all possible TWFEMs, as well as their
reduced models, which are highly dependent on the data used. However, it has been found that

65
no one researcher and book in econometric present the specific characteristics of a TWFEM,
moreover its limitations. For this reason, the following sections present special notes and
comments on selected TWFEMs.

2.10.1 Limitations of the TWFEMs


Note that the models in (2.64) is presenting an additive bi-factorial ANCOVA models
with the factors: FIRM, and TIME; and covariates X1… Xk… .. If the panel data has (NxT) firm-
time observations, then this model is representing a set of (NxT) homogeneous regressions
(regressions having the same slopes) with special pattern of the (NxT) intercepts, or Nx(T-1)
intercepts if and only if the LV(1) model is applied. Find the following simple illustration, to
show the limitation of TWFEMs.

Example 2.7 (A TWFEM based on a small subsample) Figure 2.11 presents the statistical
results of the model in (2.65), based on the subsample {Firm < 4 and t < 5}, with its table of
parameters.
ly lk*ll lk ll c @expand(firm,@dropfirst) @expand(t,@dropfirst) (2.65)

66
Figure 2.11 Statistical results of the TWFEM in (2.65), and its table of parameters

Based on this figure, the following findings and notes are presented.
(1) The subsample {FIRM < 4 and T < 5} generates a 3x4 cross tabulation having 12 cells with a
single observation in each cell
(2) Note that the model is representing 12 = 3x4 = NxT homogeneous multiple regressions
having the same slopes, namely C(1), C(2), and C(3), with a special pattern of the
intercepts, which would never be observed in practice. So that each regression contains only
a single point of observations. So that this model would be considered as the worst panel data
model. Similarly the model in the following example based on the whole sample in CES.wf1.
(3) Note that the regression has such a large R-squared, namely R2 = 1, because regression
contains all observations. For an additional illustration, a regression line Y on X, based on
only two observations, would have a R2 = 1. The R2 = 1 does not mean that the model is a
best fit model. In this case, the model is the worst model, or inappropriate model.
(4) In addition, it is unexpected the results present such a very large F-statistic of 77896.58, and
DW-statistic of 4.071138.

67
Example 2.7 (A TWFEM corresponding to the CSFEM in (2.49), and PEFEM in (2.60))
For an illustration, Figure 2.12 presents the statistical results of the TWFEM in (2.66), based on
the subsample {T >1} of CES.wf1..
ly ly(-1)*lk*ll ly(-1)*lk ly(-1)*ll lk*ll ly(-1) lk ll
C @Expand(Firm,@Dropfirst) @Expand(t,@Dropfirst) (2.66)
Based on these statistical results, the following notes and comments are presented.
(1) Similar to the TWFEM in Figure 2.10, this model represents 2,214 (=82x27) homogenous
regressions with a special pattern of intercepts. So that I would consider this model is the
worst panel data model, in a theoretical sense, among all possible panel data models having
the same set of numerical independent variables.
(2) However, it is an acceptable model, in a statistical sense, since the statistical results show
acceptable estimates of the parameters, including the coefficients of the 81 firm dummies,
and 27 year dummies, which are not presented in Figure 2.11. However, some or many of the
dummies might have large p-values.
(3) It is recognized, that the lagged variables models would have large R-squared, as well the
autoregressive models using the AR term(s).

(4) For additional illustrations, the following section presents selected TWFEMs, presented in
the international journal, with special notes,

68
Figure 2.12 Statistical results of the TWFEM in (2.66), and one of its possible reduced models

2.10.2 Fixed Effects Models Presented in International Journals


2.10.2.1 Four-Way FEMs presented by Jie, Jun, and Strahan (2012)
Jie, Jun, and Strahan (2012) present several types of fixed effects models, namely three
sets of models using (i). Cohort-Year Fixed Effects, Issuer Fixed Effects, and Initial Rating
Category Dummies, which are Four-Way FEMs, (ii). Cohort-Year Fixed Effects and Initial
Rating Category Dummies, and (iii). Cohort-Year Fixed Effects, based on the data having
thousands of observations. The models also have interaction independent variables. The models
presented have a minimum R2 = 0.613, and a maximum R2 = 0.730.
For an illustration, the Four-Way FEMs can be presented using the following general ES.
G(Y) V1…Vk…VK C @Expand(Cohort,@Dropfirst) @Expand(Year,@Dropfirst)
@Expand(Issuer,@Dropfirst) @Expand(InitialRating,@Dropfirst) (2.67)

69
However, note that these models present a set of thousands homogeneous multiple
regressions of G(Y) on Vk, k=1,…,K, with a special set of intercepts. Refer to an additive DVM
in (1.14) with the model parameters presented in Table 1.16, which would never be observed in
practice.

2.10.2.2 Four-Way FEMs presented by Engelberg, and Parson (2011)


Engelberg, and Parson (2011) present several four-way fixed effects models using
Industry, Paper, City, and Date fixed effects, based on the data sets having 265,928 and 273,999
observations, with a minimum adjusted R2 of 0.049, and a mximum R2 of 0.089.

2.10.2.3 Two-Way FEMs presented by Puri, and Zaruttskie (2012)


Puri and Zaruttskie (2012) present industry-year fixed effects models having interactions
independent variables, in addition to the hundred-thousands of industry and year dummies, based
on the data set having 105,031 observations. So that the models in fact are presenting hundred-
thousands of multiple regressions having the same slopes, with a special pattern of the intercepts,
by industry and year. Refer to the dummy variables models in (1.19). Since the models have a
dummy independent variable, in addition to the fixed effects, then they have small R 2. One of
the model has R2 = 0.063. Compare the following models having very small R2..

2.10.2.4 One-Way FEMs presented by Jotikasthira, Lundblad, and Ramadorai (2012)


Jotikasthira, Lundblad, and Ramadorai (2012) present country fixed effects model with
all other independent variables are dummy variables, including selected their interactions. So
that it is common the models should have very small R-Squares. The models presented have R2
within the range of 0.000 and 0.060.

2.10.2.5 FEMs presented in Baltagi (2000a, and 2000b)


It is recognized that the four previous papers, and many other papers, do not present the
models with their Durbin Watson statistics. So it cannot be identified whether or not the models
have autocorrelation problems. On the other hand, Baltagi (2000b) presents examples of fixed
effects models having the characteristics as follows:

70
(1) The models have various values of DW-statistics, such as a cross-section fixed effect model
(p.19) having a small DW-statistic of 0.326578, and a two-way fixed effects models (p.59),
and (p. 254), having a DW-statistics of 0.333512, and 1.433777, respectively. It has been
well known, that the value of DW-statistic could easily be increased by using a lagged
variable model or an autoregressive model. So that the models presented in Baltagi could
easily be modified.
(2) Most of the fixed models are additive models, with one of the simple models is quoted from
Grunfeld (1958) having the basic equation as follows:
I it    1 Fit   2 Cit  uit (2.68)
where Iit denotes real gross investment for firm i in year t, Fit is the real value of the firm
(shares outstanding), and Cit is the real value of the capital stock. Based on the three
variables I, F, and C, the data analysis can be done based on the CSFEM, PEFEM, and
TWFEM, and random fixed effects models, which will be presented in the following chapter.

2.11 Groups or Time-Period Fixed Effects Models


Referring to the limitation of the TWFEMs presented in Figure 2.11, and Figure 2.12, and
hundreds or thousands of homogeneous regressions presented by CSFEMs and PEFEMs, which
would never be observed in practice. For this reason, it is recommended to apply modified fixed
effect models, namely groups or time-period fixed effects models, under the precondition that the
firms within each level of the variable GROUP can be considered as homogeneous. So that their
scores for each variable can be represented using their mean. Even though, the models still
should be using the assumption that each numerical independent variable of the mean has the
same effect of the corresponding dependent variable within all cells generated by the GROUP,
and TP. For these models, there would be two types of FEMs would be considered, such as
follows:

2.11.1 Group TimePeriod Fixed Effects Models


2.11.1.1 Group  TimePeriod Fixed Effects Models without Trend

71
Corresponding to the general HRMs presented by the equation specification (ES) in
(2.23), the Group TimePeriod fixed effects models (G  TP_FEMs), which in fact are the
ANCOVA models by G  TP, without trend (the numerical time independent variable) could be
presented using the ES as follows:
G(Y) V1… Vk… VK C @Expand(Group,TP,@Dropfirst) (2.69)

where Vkit can by any variable numeric, a dummy variable, an environmental variable, say Zt,
and two- or three-way interaction, including the interaction between the dummy variable(s) with
the numerical variable(s), and the lags of the numerical dependent and numerical independent
variable(s). So that there would be a lot of possible models can be defined, even based on a small
number of variables, such as three up to five numerical exogenous variables. For instance, the
following special lagged variable G  TP_FEMs.
Note that these models also are using the assumption that each Vk has the same effect on
G(Y) within all cells generated by the categorical variables GROUP, and Time-Period (TP).
Furthermore, note that a good of fit model obtained could be unexpected reduced model, which
is highly dependent on the data used, specifically the unpredictable impacts of multicollinearity
between the independent variables. See the following example 2.10.

Example 2.8 (G  TP_FEMs based on (X1,Y)) A type of G  TP_FEMs out of a lot of possible
models based on a bivariate (X1,Y) would be considered, is the first-order lagged variable
G*TPFEMs, which can be presented using the following general equation specification.

G(Y) G(Y(-1))*F(X1) G(Y-1) F(X1) C @Expand(Group,TP,@Dropfirst) (2.70)

where F(X), and G(Y), respectively, can be any functions of the variable X and Y, having no
parameter.

Example 2.9 (G*TPFEMs based on (X1,X2,Y) A type of G*TPFEMs, out of a lot of possible
models defined or proposed based on a trivariate (X1,X2,Y), is the first-order lagged variable
G*TPFEMs, which can be presented using the following general equation specification.

72
G(Y) G(Y(-1))*F1(X1)*F2(X2) G(Y(-1))*F1(X1)) G(Y(-1))*F2(X2) F1(X1)*F2(X2)
G(Y-1) F1(X1) F1(X1) C @Expand(Group,TP,@Dropfirst) (2.71)

where F1(X1), F2(X2), and G(Y) respectively, can be any functions of the variables X1, X2, and Y,
having no parameter.

2.11.1.2 G*TPFEMs with Trend


As an extension of the G*TPFEMs in (2.69), and referring the HRMs with trend in
(2.25), we have the G*TPFEMs with Trend, which could be represented using the general ES as
follows.
G(Y) V1… Vk… VK t C @Expand(Group,TP,@Dropfirst) (2.72)

2.11.1.3 G*TPFEMs with Time-Related-Effects

As an extension of the G*TPFEMs with trend in (2.72), and referring the HRMs with the
time-related-effects (TRE) in (2.26), we have the G*TPFEMs with TRE, which could be
represented using the general ES as follows.
G(Y) V1… Vk… VK t t*V1… t*Vk… t*VK C @Expand(Group,TP,@Dropfisrt) (2.73)

Example 2.10 (An unexpected reduced model) Compare to the statistical results of the full
CSFEM in Figure 2.9, the full PEFEM in Figure 2.10, and the full Firm-Year FEM or TWFEM
in Figure 2.12, Figure 2.13 presents the statistical results based on a full G*TPFEM, using
exactly the same set of numerical variables, and one of its unexpected reduced model. Based on
these results the following notes and comments are presented.
(1) The statistical results of the full model are obtained using the following ES, which is similar
to the ES in (2.66). However, these results are representing a set of four homogeneous
regressions of LY, compare to 2,214 homogeneous regressions based on the model in (2.66)
ly ly(-1)*lk*ll ly(-1)*lk ly(-1)*ll lk*ll ly(-1) lk ll
C @Expand(Group,TP,@Dropfirst) (2.74)

73
(2) The levels of the dichotomous variable GROUP of the firms could be easily extended,
depending on the characteristics of the sampled firms. On the other hand, the sampled firm
could be considered as a single homogeneous group.

Figure 2.13 Statistical results of the model in (2.74), and two of its possible reduced models

(3) The reduced model-1 is obtained by using the manual selection method. It is an unexpected
reduced model, because the main variable LY(-1) is deleted, even it has a very small
probability of 0.0000. It is recognized that other acceptable reduced models, where each of
the independent variables has a probability less than 0.25 (or 0.30).
(4) On the other hand, the reduced model-2 is obtained by deleting the main factor having the
greatest probability, that is LK, and then LL. The interaction independent variable(s) should
be kept in the model, since it is defined the effect of LY(-1) on LY depends on LK and LL.
(5) Then there is a question, which reduced model would be considered as a better model!

74
(5) Note that based on the trivaraite (LK,LL,LY) a lot of possible models could be proposed or
defined by a researcher, including a simpler continuous LV(1) Model presented in the
following example.

2.11.2 Group+TimePeriod Fixed Effects Models


Referring to the limitations of the TWFEMs as presented in Example 2.7, then it is
recommended to apply Group+TimePeriod Fixed Effects Models (G+TP_FEMs), which in fact
are ANCOVA Models by ( Group  TimePeriod ), namely additive ANCOVA models. Refer

to the limitation of additive ANOVA Models presented in previous chapter. Corresponding to the
models in (2.69) up to (2.73), the equation specifications of the G+TP_FEMs can easily
obtained.
For instance, corresponding to the G  TP_FEMs in (2.69), the G+TP_FEMs would have
the general equation specification as follows:
G(Y) V1… Vk… VK C @Expand(Group,Dropfirst) @Expand(TP,@Dropfirst) (2.75)

Similarly, corresponding to the G  TP_FEMs in (2.72), ), the G+TP_FEMs with trend


would have the general equation specification as follows:
G(Y) V1… Vk… VK t C @Expand(Group,Dropfirst) @Expand(TP,@Dropfirst)
(2.76)
And corresponding to the G  TP_FEMs in (2.73), ), the G+TP_FEMs with TRE would
have the general equation specification as follows:
G(Y) V1… Vk… VK t t*V1… t*Vk… t*VK
C @Expand(Group,@Dropfisrt) @Expand(TP,@Dropfisrt) (2.77)

2.12 Continuous Regression Model


In addition to the models presented above, another type of models can be considered are
the continuous regression models, or the models without using a dummy independent variable.
However, such a continuous regression model would be considered as an inappropriate model or
the worst model, because all (N*T) firm-time observations, is represented by a single regression
function only, especially for a large number observations. The regression functions obtained can
show that they are good fit models, in a statistical sense. See the following examples.
75
2.12.1 The Simplest Lagged Variable Model and Alternatives
Example 2.11 (The simplest lagged variable model and alternative) Figure 2.14 presents the
statistical results based on three LV(1) Models of LY, namely Model-1 is the simplest lagged
variable continuous model, Model-2 is a LV(1) G  TP_FEM, and Model-3 is a LV(1) HRM
(Hetrogeneous Regression Model). Based on these statistical results, the following notes and
comments are presented.

Figure 2.14 Statistical results of the simplest LV(1) Models of LY, and alternatives

(1) Model-1 is a continuous regression model of LY on LY(-1), which shows that LY(-1) is a best
predictor for LY, since it has a very large R2 of 0.999. In a two-dimensional space, this
regression is presenting a single line, based on 2214 firm-time observations.
(2) Model-2 is a LV(1) G  TP_FEM, in a four dimensional space generated by the variables LY,
LY(-1), Group, and Time-Period. So that it is an abstract space. However, in a two-
dimensional space of LY and LY(1), the model can be presented by four parallel regression
lines of LY on LY(-1).

76
(3) Model-3 is a LV(1) HRM in a four dimensional space generated by the variables LY, LY(-1),
Group, and Time-Period. So that it is an abstract space. However, in a two-dimensional space
of LY and LY(1), the model can be presented by four heterogeneous regression lines of LY on
LY(-1).
(4) Since the three models have very large R2 of 0.999, and each of the independent variables has
a significant effect, then every researcher can argue about the best possible model. Based on my
point of view, the best model is Model-3.

2.12.2 LV(1) Translog Linear Model and Alternatives


Example 2.12 (LV(1) Translog Linier Model, and Alternatives) Figure 2.15 presents the
statistical results based on three LV(1) Translog Linear Models of LY on LY(-1), LK, and LL,
namely Model-1 is continuous model, Model-2 is a LV(1) G*TPFEM, and Model-3 is an
advanced LV(1) HRM. Based on these models, the following notes and comments are
presented.
(1) Model-1 is a continuous translog linear model. Since LL has a very large probability, then a
reduced model would be developed by deleting LL, in a statistical sense. However, the
reduced model is not presented, because the model will be compared to Model-2 and Model-
3 having exactly the same numerical independent variables.
(2) Model-2 is a homogeneous translog linear model, or a G  TP_FEM, which shows that LK
and LL have very large probabilities. Since both variables LK and LL have insignificant joint
effects on LY, then both variables can be deleted to obtain a reduced model, which is the
Model-2 in Figure 2.14.
(3) Model-3 is a heterogeneous translog linear regressions of LY on LY(-1), LK and LL, by
Group, and TP. It is unexpected, the statistical results show only one of the independent
variables, namely LL*(Group=2 and TP=1) has a p-value ≥ 0.30. Note that at the α = 0.15
level of significance, each of the variables LK*(Group=1 and TP=1), LK*(Group=1 and
TP=2), and LL*(Group=1 and TP=2), respectively, has a positive significant level with a
probability 0.2902/2 = 0.1451, 0.1463/2 = 0.07315, and 0.2422/2 = 0.1211.
(4) Model-3 should be the best fit model to study the differential effects of LY(-1), LK, and LL
on LY, between the cells generated by the variables Group, and TP.

77
Figure 2.15 Statistical results of the LV(1) Translog Linear Model of LY, and alternatives

2.12.3 Three-Way Interaction Continuous Model


Example 2.13 (Three-Way Interaction LV(1) Translog LS Regression) Figure 2.16 presents
the statistical results of a full PLS regression, by using the following ES, and two of its possible
reduced models. Based on these results, the following notes and comments are presented.

ly ly(-1)*lk*ll ly(-1)*lk ly(-1)*ll lk*ll ly(-1) lk ll C (2.78)

78
Figure 2.16 Statistical results of the PLS Model in (2.78) and two of its possible reduced models

Figure 2.17 Statistical results of two translog linear models, and three interaction translog models

79
(1) Referring to the path diagram in Figure 2.1a, the effect of LY(-1) on LY is defined to be
dependent on LK and LL, then the reduced model should be explored by deleting the main
variables, say LY(-1), LK, or LL.
(2) The reduced Model-1, and Model-2, can be considered as good fit models, in a statistical
sense, since each of the numerical independent variable has a significant effect on LY, and
their R-squared are very large. However, each of the regression function is developed based
on such a large number of observations, say 82*27 = 2,214 firm-time observations.
(3) For a comparison, Figure 2.17 presents the statistical results based on two translog linear
models, namely Model-1 and Model-2 is a reduced Model-1, which is exactly the same
model as Model-1 in Figure 2.15. In this case, Model-2 should be considered as a better
model, in a statistical sense.
(4) Model-3 is a continuous two-way interaction regression function, which is an acceptable
model, since each of the independent variables has a probability < 0.30.
(5) Model-4 is a continuous three-way interaction regression function, with Model-5 as its
acceptable reduced model.
(6) Looking the five regression functions in Figure 2.17, then we would have a choice between
Model-2 as the simplest model, and the most advanced Model-5.

2.13 Various of FEMs and Continuous Models


All equation specifications presented above for the FEMs and Continuous Models can
easily be applied for the following models. Do for the exercises using your own panel data sets.
(1) The models having the numerical endogenous variable presented in subsection 1.3.1, such as
the LS Regressions, Quantile Regressions, and Instrumental Variables Models.
(2) The binary choice (probit, logit, and extreme value) models, for a dummy problem indicator,
(3) The ordered choice (probit, logit, and extreme value) models, for any ordinal problem
indicator, as presented in subsection 1.3.1.3,
(4) The firm, or cross-section fixed-effects HRMs, and the time, or period fixed-effects HRMs,
as the extension of FEMs presented in Section 1.4.3.

80
Reference
Jie (Jack) He, Jun (QJ) Qian, and Philip E. Strahan, 2012. Are All Rating Created Equal?
The Impact of Issuer Size on the Pricing of Mortgage-Backed Securities. The Journal of
Finance. Vol. LXVII. No. 6. December 2012; 2097 - 2113

81

You might also like