0% found this document useful (0 votes)
0 views

Week4 Nonlinear Models

Chapter 4 of 'Introduction to Econometrics' discusses nonlinear models and transformations of variables, emphasizing the distinction between linearity in variables and parameters. It introduces Ramsey's RESET test for functional misspecification, which helps identify potential nonlinearity in regression models. The chapter includes practical examples and data analysis to illustrate these concepts.

Uploaded by

meminatmaca55
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Week4 Nonlinear Models

Chapter 4 of 'Introduction to Econometrics' discusses nonlinear models and transformations of variables, emphasizing the distinction between linearity in variables and parameters. It introduces Ramsey's RESET test for functional misspecification, which helps identify potential nonlinearity in regression models. The chapter includes practical examples and data analysis to illustrate these concepts.

Uploaded by

meminatmaca55
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 222

Introduction to Econometrics

5th Edition
C. DOUGHERTY
Chapter heading

Chapter 4: Nonlinear Models and


Transformations of Variables

FALL 2024
Introduction to Econometrics
Chapter heading
LINEARITY and NONLINEARITY
LINEARITY AND NONLINEARITY

Linear in variables and parameters:

Y = 1 +  2 X 2 +  3 X 3 +  4 X 4 + u

This sequence introduces the topic of fitting nonlinear regression


models. First, we need a definition of linearity.
The model shown above is linear in two senses.
1. The right side is linear in variables because the variables are
included precisely as defined rather than as functions.
2. It is also linear in parameters since a different parameter
appears as a multiplicative factor in each term.

1
LINEARITY AND NONLINEARITY

Linear in variables and parameters:

Y = 1 +  2 X 2 +  3 X 3 +  4 X 4 + u

Linear in parameters, nonlinear in variables:

Y =  1 +  2 X 22 +  3 X 3 +  4 log X 4 + u

The second model above is linear in parameters, but nonlinear in variables.

4
LINEARITY AND NONLINEARITY

Linear in variables and parameters:

Y = 1 +  2 X 2 +  3 X 3 +  4 X 4 + u

Linear in parameters, nonlinear in variables:

Y =  1 +  2 X 22 +  3 X 3 +  4 log X 4 + u
Z 2 = X 22 , Z 3 = X 3 , Z 4 = log X 4

Such models present no problem at all. Define new variables as shown.

5
LINEARITY AND NONLINEARITY

Linear in variables and parameters:

Y = 1 +  2 X 2 +  3 X 3 +  4 X 4 + u

Linear in parameters, nonlinear in variables:

Y =  1 +  2 X 22 +  3 X 3 +  4 log X 4 + u
Z 2 = X 22 , Z 3 = X 3 , Z 4 = log X 4
Y = 1 +  2 Z 2 +  3 Z 3 +  4 Z4 + u

With these cosmetic transformations, we have made the model linear in both
variables and parameters.
6
LINEARITY AND NONLINEARITY

Linear in variables and parameters:

Y = 1 +  2 X 2 +  3 X 3 +  4 X 4 + u

Linear in parameters, nonlinear in variables:

Y =  1 +  2 X 22 +  3 X 3 +  4 log X 4 + u
Z 2 = X 22 , Z 3 = X 3 , Z 4 = log X 4
Y = 1 +  2 Z 2 +  3 Z 3 +  4 Z4 + u

Nonlinear in parameters:

Y = 1 +  2 X 2 +  3 X 3 +  2  3 X 4 + u

This model's parameters are nonlinear since the coefficient of X4 is the product of the
coefficients of X2 and X3. As we will see, some nonlinear models can be linearized by
appropriate transformations, but this is not one of them.
7
LINEARITY AND NONLINEARITY

Average annual percentage growth rates


Employment GDP Employment GDP

Australia 2.57 3.52 Korea 1.11 4.48


Austria 1.64 2.66 Luxembourg 1.34 4.55
Belgium 1.06 2.27 Mexico 1.88 3.36
Canada 1.90 2.57 Netherlands 0.51 2.37
Czech Republic 0.79 5.62 New Zealand 2.67 3.41
Denmark 0.58 2.02 Norway 1.36 2.49
Estonia 2.28 8.10 Poland 2.05 5.16
Finland 0.98 3.75 Portugal 0.13 1.04
France 0.69 2.00 Slovak Republic 2.08 7.04
Germany 0.84 1.67 Slovenia 1.60 4.82
Greece 1.55 4.32 Sweden 0.83 3.47
Hungary 0.28 3.31 Switzerland 0.90 2.54
Iceland 2.49 5.62 Turkey 1.30 6.90
Israel 3.29 4.79 United Kingdom 0.92 3.31
Italy 0.89 1.29 United States 1.36 2.88
Japan 0.31 1.85

We will start with an example of a simple model that can linearize a cosmetic
transformation. The table displays the average annual employment and GDP
growth rates for 31 OECD countries. 8
LINEARITY AND NONLINEARITY

2
e = 1 + +u
g
3
Employment growth rate

0
0 1 2 3 4 5 6 7 8 9
GDP growth rate

A plot of the data reveals that the relationship is clearly nonlinear. We will
consider various nonlinear specifications for the relationship in the course
of this chapter, starting with the hyperbolic model shown. 9
LINEARITY AND NONLINEARITY

2 1
e = 1 + +u z= e = 1 +  2 z + u
g g
3
Employment growth rate

0
0 1 2 3 4 5 6 7 8 9
GDP growth rate

This is nonlinear in g, but if we define z = 1/g, we can rewrite the model to be


linear in variables and parameters.
10
LINEARITY AND NONLINEARITY

Average annual percentage growth rates


e g z e g z

Australia 2.57 3.52 0.2841 Korea 1.11 4.48 0.2235


Austria 1.64 2.66 0.3757 Luxembourg 1.34 4.55 0.2199
Belgium 1.06 2.27 0.4401 Mexico 1.88 3.36 0.2976
Canada 1.90 2.57 0.3891 Netherlands 0.51 2.37 0.4221
Czech Republic 0.79 5.62 0.1781 New Zealand 2.67 3.41 0.2929
Denmark 0.58 2.02 0.4961 Norway 1.36 2.49 0.4013
Estonia 2.28 8.10 0.1234 Poland 2.05 5.16 0.1938
Finland 0.98 3.75 0.2664 Portugal 0.13 1.04 0.9603
France 0.69 2.00 0.5004 Slovak Republic 2.08 7.04 0.1420
Germany 0.84 1.67 0.5980 Slovenia 1.60 4.82 0.2075
Greece 1.55 4.32 0.2315 Sweden 0.83 3.47 0.2885
Hungary 0.28 3.31 0.3021 Switzerland 0.90 2.54 0.3941
Iceland 2.49 5.62 0.1779 Turkey 1.30 6.90 0.1449
Israel 3.29 4.79 0.2089 United Kingdom 0.92 3.31 0.3024
Italy 0.89 1.29 0.7723 United States 1.36 2.88 0.3476
Japan 0.31 1.85 0.5417

The data table displays the values of z, which are derived from the values of
g. In practice, you won’t need to do these calculations manually. Regression
applications typically include features that allow you to generate new
variables based on existing ones. 11
LINEARITY AND NONLINEARITY

. gen z = 1/g
. reg e z
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 31
-----------+------------------------------ F( 1, 29) = 13.68
Model | 5.80515811 1 5.80515811 Prob > F = 0.0009
Residual | 12.3041069 29 .424279548 R-squared = 0.3206
-----------+------------------------------ Adj R-squared = 0.2971
Total | 18.109265 30 .603642167 Root MSE = .65137
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
z | -2.356137 .6369707 -3.70 0.001 -3.658888 -1.053385
_cons | 2.17537 .249479 8.72 0.000 1.665128 2.685612
----------------------------------------------------------------------------

Here is the output for a regression of e on z.

12
LINEARITY AND NONLINEARITY

eˆ = 2.18 − 2.36 z
3 ------------------------
e | Coef.
-----------+------------
z | -2.356137
Employment growth rate

_cons | 2.17537
2
------------------------

0
0.0 0.2 0.4 0.6 0.8 1.0

z=1/g

-1

The figure shows the transformed data and the regression line for the
regression of e on z.
13
LINEARITY AND NONLINEARITY

2.36
eˆ = 2.18 − 2.36 z = 2.18 −
g
3
Employment growth rate

0
0 1 2 3 4 5 6 7 8 9

-1

-2

GDP growth rate

Substituting 1/g for z, we obtain the nonlinear relationship between e and g.


The figure shows this relationship plotted in the original diagram.
14
LINEARITY AND NONLINEARITY

2.36
eˆ = 2.18 − 2.36 z = 2.18 −
g
3
Employment growth rate

0
0 1 2 3 4 5 6 7 8 9

-1

-2

GDP growth rate

In this case, the relationship between e and g was nonlinear. In multiple


regression analysis, nonlinearity might be detected using the graphical
technique described in a previous slideshow. 15
Introduction to Econometrics
Chapter heading
RAMSEY’S RESET TEST OF
FUNCTIONAL
MISSPECIFICATION
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION

k
Y = 1 +   j X j + u
j =2
k
Yˆ = ˆ1 +  ˆ j X j
j=2

Ramsey’s RESET test of functional misspecification is intended to


provide a simple indicator of evidence of nonlinearity. To implement it,
one runs the regression and saves the fitted values of the dependent
variable.
Since, by definition, the fitted values are a linear combination of the
explanatory variables, Y2 is a linear combination of the squares of the X
variables and their interactions.

1
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION

k
Y = 1 +   j X j + u
j =2
k
Yˆ = ˆ1 +  ˆ j X j
j=2

Add Ŷ 2 to regression specification

^
If Y2 is added to the regression specification, it should pick up quadratic and
interactive nonlinearity, if present, without necessarily being highly correlated
with any of the X variables. 3
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION

k
Y = 1 +   j X j + u
j =2
k
Yˆ = ˆ1 +  ˆ j X j
j=2

Add Ŷ 2 to regression specification

If the t statistic for the coefficient is significant, some kind of nonlinearity


may be present.
4
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION

. reg EARNINGS S EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 35.24
Model | 8735.42401 2 4367.712 Prob > F = 0.0000
Residual | 61593.5422 497 123.930668 R-squared = 0.1242
-----------+------------------------------ Adj R-squared = 0.1207
Total | 70328.9662 499 140.939812 Root MSE = 11.132
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | 1.877563 .2237434 8.39 0.000 1.437964 2.317163
EXP | .9833436 .2098457 4.69 0.000 .5710495 1.395638
_cons | -14.66833 4.288375 -3.42 0.001 -23.09391 -6.242752
----------------------------------------------------------------------------
. predict FITTED
(option xb assumed; fitted values)
. gen FITTEDSQ = FITTED*FITTED

We will do this for a wage equation. Here is the output from a regression of
EARNINGS on S and EXP using EAWE Data Set 21. We save the fitted
values as FITTED and generate FITTEDSQ as the square.
5
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION

. reg EARNINGS S EXP FITTEDSQ


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 25.46
Model | 9386.33186 3 3128.77729 Prob > F = 0.0000
Residual | 60942.6344 496 122.868214 R-squared = 0.1335
-----------+------------------------------ Adj R-squared = 0.1282
Total | 70328.9662 499 140.939812 Root MSE = 11.085
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | -1.334163 1.413072 -0.94 0.346 -4.110507 1.442181
EXP | -.6441233 .7373115 -0.87 0.383 -2.092762 .8045155
FITTEDSQ | .0460798 .0200203 2.30 0.022 .0067447 .0854148
_cons | 25.09321 17.79509 1.41 0.159 -9.86984 60.05626
----------------------------------------------------------------------------

The coefficient of FITTEDSQ is significant at the 5 percent level, indicating


that adding quadratic terms may improve the model’s specification.
6
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION

. reg EARNINGS S EXP FITTEDSQ


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 25.46
Model | 9386.33186 3 3128.77729 Prob > F = 0.0000
Residual | 60942.6344 496 122.868214 R-squared = 0.1335
-----------+------------------------------ Adj R-squared = 0.1282
Total | 70328.9662 499 140.939812 Root MSE = 11.085
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | -1.334163 1.413072 -0.94 0.346 -4.110507 1.442181
EXP | -.6441233 .7373115 -0.87 0.383 -2.092762 .8045155
FITTEDSQ | .0460798 .0200203 2.30 0.022 .0067447 .0854148
_cons | 25.09321 17.79509 1.41 0.159 -9.86984 60.05626
----------------------------------------------------------------------------

However, we also saw that a semilogarithmic specification was better. The


RESET test is intended to detect nonlinearity but not to be specific about
the most appropriate nonlinear model. 7
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION

. reg EARNINGS S EXP FITTEDSQ


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 25.46
Model | 9386.33186 3 3128.77729 Prob > F = 0.0000
Residual | 60942.6344 496 122.868214 R-squared = 0.1335
-----------+------------------------------ Adj R-squared = 0.1282
Total | 70328.9662 499 140.939812 Root MSE = 11.085
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | -1.334163 1.413072 -0.94 0.346 -4.110507 1.442181
EXP | -.6441233 .7373115 -0.87 0.383 -2.092762 .8045155
FITTEDSQ | .0460798 .0200203 2.30 0.022 .0067447 .0854148
_cons | 25.09321 17.79509 1.41 0.159 -9.86984 60.05626
----------------------------------------------------------------------------

It may fail to detect some types of nonlinearity. However, it is very easy to


implement and requires only one degree of freedom.
8
Introduction to Econometrics
Chapter heading
ELASTICITIES AND
LOGARITHMIC MODELS
ELASTICITIES AND LOGARITHMIC MODELS

Definition:
The elasticity of Y with respect to
X is the proportional change in
Y per proportional change in X. A
Y

dY Y
elasticity =
dX X Y
dY dX X
= O0 X 52
Y X

This sequence defines elasticities and shows how to fit nonlinear models
with constant elasticities—first, the general definition of elasticity.
1
ELASTICITIES AND LOGARITHMIC MODELS

Definition:
The elasticity of Y with respect to
X is the proportional change in
Y per proportional change in X. A
Y

dY Y
elasticity =
dX X Y
dY dX X
= O0 X 52
Y X
slope of the tangent at A
=
slope of OA

Re-arranging the expression for the elasticity, we can obtain a graphical


interpretation.
2
ELASTICITIES AND LOGARITHMIC MODELS

Definition:
The elasticity of Y with respect to
X is the proportional change in
Y per proportional change in X. A
Y

dY Y
elasticity =
dX X Y
dY dX X
= O0 X 52
Y X
slope of the tangent at A
=
slope of OA

The elasticity at any point on the curve is the ratio of the slope of the
tangent at that point to the slope of the line joining the point to the origin.
3
ELASTICITIES AND LOGARITHMIC MODELS

Definition:
elasticity < 1
The elasticity of Y with respect to
X is the proportional change in
Y per proportional change in X. A
Y

dY Y
elasticity =
dX X Y
dY dX X
= O0 X 52
Y X
slope of the tangent at A
=
slope of OA

In this case, the tangent at A is clearly flatter than the line OA, so the
elasticity must be less than 1.
4
ELASTICITIES AND LOGARITHMIC MODELS

Definition:
elasticity > 1
The elasticity of Y with respect to
X is the proportional change in
Y per proportional change in X.
A
Y
dY Y
elasticity =
dX X
dY dX
= O0 X 52
Y X
slope of the tangent at A
=
slope of OA

In this case, the tangent at A is steeper than OA, and the elasticity is greater than 1.

5
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 +  2 X Y

dY dX
elasticity =
Y X A
slope of the tangent at A
=
slope of OA
2
=
( 1 +  2 X ) X
O Xx
2
=
( 1 X ) + 2

The elasticity will generally be different at different points on the function


relating Y to X.
6
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 +  2 X Y

dY dX
elasticity =
Y X A
slope of the tangent at A
=
slope of OA
2
=
( 1 +  2 X ) X
O Xx
2
=
( 1 X ) + 2

In the example above, Y is a linear function of X.

7
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 +  2 X Y

dY dX
elasticity =
Y X A
slope of the tangent at A
=
slope of OA
2
=
( 1 +  2 X ) X
O Xx
2
=
( 1 X ) + 2

The tangent at any point is coincidental with the line itself, so its slope is
always b2 in this case. The elasticity depends on the slope of the line
joining the point to the origin. 8
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 +  2 X Y

dY dX
elasticity = B
Y X A
slope of the tangent at A
=
slope of OA
2
=
( 1 +  2 X ) X
O Xx
2
=
( 1 X ) + 2

OB is flatter than OA, so the elasticity is greater at B than at A. (This ties in


with the mathematical expression: (1 / X) + 2 is smaller at B than at A,
assuming that 1 is positive.) 9
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

However, a function of the type shown above has the same elasticity for all
values of X.
10
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

dY
=  1  2 X  2 −1
dX

For the numerator of the elasticity expression, we need the derivative of Y


with respect to X.
11
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

dY
=  1  2 X  2 −1
dX

Y 1 X 2
= =  1 X  2 −1
X X

For the denominator, we need Y/X.

12
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

dY
=  1  2 X  2 −1
dX

Y 1 X 2
= =  1 X  2 −1
X X

d Y d X  1  2 X  2 −1
elasticity = =  2 −1 =  2
Y X 1 X

Hence, we obtain the expression of elasticity. This simplifies to 2 and is


therefore constant.
13
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 0.25

By way of illustration, the function will be plotted for a range of values of 2.
We will start with a very low value, 0.25.
14
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 0.50

We will increase 2 in steps of 0.25 and see how the shape of the function
changes.
15
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 0.75

2 = 0.75.

16
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 1.00

When 2 equals 1, the curve becomes a straight line through the origin.

17
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 1.25

2 = 1.25.

18
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 1.50

2 = 1.50.

19
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 1.75

2 = 1.75. Note that the curvature can be quite gentle over wide ranges of X.

20
ELASTICITIES AND LOGARITHMIC MODELS

Y
Y = 1 X 2
 2 = 1.75

This means that even if the true model is of the constant elasticity form, a
linear model may be a good approximation over a limited range.
21
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

log Y = log  1 X  2
= log  1 + log X  2
= log  1 +  2 log X

Fitting a constant elasticity function using a sample of observations is easy.


You can linearize the model by taking the logarithms of both sides.
22
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

log Y = log  1 X  2
= log  1 + log X  2
= log  1 +  2 log X

Y ' =  1' +  2 X ' where Y ' = log Y ,


X ' = log X
 1' = log  1

You thus obtain a linear relationship between Y' and X', as defined. All serious
regression applications allow you to generate logarithmic variables from
existing ones. 23
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

log Y = log  1 X  2
= log  1 + log X  2
= log  1 +  2 log X

Y ' =  1' +  2 X ' where Y ' = log Y ,


X ' = log X
 1' = log  1

The coefficient of X' will be a direct estimate of the elasticity, 2.

24
ELASTICITIES AND LOGARITHMIC MODELS

Y = 1 X 2

log Y = log  1 X  2
= log  1 + log X  2
= log  1 +  2 log X

Y ' =  1' +  2 X ' where Y ' = log Y ,


X ' = log X
 1' = log  1

The constant term will be an estimate of log 1. To obtain an estimate of 1,
calculate exp( ˆ1' ), where ̂ 1' is the estimate of  1' . (This assumes that you have
used natural logarithms, that is, logarithms to base e, to transform the model.)
25
ELASTICITIES AND LOGARITHMIC MODELS

FDHO

7000

6000

5000

4000

3000

2000

1000

0
0 10000 20000 30000 40000 50000 EXP

Here is a scatter diagram showing annual household expenditure on FDHO,


food eaten at home, and EXP, total annual household expenditure, both
measured in dollars, for 1995 for a sample of 869 households in the United
States (Consumer Expenditure Survey data). 26
ELASTICITIES AND LOGARITHMIC MODELS

. reg FDHO EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 6334
-----------+------------------------------ F( 1, 6332) = 3431.01
Model | 972602566 1 972602566 Prob > F = 0.0000
Residual | 1.7950e+09 6332 283474.003 R-squared = 0.3514
-----------+------------------------------ Adj R-squared = 0.3513
Total | 2.7676e+09 6333 437006.15 Root MSE = 532.42
----------------------------------------------------------------------------
FDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
EXP | .0627099 .0010706 58.57 0.000 .0606112 .0648086
_cons | 369.4418 10.65718 34.67 0.000 348.5501 390.3334
----------------------------------------------------------------------------

Here is a linear regression of FDHO on EXP. When using household data, it


is usual to relate types of consumer expenditures to total expenditures rather
than income. Household income data tend to be relatively erratic.
27
ELASTICITIES AND LOGARITHMIC MODELS

. reg FDHO EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 6334
-----------+------------------------------ F( 1, 6332) = 3431.01
Model | 972602566 1 972602566 Prob > F = 0.0000
Residual | 1.7950e+09 6332 283474.003 R-squared = 0.3514
-----------+------------------------------ Adj R-squared = 0.3513
Total | 2.7676e+09 6333 437006.15 Root MSE = 532.42
----------------------------------------------------------------------------
FDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
EXP | .0627099 .0010706 58.57 0.000 .0606112 .0648086
_cons | 369.4418 10.65718 34.67 0.000 348.5501 390.3334
----------------------------------------------------------------------------

The regression implies that, at the margin, 6.3 cents out of each dollar of
expenditure is spent on food at home. Does this seem plausible? Probably,
though a little low. 28
ELASTICITIES AND LOGARITHMIC MODELS

. reg FDHO EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 6334
-----------+------------------------------ F( 1, 6332) = 3431.01
Model | 972602566 1 972602566 Prob > F = 0.0000
Residual | 1.7950e+09 6332 283474.003 R-squared = 0.3514
-----------+------------------------------ Adj R-squared = 0.3513
Total | 2.7676e+09 6333 437006.15 Root MSE = 532.42
----------------------------------------------------------------------------
FDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
EXP | .0627099 .0010706 58.57 0.000 .0606112 .0648086
_cons | 369.4418 10.65718 34.67 0.000 348.5501 390.3334
----------------------------------------------------------------------------

It also suggests that $369 would be spent on food at home if total expenditure
were zero. This is impossible. It may be possible to interpret it as baseline
expenditure, but we must consider family size and composition.
29
ELASTICITIES AND LOGARITHMIC MODELS

FDHO

7000

6000

5000

4000

3000

2000

1000

0
0 10000 20000 30000 40000 50000 EXP

Here is the regression line plotted on the scatter diagram

30
ELASTICITIES AND LOGARITHMIC MODELS

LGFDHO
10

1
6 7 8 9 10 11 LGEXP

We will now fit a constant elasticity function using the same data. The scatter
diagram shows the FDHO logarithm plotted against the EXP logarithm.
31
ELASTICITIES AND LOGARITHMIC MODELS

. g LGFDHO = ln(FDHO)
. g LGEXP = ln(EXP)
. reg LGFDHO LGEXP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 6334
-----------+------------------------------ F( 1, 6332) = 4719.99
Model | 1642.9356 1 1642.9356 Prob > F = 0.0000
Residual | 2204.04385 6332 .348080204 R-squared = 0.4271
-----------+------------------------------ Adj R-squared = 0.4270
Total | 3846.97946 6333 .60744978 Root MSE = .58998
----------------------------------------------------------------------------
LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
LGEXP | .6657858 .0096909 68.70 0.000 .6467883 .6847832
_cons | .7009498 .0843607 8.31 0.000 .5355741 .8663254
----------------------------------------------------------------------------

Here is the result of regressing LGFDHO on LGEXP. The first two commands
generate the logarithmic variables.
32
ELASTICITIES AND LOGARITHMIC MODELS

. g LGFDHO = ln(FDHO)
. g LGEXP = ln(EXP)
. reg LGFDHO LGEXP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 6334
-----------+------------------------------ F( 1, 6332) = 4719.99
Model | 1642.9356 1 1642.9356 Prob > F = 0.0000
Residual | 2204.04385 6332 .348080204 R-squared = 0.4271
-----------+------------------------------ Adj R-squared = 0.4270
Total | 3846.97946 6333 .60744978 Root MSE = .58998
----------------------------------------------------------------------------
LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
LGEXP | .6657858 .0096909 68.70 0.000 .6467883 .6847832
_cons | .7009498 .0843607 8.31 0.000 .5355741 .8663254
----------------------------------------------------------------------------

The estimate of the elasticity is 0.67. Does this seem plausible?

33
ELASTICITIES AND LOGARITHMIC MODELS

. g LGFDHO = ln(FDHO)
. g LGEXP = ln(EXP)
. reg LGFDHO LGEXP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 6334
-----------+------------------------------ F( 1, 6332) = 4719.99
Model | 1642.9356 1 1642.9356 Prob > F = 0.0000
Residual | 2204.04385 6332 .348080204 R-squared = 0.4271
-----------+------------------------------ Adj R-squared = 0.4270
Total | 3846.97946 6333 .60744978 Root MSE = .58998
----------------------------------------------------------------------------
LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
LGEXP | .6657858 .0096909 68.70 0.000 .6467883 .6847832
_cons | .7009498 .0843607 8.31 0.000 .5355741 .8663254
----------------------------------------------------------------------------

Yes, definitely. Food is a normal good, so its elasticity should be positive;


however, it is also a basic necessity. Expenditure on food should grow less
rapidly than overall expenditure, indicating that its elasticity is less than 1. 34
ELASTICITIES AND LOGARITHMIC MODELS

. g LGFDHO = ln(FDHO)
. g LGEXP = ln(EXP)
. reg LGFDHO LGEXP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 6334
-----------+------------------------------ F( 1, 6332) = 4719.99
Model | 1642.9356 1 1642.9356 Prob > F = 0.0000
Residual | 2204.04385 6332 .348080204 R-squared = 0.4271
-----------+------------------------------ Adj R-squared = 0.4270
Total | 3846.97946 6333 .60744978 Root MSE = .58998
----------------------------------------------------------------------------
LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
LGEXP | .6657858 .0096909 68.70 0.000 .6467883 .6847832
_cons | .7009498 .0843607 8.31 0.000 .5355741 .8663254
----------------------------------------------------------------------------

ˆ
LGFDHO ˆ = 2.02 EXP 0.666
= 0.701 + 0.666 LGEXP  FDHO

The intercept has no substantive meaning. To obtain an estimate of 1, we


calculate e0.701, which is 2.02.
35
ELASTICITIES AND LOGARITHMIC MODELS

LGFDHO
10

1
6 7 8 9 10 11 LGEXP

Here is the scatter diagram with the regression line plotted.

36
ELASTICITIES AND LOGARITHMIC MODELS

FDHO

7000

6000

5000

4000

3000

2000

1000

0
0 10000 20000 30000 40000 50000 EXP

Here is the regression line from the logarithmic regression plotted in the
original scatter diagram and the linear regression line for comparison.
37
ELASTICITIES AND LOGARITHMIC MODELS

FDHO

7000

6000

5000

4000

3000

2000

1000

0
0 10000 20000 30000 40000 50000 EXP

The logarithmic regression line gives a somewhat better fit, especially at low
expenditure levels.
38
ELASTICITIES AND LOGARITHMIC MODELS

FDHO

7000

6000

5000

4000

3000

2000

1000

0
0 10000 20000 30000 40000 50000 EXP

However, the difference in the fit is not dramatic. The main reason for
preferring the constant elasticity model is that it makes more sense
theoretically. It also has a technical advantage that we will discuss later
when we discuss heteroskedasticity. 39
Introduction to Econometrics
Chapter heading
SEMILOGARITHMIC MODELS
SEMILOGARITHMIC MODELS

Y =  1e  2 X

This sequence introduces the semilogarithmic model and shows how it may
be applied to an earnings function. The dependent variable is linear, but the
explanatory variables, multiplied by their coefficients, are exponents of e. 1
SEMILOGARITHMIC MODELS

Y =  1e  2 X

dY
=  1  2 e  2 X =  2Y
dX

The differential of Y with respect to X simplifies to 2Y.

2
SEMILOGARITHMIC MODELS

Y =  1e  2 X

dY
=  1  2 e  2 X =  2Y
dX

dY Y
= 2
dX

Hence, the proportional Y per unit change in X equals 2. It is, therefore,
independent of the value of X.
3
SEMILOGARITHMIC MODELS

Y =  1e  2 X

Y +  Y =  1 e  2 ( X + X )
=  1 e  2 X e  2 X
= Ye  2 X
 (  2 X )2

= Y  1 +  2 X + + ... 
 2 

Strictly speaking, this interpretation is valid only for small values of 2.
When 2 is not small, the interpretation may be a little more complex.
4
SEMILOGARITHMIC MODELS

Y =  1e  2 X

Y +  Y =  1 e  2 ( X + X )
=  1 e  2 X e  2 X
= Ye  2 X
 (  2 X )2

= Y  1 +  2 X + + ... 
 2 

Suppose that X increases by an amount of X and that, as a consequence Y


increases by an amount of Y.
5
SEMILOGARITHMIC MODELS

Y =  1e  2 X

Y +  Y =  1 e  2 ( X + X )
=  1 e  2 X e  2 X
= Ye  2 X
 (  2 X )2

= Y  1 +  2 X + + ... 
 2 

We can rewrite the right side of the equation as shown.

6
SEMILOGARITHMIC MODELS

Y =  1e  2 X

Y +  Y =  1 e  2 ( X + X )
=  1 e  2 X e  2 X
= Ye  2 X
 (  2 X )2

= Y  1 +  2 X + + ... 
 2 

We can simplify the right side of the equation as shown.

7
SEMILOGARITHMIC MODELS

Y =  1e  2 X

Y +  Y =  1 e  2 ( X + X )
=  1 e  2 X e  2 X
= Ye  2 X
 (  2 X )2

= Y  1 +  2 X + + ... 
 2 
Z2 Z3
e = 1+ Z +
Z
+ + ...
2! 3!

Now expand the exponential term using the standard expression for e to
some power.
8
SEMILOGARITHMIC MODELS

Y =  1e  2 X

Y +  Y =  1 e  2 ( X + X )
=  1 e  2 X e  2 X
= Ye  2 X
 (  2 X )2

= Y  1 +  2 X + + ... 
 2 

 (  2 X )
2

Y = Y   2 X + + ... 
 2 

Subtract Y from both sides.

9
SEMILOGARITHMIC MODELS

 (  2 X )2

Y = Y   2 X + + ... 
 2 

( 2 X )2 negligible

We now consider two cases: where 2 and X are so small that (2 X)2 is
negligible, and the alternative.
10
SEMILOGARITHMIC MODELS

 (  2 X )2

Y = Y   2 X + + ... 
 2 

( 2 X )2 negligible

Y = Y 2 X

Y / Y
= 2
X

If (2 X)2 is negligible, we obtain the same interpretation of 2 as we did


using the calculus, as expected.
11
SEMILOGARITHMIC MODELS

 (  2 X )2

Y = Y   2 X + + ... 
 2 

( 2 X )2 not negligible

Y / Y  22 X
= 2 + + ...
X 2
 22
= 2 + + ...
2

If (2 X)2 is not negligible, the proportional change in Y given a X change


in X has an extra term. (We are assuming that 2 and X are small enough
that terms with higher powers of X can be neglected.) 12
SEMILOGARITHMIC MODELS

 (  2 X )2

Y = Y   2 X + + ... 
 2 

( 2 X )2 not negligible

Y / Y  22 X
= 2 + + ...
X 2
 22
= 2 + + ... if X is one unit
2

Usually we talk about the effect of a one-unit change in X. If X = 1, the


proportional change in Y is as shown. The issue now becomes whether 2 is
so small that the second and subsequent terms can be neglected. 13
SEMILOGARITHMIC MODELS

Y =  1e  2 X

X = 0  Y =  1e 0 =  1

1 is the value of Y when X is equal to zero (note that e0 is equal to 1).


14
SEMILOGARITHMIC MODELS

Y =  1e  2 X

log Y = log  1e  2 X
= log  1 + log e  2 X
=  1' +  2 X log e
=  1' +  2 X

To fit a function of this type, you take logarithms of both sides. The right side
of the equation becomes a linear function of X (note that the logarithm of e, to
base e, is 1). Hence we can fit the model with a linear regression, regressing
log Y on X. 15
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 60.71
Model | 16.5822819 1 16.5822819 Prob > F = 0.0000
Residual | 136.016938 498 .273126381 R-squared = 0.1087
-----------+------------------------------ Adj R-squared = 0.1069
Total | 152.59922 499 .30581006 Root MSE = .52261
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
_cons | 1.83624 .1289384 14.24 0.000 1.58291 2.089571
----------------------------------------------------------------------------

Here is the regression output from a wage equation regression using Data Set.
The estimate of 2 is 0.066. As an approximation, this implies that an extra
year of schooling increases hourly earnings by a proportion of 0.066. 16
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 60.71
Model | 16.5822819 1 16.5822819 Prob > F = 0.0000
Residual | 136.016938 498 .273126381 R-squared = 0.1087
-----------+------------------------------ Adj R-squared = 0.1069
Total | 152.59922 499 .30581006 Root MSE = .52261
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
_cons | 1.83624 .1289384 14.24 0.000 1.58291 2.089571
----------------------------------------------------------------------------

In everyday language, it is usually more natural to talk about percentages


rather than proportions, so we multiply the coefficient by 100. This implies
that an extra year of schooling increases hourly earnings by 6.6%. 17
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 60.71
Model | 16.5822819 1 16.5822819 Prob > F = 0.0000
Residual | 136.016938 498 .273126381 R-squared = 0.1087
-----------+------------------------------ Adj R-squared = 0.1069
Total | 152.59922 499 .30581006 Root MSE = .52261
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
_cons | 1.83624 .1289384 14.24 0.000 1.58291 2.089571
----------------------------------------------------------------------------

(  2 X )
2
not negligible

Y / Y  2
( 0.066 ) 2

If X is one unit, = 2 + 2
+ ... = 0.066 + = 0.068
X 2 2

If we consider that a year of schooling is not a marginal change and work out
the effect exactly, the proportional increase is 0.068, and the percentage
increase is 6.8%. 18
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 60.71
Model | 16.5822819 1 16.5822819 Prob > F = 0.0000
Residual | 136.016938 498 .273126381 R-squared = 0.1087
-----------+------------------------------ Adj R-squared = 0.1069
Total | 152.59922 499 .30581006 Root MSE = .52261
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
_cons | 1.83624 .1289384 14.24 0.000 1.58291 2.089571
----------------------------------------------------------------------------

(  2 X )
2
not negligible

Y / Y  2
( 0.066 ) 2

If X is one unit, = 2 + + ... = 0.066 +


2
= 0.068
X 2 2

In general, if a unit change in X is genuinely marginal, the estimate of 2 will be


small, and one can interpret it directly as an estimate of the proportional
change in Y per unit change in X. 19
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 60.71
Model | 16.5822819 1 16.5822819 Prob > F = 0.0000
Residual | 136.016938 498 .273126381 R-squared = 0.1087
-----------+------------------------------ Adj R-squared = 0.1069
Total | 152.59922 499 .30581006 Root MSE = .52261
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
_cons | 1.83624 .1289384 14.24 0.000 1.58291 2.089571
----------------------------------------------------------------------------

(  2 X )
2
not negligible

Y / Y  2
( 0.066 ) 2

If X is one unit, = 2 + + ... = 0.066 +


2
= 0.068
X 2 2

However, if a unit change in X is not small, the coefficient may be large, and the
second term might not be negligible. In the present case, a year of schooling is
not marginal, but even so, the refinement makes only a small difference.
20
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 1, 498) = 60.71
Model | 16.5822819 1 16.5822819 Prob > F = 0.0000
Residual | 136.016938 498 .273126381 R-squared = 0.1087
-----------+------------------------------ Adj R-squared = 0.1069
Total | 152.59922 499 .30581006 Root MSE = .52261
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
_cons | 1.83624 .1289384 14.24 0.000 1.58291 2.089571
----------------------------------------------------------------------------

(  2 X )
2
not negligible

Y / Y  2
( 0.066 ) 2

If X is one unit, = 2 + + ... = 0.066 +


2
= 0.068
X 2 2

In general, when 2 is less than 0.1, working out the effect exactly can be of
little benefit.
21
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source
Source | | SSSS dfdf MSMS Number
Number ofof obs
obs = = 540
500
-------------+------------------------------
-----------+------------------------------ F(F(1,1, 498)
538)
= = 60.71
140.05
Model
Model | |16.5822819
38.5643833 1 116.5822819
38.5643833 Prob
Prob > >
F F = =0.0000
0.0000
Residual
Residual | |136.016938
148.14326 498538.273126381
.275359219 R-squared
R-squared = =0.1087
0.2065
-------------+------------------------------
-----------+------------------------------ Adj
Adj R-squared
R-squared = =0.1069
0.2051
Total
Total | | 152.59922
186.707643 499539 .30581006
.34639637 Root
Root MSE
MSE = =.52261
.52475
----------------------------------------------------------------------------
------------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
-------------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
S
_cons | | .1096934
1.83624 .0092691
.1289384 11.83
14.24 0.000
0.000 .0914853
1.58291 .1279014
2.089571
_cons | 1.292241 .1287252 10.04 0.000 1.039376 1.545107
----------------------------------------------------------------------------
------------------------------------------------------------------------------

log ˆ1 = 1.836

ˆ1 = e1.836 = 6.27

The intercept in the regression is an estimate of log 1. From it, we obtain
an estimate of 1 equal to e1.836, which is 6.27.
22
SEMILOGARITHMIC MODELS

. reg LGEARN S
----------------------------------------------------------------------------
Source
Source | | SSSS dfdf MSMS Number
Number ofof obs
obs = = 540
500
-------------+------------------------------
-----------+------------------------------ F(F(1,1, 498)
538)
= = 60.71
140.05
Model
Model | |16.5822819
38.5643833 1 116.5822819
38.5643833 Prob
Prob > >
F F = =0.0000
0.0000
Residual
Residual | |136.016938
148.14326 498538.273126381
.275359219 R-squared
R-squared = =0.1087
0.2065
-------------+------------------------------
-----------+------------------------------ Adj
Adj R-squared
R-squared = =0.1069
0.2051
Total
Total | | 152.59922
186.707643 499539 .30581006
.34639637 Root
Root MSE
MSE = =.52261
.52475
----------------------------------------------------------------------------
------------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
-------------+----------------------------------------------------------------
S | .0664621 .0085297 7.79 0.000 .0497034 .0832207
S
_cons | | .1096934
1.83624 .0092691
.1289384 11.83
14.24 0.000
0.000 .0914853
1.58291 .1279014
2.089571
_cons | 1.292241 .1287252 10.04 0.000 1.039376 1.545107
----------------------------------------------------------------------------
------------------------------------------------------------------------------

log ˆ1 = 1.836

ˆ1 = e1.836 = 6.27

This literally implies that a person with no schooling would earn $6.27 per hour.
However, it is dangerous to extrapolate so far from the range for which we have
data. 23
SEMILOGARITHMIC MODELS

Logarithm of hourly earnings

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

Here is the scatter diagram with the semilogarithmic regression.

24
SEMILOGARITHMIC MODELS

120

100
Hourly earnings ($)

80

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

Here is the semilogarithmic regression line plotted in a scatter diagram with


the untransformed data, with the linear regression shown for comparison.
25
SEMILOGARITHMIC MODELS

120

100
Hourly earnings ($)

80

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

The fit of the regression lines is not much different, but the semilogarithmic
regression is more satisfactory in two respects.
26
SEMILOGARITHMIC MODELS

120

100
Hourly earnings ($)

80

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

The linear specification predicts that hourly earnings will increase by a fixed
amount, $1.27, with each additional year of schooling. This is implausible for
high levels of education. The semi-logarithmic specification allows the
increment to increase with the level of education. 27
SEMILOGARITHMIC MODELS

120

100
Hourly earnings ($)

80

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

Second, the linear specification predicts very low earnings for an individual
with no schooling. The semilogarithmic specification predicts hourly
earnings of $6.27, which at least is not obvious nonsense. 28
SUMMARY OF THE DIFFERENT NONLINEAR REGRESSION MODELS

93
SUMMARY OF THE INTERPRETATION
COEFFICIENTS OF DIFFERENT NONLINEAR
REGRESSION MODELS

X Y
X X+1 MODEL Y = f(X) Y = f(X+1) CHANGE CHANGE
100 101 Y=3+5X 503 508 1 units 5 units

100 101 LN(Y) = 2 + 0.08 LN(X) 10.68 10.69 1% 0.08%

100 101 LN(Y) = 0.2 + 0.04 X 66.7 69.4 1 units 4.08%

100 101 Y = 3 + 0.2 LN(X) 3.92 3.92 1% 0.002 units

100(exp(b)-1)% =
100(e0.04-1)% = 4.08%
Introduction to Econometrics
Chapter heading
THE DISTURBANCE TERM IN
LOGARITHMIC MODELS
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

2
Y = 1 + +u
X
1
Z=
X

Y = 1 +  2 Z + u

Thus far, nothing has been said about the disturbance term in nonlinear
regression models.
1
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

2
Y = 1 + +u
X
1
Z=
X

Y = 1 +  2 Z + u

For the regression results in a linearized model to have the desired properties,
the disturbance term in the transformed model should be additive and satisfy
the regression model conditions. 2
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

2
Y = 1 + +u
X
1
Z=
X

Y = 1 +  2 Z + u

It should be normally distributed in the transformed model to perform the


usual tests.
3
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

2
Y = 1 + +u
X
1
Z=
X

Y = 1 +  2 Z + u

In the case of the first example of a nonlinear model, there was no problem.
If the disturbance term had the required properties in the original model, it
would have them in the regression model. It has not been affected by the
transformation. 4
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 e u = 1 X 2 v

log Y = log  1 +  2 log X + u

In the discussion of the logarithmic model, the disturbance term was


omitted altogether.
5
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 e u = 1 X 2 v

log Y = log  1 +  2 log X + u

However, implicitly it was being assumed that there was an additive


disturbance term in the transformed model.
6
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 e u = 1 X 2 v

log Y = log  1 +  2 log X + u

For this to be possible, the random component in the original model must be
a multiplicative term, eu.
7
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 e u = 1 X 2 v

log Y = log  1 +  2 log X + u

We will denote this multiplicative term v.

8
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 e u = 1 X 2 v

log Y = log  1 +  2 log X + u

When u is equal to 0, not modifying the value of log Y, v is equal to 1,


likewise not modifying the value of Y.
9
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 e u = 1 X 2 v

log Y = log  1 +  2 log X + u

Positive values of u correspond to values of v greater than 1, the random factor


having a positive effect on Y and log Y. Likewise, negative values of u
correspond to values of v between 0 and 1, the random factor having a negative
impact on Y and log Y. 10
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

f(v)
0.45

0.40 Y = 1 X 2 e u = 1 X 2 v
0.35
log Y = log  1 +  2 log X + u
0.30

0.25

0.20

0.15

0.10

0.05

0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 v
16

Besides satisfying the regression model conditions, we need u to be


normally distributed if we are to perform t tests and F tests.
11
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

f(v)
0.45

0.40 Y = 1 X 2 e u = 1 X 2 v
0.35
log Y = log  1 +  2 log X + u
0.30

0.25

0.20

0.15

0.10

0.05

0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 v
16

This will be the case if v has a lognormal distribution, shown above.

12
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

f(v)
0.45

0.40 Y = 1 X 2 e u = 1 X 2 v
0.35
log Y = log  1 +  2 log X + u
0.30

0.25

0.20

0.15

0.10

0.05

0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 v
16

The mode of the distribution is located at v = 1, where u = 0.

13
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

f(v)
0.45

0.40 Y =  1e  2 X e u =  1e  2 X v
0.35
log Y = log  1 +  2 X + u
0.30

0.25

0.20

0.15

0.10

0.05

0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 v
16

The same multiplicative disturbance term is needed in the semilogarithmic


model.
14
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

f(v)
0.45

0.40 Y =  1e  2 X e u =  1e  2 X v
0.35
log Y = log  1 +  2 X + u
0.30

0.25

0.20

0.15

0.10

0.05

0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 v
16

Note that, with this distribution, one should expect a small proportion of
observations to be subject to large positive random effects.
15
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

120

100
Hourly earnings ($)

80

60

40

20

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

Here is the scatter diagram for earnings and schooling using Data Set 21.
You can see that there are several outliers, with the three most extreme
highlighted. 16
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Logarithm of hourly earnings

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years of schooling (highest grade completed)

Here is the scatter diagram for the semilogarithmic model with its regression
line. The same three observations remain outliers, but they do not appear to
be so extreme. 17
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

160

140

120

100

80

60

40

20

0–3 –2 –1 0 1 2 3
-2.75 to -2.25 -2.25 to 1.75 -1.75 to -1.25 -1.25 to -0.75 -0.75 to -0.25 -0.25 to 0.25 0.25 to 0.75 0.75 to 1.25 1.25 to 1.75 1.75 to 2.25 2.25 to 2.75

Residuals (linear) Residuals (semilogarithmic)

The histogram above compares the distributions of the residuals from the linear
and semi-logarithmic regressions. To make them comparable, the distributions
have been standardized, that is, scaled so that their standard deviation is equal
18
to 1.
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

160

140

120

100

80

60

40

20

0–3 –2 –1 0 1 2 3
-2.75 to -2.25 -2.25 to 1.75 -1.75 to -1.25 -1.25 to -0.75 -0.75 to -0.25 -0.25 to 0.25 0.25 to 0.75 0.75 to 1.25 1.25 to 1.75 1.75 to 2.25 2.25 to 2.75

Residuals (linear) Residuals (semilogarithmic)

It can be shown that if the disturbance term in a regression model has a


normal distribution, so will the residuals.
19
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

160

140

120

100

80

60

40

20

0–3 –2 –1 0 1 2 3
-2.75 to -2.25 -2.25 to 1.75 -1.75 to -1.25 -1.25 to -0.75 -0.75 to -0.25 -0.25 to 0.25 0.25 to 0.75 0.75 to 1.25 1.25 to 1.75 1.75 to 2.25 2.25 to 2.75

Residuals (linear) Residuals (semilogarithmic)

Obviously, the residuals from the semilogarithmic regression are approximately


normal, but those from the linear regression are not. This is evidence that the
semi-logarithmic model is the better specification. 20
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 + u

What would happen if the disturbance term in the logarithmic or semilogarithmic


model were additive rather than multiplicative?
21
THE DISTURBANCE TERM IN LOGARITHMIC MODELS

Y = 1 X 2 + u

log Y = log (  1 X  2 + u )

If this were the case, we could not linearize the model by taking logarithms.
There is no way of simplifying log (  1 X  + u ) . We should have to use some
2

nonlinear regression techniques. 22


Introduction to Econometrics
Chapter heading
COMPARING LINEAR AND
LOGARITHMIC
SPECIFICATIONS
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

When alternative regression model specifications have the same dependent


variable, R2 can be used to compare their goodness of fit.
However, when the dependent variable is different, this is not legitimate.
In the case of the linear model, R2 measures the proportion of the variance
in Y explained by the model. The semilogarithmic model measures the
proportion of the variance of the logarithm of Y explained by the model.
These are related, but not the same, and direct comparisons are invalid.

1
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Box–Cox transformation:

Y l −1
= 1 +  2 X + u
l

However, the goodness of fit of models with linear and logarithmic versions of
the same dependent variable can be compared indirectly by subjecting the
dependent variable to the Box-Cox transformation and fitting the model
shown.
This is a family of specifications that depend on the parameter l. Like the other
parameters, l's determination is empirical.
The model's parameters are nonlinear, so a nonlinear regression method
should be used. In practice, maximum likelihood estimation is used.
5
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Box–Cox transformation:

Y l −1
= 1 +  2 X + u
l

Y l −1
= Y −1 when l =1
l
Y l −1
→ log Y when l →0
l

This transformation is of interest in the present context because specifications


with linear and logarithmic dependent variables are special cases.
8
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Box–Cox transformation:

Y l −1
= 1 +  2 X + u
l

Y l −1
= Y −1 when l =1
l
Y l −1
→ log Y when l →0
l

Putting l = 1 gives the linear model. The dependent variable is then Y – 1


rather than Y, but subtracting a constant from the dependent variable does
not affect the regression results, except for the intercept estimate. 9
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Box–Cox transformation:

Y l −1
= 1 +  2 X + u
l

Y l −1
= Y −1 when l =1
l
Y l −1
→ log Y when l →0
l

Putting l = 0 gives the (semi–)logarithmic model. Of course, one cannot discuss


putting l precisely equal to 0 because the dependent variable becomes zero
divided by zero. We are talking about the limiting form as l tends to zero, and
we have used L'Hôpital's rule. 10
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Box–Cox transformation:

Y l −1
= 1 +  2 X + u
l

Y l −1
= Y −1 when l =1
l
Y l −1
→ log Y when l →0
l

So one could fit the general model and see whether l is close to 0 or close
to 1. Of course. 'close' has no meaning in econometrics. One should test
the hypotheses l = 0 and l = 1 to approach this issue technically. 11
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Box–Cox transformation:

Y l −1
= 1 +  2 X + u
l

Y l −1
= Y −1 when l =1
l
Y l −1
→ log Y when l →0
l

The outcome is that one is rejected and the other not rejected, but of course,
neither might be rejected, or both might be rejected, given your chosen
significance level. 12
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Box–Cox transformation:

Y l −1
= 1 +  2 X + u
l

Y l −1
= Y −1 when l =1
l
Y l −1
→ log Y when l →0
l

If you are interested only in comparing the fits of the linear and logarithmic
specifications, there is a short-cut procedure that involves only standard
least squares regressions. 13
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Y * = Y / geometric mean of Y

The first step is to divide the observations on the dependent variable by


their geometric mean. We will call the transformed variable Y*.
14
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Y * = Y / geometric mean of Y

Y * =  1' +  2' X + u
log Y * =  1' +  2' X + u

You now regress Y* and logeY*, leaving the right side of the equation
unchanged. (The parameters have been given prime marks to emphasize
that the coefficients will not be estimates of the original 1 and 2.) 15
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Y * = Y / geometric mean of Y

Y * =  1' +  2' X + u
log Y * =  1' +  2' X + u

The residual sums of squares are now directly comparable. Therefore, the
specification with the smaller RSS provides a better fit.
16
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

Y = 1 +  2 X + u
log Y =  1 +  2 X + u

Y * = Y / geometric mean of Y

Y * =  1' +  2' X + u
log Y * =  1' +  2' X + u

We will use the transformation to compare the fits of the linear and
semilogarithmic versions of a simple wage equation using EAWE Data Set 21.
17
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS


1 1
log Yi log (Y1Y2 ...Yn )
e n
=e n

1 1
log ( )
=e Y1Y2 ...Yn n
= (Y1Y2 ...Yn ) n

The first step is to calculate the dependent variable's geometric mean. The
easiest way to do this is to take the exponential mean of the dependent
variable's log. 18
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS


1 1
log Yi log (Y1Y2 ...Yn )
e n
=e n

1 1
log ( )
=e Y1Y2 ...Yn n
= (Y1Y2 ...Yn ) n

The sum of the logarithms of Y is equal to the logarithm of the products of Y.

19
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS


1 1
log Yi log (Y1Y2 ...Yn )
e n
=e n

1 1
log ( )
=e Y1Y2 ...Yn n
= (Y1Y2 ...Yn ) n

Now we use the rule that alog X is the same as log Xa.

20
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS


1 1
log Yi log (Y1Y2 ...Yn )
e n
=e n

1 1
log ( )
=e Y1Y2 ...Yn n
= (Y1Y2 ...Yn ) n

And finally we use the fact that the exponential of the logarithm of X reduces to X.

21
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS


1 1
log Yi log (Y1Y2 ...Yn )
e n
=e n

1 1
log ( )
=e Y1Y2 ...Yn n
= (Y1Y2 ...Yn ) n

. sum LGEARN

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
LGEARN | 500 2.824265 .553001 .6931472 4.642948

LGEARN has already been defined as the logarithm of EARNINGS. We find


its mean. In Stata this is done with the ‘sum’ command.
22
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS


1 1
log Yi log (Y1Y2 ...Yn )
e n
=e n

1 1
log ( )
=e Y1Y2 ...Yn n
= (Y1Y2 ...Yn ) n

. sum LGEARN

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
LGEARN | 500 2.824265 .553001 .6931472 4.642948

. gen EARNSTAR = EARNINGS/exp(2.8243)

We then define EARNSTAR, dividing EARNINGS by the exponential of the


mean of LGEARN.
23
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS


1 1
log Yi log (Y1Y2 ...Yn )
e n
=e n

1 1
log ( )
=e Y1Y2 ...Yn n
= (Y1Y2 ...Yn ) n

. sum LGEARN

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
LGEARN | 500 2.824265 .553001 .6931472 4.642948

. gen EARNSTAR = EARNINGS/exp(2.8243)

. gen LGEARNST = ln(EARNSTAR)

We also define LGEARNST, the logarithm of EARNSTAR.

24
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

. reg EARNSTAR S EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 35.24
Model | 30.7698527 2 15.3849264 Prob > F = 0.0000
Residual | 216.958472 497 .436536161 R-squared = 0.1242
-----------+------------------------------ Adj R-squared = 0.1207
Total | 247.728325 499 .496449549 Root MSE = .66071
----------------------------------------------------------------------------
EARNSTAR | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .1114334 .0132792 8.39 0.000 .0853432 .1375236
EXP | .0583614 .0124543 4.69 0.000 .0338918 .0828311
_cons | -.8705654 .254515 -3.42 0.001 -1.370623 -.3705073
----------------------------------------------------------------------------

Here is the regression of EARNSTAR on S and EXP. The residual sum of


squares is 217.0.
25
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

. reg LGEARNST S EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 40.12
Model | 21.2104061 2 10.6052031 Prob > F = 0.0000
Residual | 131.388814 497 .264363811 R-squared = 0.1390
-----------+------------------------------ Adj R-squared = 0.1355
Total | 152.59922 499 .30581006 Root MSE = .51416
----------------------------------------------------------------------------
LGEARNST | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | -1.624505 .1980634 -8.20 0.000 -2.013649 -1.23536
----------------------------------------------------------------------------

We run the parallel regression for LGEARNST. The residual sum of squares is
131.4. Thus we conclude that the semilogarithmic version gives a better fit.
26
BOX-COX TESTS

y =  + x + u log y =  + x + u
y* = y / geometric mean of y

y* =  '+  ' x + u log y* =  '+  ' x + u

n larger RSS
log e ≈ 𝜒 2 (1)
2 smaller RSS

• The residual sums of squares are now directly comparable.


• The test statistic is as shown.
• It is distributed as a c2 (chi-squared) statistic under the null hypothesis
that there is no difference in the fit.
• If the calculated statistics are greater then critical value, Reject Ho
8
BOX-COX TESTS

LINEAR

𝑛 larger 𝑅𝑆𝑆 500 216.9


log 𝑒 = log 𝑒 = 125.3
2 smaller 𝑅𝑆𝑆 2 131.4
NONLINEAR

ccrit
2
= 10.83, 1 d.f, 0.1% level

• The test statistic is 125.3.


• This far exceeds the critical value of c2 with one degree of freedom,
even at the 0.1% level,
• So, we conclude that the semi-logarithmic version gives a significantly
better fit.
19
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

. boxcox EARNINGS S EXP

Number of obs = 500


LR chi2(2) = 76.08
Log likelihood = -1785.403 Prob > chi2 = 0.000

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/theta | .1088657 .05362 2.03 0.042 .0037726 .2139589
------------------------------------------------------------------------------

---------------------------------------------------------
Test Restricted LR statistic P-value
H0: log likelihood chi2 Prob > chi2 when 𝝀 → 𝟎
---------------------------------------------------------
theta = -1 -2025.7902 480.77 0.000 Y l −1
→ log Y
theta = 0 -1787.4901 4.17 0.041 l
theta = 1 -1912.8953 254.98 0.000
---------------------------------------------------------

Here is the output for the complete Box-Cox regression. The parameter that
we have denoted l (lambda) is called theta by Stata. It is estimated at 0.11.
Since it is closer to 0 than to 1, it indicates that the dependent variable should
be logarithmic rather than linear. 27
COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS

. boxcox EARNINGS S EXP

Number of obs = 500


LR chi2(2) = 76.08
Log likelihood = -1785.403 Prob > chi2 = 0.000

------------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
/theta | .1088657 .05362 2.03 0.042 .0037726 .2139589
------------------------------------------------------------------------------

---------------------------------------------------------
Test Restricted LR statistic P-value
H0: log likelihood chi2 Prob > chi2
---------------------------------------------------------
theta = -1 -2025.7902 480.77 0.000
theta = 0 -1787.4901 4.17 0.041
theta = 1 -1912.8953 254.98 0.000
---------------------------------------------------------

However, even the value of 0 does not (quite) lie in the 95 percent confidence
interval. (The log-likelihood tests will be explained in Chapter 10.)
28
Introduction to Econometrics
Chapter heading
QUADRATIC EXPLANATORY
VARIABLES
QUADRATIC EXPLANATORY VARIABLES

Y =  1 +  2 X 2 +  3 X 22 + u

We will now consider models with quadratic explanatory variables of the


type shown. Such a model can be fitted using OLS with no modification.

However, the usual interpretation of a parameter that represents the effect


of a unit change in its associated variable, holding all other variables
constant, cannot be applied. X2 can’t change without X22 also changing.

1
QUADRATIC EXPLANATORY VARIABLES

Y =  1 +  2 X 2 +  3 X 22 + u

dY
=  2 + 2 3 X 2
dX 2

Differentiating the equation concerning X2, one obtains the change in Y per
unit change in X2.
Thus, the impact of a unit change in X2 on Y (b2 + 2b3X2) is a function of X2.
This means that 2 differs from the ordinary linear model, where a unit
change in X2 on Y is unqualified.
In this model, 2 should be interpreted as the effect of a unit change in X2 on
Y for the special case where X2 = 0. The marginal effect will be different for
nonzero values of X2.

3
QUADRATIC EXPLANATORY VARIABLES

Y =  1 +  2 X 2 +  3 X 22 + u

dY
=  2 + 2 3 X 2
dX 2

Y =  1 + ( 2 +  3 X 2 ) X 2 + u

3 also has a special interpretation. If we rewrite the model as shown, 3 can be


interpreted as the rate of change of the coefficient of X2, per unit change in X2.
6
QUADRATIC EXPLANATORY VARIABLES

Y =  1 +  2 X 2 +  3 X 22 + u

dY
=  2 + 2 3 X 2
dX 2

Y =  1 + ( 2 +  3 X 2 ) X 2 + u

Only 1 has a conventional interpretation. As usual, it is the value of Y (apart


from the random component) when X2 = 0.
7
QUADRATIC EXPLANATORY VARIABLES

Y =  1 +  2 X 2 +  3 X 22 + u

dY
=  2 + 2 3 X 2
dX 2

Y =  1 + ( 2 +  3 X 2 ) X 2 + u

If 3>0 then Y may have a minimum point. If 3<0 then Y may has a maximum point.

8
QUADRATIC EXPLANATORY VARIABLES

. gen SSQ = S*S


. reg EARNINGS S SSQ
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 23.44
Model | 6061.38243 2 3030.69122 Prob > F = 0.0000
Residual | 64267.5838 497 129.311034 R-squared = 0.0862
-----------+------------------------------ Adj R-squared = 0.0825
Total | 70328.9662 499 140.939812 Root MSE = 11.372
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .1910651 1.785822 0.11 0.915 -3.317626 3.699757
SSQ | .0366817 .0606266 0.61 0.545 -.0824344 .1557978
_cons | 8.358401 12.86047 0.65 0.516 -16.90919 33.62599
----------------------------------------------------------------------------

We will illustrate this with the earnings function. The table gives the output of
a quadratic regression of earnings on schooling (SSQ is defined as the square
of schooling). 9
QUADRATIC EXPLANATORY VARIABLES

. gen SSQ = S*S


. reg EARNINGS S SSQ
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 23.44
Model | 6061.38243 2 3030.69122 Prob > F = 0.0000
Residual | 64267.5838 497 129.311034 R-squared = 0.0862
-----------+------------------------------ Adj R-squared = 0.0825
Total | 70328.9662 499 140.939812 Root MSE = 11.372
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .1910651 1.785822 0.11 0.915 -3.317626 3.699757
SSQ | .0366817 .0606266 0.61 0.545 -.0824344 .1557978
_cons | 8.358401 12.86047 0.65 0.516 -16.90919 33.62599
----------------------------------------------------------------------------

The coefficient of S implies that, for an individual with no schooling, the


impact of a year of schooling is to increase hourly earnings by $0.19.
10
QUADRATIC EXPLANATORY VARIABLES

. gen SSQ = S*S


. reg EARNINGS S SSQ
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 23.44
Model | 6061.38243 2 3030.69122 Prob > F = 0.0000
Residual | 64267.5838 497 129.311034 R-squared = 0.0862
-----------+------------------------------ Adj R-squared = 0.0825
Total | 70328.9662 499 140.939812 Root MSE = 11.372
----------------------------------------------------------------------------
EARNINGS | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .1910651 1.785822 0.11 0.915 -3.317626 3.699757
SSQ | .0366817 .0606266 0.61 0.545 -.0824344 .1557978
_cons | 8.358401 12.86047 0.65 0.516 -16.90919 33.62599
----------------------------------------------------------------------------

It is also doubtful whether the intercept has any sensible interpretation. It


implies that an individual with no schooling would have hourly earnings of
$8.36, which seems implausibly high. 11
QUADRATIC EXPLANATORY VARIABLES
------------------------
120 EARNINGS | Coef.
-----------+------------
S | .1910651
100 SSQ | .0366817
_cons | 8.358401
------------------------
Hourly earnings ($)

80

60

40

20 quadratic

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Years of schooling (highest grade completed)

The quadratic relationship is illustrated in the figure. Over the range of the
actual data, it fits the observations tolerably well. The fit is not dramatically
different from the linear and semilogarithmic specifications. 12
QUADRATIC EXPLANATORY VARIABLES
------------------------
120 EARNINGS | Coef.
-----------+------------
S | .1910651
100 SSQ | .0366817
_cons | 8.358401
------------------------
Hourly earnings ($)

80

60

40

20 quadratic

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Years of schooling (highest grade completed)

Most wage equation studies prefer the semilogarithmic specification. The


slope coefficient has a simple interpretation, and the specification only gives
rise to nonsensical predictions within the data range. 13
QUADRATIC EXPLANATORY VARIABLES

Average annual percentage growth rates


Employment GDP Employment GDP

Australia 2.57 3.52 Korea 1.11 4.48


Austria 1.64 2.66 Luxembourg 1.34 4.55
Belgium 1.06 2.27 Mexico 1.88 3.36
Canada 1.90 2.57 Netherlands 0.51 2.37
Czech Republic 0.79 5.62 New Zealand 2.67 3.41
Denmark 0.58 2.02 Norway 1.36 2.49
Estonia 2.28 8.10 Poland 2.05 5.16
Finland 0.98 3.75 Portugal 0.13 1.04
France 0.69 2.00 Slovak Republic 2.08 7.04
Germany 0.84 1.67 Slovenia 1.60 4.82
Greece 1.55 4.32 Sweden 0.83 3.47
Hungary 0.28 3.31 Switzerland 0.90 2.54
Iceland 2.49 5.62 Turkey 1.30 6.90
Israel 3.29 4.79 United Kingdom 0.92 3.31
Italy 0.89 1.29 United States 1.36 2.88
Japan 0.31 1.85

The data on employment growth rate, e, and GDP growth rate, g, for 25
OECD countries in Exercise 1.5 provide another example of using a
quadratic function. 14
QUADRATIC EXPLANATORY VARIABLES

. gen gsq = g*g


. reg e g gsq
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 31
-----------+------------------------------ F( 2, 28) = 7.03
Model | 6.05131556 2 3.02565778 Prob > F = 0.0034
Residual | 12.0579495 28 .430641052 R-squared = 0.3342
-----------+------------------------------ Adj R-squared = 0.2866
Total | 18.109265 30 .603642167 Root MSE = .65623
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
g | .6616232 .2988805 2.21 0.035 .0493942 1.273852
gsq | -.0490589 .0336736 -1.46 0.156 -.1180362 .0199185
_cons | -.2576489 .5845635 -0.44 0.663 -1.455073 .939775
----------------------------------------------------------------------------

The output from a quadratic regression is shown. gsq has been defined as
the square of g.
15
QUADRATIC EXPLANATORY VARIABLES

quadratic
Employment growth rate

hyperbolic
1

0 ------------------------
0 1 2 3 4 5 6 7 e |8 Coef.
9
-----------+------------
g | .6616232
-1
gsq | -.0490589
_cons | -.2576489
------------------------
-2

GDP growth rate

The quadratic specification appears to improve on the hyperbolic function fitted


in a previous slideshow. It is more satisfactory than the latter for low values of g
in that it does not yield implausibly large negative predicted values of e. 16
QUADRATIC EXPLANATORY VARIABLES

quadratic
Employment growth rate

hyperbolic
1

0 ------------------------
0 1 2 3 4 5 6 7 e |8 Coef.
9
-----------+------------
g | .6616232
-1
gsq | -.0490589
_cons | -.2576489
------------------------
-2

GDP growth rate

The only defect is that it predicts that the fitted value of e starts to fall when
g exceeds 7.
17
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

Why stop at a quadratic? Why not consider a cubic, quartic, or a polynomial of


even higher order? There are usually several good reasons for not doing so.
18
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

Diminishing marginal effects are standard in economic theory, justifying


quadratic specifications, at least as an approximation. Still, economic theory
seldom suggests that a cubic or higher-order polynomial might sensibly
represent a relationship. 19
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

The second reason follows from the first. Higher-order terms will improve fit,
but because these terms are not theoretically justified, the improvement will
be sample-specific. 20
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

Third, if the sample is very small, the fits of higher-order polynomials are likely
to be very different from those of a quadratic over the main part of the data
range. 21
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

The figure illustrates these points by comparing cubic and quartic regressions
with the quadratic regression. Over the main data range, from g = 1.5 to g = 5,
the cubic and quartic fits are very similar to those of the quadratic. 22
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

R2 for the quadratic specification is 0.334. For the cubic and quartic it is
0.345 and 0.355, relatively small improvements.
23
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

Further, the cubic and quartic curves both exhibit implausible characteristics.

24
QUADRATIC EXPLANATORY VARIABLES

3
quartic
Employment growth rate

quadratic

0
0 1 2 3 4 5 6 7 8 9

cubic

-1
GDP growth rate

As g increases, the slope of the cubic first diminishes and then increases.
There is no reasonable explanation. The quartic curve declines for g values
from 5 to 7 and then exhibits a strange upward twist at its end. 25
Introduction to Econometrics
Chapter heading
INTERACTIVE EXPLANATORY
VARIABLES
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

The model shown above is linear in parameters and may be fitted using
straightforward OLS, provided that the regression model assumptions are
satisfied. However, its nonlinearity in variables has implications for
interpreting the parameters.
When multiple regression was introduced at the beginning of the previous
chapter, it was stated that the slope coefficients represented the variables'
separate, individual marginal effects on Y, holding the other variables
constant.
In this model, such an interpretation is not possible. In particular, it is
impossible to interpret b2 as the effect of X2 on Y, holding X3 and X2X3
constant, because it is impossible to hold both X3 and X2X3 constant if X2
changes.

1
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1 + ( 2 +  4 X 3 ) X 2 +  3 X 3 + u

To make a proper interpretation of the coefficients, we can rewrite the model as shown. The
coefficient of X2, (2 + 4X3), can now be interpreted as the marginal effect of X2 on Y,
conditional on the value of X3.
4
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1 + ( 2 +  4 X 3 ) X 2 +  3 X 3 + u

This representation explicitly states that the marginal effect of X2 depends on


the value of X3. The interpretation of b2 now becomes the marginal effect of X2
on Y when X3 is equal to zero. 5
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1 + ( 2 +  4 X 3 ) X 2 +  3 X 3 + u
Y =  1 +  2 X 2 + ( 3 +  4 X 2 ) X 3 + u

One may rewrite the model and the third line. From this, it may be seen that the
marginal effect of X3 on Y, conditional on the value of X2, is (b3 + b4X2) and that
b3 may be interpreted as the marginal effect of X3 on Y when X2 is equal to zero.
6
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1 + ( 2 +  4 X 3 ) X 2 +  3 X 3 + u
Y =  1 +  2 X 2 + ( 3 +  4 X 2 ) X 3 + u

4 may be interpreted as the change in the coefficient of X2 when X3 changes


by one unit. Equally, it may be interpreted as the change in the coefficient of
X3 when X2 changes by one unit. 7
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1 + ( 2 +  4 X 3 ) X 2 +  3 X 3 + u
Y =  1 +  2 X 2 + ( 3 +  4 X 2 ) X 3 + u

If X3 = 0 is a long way outside its range in the sample, interpreting 2 as the


marginal effect of X2 when X3 = 0 should be treated with caution. The same
applies to the interpretation of 3 as the marginal effect of X2 when X3 = 0. 8
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1 + ( 2 +  4 X 3 ) X 2 +  3 X 3 + u
Y =  1 +  2 X 2 + ( 3 +  4 X 2 ) X 3 + u

Sometimes, the estimate is completely implausible, just as the estimate of the


intercept in a regression is often implausible if given a literal interpretation.
9
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1 + ( 2 +  4 X 3 ) X 2 +  3 X 3 + u
Y =  1 +  2 X 2 + ( 3 +  4 X 2 ) X 3 + u

This can make it difficult to compare the estimates of the effects of X2 and X3
on Y in models excluding and including the interactive term.
10
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

X 2* = X 2 − X 2 X 3* = X 3 − X 3
X 2 = X 2* + X 2 X 3 = X 3* + X 3

One way of mitigating the problem is to rescale X2 and X3 so that they are
measured from their sample means.
11
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

X 2* = X 2 − X 2 X 3* = X 3 − X 3
X 2 = X 2* + X 2 X 3 = X 3* + X 3

Y =  1 +  2 ( X 2* + X 2 ) +  3 ( X 3* + X 3 ) +  4 ( X 2* + X 2 )( X 3* + X 3 ) + u

 1* =  1 +  2 X 2 +  3 X 3 +  4 X 2 X 3  2* =  2 +  4 X 3
 3* =  3 +  4 X 2

Y =  1* +  2* X 2* +  3* X 3* +  4 X 2* X 3* + u

Substituting for X2 and X3, the model is as shown, with new parameters
defined in terms of the original ones.
12
INTERACTIVE EXPLANATORY VARIABLES

Y = 1 +  2 X 2 +  3 X 3 +  4 X 2 X 3 + u

Y =  1* + ( 2* +  4 X 3* )X 2* +  3* X 3* + u
Y =  1* +  2* X 2* + ( 3* +  4 X 2* )X 3* + u

The point of doing this is that the coefficients of X2 and X3 now give the
marginal effects of the variables when the other variable is held at its sample
mean, which is, to some extent, a representative value. 13
INTERACTIVE EXPLANATORY VARIABLES

Y =  1* +  2* X 2* +  3* X 3* +  4 X 2* X 3* + u

Y =  1* + ( 2* +  4 X 3* )X 2* +  3* X 3* + u
Y =  1* +  2* X 2* + ( 3* +  4 X 2* )X 3* + u

X 3* = 0  X 3 = X 3

For example, it can be seen that 2* gives the marginal effect of X2*, and
hence X2, when X3* = 0, that is, when X3 is at its sample mean.
14
INTERACTIVE EXPLANATORY VARIABLES

Y =  1* +  2* X 2* +  3* X 3* +  4 X 2* X 3* + u

Y =  1* + ( 2* +  4 X 3* )X 2* +  3* X 3* + u
Y =  1* +  2* X 2* + ( 3* +  4 X 2* )X 3* + u

X 2* = 0  X 2 = X 2

3* has a similar interpretation.

15
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 40.12
Model | 21.2104059 2 10.6052029 Prob > F = 0.0000
Residual | 131.388814 497 .264363811 R-squared = 0.1390
-----------+------------------------------ Adj R-squared = 0.1355
Total | 152.59922 499 .30581006 Root MSE = .51416
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------

We will illustrate the analysis with a wage equation in which the logarithm of
hourly earnings is regressed on years of schooling and work experience. We
start with a straightforward linear specification using EAWE Data Set 21. 16
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S EXP


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 40.12
Model | 21.2104059 2 10.6052029 Prob > F = 0.0000
Residual | 131.388814 497 .264363811 R-squared = 0.1390
-----------+------------------------------ Adj R-squared = 0.1355
Total | 152.59922 499 .30581006 Root MSE = .51416
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------

The regression implies that an extra year of schooling increases earnings


by 9.2 percent and that an extra year of work experience increases them by
4.1 percent. 17
INTERACTIVE EXPLANATORY VARIABLES

. gen SEXP = S*EXP


. reg LGEARN S EXP SEXP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 26.75
Model | 21.254031 3 7.08467699 Prob > F = 0.0000
Residual | 131.345189 496 .264808848 R-squared = 0.1393
-----------+------------------------------ Adj R-squared = 0.1341
Total | 152.59922 499 .30581006 Root MSE = .5146
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

The interactive variable SEXP is defined as the product of S and EXP, and
the regression is performed again, including this term.
18
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S EXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------

. gen SEXP = S*EXP


. reg LGEARN S EXP SEXP
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

The schooling coefficient has fallen. It has now changed its meaning. It now
estimates the impact of an extra year of schooling for individuals without
work experience. 19
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S EXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------

. gen SEXP = S*EXP


. reg LGEARN S EXP SEXP
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

The experience coefficient has fallen sharply, and its meaning has also
changed. Now, it refers to individuals with no schooling, and every individual
in the sample had at least 8 years. 20
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S EXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------

. gen SEXP = S*EXP


. reg LGEARN S EXP SEXP
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

The SEXP coefficient indicates that the schooling coefficient falls by 0.0012,
0.12 percent, for every additional year of work experience. Equally, it suggests
that the experience coefficient falls by 0.12 percent for every extra year of
schooling. 21
INTERACTIVE EXPLANATORY VARIABLES

. sum S EXP

Variable | Obs Mean Std. Dev. Min Max


-------------+--------------------------------------------------------
S | 500 14.866 2.742825 8 20
EXP | 500 6.444577 2.924476 0 13.92308

. gen S1 = S - 14.866
. gen EXP1 = EXP - 6.445
. gen SEXP1 = S1*EXP1

We now define S1, EXP1, and SEXP1 as the corresponding schooling,


experience, and interactive variables with subtracted means and repeat the
regressions. We first use the sum (summarize) command to find the mean
values of S and EXP. 22
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 40.12
Model | 21.2104059 2 10.605203 Prob > F = 0.0000
Residual | 131.388814 497 .26436381 R-squared = 0.1390
-----------+------------------------------ Adj R-squared = 0.1355
Total | 152.59922 499 .30581006 Root MSE = .51416
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP1 | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 2.824265 .0229941 122.83 0.000 2.779088 2.869443
----------------------------------------------------------------------------

Here is the regression without the interactive term. The top half of the
output is identical to that when LGEARN was regressed on S and EXP. What
differences do you expect in the bottom half? 23
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP1 | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 2.824265 .0229941 122.83 0.000 2.779088 2.869443
----------------------------------------------------------------------------

. reg LGEARN S EXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------

The slope coefficients (and their standard errors and t statistics) are precisely
the same as before. Only the intercept has been changed by subtracting the
means from S and EXP. 24
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP1 | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 2.824265 .0229941 122.83 0.000 2.779088 2.869443
----------------------------------------------------------------------------

. reg LGEARN S EXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------
e = 3.32
1.20

In the original specification, the constant estimates predicted LGEARN when


S = 0 and EXP = 0. This implies hourly earnings of e1.20 = $3.32, but it is
doubtful whether this is meaningful. 25
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP1 | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 2.824265 .0229941 122.83 0.000 2.779088 2.869443
----------------------------------------------------------------------------
e = 16.78
2.82
. reg LGEARN S EXP
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------
e = 3.32
1.20

In the revised specification, the constant estimates predicted LGEARN when


S1 = 0 and EXP1 = 0, when S and EXP are at their sample means. This implies
hourly earnings of e2.82 = $16.78, which makes much better sense. 26
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1 SEXP1


----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 26.75
Model | 21.2540309 3 7.08467697 Prob > F = 0.0000
Residual | 131.345189 496 .264808848 R-squared = 0.1393
-----------+------------------------------ Adj R-squared = 0.1341
Total | 152.59922 499 .30581006 Root MSE = .5146
----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .092194 .0104156 8.85 0.000 .0717299 .1126581
EXP1 | .0415275 .0099934 4.16 0.000 .0218929 .0611621
SEXP1 | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 2.829957 .0269497 105.01 0.000 2.777008 2.882907
----------------------------------------------------------------------------

Here is the output from the regression using S and EXP, with means extracted
and their interactive term. The top half of the output is identical to that when
LGEARN was regressed on S, EXP, and SEXP. 27
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1 SEXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .092194 .0104156 8.85 0.000 .0717299 .1126581
EXP1 | .0415275 .0099934 4.16 0.000 .0218929 .0611621
SEXP1 | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 2.829957 .0269497 105.01 0.000 2.777008 2.882907
----------------------------------------------------------------------------

. reg LGEARN S EXP SEXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

However, the bottom half is different. The coefficients of S1 and EXP1 measure
the effects of those variables for the mean value of the other variable, that is, for
a ‘typical’ individual. The coefficients of S and EXP measure their effects when
the other variable is zero. 28
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1 SEXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .092194 .0104156 8.85 0.000 .0717299 .1126581
EXP1 | .0415275 .0099934 4.16 0.000 .0218929 .0611621
SEXP1 | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 2.829957 .0269497 105.01 0.000 2.777008 2.882907
----------------------------------------------------------------------------

. reg LGEARN S EXP SEXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

Note that the coefficient of the interactive term is the same.

29
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1 SEXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .092194 .0104156 8.85 0.000 .0717299 .1126581
EXP1 | .0415275 .0099934 4.16 0.000 .0218929 .0611621
SEXP1 | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 2.829957 .0269497 105.01 0.000 2.777008 2.882907
----------------------------------------------------------------------------

. reg LGEARN S EXP SEXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

As before, it measures the change in the schooling coefficient per unit (one
year) change in experience and is unaffected by the extraction of the means.
It also measures the change in the experience coefficient per unit change in
schooling. 30
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP1 | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 2.824265 .0229941 122.83 0.000 2.779088 2.869443
----------------------------------------------------------------------------

. reg LGEARN S1 EXP1 SEXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .092194 .0104156 8.85 0.000 .0717299 .1126581
EXP1 | .0415275 .0099934 4.16 0.000 .0218929 .0611621
SEXP1 | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 2.829957 .0269497 105.01 0.000 2.777008 2.882907
----------------------------------------------------------------------------

With the means-extracted variables, we can see more clearly the impact of
including the interactive term.
31
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S1 EXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP1 | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 2.824265 .0229941 122.83 0.000 2.779088 2.869443
----------------------------------------------------------------------------

. reg LGEARN S1 EXP1 SEXP1


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S1 | .092194 .0104156 8.85 0.000 .0717299 .1126581
EXP1 | .0415275 .0099934 4.16 0.000 .0218929 .0611621
SEXP1 | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 2.829957 .0269497 105.01 0.000 2.777008 2.882907
----------------------------------------------------------------------------

If we assume it should be in the model, omitting it has little effect on the


schooling and experience coefficients.
32
INTERACTIVE EXPLANATORY VARIABLES

. reg LGEARN S EXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0916942 .0103338 8.87 0.000 .0713908 .1119976
EXP | .0405521 .009692 4.18 0.000 .0215098 .0595944
_cons | 1.199799 .1980634 6.06 0.000 .8106537 1.588943
----------------------------------------------------------------------------

. reg LGEARN S EXP SEXP


----------------------------------------------------------------------------
LGEARN | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
S | .0843417 .0208594 4.04 0.000 .0433581 .1253253
EXP | .0234143 .0433233 0.54 0.589 -.0617055 .1085341
SEXP | .0012184 .0030019 0.41 0.685 -.0046796 .0071165
_cons | 1.308507 .3332092 3.93 0.000 .6538312 1.963182
----------------------------------------------------------------------------

Here, again, are the corresponding results with the original variables for
comparison, where the introduction of the interactive term appears to have a
much larger effect. 33
Introduction to Econometrics
Chapter heading
NONLINEAR REGRESSION
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

➢ Suppose you believe that a variable Y depends on a variable X


according to the relationship shown, and you wish to obtain
estimates of 1, 2, and 3 given data on Y and X.
➢ There is no way of transforming the relationship to obtain a linear
relationship, so the usual regression procedure cannot be applied.
➢ Nevertheless, the principle of minimizing the sum of the squares of
the residuals can still be used to obtain estimates of the parameters.
We will describe a simple nonlinear regression algorithm that uses
this principle. It consists of a series of repeated steps.

1
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.

You start by guessing plausible values for the parameters.

4
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.


2. Calculate Yˆi = ˆ1 + ˆ2 X i 3 for each observation.
ˆ

You calculate the corresponding fitted values of Y from the data on X,


conditional on these values of the parameters.
5
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.


2. Calculate Yˆi = ˆ1 + ˆ2 X i 3 for each observation.
ˆ

3. Calculate uˆ i = Yi − Yˆi for each observation.


4. Calculate RSS =  uˆ 2
i .

You calculate the residual for each observation in the sample, and hence
RSS, the sum of the squares of the residuals.
6
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.


2. Calculate Yˆi = ˆ1 + ˆ2 X i 3 for each observation.
ˆ

3. Calculate uˆ i = Yi − Yˆi for each observation.


4. Calculate RSS =  uˆ 2
i .

5. Adjust ˆ1 , ˆ 2 , and ˆ 3 .

You then make small changes in one or more of your estimates of the
parameters.
7
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm


Nonlinear regression algorithm
1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.
1. Guess ˆ1, 2ˆ, andˆ3. b31, b2, and b3 are the guesses.
ˆ
2. Calculate Yi =  1 +  2 X i for each observation.
3. Calculate uˆ i = iYi −1Yˆi for
2. Calculate Y = b + b X 2 i for each observation.
each observation.
3. Calculate e = Yi – Y2i for each observation.
4. Calculate RSSi =
4.
 uˆ .
Calculate RSS = ∑e .
i

5. Adjust ˆ1 , ˆ 2 , and ˆ 3 .


i

5. Adjust b1, b2, and b3.


6. Re-calculate Yi, uˆ i , RSS.
6. Re-calculate Yi, ei, RSS.

Using the new estimates of 1, 2, and 3, you recalculate the fitted values of Y.
Then, you recalculate the residuals and RSS.
8
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.


2. Calculate Yˆi = ˆ1 + ˆ2 X i 3 for each observation.
ˆ

3. Calculate uˆ i = Yi − Yˆi for each observation.


4. Calculate RSS =  uˆ 2
i .

5. Adjust ˆ1 , ˆ 2 , and ˆ 3 .


6. Re-calculate Yi, uˆ i , RSS.
7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.

If RSS is smaller than before, your new estimates of the parameters are better
than the old ones, and you continue adjusting your estimates in the same
direction. Otherwise, you would try different adjustments. 9
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.


2. Calculate Yˆi = ˆ1 + ˆ2 X i 3 for each observation.
ˆ

3. Calculate uˆ i = Yi − Yˆi for each observation.


4. Calculate RSS =  uˆ 2
i .

5. Adjust ˆ1 , ˆ 2 , and ˆ 3 .


6. Re-calculate Yi, uˆ i , RSS.
7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You repeat steps 5, 6, and 7 repeatedly until you cannot make any changes
in the estimates of the parameters that would reduce RSS.
10
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.


2. Calculate Yˆi = ˆ1 + ˆ2 X i 3 for each observation.
ˆ

3. Calculate uˆ i = Yi − Yˆi for each observation.


4. Calculate RSS =  uˆ 2
i .

5. Adjust ˆ1 , ˆ 2 , and ˆ 3 .


6. Re-calculate Yi, uˆ i , RSS.
7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

You conclude that you have minimized RSS, and you can describe the final
estimates of the parameters as the least squares estimates.
11
NONLINEAR REGRESSION

Y = 1 +  2 X 3 + u

Nonlinear regression algorithm

1. Guess 1, 2, and 3. ̂ 1 , ˆ 2 , and ̂ 3 are the guesses.


2. Calculate Yˆi = ˆ1 + ˆ2 X i 3 for each observation.
ˆ

3. Calculate uˆ i = Yi − Yˆi for each observation.


4. Calculate RSS =  uˆ 2
i .

5. Adjust ˆ1 , ˆ 2 , and ˆ 3 .


6. Re-calculate Yi, uˆ i , RSS.
7. If new RSS < old RSS, continue adjustment.
Otherwise try different adjustment.
8. Repeat steps 5, 6, and 7 to convergence.

It should be emphasized that, long ago, mathematicians devised sophisticated


techniques to minimize the number of steps required by algorithms of this
type. 12
NONLINEAR REGRESSION

2
e = 1 + +u
g
3
Employment growth rate

0
0 1 2 3 4 5 6 7 8 9
GDP growth rate

In the first slideshow for this chapter, we will return to the relationship
between employment growth rate, e, and GDP growth rate, g. e and g are
hypothesized to be related, as shown. 13
NONLINEAR REGRESSION

2
e = 1 + +u
g
3
Employment growth rate

0
0 1 2 3 4 5 6 7 8 9
GDP growth rate

According to this specification, as g becomes large, e will tend to a limit of 1.


Looking at the figure, we see that the maximum value of e is about 3. So, we will
take this as our initial value for 1. We then hunt for the optimal value of 2,
conditional on this guess for 1. 14
NONLINEAR REGRESSION

400

Conditional on ˆ1 = 3
RSS

300

200
-7 -6 -5 –4.22 -4 -3 -2 -1 0
estimate of 2

The figure shows RSS plotted as a function of ̂ 2 , conditional on ̂ 1 = 3.


From this we see that the optimal value of ˆ2 , conditional on ˆ1 = 3, is –4.22.
15
NONLINEAR REGRESSION

200

Conditional on ˆ2 = −4.22


RSS

100

0
1 2 2.82 3 4
estimate of 1
b1

Next, holding ˆ2 at –4.22, we look to improve our guess ̂ 1 . The figure shows
RSS as a function of ̂ 1, conditional on ˆ2 = –4.22. We see that the optimal
value of b1 is 2.82. 16
NONLINEAR REGRESSION

200

Conditional on ˆ2 = −4.22


RSS

100

0
1 2 2.82 3 4
estimate of 1
b1

If we continue to do this, both parameter estimates will converge on limits and


cease to change. We will then have reached the values that yield minimum RSS.
17
NONLINEAR REGRESSION

200

Conditional on ˆ2 = −4.22


RSS

100

0
1 2 2.82 3 4
estimate of 1
b1

The limits must be the values from the transformed linear regression shown
in the first slideshow for this chapter: ˆ1 = 2.18 and ˆ2 = –2.36. The same
criterion, the minimization of RSS, has determined them. All we have done is
18
use a different method.
NONLINEAR REGRESSION

2
e = 1 +
. nl (e = {beta1} + {beta2}/g)
(obs = 31) +u
Iteration 0: residual SS = 12.30411
g
Iteration 1: residual SS = 12.30411
----------------------------------------------------------------------------
Source | SS df MS
-----------+------------------------------ Number of obs = 31
Model | 5.80515805 1 5.80515805 R-squared = 0.3206
Residual | 12.304107 29 .42427955 Adj R-squared = 0.2971
-----------+------------------------------ Root MSE = .6513674
Total | 18.109265 30 .603642167 Res. dev. = 59.32851
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
/beta1 | 2.17537 .249479 8.72 0.000 1.665128 2.685612
/beta2 | -2.356136 .6369707 -3.70 0.001 -3.658888 -1.053385
----------------------------------------------------------------------------

Here is the output for the present hyperbolic regression of e on g using


nonlinear regression. It is, as usual, Stata output, but output from other
regression applications will look similar. 19
NONLINEAR REGRESSION

2
e = 1 +
. nl (e = {beta1} + {beta2}/g)
(obs = 31) +u
Iteration 0: residual SS = 12.30411
g
Iteration 1: residual SS = 12.30411
----------------------------------------------------------------------------
Source | SS df MS
-----------+------------------------------ Number of obs = 31
Model | 5.80515805 1 5.80515805 R-squared = 0.3206
Residual | 12.304107 29 .42427955 Adj R-squared = 0.2971
-----------+------------------------------ Root MSE = .6513674
Total | 18.109265 30 .603642167 Res. dev. = 59.32851
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
/beta1 | 2.17537 .249479 8.72 0.000 1.665128 2.685612
/beta2 | -2.356136 .6369707 -3.70 0.001 -3.658888 -1.053385
----------------------------------------------------------------------------

The Stata command for a nonlinear regression is ‘nl.’ The hypothesized


mathematical relationship within parentheses follows this. The parameters
must be given names placed within braces. Here b1 is {beta1} and b2 is {beta2}.
20
NONLINEAR REGRESSION

2
e = 1 +
. gen z = 1/g
. reg e z
+u
g
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 31
-----------+------------------------------ F( 1, 29) = 13.68
Model | 5.80515811 1 5.80515811 Prob > F = 0.0009
Residual | 12.3041069 29 .424279548 R-squared = 0.3206
-----------+------------------------------ Adj R-squared = 0.2971
Total | 18.109265 30 .603642167 Root MSE = .65137
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
z | -2.356137 .6369707 -3.70 0.001 -3.658888 -1.053385
_cons | 2.17537 .249479 8.72 0.000 1.665128 2.685612
----------------------------------------------------------------------------

2.36
eˆ = 2.18 − 2.36 z = 2.18 −
g

The output is effectively the same as the linear regression output in the first
slideshow for this chapter.
21
NONLINEAR REGRESSION

2
4 e = 1 + +u
g
3
Employment growth rate

0
0 1 2 3 4 5 6 7 8 9

-1

-2

-3
GDP growth rate

The hyperbolic function imposes the constraint that the function plunges to
minus infinity for positive g as g approaches zero.
22
NONLINEAR REGRESSION

2
e = 1 +
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 31) +u
Iteration 0: residual SS = 12.30411
3 + g
Iteration 1: residual SS = 12.27327
.....................................
Iteration 8: residual SS = 11.98063
----------------------------------------------------------------------------
Source | SS df MS
-----------+------------------------------ Number of obs = 31
Model | 6.12863996 2 3.06431998 R-squared = 0.3384
Residual | 11.9806251 28 .427879466 Adj R-squared = 0.2912
-----------+------------------------------ Root MSE = .654125
Total | 18.109265 30 .603642167 Res. dev. = 58.5026
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
/beta1 | 2.714411 1.017058 2.67 0.013 .6310616 4.79776
/beta2 | -6.140415 8.770209 -0.70 0.490 -24.10537 11.82454
/beta3 | 1.404714 2.889556 0.49 0.631 -4.514274 7.323702
----------------------------------------------------------------------------

This feature can be relaxed by using the variation shown. Unlike the
previous function, this cannot be linearized by any transformation. Here,
nonlinear regression must be used. 23
NONLINEAR REGRESSION

2
e = 1 +
. nl (e = {beta1} + {beta2}/({beta3} + g))
(obs = 31) +u
Iteration 0: residual SS = 12.30411
3 + g
Iteration 1: residual SS = 12.27327
.....................................
Iteration 8: residual SS = 11.98063
----------------------------------------------------------------------------
Source | SS df MS
-----------+------------------------------ Number of obs = 31
Model | 6.12863996 2 3.06431998 R-squared = 0.3384
Residual | 11.9806251 28 .427879466 Adj R-squared = 0.2912
-----------+------------------------------ Root MSE = .654125
Total | 18.109265 30 .603642167 Res. dev. = 58.5026
----------------------------------------------------------------------------
e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
/beta1 | 2.714411 1.017058 2.67 0.013 .6310616 4.79776
/beta2 | -6.140415 8.770209 -0.70 0.490 -24.10537 11.82454
/beta3 | 1.404714 2.889556 0.49 0.631 -4.514274 7.323702
----------------------------------------------------------------------------

The output for this specification is shown, with most of the iteration
messages deleted.
24
NONLINEAR REGRESSION

2
4 e = 1 + +u
3 + g
3
(4.47)
Employment growth rate

(4.46)
1

0
0 1 2 3 4 5 6 7 8 9

-1

-2

-3
GDP growth rate

The figure compares the original (black) and new (red) hyperbolic functions.
The overall fit is not significantly improved, but the specification does seem
more satisfactory. 25

You might also like