Econometrics II Chapter One

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

1

CHAPTER ONE:
REGRESSION ON DUMMY VARIABLES
1. Introduction

 In a regression analysis, we usually face a qualitative response (dependent)

variable of the “yes” or “no” type.

 Discrete choice models dealing with such kind of binary responses are called
binary choice models.

 Technically, it is possible to estimate the binary choices using OLS.

 Such linear model for binary choices where OLS is used is called linear
probability model (LPM).
Cont’d…
3

 Such models are very useful in that they allow us to address


questions for which there is a “yes or no” answer.

 In a regression analysis, we usually face a qualitative response


(dependent) variable of the “yes” or “no” type.

 When examining the dummy dependent variables we need to


ensure there are sufficient numbers of 0s and 1s.
Regression on Dummy Variables
4
The nature of dummy variables: In regression analysis the dependent
variable is frequently influenced not only by variables that can be readily
quantified on some well-defined scale (e.g., income, output, prices, costs,
height, and temperature), but also by variables that are essentially
qualitative in nature (e.g., sex, race, color, religion, nationality, wars,
earthquakes, strikes, political upheavals, and changes in government
economic policy). For example, holding all other factors constant, female
college professors are found to earn less than their male counterparts,
and nonwhites are found to earn less than whites.
Regression on Dummy Variables
5
 This pattern may result from sex or racial discrimination, but
whatever the reason, qualitative variables such as sex and races do
influence the dependent variable and clearly should be included
among the explanatory variables.

 Since such qualitative variables usually indicate the presence or


absence of a “quality” or an attribute, such as male or female, black
or white, or Christian or Muslim, one method of “quantifying” such
attributes is by constructing artificial variables that take on values
of 1 or 0, 0 indicating the absence of an attribute and 1 indicating
the presence (or possession) of that attribute.
Regression on Dummy Variables
6

 For example, 1 may indicate that a person is a male, and 0 may


designate a female; or 1 may indicate that a person is a college
graduate, and 0 that he is not, and so on. Variables that assume
such 0 and 1 values are called dummy variables.

 Alternative names are indicator variables, binary variables,


categorical variables, and dichotomous variables.
Regression on Dummy Variables
7
 Dummy variables can be used in regression models just as easily as
quantitative variables. As a matter of fact, a regression model may contain
explanatory variables that are exclusively dummy, or qualitative, in nature.

 Example: 𝒀𝒊 = 𝒂 + 𝜷𝑫𝒊 + 𝑼𝒊 … … … … … … … … … … … … … … … … … . . (𝟓. 𝟎𝟏)


Example: Where Y=annual salary of a college professor
Di = 1 If male college professor
= 0 otherwise (i.e., female professor)
Note that (5.01) is like the two variable regression models encountered
previously except that instead of a quantitative X variable we have a dummy
variable D (hereafter, we shall designate all dummy variables by the letter D).
Cont’d
8
 Model (5.01) may enable us to find out whether sex makes any
difference in a college professor’s salary, assuming, of course, that
all other variables such as age, degree attained, and years of
experience are held constant. Assuming that the disturbance
satisfy the usually assumptions of the classical linear regression
model, we obtain from (5.01).

 Mean salary of female college professor𝑬(𝒀𝒊/𝑫𝒊= 𝟎) = 𝒂

 Mean salary of male college professor:𝑬(𝒀𝒊 /𝑫𝒊 = 𝟏) = 𝒂 + 𝜷


Cont’d
9

 that is, the intercept term 𝒂 gives the mean salary of female
college professors and the slope coefficient 𝜷 tells by how much the
mean salary of a male college professor differs from the mean salary
of his female counterpart, 𝒂 + 𝜷 reflecting the mean salary of the male
college professor.
 A test of the null hypothesis that there is no sex discrimination (H 0 :
𝜷= 0) can be easily made by running regression (5.01) in the usual
manner and finding out whether on the basis of the t test the
estimated 𝜷 is statistically significant.
1.2.Regression on one quantitative variable and one qualitative
variable with two classes or categories 10
 Consider the Model: 𝒀𝒊 = 𝞪𝒊 + 𝞪𝟐 𝑫𝒊 + 𝜷𝑿𝒊 + 𝑼𝒊…………………………………(5.03)

 Where: 𝒀𝒊 =annual salary of a college professor

 𝑿𝒊 = Years of teaching experience

 𝑫𝒊 =1 if male and 0= Otherwise(i.e if Female)

 Model (5.03) contains one quantitative variable (years of teaching experience) and one
qualitative variable (sex) that has two classes (or levels, classifications, or categories),
namely, male and female. What is the meaning of this equation? Assuming, as usual, that
E(ui ) = 0, we see that
Cont’d 11
 Mean salary of female college professor𝑬(𝒀𝒊/𝑫𝒊= 𝟎) = 𝞪𝟏 + 𝜷𝑿𝒊……..(5.04)
 Mean salary of male college professor:𝑬(𝒀𝒊 /𝑫𝒊 = 𝟏) = (𝞪𝟏 + 𝞪𝟐 ) + 𝜷𝑿𝒊………(5.05)
 Geometrically, we have the situation shown in fig. 5.01 (for illustration, it is assumed
that(𝞪𝟏 > 0 ).

 In words, model 5.01 postulates that the male and female college professors‟ salary
functions in relation to the years of teaching experience have the same slope (𝜷 ) but
different intercepts.

 In other words, it is assumed that the level of the male professor‟s mean salary is
different from that of the female professor‟s mean salary by(𝞪𝟐 ) but the rate of change
in the mean annual salary by years of experience is the same for both sexes.
Cont’d 12

Cont’d 13
 If the assumption of common slopes is valid, a test of the hypothesis
that the two regressions (5.04) and (5.05) have the same intercept
(i.e., there is no sex discrimination) can be made easily by running the
regression (5.03) and noting the statistical significance of the
estimated 𝞪𝟐 on the basis of the traditional t test.

 If the t test shows that𝞪𝟐 is statistically significant, we reject the


null hypothesis that the male and female college professors‟ levels of
mean annual salary are the same.
Cont’d 14

Before proceeding further, note the following features of the


dummy variable regression model considered previously.

 1.To distinguish the two categories, male and female, we have


introduced only one dummy Variable Di .For if Di = 1 always denotes
a male, when Di = 0 we know that it is a female since there are only
two possible outcomes.
Cont’d 15

 Hence, one dummy variable suffices to distinguish two categories.


The general rule is this: If a qualitative variable has „m‟ categories,
introduce only „m-1‟ dummy variables.

 In our example, sex has two categories, and hence we introduced


only a single dummy variable. If this rule is not followed, we shall
fall into what might be called the dummy variable trap, that is, the
situation of perfect multicollinearity.
Cont’d 16
 2. The assignment of 1 and 0 values to two categories, such as
male and female, is arbitrary in the sense that in our example we
could have assigned D=1 for female and D=0 for male.

 3. The group, category, or classification that is assigned the value


of 0 is often referred to as the base, benchmark, control,
comparison, reference, or omitted category. It is the base in the
sense that comparisons are made with that category.

 4. The coefficient 𝞪𝟐 attached to the dummy variable D can be called


the differential intercept coefficient because it tells by how much the
value of the intercept term of the category that receives the value of 1
differs from the intercept coefficient of the base category.
1.3.Regression on one quantitative variable and one qualitative
variable with more than two classes 17

 Suppose that, on the basis of the cross-sectional data, we want to


regress the annual expenditure on health care by an individual on the
income and education of the individual.

 Since the variable education is qualitative in nature, suppose we consider


three mutually exclusive levels of education: less than high school, high
school, and college.

 Now, unlike the previous case, we have more than two categories of the
qualitative variable education.
1.3.Regression on one quantitative variable and one qualitative
variable with more than two classes 18

 Therefore, following the rule that the number of dummies be one less
than the number of categories of the variable, we should introduce two
dummies to take care of the three levels of education.

 Assuming that the three educational groups have a common slope but
different intercepts in the regression of annual expenditure on health
care on annual income, we can use the following model:
1.3.Regression on one quantitative variable and one qualitative
variable with more than two classes 19
 Consider the Mod𝒆𝒍 𝒀𝒊 = 𝞪𝟏 + 𝞪𝟐 𝑫𝟐𝒊 + 𝞪𝟑 𝑫𝟑𝒊 + 𝜷𝑿𝒊 + 𝑼𝒊 …………………………………(5.06)

 Where: 𝒀𝒊 =annual expenditure on health care

 𝑿𝒊 = annual expenditure

 𝑫𝟐 =1 if high school education and 0= Otherwise

 𝑫𝟑 =1 if college education and 0= Otherwise

 Note that in the preceding assignment of the dummy variables we are arbitrarily
treating the “less than high school education” category as the base category.
Therefore, the intercept 𝞪𝟏 will reflect the intercept for this category.
1.3.Regression on one quantitative variable and one qualitative
variable with more than two classes 20

 The differential intercepts 𝞪𝟐 and 𝞪𝟑 tell by how much the intercepts of the other
two categories differ from the intercept of the base category, which can be readily
checked as follows: Assuming E(ui ) = 0 , we obtain from (5.06)

 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟎,/𝑫𝟑 = 𝟎, 𝑿𝒊 ) = 𝞪𝟏 + 𝜷𝑿𝒊

 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟏,/𝑫𝟑 = 𝟎, 𝑿𝒊 ) = (𝞪𝟏 +𝞪𝟐 ) +𝜷𝑿𝒊

 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟎,/𝑫𝟑 = 𝟏, 𝑿𝒊 ) = (𝞪𝟏 +𝞪𝟑 ) +𝜷𝑿𝒊

 which are, respectively the mean health care expenditure functions for the three levels
of education, namely, less than high school, high school, and college. Geometrically, the

situation is shown in fig 5.2 (for illustrative purposes it is assumed that𝞪𝟑 > 𝞪𝟐 ).
1.3.Regression on one quantitative variable and one qualitative
variable with more than two classes 21

1.4. Regression on one quantitative variable and two qualitative
variables 22

 The technique of dummy variable can be easily extended to handle more


than one qualitative variable. Let us revert to the college professors‟
salary regression (5.03), but now assume that in addition to years of
teaching experience and sex the skin color of the teacher is also an
important determinant of salary.

 For simplicity, assume that color has two categories: black and white. We

can now write (5.03) as:


Cont’d 23
 Consider the Mod𝒆𝒍 𝒀𝒊 = 𝞪𝟏 + 𝞪𝟐 𝑫𝟐𝒊 + 𝞪𝟑 𝑫𝟑𝒊 + 𝜷𝑿𝒊 + 𝑼𝒊 ……………(5.07)

 where Yi = annual Salary

 Xi = Years of teaching experience

 D2 = 1 if female

 = 0 if male

 D3 = 1 if white

 = 0 otherwise
Cont’d 24
 Notice that each of the two qualitative variables, sex and color, has two
categories and hence needs one dummy variable for each. Note also that
the omitted, or base, category now is “black female professor.”
 Assuming E(ui ) = 0 , we can obtain the following regression from (5.07)
Mean salary for black female professor:
 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟎,/𝑫𝟑 = 𝟎, 𝑿𝒊 ) = 𝞪𝟏 + 𝜷𝑿𝒊
Mean salary for black male professor
 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟏,/𝑫𝟑 = 𝟎, 𝑿𝒊 ) = (𝞪𝟏 +𝞪𝟐 ) +𝜷𝑿𝒊
Mean salary for white female professor:
 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟎,/𝑫𝟑 = 𝟏, 𝑿𝒊 ) = (𝞪𝟏 +𝞪𝟑 ) +𝜷𝑿𝒊
Mean salary for white male professor:
 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟏,/𝑫𝟑 = 𝟏, 𝑿𝒊 ) = (𝞪𝟏 +𝞪𝟐 + 𝞪𝟑 ) +𝜷𝑿𝒊
Cont’d 25
 Consider Once again, it is assumed that the preceding regressions differ

only in the intercept coefficient but not in the slope coefficient 𝜷.


 An OLS estimation of (5.06) will enable us to test a variety of hypotheses.

Thus, if 𝞪𝟑 is statistically significant, it will mean that color does affect a

professor‟s salary.
 Similarly, if𝞪𝟐 is statistically significant, it will mean that sex also affects a
professor‟s salary.
 If both these differential intercepts are statistically significant, it would
mean sex as well as color is an important determinant of professors’ salaries.
Cont’d 26

 From the preceding discussion it follows that we can extend our model to
include more than one quantitative variable and more than two qualitative
variables.

 The only precaution to be taken is that the number of dummies for each
qualitative variable should be one less than the number of categories of
that variable.
1.5.Testing for structural stability of regression models
27
 Until now, in the models considered in this chapter we assumed that the
qualitative variables affect the intercept but not the slope coefficient of
the various subgroup regressions.

 But what if the slopes are also different? If the slopes are in fact
different, testing for differences in the intercepts may be of little practical
significance.

 Therefore, we need to develop a general methodology to find out whether


two (or more) regressions are different, where the difference may be in the
intercepts or the slopes or both.
1.6. Interaction effects 28
 Consider the following Mod𝒆𝒍 𝒀𝒊 = 𝞪𝟏 + 𝞪𝟐 𝑫𝟐𝒊 + 𝞪𝟑 𝑫𝟑𝒊 + 𝜷𝑿𝒊 + 𝑼𝒊 ……………(5.08)

 where Yi = annual expenditure on clothing

 Xi = Income

 D2 = 1 if female

 = 0 if male

 D3 = 1 if college graduate

 = 0 otherwise
Cont’d 29
 Implicit in this model is the assumption that the differential effect of the
sex dummy D2 is constant across the two levels of education and the
differential effect of the education dummy D3 is also constant across the
two sexes. That is, if, say, the mean expenditure on clothing is higher for
females than males this is so whether they are college graduates or not.

 Likewise, if, say, college graduates on the average spend more on clothing
than non college graduates, this is so whether they are female or males.
Cont’d 30

 In many applications such an assumption may be untenable. A female


college graduate may spend more on clothing than a male graduate.

 In other words, there may be interaction between the two qualitative


variables D2 and D3 and therefore their effect on mean Y may not be
simply additive as in (5.08) but multiplicative as well, as in the following
model:
Cont’d 31
Consider the following Mod𝒆𝒍 𝒀𝒊 = 𝞪𝟏 + 𝞪𝟐 𝑫𝟐𝒊 + 𝞪𝟑 𝑫𝟑𝒊 + 𝞪𝟒 (𝑫𝟐𝒊 𝑫𝟑𝒊 ) + 𝜷𝑿𝒊 + 𝑼𝒊…(4.09)

 From (4.09) we obtain

 𝑬(𝒀𝒊 /𝑫𝟐 = 𝟏,/𝑫𝟑 = 𝟏, 𝑿𝒊 ) = (𝞪𝟏 +𝞪𝟐 + 𝞪𝟑 + 𝞪𝟒 ) +𝜷𝑿𝒊 ……………………………………………………..(4.10)

 which is the mean clothing expenditure of graduate females. Notice that

 𝞪𝟐 differential effect of being a female

 𝞪𝟑 differential effect of being a college graduate

 𝞪𝟒 differential effect of being a female graduate which shows that the mean clothing
expenditure of graduate females is different (by 𝞪𝟒 ) from the mean clothing
expenditure of females or college graduates.
Cont’d 32

 If 𝞪𝟐 , 𝞪𝟑 , and 𝞪𝟒 are all positive, the average clothing expenditure of


females is higher (than the base category, which here is male non
graduate), but it is much more so if the females also happen to be
graduates.

 Similarly, the average expenditure on clothing by a college graduate


tends to be higher than the base category but much more so if the
graduate happens to be a female.
1.7 The use of dummy variables in seasonal analysis
33

 Many economic time series based on monthly or quarterly data exhibit


seasonal patterns (regular oscillatory movement).

 Examples are sales of department stores at Christmastime, demand for


money (cash balances) by households at holiday times, demand for ice
cream and soft drinks during the summer, and prices of crops right after
the harvesting season.
Cont’d 34
 Often it is desirable to remove the seasonal factor, or component, from a time
series so that one may concentrate on the other components, such as the
trend.
 The process of removing the seasonal component from a time series is known
as deseasonalization, or seasonal adjustment, and the time series thus
obtained is called the deseasonalized or seasonally adjusted, time series.
 Important economic time series, such as the consumer price index, the
wholesale price index, the index of industrial production, are usually published
in the seasonably adjusted form.
1.8 Piecewise linear regression 35
 To illustrate yet another use of dummy variables, consider fig 5.3, which
shows how a hypothetical company remunerates its sales representatives.
Cont’d 36

 It pays commissions based on sales in such manner that up to a certain


level, the target, or threshold, level X*, there is one (stochastic)
commission structure and beyond that level another. (Note: Besides sales,
other factors affect sales commission.

 Assume that these other factors are represented by the stochastic


disturbance term.) More specifically, it is assumed that sales commission
increases linearly with sales until the threshold level X*, after which also
it increases linearly with sales but at a much steeper rate.
Cont’d 37

 Thus, we have a piece-wise linear regression consisting of two linear


pieces or segments, which are labeled I and II in fig. 5.3, and the
commission function changes its slope at the threshold value.

 Given the data on commission, sales, and the value of the threshold level
X*, the technique of dummy variables can be used to estimate the
(differing) slopes of the two segments of the piecewise linear regression
shown in fig. 5.3. We proceed as follows:
Cont’d 38
 Consider the Mod𝒆𝒍 𝒀𝒊 = 𝞪𝟏 + 𝜷𝑿 + 𝜷𝟐 (𝑿𝒊 − 𝑿 ∗)𝑫𝒊 + 𝑼𝒊…………(5.11)

 Where: 𝐘𝐢 = sales commission

 𝐗𝐢= volume of sales generated by the sales person

 X*=threshold value of sales also known as a knot (Known in advance)

 D=1 if 𝐗 𝐢 >X *

 0 if 𝐗 𝐢 < X *
Cont’d 39
 Assuming E(ui ) = 0, we see at once that
 𝑬(𝒀𝒊 /𝑫𝒊 = 𝟎, 𝑿𝒊 , 𝑿 ∗) = 𝞪𝟏 + 𝜷𝟏 𝑿𝒊 ………………………………………………………………………(5.12)
which gives the mean sales commission up to the target level X* and
 𝑬(𝒀𝒊 /𝑫𝒊 = 𝟏, 𝑿𝒊 , 𝑿 ∗) = 𝞪𝟏 + 𝜷𝟐 𝑿 ∗ +(𝜷𝟏 +𝜷𝟐 )𝑿𝒊……………………………………………………(5.13)

 which gives the mean sales commission beyond the target level X*.

 Thus, 𝜷𝟏 gives the slope of the regression lien in segment I, and(𝜷𝟏+𝜷𝟐)


gives the slope of there gression line in segment II of the piecewise linear
regression shown in fig 5.3.

 A test of the hypothesis that there is no break in the regression at the


threshold value X* can be conducted easily by noting the statistical
significance of the estimated differential slope coefficient𝜷𝟐 .
2.REGRESSION ON DUMMY DEPENDENT VARIABLE
40
 Binary dependent variables are extremely common in the social sciences.
Suppose we want to study the labor-force participation of adult males as a
function of the unemployment rate, average wage rate, family income,
education, etc.

 A person either is in the labor force or not. Hence, the dependent variable,
labor-force participation, can take only two values: 1 if the person is in the
labor force and 0 if he or she is not. We can consider another example. A
family may or may not own a house. If it owns a house, it takes a value 1 and
0 if it does no
Cont’d
41
 There are several such examples where the dependent variable is
dichotomous. A unique feature of all the examples is that the dependent
variable is of the type that elicits a yes or no response; that is, it is
dichotomous in nature.

 Now before we discuss the estimation of models involving dichotomous


response variables, let us briefly discuss the concept of qualitative
response models:
4
Cont’d…
2
These are models in which the dependent variable is a discrete outcome.
 Example 1. Y = 0 + 1X1 + 2X2
 Y = 1, if individual i attended college
 = 0, otherwise
 In the above example the dependent variable Y takes on only two
values (i.e., 0 and 1).
 Conventional regression cannot be used to analyze a qualitative
dependent variable model.
 The models are analyzed in a general framework of probability models.
2.1. Categories of Qualitative Response Models (QRM) 43
Two broad categories of QRM
1.Binomial Model
 The choice is between two alternatives
2.Multinomial models
The choice is between more than two alternatives
 Example: Y = 1, occupation is farming
 = 2, occupation is carpentry
 = 3, occupation is fishing
 Let us define some important terminologies
 Binary variables: are variables that have two categories and are often used to indicate
that an event has occurred or that some characteristic is present.
 Example: - Decision to participate in the labor force/or not to participate
 -Decision to vote or not to vote
Cont’d
44
Ordinal variables:- these are variables that have categories that can be
ranked.
Example: – Rank to indicate political orientation
 Y = 1, radical
 = 2, liberal
 = 3, conservative
 - Rank according to education attainment
 Y = 1, primary education
 = 2, secondary education
 = 3, university education
Cont’d
45
Nominal variables: These variables occur when there are multiple
outcomes that cannot be ordered.

Example: Occupation can be grouped as farming, fishing, carpentry etc.

Note that numbers are assigned arbitrarily

Y = 1 farming

= 2 fishing

= 3 carpentry

= 4 Livestock
Cont’d
46
 Count variables: These variables indicate the number of times some
event has occurred.
 Example: How many strikes have been occurred.
2.3.Types of Binomial Models
47

The four most commonly used approaches to estimating binary response


models (Type of binomial models).

1. Linear probability models

2. The logit model

3. The probit model

4. The tobit (censored regression) model.


2.3.1.THE LINEAR PROBABILITY MODEL (LPM)
48
The linear probability model is the regression model applied to a binary
dependent variable. To fix ideas, consider the following simple model:
Yi = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊 + 𝑼𝒊`……………………………(1)
 where X = family income
Y = 1 if the family owns a house
= 0 if the family does not own a house
Ui is the disturbance term
 The independent variable Xi can be discrete or continuous variable. The model can be
extended to include other additional explanatory variables.
Cont’d
49
 The above model expresses the dichotomous Yi as a linear function of the explanatory
variable Xi.

 Such kinds of models are called linear probability models (LPM) since E(Yi/Xi) the
conditional expectation of Yi given Xi, can be interpreted as the conditional probability
that the event will occur given Xi; that is, Pr(Yi = 1/Xi).

 Thus, in the preceding case, E(Yi/Xi) gives the probability of a family owing a house and
whose income is the given amount Xi. The justification of the name LPM can be seen as
follows.

Assuming E(Ui) = 0, as usual (to obtain unbiased estimators), we obtain

 E(Yi/Xi) = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊 …………………………………….(2)
50

 .
2.3.1.1.Problems with the LPM 51
 While the interpretation of the parameters is unaffected by having a
binary outcome, several assumptions of the LPM are necessarily violated.

.
Cont’d
52

 .
2.BinaryNon-normality of Ui
53
 Although OLS does not require the disturbance (U’s) to be normally
distributed, we assumed them to be so distributed for the purpose of
statistical inference, that is, hypothesis testing, etc. But the assumption of
normality for Ui is no longer tenable for the LPMs because like Yi, Ui takes
on only two values.
𝑼𝒊 =𝒀𝒊 -𝜷𝟎 - 𝜷𝟏 𝑿𝒊

Now when 𝒀𝒊 =1 , 𝑼𝒊 = 1-𝜷𝟎- 𝜷𝟏 𝑿𝒊


and when 𝒀𝒊 =0, 𝑼𝒊= -𝜷𝟎- 𝜷𝟏 𝑿𝒊
Obviously Ui cannot be assumed to be normally distributed.
Recall that normality is not required for the OLS estimates to be unbiased.
Cont’d
54
3.Non-Sensical Predictions
 The LPM produces predicted values outside the normal range of
probabilities (0, 1). It predicts value of Y that are negative and greater than
1. This is the real problem with the OLS estimation of the LPM.
4.Functional Form:
 Since the model is linear, a unit increase in X results in a constant change of
in the probability of an event, holding all other variables constant. The
increase is the same regardless of the current value of X. In many
applications, this is unrealistic. When the outcome is a probability, it is often
substantively reasonable that the effects of independent variables will have
diminishing returns as the predicted probability approaches 0 or 1.
 Remark: Because of the above mentioned problems the LPM model is not
recommended for empirical works.
2.3.2.THE LOGIT MODEL
55
 We have seen that LPM has many problems, such as non-normality of Ui,
heteroscedasticity of Ui, possibility of lying outside the 0-1 range, and the
generally lower R2 values. But these problems are surmountable. The
fundamental problem with the LPM is that it is not logically a very
attractive model because it assumes that Pi = E(Y = 1/X) increases
linearly with X, that is, the marginal or incremental effect of X remains
constant throughout.
 Example: The LPM estimated by OLS (on home ownership) is given as
follows:
= -0.9457 + 0.1021Xi
(0.1228) (0.0082)
t = (-7.6984) (12.515)
R2 = 0.8048
Cont’d
56
The above regression is interpreted as follows

 The intercept of –0.9457 gives the “probability” that a family with zero
income will own a house. Since this value is negative, and since
probability cannot be negative, we treat this value as zero.

 The slope value of 0.1021 means that for a unit change in income, on the
average the probability of owning a house increases by 0.1021 or about
10 percent. This is so whether the income level is increased or not. This
seems patently unrealistic. In reality one would expect that Pi is non-
linearly related to Xi.
Cont’d
57
Therefore, what we need is a (probability) model that has the following two
features:

 1.As Xi increases, Pi = E(Y = 1/ Xi ) increases but never steps outside the 0-1
interval.

 2. The relationship between Pi and Xi is non-linear, that is, “ one which


approaches zero at slower and slower rates as Xi gets small and
approaches one at slower and slower rates as Xi gets very large”
Cont’d
58

 .
Cont’d
59
 Therefore, one can easily use the CDF to model regressions where the response
variable is dichotomous, taking 0-1 values.

 The CDFs commonly chosen to represent the 0-1 response models are.

a. the logistic – which gives rise to the logit model

b. the normal – which gives rise to the probit (or normit) model

 Now let us see how one can estimate and interpret the logit model.

Recall that the LPM was (for home ownership)

Pi = E(Y = 1/Xi) = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊

Where X is income and Y = 1 means the family owns a house


60

 .
61

 .
62

 .
3.THE PROBIT MODEL 63
 The estimating model that emerges from the normal CDF is popularly known
as the probit model.
 Here the observed dependent variable Y, takes on one of the values 0 and 1
using the following criteria.
y=1 if 𝐗 𝐢 >X *
yif 𝐗 𝐢 < X *
 The latent variable Y* is continuous (-∞ < Y* <∞ ). It generates the
observed binary variable Y.
 An observed variable, Y can be observed in two states:
i) if an event occurs it takes a value of 1
ii) if an event does not occur it takes a value of 0
 The latent variable is assumed to be a linear function of the observed X‟s
through the structural model.
Cont’d 64
 Example: Let Y measures whether one is employed or not. It is a binary
variable taking values 0 and 1.

 Y* - measures the willingness to participate in the labor market. This


changes continuously and is unobserved. If X is a wage rate, then as X
increases the willingness to participate in the labor market will increase.
(Y* - the willingness to participate cannot be observed). The decision of
the individual will be changed (becomes zero) if the wage rate is below the
critical point.

 Since Y* is continuous the model avoids the problems inherent in the LPM
model (i.e., the problem of non-normality of the error term and
heteroscedasticity)
Cont’d 65
 However, since the latent dependent variable is unobserved the model
cannot be estimated using OLS. Maximum likelihood can be used instead.

 Most often, the choice is between normal errors and logistic errors,
resulting in the probit (normit) and logit models, respectively. The
coefficients derived from the maximum likelihood (ML) function will be the
coefficients for the probit model, if we assume a normal distribution.

 If we assume that the appropriate distribution of the error term is a


logistic distribution, the coefficients that we get from the ML function will
be the coefficient of the logit model. In both cases, as with the LPM, it is
assumed that E[Ui/Xi] = 0
Cont’d 66

 .
67

 .
4.THE TOBIT MODEL 68
 An extension of the probit model is the tobit model developed by James
Tobin. To explain this model, let us consider the home ownership example.

 Suppose we want to find out the amount of money the consumer spends in
buying a house in relation to his or her income and other economic
variables. Now we have a problem.

 If a consumer does not purchase a house, obviously we have no data on


housing expenditure for such consumers; we have such data only on
consumers who actually purchase a house.
Cont’d 69
 Thus consumers are divided into two groups, one consisting of say, N1
consumers about whom we have information on the regressors (say income,
interest rate etc)as well as the regresand ( amount of expenditure on
housing) and another consisting of say, N2 consumers about whom we have
information only on the regressors but on the regressand.

 A sample in which information on regressand is available only for some


observations is known as a censored sample. Therefore, the tobit model is
also known as a censored regression model.
Cont’d 70
 Mathematically, we can express the tobit model as

Yi = 𝜷𝟎 + 𝜷𝟏 𝑿𝒊 + Ui if RHS > 0

= 0, otherwise

Where RHS = right-hand side

 The method of maximum likelihood can be used to estimate the


parameters of such models.
71

END OF CHAPTER ONE

You might also like