0% found this document useful (0 votes)

56 views8 pages

CHapter 5 Acct

This document introduces dummy variables and qualitative response regression models. It discusses how dummy variables can represent qualitative variables by assigning binary values of 0 and 1. A single dummy variable is sufficient to represent variables with two categories. The intercept represents the mean of the base or omitted category, while the coefficient on the dummy variable represents the difference from the base category. Four approaches to modeling a binary dependent variable are introduced: the linear probability model, logit model, probit model, and tobit model. The linear probability model is described in detail, where the regression equation is interpreted as estimating the conditional probability of a binary dependent variable.

Uploaded by

haile ethio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views8 pages

CHapter 5 Acct

Uploaded by

haile ethio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

CHAPTER ONE

Regression Analysis with Qualitative Information: Binary (Dummy Variables)

1.1. Describing Qualitative Information
In the regression analysis you discussed in previous chapters, the dependent variable as well as the
explanatory variables was quantitative in nature. However, the dependent variable might not only
influenced by variables that can be readily quantified on some well-defined scale (e.g., income,
output, prices, costs, height, and temperature), but also by variables that are essentially qualitative
in nature (e.g., sex, race, color, religion, nationality, and changes in government economic policy).
For example, holding all other factors constant, female college professors are found to earn less
than their male counterparts, and nonwhites are found to earn less than whites. This pattern may
result from sex or racial discrimination, but whatever the reason, qualitative variables such as sex
and race does influence the dependent variable and clearly should be included among the
explanatory variables. Since such qualitative variables usually indicate the presence or absence of
a “quality” or an attribute, such as male or female, black or white, one method of “quantifying”
such attributes is by constructing artificial variables that take on values of 1 or 0, 0 indicating the
absence of an attribute and 1 indicating the presence (or possession) of that attribute. For example,
1 may indicate that a person is a male, and 0 may designate a female; or 1 may indicate that a
person is a college graduate, and 0 that he is not, and so on. Variables that assume such 0 and 1
values are called dummy variables. Alternative names are indicator variables, binary variables,
categorical variables, and dichotomous variables.

1.2. Dummy Variable as Independent Variable

Dummy variables can be used in regression models just as easily as quantitative variables. As a
matter of fact, a regression model may contain explanatory variables that are exclusively dummy,
or qualitative, in nature.
Example: 𝑌𝑖 = 𝛼 + 𝛽𝐷𝑖 + 𝑈𝑖 … … … … … … … … … … … … … … … … … … … … … . (1.1)
Where 𝑌𝑖 =annual salary of a college professor
𝐷𝑖 = 1 if male college professor
= 0 otherwise (i.e., female professor)

1
Note that (1.1) is like the two variable regression models encountered in chapter two except that
instead of a quantitative X variable we have a dummy variable D (hereafter, we shall designate all
dummy variables by the letter D).
Model (1. 1) may enable us to find out whether sex makes any difference in a college professor’s
salary, assuming, of course, that all other variables such as age, degree attained, and years of
experience are held constant. Assuming that the disturbance satisfy the usually assumptions of the
classical linear regression model, we obtain from (1. 1).
Mean salary of female college professor: 𝐸(𝑌𝑖 /𝐷𝑖 = 0) = 𝛼 … … … … … … … … (1.2)
Mean salary of male college professor: 𝐸(𝑌𝑖 /𝐷𝑖 = 1) = 𝛼 + 𝛽
That is, the intercept term  gives the mean salary of female college professors and the slope
coefficient  tells by how much the mean salary of a male college professor differs from the mean
salary of his female counterpart,  +  reflecting the mean salary of the male college professor.
A test of the null hypothesis that there is no sex discrimination ( H 0 :  = 0) can be easily made by

running regression (1.1) in the usual manner and finding out whether on the basis of the t- test the
estimated  is statistically significant.

Note the following features of the dummy variable regression model considered previously.
1. To distinguish the two categories, male and female, we have introduced only one dummy
variable Di . For example, if Di = 1 always denotes a male, when Di = 0 we know that it

is a female since there are only two possible outcomes. Hence, one dummy variable
suffices to distinguish two categories. The general rule is this: If a qualitative variable
has ‘m’ categories, introduce only ‘m-1’ dummy variables. In our example, sex has two
categories, and hence we introduced only a single dummy variable. If this rule is not
followed, we shall fall into what might be called the dummy variable trap, that is, the
situation of perfect multicollinearity.
2. The assignment of 1 and 0 values to two categories, such as male and female, is arbitrary
in the sense that in our example we could have assigned D=1 for female and D=0 for male.
3. The group, category, or classification that is assigned the value of 0 is often referred to as
the base, benchmark, control, comparison, reference, or omitted category. It is the base in
the sense that comparisons are made with that category.

2
4. The coefficient  attached to the dummy variable D can be called the differential
intercept coefficient because it tells by how much the value of the intercept term of the
category that receives the value of 1 differs from the intercept coefficient of the base
category.

1.3. Dummy Variables as an Independent variables

1.3.1. Introduction
In the regression models so far, we have implicitly assumed that the dependent variable Y is
quantitative, whereas the explanatory variables are either quantitative, qualitative (or dummy), or
a mixture thereof. In fact, in the previous sections, on dummy variables, we saw how the dummy
explanatory variables are introduced in a regression model and what role they play in specific
situations. In this section we consider several models in which the dependent variable itself is
qualitative in nature. Although increasingly used in various areas of social sciences and medical
research, qualitative response regression models pose interesting estimation and interpretation
challenges.

Suppose we want to study about house ownership as a function of the income, house price, etc. A
person either is own house or not. Hence, the dependent variable, house ownership, can take only
two values: 1 if the person is owns house and 0 if he or she does not. In this case, the dependent
variable is of the type that elicits a yes or no response; that is, it is dichotomous in nature.

1.3.2. QUALITATIVE RESPONSE MODELS

Now we start our discussion of qualitative response models (QRM). These are models in which
the dependent variable is a discrete outcome. These models are analyzed in a general framework
of probability models. There are two broad categories of QRM.
A. Binomial Models: where the choice is between two alternatives
Example 1.3.1: 𝑌 = 𝛼0 + 𝛼1 𝑋1 +𝛼2 𝑋2

Y= 1, if individual 𝑖 attend college

= 0, otherwise
Therefore, the dependent variable Y takes on only two values (i.e., 0 and 1). Conventional
regression methods cannot be used to analyze a qualitative dependent variable model.

3
B. Multinomial Models: The choice is between more than two alternatives
Example 1.3.2: 𝑌 = 𝛼0 + 𝛼1 𝑋1 +𝛼2 𝑋2
In this case, the dependent variable Y takes more than two alternatives. For instances,
Y= 1, if individual 𝑖 attend college

Y= 2, if individual 𝑖 high school

Y= 3, otherwise
Since the discussion of multinomial QRMs is out of scope of this course, we start our study of
qualitative response models by considering the binary response regression model. There are four
approaches to estimating a probability model for a binary response variable:
1. Linear Probability Models 3. The Probit Model
2. The Logit Model 4. The Tobit (Censored Regression) Model1.

1.3.2.1. THE LINEAR PROBABILITY MODEL (LPM)

The linear probability model is the regression model applied to a binary dependent variable. To
fix ideas, consider the following simple model:
Yi =  0 +  1 Xi + Ui ……………………………(1)

Where X = income and 𝑈𝑖 = is the disturbance term

Y = 1 if the person owns a house
= 0 if the person does not own a house
The independent variable Xi can be discrete or continuous variable. The model can be extended to
include other additional explanatory variables. The above model expresses the dichotomous Yi as
a linear function of the explanatory variable Xi. Such kinds of models are called linear probability
models (LPM) since E(Yi/Xi) the conditional expectation of Yi given Xi, can be interpreted as the
conditional probability that the event will occur given Xi; that is, Pr(Yi = 1/Xi). Thus, in the
preceding case, E(Yi/Xi) gives the probability of a person owing a house and whose income is the
given amount Xi. The justification of the name LPM can be seen as follows.

Assuming E(Ui) = 0, as usual (to obtain unbiased estimators), we obtain

1
The Tobit model will not be discussed in this chapter

4
E(Yi/Xi) =  0 +  1 Xi …………………………………….(2)

Now, letting Pi = probability that Yi = 1 (that is, that the event occurs) and 1 – Pi = probability that
Yi = 0 (that is, that the event does not occur), the variable Yi has the following distributions:

𝑌𝑖 Probability
0 1-𝑃𝑖
1 Pi
Total 1

Therefore, by the definition of mathematical expectation, we obtain

E(Yi) = 0 (1 – Pi) + 1(Pi) = Pi ……………………………………..(3)
Now, comparing (2) with (3), we can equate
E(Yi/Xi) = Yi =  0 +  1 Xi = Pi ……………………………………(4)

i.e., the conditional expectation of the model (1) can, in fact, be interpreted as the conditional
probability of Yi. Since the probability Pi must lie between 0 and 1, we have the restriction 0  E
(Yi/Xi)  1 i.e., the conditional expectation, or conditional probability, must lie between 0 and 1.

Example: The LPM estimated by OLS (on home ownership) is given as follows:
Ŷi = -0.9457 + 0.1021Xi
(0.1228) (0.0082)
t = (-7.6984) (12.515) R2 = 0.8048
The above regression is interpreted as follows
❖ The intercept of –0.9457 gives the “probability” that a family with zero income will own a
house. Since this value is negative, and since probability cannot be negative, we treat this
value as zero. The slope value of 0.1021 means that for a unit change in income, on the
average the probability of owning a house increases by 0.1021 or about 10 percent. This is
so whether the income level is increased or not. This seems patently unrealistic. In reality
one would expect that Pi is non-linearly related to Xi (see next section).

Problems with the LPM

5
From the preceding discussion it would seem that OLS can be easily extended to binary dependent
variable regression models. So, perhaps there is nothing new here. Unfortunately, this is not the
case, for the LPM poses several problems. That is, while the interpretation of the parameters is
unaffected by having a binary outcome, several assumptions of the LPM are necessarily violated.

1. Heteroscedasticity
The variance of the disturbance terms depends on the X’s and is thus not constant. Let us see this
as follows. We have the following probability distributions for U.
𝑌𝑖 𝑈𝑖 Probability
0 -𝛽0-𝛽1 𝑋𝑖 1-𝑃𝑖
1 1-𝛽0-𝛽1 𝑋𝑖 Pi

Now by definition Var(𝑈𝑖 ) = E[𝑈𝑖 − E(𝑈𝑖 )]2 = 𝐸(𝑈𝑖 2 ) since E(𝑈𝑖 ) = 0 and Cov(𝑈𝑖 , 𝑈𝑗 ) = 0 for
all 𝑖 ≠ 𝑗 (no serial correlation) by assumption.
Therefore, using the preceding probability distribution of 𝑈𝑖 , we obtain
Var(𝑈𝑖 ) = E(Ui2) = (-  0 –  1 Xi)2 (1-Pi) + (1-  0 –  1 Xi)2 (Pi)

=(-  0 –  1 Xi)2(1-  0 –  1 Xi) + (1-  0 –  1 Xi)2 (  0 +  1 Xi)

= (  0 +  1 Xi) (1-  0 –  1 Xi)

or Var(𝑈𝑖 ) = E(Yi/Xi) [1 – E(Yi/Xi) = Pi (1 – Pi)

This shows that the variance of Ui is heteroscedastic because it depends on the conditional
expectation of Y, which, of course, depends on the value taken by X. Thus the OLS estimator of
 is inefficient and the standard errors are biased, resulting in incorrect test.

2. Non-normality of Ui
Although OLS does not require the disturbance (U’s) to be normally distributed, we assumed them
to be so distributed for the purpose of statistical inference, that is, hypothesis testing, etc. But the
assumption of normality for Ui is no longer tenable for the LPMs because like Yi, Ui takes on only
two values.
𝑈𝑖 = Yi-  0 –  1 Xi

Now when Yi = 1, 𝑈𝑖 = 1 -  0 –  1 Xi

6
and when Yi = 0, 𝑈𝑖 = –  0 –  1 Xi

Obviously 𝑈𝑖 cannot be assumed to be normally distributed. Recall that normality is not required
for the OLS estimates to be unbiased.

3. Non-fulfillment of 0  E (Yi/Xi)  1
The LPM produces predicted values outside the normal range of probabilities (0, 1). It predicts
value of Y that are negative and greater than 1. This is the real problem with the OLS estimation
of the LPM.

4. Functional Form:
Since the model is linear, a unit increase in X results in a constant change of  in the probability
of an event, holding all other variables constant. The increase is the same regardless of the current
value of X. In many applications, this is unrealistic. When the outcome is a probability, it is often
substantively reasonable that the effects of independent variables will have diminishing returns as
the predicted probability approaches 0 or 1.
Remark: Because of the above mentioned problems the LPM model is not recommended for
empirical works.

1.3.2.2. ALTERNATIVE MODEL TO LPM

As we have seen, the LPM is characterized by several problems such as non-normality of ui,
heteroscedasticity of ui and possibility of Yi lying outside the 0–1 range. But even then the
fundamental problem with the LPM is that it is not logically a very attractive model because it
assumes that Pi = E(Y =1| X) increases linearly with X, that is, the marginal or incremental effect
of X remains constant throughout. Thus, in our home ownership example we found that as X
increases by a unit (Birr 1000), the probability of owning a house increases by the same constant
amount of 0.10. This is so whether the income level is Birr 8000, Birr 10,000, Birr 18,000, or Birr
22,000. This seems patently unrealistic. In reality one would expect that Pi is nonlinearly related
to Xi: At very low income a family will not own a house but at a sufficiently high level of income,
say, X*, it most likely will own a house. Any increase in income beyond X* will have little effect
on the probability of owning a house. Thus, at both ends of the income distribution, the probability
of owning a house will be virtually unaffected by a small increase in X.

7
Therefore, what we need is a (probability) model that has these two features: (1) As Xi increases,
Pi = E(Y = 1 | X) increases but never steps outside the 0–1 interval, and (2) the relationship between
Pi and Xi is nonlinear, that is, “one which approaches zero at slower and slower rates as Xi gets
small and approaches one at slower and slower rates as Xi gets very large.’’ Geometrically, the
model we want would look something like Figure 1.1.

1 CDF

X
- 
0
Fig 1.1: A Cumulative Distribution Function (CDF
The above S-shaped curve is very much similar with the cumulative distribution function (CDF)
of a random variable. (Note that the CDF of a random variable X is simply the probability that it
takes a value less than or equal to X0, were X0 is some specified numerical value of X. In short,
F(X), the CDF of X, is F(X = X0) = P(X  X0). Therefore, one can easily use the CDF to model
regressions where the response variable is dichotomous, taking 0-1 values.
The CDFs commonly chosen to represent the 0-1 response models are.
1. the logistic – which gives rise to the logit model
2. the normal – which gives rise to the probit (or normit) model

Chap 1
No ratings yet
Chap 1
77 pages
2022 Econometrics I Chapter Four
No ratings yet
2022 Econometrics I Chapter Four
83 pages
Qualitative Information
No ratings yet
Qualitative Information
65 pages
Chapter 4
No ratings yet
Chapter 4
78 pages
Econometrics ch-4
No ratings yet
Econometrics ch-4
78 pages
C3 English
No ratings yet
C3 English
31 pages
Lecture 10
No ratings yet
Lecture 10
37 pages
Econometrics II Chapter One
No ratings yet
Econometrics II Chapter One
35 pages
Econometrics II Chapter Two
No ratings yet
Econometrics II Chapter Two
96 pages
Lec11 Ecmt
No ratings yet
Lec11 Ecmt
25 pages
Chapter 5 & 6
No ratings yet
Chapter 5 & 6
136 pages
CH 4 Eco
No ratings yet
CH 4 Eco
42 pages
Econometrics Cha 4
No ratings yet
Econometrics Cha 4
72 pages
Binary
No ratings yet
Binary
47 pages
Econometrics II (N)
No ratings yet
Econometrics II (N)
30 pages
Econometrics II-1-1
No ratings yet
Econometrics II-1-1
37 pages
Metrics Course Outline
No ratings yet
Metrics Course Outline
22 pages
Chapter Five (Dummy) - For Evaluation
No ratings yet
Chapter Five (Dummy) - For Evaluation
64 pages
Econometrics CH 1-4
100% (1)
Econometrics CH 1-4
315 pages
Chapter 5 & 6
No ratings yet
Chapter 5 & 6
139 pages
Chapter 1 Dummy Variable Regression
No ratings yet
Chapter 1 Dummy Variable Regression
45 pages
Econometrics II Slides-1
No ratings yet
Econometrics II Slides-1
61 pages
Binary
No ratings yet
Binary
40 pages
ECN 813 Dummy Variable
No ratings yet
ECN 813 Dummy Variable
21 pages
Block 3 MECE 001 Unit 10
100% (1)
Block 3 MECE 001 Unit 10
16 pages
Econometrics II All Chapters
No ratings yet
Econometrics II All Chapters
240 pages
Chapter 1 Qualitative Variables Final
No ratings yet
Chapter 1 Qualitative Variables Final
74 pages
Econometrics II Notes
No ratings yet
Econometrics II Notes
72 pages
Econometrics II Specail-2
No ratings yet
Econometrics II Specail-2
107 pages
Introduction To Econometrics Ii (Econ-3062) : Mohammed Adem (PHD)
100% (5)
Introduction To Econometrics Ii (Econ-3062) : Mohammed Adem (PHD)
83 pages
Dummy Variables: Nominal Scale
No ratings yet
Dummy Variables: Nominal Scale
17 pages
Econometrics II Chapter One
No ratings yet
Econometrics II Chapter One
87 pages
Econometrics Il
No ratings yet
Econometrics Il
14 pages
Dummy Variable Regression Models
No ratings yet
Dummy Variable Regression Models
19 pages
Econometric S
No ratings yet
Econometric S
22 pages
Econometrics
No ratings yet
Econometrics
49 pages
Econometrics Categorical Variables
No ratings yet
Econometrics Categorical Variables
12 pages
Chapter 1 Econometrics
No ratings yet
Chapter 1 Econometrics
21 pages
Dummy Variable Regression
No ratings yet
Dummy Variable Regression
8 pages
Chapter Three QM
No ratings yet
Chapter Three QM
77 pages
Econometrics Lecture Note Chapter 4 and 5
No ratings yet
Econometrics Lecture Note Chapter 4 and 5
39 pages
Lecture 08 Dummy Variables
No ratings yet
Lecture 08 Dummy Variables
6 pages
Module 5.2 - Anova and Ancova
No ratings yet
Module 5.2 - Anova and Ancova
10 pages
Anova
No ratings yet
Anova
14 pages
Chapter 4 (Compatibility Mode)
No ratings yet
Chapter 4 (Compatibility Mode)
66 pages
Econometrics 2
No ratings yet
Econometrics 2
84 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
Dummies
No ratings yet
Dummies
5 pages
Presentation G1
No ratings yet
Presentation G1
21 pages
Econometrics 2
No ratings yet
Econometrics 2
135 pages
Econometrics II Chapter One
No ratings yet
Econometrics II Chapter One
71 pages
Dummy Variable
No ratings yet
Dummy Variable
21 pages
Econometrics II Distance Module
No ratings yet
Econometrics II Distance Module
97 pages
Dummy Dependent Variables Models
No ratings yet
Dummy Dependent Variables Models
15 pages
3.dummy Variables
No ratings yet
3.dummy Variables
25 pages
Worksheet Econometrics I
100% (3)
Worksheet Econometrics I
6 pages
Regression With Qualitative Information
No ratings yet
Regression With Qualitative Information
25 pages
Econometrics II Handout For Students
No ratings yet
Econometrics II Handout For Students
29 pages
Panel Data Econometrics Kenya
No ratings yet
Panel Data Econometrics Kenya
114 pages
Econometrics II-1
No ratings yet
Econometrics II-1
56 pages
Comandos
No ratings yet
Comandos
51 pages
CH 11 Test
100% (2)
CH 11 Test
20 pages
Introductory Econometrics - Exam: 1 Theoretical Questions
No ratings yet
Introductory Econometrics - Exam: 1 Theoretical Questions
5 pages
Econometrics Madala
No ratings yet
Econometrics Madala
1 page
Econometrics Chapter Two-1
No ratings yet
Econometrics Chapter Two-1
41 pages
CH 5. Discrete Choice Model
No ratings yet
CH 5. Discrete Choice Model
38 pages
Business Statistics, 5 Ed.: by Ken Black
No ratings yet
Business Statistics, 5 Ed.: by Ken Black
34 pages
Quiz Final Ae
No ratings yet
Quiz Final Ae
23 pages
Practical Introduction To Stata PDF
No ratings yet
Practical Introduction To Stata PDF
58 pages
Job Satisfaction and Promotions
No ratings yet
Job Satisfaction and Promotions
21 pages
Krueger - Experimental Estimates of Education Production
No ratings yet
Krueger - Experimental Estimates of Education Production
37 pages
Haphazard Selection: Is It Time To Change Audit Standards?: Tom - Hall@uta - Edu
No ratings yet
Haphazard Selection: Is It Time To Change Audit Standards?: Tom - Hall@uta - Edu
43 pages
Course Outline Econometrics
No ratings yet
Course Outline Econometrics
2 pages
BOOK MADDALA Limityed Dependent Var.
No ratings yet
BOOK MADDALA Limityed Dependent Var.
208 pages
A Statistical Analysis and Model of The Residual Value ###
No ratings yet
A Statistical Analysis and Model of The Residual Value ###
292 pages
Lec 6
No ratings yet
Lec 6
133 pages
Career, Family, and The Well-Being of College-Educated Women
100% (1)
Career, Family, and The Well-Being of College-Educated Women
7 pages
MDRC A Practical Guide To Regression Discontinuity
No ratings yet
MDRC A Practical Guide To Regression Discontinuity
100 pages
15 Building Regression Models Part2
No ratings yet
15 Building Regression Models Part2
17 pages
Jaggia Chapter 7 1 Updated
No ratings yet
Jaggia Chapter 7 1 Updated
23 pages
MS Excel Instruction Steps in Matrimony Conjoint Analysis
No ratings yet
MS Excel Instruction Steps in Matrimony Conjoint Analysis
8 pages
Faccio Et Al .2006. Political Connections and Corporate Bailouts
No ratings yet
Faccio Et Al .2006. Political Connections and Corporate Bailouts
44 pages
Contoh Soal Financial Management
No ratings yet
Contoh Soal Financial Management
50 pages
Risk Management Theory: A Comprehensive Empirical Assessment
No ratings yet
Risk Management Theory: A Comprehensive Empirical Assessment
31 pages
Quality Kitchens Meat Loaf Mix: Team 8
No ratings yet
Quality Kitchens Meat Loaf Mix: Team 8
7 pages
Higher Wages, Overstaffing or Both? The Employer's Assessment of Problems Regarding Wage Costs and Staff Level in Co-Determined Establishments
No ratings yet
Higher Wages, Overstaffing or Both? The Employer's Assessment of Problems Regarding Wage Costs and Staff Level in Co-Determined Establishments
32 pages
Brand Counter Extensions The Impact of Brand Extension Success Versus Failure
No ratings yet
Brand Counter Extensions The Impact of Brand Extension Success Versus Failure
13 pages
International Business Review: Roger Mongong Fon, Fragkiskos Filippaios, Carmen Stoian, Soo Hee Lee
No ratings yet
International Business Review: Roger Mongong Fon, Fragkiskos Filippaios, Carmen Stoian, Soo Hee Lee
18 pages
The Motivation For Tax Avoidance in Earnings Management
No ratings yet
The Motivation For Tax Avoidance in Earnings Management
4 pages
Sultana Et Al - 2014 - Audit Committee Characteristics and Audit Report Lag
No ratings yet
Sultana Et Al - 2014 - Audit Committee Characteristics and Audit Report Lag
16 pages
Auto Sales Forecasting For Production Planning at Ford
No ratings yet
Auto Sales Forecasting For Production Planning at Ford
12 pages
A Comparison of The Determinants of Stock Returns in The 1987 and 2008 Stoc...
No ratings yet
A Comparison of The Determinants of Stock Returns in The 1987 and 2008 Stoc...
13 pages
Ordinary Differential Equations and Stability Theory: An Introduction
From Everand
Ordinary Differential Equations and Stability Theory: An Introduction
David A. Sanchez
No ratings yet
Applied Probability Models with Optimization Applications
From Everand
Applied Probability Models with Optimization Applications
Sheldon M. Ross
2.5/5 (3)

CHapter 5 Acct

Uploaded by

CHapter 5 Acct

Uploaded by

CHAPTER ONE

Regression Analysis with Qualitative Information: Binary (Dummy Variables)

1.2. Dummy Variable as Independent Variable

1.3. Dummy Variables as an Independent variables

1.3.2. QUALITATIVE RESPONSE MODELS

Y= 1, if individual 𝑖 attend college

Y= 2, if individual 𝑖 high school

1.3.2.1. THE LINEAR PROBABILITY MODEL (LPM)

Where X = income and 𝑈𝑖 = is the disturbance term

Assuming E(Ui) = 0, as usual (to obtain unbiased estimators), we obtain

Therefore, by the definition of mathematical expectation, we obtain

Problems with the LPM

=(-  0 –  1 Xi)2(1-  0 –  1 Xi) + (1-  0 –  1 Xi)2 (  0 +  1 Xi)

= (  0 +  1 Xi) (1-  0 –  1 Xi)

or Var(𝑈𝑖 ) = E(Yi/Xi) [1 – E(Yi/Xi) = Pi (1 – Pi)

1.3.2.2. ALTERNATIVE MODEL TO LPM

You might also like