0% found this document useful (0 votes)
21 views15 pages

1 Dummy Variables - FINAL

Uploaded by

Rahul Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views15 pages

1 Dummy Variables - FINAL

Uploaded by

Rahul Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Regression analysis with cross

section data
Regression with Qualitative explanatory variables
Recommended Text: Gujarati-Econometrics by Example
Consider the issue of gender discrimination in the salary of employees in some
industries. In examining this issue, suppose a random sample is drawn from a pool of
employed laborers in a particular industry. The following are the variables on which
data are collected.

Wage: Annual salary in dollars


Gender: M/F
Race: White/Non-white
Age in years
Years of education

Strictly for lecture purpose at NMIMS SOC


The data generating process in the population
Wage i  β 0  β 1Genderi  β 2 Race i   3 Education i   4 Age i  ε i

The Regression Model

Wage i  b 0  b 1Genderi  b 2 Race i  b3 Education i  b4 Age i

Strictly for lecture purpose at NMIMS SOC


How would you use the data on Gender and Race
For gender, define a variable:
D1=0, if male (base group)
= 1, if female
[Again, if you want you can take female as the base group]
Similarly for race, you may define
D2=0, if white (base group)
=1, if non-white
[Again, if you want you can take non-white as the base group]
Wage i  b 0  b 1 D1 i  b 2 D 2 i  b3 Education i  b4 Age i

Strictly for lecture purpose at NMIMS SOC


Regression Results
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.610285867
R Square 0.37244884
Adjusted R Square 0.346025633
Standard Error 13704.89414
Observations 100

ANOVA
df SS MS F Significance F
Regression 4 10589914766 2647478691 14.09552002 4.57061E-09
Residual 95 17843291717 187824123
Total 99 28433206483

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept -12725.84107 9068.128509 -1.40335914 0.163769949 -30728.35229 5276.670154
Education 2746.375373 511.4234204 5.3700618 5.57284E-07 1731.071515 3761.679232
D2 -8029.584532 4611.575277 -1.74118041 0.084889263 -17184.71898 1125.549915
D1 -11691.51836 2773.02663 -4.21615798 5.66728E-05 -17196.67226 -6186.364456
Age 380.8722742 113.5825283 3.35326462 0.001148186 155.3824459 606.3621025

Strictly for lecture purpose at NMIMS SOC


Is there any evidence of gender discrimination in
salary?
Consider a person with the following data:
Education: 12 years, Gender: male (D1=0), Race= White (D2=0), Age=51
years
What is the predicted annual wage of this person?

Wagei  12725.84 - 11691.51D1i - 8029.58D 2i  2746.37 Education i  380.87 Agei

Wagei  12725.84 - 11691.51 * (0) - 8029.58 * (0)  2746.37 * (12)  380.87(51)

= 39654.97

Strictly for lecture purpose at NMIMS SOC


Is there any evidence of gender discrimination in
salary?
Consider a person with the following data:
Education: 12 years, Gender: Female (D1=1), Race= White (D2=0), Age=51
years
What is the predicted annual wage of this person?

Wagei  12725.84 - 11691.51D1i - 8029.58D 2i  2746.37 Education i  380.87 Agei

Wagei  12725.84 - 11691.51 * (1) - 8029.58 * (0)  2746.37 * (12)  380.87(51)

= 27963.46
Females, on average, get 11691.51 dollars less than males
Strictly for lecture purpose at NMIMS SOC
Is there any evidence of racial discrimination in
salary?
Consider a person with the following data:
Education: 12 years, Gender: male (D1=0), Race= White (D2=0), Age=51
years
What is the predicted annual wage of this person?

Wagei  12725.84 - 11691.51D1i - 8029.58D 2i  2746.37 Education i  380.87 Agei

Wagei  12725.84 - 11691.51 * (0) - 8029.58 * (0)  2746.37 * (12)  380.87(51)

= 39654.97

Strictly for lecture purpose at NMIMS SOC


Is there any evidence of racial discrimination in
salary?
Consider a person with the following data:
Education: 12 years, Gender: male (D1=0), Race= Nonwhite (D2=1), Age=51
years
What is the predicted annual wage of this person?

Wagei  12725.84 - 11691.51D1i - 8029.58D 2i  2746.37 Education i  380.87 Agei

Wagei  12725.84 - 11691.51 * (0) - 8029.58 * (1)  2746.37 * (12)  380.87(51)

= 31625.39. Nonwhites, on average get 8029.58 dollars less than the


whites.
Strictly for lecture purpose at NMIMS SOC
Wage i  b 0  b 1 D1 i  b 2 D 2 i  b3 Education i  b4 Age i
Base group is white male
Predicted Wage for white male:

Wagei  b 0  b3 Education i  b4 Agei


Predicted Wage for white female:

Wagei  b 0  b1  b3 Education i  b4 Agei


Difference in predicted wages between white
male and white female: (b0+b1)-b0=b1
Similarly, if we compare non-white male and non-white female, the difference in wage is b1

Strictly for lecture purpose at NMIMS SOC


The Regression Model

Wage i  b 0  b 1 D1 i  b 2 D 2 i  b3 Education i  b4 Age i

In any regression model, the coefficient of the dummy variable is the differential
impact on the dependent variable due difference in the category of the dummy
variable, holding other variables constant

Strictly for lecture purpose at NMIMS SOC


Changing the base group of the dummy
For gender, define a variable:
newD1=0, if female (base group)
= 1, if male

Similarly for race, you may define


newD2=0, if non-white (base group)
=1, if white

Wage i  b 0  b 1 D1 i  b 2 D 2 i  b3 Education i  b4 Age i

Strictly for lecture purpose at NMIMS SOC


SUMMARY OUTPUT

Regression Statistics
Multiple R 0.610285867
R Square 0.37244884
Adjusted R Square 0.346025633
Standard Error 13704.89414
Observations 100

ANOVA
df SS MS F Significance F
Regression 4 10589914766 2647478691 14.09552 4.57061E-09
Residual 95 17843291717 187824123.3
Total 99 28433206483

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept -32446.94396 9696.558839 -3.34623287 0.001175 -51697.04712 -13196.84079
education 2746.375373 511.4234204 5.3700618 5.57E-07 1731.071515 3761.679232
age 380.8722742 113.5825283 3.353264625 0.001148 155.3824459 606.3621025
newd1 11691.51836 2773.02663 4.21615798 5.67E-05 6186.364456 17196.67226
newd2 8029.584532 4611.575277 1.74118041 0.084889 -1125.549915 17184.71898

Observation: the magnitudes of the coefficients remain unchanged. Only the signs of the dummy variables
get reversed.

Strictly for lecture purpose at NMIMS SOC


D1=0, if male (base group)
= 1, if female

D2=0, if white (base group)


=1, if non-white

Wagei  12725.84 - 11691.51D1i - 8029.58D 2i  2746.37 Education i  380.87 Agei

D1=0, if female (base group)


= 1, if male

D2=0, if non-white (base group)


=1, if white

Wagei  32446.94  11691.51D1i  8029.58D 2i  2746.37 Education i  380.87 Agei

Strictly for lecture purpose at NMIMS SOC


More than two categories

If a variable has m number of categories, then


number of dummies to be created is m-1

Strictly for lecture purpose at NMIMS SOC

You might also like