Assignment 3
Assignment 3
We have 3,861 observations in the data, insurance variable is the dependent variable and
the 7 other factors are explanatory variables which include age, male, urban, education, income,
occupation and doctor variables.
We summarize age and income variables in two groups that have private health insurance
or not.
. by insurance: sum age income
-> insurance = 0
-> insurance = 1
private
insurance Freq. Percent Cum.
Out of 3,861 observations, about 85% of them have no private health insurance, and the
rest are people who have private health insurance.
. tab male
type of
place of
residence Freq. Percent Cum.
There are 53% people who live in rural areas and the rest people live in city regions.
. tab education
highest
educational
level Freq. Percent Cum.
The proportion of people who have a secondary education is highest with around 48%,
followed by that of people who are pupils in primary schools with about 36%. The figure for
higher educational level is lower with 13% and the percentage of people who have no education
is lowest with around 3%.
. tab occupation
. tab doctor
usual
doctor use Freq. Percent Cum.
The percentage of people who do not need doctor takes up the highest value with around
69% in the total of observations. The three other types of usual doctor are general practitioner,
specialist and hospital doctors with over 18%, about 8%, above 4% respectively.
Question 2: Interpret the estimation results
. logit insurance age income male urban primary secondary higher professional clerical
> sales agri_self agri_employee household services GP specialist hospital_doctors
First of all, we can test the null hypothesis that all the coefficients are simultaneously zero
with the likelihood ratio statistic. In this results, the value of the likelihood ratio statistic is about
2,089 and the p-value is approximately equal to zero, thus we reject the null hypothesis at the
5% level. Therefore, we can say that at least one of the above variables included in the logit model
can explain the decision of having private health insurance.
The coefficients of age and male variables are not statistically significant, for p-values of
which are about 30% and above 50% respectively. Consequently, we can conclude that age has
no effect on the decision on purchasing private health insurance, and genders have no impact on
the ratio between probability of having private health insurance and probability of no private
health insurance.
The coefficient of income has the expected positive sign and is highly statistically significant
at the 5% level. Holding the other variables constant, the higher income, the higher is the
probability of people will have private health insurance.
The urban variable has the expected positive sign and is highly statistically significant at the
5% level for p-value is practically zero. Ceteris paribus, the log of the odds ratio of urban residents
is higher by 2.39 as compared to the log of the odds ratio of people who live in cities.
For the highest educational level variables, the coefficient of the primary level is not
statistically significant at any significance level, because its p-value is about 88% > 10%. So we
can say that there is no difference in the log of the odds ratio between primary level and no
education. However, the coefficients of secondary and higher educational levels are individually
statistically highly significant at the 5% level. With the secondary dummy coefficient of about
2.45, holding all other factors constant, we can conclude that the log of the odds ratio of people
who have secondary level is higher by around 2.45 than the log of the odds ratio of people who
do not have education, which is the reference category. Similarly, the coefficient of the higher
educational level is nearly 4.02 which means the log of the odds ratio of people that have higher
educational level is higher by nearly 4.02 as compared to the log of the odds ratio of people
without education, of course holding other variables constant.
Looking at the occupation variables, the coefficients of sales, agricultural employee,
household and domestic, and services are not statistically significant even at the 10% level with
their p-values are around 27%, 93%, 70%, 55% in that order. Therefore, we can conclude that
there are no differences in the log of the odds ratio between these jobs and skilled manual job,
which is the base category. Three jobs which are professional/technical/managerial, clerical, and
agricultural self-employed are individually statistically significant at the 10% level. The
professional/technical/managerial coefficient of about 0.47 suggests that ceteris paribus, the log
of the odds ratio of people who have jobs in the professional group is higher by around 0.47 than
that of people have skilled manual occupation. Similarly, the coefficient of clerical job is
approximately 0.6, which means the log of the odds ratio of individuals having clerical job is
higher by about 0.6 as compared to the log of odds of people having skilled manual job, of course
all other things being equal. Likewise, we can see nearly -0.86 is the coefficient of agricultural
self-employed, so we can say that the log of the odds ratio of agricultural self-employed is lower
by about 0.86 than that of skilled manual, which is comparison category, ceteris paribus.
We have GP, specialist and hospital doctors are individually statistically highly significant,
for their p-values are practically zero. We have the GP coefficient of 2.8 suggests that holding
other factors constant, the log of odds ratio of general practitioner is higher by 2.8 than that of
no doctor, which is reference category. Likewise, the log of the odds ratio of specialist is higher
by 5.5 as compared to the log of the odds ratio of no doctor, ceteris paribus. Similarly, the log of
the odds ratio of hospital doctors is higher by 8.4 than of the log of the odds ratio of no doctor,
ceteris paribus.
Question 3: The marginal effects for the regression
. mfx
Question 4:
. test professional clerical sales agri_self agri_employee household services
( 1) [insurance]professional = 0
( 2) [insurance]clerical = 0
( 3) [insurance]sales = 0
( 4) [insurance]agri_self = 0
( 5) [insurance]agri_employee = 0
( 6) [insurance]household = 0
( 7) [insurance]services = 0
chi2( 7) = 20.47
Prob > chi2 = 0.0046
To test whether occupation affect the purchase of private insurance, we will test the seven
dummies relating to occupation simultaneously.
We have the hypothesis that:
H0: βprofessional = βclerical = βsales = βagri_self = βagri_employee = βhousehold = βservices = 0
Ha: At least one β different from 0
With p-value = 0.0046, we can reject the null hypothesis at the 5% level of significance, so
we can conclude that the occupation have an impact on the decision of purchasing of private
health insurance.
Question 5: Predict the probability of purchasing private insurance
. predict purchase
(option pr assumed; Pr(insurance))
. sum purchase
Delta-method
Margin Std. Err. z P>|z| [95% Conf. Interval]
The probability of purchasing private insurance that variables are at their mean is 1.1%.