0% found this document useful (0 votes)
48 views9 pages

Assignment 3

- The document analyzes survey data on health insurance using descriptive statistics and logistic regression. It summarizes characteristics of respondents with and without private health insurance. - The logistic regression finds that higher income, urban residence, secondary education or higher, and seeing a general practitioner, specialist or hospital doctor increase the likelihood of having private health insurance, while being self-employed in agriculture decreases it. Age and gender were not statistically significant predictors. - The analysis provides insight into factors that influence the decision to purchase private health insurance in Vietnam.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views9 pages

Assignment 3

- The document analyzes survey data on health insurance using descriptive statistics and logistic regression. It summarizes characteristics of respondents with and without private health insurance. - The logistic regression finds that higher income, urban residence, secondary education or higher, and seeing a general practitioner, specialist or hospital doctor increase the likelihood of having private health insurance, while being self-employed in agriculture decreases it. Age and gender were not statistically significant predictors. - The analysis provides insight into factors that influence the decision to purchase private health insurance in Vietnam.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Full name: Nguyen Thi Xuan Tham

Student ID: 419212625


Class: VNP26
ASSIGNMENT 3
Question 1: Descriptive statistics
. sum insurance age male urban education income occupation doctor

Variable Obs Mean Std. Dev. Min Max

insurance 3,861 .1574722 .3642925 0 1


age 3,861 36.55581 8.765382 15 54
male 3,861 .4858845 .4998654 0 1
urban 3,861 .4695675 .4991376 0 1
education 3,861 1.703445 .7343257 0 3

income 3,861 3503.23 1241.201 248.7494 7364.188


occupation 3,861 4.7669 2.164373 1 8
doctor 3,861 1.478373 .8228354 1 4

We have 3,861 observations in the data, insurance variable is the dependent variable and
the 7 other factors are explanatory variables which include age, male, urban, education, income,
occupation and doctor variables.
We summarize age and income variables in two groups that have private health insurance
or not.
. by insurance: sum age income

-> insurance = 0

Variable Obs Mean Std. Dev. Min Max

age 3,253 36.49616 8.809819 15 54


income 3,253 3316.491 1209.929 248.7494 7364.188

-> insurance = 1

Variable Obs Mean Std. Dev. Min Max

age 608 36.875 8.523685 17 54


income 608 4502.348 875.7924 1630.28 6796.672
For 3,253 people who have no private health insurance, the mean age is around 36 years
old, the minimum value is 15 and the maximize value is 54. In addition, the mean value of income
is over 3.3 thousand USD, the minimum income in this group is about 248 USD and the maximize
value is above 7.3 thousand USD.
For group including 608 people who have private health insurance, the mean age is about
36, the minimum age is 17 and the maximize age is 54. Moreover, the mean income in this group
is over 4.5 thousand USD, the minimum income is around 1.6 thousand USD and the maximize
value is nearly 6.8 thousand USD.
. tab insurance

private
insurance Freq. Percent Cum.

0 3,253 84.25 84.25


1 608 15.75 100.00

Total 3,861 100.00

Out of 3,861 observations, about 85% of them have no private health insurance, and the
rest are people who have private health insurance.

. tab male

male=1 Freq. Percent Cum.

0 1,985 51.41 51.41


1 1,876 48.59 100.00

Total 3,861 100.00


Out of over 3.8 thousand observations, we have about 51% people are male, and around
49% are female.
. tab urban

type of
place of
residence Freq. Percent Cum.

rural 2,048 53.04 53.04


urban 1,813 46.96 100.00

Total 3,861 100.00

There are 53% people who live in rural areas and the rest people live in city regions.

. tab education

highest
educational
level Freq. Percent Cum.

no education 134 3.47 3.47


primary 1,381 35.77 39.24
secondary 1,842 47.71 86.95
higher 504 13.05 100.00

Total 3,861 100.00

The proportion of people who have a secondary education is highest with around 48%,
followed by that of people who are pupils in primary schools with about 36%. The figure for
higher educational level is lower with 13% and the percentage of people who have no education
is lowest with around 3%.
. tab occupation

respondent's occupation (grouped) Freq. Percent Cum.

professional/technical/managerial 355 9.19 9.19


clerical 214 5.54 14.74
sales 581 15.05 29.79
agricultural - self employed 820 21.24 51.02
agricultural - employee 224 5.80 56.82
household and domestic 912 23.62 80.45
services 33 0.85 81.30
skilled manual 722 18.70 100.00

Total 3,861 100.00


The occupation of interviewers is divided into 8 groups, and four groups of which include
sales, agricultural self-employed, household and domestic, and skilled manual have nearly the
same proportion with over 15% in each group. The others except for services account for below
10% in each job.

. tab doctor

usual
doctor use Freq. Percent Cum.

no doctor 2,668 69.10 69.10


GP 710 18.39 87.49
Specialist 312 8.08 95.57
Hosp doctor 171 4.43 100.00

Total 3,861 100.00

The percentage of people who do not need doctor takes up the highest value with around
69% in the total of observations. The three other types of usual doctor are general practitioner,
specialist and hospital doctors with over 18%, about 8%, above 4% respectively.
Question 2: Interpret the estimation results
. logit insurance age income male urban primary secondary higher professional clerical
> sales agri_self agri_employee household services GP specialist hospital_doctors

Iteration 0: log likelihood = -1681.2889


Iteration 1: log likelihood = -932.50257
Iteration 2: log likelihood = -719.82368
Iteration 3: log likelihood = -638.13673
Iteration 4: log likelihood = -636.50329
Iteration 5: log likelihood = -636.49667
Iteration 6: log likelihood = -636.49667

Logistic regression Number of obs = 3,861


LR chi2(17) = 2089.58
Prob > chi2 = 0.0000
Log likelihood = -636.49667 Pseudo R2 = 0.6214

insurance Coef. Std. Err. z P>|z| [95% Conf. Interval]

age .0090934 .0092554 0.98 0.326 -.0090469 .0272336


income .0007057 .0000947 7.45 0.000 .0005201 .0008913
male .1088062 .1723292 0.63 0.528 -.2289528 .4465651
urban 2.385983 .2119 11.26 0.000 1.970666 2.801299
primary -.1659744 1.095825 -0.15 0.880 -2.313753 1.981804
secondary 2.452606 1.0795 2.27 0.023 .3368252 4.568387
higher 4.018254 1.09214 3.68 0.000 1.877699 6.158809
professional .4662352 .2747132 1.70 0.090 -.0721928 1.004663
clerical .5988048 .2889753 2.07 0.038 .0324236 1.165186
sales -.2885207 .2615986 -1.10 0.270 -.8012446 .2242031
agri_self -.8625839 .3705571 -2.33 0.020 -1.588862 -.1363054
agri_employee -.0299311 .3334069 -0.09 0.928 -.6833967 .6235345
household -.1050945 .2721009 -0.39 0.699 -.6384025 .4282135
services -.6623256 1.093141 -0.61 0.545 -2.804843 1.480192
GP 2.802244 .1829739 15.31 0.000 2.443622 3.160866
specialist 5.5107 .2791884 19.74 0.000 4.963501 6.057899
hospital_doctors 8.399037 .4476357 18.76 0.000 7.521687 9.276387
_cons -11.24834 1.208275 -9.31 0.000 -13.61652 -8.880165

First of all, we can test the null hypothesis that all the coefficients are simultaneously zero
with the likelihood ratio statistic. In this results, the value of the likelihood ratio statistic is about
2,089 and the p-value is approximately equal to zero, thus we reject the null hypothesis at the
5% level. Therefore, we can say that at least one of the above variables included in the logit model
can explain the decision of having private health insurance.
The coefficients of age and male variables are not statistically significant, for p-values of
which are about 30% and above 50% respectively. Consequently, we can conclude that age has
no effect on the decision on purchasing private health insurance, and genders have no impact on
the ratio between probability of having private health insurance and probability of no private
health insurance.
The coefficient of income has the expected positive sign and is highly statistically significant
at the 5% level. Holding the other variables constant, the higher income, the higher is the
probability of people will have private health insurance.
The urban variable has the expected positive sign and is highly statistically significant at the
5% level for p-value is practically zero. Ceteris paribus, the log of the odds ratio of urban residents
is higher by 2.39 as compared to the log of the odds ratio of people who live in cities.
For the highest educational level variables, the coefficient of the primary level is not
statistically significant at any significance level, because its p-value is about 88% > 10%. So we
can say that there is no difference in the log of the odds ratio between primary level and no
education. However, the coefficients of secondary and higher educational levels are individually
statistically highly significant at the 5% level. With the secondary dummy coefficient of about
2.45, holding all other factors constant, we can conclude that the log of the odds ratio of people
who have secondary level is higher by around 2.45 than the log of the odds ratio of people who
do not have education, which is the reference category. Similarly, the coefficient of the higher
educational level is nearly 4.02 which means the log of the odds ratio of people that have higher
educational level is higher by nearly 4.02 as compared to the log of the odds ratio of people
without education, of course holding other variables constant.
Looking at the occupation variables, the coefficients of sales, agricultural employee,
household and domestic, and services are not statistically significant even at the 10% level with
their p-values are around 27%, 93%, 70%, 55% in that order. Therefore, we can conclude that
there are no differences in the log of the odds ratio between these jobs and skilled manual job,
which is the base category. Three jobs which are professional/technical/managerial, clerical, and
agricultural self-employed are individually statistically significant at the 10% level. The
professional/technical/managerial coefficient of about 0.47 suggests that ceteris paribus, the log
of the odds ratio of people who have jobs in the professional group is higher by around 0.47 than
that of people have skilled manual occupation. Similarly, the coefficient of clerical job is
approximately 0.6, which means the log of the odds ratio of individuals having clerical job is
higher by about 0.6 as compared to the log of odds of people having skilled manual job, of course
all other things being equal. Likewise, we can see nearly -0.86 is the coefficient of agricultural
self-employed, so we can say that the log of the odds ratio of agricultural self-employed is lower
by about 0.86 than that of skilled manual, which is comparison category, ceteris paribus.
We have GP, specialist and hospital doctors are individually statistically highly significant,
for their p-values are practically zero. We have the GP coefficient of 2.8 suggests that holding
other factors constant, the log of odds ratio of general practitioner is higher by 2.8 than that of
no doctor, which is reference category. Likewise, the log of the odds ratio of specialist is higher
by 5.5 as compared to the log of the odds ratio of no doctor, ceteris paribus. Similarly, the log of
the odds ratio of hospital doctors is higher by 8.4 than of the log of the odds ratio of no doctor,
ceteris paribus.
Question 3: The marginal effects for the regression

. mfx

Marginal effects after logit


y = Pr(insurance) (predict)
= .0111466

variable dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

age .0001002 .0001 0.97 0.333 -.000103 .000303 36.5558


income 7.78e-06 .00000 5.46 0.000 5.0e-06 .000011 3503.23
male* .0012017 .00191 0.63 0.530 -.002547 .00495 .485884
urban* .034764 .00565 6.16 0.000 .023698 .04583 .469567
primary* -.0017896 .01159 -0.15 0.877 -.02451 .020931 .357679
second~y* .035571 .02269 1.57 0.117 -.008894 .080036 .477078
higher* .2639523 .18384 1.44 0.151 -.096368 .624273 .130536
profes~l* .0062386 .00451 1.38 0.167 -.002602 .015079 .091945
clerical* .0086723 .00554 1.57 0.117 -.002182 .019526 .055426
sales* -.0028907 .0024 -1.20 0.229 -.007601 .001819 .150479
agri_s~f* -.0076759 .00281 -2.74 0.006 -.013175 -.002176 .21238
agri_e~e* -.0003257 .00358 -0.09 0.928 -.007345 .006693 .058016
househ~d* -.0011279 .00285 -0.40 0.692 -.006704 .004448 .236208
services* -.0053976 .00641 -0.84 0.399 -.017952 .007156 .008547
GP* .0931996 .01495 6.24 0.000 .063905 .122494 .18389
specia~t* .6338931 .04371 14.50 0.000 .548223 .719563 .080808
hospit~s* .9641389 .00962 100.19 0.000 .945278 .982999 .044289

(*) dy/dx is for discrete change of dummy variable from 0 to 1


From the results, we cannot interpret the marginal effects of age, male, primary,
secondary, higher, professional, clerical, sales, agri_employee, household, services variables,
because their p-value are higher than the 10% level. We can interpret the marginal effect of the
income, urban, agri_self, GP, specialist, hospital doctors variables because their p-value are
approximately equal to zero.
The marginal effect of income is 7.78x10-6, which means when income increases from 3,503
USD to 3,504 USD, the probability of purchasing of private health insurance increases 7.78x10-4
percentage point.
The marginal effect of urban of about 0.035 suggests that the probability of the purchase
of private insurance of people who live in urban areas higher than that of individuals living in
rural areas is 3.5 percentage point.
The marginal effect of agri_self is around -0.0077, which means the probability of
purchasing private health insurance of agricultural self-employed people is lower by about 0.77
percentage point as compared to that of skilled manual people.
The marginal effect of GP of about 0.093 means that the probability of having private
insurance of people who go to general practitioner is higher by about 9.3 percentage point than
that of people who do not need doctor. Similarly, the marginal effect of specialist is about 0.63,
we can say that the probability of buying private insurance of visiting specialist exceeds by 63
percentage point as compare to that of no doctor. Likewise, the marginal effect of hospital
doctors of about 0.96 suggests that the probability of having private health insurance of people
who visit hospital doctors is higher by about 96 percentage point than that of people not using
doctor services.

Question 4:
. test professional clerical sales agri_self agri_employee household services

( 1) [insurance]professional = 0
( 2) [insurance]clerical = 0
( 3) [insurance]sales = 0
( 4) [insurance]agri_self = 0
( 5) [insurance]agri_employee = 0
( 6) [insurance]household = 0
( 7) [insurance]services = 0

chi2( 7) = 20.47
Prob > chi2 = 0.0046
To test whether occupation affect the purchase of private insurance, we will test the seven
dummies relating to occupation simultaneously.
We have the hypothesis that:
H0: βprofessional = βclerical = βsales = βagri_self = βagri_employee = βhousehold = βservices = 0
Ha: At least one β different from 0
With p-value = 0.0046, we can reject the null hypothesis at the 5% level of significance, so
we can conclude that the occupation have an impact on the decision of purchasing of private
health insurance.
Question 5: Predict the probability of purchasing private insurance
. predict purchase
(option pr assumed; Pr(insurance))

. sum purchase

Variable Obs Mean Std. Dev. Min Max

purchase 3,861 .1574722 .2873847 7.56e-06 .999898


The mean probability of having private health insurance is about 15%.
. margins, atmean

Adjusted predictions Number of obs = 3,861


Model VCE : OIM

Expression : Pr(insurance), predict()


at : age = 36.55581 (mean)
income = 3503.23 (mean)
male = .4858845 (mean)
urban = .4695675 (mean)
primary = .3576794 (mean)
secondary = .4770785 (mean)
higher = .1305361 (mean)
professional = .0919451 (mean)
clerical = .0554261 (mean)
sales = .1504792 (mean)
agri_self = .2123802 (mean)
agri_emplo~e = .0580161 (mean)
household = .2362082 (mean)
services = .008547 (mean)
GP = .1838902 (mean)
specialist = .0808081 (mean)
hospital_d~s = .044289 (mean)

Delta-method
Margin Std. Err. z P>|z| [95% Conf. Interval]

_cons .0111466 .0020701 5.38 0.000 .0070892 .0152039

The probability of purchasing private insurance that variables are at their mean is 1.1%.

You might also like