1 - Credit Risk Log File
1 - Credit Risk Log File
1 - Credit Risk Log File
-----------------------------------------------------------------------------------
----------
name: <unnamed>
log: \Credit Risk Log File_Last Version.log
log type: text
. * Descriptive Statistics: The Stata command "sum" for continuous variables and
the Stata command "tab1" to display at one time all the frequency tables of the
categorical
variables
.
.
. sum checking savings monthscustomer monthsemployed age
maritalstat |
us | Freq. Percent Cum.
------------+-----------------------------------
1 | 156 36.71 36.71
2 | 36 8.47 45.18
3 | 233 54.82 100.00
------------+-----------------------------------
Total | 425 100.00
. tab creditrisk
. * Comment: 50.35 % of borrowers are ranked low risk and 49.65 are ranked high
risk by the bank
.
.
. tab2 creditrisk loanpurpose, chi2
.
.
. * Joint distribution and correlation between creditrisk and gender (the stata
command is : tab2 creditrisk gender, chi2)
.
.
. tab2 creditrisk gender, chi2
| gender
creditrisk | 0 1 | Total
-----------+----------------------+----------
0 | 57 157 | 214
1 | 78 133 | 211
-----------+----------------------+----------
Total | 135 290 | 425
.
.
. * There is a strong significant correlation between creditrisk and gender (p-
value =2.2% < 5%)
. * Joint distribution and correlation between creditrisk and housing (the stata
command is : tab2 creditrisk housing , chi2)
| housing
creditrisk | 1 2 3 | Total
-----------+---------------------------------+----------
0 | 161 32 21 | 214
1 | 131 49 31 | 211
-----------+---------------------------------+----------
Total | 292 81 52 | 425
. * Logistic regression is used when the dependent variable is binary and when we
have a typical coding : 0 for negative outcome (event did not occur) and 1 for
positive
outcome (event did occur). We use a Logit model when we are interested in seeing
how the independent variables affect the probabilty of the event occuring (or not
occuring)
.
.
. * The Stata commad to estimate a Logit model is: "logit depvar indvars"
--------------------------------------------------------------------------------
creditrisk | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
checking | -.0000476 .0000348 -1.37 0.171 -.0001158 .0000206
savings | -.0000496 .0000316 -1.57 0.117 -.0001116 .0000124
monthscustomer | .0502559 .0105246 4.78 0.000 .029628 .0708837
monthsemployed | -.0039044 .0037417 -1.04 0.297 -.011238 .0034291
age | -.0116182 .0112195 -1.04 0.300 -.0336081 .0103717
|
loanpurpose |
2 | .1805578 .5798451 0.31 0.756 -.9559176 1.317033
3 | .0208036 .4146947 0.05 0.960 -.791983 .8335902
4 | 1.511012 1.296294 1.17 0.244 -1.029677 4.051702
5 | .7164681 .4034342 1.78 0.076 -.0742485 1.507185
6 | -.7741574 .7415011 -1.04 0.296 -2.227473 .6791581
7 | .5033596 1.546789 0.33 0.745 -2.528292 3.535011
8 | -.2122486 .404214 -0.53 0.600 -1.004494 .5799963
9 | -1.345953 .5263792 -2.56 0.011 -2.377637 -.3142685
10 | .0141124 1.029348 0.01 0.989 -2.003373 2.031598
|
1.gender | .129157 .5282329 0.24 0.807 -.9061604 1.164474
|
maritalstatus |
2 | -.4163424 .6175879 -0.67 0.500 -1.626792 .7941076
3 | -.6619016 .5082489 -1.30 0.193 -1.658051 .334248
|
housing |
2 | .5931464 .2893651 2.05 0.040 .0260012 1.160292
3 | .5939747 .3756847 1.58 0.114 -.1423538 1.330303
|
job |
2 | -.2862753 .3604276 -0.79 0.427 -.9927004 .4201498
3 | -.1103014 .412744 -0.27 0.789 -.9192648 .6986621
4 | -.5714215 .7860727 -0.73 0.467 -2.112096 .9692526
|
_cons | -.1554973 .6956042 -0.22 0.823 -1.518857 1.207862
--------------------------------------------------------------------------------
.
.
. logit creditrisk savings monthscustomer age i.loanpurpose i.gender
i.maritalstatus i.housing
--------------------------------------------------------------------------------
creditrisk | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
savings | -.0000523 .000031 -1.69 0.091 -.000113 8.43e-06
monthscustomer | .0491996 .0102011 4.82 0.000 .0292058 .0691934
age | -.0135457 .0106361 -1.27 0.203 -.0343921 .0073007
|
loanpurpose |
2 | .2306476 .5752784 0.40 0.688 -.8968774 1.358172
3 | .0283228 .4071357 0.07 0.945 -.7696485 .8262941
4 | 1.619859 1.280687 1.26 0.206 -.8902408 4.129959
5 | .7146552 .3995288 1.79 0.074 -.0684068 1.497717
6 | -.7673558 .7213625 -1.06 0.287 -2.1812 .6464887
7 | .3543893 1.514912 0.23 0.815 -2.614784 3.323563
8 | -.2122968 .3982476 -0.53 0.594 -.9928478 .5682542
9 | -1.228455 .5094138 -2.41 0.016 -2.226887 -.2300221
10 | .2539983 .9985979 0.25 0.799 -1.703218 2.211214
|
1.gender | .1892455 .5261844 0.36 0.719 -.842057 1.220548
|
maritalstatus |
2 | -.5019915 .6130893 -0.82 0.413 -1.703625 .6996415
3 | -.7653826 .502026 -1.52 0.127 -1.749336 .2185704
|
housing |
2 | .5846078 .286382 2.04 0.041 .0233094 1.145906
3 | .5378179 .3707556 1.45 0.147 -.1888497 1.264486
|
_cons | -.4400372 .5845148 -0.75 0.452 -1.585665 .7055908
--------------------------------------------------------------------------------
. * To check the predictive power of the estimated model, the stata post
estimation command is : "estat classification"
. estat classification
. * To report the odds ratios (exp(b)) for each independent variable, the stata
command is "logit depvar indvars, or". Standard errors and confidence intervals
are also transformed.
.
.
. logit creditrisk savings monthscustomer age i.loanpurpose i.gender
i.maritalstatus i.housing, or
--------------------------------------------------------------------------------
creditrisk | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
savings | .9999477 .000031 -1.69 0.091 .9998871 1.000008
monthscustomer | 1.05043 .0107155 4.82 0.000 1.029636 1.071643
age | .9865457 .010493 -1.27 0.203 .9661926 1.007327
|
loanpurpose |
2 | 1.259415 .7245144 0.40 0.688 .4078412 3.889079
3 | 1.028728 .4188318 0.07 0.945 .4631758 2.284836
4 | 5.052379 6.470514 1.26 0.206 .4105569 62.17537
5 | 2.043482 .8164298 1.79 0.074 .9338805 4.47147
6 | .464239 .3348846 -1.06 0.287 .1129059 1.908826
7 | 1.42531 2.15922 0.23 0.815 .0731836 27.75908
8 | .8087246 .3220727 -0.53 0.594 .37052 1.765183
9 | .2927446 .1491281 -2.41 0.016 .1078636 .7945161
10 | 1.28917 1.287362 0.25 0.799 .1820967 9.126792
|
1.gender | 1.208338 .6358084 0.36 0.719 .4308234 3.389044
|
maritalstatus |
2 | .605324 .3711177 -0.82 0.413 .1820226 2.013031
3 | .4651559 .2335204 -1.52 0.127 .1738895 1.244297
|
housing |
2 | 1.794287 .5138515 2.04 0.041 1.023583 3.14529
3 | 1.712266 .6348324 1.45 0.147 .8279109 3.54127
|
_cons | .6440125 .3764348 -0.75 0.452 .2048115 2.025043
--------------------------------------------------------------------------------
Note: _cons estimates baseline odds.
.
.
. * Notes : 1) for positive b, "the odds are exp(b) times larger" or "the odds
increase by a factor of exp(b)" --- 2) for negative b, "the odds are exp(b) times
smaller" or
> "the odds decrease by a factor of exp(b)"--- 3) odds close to 1 indicate a small
change (multiplying by 1.01 or 0.99 does not change the odds much. --- 4) The odds
of Y=1 (high risk) increase multiplicatively by exp(b) for a one unit increase
in X, holding all other variables constant.
. * Comments: Results in the table above show 1) the odds of the variable
monthscustomer (continuous) for a customer ranked high risk (creditrisk=1) increase
by a factor of 1
> .05 for a unit increase in monthscustomer. 2) The odds for a customer ranked
high risk (creditrisk=1) increase by a factor of 1.8 when the customer rents a
house compared
when he owns his house.
. * How to obtain easier coefficients for easier interpretation : the stata post
estimation command "listcoef, percent" gives the percent change in odds for unit
increase in
> X and the percent change in odds for Standard Deviation increase in X.
. listcoef, percent
Odds of: 1 vs 0
----------------------------------------------------------------------
creditrisk | b z P>|z| % %StdX SDofX
-------------+--------------------------------------------------------
savings | -0.00005 -1.688 0.091 -0.0 -17.1 3597.2850
monthscust~r | 0.04920 4.823 0.000 5.0 82.9 12.2676
age | -0.01355 -1.274 0.203 -1.3 -13.9 11.0451
2.loanpurp~e | 0.23065 0.401 0.688 25.9 5.4 0.2265
3.loanpurp~e | 0.02832 0.070 0.945 2.9 1.1 0.4005
4.loanpurp~e | 1.61986 1.265 0.206 405.2 17.0 0.0967
5.loanpurp~e | 0.71466 1.789 0.074 104.3 36.0 0.4304
6.loanpurp~e | -0.76736 -1.064 0.287 -53.6 -11.9 0.1658
7.loanpurp~e | 0.35439 0.234 0.815 42.5 2.5 0.0685
8.loanpurp~e | -0.21230 -0.533 0.594 -19.1 -8.8 0.4318
9.loanpurp~e | -1.22845 -2.412 0.016 -70.7 -30.2 0.2923
10.loanpur~e | 0.25400 0.254 0.799 28.9 3.0 0.1181
1.gender | 0.18925 0.360 0.719 20.8 9.2 0.4661
2.maritals~s | -0.50199 -0.819 0.413 -39.5 -13.1 0.2788
3.maritals~s | -0.76538 -1.525 0.127 -53.5 -31.7 0.4983
2.housing | 0.58461 2.041 0.041 79.4 25.8 0.3932
3.housing | 0.53782 1.451 0.147 71.2 19.3 0.3281
----------------------------------------------------------------------
. * Table description : 1) b = raw coefficient 2)z = z-score for test of b=0 3)p>|
z|= p-value for z-test 4)% = percent change in odds for percent increase in X 5)
%StdX = percent change in odds for SD increase in X 6) SDofX = standard
deviation of X.
. prvalue, rest(mean)
.
***********************************************************************************
***********************************************************************************
**********************************
. log close
-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
----------