0% found this document useful (0 votes)

24 views10 pages

1 - Credit Risk Log File

The document analyzes credit risk data using logistic regression. Descriptive statistics and frequency tables are presented for the variables. Logistic regression is used to predict the probability of high credit risk based on independent variables like checking/savings amounts, customer history, demographics and loan details. Several independent variables are found to have statistically significant correlations with credit risk including loan purpose, gender, and housing status.

Uploaded by

Abdullah alsilme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views10 pages

1 - Credit Risk Log File

Uploaded by

Abdullah alsilme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 10

-----------------------------------------------------------------------------------

-----------------------------------------------------------------------------------
----------
name: <unnamed>
log: \Credit Risk Log File_Last Version.log
log type: text

. Data : "2_Credit Risk data.dta"

. * Data description: Dependent variable: creditrisk (binary variable taking 1 if

the customer is high risk and taking 0 if the customer is low risk) Independent
Continuous
variables ( checking savings monthscustomer monthsemployed age) and independent
categorical variables ( loanpurpose gender maritalstatus housing job)
***********************************************************************************
***********************************************************************************
***********************

. * Descriptive Statistics: The Stata command "sum" for continuous variables and
the Stata command "tab1" to display at one time all the frequency tables of the
categorical
variables

.
.
. sum checking savings monthscustomer monthsemployed age

Variable | Obs Mean Std. Dev. Min Max

-------------+---------------------------------------------------------
checking | 425 1048.014 3147.183 0 19812
savings | 425 1812.562 3597.285 0 19811
monthscust~r | 425 22.89647 12.2676 5 73
monthsempl~d | 425 31.89647 32.25932 0 119
age | 425 34.39765 11.04513 18 73

. tab1 loanpurpose gender maritalstatus housing job

-> tabulation of loanpurpose

loanpurpose | Freq. Percent Cum.

------------+-----------------------------------
1 | 44 10.35 10.35
2 | 23 5.41 15.76
3 | 85 20.00 35.76
4 | 4 0.94 36.71
5 | 104 24.47 61.18
6 | 12 2.82 64.00
7 | 2 0.47 64.47
8 | 105 24.71 89.18
9 | 40 9.41 98.59
10 | 6 1.41 100.00
------------+-----------------------------------
Total | 425 100.00

-> tabulation of gender

gender | Freq. Percent Cum.

------------+-----------------------------------
0 | 135 31.76 31.76
1 | 290 68.24 100.00
------------+-----------------------------------
Total | 425 100.00

-> tabulation of maritalstatus

maritalstat |
us | Freq. Percent Cum.
------------+-----------------------------------
1 | 156 36.71 36.71
2 | 36 8.47 45.18
3 | 233 54.82 100.00
------------+-----------------------------------
Total | 425 100.00

-> tabulation of housing

housing | Freq. Percent Cum.

------------+-----------------------------------
1 | 292 68.71 68.71
2 | 81 19.06 87.76
3 | 52 12.24 100.00
------------+-----------------------------------
Total | 425 100.00

-> tabulation of job

job | Freq. Percent Cum.

------------+-----------------------------------
1 | 54 12.71 12.71
2 | 271 63.76 76.47
3 | 89 20.94 97.41
4 | 11 2.59 100.00
------------+-----------------------------------
Total | 425 100.00

. * frequency table of the dependent variable : creditrisk

. tab creditrisk

creditrisk | Freq. Percent Cum.

------------+-----------------------------------
0 | 214 50.35 50.35
1 | 211 49.65 100.00
------------+-----------------------------------
Total | 425 100.00

. * Comment: 50.35 % of borrowers are ranked low risk and 49.65 are ranked high
risk by the bank

. * Joint distribution and correlation between creditrisk and loanpurpose (the

stata command is : tab2 creditrisk loanpurpose, chi2)

.
.
. tab2 creditrisk loanpurpose, chi2

-> tabulation of creditrisk by loanpurpose

| loanpurpose
creditrisk | 1 2 3 4 5 6
7 8 9 10 | Total
-----------
+----------------------------------------------------------------------------------
----------------------------+----------
0 | 21 9 42 1 39 8
1 63 28 2 | 214
1 | 23 14 43 3 65 4
1 42 12 4 | 211
-----------
+----------------------------------------------------------------------------------
----------------------------+----------
Total | 44 23 85 4 104 12
2 105 40 6 | 425

Pearson chi2(9) = 21.2695 Pr = 0.012

. * There is a strong significant correlation between creditrisk and loanpurpose

(p-value =1.2% < 5%)

.
.
. * Joint distribution and correlation between creditrisk and gender (the stata
command is : tab2 creditrisk gender, chi2)

.
.
. tab2 creditrisk gender, chi2

-> tabulation of creditrisk by gender

| gender
creditrisk | 0 1 | Total
-----------+----------------------+----------
0 | 57 157 | 214
1 | 78 133 | 211
-----------+----------------------+----------
Total | 135 290 | 425

Pearson chi2(1) = 5.2320 Pr = 0.022

.
.
. * There is a strong significant correlation between creditrisk and gender (p-
value =2.2% < 5%)

. * Joint distribution and correlation between creditrisk and housing (the stata
command is : tab2 creditrisk housing , chi2)

. tab2 creditrisk housing , chi2

-> tabulation of creditrisk by housing

| housing
creditrisk | 1 2 3 | Total
-----------+---------------------------------+----------
0 | 161 32 21 | 214
1 | 131 49 31 | 211
-----------+---------------------------------+----------
Total | 292 81 52 | 425

Pearson chi2(2) = 8.5524 Pr = 0.014

. * There is a strong significant correlation between creditrisk and housing (p-

value =1.4% < 5%)
***********************************************************************************
***********************************************************************************
**********************

. LOGISTIC REGRESSION

. * Logistic regression is used when the dependent variable is binary and when we
have a typical coding : 0 for negative outcome (event did not occur) and 1 for
positive
outcome (event did occur). We use a Logit model when we are interested in seeing
how the independent variables affect the probabilty of the event occuring (or not
occuring)

. * Logit model : y = c + bX + e where y is the dependent variable (creditrisk) x a

set of independent continuous and categorical variables ( checking savings
monthscustomer
monthsemployed age loanpurpose gender maritalstatus housing job) c (constant/no
real significance in logistic regression model) and b are parameters to be
estimated.
e, the error term has mean 0 and variance π^2 (Pi squared). Pr(y=1|x)=
exp(c+bx)/1+exp(c+bx). So a positive coefficien b, indicates that higher levels of
x are associated
with an increase in Pr(y=1|x) and a negative coefficient indicates that higher
levels of x are associated with a decrease in Pr(y=1|x).

.
.
. * The Stata commad to estimate a Logit model is: "logit depvar indvars"

. logit creditrisk checking savings monthscustomer monthsemployed age i.loanpurpose

i.gender i.maritalstatus i.housing i.job

Iteration 0: log likelihood = -294.57696

Iteration 1: log likelihood = -257.67227
Iteration 2: log likelihood = -257.54937
Iteration 3: log likelihood = -257.54926
Iteration 4: log likelihood = -257.54926

Logistic regression Number of obs = 425

LR chi2(22) = 74.06
Prob > chi2 = 0.0000
Log likelihood = -257.54926 Pseudo R2 = 0.1257

--------------------------------------------------------------------------------
creditrisk | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
checking | -.0000476 .0000348 -1.37 0.171 -.0001158 .0000206
savings | -.0000496 .0000316 -1.57 0.117 -.0001116 .0000124
monthscustomer | .0502559 .0105246 4.78 0.000 .029628 .0708837
monthsemployed | -.0039044 .0037417 -1.04 0.297 -.011238 .0034291
age | -.0116182 .0112195 -1.04 0.300 -.0336081 .0103717
|
loanpurpose |
2 | .1805578 .5798451 0.31 0.756 -.9559176 1.317033
3 | .0208036 .4146947 0.05 0.960 -.791983 .8335902
4 | 1.511012 1.296294 1.17 0.244 -1.029677 4.051702
5 | .7164681 .4034342 1.78 0.076 -.0742485 1.507185
6 | -.7741574 .7415011 -1.04 0.296 -2.227473 .6791581
7 | .5033596 1.546789 0.33 0.745 -2.528292 3.535011
8 | -.2122486 .404214 -0.53 0.600 -1.004494 .5799963
9 | -1.345953 .5263792 -2.56 0.011 -2.377637 -.3142685
10 | .0141124 1.029348 0.01 0.989 -2.003373 2.031598
|
1.gender | .129157 .5282329 0.24 0.807 -.9061604 1.164474
|
maritalstatus |
2 | -.4163424 .6175879 -0.67 0.500 -1.626792 .7941076
3 | -.6619016 .5082489 -1.30 0.193 -1.658051 .334248
|
housing |
2 | .5931464 .2893651 2.05 0.040 .0260012 1.160292
3 | .5939747 .3756847 1.58 0.114 -.1423538 1.330303
|
job |
2 | -.2862753 .3604276 -0.79 0.427 -.9927004 .4201498
3 | -.1103014 .412744 -0.27 0.789 -.9192648 .6986621
4 | -.5714215 .7860727 -0.73 0.467 -2.112096 .9692526
|
_cons | -.1554973 .6956042 -0.22 0.823 -1.518857 1.207862
--------------------------------------------------------------------------------

. * Model 2 : logit creditrisk savings monthscustomer age i.loanpurpose i.gender

i.maritalstatus i.housing (some inconsistent independent variables are dropped)

.
.
. logit creditrisk savings monthscustomer age i.loanpurpose i.gender
i.maritalstatus i.housing

Iteration 0: log likelihood = -294.57696

Iteration 1: log likelihood = -259.73212
Iteration 2: log likelihood = -259.65054
Iteration 3: log likelihood = -259.65049
Iteration 4: log likelihood = -259.65049

Logistic regression Number of obs = 425

LR chi2(17) = 69.85
Prob > chi2 = 0.0000
Log likelihood = -259.65049 Pseudo R2 = 0.1186

--------------------------------------------------------------------------------
creditrisk | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
savings | -.0000523 .000031 -1.69 0.091 -.000113 8.43e-06
monthscustomer | .0491996 .0102011 4.82 0.000 .0292058 .0691934
age | -.0135457 .0106361 -1.27 0.203 -.0343921 .0073007
|
loanpurpose |
2 | .2306476 .5752784 0.40 0.688 -.8968774 1.358172
3 | .0283228 .4071357 0.07 0.945 -.7696485 .8262941
4 | 1.619859 1.280687 1.26 0.206 -.8902408 4.129959
5 | .7146552 .3995288 1.79 0.074 -.0684068 1.497717
6 | -.7673558 .7213625 -1.06 0.287 -2.1812 .6464887
7 | .3543893 1.514912 0.23 0.815 -2.614784 3.323563
8 | -.2122968 .3982476 -0.53 0.594 -.9928478 .5682542
9 | -1.228455 .5094138 -2.41 0.016 -2.226887 -.2300221
10 | .2539983 .9985979 0.25 0.799 -1.703218 2.211214
|
1.gender | .1892455 .5261844 0.36 0.719 -.842057 1.220548
|
maritalstatus |
2 | -.5019915 .6130893 -0.82 0.413 -1.703625 .6996415
3 | -.7653826 .502026 -1.52 0.127 -1.749336 .2185704
|
housing |
2 | .5846078 .286382 2.04 0.041 .0233094 1.145906
3 | .5378179 .3707556 1.45 0.147 -.1888497 1.264486
|
_cons | -.4400372 .5845148 -0.75 0.452 -1.585665 .7055908
--------------------------------------------------------------------------------

. * To check the predictive power of the estimated model, the stata post
estimation command is : "estat classification"

. estat classification

Logistic model for creditrisk

-------- True --------

Classified | D ~D | Total
-----------+--------------------------+-----------
+ | 135 65 | 200
- | 76 149 | 225
-----------+--------------------------+-----------
Total | 211 214 | 425

Classified + if predicted Pr(D) >= .5

True D defined as creditrisk != 0
--------------------------------------------------
Sensitivity Pr( +| D) 63.98%
Specificity Pr( -|~D) 69.63%
Positive predictive value Pr( D| +) 67.50%
Negative predictive value Pr(~D| -) 66.22%
--------------------------------------------------
False + rate for true ~D Pr( +|~D) 30.37%
False - rate for true D Pr( -| D) 36.02%
False + rate for classified + Pr(~D| +) 32.50%
False - rate for classified - Pr( D| -) 33.78%
--------------------------------------------------
Correctly classified 66.82%
--------------------------------------------------

. * Considering the estimated Logit model, the percentage of customers correctly

classified (ranked high or low risk) is around 67% (correctly classified 66.82 %).
This model has a good predictive power.

. * To report the odds ratios (exp(b)) for each independent variable, the stata
command is "logit depvar indvars, or". Standard errors and confidence intervals
are also transformed.
.
.
. logit creditrisk savings monthscustomer age i.loanpurpose i.gender
i.maritalstatus i.housing, or

Iteration 0: log likelihood = -294.57696

Iteration 1: log likelihood = -259.73212
Iteration 2: log likelihood = -259.65054
Iteration 3: log likelihood = -259.65049
Iteration 4: log likelihood = -259.65049

Logistic regression Number of obs = 425

LR chi2(17) = 69.85
Prob > chi2 = 0.0000
Log likelihood = -259.65049 Pseudo R2 = 0.1186

--------------------------------------------------------------------------------
creditrisk | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
---------------+----------------------------------------------------------------
savings | .9999477 .000031 -1.69 0.091 .9998871 1.000008
monthscustomer | 1.05043 .0107155 4.82 0.000 1.029636 1.071643
age | .9865457 .010493 -1.27 0.203 .9661926 1.007327
|
loanpurpose |
2 | 1.259415 .7245144 0.40 0.688 .4078412 3.889079
3 | 1.028728 .4188318 0.07 0.945 .4631758 2.284836
4 | 5.052379 6.470514 1.26 0.206 .4105569 62.17537
5 | 2.043482 .8164298 1.79 0.074 .9338805 4.47147
6 | .464239 .3348846 -1.06 0.287 .1129059 1.908826
7 | 1.42531 2.15922 0.23 0.815 .0731836 27.75908
8 | .8087246 .3220727 -0.53 0.594 .37052 1.765183
9 | .2927446 .1491281 -2.41 0.016 .1078636 .7945161
10 | 1.28917 1.287362 0.25 0.799 .1820967 9.126792
|
1.gender | 1.208338 .6358084 0.36 0.719 .4308234 3.389044
|
maritalstatus |
2 | .605324 .3711177 -0.82 0.413 .1820226 2.013031
3 | .4651559 .2335204 -1.52 0.127 .1738895 1.244297
|
housing |
2 | 1.794287 .5138515 2.04 0.041 1.023583 3.14529
3 | 1.712266 .6348324 1.45 0.147 .8279109 3.54127
|
_cons | .6440125 .3764348 -0.75 0.452 .2048115 2.025043
--------------------------------------------------------------------------------
Note: _cons estimates baseline odds.

.
.
. * Notes : 1) for positive b, "the odds are exp(b) times larger" or "the odds
increase by a factor of exp(b)" --- 2) for negative b, "the odds are exp(b) times
smaller" or
> "the odds decrease by a factor of exp(b)"--- 3) odds close to 1 indicate a small
change (multiplying by 1.01 or 0.99 does not change the odds much. --- 4) The odds
of Y=1 (high risk) increase multiplicatively by exp(b) for a one unit increase
in X, holding all other variables constant.
. * Comments: Results in the table above show 1) the odds of the variable
monthscustomer (continuous) for a customer ranked high risk (creditrisk=1) increase
by a factor of 1
> .05 for a unit increase in monthscustomer. 2) The odds for a customer ranked
high risk (creditrisk=1) increase by a factor of 1.8 when the customer rents a
house compared
when he owns his house.

. * How to obtain easier coefficients for easier interpretation : the stata post
estimation command "listcoef, percent" gives the percent change in odds for unit
increase in
> X and the percent change in odds for Standard Deviation increase in X.

. listcoef, percent

logit (N=425): Percentage Change in Odds

Odds of: 1 vs 0

----------------------------------------------------------------------
creditrisk | b z P>|z| % %StdX SDofX
-------------+--------------------------------------------------------
savings | -0.00005 -1.688 0.091 -0.0 -17.1 3597.2850
monthscust~r | 0.04920 4.823 0.000 5.0 82.9 12.2676
age | -0.01355 -1.274 0.203 -1.3 -13.9 11.0451
2.loanpurp~e | 0.23065 0.401 0.688 25.9 5.4 0.2265
3.loanpurp~e | 0.02832 0.070 0.945 2.9 1.1 0.4005
4.loanpurp~e | 1.61986 1.265 0.206 405.2 17.0 0.0967
5.loanpurp~e | 0.71466 1.789 0.074 104.3 36.0 0.4304
6.loanpurp~e | -0.76736 -1.064 0.287 -53.6 -11.9 0.1658
7.loanpurp~e | 0.35439 0.234 0.815 42.5 2.5 0.0685
8.loanpurp~e | -0.21230 -0.533 0.594 -19.1 -8.8 0.4318
9.loanpurp~e | -1.22845 -2.412 0.016 -70.7 -30.2 0.2923
10.loanpur~e | 0.25400 0.254 0.799 28.9 3.0 0.1181
1.gender | 0.18925 0.360 0.719 20.8 9.2 0.4661
2.maritals~s | -0.50199 -0.819 0.413 -39.5 -13.1 0.2788
3.maritals~s | -0.76538 -1.525 0.127 -53.5 -31.7 0.4983
2.housing | 0.58461 2.041 0.041 79.4 25.8 0.3932
3.housing | 0.53782 1.451 0.147 71.2 19.3 0.3281
----------------------------------------------------------------------

. * Table description : 1) b = raw coefficient 2)z = z-score for test of b=0 3)p>|
z|= p-value for z-test 4)% = percent change in odds for percent increase in X 5)
%StdX = percent change in odds for SD increase in X 6) SDofX = standard
deviation of X.

. * Results : 1)the odds of a high risk customer increase by 5% for 1 month

increase of the variable monthscustomer, holding other variables constant. 2)the
odds of a high
risk customer decrease by 70.7% when the loan purpose is buying a used car
( loanpurpose=9) compared to a high risk borrower when the loan is for business
( loanpurpose=1,
the reference), holding other variables constant. 3)the odds of a high risk
customer renting a house increase by 79.4% compared to a high risk borrower owning
a house
(house=1, the reference), holding other variables constant.
***********************************************************************************
***********************************************************************************
*********************************
. *** Probability Prediction***

. * to predict the probability of a customer to be ranked high risk, the stata

command is "prvalue". So how? Example: for a customer having specific
characteristics
x( savings=5000 monthscustomer=28 age=30 loanpurpose=2 gender=1 maritalstatus=2
housing=2), the first step is to estimate the model this way : "quietly logit
creditrisk savings
monthscustomer age loanpurpose gender maritalstatus housing" . Preceding any
stata command by "quietly and Stata would not display results (we do not need
them).
The second step consists to run the command "prvalue, x(savings=5000
monthscustomer=28 age=30 loanpurpose=2 gender=1 maritalstatus=2 housing=2)"

. quietly logit creditrisk savings monthscustomer age loanpurpose gender

maritalstatus housing

. * Stata does not display results...

. prvalue, x( savings=5000 monthscustomer=28 age=30 loanpurpose=2 gender=1

maritalstatus=2 housing=2)

logit: Predictions for creditrisk

Confidence intervals by delta method

95% Conf. Interval

Pr(y=1|x): 0.6512 [ 0.5224, 0.7800]
Pr(y=0|x): 0.3488 [ 0.2200, 0.4776]

savings monthscust~r age loanpurpose gender

maritalsta~s housing
x= 5000 28 30 2 1
2 2

. * The predicted probability to be ranked high risk of a customer with these

characteristics is 0.6512 with 95% CI [0.5224 0.7800].

. * To predict the probability to be ranked high risk of a customer at the mean of

the set of independent variables, the stata command is "prvalue, rest(mean)"

. prvalue, rest(mean)

logit: Predictions for creditrisk

Confidence intervals by delta method

95% Conf. Interval

Pr(y=1|x): 0.4976 [ 0.4472, 0.5479]
Pr(y=0|x): 0.5024 [ 0.4521, 0.5528]

savings monthscust~r age loanpurpose gender

maritalsta~s housing
x= 1812.5624 22.896471 34.397647 5.24 .68235294
2.1811765 1.4352941
. * the predicted probabilty to be ranked high risk is 0.4976 with 95% CI [0.4472
0.5479]

.
***********************************************************************************
***********************************************************************************
**********************************

. log close

-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
----------

Operations Research: Oxford
No ratings yet
Operations Research: Oxford
7 pages
A1 Quantitative Techniques
No ratings yet
A1 Quantitative Techniques
385 pages
Cfa二级百题预测金程教育学员版题目
No ratings yet
Cfa二级百题预测金程教育学员版题目
392 pages
pmwj27 Oct2014 Wain Updating The Lang Factor Featured Paper PDF
No ratings yet
pmwj27 Oct2014 Wain Updating The Lang Factor Featured Paper PDF
17 pages
Data Science Methodology Assignment
100% (2)
Data Science Methodology Assignment
2 pages
Practicing Perfection Chaffin 2002
No ratings yet
Practicing Perfection Chaffin 2002
8 pages
Solutions For Biostatistics For The Biological and Health Sciences 3rd Edition by Triola
No ratings yet
Solutions For Biostatistics For The Biological and Health Sciences 3rd Edition by Triola
17 pages
May Thu, MEcon. Stats. Roll.4
No ratings yet
May Thu, MEcon. Stats. Roll.4
90 pages
Chapter1 IntroductiontoSEMinAMOS
100% (3)
Chapter1 IntroductiontoSEMinAMOS
28 pages
MGMT 469 Helpful Stata Commands
No ratings yet
MGMT 469 Helpful Stata Commands
8 pages
Prévision 20
No ratings yet
Prévision 20
120 pages
Sales Budget PDF
No ratings yet
Sales Budget PDF
30 pages
Mec R2018
No ratings yet
Mec R2018
227 pages
The Effect of Addiction of Watching Korean Drama Series On Imitation Behavior of Adolescents
No ratings yet
The Effect of Addiction of Watching Korean Drama Series On Imitation Behavior of Adolescents
8 pages
Snehavathy J 42410397
No ratings yet
Snehavathy J 42410397
57 pages
Forensic Dental Age Estimation by Measuring Root Dentin Translucency Area Using A New Digital Technique
No ratings yet
Forensic Dental Age Estimation by Measuring Root Dentin Translucency Area Using A New Digital Technique
6 pages
Predictive Maintenance Project Milestone Report
No ratings yet
Predictive Maintenance Project Milestone Report
7 pages
Institutional Repository: Original Citation
No ratings yet
Institutional Repository: Original Citation
24 pages
Cross Sectional
No ratings yet
Cross Sectional
40 pages
M Stat
No ratings yet
M Stat
59 pages
CM20315 01 Intro 01
No ratings yet
CM20315 01 Intro 01
39 pages
OPIANA - MIDTERM+Problem-set-4-5-6-7-and-8 - 9-10
No ratings yet
OPIANA - MIDTERM+Problem-set-4-5-6-7-and-8 - 9-10
73 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Multiple Regression Analysis: Estimation
No ratings yet
Multiple Regression Analysis: Estimation
50 pages
BTM Graduate Course-Requirements English 2023
No ratings yet
BTM Graduate Course-Requirements English 2023
12 pages
NFL Econ 398
No ratings yet
NFL Econ 398
5 pages
Sports Analyticsfor Football League Tableand Player Performance Prediction CR
No ratings yet
Sports Analyticsfor Football League Tableand Player Performance Prediction CR
9 pages
Cognitive Dissonance And: Professor William N. Goetzmann
No ratings yet
Cognitive Dissonance And: Professor William N. Goetzmann
22 pages
2020 Deep Learning For Mental Illness Detection Using Brain SPECT Imaging - SpringerLink
No ratings yet
2020 Deep Learning For Mental Illness Detection Using Brain SPECT Imaging - SpringerLink
10 pages
Ongoing Improvements To Mine Planning
No ratings yet
Ongoing Improvements To Mine Planning
2 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2141)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)

1 - Credit Risk Log File

Uploaded by

1 - Credit Risk Log File

Uploaded by

-----------------------------------------------------------------------------------

. Data : "2_Credit Risk data.dta"

. * Data description: Dependent variable: creditrisk (binary variable taking 1 if

Variable | Obs Mean Std. Dev. Min Max

. tab1 loanpurpose gender maritalstatus housing job

-> tabulation of loanpurpose

loanpurpose | Freq. Percent Cum.

-> tabulation of gender

gender | Freq. Percent Cum.

-> tabulation of maritalstatus

-> tabulation of housing

housing | Freq. Percent Cum.

-> tabulation of job

job | Freq. Percent Cum.

. * frequency table of the dependent variable : creditrisk

creditrisk | Freq. Percent Cum.

. * Joint distribution and correlation between creditrisk and loanpurpose (the

-> tabulation of creditrisk by loanpurpose

Pearson chi2(9) = 21.2695 Pr = 0.012

. * There is a strong significant correlation between creditrisk and loanpurpose

-> tabulation of creditrisk by gender

Pearson chi2(1) = 5.2320 Pr = 0.022

. tab2 creditrisk housing , chi2

-> tabulation of creditrisk by housing

Pearson chi2(2) = 8.5524 Pr = 0.014

. * There is a strong significant correlation between creditrisk and housing (p-

. **** LOGISTIC REGRESSION****

. * Logit model : y = c + bX + e where y is the dependent variable (creditrisk) x a

. logit creditrisk checking savings monthscustomer monthsemployed age i.loanpurpose

Iteration 0: log likelihood = -294.57696

Logistic regression Number of obs = 425

. * Model 2 : logit creditrisk savings monthscustomer age i.loanpurpose i.gender

Iteration 0: log likelihood = -294.57696

Logistic regression Number of obs = 425

Logistic model for creditrisk

-------- True --------

Classified + if predicted Pr(D) >= .5

. * Considering the estimated Logit model, the percentage of customers correctly

Iteration 0: log likelihood = -294.57696

Logistic regression Number of obs = 425

logit (N=425): Percentage Change in Odds

. * Results : 1)the odds of a high risk customer increase by 5% for 1 month

. * to predict the probability of a customer to be ranked high risk, the stata

. quietly logit creditrisk savings monthscustomer age loanpurpose gender

. * Stata does not display results...

. prvalue, x( savings=5000 monthscustomer=28 age=30 loanpurpose=2 gender=1

logit: Predictions for creditrisk

Confidence intervals by delta method

95% Conf. Interval

savings monthscust~r age loanpurpose gender

. * The predicted probability to be ranked high risk of a customer with these

. * To predict the probability to be ranked high risk of a customer at the mean of

logit: Predictions for creditrisk

Confidence intervals by delta method

95% Conf. Interval

savings monthscust~r age loanpurpose gender

You might also like

. LOGISTIC REGRESSION