100% found this document useful (2 votes)

469 views66 pages

Credit Risk Modeling in R

Here are the key steps for building a logistic regression model in R: 1. Split the data into training and test sets. This allows you to train the model on a portion of the data and validate it on held-out data. 2. Fit a null model with only the intercept term. This serves as a baseline to compare subsequent models against. 3. Fit models adding variables one by one and check their significance levels. Remove non-significant variables. 4. Compare the residual deviance and null deviance of each model. Lower residual deviance indicates better fit after accounting for variables. 5. Check the Akaike Information Criterion (AIC) values. Prefer models with lower

Uploaded by

Arjun Khosla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

469 views66 pages

Credit Risk Modeling in R

Uploaded by

Arjun Khosla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 66

Credit Risk Modeling in R

Credit risk modelling is the best way for lenders to understand how likely a particular loan is to get repaid. In other words, it’s a tool to
understand the credit risk of a borrower. This is especially important because this credit risk profile keeps changing with time and
circumstances.
What is Credit Risk?

Credit risk refers to the chance that a borrower will be unable to make their payments on time and default on their
debt. It refers to the risk that a lender may not receive their interest due or the principal lent on time.

This results in an interruption of cash flows for the lender and increases the cost of collection. In extreme cases, some
part of the loan or even the entire loan may have to be written off resulting in a loss for the lender.

It is extremely difficult and complex to pinpoint exactly how likely a person is to default on their loan. At the same
time, properly assessing credit risk can reduce the likelihood of losses from default and delayed repayment.
What is Credit Risk?

Interest payments from the borrower are the lender’s

reward for bearing credit risk. If the credit risk is
higher, the lender or investor will either charge a
higher interest or forego the lending opportunity
altogether.

For example, a loan applicant with a superior credit

history and steady income will be charged a lower
interest rate for the same loan than an applicant with a
poor credit history.
What is Credit Risk Modelling?

Credit risk modelling refers to the process of using data models to find out two important things.

1. The first is the probability of the borrower defaulting on the loan.

2. The second is the impact on the financials of the lender if this default occurs.

Financial institutions rely on credit risk models to determine the credit risk of potential borrowers. They make
decisions on whether or not to sanction a loan as well as on the interest rate of the loan based on the credit risk
model validation.
Which Factors Affect Credit Risk Modelling?

There are several major factors to consider while determining credit risk. From the financial health of the borrower and the
consequences of default for both the borrower and the creditor to a variety of macroeconomic considerations. Here are
three major factors affecting the credit risk of a borrower.

(i) The Probability of Default (PD)

This refers to the likelihood that a borrower will default on their loans and is obviously the most important part of a credit
risk model. For individuals, this score is based on their debt-income ratio and existing credit score.

The PD generally determines the interest rate and amount of down payment needed.
Which Factors Affect Credit Risk Modelling?

(ii) Loss Given Default

This refers to the total loss that the lender will suffer if the debt is not repaid. This is a critical component in credit risk modeling.
For instance, two borrowers with the same credit score and a similar debt-income ratio will present two very different credit risk
profiles if one is borrowing a much larger amount.

That’s because the loss to the lender in case of default is much higher when the amount is larger. This again plays a big role in
determining interest rates and down payments. If the borrower is willing to offer collateral then that has a big impact on the interest
rate offered.

(iii) Exposure at Default

This is a measure of the total exposure that a lender is exposed to at any given point of time. This also has an impact on the credit
risk because it is an indicator of the risk appetite of the lender. It is calculated by multiplying each loan by a certain percentage
depending on the particulars of the loan.
Types of Credit Risk Rating Models

(i) The Models Based on Financial Statement Analysis

Examples of these models include Altman Z score and Moody’s Risk Calc. These models are based on an analysis of financial statements
of borrowing institutions. They chiefly take into account well known financial ratios that can be useful in determining credit risk. For
instance, Altman Z score takes into account financial ratios like EBIDTA/total taxes and sales/total assets in different proportions to
determine the likelihood of a company going bankrupt.
Types of Credit Risk Rating Models

(ii) Machine Learning Models

The introduction of machine learning and big data to credit risk modeling has made it possible to create credit risk
models that are far more scientific and accurate.

Big data and analytics are enabling credit risk modelling to become more scientific as it is now based more on past data
than guesswork. In fact, credit risk modeling using R, Python, and other programming languages is becoming more
mainstream
Credit Risk
Modelling in R
Understand the dataset …… loaddata.rds
Contingency tables provide a way to display
the frequencies and relative frequencies of

Important Step observations, which are classified according to two

categorical variables. The elements of one category
Use contingency tables to understand dataset.
are displayed across the columns; the elements of the
other category are displayed over the rows.
Prop.r - If TRUE, row proportions will be included
prop.c - If TRUE, column proportions will be included
prop.t - If TRUE, table proportions will be included
prop.chisq - If TRUE, chi-square contribution of each cell will be included
Chisq - If TRUE, the results of a chi-square test will be included
Removing outlier record

Gives the rowid of the outlier

EDA performed on int_rate and all NA
values are removed.

Note – Different dataframe used to save

loan2 (original dataset loan still has NA
values)
EDA Ends Decision Tree
Cp : complexity parameter.

Any split that does not decrease the overall lack of fit by a factor of cp is
not attempted.

The complexity parameter, is the threshold value for a decrease in

overall lack of fit for any split. If cp is not met, further splits will no
longer be pursued. cp's default value is 0.01, but for complex problems,
it is advised to relax cp.

Keep cp = -1 means tree to be fully grown

The default value for Cp value is 0.01.

When Cp = 0.00, the decision tree has no restrictions on what a split

must add, and it’ll produce the most complex tree possible.

Method :

one of "anova", "poisson", "class" or "exp".
If method is missing then the routine tries to make an
intelligent guess. If y is a survival object, then method =
"exp" is assumed, if y has 2 columns then method =
"poisson" is assumed, if y is a factor then method =
"class" is assumed, otherwise method = "anova" is
assumed. It is wisest to specify the method directly,
especially as more criteria may added to the function in
future.
Changing prior probabilities
Prior probability: the proportion of events and non-events in an imbalance data set.

Changing the prior probabilities in a data set indirectly adjusts the importance of incorrectly classifying the prediction for each class.

By making the prior probabilities for “Loan status – 0 ” bigger, we put more importance when the prediction is classified as “False Positive” or “False Negative”
(aka misclassification).

How to change prior probabilities:

Applying prior probabilities to RPART is very easy.

We apply parms(), an additional argument inside rpart(), and this argument specifically deals with unbalanced class sizes.

Inside the parms() argument, define the percentage proportion we want to apply on for our example below, we’ll start with a percentage proportion of (70% =loan
status - 0, 30% = loan status -1).

Note: parms argument should always sum up to 1.

Prior Probabilities ….
So, what is the best Cp value?
We want to choose the Cp value that produces the lowest amount of cross validation error.

Cross-Validation is a technique used in model selection to better estimate how our decision tree will perform.

The idea behind cross-validation is to create a number of partitions of sample observations, known as the validation sets, from the training data set, then
we measure the performance against each validation set, and then calculate the average error. It gives us a better assessment of how the model will perform
when asked to predict for new observations.

The functions printcp() and plotcp() will help us validate and identify the best Cp value for our model.

printcp() function will generate a list of Cp values, and we use this list to find the value that has the least amount of cross-validated error (xerror). The cp
value with the least amount of cross-validated error (xerror) will generate a decision tree with the most efficient amount of decision nodes for our data set.

For our example below, we’ll print the cp list of “tree_prior” (changing prior probability decision tree).
The cp value with the least
amount of cross-validated error
(xerror) will generate a decision
tree with the most
efficient amount of decision
nodes for our data set.
Long list ….. Check in R
Credit Modelling
Using Logistic Regression
in R
Null Deviation : Deviation that we get from actual
value of dataset ( only using intercept).
🡪 Lower the value better the model

Residual Deviance : include independent

variables (B,C ...)
🡪 Lower the value better the model

Check the significance levels .. Look for the

variable that can be removed

AIC ( Akaike Information Criteria)

Similar to adjusted R2 .
AIC is the measure of fit which penalizes model for
the number of model coefficients

🡪 Preference : Model with minimum AIC value

Optimize the Model
(Remove insignificant part)

AIC should decrease

whereas
Residual remain same
Taken threshold of 0.15 to convert into Binomial 0 or 1
Note – Confusion Matrix & Accuracy of Decision
Tree
Comparative
Analysis using Understanding AUC - ROC Curve

ROC
Understanding Confusion Matrix

It is a performance measurement for machine learning

classification problem where output can be two or more
classes. It is a table with 4 different combinations of
predicted and actual values.

True Positive: You predicted positive and it’s true.

True Negative: You predicted negative and it’s true.

False Positive (Type 1 Error) : You predicted positive and it’s false.
False Negative (Type 2 Error) : You predicted negative and it’s false.
What is AUC - ROC Curve?

• In Machine Learning, performance measurement is an essential task. So when it comes to a classification problem,
we can count on an AUC - ROC Curve. When we need to check or visualize the performance of the multi - class
classification problem, we use AUC (Area Under The Curve) ROC (Receiver Operating Characteristics) curve. It
is one of the most important evaluation metrics for checking any classification model’s performance. It is also written
as AUROC (Area Under the Receiver Operating Characteristics).

• AUC - ROC curve is a performance measurement for classification problem at various thresholds settings. ROC is a
probability curve and AUC represents degree or measure of separability. It tells how much model is capable of
distinguishing between classes. Higher the AUC, better the model is at predicting 0s as 0s and 1s as 1s.

TPR (True Positive Rate), FPR (False Positive Rate)

This threshold
can be adjusted
to tune the
behavior of the
model for a
specific
problem.
New Threshold … after
increasing the threshold
Continuously increasing the
threshold we can derive various
dots … and will reach to the
situation where all dots represent
obese
Ket Pointers:

The Receiver Operator Characteristic (ROC) curve is an evaluation metric for binary classification problems. It is a
probability curve that plots the TPR against FPR at various threshold values and essentially separates the ‘signal’ from the
‘noise’.

The Area Under the Curve (AUC) is the measure of the ability of a classifier to distinguish between classes and is used as a
summary of the ROC curve.

The higher the AUC, the better the performance of the model at distinguishing between the positive and negative classes.
Design

Right from the start, you’ll have access to beautiful,

widescreen themes that you can easily change to match
your style. Every theme comes with a variety of color
variants that you can mix and match.

New features like Merge Shapes and a color-matching

Eyedropper open up new possibilities for your designs.

Line up your layouts, photos, and diagrams perfectly in

seconds with alignment guides and smart guides.
Impress

The improved Presenter View has new tools to keep you in control. The
new Auto-Extend instantly applies the right settings for you, so you can
focus on speaking instead of your display.

• Slide zoom – Help focus your audience on your ideas. Just click to
zoom in and out on a specific diagram, chart or graphic.

• Slide Navigator – A feature that enables the user to visually browse for
and navigate to other slides without leaving Slide Show view. Your
audience only sees the slide you’re presenting.
Work Together

Edit with others from different PCs at the same

time and have conversations with improved
commenting.

Sharing online is simple. Even if your audience

doesn’t have PowerPoint, simply project to their
browser with Present Online.

Work together with others at the same time from

different locations, whether you are using
PowerPoint on your desktop or PowerPoint
Online.
Intuitively design beautiful presentations,
easily share and work together with others
PowerPoint 2013 and give a professional performance with
advanced presenting tools.

Find out more at the PowerPoint Getting Started Center

(Click the arrow when in Slide Show mode)

Sade The Best of Sade Piano Notes PDF
97% (32)
Sade The Best of Sade Piano Notes PDF
79 pages
Valuation and Risk Models
No ratings yet
Valuation and Risk Models
82 pages
CQF January 2023 Final Project Brief
No ratings yet
CQF January 2023 Final Project Brief
23 pages
Quicksheet FRM Part II
No ratings yet
Quicksheet FRM Part II
7 pages
Credit Risk Modeling in Python Chapter3
No ratings yet
Credit Risk Modeling in Python Chapter3
35 pages
Logit Model For PD
No ratings yet
Logit Model For PD
9 pages
SUPERSTITION - Reading and Listening Comprehension Test (B1)
No ratings yet
SUPERSTITION - Reading and Listening Comprehension Test (B1)
4 pages
FRTB Standardised Approach
No ratings yet
FRTB Standardised Approach
14 pages
P2.T6. Credit Risk Measurement & Management Jonathan Golin and Philippe Delhaise, The Bank Credit Analysis Handbook Bionic Turtle FRM Study Notes
No ratings yet
P2.T6. Credit Risk Measurement & Management Jonathan Golin and Philippe Delhaise, The Bank Credit Analysis Handbook Bionic Turtle FRM Study Notes
32 pages
Market Risk Questions PDF
No ratings yet
Market Risk Questions PDF
16 pages
FRTB - Concepts Beautifully Visualized Brick by Brick
50% (2)
FRTB - Concepts Beautifully Visualized Brick by Brick
6 pages
JD - Credit Model Validations
No ratings yet
JD - Credit Model Validations
2 pages
How to Implement Market Models Using VBA
From Everand
How to Implement Market Models Using VBA
Francois Goossens
No ratings yet
A Compendium of Standards On Internal Auditing
No ratings yet
A Compendium of Standards On Internal Auditing
2 pages
The Walled City by Ryan Graudin Extract
100% (1)
The Walled City by Ryan Graudin Extract
15 pages
Credit Risk Modeling in Python Chapter4
100% (1)
Credit Risk Modeling in Python Chapter4
35 pages
Credit Risk Predictive Modelling - by EY
0% (1)
Credit Risk Predictive Modelling - by EY
37 pages
Credit Risk Modeling in Python Chapter1
100% (1)
Credit Risk Modeling in Python Chapter1
27 pages
Quantitative Risk Management: A Practical Guide to Financial Risk
From Everand
Quantitative Risk Management: A Practical Guide to Financial Risk
Thomas S. Coleman
No ratings yet
Quantitative Credit Portfolio Management: Practical Innovations for Measuring and Controlling Liquidity, Spread, and Issuer Concentration Risk
From Everand
Quantitative Credit Portfolio Management: Practical Innovations for Measuring and Controlling Liquidity, Spread, and Issuer Concentration Risk
Arik Ben Dor
3.5/5 (1)
Modeling of EAD and LGD: Empirical Approaches and Technical Implementation
100% (1)
Modeling of EAD and LGD: Empirical Approaches and Technical Implementation
21 pages
Credit Risk Modeling
No ratings yet
Credit Risk Modeling
213 pages
SMEs Credit Risk Modelling For PDF
No ratings yet
SMEs Credit Risk Modelling For PDF
270 pages
Credit Risk Sas
No ratings yet
Credit Risk Sas
152 pages
106 - Machine Learning and Credit Risk Modelling
100% (1)
106 - Machine Learning and Credit Risk Modelling
8 pages
Credit Risk Modeling New - New1
No ratings yet
Credit Risk Modeling New - New1
41 pages
Point-In-Time (PIT) LGD and EAD Models For IFRS9/CECL and Stress Testing
No ratings yet
Point-In-Time (PIT) LGD and EAD Models For IFRS9/CECL and Stress Testing
16 pages
Evaluation of Value at Risk-Models
No ratings yet
Evaluation of Value at Risk-Models
58 pages
Interest Rate Risk in The Banking Book (IRRBB) : John N.Chalouhi
100% (2)
Interest Rate Risk in The Banking Book (IRRBB) : John N.Chalouhi
15 pages
Lecture 1.1 CQF 2010 - B
No ratings yet
Lecture 1.1 CQF 2010 - B
52 pages
How To Credit Score With Predictive Analytics: Whitepaper
No ratings yet
How To Credit Score With Predictive Analytics: Whitepaper
7 pages
Estimation of Probability of Defaults (PD) For Low Default Portfolios An Actuarial Approach
100% (2)
Estimation of Probability of Defaults (PD) For Low Default Portfolios An Actuarial Approach
47 pages
Model risk Second Edition
From Everand
Model risk Second Edition
Gerardus Blokdyk
No ratings yet
Credit Risk Irb Approach2
No ratings yet
Credit Risk Irb Approach2
232 pages
Risk Complete Notes
No ratings yet
Risk Complete Notes
417 pages
Stress Testing and Risk Integration in Banks: University of Passau
No ratings yet
Stress Testing and Risk Integration in Banks: University of Passau
53 pages
FASB's Current Expected Credit Loss Model For Credit Loss Accounting (CECL) : Background and FAQ 'S For Bankers June 2016
No ratings yet
FASB's Current Expected Credit Loss Model For Credit Loss Accounting (CECL) : Background and FAQ 'S For Bankers June 2016
23 pages
Credit Loss and Systematic LGD Frye Jacobs 100611 PDF
No ratings yet
Credit Loss and Systematic LGD Frye Jacobs 100611 PDF
31 pages
Forecasting Default With The KMV-Merton Model
No ratings yet
Forecasting Default With The KMV-Merton Model
35 pages
FRTB Ey Review
100% (1)
FRTB Ey Review
8 pages
Credit Risk Modeling
No ratings yet
Credit Risk Modeling
4 pages
Derivatives Valuation Pricing Approaches PDF
100% (2)
Derivatives Valuation Pricing Approaches PDF
12 pages
Calibration and Data Analysis: in This Lecture. .
No ratings yet
Calibration and Data Analysis: in This Lecture. .
73 pages
Credit Risk Models
No ratings yet
Credit Risk Models
24 pages
Credit Risk Estimation Techniques
0% (1)
Credit Risk Estimation Techniques
31 pages
Eve Irrbb PDF
No ratings yet
Eve Irrbb PDF
86 pages
Credit Risk Modelling
No ratings yet
Credit Risk Modelling
28 pages
Basel III Fundamental Review of The Trading Book
No ratings yet
Basel III Fundamental Review of The Trading Book
20 pages
Basel II Risk Weight Functions
No ratings yet
Basel II Risk Weight Functions
31 pages
Bootcamp in CRM PDF
100% (1)
Bootcamp in CRM PDF
163 pages
Lecture Notes Stochastic Calculus
100% (3)
Lecture Notes Stochastic Calculus
365 pages
Expected Shortfall - An Alternative Risk Measure To Value-At-Risk
No ratings yet
Expected Shortfall - An Alternative Risk Measure To Value-At-Risk
14 pages
Market Risk Model Validation
100% (1)
Market Risk Model Validation
31 pages
Introduction To Risk Modelling
No ratings yet
Introduction To Risk Modelling
14 pages
Toolbox FRTB Sba PDF
No ratings yet
Toolbox FRTB Sba PDF
52 pages
Curves Boothstrap
No ratings yet
Curves Boothstrap
42 pages
Models For PD LGD Ead
100% (2)
Models For PD LGD Ead
38 pages
Basel II and Credit Risk
No ratings yet
Basel II and Credit Risk
18 pages
Chap 10 Market Risk
100% (1)
Chap 10 Market Risk
115 pages
Model Management Guidance PDF
No ratings yet
Model Management Guidance PDF
70 pages
Validators Guide To Model Risk Management by RiskSpan
100% (5)
Validators Guide To Model Risk Management by RiskSpan
29 pages
P1.T2. Quantitative Analysis
100% (1)
P1.T2. Quantitative Analysis
13 pages
Model Risk Tiering
100% (2)
Model Risk Tiering
32 pages
Modelling Single-name and Multi-name Credit Derivatives
From Everand
Modelling Single-name and Multi-name Credit Derivatives
Dominic O'Kane
No ratings yet
Operations Management Week 2 New1
No ratings yet
Operations Management Week 2 New1
112 pages
Pom Ndim
No ratings yet
Pom Ndim
30 pages
Operations Management: Dr. Mashkur Zafar
No ratings yet
Operations Management: Dr. Mashkur Zafar
30 pages
In Chapter 4 - Managing Product and Service Innovation Identifies The Following Key Questions
No ratings yet
In Chapter 4 - Managing Product and Service Innovation Identifies The Following Key Questions
18 pages
Data Collection Techniques Modified
No ratings yet
Data Collection Techniques Modified
32 pages
Study Material
No ratings yet
Study Material
88 pages
Operations Management Week 1
No ratings yet
Operations Management Week 1
49 pages
Chapter-4: Causal/Experimental Research Designs
No ratings yet
Chapter-4: Causal/Experimental Research Designs
34 pages
Sampling Fundamentals Modified
No ratings yet
Sampling Fundamentals Modified
45 pages
Working With SAS Dates
No ratings yet
Working With SAS Dates
6 pages
Domestic Investment in India
No ratings yet
Domestic Investment in India
42 pages
DPR
No ratings yet
DPR
5 pages
Study Mat
No ratings yet
Study Mat
34 pages
Ipo 1
No ratings yet
Ipo 1
4 pages
Ratan Goyal
No ratings yet
Ratan Goyal
6 pages
Advanced Analytics Using SAS
No ratings yet
Advanced Analytics Using SAS
14 pages
Advanced - Linear Regression
No ratings yet
Advanced - Linear Regression
57 pages
3 Statement & DCF Model
No ratings yet
3 Statement & DCF Model
17 pages
Cars
No ratings yet
Cars
31 pages
Eletronic Cash Controller
No ratings yet
Eletronic Cash Controller
3 pages
مراجعة شهر فبراير للصف الأول الإعدادي د محمد شوقي النجار 2025 الترم الثاني
No ratings yet
مراجعة شهر فبراير للصف الأول الإعدادي د محمد شوقي النجار 2025 الترم الثاني
20 pages
Women and Girls in Hindi Magazine
No ratings yet
Women and Girls in Hindi Magazine
26 pages
EFFECTIVENESS OF INTERVENTION CLASSES USING PROJECT MDAS..division
No ratings yet
EFFECTIVENESS OF INTERVENTION CLASSES USING PROJECT MDAS..division
6 pages
Independence Sunday Liturgy PDF
No ratings yet
Independence Sunday Liturgy PDF
2 pages
Research 1 2
No ratings yet
Research 1 2
27 pages
Buenaflor - Tle 8 1,2,3
No ratings yet
Buenaflor - Tle 8 1,2,3
5 pages
HSC English Standard Course Information:: Overall Approach To Module A
No ratings yet
HSC English Standard Course Information:: Overall Approach To Module A
6 pages
Analyticalexpositiontext 140507024650 Phpapp02
No ratings yet
Analyticalexpositiontext 140507024650 Phpapp02
13 pages
Draft Script - Phone Addiction
100% (1)
Draft Script - Phone Addiction
4 pages
Austrian Man Has Confessed To Imprisoning His Daughter
No ratings yet
Austrian Man Has Confessed To Imprisoning His Daughter
6 pages
PP T 0000027
No ratings yet
PP T 0000027
36 pages
Fine Jewelry Auction - Skinner
100% (4)
Fine Jewelry Auction - Skinner
124 pages
Case Study On Data Mining
No ratings yet
Case Study On Data Mining
7 pages
The Relativity of Simultaneity: An Analysis Based On The Properties of Electromagnetic Waves
No ratings yet
The Relativity of Simultaneity: An Analysis Based On The Properties of Electromagnetic Waves
13 pages
Matrix Calculus
No ratings yet
Matrix Calculus
9 pages
Evolution of Dance
No ratings yet
Evolution of Dance
23 pages
English 3 1st Summative
No ratings yet
English 3 1st Summative
3 pages
Tokaido Player Aid v1.1
No ratings yet
Tokaido Player Aid v1.1
1 page
Data Mining: Dosen: Dr. Vitri Tundjungsari
No ratings yet
Data Mining: Dosen: Dr. Vitri Tundjungsari
64 pages
Set 3 Soal Uji Pengetahuan PPG
No ratings yet
Set 3 Soal Uji Pengetahuan PPG
18 pages
Pseudocode and Flowcharts
No ratings yet
Pseudocode and Flowcharts
5 pages
Expository Writing Lesson
100% (1)
Expository Writing Lesson
3 pages
Creating An Abundance Altar
No ratings yet
Creating An Abundance Altar
5 pages
Sandy Thomas - Feminization 56 - Mother's New Daughter
No ratings yet
Sandy Thomas - Feminization 56 - Mother's New Daughter
97 pages
13 Eco6e Ev Edms
100% (3)
13 Eco6e Ev Edms
38 pages