PA v0.12
PA v0.12
My Goal
Given historical data on loans given out with information on whether or not the borrower defaulted (charge-off), I have built
a model that can predict whether or not a borrower will pay back their loan? This way in the future when the company gets a
new potential customer we can assess whether or not they are likely to pay back the loan.
Model Used
1. Random Forest
2. Decision Tree
3. Neural Network
Language/Analytics Tools Used
1. Python – Jupyter Notebook
Modules used
2. Pandas
3. Numpy
4. Matplotlib
Data Set Overview: copy from
https://github.com/vishrut18/Data-Science-and-ML-Projects/blob/master/1.%20LendingClub%20Loan_Status%20Predictive%2
0model%20using%20Decision%20Tress%20and%20Random%20Forests.ipynb
In the form of table
27 columns
Mention it is categorical/Numerical/Ordinal in 1 column
2 files – 1st for data and 2nd for field description
OVERALL GOAL: Get an understanding for which variables are important, view summary statistics, and
visualize the data
Ratio: XX:YY
Let's explore the Grade and SubGrade columns that LendingClub attributes to the loans.