DS Report 1
DS Report 1
SUBMITTED BY
T. BALA SAATVIK
71762108005
1
71762108005 21AD46
DESCRIPTION:
The Australian Credit Dataset consists of 20 variables that describe the demographic and
socio-economic characteristics of 1000 loan applicants and one outcome variable that
indicates whether the applicants are a “good credit risk” (i.e. likely to repay the loan) or a
“bad credit risk” (i.e. unlikely to repay the loan). A predictive model, developed based on
this dataset, is expected to provide guidance for a bank manager to decide whether to
approve a loan based on the profile of a loan applicant.
PROBLEM STATEMENT:
In Phase 1 of the Data Analytics Lifecycle, a data science team learns the business
domain, assesses the resources available and formulates initial hypotheses (IHs) to test
and begin learning the data. The file AUS_CREDIT.xlsx contains two spreadsheets: one
containing a dataset of 21 variables from 1000 loan applicants and one containing
descriptions of all variables in the dataset. Use the following steps to formulate appropriate
hypotheses that can be tested with the given dataset:
1. Review and discuss the roles of the predictor variables in a credit decision.
SOLUTION:
2
71762108005 21AD46
OUTPUT:
OUTPUT:
3
71762108005 21AD46
This is a heatmap generated with the help of Seaborn library to identify the
Predictor Variables from the aus_credit dataset.
The MAXIMUM POSITIVE Creditability score found using the heatmap would be
given higher preference as a Predictor Variable.
Account_status: It refers to the current state of a financial account, such as a bank account
or credit card account.
4
71762108005 21AD46
Real_estate: It refers to property consisting of land and the buildings, structures, or natural
resources on it. Real estate can be residential, commercial, or industrial in nature, and
may be used for a variety of purposes, such as housing, retail, office space,
manufacturing, or agriculture.
Credit history: It provides information about past credit behaviour, such as timely
payments, defaults, or bankruptcies, and can indicate the applicant's ability to pay back a
loan.
Age: It may affect the applicant's ability to repay the loan and may be correlated with other
variables employment status.
SET OF HYPOTHESIS:
H1: Applicants with a good Account Status are more likely to be approved for a loan.
H2: Applicants with a Savings Account are more likely to be approved for a loan.
H3: Applicants who are employed full-time are more likely to be approved for a loan.
H4: Applicants with a Real Estate background are more likely to be approved for a loan.
H5: Applicants with a proper Credit History are more likely to be approved for a loan.