Data Analysis Powerpoint
Data Analysis Powerpoint
Machine Learning
Factors affecting customer Creditability
Credit Score
The most widely used credit score is the FICO Score,
which was developed by the Fair Isaac Corporation.
FICO Scores range from 300 to 850, with higher scores
indicating better creditworthiness.
The FICO Score is calculated based on a combination of
credit report data from the three major credit bureaus:
Experian, Equifax, and TransUnion.
In addition to the FICO Score, there are other credit
scores available, such as the VantageScore, which was
developed jointly by the three major credit bureaus.
Like the FICO Score, the VantageScore ranges from 300
to 850, with higher scores indicating better
creditworthiness.
Data Used in the analysis
Age/Gender
The figure to the left gives the
creditability by Age, grouped by
gender.
Younger adults are more likely to
be creditable, when compared
to older adults.
Across all the ages, Male Single
are more likely to have bad
credit rating when compared to
other genders.
Factors affecting customer Creditability
Status
The average number of Existing credits
score higher in the Creditability, when
compared to the lower number.
Status 1: 0< 200 balance score high in
credit worthiness.
Logistic Regression
Logistic regression is a type of statistical model that is commonly The logistic regression model
used for classification and predictive analytics. It is a variation of estimates the probability of
the dependent variable being
linear regression, but instead of predicting a continuous
in a certain category based on
numerical value, it predicts the probability of an event occurring the values of the independent
based on a set of input variables. variables. The model does this
by applying a logistic function,
Logistic regression is used when the dependent variable, also
also known as a sigmoid
known as the outcome or response variable, is binary, meaning it
function, to a linear
can take one of two possible values, such as yes or no, or 0 or 1. combination of the input
The independent variables, also known as predictors or features, variables.
can be continuous, categorical, or binary.
Logistic Regression
• While running the logistic regression
for Creditability, the coefficient of all
determinants is significant at -3.38.
• Other important variables are Sex:
female-single and Male-Single.
• The amount, Guaranter and car loan
are the most significant factors from
the glm regression given.
• The AIC for the data is 3171, and this
means a poor model for logistic
regression.
Decision Trees
A decision tree is a type of supervised learning
algorithm used in machine learning that can be used
for both classification and regression tasks.