0% found this document useful (0 votes)
237 views6 pages

Microsoft Azure Machine Learning Project To Predict Likelihood of Good Credit of Customer

The document summarizes a machine learning project that analyzed customer data to predict credit status. It used Python, R and SQL to clean the data by removing unnecessary columns. A two-class decision forest model was trained on the cleaned data with bagging, 50 decision trees of maximum depth 32, 32 random splits per node, and 4 minimum samples per leaf node. The model achieved an accuracy of 77.9% in predicting customer credit.

Uploaded by

api-355102227
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
237 views6 pages

Microsoft Azure Machine Learning Project To Predict Likelihood of Good Credit of Customer

The document summarizes a machine learning project that analyzed customer data to predict credit status. It used Python, R and SQL to clean the data by removing unnecessary columns. A two-class decision forest model was trained on the cleaned data with bagging, 50 decision trees of maximum depth 32, 32 random splits per node, and 4 minimum samples per leaf node. The model achieved an accuracy of 77.9% in predicting customer credit.

Uploaded by

api-355102227
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Microsoft Azure Machine Learning Project

By: Shubham Dwivedi

Report:

Model Output
Predicted with an accuracy of 77.9% whether customer is likely to have good or bad credit.
Used Python, R, SQL for data modification and feature engineering.

Inputs:
Analyzed the data set noting that it contains data on 950 customer cases. There are column
headers- 20 features (data columns which can be used to train a machine learning model)
and the label (the column indicating the actual credit status of the customers).

The second column labeled Duration, which will display some properties of that feature
(data column) on the right side of the display. These properties include summary statistics
and the data type, as shown here:
Label: CreditStatus (0,1)

Data Transformation:

As part of Data transformation, we will be removing some of the columns which are
as follows: Housing, SexAndStatus , OtherDetorsGuarantors, OtherInstalments and
ExistingCreditsAtBank

I have used the python Scripts and R script to drop the mentioned columns

Python Code:

def azureml_main(creditframe):
drop_cols = ['SexAndStatus',
'OtherDetorsGuarantors']
creditframe.drop(drop_cols, axis = 1, inplace = True)
return creditframe
R Code:
credit.frame <- maml.mapInputPort(1)
drop.cols <- c('OtherInstalments',
'ExistingCreditsAtBank')
out.frame <- credit.frame[, !(names(credit.frame) %in% drop.cols)]
maml.mapOutputPort("out.frame")

SQL:
select
CheckingAcctStat,
Duration,
CreditHistory,
Purpose,
Savings,
Employment,
InstallmentRatePecnt,
PresentResidenceTime,
Property,
Age,
Telephone,
CreditStatus
from t1;

Creating and Evaluating a Machine Learning Model


Now will use the algorithm to train the module ( Classification)
Two Class Decision Forest module

Resampling method: Bagging


Create trainer mode: Single Parameter
Number of Decision trees: 50
Maximum depth of the decision tree: 32
Number of random splits per node: 32
Minimum number of samples per leaf node: 4
Once it is trained will score and evaluate the model and the experiment looks
like this:
Evaluation Results:

You might also like