0% found this document useful (0 votes)

208 views1 page

Assignment 1 Data Mining

This document shows code for building a decision tree classifier model to predict loan defaults. It loads and preprocesses a CSV dataset containing home ownership, marital status, income, and default labels. It splits the data into features and a target, fits a decision tree classifier, exports a visualization of the tree, and makes predictions on new data points.

Uploaded by

Aldi Renadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

208 views1 page

Assignment 1 Data Mining

Uploaded by

Aldi Renadi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

In

[1]:
#Import Module

import pandas as pd

from sklearn import tree

from sklearn.tree import DecisionTreeClassifier

In [2]:
#Import and Read Dataset

dataset = pd.read_csv('https://raw.githubusercontent.com/asrulabdullah99/data_mining/master/dataset_decision/Dataset_Tugas.csv')

dataset

Out[2]: Home_Owner Marital_Status Annual_Income Defaulted_Borrower

0 Yes Single 125000 No

1 No Married 100000 No

2 No Single 70000 No

3 Yes Married 120000 No

4 No Divorced 95000 Yes

5 No Married 60000 No

6 Yes Divorced 220000 No

7 No Single 85000 Yes

8 No Married 75000 No

9 No Single 90000 Yes

In [3]:
#Convert all data to numerical values

d = {'Yes':1,'No':0}

dataset['Home_Owner']=dataset['Home_Owner'].map(d)

d = {'Single':0,'Married':1,'Divorced':2}

dataset['Marital_Status']=dataset['Marital_Status'].map(d)

d = {'Yes':1,'No':0}

dataset['Defaulted_Borrower']=dataset['Defaulted_Borrower'].map(d)

dataset

Out[3]: Home_Owner Marital_Status Annual_Income Defaulted_Borrower

0 1 0 125000 0

1 0 1 100000 0

2 0 0 70000 0

3 1 1 120000 0

4 0 2 95000 1

5 0 1 60000 0

6 1 2 220000 0

7 0 0 85000 1

8 0 1 75000 0

9 0 0 90000 1

In [4]:
#Saparate columns

#x adalah feature columns, y = target columns

features = ['Home_Owner','Marital_Status','Annual_Income','Defaulted_Borrower']

x = dataset[features]

y = dataset['Defaulted_Borrower']

print(x)

print(y)

Home_Owner Marital_Status Annual_Income Defaulted_Borrower

0 1 0 125000 0

1 0 1 100000 0

2 0 0 70000 0

3 1 1 120000 0

4 0 2 95000 1

5 0 1 60000 0

6 1 2 220000 0

7 0 0 85000 1

8 0 1 75000 0

9 0 0 90000 1

0 0

1 0

2 0

3 0

4 1

5 0

6 0

7 1

8 0

9 1

Name: Defaulted_Borrower, dtype: int64

In [5]:
import pydotplus

import matplotlib.pyplot as plt

import matplotlib.image as pltimg

dtree = DecisionTreeClassifier()

dtree = dtree.fit(x,y)

data = tree.export_graphviz(dtree, out_file= None, feature_names=features)

graph = pydotplus.graph_from_dot_data(data)

graph.write_png('decisiontree.png')

img = pltimg.imread('decisiontree.png')

imgplot = plt.imshow(img)

plt.show()

In [6]:
#Prediksi dari umur 39, 10 tahun experience, comedy ranking 7 dan USA

print(dtree.predict([[39,15,3,0]]))

[0]

In [8]:
print(dtree.predict([[1, 0, 125000, 0]]))

[0]

In [ ]:

Aosdijfpqoiew
No ratings yet
Aosdijfpqoiew
6 pages
Pneumatic Conveying System Design Calculation: Input Parameters Unit Value
67% (3)
Pneumatic Conveying System Design Calculation: Input Parameters Unit Value
6 pages
2004 KX125 Racing Tuning
100% (1)
2004 KX125 Racing Tuning
5 pages
Reading CSV Data Python Example
No ratings yet
Reading CSV Data Python Example
5 pages
Smoke&Gas With PLC Project1-Modified
33% (3)
Smoke&Gas With PLC Project1-Modified
53 pages
Ensemmmmm
No ratings yet
Ensemmmmm
10 pages
Apex Financial Services Loan Data Automation
No ratings yet
Apex Financial Services Loan Data Automation
18 pages
Sumanca 1485 Cap
No ratings yet
Sumanca 1485 Cap
12 pages
ML Cops
No ratings yet
ML Cops
17 pages
Credit Card Default
No ratings yet
Credit Card Default
5 pages
Lab3.ipynb - Colaboratory
No ratings yet
Lab3.ipynb - Colaboratory
7 pages
Loan - Approval - Prediction - Ipynb - Colab
No ratings yet
Loan - Approval - Prediction - Ipynb - Colab
7 pages
Ex 8
No ratings yet
Ex 8
3 pages
Kunal Assignment 3
No ratings yet
Kunal Assignment 3
19 pages
Practical 3
No ratings yet
Practical 3
8 pages
AIML Lab Ex 3-5 - 1
No ratings yet
AIML Lab Ex 3-5 - 1
31 pages
Practical1c.ipynb - Colab
No ratings yet
Practical1c.ipynb - Colab
2 pages
LoanTap Case Study
No ratings yet
LoanTap Case Study
37 pages
Credit Risk Modeling in R - ch1 - PDF
No ratings yet
Credit Risk Modeling in R - ch1 - PDF
45 pages
Project 3 Thera Bank
100% (1)
Project 3 Thera Bank
24 pages
AML Project LearnerNotebook LowCode
No ratings yet
AML Project LearnerNotebook LowCode
74 pages
Germany Credit Analysis
No ratings yet
Germany Credit Analysis
41 pages
Week 4 LAB
No ratings yet
Week 4 LAB
26 pages
Dsbda 3a
No ratings yet
Dsbda 3a
11 pages
FINBUSF Formulas
No ratings yet
FINBUSF Formulas
14 pages
House - Price - Prediction
No ratings yet
House - Price - Prediction
16 pages
Predictive+Modelling+-+Logistic+Regression+-+Student+Version-New2.3.ipynb - Colaboratory
No ratings yet
Predictive+Modelling+-+Logistic+Regression+-+Student+Version-New2.3.ipynb - Colaboratory
12 pages
Customer Churn Syntax
No ratings yet
Customer Churn Syntax
66 pages
House Price Prediction
No ratings yet
House Price Prediction
1 page
Observation: Import As Import As Import As Import As
No ratings yet
Observation: Import As Import As Import As Import As
31 pages
Aiml
No ratings yet
Aiml
27 pages
Imobsters Real Estate Calculator: Buys Completed (Copy Results)
No ratings yet
Imobsters Real Estate Calculator: Buys Completed (Copy Results)
35 pages
Copy of Final Project
No ratings yet
Copy of Final Project
16 pages
DACLUSTER
No ratings yet
DACLUSTER
9 pages
02 End To End Machine Learning Project
No ratings yet
02 End To End Machine Learning Project
26 pages
MLT Ann Lab 2
No ratings yet
MLT Ann Lab 2
7 pages
Data Preprocessing
No ratings yet
Data Preprocessing
27 pages
Exp 10
No ratings yet
Exp 10
1 page
Case Study 2 Bazinga Inc
No ratings yet
Case Study 2 Bazinga Inc
12 pages
Machine Learning Program
No ratings yet
Machine Learning Program
12 pages
Normialization Dataset
No ratings yet
Normialization Dataset
7 pages
Online Food Orders Analysis Using Python
No ratings yet
Online Food Orders Analysis Using Python
12 pages
Capstone Removed
No ratings yet
Capstone Removed
17 pages
Credit Risk Modelling Using R
No ratings yet
Credit Risk Modelling Using R
148 pages
Eda Case Study Code
No ratings yet
Eda Case Study Code
40 pages
Data Frame Notes3
No ratings yet
Data Frame Notes3
39 pages
LogisticRegression.rst
No ratings yet
LogisticRegression.rst
11 pages
Step # 1: Calculation of Installment For Borrowing
No ratings yet
Step # 1: Calculation of Installment For Borrowing
5 pages
Traditional Vs Roth Ira
No ratings yet
Traditional Vs Roth Ira
2 pages
Linear Models Reading
No ratings yet
Linear Models Reading
26 pages
Animesh Jain
No ratings yet
Animesh Jain
13 pages
Public Policy Report: UC Student Default Python Model: The Council For Education (CED)
No ratings yet
Public Policy Report: UC Student Default Python Model: The Council For Education (CED)
20 pages
SQL Day1
No ratings yet
SQL Day1
1 page
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
Accumulated-Depreciation Schedule
No ratings yet
Accumulated-Depreciation Schedule
5 pages
Week 12 Assignment
No ratings yet
Week 12 Assignment
8 pages
Projet 2 Classification Des Crédits
No ratings yet
Projet 2 Classification Des Crédits
24 pages
MSML Project 1
No ratings yet
MSML Project 1
8 pages
DPDZero Assessment
No ratings yet
DPDZero Assessment
12 pages
3 - Analysis of Default - Ipynb - Colab
No ratings yet
3 - Analysis of Default - Ipynb - Colab
16 pages
Observation: As We Can See We Have Threwe Types of Datatypes I.E. (Int, Float, Object) That Means We Have Both Categorical and Numerical Data
No ratings yet
Observation: As We Can See We Have Threwe Types of Datatypes I.E. (Int, Float, Object) That Means We Have Both Categorical and Numerical Data
2 pages
1) Download The Binary Classification Dataset For... - Colab
No ratings yet
1) Download The Binary Classification Dataset For... - Colab
6 pages
Develop Snakes & Ladders Game Complete Guide with Code & Design
From Everand
Develop Snakes & Ladders Game Complete Guide with Code & Design
Anurag Pandey
No ratings yet
Binny Bansal
No ratings yet
Binny Bansal
9 pages
FARO Focus X 130: High-Speed 3D Laser Scanner
No ratings yet
FARO Focus X 130: High-Speed 3D Laser Scanner
2 pages
Lecture Slides CPE 676 - Absorption & Adsorption
No ratings yet
Lecture Slides CPE 676 - Absorption & Adsorption
56 pages
COMPILER Design Unit-6
No ratings yet
COMPILER Design Unit-6
7 pages
Mustaza Bin Zaini, 600, JLN BT Kawa TMN Sri Moyan 2 Fasa 3 93250, KUCHING, SAR
No ratings yet
Mustaza Bin Zaini, 600, JLN BT Kawa TMN Sri Moyan 2 Fasa 3 93250, KUCHING, SAR
2 pages
The Supplier and Contract Management System (SCMS) in The Yorkshire and The Humber Region.
No ratings yet
The Supplier and Contract Management System (SCMS) in The Yorkshire and The Humber Region.
2 pages
Ceran (R) Cleartrans Product Detail Sheet
No ratings yet
Ceran (R) Cleartrans Product Detail Sheet
2 pages
Audio Consoles: The International Buyer'S Guide
No ratings yet
Audio Consoles: The International Buyer'S Guide
52 pages
Parts Manual Parts Manual Parts Manual Parts Manual: Mfg. No: 104M02-0002-F1
No ratings yet
Parts Manual Parts Manual Parts Manual Parts Manual: Mfg. No: 104M02-0002-F1
23 pages
415 V System Stage-1
100% (1)
415 V System Stage-1
18 pages
Product Recommendation Yamaha Yzf r15 Yzf r15 v10 v20 2008 2017
No ratings yet
Product Recommendation Yamaha Yzf r15 Yzf r15 v10 v20 2008 2017
1 page
M8 - Research Framework
No ratings yet
M8 - Research Framework
2 pages
Purple Modern Course Z-Fold Brochure
No ratings yet
Purple Modern Course Z-Fold Brochure
2 pages
IBM India Report
No ratings yet
IBM India Report
4 pages
Control of Sulphur Oxides
No ratings yet
Control of Sulphur Oxides
10 pages
Agri
No ratings yet
Agri
3 pages
Panasonic Home Theatre Sound System
100% (1)
Panasonic Home Theatre Sound System
44 pages
AASHTO ASTM CODE Name
No ratings yet
AASHTO ASTM CODE Name
2 pages
Shafer™ RV-Series Rotary Vane Valve Actuators
No ratings yet
Shafer™ RV-Series Rotary Vane Valve Actuators
8 pages
Eaton 9390/9390IT UPS 20-80 kVA Installation and Operation Manual
No ratings yet
Eaton 9390/9390IT UPS 20-80 kVA Installation and Operation Manual
238 pages
Course Syllabus CE463
100% (1)
Course Syllabus CE463
2 pages
Repair GRUB2 When Ubuntu Won't Boot
No ratings yet
Repair GRUB2 When Ubuntu Won't Boot
14 pages
Risk Management in Emerging Payments
100% (1)
Risk Management in Emerging Payments
23 pages
An Ontology of Technology
100% (1)
An Ontology of Technology
11 pages
Advancement in Water Filter
No ratings yet
Advancement in Water Filter
20 pages
Sturdy Large Autoclave catalogue-SAP Series-201810
No ratings yet
Sturdy Large Autoclave catalogue-SAP Series-201810
14 pages
Sizing Medical Gas Piping
100% (2)
Sizing Medical Gas Piping
6 pages

Assignment 1 Data Mining

Uploaded by

Assignment 1 Data Mining

Uploaded by

In

from sklearn import tree

from sklearn.tree import DecisionTreeClassifier

Out[2]: Home_Owner Marital_Status Annual_Income Defaulted_Borrower

0 Yes Single 125000 No

3 Yes Married 120000 No

4 No Divorced 95000 Yes

6 Yes Divorced 220000 No

7 No Single 85000 Yes

9 No Single 90000 Yes

Out[3]: Home_Owner Marital_Status Annual_Income Defaulted_Borrower

#x adalah feature columns, y = target columns

Home_Owner Marital_Status Annual_Income Defaulted_Borrower

Name: Defaulted_Borrower, dtype: int64

import matplotlib.pyplot as plt

import matplotlib.image as pltimg

data = tree.export_graphviz(dtree, out_file= None, feature_names=features)

You might also like