DT RF
DT RF
ipynb - Colaboratory
Based on the principal of entropy theroy? measure of randomness/ how much homogeneous
and heterogeneous data...? if homogeneous = Information Gain is 0 heterogenoeus Information
gain is maximum
Capture.JPG
There are two main types of Decision Trees: Classification Trees. Regression Trees.
Capture.JPG
https://colab.research.google.com/drive/17egvNAuEBzG6AFyLgKTitvbn4VLj72KS#scrollTo=wVwwj1tgTTiO&printMode=true 1/7
12/24/21, 5:28 PM DT_RF.ipynb - Colaboratory
Capture.JPG
1 import pandas as pd
2 cr = pd.read_csv(r"CreditRisk.csv")
3 cr.head()
Not
3 LP001006 Male Yes 0.0 No 2583
Graduate
Loan_ID 0
Gender 24
Married 3
Dependents 25
Education 0
Self_Employed 55
ApplicantIncome 0
CoapplicantIncome 0
LoanAmount 27
Loan_Amount_Term 20
Credit_History 79
Property_Area 0
Loan_Status 0
dtype: int64
1 cr.Gender = cr.Gender.fillna('Male')
2 cr.Self_Employed = cr.Self_Employed.fillna('Yes')
3 cr.Credit_History = cr.Credit_History.fillna(0)
4 cr.Dependents = cr.Dependents.fillna(0)
5 cr.LoanAmount = cr.LoanAmount.fillna(cr.LoanAmount.mean())
6 cr.Loan_Amount_Term = cr.Loan_Amount_Term.fillna(cr.Loan_Amount_Term.mean())
7 cr.Married = cr.Married.fillna("Yes")
1 cr.isnull().sum()
Loan_ID 0
Gender 0
Married 0
https://colab.research.google.com/drive/17egvNAuEBzG6AFyLgKTitvbn4VLj72KS#scrollTo=wVwwj1tgTTiO&printMode=true 2/7
12/24/21, 5:28 PM DT_RF.ipynb - Colaboratory
Dependents 0
Education 0
Self_Employed 0
ApplicantIncome 0
CoapplicantIncome 0
LoanAmount 0
Loan_Amount_Term 0
Credit_History 0
Property_Area 0
Loan_Status 0
dtype: int64
1 import sklearn
DecisionTreeClassifier()
1 pred_dt =dtree.predict(cr_x_test)
71.06598984771574
1 cr_x_train.head()
https://colab.research.google.com/drive/17egvNAuEBzG6AFyLgKTitvbn4VLj72KS#scrollTo=wVwwj1tgTTiO&printMode=true 3/7
12/24/21, 5:28 PM DT_RF.ipynb - Colaboratory
33 1 1 0.0 1 0 3500
1 dtree.feature_importances_
1 feature_score
2
Importance Variable_Name
0 0.023792 Gender
1 0.038591 Married
2 0.025406 Dependents
3 0.022473 Education
4 0.000000 Self_Employed
5 0.283718 ApplicantIncome
6 0.094969 CoapplicantIncome
7 0.176804 LoanAmount
8 0.039185 Loan_Amount_Term
9 0.261147 Credit_History
10 0.033915 Property_Area
https://colab.research.google.com/drive/17egvNAuEBzG6AFyLgKTitvbn4VLj72KS#scrollTo=wVwwj1tgTTiO&printMode=true 4/7
12/24/21, 5:28 PM DT_RF.ipynb - Colaboratory
Importance Variable_Name
5 0.283718 ApplicantIncome
9 0.261147 Credit_History
7 0.176804 LoanAmount
6 0.094969 CoapplicantIncome
8 0.039185 Loan_Amount_Term
1 0.038591 Married
10 0.033915 Property_Area
2 0.025406 Dependents
0 0.023792 Gender
3 0.022473 Education
Random
4 Forest Model
0.000000 Self_Employed
1 # Random Forest
2 # It uses number of decision tree
3 # Ensemble technique( N number of samples and on each sample a DT is created)
4 # Each tree does a prediction and at the end Votes are taken
5 #--------------------------#
6 # for example if you have 1000 records and you are going to build 100 trees
7 # your 100 samples are created randomly
8 # few samples may have 50 records and 3 cols
9 # other samples may have different cobination of records
10 # finally each tree take the decision individually, final decision is taken by the vot
11 # records can be duplicated also - ramdomly
12 # maimum vote is for class 1 or class 0
Capture1.JPG
Capture.JPG
1 rfc.fit(cr_x_train, cr_y_train)
RandomForestClassifier()
https://colab.research.google.com/drive/17egvNAuEBzG6AFyLgKTitvbn4VLj72KS#scrollTo=wVwwj1tgTTiO&printMode=true 5/7
12/24/21, 5:28 PM DT_RF.ipynb - Colaboratory
1 pred_rf=rfc.predict(cr_x_test )
2 pred_rf
array([1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1,
1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1,
0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0,
1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1,
1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0,
1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1,
1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1])
76.6497461928934
0.766497461928934
https://colab.research.google.com/drive/17egvNAuEBzG6AFyLgKTitvbn4VLj72KS#scrollTo=wVwwj1tgTTiO&printMode=true 6/7
12/24/21, 5:28 PM DT_RF.ipynb - Colaboratory
https://colab.research.google.com/drive/17egvNAuEBzG6AFyLgKTitvbn4VLj72KS#scrollTo=wVwwj1tgTTiO&printMode=true 7/7