0% found this document useful (0 votes)
0 views6 pages

Desicion Tree Ipynb

The document demonstrates how to use a decision tree classifier on the iris dataset to classify iris species based on petal dimensions. It includes steps for loading the dataset, splitting it into training and testing sets, training the model, making predictions, and evaluating performance using accuracy and confusion matrix. Additionally, it shows how to visualize the decision tree and make predictions for specific petal measurements.

Uploaded by

Pranay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views6 pages

Desicion Tree Ipynb

The document demonstrates how to use a decision tree classifier on the iris dataset to classify iris species based on petal dimensions. It includes steps for loading the dataset, splitting it into training and testing sets, training the model, making predictions, and evaluating performance using accuracy and confusion matrix. Additionally, it shows how to visualize the decision tree and make predictions for specific petal measurements.

Uploaded by

Pranay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

08/12/2021, 13:14 decision_trees (1).

ipynb - Colaboratory

Decison Tree

We'll use the iris dataset to visualize how a decision tree works.

from sklearn.datasets import load_iris


from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
iris.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])

Let us first understand the datasets

The data set consists of:

150 samples

3 labels: species of Iris (Iris setosa, Iris virginica and Iris versicolor)

4 features: Sepal length,Sepal width,Petal length,Petal Width in cm

Scikit learn only works if data is stored as numeric data, irrespective of it being a regression or a classification problem. It also requires the
arrays to be stored at numpy arrays for optimization. Since, this dataset is loaded from scikit learn, everything is appropriately formatted

iris['feature_names']

['sepal length (cm)',


'sepal width (cm)',
'petal length (cm)',
'petal width (cm)']

Assign the data and target to separate variables. x contains the features and y contains the label

https://colab.research.google.com/drive/11xVrqpEgeHcu9DfrErPhEsoqc5qPu70x#scrollTo=HnrpmO2mSgvI&printMode=true 1/6
08/12/2021, 13:14 decision_trees (1).ipynb - Colaboratory

X = iris.data[:,2 :] # considering only the petal length and petal width as the features
y = iris.target

Splitting the dataset;

x_train contains the training features

x_test contains the testing features


Saving…
y_train contains the training label

y_test contains the testing labels

from sklearn.model_selection import train_test_split


x_train,x_test,y_train,y_test=train_test_split(X,y,test_size=.3)

Train the Model.

from sklearn import tree


classifier=tree.DecisionTreeClassifier(max_depth=2)
classifier.fit(x_train,y_train)

DecisionTreeClassifier(max_depth=2)

Make predictions:

predictions=classifier.predict(x_test)
predictions

array([2, 1, 1, 2, 1, 1, 2, 2, 1, 0, 2, 2, 2, 2, 0, 1, 1, 0, 1, 1, 1, 0,
0, 0, 1, 0, 2, 0, 0, 2, 1, 0, 2, 0, 0, 0, 1, 1, 2, 0, 1, 1, 1, 1,
0])
https://colab.research.google.com/drive/11xVrqpEgeHcu9DfrErPhEsoqc5qPu70x#scrollTo=HnrpmO2mSgvI&printMode=true 2/6
08/12/2021, 13:14 decision_trees (1).ipynb - Colaboratory

from sklearn.tree import plot_tree


plot_tree(classifier)

[Text(133.92000000000002, 181.2, 'X[0] <= 2.45\ngini = 0.666\nsamples = 105\nvalue = [35, 33, 37]'),
Text(66.96000000000001, 108.72, 'gini = 0.0\nsamples = 35\nvalue = [35, 0, 0]'),
Text(200.88000000000002, 108.72, 'X[1] <= 1.7\ngini = 0.498\nsamples = 70\nvalue = [0, 33, 37]'),
Text(133.92000000000002, 36.23999999999998, 'gini = 0.111\nsamples = 34\nvalue = [0, 32, 2]'),
Text(267.84000000000003, 36.23999999999998, 'gini = 0.054\nsamples = 36\nvalue = [0, 1, 35]')]

Saving…

Performance measure

from sklearn.metrics import accuracy_score


print(accuracy_score(y_test,predictions))

0.9333333333333333

from sklearn.metrics import confusion_matrix


cm = confusion_matrix(y_test, predictions)
cm

array([[15, 0, 0],
[ 0, 16, 1],

https://colab.research.google.com/drive/11xVrqpEgeHcu9DfrErPhEsoqc5qPu70x#scrollTo=HnrpmO2mSgvI&printMode=true 3/6
08/12/2021, 13:14 decision_trees (1).ipynb - Colaboratory

[ 0, 2, 11]])

Visulising our decision tree

we can visualize the trained decision tree using the export_graphviz() method.

from sklearn.tree import export_graphviz

export_graphviz(
classifier,
out_file='iris_tree.dot',
Saving…
feature_names=iris.feature_names[2:],
class_names=iris.target_names,
rounded=True,
filled=True)

%ls

iris_tree.dot sample_data/

Now we can use the following command to convert our .dot file to the required format

! dot -Tpng iris_tree.dot -o iris_tree.png

from IPython.display import Image


Image (filename='iris_tree.png')

https://colab.research.google.com/drive/11xVrqpEgeHcu9DfrErPhEsoqc5qPu70x#scrollTo=HnrpmO2mSgvI&printMode=true 4/6
08/12/2021, 13:14 decision_trees (1).ipynb - Colaboratory

Saving…

classifier.predict_proba([[5, 1.5]]) # a petal length of 5 and petal width of 1.5

array([[0. , 0.94117647, 0.05882353]])

classifier.predict_proba([[2.45, 1]]) # a petal length of 2.45 and petal width of 1

array([[0. , 0.94117647, 0.05882353]])

classifier.predict([[5, 1.5]])

array([1])

https://colab.research.google.com/drive/11xVrqpEgeHcu9DfrErPhEsoqc5qPu70x#scrollTo=HnrpmO2mSgvI&printMode=true 5/6
08/12/2021, 13:14 decision_trees (1).ipynb - Colaboratory

Saving…

check 0s completed at 13:14

https://colab.research.google.com/drive/11xVrqpEgeHcu9DfrErPhEsoqc5qPu70x#scrollTo=HnrpmO2mSgvI&printMode=true 6/6

You might also like