0% found this document useful (0 votes)
13 views4 pages

E.X No.6 Build D: Ecision Trees and Random Forests

Uploaded by

samdhanasekar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views4 pages

E.X No.6 Build D: Ecision Trees and Random Forests

Uploaded by

samdhanasekar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

E.x No.

6 Build Decision Trees and Random Forests

Aim:

To Write a python program to build decision trees and random forests.

Algorithm:

1. Start
2. Import the required libraries: pandas and matplotlib.pyplot.
3. Import the kyphosis data set into python script using pandas read_csv.
4. By exploratory data analysis determine the size of the data set :raw_data.info()
5. Visualize the data set using seaborn library.
6. Preprocess the data:
a.Splitting the dataset into the training and test set
b.train the decision tree model.
7.Make predictions using model.predict(x_test_data) and measure the performance using scikit-
learn's built-in functions classification_report and confusion_matrix.
8.Now train random forest model and make predictions using random forest Model.
9.Measure the performance of the random forest model and generate the confusion matrix.
10.Stop.

Program:

#Numerical computing libraries

import pandas as pd

import numpy as np

#Visalization libraries

import matplotlib.pyplot as plt

import seaborn as sns

%matplotlib inline

raw_data = pd.read_csv('kyphosis.csv')

raw_data.columns

#Exploratory data analysis

raw_data.info()

sns.pairplot(raw_data, hue = 'Kyphosis')


#Split the data set into training data and test data

from sklearn.model_selection import train_test_split

x = raw_data.drop('Kyphosis', axis = 1)

y = raw_data['Kyphosis']

x_training_data, x_test_data, y_training_data, y_test_data = train_test_spli


t(x, y, test_size = 0.3)

#Train the decision tree model

from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()

model.fit(x_training_data, y_training_data)

predictions = model.predict(x_test_data)

#Measure the performance of the decision tree model

from sklearn.metrics import classification_report

from sklearn.metrics import confusion_matrix

print(classification_report(y_test_data, predictions))

print(confusion_matrix(y_test_data, predictions))

#Train the random forests model

from sklearn.ensemble import RandomForestClassifier

random_forest_model = RandomForestClassifier()

random_forest_model.fit(x_training_data, y_training_data)

random_forest_predictions = random_forest_model.predict(x_test_data)

#Measure the performance of the random forest model

print(classification_report(y_test_data, random_forest_predictions))

print(confusion_matrix(y_test_data, random_forest_predictions))

Output:

RangeIndex: 81 entries, 0 to 80
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Kyphosis 81 non-null object
1 Age 81 non-null int64
2 Number 81 non-null int64
3 Start 81 non-null int64
dtypes: int64(3), object(1)
memory usage: 2.7+ KB
precision recall f1-score support

absent 0.84 0.76 0.80 21


present 0.17 0.25 0.20 4

accuracy 0.68 25
macro avg 0.50 0.51 0.50 25
weighted avg 0.73 0.68 0.70 25

[[16 5]
[ 3 1]]
precision recall f1-score support

absent 0.86 0.86 0.86 21


present 0.25 0.25 0.25 4

accuracy 0.76 25
macro avg 0.55 0.55 0.55 25
weighted avg 0.76 0.76 0.76 25

[[18 3]
[ 3 1]]
Result:
Thus a python program to build decision trees and random forests has been written and
executed successfully.

You might also like