0% found this document useful (0 votes)
2 views

random forest

The document outlines a Python script that utilizes the Iris dataset to train a Random Forest classifier. It demonstrates data loading, preprocessing, model training, and evaluation, achieving a perfect accuracy score of 1.0 on the test set. Additionally, it includes the generation of a confusion matrix to visualize the model's performance.

Uploaded by

Kavya Padarthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

random forest

The document outlines a Python script that utilizes the Iris dataset to train a Random Forest classifier. It demonstrates data loading, preprocessing, model training, and evaluation, achieving a perfect accuracy score of 1.0 on the test set. Additionally, it includes the generation of a confusion matrix to visualize the model's performance.

Uploaded by

Kavya Padarthi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

In [2]: import pandas as pd

from sklearn.datasets import load_iris


import matplotlib.pyplot as plt
iris =load_iris()

In [4]: dir(iris)

Out[4]: ['DESCR',
'data',
'data_module',
'feature_names',
'filename',
'frame',
'target',
'target_names']

In [6]: df = pd.DataFrame(iris.data)
df

Out[6]: 0 1 2 3

0 5.1 3.5 1.4 0.2

1 4.9 3.0 1.4 0.2

2 4.7 3.2 1.3 0.2

3 4.6 3.1 1.5 0.2

4 5.0 3.6 1.4 0.2

... ... ... ... ...

145 6.7 3.0 5.2 2.3

146 6.3 2.5 5.0 1.9

147 6.5 3.0 5.2 2.0

148 6.2 3.4 5.4 2.3

149 5.9 3.0 5.1 1.8

150 rows × 4 columns

In [8]: df[0:12]

Out[8]: 0 1 2 3

0 5.1 3.5 1.4 0.2

1 4.9 3.0 1.4 0.2

2 4.7 3.2 1.3 0.2

3 4.6 3.1 1.5 0.2

4 5.0 3.6 1.4 0.2

5 5.4 3.9 1.7 0.4

6 4.6 3.4 1.4 0.3

7 5.0 3.4 1.5 0.2

8 4.4 2.9 1.4 0.2

9 4.9 3.1 1.5 0.1

10 5.4 3.7 1.5 0.2

11 4.8 3.4 1.6 0.2

In [10]: df['target'] = iris.target

In [12]: df.head()

Out[12]: 0 1 2 3 target

0 5.1 3.5 1.4 0.2 0

1 4.9 3.0 1.4 0.2 0

2 4.7 3.2 1.3 0.2 0

3 4.6 3.1 1.5 0.2 0

4 5.0 3.6 1.4 0.2 0

In [14]: X = df.drop('target',axis ='columns')


y = df.target

In [16]: from sklearn.model_selection import train_test_split

In [18]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

In [20]: from sklearn.ensemble import RandomForestClassifier


model = RandomForestClassifier(n_estimators = 10)
model.fit(X_train,y_train)

Out[20]: ▾ RandomForestClassifier i ?

RandomForestClassifier(n_estimators=10)

In [22]: model.score(X_test,y_test)

Out[22]: 1.0

In [24]: y_predict = model.predict(X_test)

In [26]: y_predict

Out[26]: array([2, 1, 2, 0, 2, 1, 2, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 2, 0, 0, 2,
1, 2, 2, 1, 2, 0, 1, 1])

In [28]: from sklearn.ensemble import RandomForestClassifier


model = RandomForestClassifier(n_estimators = 20)
model.fit(X_train,y_train)
model.score(X_test,y_test)

Out[28]: 1.0

In [30]: from sklearn.ensemble import RandomForestClassifier


model = RandomForestClassifier(n_estimators = 30)
model.fit(X_train,y_train)
model.score(X_test,y_test)

Out[30]: 1.0

In [32]: from sklearn.ensemble import RandomForestClassifier


model = RandomForestClassifier(n_estimators = 60)
model.fit(X_train,y_train)
model.score(X_test,y_test)

Out[32]: 1.0

In [34]: from sklearn.ensemble import RandomForestClassifier


model = RandomForestClassifier(n_estimators = 90)
model.fit(X_train,y_train)
model.score(X_test,y_test)

Out[34]: 1.0

In [36]: from sklearn.ensemble import RandomForestClassifier


model = RandomForestClassifier(n_estimators = 100)
model.fit(X_train,y_train)
model.score(X_test,y_test)

Out[36]: 1.0

In [38]: from sklearn.metrics import confusion_matrix


cm = confusion_matrix(y_test,y_predict)
cm

Out[38]: array([[11, 0, 0],


[ 0, 10, 0],
[ 0, 0, 9]], dtype=int64)

In [40]: import seaborn as sns


plt.figure(figsize=(10,7))
sns.heatmap(cm,annot = True)
plt.xlabel('predicted')
plt.ylabel('Truth')

Out[40]: Text(95.72222222222221, 0.5, 'Truth')


In [ ]:

You might also like