0% found this document useful (0 votes)
38 views25 pages

Practice 2+

This document discusses decision tree classifiers and the steps to build one. It introduces decision trees and the libraries and packages used, including scikit-learn, NumPy, and Pandas. It describes installing these packages and loading/preparing the data, which involves converting categorical variables to numeric. The document explains separating the data into feature and target columns to train the decision tree and make predictions on the target variable.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views25 pages

Practice 2+

This document discusses decision tree classifiers and the steps to build one. It introduces decision trees and the libraries and packages used, including scikit-learn, NumPy, and Pandas. It describes installing these packages and loading/preparing the data, which involves converting categorical variables to numeric. The document explains separating the data into feature and target columns to train the decision tree and make predictions on the target variable.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

2 practice

DECISION TREE
CLASSIFIER
• Thegoal is to create a model that predicts the
value of a target variable by learning simple
decision rules inferred from the data features.
Libraries
1. sklearn :
1. In python, sklearn is a machine learning package which include a lot of ML algorithms.
2. Here, we are using some of its modules like train_test_split, DecisionTreeClassifier and
accuracy_score.
1.NumPy :
1. It is a numeric python module which provides
fast maths functions for calculations.
2. It is used to read data in numpy arrays and for
manipulation purpose.
•Pandas :
• Used to read and write different files.
• Data manipulation can be done easily with
dataframes.
Installation of the packages :
• In Python, sklearn is the package which contains all the required packages to implement
Machine learning algorithm. You can install the sklearn package by following the
commands given below.

pip install -U scikit-learn


• Before using the above command make sure you have scipy and numpy packages
installed.
• If you don’t have pip. You can install it using
python get-pip.py
• import pandas

df = pandas.read_csv("data.csv")

print(df)
To make a decision tree, all data has to be numerical.
We have to convert the non numerical columns
'Nationality' and 'Go' into numerical values.
Pandas has a map() method that takes a dictionary with
information on how to convert the values.
{'UK': 0, 'USA': 1, 'N': 2}
Means convert the values 'UK' to 0, 'USA' to 1, and 'N' to
2.
Change string values into
numerical values:
• d = {'UK': 0, 'USA': 1, 'N': 2}
df['Nationality'] = df['Nationality'].map(d)
d = {'YES': 1, 'NO': 0}
df['Go'] = df['Go'].map(d)

print(df)
Feature and target columns

• Then we have to separate the feature columns from


the target column.
The feature columns are the columns that we try to predict from,

and the target column is the column with the values we try to
predict.
X is the feature columns, y is the target column:
features = ['Age', 'Experience', 'Rank', 'Nationality']

X = df[features]
y = df['Go']

print(X)
print(y)
Decision tree
• Now we can create the actual decision tree, fit it with our details.
Start by importing the modules we need:

You might also like