0% found this document useful (0 votes)

59 views

Soft Computing Lab Practical Assignment 2

The document discusses building a decision tree classifier to predict whether customers will buy lipstick based on their attributes. It describes the dataset, prerequisites, and outputs of building the decision tree model. The key steps are encoding the labels, fitting the decision tree classifier, predicting for a test data point, and visualizing the tree.

Uploaded by

Vaibhav Srivastava

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views

Soft Computing Lab Practical Assignment 2

Uploaded by

Vaibhav Srivastava

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Peoples Empowerment Group

ISBM COLLEGE OF ENGINEERING, PUNE

DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

Assignment No. 2

2.1 Title

2.2Problem Definition:

A dataset collected in a cosmetics shop showing details of customers and whether or not
they responded to a special offer to buy a new lip-stick is shown in table below. Use this
dataset to build a decision tree, with Buys as the target variable, to help in buying lip-sticks
in the future. Find the root node of decision tree. According to the decision tree you have
made from previous training data set, what is the decision for the test data: [Age < 21,
Income = Low, Gender = Female, Marital Status = Married]?

2.3Prerequisite:

Basic of Python, Data Mining Algorithm, Concept of Decision Tree Classifier

2.4Software Requirements:

Anaconda with Python 3.7

2.5Hardware Requirement:

PIV, 2GB RAM, 500 GB HDD, Lenovo A13-4089Model.

2.6Learning Objectives:

Learn How to Apply Decision Tree Classifier to find the root node of decision tree.
According to the decision tree you have made from previous training data set.

2.7Outcomes:

After completion of this assignment students are able Implement code for Create Decision
tree for given dataset and find the root node for same based on the given condition.

2.8Theory Concepts:

2.8.1 Motivation
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

Suppose we have following plot for two classes represented by black circle and blue
squares. Is it possible to draw a single separation line? Perhaps no.

Can you draw single division line for these classes?

We will need more than one line, to divide into classes. Something similar to following
image:

We need two lines one for threshold of x and

threshold for y.

We need two lines here one separating according to threshold value of x and other for
threshold value of y. Decision Tree Classifier, repetitively divides the working area(plot)
into sub part by identifying lines. (Repetitively because there may be two distant
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

regions of same class divided by other as shown in image below).

So, when does it terminate?
1. Either it has divided into classes that are pure (only containing members of single
class)
2. Some criteria of classifier attributes are met.

1. Impurity-
In above division, we had clear separation of classes. But what if we had following case?
Impurity is when we have a trace of one class division into other. This can arise due to
following reason
2. We run out of available features to divide the class upon.
We tolerate some percentage of impurity (we stop further division) for faster
performance. (There is always tradeoff between accuracy and performance).
For example, in second case we may stop our division when we have x number of
fewer number of elements left. This is also known as gini impurity.

2. Entropy
Entropy is degree of randomness of elements or in other words it is measure of
impurity. Mathematically, it can be calculated with the help of probability of the items as:

It is negative summation of probability times the log of probability of item x.

For example,
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

if we have items as number of dice face occurrence in a throw

event as 1123, the entropy is
p(1) = 0.5
p(2) = 0.25
p(3) = 0.25
entropy = - (0.5 * log(0.5)) - (0.25 * log(0.25)) -(0.25 * log(0.25)
= 0.45

3. Information Gain

Suppose we have multiple features to divide the current working set. What feature should
we select for division? Perhaps one that gives us less impurity.
Suppose we divide the classes into multiple branches as follows, the information gain
at any node is defined as
Information Gain (n) = Entropy(x) — ([weighted average] * entropy(children
for feature)) This need a bit explanation!
Suppose we have following class to work with initially

112234445

Suppose we divide them based on property: divisible by 2

Entropy at root level: 0.66

Entropy of left child: 0.45, weighted value = (4/9) * .45 = 0.2
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

Entropy of right child: 0.29, weighted value = (5/9) * 0.29 = 0.16

Information Gain = 0.66 - [0.2 + 0.16] = 0.3

Check what information gain we get if we take decision as prime number instead of
divide by 2. Which one is better for this case?
Decision tree at every stage selects the one that gives best information gain.
When information gain is 0 means the feature does not divide the working set at all.

2.9 Given Data set in Our Definition

What is the decision for the test data: [Age < 21, Income = Low, Gender = Female,
Marital Status = Married]?

Answer is Whether Yes, or No??

1. Which of the attributes would be the root node?
A. Age
B. Income
C. Gender
D. Marital Status
Solution: C
2. What is the decision for the test data [Age < 21, Income = Low, Gender = Female, Marital
Status = Married]?
A. Yes
B. No
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

Solution: A
[Hints: construct the decision tree to answer these questions]

2.10 Algorithm

1. Import the Required Packages

2. Read Given Dataset
3. Perform the label Encoding Mean Convert String value into Numerical values
4. Import and Apply Decision Tree Classifier
5. Predict value for the given Expression like [Age < 21, Income = Low, Gender
= Female, Marital Status = Married]? In encoding Values [1,1,0,0]
6. Import the packages for Create Decision Tree.
7. Check the Decision Tree Created based on Expression.

Code:

#import packages

import numpy as np
import pandas as pd
import graphviz

#reading Dataset
dataset=pd.read_csv('E:/Soft Computing Practicals/Lab Practicals Assignment2

/dataset.csv')

X=dataset.iloc[:,1:-1]
Y=dataset.iloc[:,5].values

#Performing label encoding

from sklearn.preprocessing import LabelEncoder
labelencoder_X = LabelEncoder()
X = X.apply(LabelEncoder().fit_transform)
print(X)

#Find the label with encoded value

for i in range(4):
col=X.iloc[:,i:i+1]
if (i==0):
print("age= 21-35 as 0, age= <21 as 1 & age= >35 as 2")
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

   if (i==1):
     print("income= High as 0, income= Medium as 2 & income= Low as 1")
   if (i==2):
     print("gender= Male as 1, gender= Female as 0")
   if (i==3):
     print("Marital Status= Single as 1, Marital Status= Married as 0")

#Creating Decision Tree Classifier

from sklearn.tree import DecisionTreeClassifier
regressor=DecisionTreeClassifier()
regressor.fit(X.iloc[:,:5],Y)

#Predict the value for given test data

X_in=np.array([1,1,0,0])
Y_pred=regressor.predict([X_in])
print("Predicted Value for Given Test Data is : ",Y_pred)

#Creating Graph
from sklearn.datasets import load_iris
from sklearn import tree
iris = load_iris()
dot_data = tree.export_graphviz(regressor, out_file=None, filled=True, rounded=True,
special_characters=True)
graph = graphviz.Source(dot_data)
graph.render("tree")

2.11 Output
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22
Peoples Empowerment Group
ISBM COLLEGE OF ENGINEERING, PUNE
DEPARTMENT OF COMPUTER ENGINEERING
Academic Year 2021-22

2.12 Conclusion
We learn that how to create a Decision Tree based on given decision, Find the Root Node
of the tree using Decision tree Classifier.

Decision-Tree-Classifier_Manual
No ratings yet
Decision-Tree-Classifier_Manual
6 pages
Classification
No ratings yet
Classification
33 pages
Unit-3 Alt
No ratings yet
Unit-3 Alt
24 pages
ML_UNIT3
No ratings yet
ML_UNIT3
24 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
ML Unit 3 New
100% (1)
ML Unit 3 New
24 pages
CH 5
No ratings yet
CH 5
84 pages
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
No ratings yet
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
4 pages
Decision Tree Introduction
No ratings yet
Decision Tree Introduction
14 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
DM Lab 04
No ratings yet
DM Lab 04
6 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
AIML Module-04
No ratings yet
AIML Module-04
46 pages
Slide 3
No ratings yet
Slide 3
23 pages
Chapter 5 Classification
No ratings yet
Chapter 5 Classification
24 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
19 -- Decision Tree -- ID3
No ratings yet
19 -- Decision Tree -- ID3
87 pages
8 Classification
No ratings yet
8 Classification
45 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
Unit-6: Classification and Prediction
No ratings yet
Unit-6: Classification and Prediction
63 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
5.classification and Prediction
No ratings yet
5.classification and Prediction
9 pages
Types of Pruning Techniques
No ratings yet
Types of Pruning Techniques
10 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
Unit-4 DM
No ratings yet
Unit-4 DM
19 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
ml2 PDF
No ratings yet
ml2 PDF
5 pages
ML_UNIT_3_NOTES-1
No ratings yet
ML_UNIT_3_NOTES-1
118 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
DM Lect 9_Classification - Decision Trees
No ratings yet
DM Lect 9_Classification - Decision Trees
39 pages
Decision_tree
No ratings yet
Decision_tree
15 pages
7 Classification
100% (3)
7 Classification
63 pages
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
No ratings yet
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
8 pages
Experiment No-2
No ratings yet
Experiment No-2
4 pages
Lecture 8
No ratings yet
Lecture 8
28 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
Lecture 11-Classification-M
No ratings yet
Lecture 11-Classification-M
33 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Decision Tree.pptx
No ratings yet
Decision Tree.pptx
41 pages
Decision Tree-31-01-2025
No ratings yet
Decision Tree-31-01-2025
28 pages
6CS4-02 Machine Learning Manish Bhardwaj
No ratings yet
6CS4-02 Machine Learning Manish Bhardwaj
625 pages
Data Mining Unit 2
No ratings yet
Data Mining Unit 2
40 pages
6.2.Unit-2 ML Handsout
No ratings yet
6.2.Unit-2 ML Handsout
18 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
04 Classification
No ratings yet
04 Classification
72 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Decision Tree
No ratings yet
Decision Tree
23 pages
Decision Trees: Classifier
No ratings yet
Decision Trees: Classifier
23 pages
05classification Rule Mining
No ratings yet
05classification Rule Mining
56 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
88 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
Unit 2 - Fundamentals of IoT Notes
No ratings yet
Unit 2 - Fundamentals of IoT Notes
10 pages
Revised Assignment 2 Vaibhav Srivastava 64
No ratings yet
Revised Assignment 2 Vaibhav Srivastava 64
4 pages
Assignment 1 Vaibhav Srivastava 64
No ratings yet
Assignment 1 Vaibhav Srivastava 64
3 pages
LP3 Soft Computing 4 Practical
No ratings yet
LP3 Soft Computing 4 Practical
7 pages
Face Recognition Attendance System
0% (1)
Face Recognition Attendance System
14 pages