Assignment-1 of Machine Learning On Decision Tree: Submitted To: Submitted by
Assignment-1 of Machine Learning On Decision Tree: Submitted To: Submitted by
Of
Machine Learning
On
Decision Tree
Algorithm:
If all examples are positive, Return the single-node tree Root, with label = +.
If all examples are negative, Return the single-node tree Root, with label = -.
If number of predicting attributes is empty, then Return the single node tree Root, with label = most
common value of the target attribute in the examples.
Else
•Add a new tree branch below Root, corresponding to the test A = vi.
•Let Examples(vi), be the subset of examples that have the value vi for A
–Then below this new branch add a leaf node with label = most common target
value in the examples
•Else below this new branch add the subtree ID3 (Examples(vi), Target_Attribute,
Attributes – {A})
End
Return Root
ID3 algorithm
I have taken database of responses of 62 people (attached with the mail) and questionnaire consist of
seven questions regarding the facilities provided by employer before accepting any job offer.
Questions are:
The attribute which have highest information gain will become root node of a sub tree.
In the program, decision tree is created by ID3 algorithm and performance is compared with the
performance if tree is created by prior probability of true and false ( i.e. tree is created by counting
no. of true and false decision)
for ii=1:numAttributes;
fprintf('%s\t', attributes{ii});
end
fprintf('\n');
for ii=1:size_of_traingset;
for jj=1:numAttributes;
if (trainingSet(ii, jj));
fprintf('%s\t', 'yes');
else
fprintf('%s\t', 'no');
end
end
fprintf('\n');
end
% Estimate the expected prior probability of yes and no based on
% training set
if (sum(trainingSet(:, numAttributes)) >= size_of_traingset);
expectedPrior = 'yes';
else
expectedPrior = 'no';
end
% Construct a decision tree on the training set using the ID3 algorithm
activeAttributes = ones(1, length(attributes) - 1);
new_attributes = attributes(1:length(attributes)-1);
tree = ID3(trainingSet, attributes, activeAttributes);
ExpectedPrior_Classifications(k, 2) = testingSet(k,numAttributes);
if (expectedPrior);
ExpectedPrior_Classifications(k, 1) = 1;
else
ExpectedPrior_Classifications(k, 0) = 0;
end
meanID3 = round(mean(ID3_Percentages));
meanPrior = round(mean(ExpectedPrior_Percentages));
Part 2: ID3 : This function will calculate entropy and information gain.
function [tree] = ID3(examples, attributes, activeAttributes)
if (isempty(examples));
error('Must provide examples');
end
% Constants
numberAttributes = length(activeAttributes);
numberExamples = length(examples(:,1));
% Create the tree node
tree = struct('value', 'null', 'left', 'null', 'right', 'null');
Part 3: ClassifyByTree : After getting value of entropy and information gain, it will create the tree
depending upon their value.
return
end
Part 4: PrintTree: This function will print the tree on command window.
function [] = PrintTree(tree, parent)
% Print current node
if (strcmp(tree.value, 'yes'));
fprintf('parent: %s\tyes\n', parent);
return
elseif (strcmp(tree.value, 'no'));
fprintf('parent: %s\tno\n', parent);
return
else
% Current node an attribute splitter
fprintf('parent: %s\tattribute: %s\tnoChild:%s\tyesChild:%s\n', ...
parent, tree.value, tree.left.value, tree.right.value);
end
End
Dataset: