0% found this document useful (0 votes)
67 views

Lab 08: ID3 - Decision Tree and Linear Regression Objectives

This document introduces the objectives of Lab 08, which are to implement decision tree and linear regression algorithms in Python. For decision trees, it describes entropy, information gain, and provides sample datasets. For linear regression, it gives the slope-intercept formula and provides a sample dataset to fit a line to temperature and yield observations. Students are instructed to write programs for each algorithm that take a data file, output the resulting model, and calculate accuracy on decision trees.

Uploaded by

zombiee hook
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Lab 08: ID3 - Decision Tree and Linear Regression Objectives

This document introduces the objectives of Lab 08, which are to implement decision tree and linear regression algorithms in Python. For decision trees, it describes entropy, information gain, and provides sample datasets. For linear regression, it gives the slope-intercept formula and provides a sample dataset to fit a line to temperature and yield observations. Students are instructed to write programs for each algorithm that take a data file, output the resulting model, and calculate accuracy on decision trees.

Uploaded by

zombiee hook
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Bahria University CSL-411: Artificial Intelligence Lab

Department of Computer Science Semester 07 (Spring 2018)

Lab 08: ID3 - Decision Tree and Linear Regression


Objectives
Introducing the technique and write program to implement the following:
1. Decision Tree
2. Linear Regression

1. Decision Tree

A decison tree is a tree in which each branch node represents a choice between a number of
alternatives, and each leaf node represents a decision. Decision trees are commonly used for gaining
informatiom for the purpose of decision-making. Decision tree starts with a root node on which it is
for users to take actions. From this node, users split each node recursively according to decision tree
learning algorithm. The final result is a decision tree in which each branch represents a possible
scenario of decision and its outcome.
Entropy
In information theory, entropy is a measure of the uncertainity about a source of messages. The more
uncertain a receiver is about a source of messages, the more information that receiver will need in
order to know what message has been sent.

Information Gain
Measuring the expected reduction in Entropy As we mentioned before, to minimize the decision tree
depth, when we traverse the tree path, we need to select the optimal attribute for splitting the tree
node, which we can easily imply that the attribute with the most entropy reduction is the best choice.
We define information gain as the expected reduction of entropy related to specified attribute when
splitting a decision tree node.

Write a program in Python to implement the ID3 decision tree algorithm. You should read in a tab
delimited dataset, and output to the screen your decision tree and the training set accuracy in some
readable format.

Here are two sample datasets you can try: tennis.txt


BU, CS Department 2/4 Semester 7 (Spring 2018)
CSL-411: AI Lab Lab 08: ID3 &Lin Reg

outlook temperature humidity wind playtennis


sunny hot high weak no
sunny hot high strong no
overcast hot high weak yes
rain mild high weak yes
rain cool normal weak yes
rain cool normal strong no
overcast cool normal strong yes
sunny mild high weak no
sunny cool normal weak yes
rain mild normal weak yes
sunny mild normal strong yes
overcast mild high strong yes
overcast hot normal weak yes
rain mild high strong no

Your decision tree program should be able to work on any dataset (don't hardcode in attributes or
values).

In particular, it should be able to run on the following datasets:

 restaurant data:

restaurant.csv
o restaurant data

restaurant.txt
o restaurant metadata
 zoo data:

zoo.csv
o zoo data

zoo.txt
o zoo metadata
 credit screening data

crx.csv
o credit screening data

crx.txt
o Credit screening metadata

(for testing purposes, you might also want to work with the tennis example you solved by hand

tennis.txt tennis.csv
above. Here is the metadata, and here is the data.) When you run your
program, it should take a file name containing the data. For example:
BU, CS Department 3/4 Semester 7 (Spring 2018)
CSL-411: AI Lab Lab 08: ID3 &Lin Reg

tennis.txt
For output, you can choose how to draw the tree so long as it is clear what the tree is. For example:

outlook = sunny
| humidity = high: no
| humidity = normal: yes
outlook = overcast: yes
outlook = rainy
| windy = TRUE: no
| windy = FALSE: yes

2. Linear Regression

Aims at fitting a line to a set of observations

there is a straight line y = ax +b.

Formulas

Slope-intercept form of the equation for the linear regression prediction equation is y = a + bX

Where:
Yˆ = predicted score
b = slope of the line
a = Y intercept

parameters estimators

Write a program in Python to implement the linear regression algorithm. You should read in a tab
delimited dataset, and output to the screen your final linear regression expression.

Observation Y (Yield) X (Temperature)


Number
1 122 50
2 118 53
3 128 54
4 121 55
5 125 56
BU, CS Department 4/4 Semester 7 (Spring 2018)
CSL-411: AI Lab Lab 08: ID3 &Lin Reg

6 136 59
7 144 62
8 142 65
9 149 67
10 161 71
11 167 72
12 168 74
13 162 75
14 171 76
15 175 79
16 182 80
17 180 82
18 183 85
19 188 87
20 200 90
21 194 93
22 206 94
23 207 95
24 210 97
25 219 100

Predict the yield of crop when the temperature is 32.

You might also like