Assignment 1 (Decision Trees (2p) ) : Machine Learning - Sheet 2

This document provides assignments for a machine learning course. It includes 4 assignments on decision trees, entropy and information gain, implementing decision trees on an iris data set using scikit-learn in Python, and performing a z-test on data to identify outliers. It provides details and guidelines on calculating decision trees, entropy, information gain, cross-validation, and implementing a function for the Rosner test to identify outliers above a threshold. Students are asked to complete the assignments and submit their work by a specified deadline.

Uploaded by

Toni Tan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views2 pages

Assignment 1 (Decision Trees (2p) ) : Machine Learning - Sheet 2

Uploaded by

Toni Tan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Summer 2015

Machine Learning - Sheet 2

Submission until: 03.05. (late submissions will get a deduction)

Discussion on: 05.05.
Submission as upload to your groups stud.IP folder as groupNumber sheet2.zip

Assignment 1 (Decision Trees (2p))

Build/draw the decision trees for the following boolean functions:
(a) A B ( = xor)
(b) (A B) (B C)
(c) (A B) (A C)
(d) (A B) (A C) D

Assignment 2 (Entropy and information gain - 8p)

genre
action,romance,comedy

Nr.
1
2
3
4
5

Table 1: Attributes and their possible values

main-character
has ninjas
male,female
true,false

genre
action
romance
action
comedy
romance

Table 2: training examples

main-character
has ninjas
male
true
male
true
female
true
female
true
female
false

watch
no
yes
yes
yes
no

(a) (4 p) Consider the five training examples from Table 2. Build the root node of a decision
tree from these training examples.
To do this, you calculate the information gain on all three distinct attributes (genre, maincharacter, has ninjas) to decide which one would be the best choice for the root node (the
one with the largest gain).
The information gain is given as
P
|Sv |
Gain(S, A) = Entropy(S)
Entropy(Sv )
vV alues(A) |S|
The Entropy is given as
Entropy(S) = p log2 p p log2 p
Sv is the subset of S for which attribute A has value v.
Example for attribute main-character:
Sm [1+, 1], |Sm | = 2
Sf [2+, 1], |Sf | = 3
Provide all detailed calculations and the result.

Prof. Dr. G. Heidemann

Page 1

Summer 2015

Machine Learning - Sheet 2

(b) (2 p) Perform the same calculation as in a) but use the gain ratio instead of the information
gain. Does the result for the root node change?
GainRatio(S, A) =

Gain(S, A)
, with
SplitInf ormation(S, A)

SplitInf ormation(S, A) =

|Sv |
|Sv |
log2
|S|
|S|
vV alues(A)
P

(c) (2 p) Lets assume the root node is a node which checks the value of the attribute has ninjas.
Calculate the next level of the decision tree using the information gain.

Programming Exercises
For the following tasks you can use either Matlab or Python. Only use builtin functions where
they are explicitly permitted. Basic functions for file handling, array creation and manipulation
as well as plotting are of course excluded from this regulations. For Python users this covers the
use of the following modules:
1. scipy.io for handling .mat files
2. numpy for array creation/manipulation
3. matplotlib.pyplot for plotting
One last advice: Do NOT copy code from external sources and submit it as your own. If a group
should happen to submit such code all group members will receive a serious deduction of points.

Assignment 3 (Decision Trees (5p))

Use builtin functions to solve this task. For Matlab have a look at the classregtree function.
Python users should make use of the DecisionTreeClassifier class from the scikit-learn module,
as well as pydot for plotting.
(a) (2 p) Calculate a decision tree on the Iris data set (iris.mat).
(b) (1 p) Perform a 3-fold cross-validation on the data set.
(c) (2 p) Calculate the errors for the test classification. Display the best and worst decision tree.

Assignment 4 (Z-test (5p))

Given the data in zPoints.mat, implement a function that performs the Rosner test.
(a) (3 p) using the mean.
(b) (1 p) using the median.
Use 3.0 as value for the threshold. Plot the data points and highlight those that would be/are
removed (1 p).

Prof. Dr. G. Heidemann

Page 2

Crack WiFi Network Password Using Android Device PDF
70% (10)
Crack WiFi Network Password Using Android Device PDF
7 pages
ML Assignment 2 2019 Nptel
No ratings yet
ML Assignment 2 2019 Nptel
34 pages
My ML Lab Manual
No ratings yet
My ML Lab Manual
21 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
Lecture2 DT
No ratings yet
Lecture2 DT
103 pages
02 LecDT
No ratings yet
02 LecDT
85 pages
03a Algind DT Rules Eng PDF
No ratings yet
03a Algind DT Rules Eng PDF
78 pages
Machine Learning CA 2
No ratings yet
Machine Learning CA 2
19 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
21 Decision Trees
No ratings yet
21 Decision Trees
62 pages
Bus Reservation System
100% (1)
Bus Reservation System
9 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
Artificial Intelligence: Machine Learning
No ratings yet
Artificial Intelligence: Machine Learning
110 pages
Module 3 Chap 3 Decision Tree Learning
No ratings yet
Module 3 Chap 3 Decision Tree Learning
79 pages
ML Lab 146
No ratings yet
ML Lab 146
50 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
305 BA MachineLearning and Cognitive Intellegence Using Python 1
No ratings yet
305 BA MachineLearning and Cognitive Intellegence Using Python 1
32 pages
MCQ Question
No ratings yet
MCQ Question
5 pages
Learning
No ratings yet
Learning
51 pages
Types of Pruning Techniques
No ratings yet
Types of Pruning Techniques
10 pages
Merging Result-Merged
No ratings yet
Merging Result-Merged
14 pages
ML - Unit 2 - Part I
No ratings yet
ML - Unit 2 - Part I
15 pages
Unit IV Notes
No ratings yet
Unit IV Notes
20 pages
ML Lab Experiments (1) - Pages-2
No ratings yet
ML Lab Experiments (1) - Pages-2
10 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
Module 2 Notes
No ratings yet
Module 2 Notes
20 pages
2 Machine Learning
No ratings yet
2 Machine Learning
21 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Homework 3
No ratings yet
Homework 3
10 pages
Put MLT
No ratings yet
Put MLT
12 pages
CS440: HW3
No ratings yet
CS440: HW3
7 pages
Midterm
No ratings yet
Midterm
12 pages
MLSP Lab Exp4
No ratings yet
MLSP Lab Exp4
9 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
ML Priyesha - 778
No ratings yet
ML Priyesha - 778
23 pages
Java Threads
No ratings yet
Java Threads
81 pages
Machine Learning
No ratings yet
Machine Learning
4 pages
ML Lab Report
No ratings yet
ML Lab Report
8 pages
DWM Exp4
No ratings yet
DWM Exp4
5 pages
IDAI610 PS1 DecisionTree
No ratings yet
IDAI610 PS1 DecisionTree
5 pages
Decision Trees in Sklearn Decision Trees in Sklearn
No ratings yet
Decision Trees in Sklearn Decision Trees in Sklearn
7 pages
Decision Tree Classifier - Manual
No ratings yet
Decision Tree Classifier - Manual
6 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
No ratings yet
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
4 pages
CSPC - 204
No ratings yet
CSPC - 204
4 pages
ML End Sem Nov2024 Paper
No ratings yet
ML End Sem Nov2024 Paper
4 pages
Programming Assignment: Decision Tree Classifier: Objective
No ratings yet
Programming Assignment: Decision Tree Classifier: Objective
3 pages
Wireless Application Protocol: Unit - 5
No ratings yet
Wireless Application Protocol: Unit - 5
39 pages
Lab 2
No ratings yet
Lab 2
3 pages
Troubleshooting HUAWEI VPN
100% (1)
Troubleshooting HUAWEI VPN
146 pages
Ass3 v1
No ratings yet
Ass3 v1
4 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
ML Important Questions
No ratings yet
ML Important Questions
7 pages
May'18
No ratings yet
May'18
2 pages
Soft Computing Lab Practical Assignment 2
No ratings yet
Soft Computing Lab Practical Assignment 2
10 pages
Ad2000 Manual Rfid
0% (1)
Ad2000 Manual Rfid
4 pages
Decision Tree Classifier
No ratings yet
Decision Tree Classifier
3 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Bridge Modelling Step 1: Open Staad - Pro.select New Project. Select Space and Suitable Length Units Eg. M and KN
No ratings yet
Bridge Modelling Step 1: Open Staad - Pro.select New Project. Select Space and Suitable Length Units Eg. M and KN
15 pages
15-381 Spring 2007 Assignment 6: Learning
No ratings yet
15-381 Spring 2007 Assignment 6: Learning
14 pages
1000099853
No ratings yet
1000099853
2 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Com, Dcom, Com+
100% (2)
Com, Dcom, Com+
20 pages
AUDocker LE Manual
No ratings yet
AUDocker LE Manual
11 pages
Modbus Signal Listing MCS100E
No ratings yet
Modbus Signal Listing MCS100E
12 pages
EDB Database Compatibility For Oracle Developers Reference Guide v10
No ratings yet
EDB Database Compatibility For Oracle Developers Reference Guide v10
333 pages
NOTES
No ratings yet
NOTES
275 pages
Cashcode Net Interface PDF
No ratings yet
Cashcode Net Interface PDF
60 pages
Project Report ON E-Student Zone: Under Guidance of Bhabasankar Dagar Ravenshaw University Cuttack
No ratings yet
Project Report ON E-Student Zone: Under Guidance of Bhabasankar Dagar Ravenshaw University Cuttack
78 pages
Atmega 32 U 4
100% (1)
Atmega 32 U 4
26 pages
Basic Computer Configuration Setup: Module 3, Lesson 2: Install Equipment/ Devices and Systems
No ratings yet
Basic Computer Configuration Setup: Module 3, Lesson 2: Install Equipment/ Devices and Systems
60 pages
LPP Total
No ratings yet
LPP Total
33 pages
Mat Lab 2
No ratings yet
Mat Lab 2
38 pages
Essential C - Nick Parlante
No ratings yet
Essential C - Nick Parlante
45 pages
Dirceu InternetSecurity
No ratings yet
Dirceu InternetSecurity
12 pages
Windows Services
No ratings yet
Windows Services
38 pages
Information and Software Technology: Ishan Banerjee, Bao Nguyen, Vahid Garousi, Atif Memon
No ratings yet
Information and Software Technology: Ishan Banerjee, Bao Nguyen, Vahid Garousi, Atif Memon
16 pages
Upgrade Nav 2009r2 To 13 Toolkit
No ratings yet
Upgrade Nav 2009r2 To 13 Toolkit
7 pages
AR New Opportunities For Product Designers Ebook
No ratings yet
AR New Opportunities For Product Designers Ebook
11 pages
BCH Codes
No ratings yet
BCH Codes
11 pages
Standard Operating Procedure: The Lemon Tree Hotel Company
No ratings yet
Standard Operating Procedure: The Lemon Tree Hotel Company
3 pages
C - Command Line Options C - Editing Commands C - Alphabetical List of Keys ×
No ratings yet
C - Command Line Options C - Editing Commands C - Alphabetical List of Keys ×
1 page
Ajay Acharya Resume
No ratings yet
Ajay Acharya Resume
3 pages
Linear Programming Questions
No ratings yet
Linear Programming Questions
3 pages
Java J2SE "Regular Expressions" Cheat Sheet V 0.1
No ratings yet
Java J2SE "Regular Expressions" Cheat Sheet V 0.1
1 page
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet