0% found this document useful (0 votes)
61 views2 pages

Mid-Term1 - Exam2019

This document contains a midterm exam for a data mining course. It has two questions. Question 1 asks students to calculate various entropies for a decision tree dataset and draw the full decision tree. It also asks about overfitting. Question 2 asks students to calculate the best splitting position for an attribute and contains a perceptron classification problem to solve.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views2 pages

Mid-Term1 - Exam2019

This document contains a midterm exam for a data mining course. It has two questions. Question 1 asks students to calculate various entropies for a decision tree dataset and draw the full decision tree. It also asks about overfitting. Question 2 asks students to calculate the best splitting position for an attribute and contains a perceptron classification problem to solve.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

MENOUFIA UNIVERSITY Name:……….………………..

Faculty of Computer & information Section:……….………………


Subject: Data Mining, 4th Year-1st Term Examiner: Dr. Hayam Mousa
Exam: Midterm Data: 19/11/2019 Marks:15 Time: 60 Min Pages:2 pages
Question 1
GPA Studied Passed
A) We will use the dataset below to learn a decision tree which predicts
if people pass machine learning (Yes or No), based on their previous L F F
GPA (High, Medium, or Low) and whether or not they studied.
According to this dataset L T T
a) What is the entropy H(Passed)? M F F
b) What is the entropy H(Passed | GPA)?
c) What is the entropy H(Passed | Studied)? M T T
d) Draw the full decision tree that would be learned for this dataset.
e) What causes over-fitting in a decision classification tree? Does over-fitting increase with H F T
number of training examples, explain your answer?
H T T

B) Consider you have the following dataset calculate the best splitting position for the age attribute.
ID Age Car Class
Type label
0 23 Family High
1 17 Sport High
2 43 Sport High
3 68 Family Low
4 32 Truck Low
5 20 Famiy High
Question 2:
A) Consider the three perceptrons below, which respectively correspond to classes A, B,
and C. For a given input x, the perceptron with the highest value of
∑ 𝒘𝒊 𝒙𝒊
𝒊
is the prediction of the group.
For the following test set, find the prediction of this set of perceptrons on each example,
and create a corresponding confusion matrix. Given the actual class label is indicated in the Table
below.

Example X1 X2 X3 Label
1 0 1 1 A
2 1 0 1 C
3 0 0 0 C
4 1 1 1 B

B) If vector x=(0,1,0,1) and y=(1,0,1,0) Calculate the Cosine, Correlation, Euclidean and jaccard
similarity and distance measures.

You might also like