0% found this document useful (0 votes)
17 views4 pages

ML Exercises 4 5 6 en

The document contains questions and exercises related to machine learning topics including Bayes classifier, decision trees, and clustering algorithms. Various questions assess concepts like classifying patterns using Naive Bayes and k-nearest neighbors, determining entropy and information gain in decision trees, applying k-means clustering with different numbers of clusters, and comparing hierarchical and partitional clustering methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views4 pages

ML Exercises 4 5 6 en

The document contains questions and exercises related to machine learning topics including Bayes classifier, decision trees, and clustering algorithms. Various questions assess concepts like classifying patterns using Naive Bayes and k-nearest neighbors, determining entropy and information gain in decision trees, applying k-means clustering with different numbers of clusters, and comparing hierarchical and partitional clustering methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 4

Questions and Exercises

Course: Machine Learning

Chater 4
(Bayes Classifier)

4.1 A study of a university found out that 15% of undergraduate students who smoke and 23% of
graduate students who smoke. If 1/5 students in the university are graduate students and the rest
are undergraduate students, what is the probability that a graduate student who smokes?

4.2 (True/false) If P(A|B) = P(A) then P(A,B) = P(A).P(B).

4.3 State the difference between k-nearest neighbor algorithm and Naïve Bayes in classification.

4.4 State the assumption on the characteristic of the dataset which allows us to apply Naïve
Bayes classifier.

4.5 Consider the following data set:

Feature 1 Feature 2 Feature 3 Class


0 0 0 0
1 0 1 1
1 0 0 0
1 1 1 1
0 1 1 1
0 1 1 0

If we have a test pattern P with feature 1 as 0 and feature 2 as 0 and feature 3 as 1, classify this
pattern using Naïve Bayes classifier.

4.6. Given a dataset as in Exercise 4.4. Since the attributes are not continuous, we apply the
following method to calculate the distance between two patterns with categorical attributes.
Given two patterns, each consists of m categorical attributes. The distance between X and Y is
total number of differences between the corresponding attribute values of the two patterns. The
total number of differences is smaller, the two patterns more similar. That means:

where

By using this distance measure, apply 1-nearest neighbor algorithm to classify the test pattern P
= (0, 0, 1), based on the dataset given in Exercise 4.4.
Compare the results of the two classification method: 1-nearest neighbor algorithm and Naïve
Bayes (Exercise 4.4).
Chapter 5
(Decision Trees)

5.1. Determine the entropy impurity for the following distributions.


a) The dataset has 1/2 of patterns belonging to the first class, 1/4 of patterns belonging to the
second class, 1/8 of patterns belonging to the third class, 1/16 of patterns belonging to the fourth
class, and 1/16 of patterns belonging to the fifth class.
b) The dataset consists of five classes and each class has 1/5 of patterns.

5.2. Consider the following data set for a binary classification problem. Each pattern has two
binary attributes and one class label (+ or -).

A B Class label
T F +
T T +
T T +
T F -
T T +
F F -
F F -
F F -
T T -
T F -

Let use the information gain when determining the splitting attribute. Which of the features is
selected as splitting attribute at the root node in the decision tree for the data set.

5.3. Consider the following Weather dataset for a binary classification problem. Each pattern has
four discrete attributes and one class label (Yes or No).

Outlook Temperature Humidity Windy Play Tenis


Sunny Hot High False No
Sunny Hot High False No
Overcast Hot High False Yes
Rainy Mild High False Yes
Rainy Cool Normal False Yes
Rainy Cool Normal True No
Overcast Cool Normal True Yes
Sunny Warm High False No
Sunny Cool Normal False Yes
Rainy Warm Normal False Yes
Sunny Warm Normal True Yes
Overcast Warm High True Yes
Overcast Hot Normal False Yes
Rainy Warm High True No

Let use the information gain when determining the splitting attribute. Which of the features is
selected as splitting attribute at the root node in the decision tree for the data set.

5.4 (True/false) The depth of a learned decision tree can be larger than the number of training
examples used to create the tree.

Chương 6
(Clustering)

6.1 State the difference between supervised learning (classification) and unsupervised learning
(clustering).

6.2 Consider the following 10 patterns:


X1 = (1, 1), X2 = (6, 1), X3 = (2, 1), X4 = (6, 7), X5 = (1, 2), X6 = (7, 1), X7 = (7, 7), X8 = (2, 2), X1
= (6, 2), X10 = (7, 6)
Obtain the distance matrix using the Euclidean distance as the distance between two patterns.

6.3 If there is a set of n patterns and it is required to cluster these patterns to form two clusters,
how many such partitions will there be?

6.4. Given the cluster of 5 patterns:


X1 = (1, 1), X2 = (1, 2), X3 = (2, 1), X4 = (1.6, 1.4), X5 = (2, 2)
Show that the medoid of the cluster is (1.6, 1.4).

6.5 In agglomerative hierarchical clustering, among the current clusters, how to select the most
suitable pair of clusters to be merged?

6.6 In divisive hierarchical clustering,


a. How to find the best way to split a cluster into two clusters.
b. How to select among the current clusters the most suitable cluster to be split.

6.7 State the computational complexity of k-means algorithm.

6.8 State the strong points and weak points of k-means algorithm.

6.9. State the computational complexity of agglomerative hierarchical clustering and divisive
hierarchical clustering.

6.10 Consider the two dimensional data set given below:


(1, 1), (1, 2), (2, 1), (2, 1.5), (3, 2), (4, 1.5), (4, 2), (5, 1.5), (4.5, 2), (4, 4), (4.5, 4), (4.5, 5), (4, 5),
(5, 5)
Use the k-means algorithm to cluster these patterns with k = 3.
6.11 Given a set of 2-dimensional patterns: X1 = (1, 3), X2 = (1.5, 3.2), X3 = (1.3, 2.8), X4
=(3,1). Let apply k-means with k=2 to cluster this dataset. Assume that at a certain iteration, the
dataset is grouped into 2 clusters as follows : the first cluster consists of X1, and the second
cluster consists of X2, X3, X4. Let perform the next iteration which consists of two steps :
recalculating the centroids and assigning the patterns to the clussters.
Note: Euclidean distance is used in the k-means.

6.12 Consider the two dimensional data set given below:


(1, 1), (1, 2), (2, 1), (2, 1.5), (3, 2), (4, 1.5), (4, 2), (5, 1.5), (4.5, 2), (4, 4), (4.5, 4), (4.5, 5), (4, 5),
(5, 5)
Use agglomerative hierarchical clustering with single-link and agglomerative hierarchical
clustering with complete-link to cluster the dataset into 4 clusters.

6.13 State the similarity between two clustering algorithms: k-means and fuzzy-c-means.

6.14 Given a set of 2-dimensional patterns: X1 = (1, 6), X2 = (2,5), X3 = (3, 8), X4 =(4,4). X5 =
(5, 7), X6 =(6,9). Let apply fuzzy-c-means with k = 2 to cluster this dataset. Assume that at a
certain iteration, the dataset is grouped into 2 clusters with the membership weights as follows.
X1 X2 X3 X4 X5 X6
Cluster c1 0.8 0.9 0.7 0.3 0.5 0.2
Cluster c2 0.2 0.1 0.3 0.7 0.5 0.8

Let perform the next iteration which consists of two steps : recalculating the centroids and
assigning the membership weights for each pattern.
Note: Euclidean distance is used in the fuzzy-c-means.

6.15. (True/False) K-Means can generate clusters with arbitrary shapes.


6.16. (True/False) DBSCAN can generate clusters with arbitrary shapes.
6.17. (True/False) K-Means can generate clusters only with spherical shapes.

6.18 Give an example in which clustering can be used as a preprocessing step for an another data
classification task.

6.19 Explain the term incremental clustering. State the weak point of the Leader algorithm for
incremental clustering.

6.20 Given a set of 2-dimensional patterns :

A = (1, 1), B = (1, 2), C = (2, 2), D = (6, 2), E = (7, 2), F = (6, 6), G = (7, 6)
Let apply Leader algorithm to cluster the dataset. Assume that the data will be processed in the
order A, B, C, D, E, F and G, and the user specified threshold T be 3.

6.21 How to evaluate clustering quality based on objective function.

You might also like