0% found this document useful (0 votes)
58 views2 pages

SLA - Class Test - 5 - AnswerKey

This document contains a 4 question test on statistical learning with applications. The questions cover topics like cross-entropy and gini-index used to measure decision tree performance, calculating radial kernel values between points and determining cluster influence, forming clusters based on radial kernel similarity and determining complete and centroid linkages, and calculating principal component variance and cumulative variance from standard deviation values.

Uploaded by

cadi0761
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views2 pages

SLA - Class Test - 5 - AnswerKey

This document contains a 4 question test on statistical learning with applications. The questions cover topics like cross-entropy and gini-index used to measure decision tree performance, calculating radial kernel values between points and determining cluster influence, forming clusters based on radial kernel similarity and determining complete and centroid linkages, and calculating principal component variance and cumulative variance from standard deviation values.

Uploaded by

cadi0761
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

INDIAN INSTITUTE OF TECHNOLOGY, KHARAGPUR

Department of Industrial Engineering and Management


Class Test 5

Subject Number: IM31202 Subject Name: Statistical Learning with Applications


Full Marks: 30 Time: 1 hour Date: 13.04.2023

Instructions : 1. Attempt all questions.


2. Maximum marks are shown against each question.
3. Answers should be short and to the point.

1. Explain cross-entropy and gini-index. How are they used to measure performance of decision
trees? (5)

2. Determine the radial kernel values among the 3 points 𝑎: (10,2), 𝑏: (3,5) 𝑎𝑛𝑑 𝑐: (5,7). What
information is derived from the radial kernel values. Comment which points among 𝑎 and 𝑏
has greater influence on point 𝑐. Assume 𝛾 = 1 . (10)
𝑝
2
𝐾(𝑎, 𝑏) = exp (−𝛾 (∑(𝑎𝑗 − 𝑏𝑗 ) ) = exp(−1((10 − 3)2 + (2 − 5)2 ))
𝑗=1

K(a,b) 6.47023E-26
K(b,c) 0.000335463
K(a,c) 1.92875E-22

The radial kernel values show that points (a,b) and (a,c) are relatively far from one another and
do not have much influence on each other. (b,c) are relatively closer and may have some
influence on each other. This indicates that point a is relatively far away from b and c.
Point b has greater influence on point c.

3. Form 2 clusters based on the radial kernel values in problem 2. The clusters should be formed
based on similarity. Determine the complete and centroid linkages between these 2 clusters.
(10)

Based on the radial kernel similarity, the two points that are most similar are b and c. Hence
two clusters are point C1: {a}, C2: {b,c}

Euclidean distance
D(a,b) : 7.615

D(a,c) : 7.071

Complete linkage: max(𝐷(𝑎, 𝑏), 𝐷(𝑎, 𝑐)) = 7.615 (using Euclidean Distance measure)
Centroid of C1: (10,2)
Centroid of C2: (4,6)

Centroid linkage: Euclidean Distance between centroids: 7.211

1
4. The result below shows the first few PCs of mtcars (11 features) data with associated
standard deviation

PC1 PC2 PC3 PC4


2.5707 1.6 0.79196 0.51923

Determine the associated PVE and cumulative PVE of the PCs. (5)

###########################

You might also like