100% found this document useful (2 votes)
760 views4 pages

Data Mining Quiz 1 Clustering

The document contains a quiz on data mining clustering techniques. It has 8 multiple choice questions testing knowledge of concepts like silhouette score calculation, Minkowski distance, Euclidean distance, Manhattan distance, agglomerative clustering and k-means clustering.

Uploaded by

Shripad H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
760 views4 pages

Data Mining Quiz 1 Clustering

The document contains a quiz on data mining clustering techniques. It has 8 multiple choice questions testing knowledge of concepts like silhouette score calculation, Minkowski distance, Euclidean distance, Manhattan distance, agglomerative clustering and k-means clustering.

Uploaded by

Shripad H
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Data Mining Quiz 1 Clustering

Type : Graded Quiz Questions : 8 Time : 45m


Marks: 10
Q No: 1

Correct Answer
Marks: 1/1

Silhouette Score is calculated using the following formula:

Silhouettescore = (p−q)/max(p,q)

What does p & q represent?

p = mean distance to the points in the nearest cluster & q = mean intra-cluster distance to all the
points.
You Selected
p = mean distance to the points in the farthest cluster & q = mean intra-cluster distance to all the
points.

p = mean distance to the points in the nearest cluster & q = sum of the intra-cluster distance of all the
points.

p = mean distance to the points in the farthest cluster & q = sum of the intra-cluster distance of all
the points.
Q No: 2

Correct Answer
Marks: 1/1
At p=2, the Minkowski distance will resemble which type of distance measure?

Euclidean Distance
You Selected
Manhattan Distance

Chebyshev Distance

None of the mentioned

d(x,y)= (Summation( xi - yi)p )1/p

for p=2, d(x,y) becomes (Summation( xi - yi)2 )1/2 

Q No: 3

Correct Answer
Marks: 1/1
Calculate Euclidean Distance for between below points:
p1= [2,3]
p2= [4,5]

2.626

3.100

2.423

2.828
You Selected

Euclidean Distance:

dist((x, y), (a, b)) = √(x - a)² + (y - b)²

(2,3)

(4,5)

Find difference 2-4= -2 and 3-5 =-2

Square and add the values  4 + 4 =8

Take the Square Root of the value  √8 = 2 x √2 = 2 x 1.414 =2.828

Q No: 4

Correct Answer
Marks: 1/1

Calculate the Silhouette Score for below:


np.random.seed(7)
array=np.array(np.random.rand(20)).reshape(10,2)
for n_clusters=2

[hint: scale the array using standard scalar]

0.4164

0.5478

0.4069
You Selected
0.3209
Q No: 5

Correct Answer
Marks: 1/1
Calculate the Manhattan distance between Point P1(4,4) and P2(9,9)?

10
You Selected
(5,5)

None of the Mentioned

Manhattan Distance:

(4,4) (9,9)

d= |(x2-x1)|+|(y2-y1)| 

d= |(9-4)|+|(9-4)| = 5+5=10

Q No: 6

Correct Answer
Marks: 1/1
Agglomerative clustering algorithm is generating 2 different dendrograms. What among the following
could be the possibilities for it to occur?

All of the mentioned.


You Selected
Due to the proximity function

Due to the data points used

Due to the variables used


Q No: 7

Correct Answer
Marks: 1/1
Agglomerative Clustering will start by considering all points as part of one big cluster

True

False
You Selected
Agglomerative Clustering starts by considering all points as individual clusters
Q No: 8

Correct Answer
Marks: 3/3

Use the dataset provided in the instructions.

The within-cluster sum of squared for 4 clusters is:

[Hint: Use KMeans Clustering and keep random_state=0]

1102.32

1694.33

1895.25
You Selected
2123.10

kmeans = KMeans(n_clusters=4,random_state=0)
km=kmeans.fit(dataset_scaled)
print('The within sum of squared for 4 clusters is',round(km.inertia_,2))

The within sum of squared for 4 clusters is 1895.25

You might also like