0% found this document useful (0 votes)

16 views17 pages

A Characterization of Linkage-Based Hierarchical Clustering

This document summarizes a research paper that characterizes linkage-based hierarchical clustering algorithms. The paper identifies two key properties that all linkage-based algorithms satisfy, but no other hierarchical algorithms satisfy both properties. It also shows that popular divisive hierarchical algorithms like bisecting k-means produce clusterings that linkage-based algorithms cannot produce. This provides a theoretical basis for distinguishing linkage-based algorithms from other hierarchical approaches based on their input-output behavior alone.

Uploaded by

aegr82

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views17 pages

A Characterization of Linkage-Based Hierarchical Clustering

Uploaded by

aegr82

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Journal of Machine Learning Research 17 (2016) 1-17 Submitted 7/11; Revised 8/15; Published 12/16

A Characterization of Linkage-Based Hierarchical Clustering

Margareta Ackerman [email protected]

Department of Computer Science
San Jose State University
San Jose, CA
Shai Ben-David [email protected]
D.R.C. School of Computer Science
University of Waterloo
Waterloo, ON

Editor: Marina Meila

Abstract

The class of linkage-based algorithms is perhaps the most popular class of hierarchical
algorithms. We identify two properties of hierarchical algorithms, and prove that linkage-
based algorithms are the only ones that satisfy both of these properties. Our character-
ization clearly delineates the difference between linkage-based algorithms and other hier-
archical methods. We formulate an intuitive notion of locality of a hierarchical algorithm
that distinguishes between linkage-based and “global” hierarchical algorithms like bisecting
k-means, and prove that popular divisive hierarchical algorithms produce clusterings that
cannot be produced by any linkage-based algorithm.

1. Introduction
Clustering is a fundamental and immensely useful task, with many important applications.
There are many clustering algorithms, and these algorithms often produce different results
on the same data. Faced with a concrete clustering task, a user needs to choose an appropri-
ate algorithm. Currently, such decisions are often made in a very ad hoc, if not completely
random, manner. Users are aware of the costs involved in employing different clustering
algorithms, such as running times, memory requirements, and software purchasing costs.
However, there is very little understanding of the differences in the outcomes that these
algorithms may produce.
It has been proposed to address this challenge by identifying significant properties
that distinguish between different clustering paradigms (see, for example, Ackerman et al.
(2010b) and Fisher and Van Ness (1971)). By focusing on the input-output behaviour of al-
gorithms, these properties shed light on essential differences between them (Ackerman et al.
(2010b, 2012)). Users could then choose desirable properties based on domain expertise,
and select an algorithm that satisfies these properties.
In this paper, we focus hierarchical algorithms, a prominent class of clustering algo-
rithms. These algorithms output dendrograms, which the user can then traverse to obtain
the desired clustering. Dendrograms provide a convenient method for exploring multiple

c 2016 Margareta Ackerman and Shai Ben-David.

Ackerman and Ben-David

clusterings of the data. Notably, for some applications the dendrogram itself, not any clus-
tering found in it, is the desired final outcome. One such application is found in the field
of phylogeny, which aims to reconstruct the tree of life.
One popular class of hierarchical algorithms is linkage-based algorithms. These algo-
rithms start with singleton clusters, and repeatedly merge pairs of clusters until a den-
drogram is formed. This class includes commonly-used algorithms such as single-linkage,
average-linkage, complete-linkage, and Ward’s method.
In this paper, we provide a property-based characterization of hierarchical linkage-based
algorithms. We identify two properties of hierarchical algorithms that are satisfied by all
linkage-based algorithms, and prove that at the same time no algorithm that is not linkage-
based can satisfy both of these properties.
The popularity of linkage-based algorithms leads to a common misconception that
linkage-based algorithms are synonymous with hierarchical algorithms. We show that even
when the internal workings of algorithms are ignored, and the focus is placed solely on their
input-output behaviour, there are natural hierarchical algorithms that are not linkage-based.
We define a large class of divisive algorithms that includes the popular bisecting k-means al-
gorithm, and show that no linkage-based algorithm can simulate the input-output behaviour
of any algorithm in this class.

2. Previous Work
Our work falls within the larger framework of studying properties of clustering algorithms.
Several authors study such properties from an axiomatic perspective. For instance, Wright
(1973) proposes axioms of clustering functions in a weighted setting, where every domain
element is assigned a positive real weight, and its weight may be distributed among multiple
clusters. A recent, and influential, paper in this line of work is Kleinberg’s impossibility
result (Kleinberg (2003)), where he proposes three axioms of partitional clustering functions
and proves that no clustering function can simultaneously satisfy these properties.
Properties have been used study different aspects of clustering. Ackerman and Ben-
David (2008) consider properties satisfied by clustering quality measures, showing that
properties analogous to Kleinberg’s axioms are consistent in this setting. Meila (2005)
studies properties of criteria for comparing clusterings, functions that map pairs of cluster-
ings to real numbers, and identifies properties that are sufficient to uniquely identify several
such criteria. Puzicha et al. (2000) explore properties of clustering objective functions.
They propose a few natural properties of clustering objective functions, and then focus on
objective functions that arise by requiring functions to decompose into additive form.
Most relevant to our work are previous results distinguishing linkage-based algorithms
based on their properties. Most of these results are concerned with the single-linkage al-
gorithm. In the hierarchial clustering setting, Jardine and Sibson (1971) and Carlsson and
Mémoli (2010) formulate a collection of properties that define single linkage.
Zadeh and Ben-David (2009) characterize single linkage in the partitional setting where
instead of constructing a dendrogram, clusters are merged until a given number of clusters
remain. Finally, Ackerman et al. (2010a) characterize linkage-based algorithms in the same
partitional setting in terms of a few natural properties. These results enable a comparison

2
Linkage-Based Hierarchical Clustering

Figure 1: A dendrogram of domain set {x1 , . . . , x8 }. The horizontal lines represent levels
and every leaf is associated with an element of the domain.

of the input-output behaviour of (a partitional variant of) linkage-based algorithms with

other partitional algorithms.
In this paper, we characterize hierarchical linkage-based algorithms, which map data sets
to dendrograms. Our characterization is independent of any stopping criterion. It enables
the comparison of linkage-based algorithms to other hierarchical algorithms, and clearly
delineates the differences between the input/output behaviour of linkage-based algorithms
and other hierarchical methods.

3. Definitions
A distance function is a symmetric function d : X × X → R+ , such that d(x, x) = 0 for
all x ∈ X. The data sets that we consider are pairs (X, d), where X is some finite domain
set and d is a distance function over X. We say that a distance function d over X extends
distance function d0 over X 0 ⊆ X, denoted d0 ⊆ d, if d0 (x, y) = d(x, y) for all x, y ∈ X 0 . Two
distance function d over X and d0 over X 0 agree on a data set Y if Y ⊆ X, Y ⊆ X 0 , and
d(x, y) = d0 (x, y) for all x, y ∈ Y .
A k-clustering C = {C1 , C2 , . . . , Ck } of a data set X is a partition of X into k non-empty
disjoint subsets of X (so, ∪i Ci = X). A clustering of X is a k-clustering of X for some
1 6 k 6 |X|. For a clustering C, let |C| denote the number of clusters in C. For x, y ∈ X
and clustering C of X, we write x ∼C y if x and y belong to the same cluster in C and
x 6∼C y, otherwise.
Given a rooted tree T where the edges are oriented away from the root, let V (T ) denote
the set of vertices in T , and E(T ) denote the set of edges in T . We use the standard
interpretation of the terms leaf, descendent, parent, and child.
A dendrogram over a data set X is a binary rooted tree where the leaves correspond to
elements of X. In addition, every node is assigned a level, using a level function (η); leaves
are placed at level 0, parents have higher levels than their children, and no level is empty.
See Figure 1 for an illustration. Formally,

3
Ackerman and Ben-David

Definition 1 (dendrogram) A dendrogram over (X, d) is a triple (T, M, η) where T is

a binary rooted tree, M : leaves(T ) → X is a bijection, and η : V (T ) → {0, . . . , h} is onto
(for some h ∈ Z+ ∪ {0}) such that

1. For every leaf node x ∈ V (T ), η(x) = 0.

2. If (x, y) ∈ E(T ), then η(x) > η(y).

Given a dendrogram D = (T, M, η) of X, we define a mapping from nodes to clusters

C : V (T ) → 2X by C(x) = {M (y) | y is a leaf and a descendent of x}. If C(x) = A, then we
write v(A) = x. We think of v(A) as the vertex (or node) in the tree that represents cluster
A.
We say that A ⊆ X is a cluster in D if there exists a node x ∈ V (T ) so that C(x) = A.
We say that a clustering C = {C1 , . . . , Ck } of X 0 ⊆ X is in D if Ci is in D for all 1 6 i 6 k.
Note that a dendrogram may contain clusterings that do not partition the entire domain,
and ∀i 6= j, v(Ci ) is not a descendent of v(Cj ), since Ci ∩ Cj = ∅.

Definition 2 (sub-dendrogram) A sub-dendrogram of (T, M, η) rooted at x ∈ V (T ) is

a dendrogram (T 0 , M 0 , η 0 ) where

1. T 0 is the subtree of T rooted at x,

2. For every y ∈ leaves(T 0 ), M 0 (y) = M (y), and

3. For all y, z ∈ V (T 0 ), η 0 (y) < η 0 (z) if and only if η(y) < η(z).

Definition 3 (Isomorphisms) A few notions of isomorphisms of structures are relevant

to our discussion.

1. We say that (X, d) and (X 0 , d0 ) are isomorphic domains, denoted (X, d) ∼=X (X 0 , d0 ),
0 0
if there exists a bijection φ : X → X so that d(x, y) = d (φ(x), φ(y)) for all x, y ∈ X.

2. We say that two clusterings (or partitions) C of some domain (X, d) and C 0 of some
domain (X 0 , d0 ) are isomorphic clusterings, denoted (C, d) ∼
=C (C 0 , d0 ), if there exists
0
a domain isomorphism φ : X → X so that x ∼C y if and only if φ(x) ∼C 0 φ(y).

3. We say that (T1 , η1 ) and (T2 , η2 ) are isomorphic trees, denoted (T1 , η1 ) ∼
=T (T1 , η1 ), if
there exists a bijection H : V (T1 ) → V (T2 ) so that

(a) for all x, y ∈ V (T1 ), (x, y) ∈ E(T1 ) if and only if (H(x), H(y)) ∈ E(T2 ), and
(b) for all x ∈ V (T1 ), η1 (x) = η2 (H(x)).

4. We say that D1 = (T1 , M1 , η1 ) of (X, d) and D2 = (T2 , M2 , η2 ) of (X 0 , d0 ) are iso-

morphic dendrograms, denoted D1 ∼ =D D2 , if there exists a domain isomorphism φ :
0
X → X and a tree isomorphism H : (T1 , η1 ) → (T2 , η2 ) so that for all x ∈ leaves(T1 ),
φ(M1 (x)) = M2 (H(x)).

4
Linkage-Based Hierarchical Clustering

4. Hierarchical and Linkage-Based Algorithms

In the hierarchical clustering setting, linkage-based algorithms are hierarchical algorithms
that can be simulated by repeatedly merging close clusters. In this section, we formally
define hierarchical algorithms and linkage-based hierarchical algorithms.

4.1 Hierarchical Algorithms

In addition to outputting a dendrogram, we require that hierarchical clustering functions
satisfy a few natural properties.

Definition 4 (Hierarchical clustering function) A hierarchical clustering function F

is a function that takes as input a pair (X, d) and outputs a dendrogram (T, M, η). We
require such a function, F , to satisfy the following:
1. Representation Independence: Whenever (X, d) ∼
=X (X 0 , d0 ), then F (X, d) ∼
=D F (X 0 , d0 ).
2. Scale Invariance: For any domain set X and any pair of distance functions d, d0
over X, if there exists c ∈ R+ such that d(a, b) = c · d0 (a, b) for all a, b ∈ X, then
F (X, d) = F (X, d0 ).

3. Richness: For all data sets {(X1 , d1 ), . . . , (Xk , dk )} where Xi ∩ Xj = ∅ for all i 6= j,
there exists a distance function dˆ over ki=1 Xi that extends each of the di ’s (for i 6 k),
S

so that the clustering {X1 , . . . , Xk } is in F ( ki=1 Xi , d).

ˆ
S

The last condition, richness, requires that by manipulating between-cluster distances

every clustering can be produced by the algorithm. Intuitively, if we place the clusters
sufficiently far apart, then the resulting clustering should be in the dendrogram.
In this work, we focus on distinguishing linkage-based algorithms from other hierarchical
algorithms.

4.2 Linkage-Based Algorithms

The class of linkage-base algorithms includes some of the most popular hierarchical algo-
rithms, such as single-linkage, average-linkage, complete-linkage, and Ward’s method.
Every linkage-based algorithm has a linkage function that can be used to determine
which clusters to merge at every step of the algorithm.

Definition 5 (Linkage Function) A linkage function is a function

` : {(X1 , X2 , d) | d over X1 ∪ X2 } → R+

such that,
1. ` is representation independent: For all (X1 , X2 ) and (X10 , X20 ), if ({X1 , X2 }, d) ∼
=C
({X10 , X20 }, d0 ) then `(X1 , X2 , d) = `(X10 , X20 , d0 ).

2. ` is monotonic: For all (X1 , X2 , d) if d0 is a distance function over X1 ∪ X2 such that

for all x ∼{X1 ,X2 } y, d(x, y) = d0 (x, y) and for all x 6∼{X1 ,X2 } y, d(x, y) 6 d0 (x, y) then
`(X1 , X2 , d0 ) > `(X1 , X2 , d).

5
Ackerman and Ben-David

As in our characterization of partitional linkage-based algorithms, we assume that a

linkage function has a countable range. Say, the set of non-negative algebraic real numbers.
The following are the linkage-functions of some of the most popular linkage-based algo-
rithms,

• Single-linkage: `(A, B, d) = mina∈A,b∈B d(a, b)

P
• Average-linkage: `(A, B, d) = a∈A,b∈B d(a, b)/(|A| · |B|)

• Complete-linkage: `(A, B, d) = maxa∈A,b∈B d(a, b)

For a dendrogram D and clusters A and B in D, if there exists x so that parent(v(A)) =

parent(v(B)) = x, then let parent(A, B) = x, otherwise parent(A, B) = ∅.
We now define hierarchical linkage-based functions.

Definition 6 (Linkage-Based Function) A hierarchical clustering function F is linkage-

based if there exists a linkage function ` so that for all (X, d), F (X, d) = (T, M, η) where
η(parent(A, B)) = m if and only if `(A, B) is minimal in {`(S, T ) : S ∩ T = ∅, η(S) <
m, η(T ) < m, η(parent(S)) > m, η(parent(T )) > m}.

Note that the above definition implies that there exists a linkage function that can be
used to simulate the output of F . We start by assigning every element of the domain to
a leaf node. We then use the linkage function to identify the closest pair of nodes (with
respect to the clusters that they represent), and repeatedly merge the closest pairs of nodes
that do yet have parents, until only one such node remains.

4.3 Locality
We introduce a new property of hierarchical algorithms. Locality states that if we select a
clustering from a dendrogram (a union of disjoint clusters that appear in the dendrogram),
and run the hierarchical algorithm on the data underlying this clustering, we obtain a result
that is consistent with the original dendrogram.

Definition 7 (Locality) A hierarchical function F is local if for all X, d, and X 0 ⊆ X,

whenever clustering C = {C1 , C2 , . . . , Ck } of X 0 is in F (X, d) = (T, M, η), then for all
16i6k

1. Cluster Ci is in F (X 0 , d|X 0 ) = (T 0 , M 0 , η 0 ), and the sub-dendrogram of F (X, d) rooted

at v(Ci ) is also a sub-dendrogram of F (X 0 , d|X 0 ) rooted at v(Ci ).

2. For all x, y ∈ X 0 , η 0 (x) < η 0 (y) if and only if η(x) < η(y).

Locality is often a desirable property. Consider for example the field of phylogenetics,
which aims to reconstruct the tree of life. If an algorithm clusters phylogenetic data cor-
rectly, then if we cluster any subset of the data, we should get results that are consistent
with the original dendrogram.

6
Linkage-Based Hierarchical Clustering

Figure 2: An example of an A-cut.

4.4 Outer Consistency

Clustering aims to group similar elements and separate dissimilar ones. These two require-
ments are often contradictory and algorithms vary in how they resolve this contradiction.
Kleinberg (2003) proposed a formalization of these requirements in his “consistency” axiom
for partitional clustering algorithms. Consistency requires that if within-cluster distances
are decreased, and between-cluster distances are increased, then the output of a clustering
function does not change.
Since then it was found that while many natural clustering functions fail consistency,
most satisfy a relaxation, which requires that the output of an algorithm is not changed by
increasing between-cluster distances (Ackerman et al. (2010b)). Given successfully clustered
data, if points that are already assigned to different clusters are drawn even further apart,
then it is natural to expect that, when clustering the resulting new data set, such points
will not share the same cluster. Here we propose a variation of this requirement for the
hierarchical clustering setting.
Given a dendrogram produced by a hierarchical algorithm, we select a clustering C
from a dendrogram and pull apart the clusters in C (thus making the clustering C more
pronounced). If we then run the algorithm on the resulting data, we can expect that the
clustering C will occur in the new dendrogram. Outer consistency is a relaxation of the
above property, making this requirement only on a subset of clusterings.
For a cluster A in a dendrogram D, the A-cut of D is a clustering in D represented by
nodes on the same level as v(A) or directly below v(A). For convenience, if node u is the
root of the dendrogram, then assume its parent has infinite level, η(parent(u)) = ∞.
Formally,

Definition 8 (A-cut) Given a cluster A in a dendrogram D = (T, M, η), the A-cut of D

is cutA (D) = {C(u) | u ∈ V (T ), η(parent(u)) > η(v(A)) and η(u) 6 η(v(A)).}.

Note that for any cluster A in D of (X, d), the A-cut is a clustering of X, and A is one
of the clusters in that clustering.
For example, consider the diagram in Figure 2. Let A = {x3 , x4 }. The horizontal
line on level 4 of the dendrogram represents the intuitive notion of a cut. To obtain the
corresponding clustering, we select all clusters represented by nodes on the line, and for

7
Ackerman and Ben-David

the remaining clusters, we choose clusters represented by nodes that lay directly below the
horizontal cut. In this example, clusters {x3 , x4 } and {x5 , x6 , x7 , x8 } are represented by
nodes directly on the line, and {x1 , x2 } is a cluster represented by a node directly below
the marked horizontal line.
Recall that a distance function d0 over X is (C, d)-outer-consistent if d0 (x, y) = d(x, y)
whenever x ∼C y, and d0 (x, y) > d(x, y) whenever x 6∼C y.

Definition 9 (Outer-Consistency) A hierarchical function F is outer consistent if for

all (X, d) and any cluster A in F (X, d), if d0 is (cutA (F (X, d)), d)-outer-consistent then
cutA (F (X, d)) = cutA (F (X, d0 )).

5. Main Result
The following is our characterization of linkage-based hierarchical algorithms.

Theorem 10 A hierarchical function F is linkage-based if and only if F is outer consistent

and local.

We prove the result in the following subsections (one for each direction of the iff). In
the last part of this section, we demonstrate the necessity of both properties.

5.1 All Local, Outer-Consistent Hierarchical Functions are

Linkage-Based
Lemma 11 If a hierarchical function F is outer-consistent and local, then F is linkage-
based.

We show that there exists a linkage function ` so that when ` is used in Definition 6 then
for all (X, d) the output is F (X, d). Due to the representation independence of F , one can
assume w.l.o.g., that the domain sets over which F is defined are (finite) subsets of the set
of natural numbers, N.

Definition 12 (The (pseudo-) partial ordering <F ) We consider triples of the form
(A, B, d), where A ∩ B = ∅ and d is a distance function over A ∪ B. Two triples, (A, B, d)
and (A0 , B 0 , d0 ) are equivalent, denoted (A, B, d) ∼= (A0 , B 0 , d0 ) if they are isomorphic as
∼ 0 0
clusterings, namely, if ({A, B}, d) =C ({A , B }, d ).0

<F is a binary relation over equivalence classes of such triples, indicating that F merges
a pair of clusters earlier than another pair of clusters. Formally, denoting ∼ =-equivalence
classes by square brackets, we define it by: [(A, B, d)] <F [(A0 , B 0 , d0 )] if

1. At most two sets in {A, B, A0 , B 0 } are equal and no set is a strict subset of another.

2. The distance functions d and d0 agree on (A ∪ B) ∩ (A0 ∪ B 0 ).

3. There exists a distance function d∗ over X = A ∪ B ∪ A0 ∪ B 0 so that F (X, d∗ ) =

(T, M, η) such that

(a) d∗ extends both d and d0 ,

8
Linkage-Based Hierarchical Clustering

(b) There exist (x, y), (x, z) ∈ E(T ) such that C(x) = A∪B, C(y) = A, and C(z) = B
(c) For all D ∈ {A0 , B 0 }, either D ⊆ A ∪ B, or D ∈ cutA∪B F (X, d∗ ).
(d) η(v(A0 )) < η(v(A ∪ B)) and η(v(B 0 )) < η(v(A ∪ B)).

Since we define hierarchical algorithms to be representation independent, we can just

discuss triples, instead of their equivalence classes. For the sake of simplifying notation, we
will omit the square brackets in the following discussion.
In the following lemma we show that if (A, B, d) <F (A0 , B 0 , d0 ), then A0 ∪ B 0 cannot
have a lower level than A ∪ B.

Lemma 13 Given a local and outer-consistent hierarchical function F , whenever

(A1 , B1 , d1 ) <F (A2 , B2 , d2 ), there is no data set (X, d) such that A1 , B1 , A2 , B2 ⊆ X
and η(v(A2 ∪ B2 )) 6 η(v(A1 ∪ B1 )), where F (X, d) = (T, M, η).

Proof By way of contradiction, assume that such (X, d) exists. Let X 0 = A1 ∪ B1 ∪

A2 ∪ B2 . Since (A1 , B1 , d1 ) <F (A2 , B2 , d2 ), there exists d0 that satisfies the conditions of
Definition 12.
Consider F (X 0 , d|X 0 ). By locality, the sub-dendrogram rooted at v(A1 ∪ B1 ) contains
the same nodes in both F (X 0 , d|X 0 ) and F (X, d), and similarly for the sub-dendrogram
rooted at v(A2 ∪ B2 ). In addition, the relative level of nodes in these subtrees is the same.
Construct a distance function d∗ over X 0 that is both ({A1 ∪ B1 , A2 ∪ B2 }, d|X 0 )-outer
consistent and ({A1 ∪ B2 , A2 , B2 }, d0 )-outer consistent as follows:

• d∗ (x, y) = max(d(x, y), d0 (x, y)) whenever x ∈ A1 ∪ B1 and y ∈ A2 ∪ B2

• d∗ (x, y) = d1 (x, y) whenever x, y ∈ A ∪ B

• d∗ (x, y) = d2 (x, y) whenever x, y ∈ A0 ∪ B 0

Note that {A1 ∪ B1 , A2 ∪ B2 } is an (A1 ∪ B1 )-cut of F (X 0 , d|X 0 ). Therefore, by outer-

consistency, cutA1 ∪B1 (F (X 0 , d∗ )) = {A2 ∪ B2 , A1 ∪ B1 }.
Since d0 satisfies the conditions in Definition 12, cutA1 ∪B1 F (X, d0 ) = {A1 ∪ B1 , A2 , B2 }.
By outer-consistency we get that cutA1 ∪B1 (F (X 0 , d∗ )) = {A2 ∪ B2 , A1 , B1 }. Since these sets
are all non-empty, this is a contradiction.

We now define equivalence with respect to <F .

Definition 14 (= ∼F ) [(A, B, d)] and [(A0 , B 0 , d0 )] are F -equivalent, denoted [(A, B, d)] ∼
=F
0 0 0 ∼ 0 0
[(A , B , d )], if either they are isomorphic as clusterings, ({A, B}, d) =C ({A , B }, d ) or 0

1. At most two sets in {A, B, A0 , B 0 } are equal and no set is a strict subset of another.

2. The distance functions d and d0 agree on (A ∪ B) ∩ (A0 ∪ B 0 ).

3. There exists a distance function d∗ over X = A ∪ B ∪ A0 ∪ B 0 so that F (A ∪ B ∪ A0 ∪

B 0 , d∗ ) = (T, η) where

(a) d∗ extends both d and d0 ,

9
Ackerman and Ben-David

(b) There exist (x, y), (x, z) ∈ E(T ) such that C(x) = A ∪ B, and C(y) = A, and
C(z) = B,
(c) There exist (x0 , y 0 ), (x0 , z 0 ) ∈ E(T ) such that C(x0 ) = A0 ∪ B 0 , and C(y 0 ) = A0 ,
and C(z 0 ) = B 0 , and
(d) η(x) = η(x0 )

(A, B, d) is comparable with (C, D, d0 ) if they are <F comparable or (A, B, d) ∼ =F

(C, D, d0 ).
Whenever two triples are F -equivalent, then they have the same <F or ∼
=F relationship
with all other triples.

Lemma 15 Given a local, outer-consistent hierarchical function F , if (A, B, d1 ) ∼ =F (C, D, d2 ),

then for any (E, F, d3 ), if (E, F, d3 ) is comparable with both (A, B, d1 ) and (C, D, d2 ) then

• if (A, B, d1 ) ∼
=F (E, F, d3 ) then (C, D, d2 ) ∼
=F (E, F, d3 )
• if (A, B, d1 ) <F (E, F, d3 ) then (C, D, d2 ) <F (E, F, d3 )

Proof Let X = A∪B∪C ∪D∪E∪F . By richness (condition 3 of Definition 4), there exists a
distance function d that extends di for i ∈ {1, 2, 3} so that {A∪B, C∪D, E∪F } is a clustering
in F (X, d). Assume that (E, F, d3 ) is comparable with both (A, B, d1 ) and (C, D, d2 ). By
way of contradiction, assume that (A, B, d1 ) ∼ =F (E, F, d3 ) and (C, D, d2 ) <F (E, F, d3 ).
Then by locality, in F (X, d), η(v(A ∪ B)) = η(v(E ∪ F )).
Observe that by locality, since (C, D, d1 ) <F (E, F, d3 ), then η(v(C ∪ D)) < η(v(E ∪ F ))
in F (X, d). Therefore (again by locality) η(v(A ∪ B)) 6= η(v(C ∪ D)) in any data set that
extends d1 and d2 , contradicting that (A, B, d1 ) ∼ =F (C, D, d2 ).

Note that <F is not transitive. In particular, if (A, B, d1 ) <F (C, D, d2 ) and (C, D, d2 ) <F
(E, F, d3 ), it may be that (A, B, d1 ) and (E, F, d3 ) are incomparable. To show that <F can
be extended to a partial ordering, we first prove the following “anti-cycle” property.

Lemma 16 Given a hierarchical function F that is local and outer-consistent, there exists
no finite sequence (A1 , B1 , d1 ) <F · · · <F (An , Bn , dn ) <F (A1 , B1 , d1 ).

Proof Without loss of generality, assume that such a sequence exists. By richness, there
exists a distance function
S d that extends each of the di where {A1 ∪B1 , A1 ∪B2 , . . . , An ∪Bn }
is a clustering in F ( i Ai ∪ Bi , d) = (T, M, η).
Let i0 be so that η(v(Ai0 ∪ Bi0 ) 6 η(v(Aj ∪ Bj )) for all j 6= i0 . By the circular structure
with respect to <F , there exists j0 so that (Aj0 , Bj0 , dj0 ) <F (Ai0 , Bi0 , di0 ). This contradicts
Lemma 13.

We make use of the following general result.

Lemma 17 For any cycle-free, anti-symmetric relation P ( , ) over a finite or countable

domain D there exists an embedding h into R+ so that for all x, y ∈ D, if P (x, y) then
h(x) < h(y).

10
Linkage-Based Hierarchical Clustering

Proof First we convert the relation P into a partial order by defining a < b whenever there
exists a sequence x1 , ...., xk so that P (a, x1 ), P (x2 , x3 ), ..., P (xk , b). This is a partial ordering
because P is antisymmetric and cycle-free. To map the partial order to the positive reals,
we first enumerate the elements, which can be done because the domain is countable. The
first element is then mapped to any value, φ(x1 ). By induction, we assume that the first
n elements are mapped in an order preserving manner. Let xi1 ...xik be all the members of
{x1 , ...xn } that are below xn+1 in the partial order. Let r1 = max{φ(xi1 ), ....φ(xik }, and sim-
ilarly let r2 be the minimum among the images of all the members of {x1 , . . . , xk } that are
above xn+1 in the partial order. Finally, let φ(xn+1 ) be any real number between r1 and r2 .
It is easy to see that now φ maps {x1 , ...xn , xn+1 } in a way that respects the partial order.

Finally, we define our linkage function by embedding the ∼ =F -equivalence classes into the
positive real numbers in an order preserving way, as implied by applying Lemma 17 to <F .
Namely, `F : {[(A, B, d)] : A ⊆ N, B ⊆ N, A ∩ B = ∅ and d is a distance function over A ∪
B} → R+ so that [(A, B, d)] <F [(A0 , B 0 , d0 )] implies `F [(A, B, d)] < `F [(A, B, d)].

Lemma 18 The function `F is a linkage function for any hierarchical function F that
satisfies locality and outer-consistency.

Proof Since `F is defined on ∼ =F -equivalence classes, representation independence of hi-

erarchical functions implies that `F satisfies condition 1 of Definition 5. The function `F
satisfies condition 2 of Definition 5 by Lemma 19, whose proof follows.

Lemma 19 Consider d1 over X1 ∪ X2 and d2 that is ({X1 , X2 }, d1 )-outer-consistent, then

(X1 , X2 , d2 ) 6<F (X1 , X2 , d1 ), whenever F is local and outer-consistent.

Proof Assume that there exist such d1 and d2 where (X1 , X2 , d2 ) <F (X1 , X2 , d1 ). Let d3
over X1 ∪ X2 be a distance function such that d3 is ({X1 , X2 }, d1 )-outer-consistent and d2
is ({X1 , X2 }, d3 )-outer-consistent. In particular, d3 can be constructed as follows:
d1 (x,y)+d2 (x,y)
• d3 (x, y) = 2 whenever x ∈ X1 and y ∈ X2

• d3 (x, y) = d1 (x, y) whenever x, y ∈ X1 or x, y ∈ X2

∼F (X1 , X2 , d2 ) and (X 00 , X 00 , d3 ) ∼
Set (X10 , X20 , d2 ) = =F (X1 , X2 , d3 ).
1 2
0 0 00 00
Let X = X1 ∪ X2 ∪ X1 ∪ X2 ∪ X1 ∪ X2 . By richness, there exists a distance function
d∗ that extends di for all 1 6 i 6 3 so that {X1 ∪ X2 , X10 ∪ X20 , X100 ∪ X200 } is a clustering in
F (X, d∗ ).
Let F (X, d∗ ) = (T, M, η). Since (X10 , X20 , d2 ) <F (X1 , X2 , d1 ), by locality and outer-
consistency, we get that η(v(X10 ∪ X20 )) < η(v(X1 ∪ X2 )). We consider the level (η value) of
v(X100 ∪ X200 ) with respect to the levels of v(X10 ∪ X20 ) and v(X1 ∪ X2 ) in F (X, d∗ ).
We now consider a few cases.
Case 1: η(v(X100 ∪ X200 )) 6 η(v(X10 ∪ X20 )). Then there exists an outer-consistent change
moving X1 and X2 further away from each other until (X1 , X2 , d1 ) = (X100 , X200 , d3 ). Let dˆ be
the distance function that extends d1 and d2 which shows that (X10 , X20 , d2 ) <F (X1 , X2 , d1 ).

11
Ackerman and Ben-David

cutX10 ∪X20 F (X1 ∪ X2 ∪ X10 ∪ X20 , d) ˆ = {X 0 ∪ X 0 , X1 , X2 }. We can apply outer consistency on

1 2
0 0
{X1 ∪ X2 , X1 , X2 } and move X1 and X2 away from each other until {X1 , X2 } is isomorphic
to {X100 , X200 }. By outer consistency, this modification should not effect the (X1 ∪ X2 )-cut.
Applying locality, we have two isomorphic data sets that produce different dendrograms,
one in which the further pair ((X10 , X20 ) with distance function d2 ) is not below the medium
pair ((X100 , X200 ) with distance function d3 ), and the other in which the medium pair is above
the furthest pair.
Case 2: η(v(X100 ∪ X200 )) > η(v(X1 ∪ X2 )). Since Xi00 is isomorphic to Xi for all i ∈ {1, 2},
η(v(Xi )) = η(v(Xi00 )) for all i ∈ {1, 2}. This gives us that in this case, cutX1 ∪X2 F (X1 ∪ X2 ∪
X100 ∪ X200 , d∗ ) = {X1 ∪ X2 , X100 , X200 }. We can therefore apply outer consistency and separate
X100 and X200 until {X100 , X200 } is isomorphic to {X10 ∪ X20 }. So this gives us two isomorphic
data sets, one in which the further pair is not below the closest pair, and the other in which
the further pair is below the closest pair.
Case 3: η(X1 ∪ X2 ) < η(X100 ∪ X200 ) < η(X10 ∪ X20 ). Notice that cutX100 ∪X200 F (X1 ∪ X2 ∪
X100 ∪ X200 , d∗ ) = {X100 ∪ X200 , X1 , X2 }. So outer-consistency applies when we increase the dis-
tance between X1 and X2 until {X1 , X2 } is isomorphic to {X10 ∪ X20 }. This gives us two
isomorphic sets, one in which the medium pair is below the further pair, and another in
which the medium pair is above the furthest pair.

The following Lemma concludes the proof that every local, outer-consistent hierarchical
algorithm is linkage-based.

Lemma 20 Given any hierarchical function F that satisfies locality and outer-consistency,
let `F be the linkage function defined above. Let L`F denote the linkage-based algorithm that
`F defines. Then L`F agrees with F on every input data set.

Proof Let (X, d) be any data set. We prove that at every level s, the nodes at level s in
F (X, d) represent the same clusters as the nodes at level s in L`F (X, d). In both F (X, d) =
(T, M, η) and L`F (X, d) = (T 0 , M 0 , η 0 ), level 0 consists of |X| nodes each representing a
unique elements of X.
Assume the result holds below level k. We show that pairs of nodes that do not have
parents below level k have minimal `F value only if they are merged at level k in F (X, d).
Consider F (X, d) at level k. Since the dendrogram has no empty levels, let x ∈ V (T )
where η(x) = k. Let x1 and x2 be the children of x in F (X, d). Since η(x1 ), η(x2 ) < k,
these nodes also appear in L`F (X, d) below level k, and neither node has a parent below
level k.
If x is the only node in F (X, d) above level k − 1, then it must also occur in L`F (X, d).
Otherwise, there exists a node y1 ∈ V (T ), y1 6∈ {x1 , x2 } so that η(y1 ) < k and η(parent(y1 )) >
k. Let X 0 = C(x) ∪ C(y1 ). By locality, cutC(x) F (X 0 , d|X 0 ) = {C(x), C(y1 )}, y1 is below x,
and x1 and x2 are the children of x. Therefore, (C(x1 ), C(x2 ), d) <F (C(x1 ), C(y1 ), d) and
`F (C(x1 ), C(x2 ), d) < `F (C(x1 ), C(y1 ), d).
Assume that there exists y2 ∈ V (T ), y2 6∈ {x1 , x2 , y1 } so that η(y2 ) < k and η(parent(y2 )) >
k. If parent(y1 ) = parent(y2 ) and η(parent(y1 )) = k, then (C(x1 ), C(x2 ), d) ∼ =F (C(y1 ), C(y2 ), d)
and so `F (C(x1 ), C(x2 ), d) = `F (C(y1 ), C(y2 ), d).

12
Linkage-Based Hierarchical Clustering

Otherwise, let X 0 = C(x) ∪ C(y1 ) ∪ C(y2 ). By richness, there exists a distance function
d∗ that extends d|C(x) and d|(C(y1 ) ∪ C(y1 )), so that {C(x), C(y1 ) ∪ C(y2 )} is in F (X 0 , d∗ ).
Note that by locality, the node v(C(y1 ) ∪ C(y2 )) has children v(C(y1 )) and v(C(y2 )) in
F (X 0 , d∗ ). We can separate C(x) from C(y1 ) ∪ C(y2 ) in both F (X 0 , d∗ ) and F (X 0 , d|X 0 ) until
both are equal. Then by outer-consistency, cutC(x) F (X 0 , d|X 0 ) = {C(x), C(y1 ), C(y2 )} and
by locality y1 and y2 are below x. Therefore, (C(x1 ), C(x2 ), d) <F (C(y1 ), C(y2 ), d) and so
`F (C(x1 ), C(x2 ), d) < `F (C(y1 ), C(y2 ), d).

5.2 All Linkage-Based Functions are Local and Outer-Consistent

Lemma 21 Every linkage-based hierarchical clustering function is local.

Proof Let C = {C1 , C2 , . . . , Ck } be a clustering in F (X, d) = (T, M, η). Let X 0 = ∪i Ci .

For all X1 , X2 ∈ X 0 , `(X1 , X2 , d) = `(X1 , X2 , d|X 0 ). Therefore, for all 1 6 i 6 k, the
sub-dendrogram rooted at v(Ci ) in F (X, d) also appears in F (X, d0 ), with the same relative
levels.

Lemma 22 Every linkage-based hierarchical clustering function is outer-consistent.

Proof Let C = {C1 , C2 , . . . , Ck } be a Ci -cut in F (X, d) for some 1 6 i 6 k. Let d0 be (C, d)-
outer-consistent. Then for all 1 6 i 6 k, and all X1 , X2 ⊆ Ci , `(X1 , X2 , d) = `(X1 , X2 , d0 ),
while for all X1 ⊆ Ci , X2 ⊆ Cj , for any i 6= j, `(X1 , X2 , d) 6 `(X1 , X2 , d0 ) by monotonic-
ity. Therefore, for all 1 6 j 6 k, the sub-dendrogram rooted at v(Cj ) in F (X, d) also
appears in F (X, d0 ). All nodes added after these sub-dendrograms are at a higher level
than the level of v(Ci ). And since the Ci -cut is represented by nodes that occur on levels no
higher than the level of v(Ci ), the Ci -cut in F (X, d0 ) is the same as the Ci -cut in F (X, d).

5.3 Necessity of Both Properties

We now show that both the locality and outer-consistency properties are necessary for
defining linkage-based algorithms. Neither property individually is sufficient for defining
this family of algorithms. Our results above showing that all linkage-based algorithms are
both local and outer-consistent already imply that a clustering function that satisfies one,
but not both, of these requirements is not linkage-based. It remains to show that neither
of these two properties implies the other. We do so by demonstrating the existence of a
hierarchical function that satisfies locality but not outer-consistency, and one that satisfy
outer-consistency but not locality.
Consider a hierarchical clustering function F that applies average-linkage on data sets
with an even number of elements, and single-linkage on data sets consisting of an odd number
of elements. Since both average-linkage and single-linkage are linkage-based algorithms, they
are both outer-consistent. It follows that F is outer-consistent. However, this hierarchical
clustering function fails locality, as it is easy to construct a data set with an even number of

13
Ackerman and Ben-David

elements where average-linkage detects an odd-sized cluster, for which single-linkage would
produce a different dendrogram.
Now, consider the following function
1
`(X1 , X2 , d) = .
maxx∈X1 ,y∈X2 d(x, y)
The function ` is not a linkage-function since it fails the monotonicity condition. The
function ` also does not conform with the intended meaning of a linkage-function. For
instance, `(X1 , X2 , d) is smaller than `(X10 , X20 , d0 ) when all the distances between X1 and
X2 are (arbitrarily) larger than any distance between X10 and X20 . If we then consider the
hierarchical clustering function F that results by utilizing ` in a greedy fashion to construct
a dendrogram (by repeatedly merging the closest clusters according to `), then the function
F is local by the same argument as the proof of Lemma 21. We now demonstrate that F
is not outer-consistent. Consider a data set (X, d) such that for some A ⊂ X, the A-cut
of F (X, d) is a clustering with a least 3 clusters where every cluster consists of a least 2
elements. Then if we move two clusters sufficiently far away from each other and all other
data, they will be merged by the algorithm before any of the other clusters are formed, and
so the A-cut on the resulting data changes following an outer-consistent change. As such,
F is not outer-consistent.

6. Divisive Algorithms
Our formalism provides a precise sense in which linkage-based algorithms make only local
considerations, while many divisive algorithms inevitably take more global considerations
into account. This fundamental distinction between these paradigms can be used to help
select a suitable hierarchical algorithm for specific applications.
This distinction also implies that many divisive algorithms cannot be simulated by any
linkage-based algorithm, showing that the class of hierarchical algorithms is strictly richer
than the class of linkage-based algorithm (even when focusing only on the input-output
behaviour of algorithms).
A 2-clustering function F maps a data set (X, d) to a 2-partition of X. An F-Divisive
algorithm is a divisive algorithm that uses a 2-clustering function F to decide how to split
nodes. Formally,

Definition 23 (F-Divisive) A hierarchical clustering function is F-Divisive with respect

to a 2-clustering function F, if for all (X, d), F(X, d) = (T, M, η) such that for all x ∈
V (T )/leaves(T ) with children x1 and x2 , F(C(x)) = {C(x1 ), C(x2 )}.

Note that Definition 23 does not place restrictions on the level function. This allows for
some flexibility in the levels. Intuitively, it doesn’t force an order on splitting nodes.
The following property represents clustering functions that utilize contextual informa-
tion found in the remainder of the data set when partitioning a subset of the domain.

Definition 24 (Context sensitive) F is context-sensitive if there exist x, y, z, w and

distance functions d and d0 , where d0 extends d, such that F({x, y, z}, d) = {{x}, {y, z}}
and F({x, y, z, w}, d0 ) = {{x, y}, {z, w}}.

14
Linkage-Based Hierarchical Clustering

Many 2-clustering functions, including k-means, min-sum, and min-diameter are context-
sensitive (see Corollary 29, below). Natural divisive algorithms, such as bisecting k-means
(k-means-Divisive), rely on context-sensitive 2-clustering functions.
Whenever a 2-clustering algorithm is context-sensitive, then the F-divisive function is
not local.

Theorem 25 If F is context-sensitive then the F-divisive function is not local.

Proof
Since F is context-sensitive, there exists a distance functions d ⊂ d0 so that {x} and
{y, z} are the children of the root in F({x, y, z}, d), while in F({x, y, z, w}, d0 ), {x, y} and
{z, w} are the children of the root and z and w are the children of {z, w}. Therefore,
{{x, y}, {z}} is clustering in F({x, y, z, w}, d0 ). But cluster {x, y} is not in F({x, y, z}, d),
so the clustering {{x, y}, {z}} is not in F({x, y, z}, d), and so F-divisive is not local.

Applying Theorem 10, we get:

Corollary 26 If F is context-sensitive, then the F-divisive function is not linkage-based.

We say that two hierarchical algorithms disagree if they may output dendrograms with
different clusterings. Formally,

Definition 27 Two hierarchical functions F0 and F1 disagree if there exists a data set
(X, d) and a clustering C of X so that C is in Fi (X, d) but not in F1−i (X, d), for some
i ∈ {0, 1}.

Theorem 28 If F is context-sensitive, then the F-divisive function disagrees with every

linkage-based function.

Proof Let L be any linkage-based function. Since F is context-sensitive, there exists

distance functions d ⊂ d0 so that F({x, y, z}, d) = {{x}, {y, z}} and F({x, y, z, w}, d0 ) =
{{x, y}, {z, w}}.
Assume that L and F-divisive produce the same output on ({x, y, z, w}, d0 ). There-
fore, since {{x, y}, {z}} is a clustering in F-divisive({x, y, z, w}, d0 ), it is also a cluster-
ing in L({x, y, z, w}, d0 ). Since L is linkage-based, by Theorem 10, L is local. Therefore,
{{x, y}, {z}} is a clustering in L({x, y, z}, d0 ). But it is not a clustering in F-divisive({x, y, z},
d).

Corollary 29 The divisive algorithms that are based on the following 2-clustering functions
disagree with every linkage-based function: k-means, min-sum, min-diameter.

Proof Set x = 1, y = 3, z = 4, and w = 6 to show that these 2-clustering functions are

context-sensitive. The result follows by Theorem 28.

15
Ackerman and Ben-David

7. Conclusions
In this paper, we provide the first property-based characterization of hierarchical linkage-
based clustering. Our characterization shows the existence of hierarchical methods that
cannot be simulated by any linkage-based method, revealing inherent input-output differ-
ences between agglomeration and divisive hierarchical algorithms.
This work falls in the larger framework of property-based analysis of clustering algo-
rithms, which aims to provide a better understanding of these techniques as well as aid users
in the crucial task of algorithm selection. It is important to note that our characterization
is not intended to demonstrate the superiority of linkage-based methods over other hier-
archical techniques, but rather to enable users to make informed trade-offs when choosing
algorithms. In particular, properties investigated in previous work should also be consid-
ered, while future work will continue to investigate important properties with the ultimate
goal of providing users with a property-based taxonomy of popular clustering methods that
would enable selecting suitable methods for a wide range of applications.

8. Acknowledgements
We would like to thank David Loker for several helpful discussions. We would also like
to thank the anonymous referees whose comments and suggestions greatly improved this
paper.

References
M. Ackerman and S. Ben-David. Measures of clustering quality: A working set of axioms
for clustering. In Proceedings of Neural Information Processing Systems (NIPS), pages
121–128, 2008.

M. Ackerman, S. Ben-David, and D. Loker. Characterization of linkage-based clustering.

In Proceedings of The 23rd Conference on Learning Theory, pages 270–281, 2010a.

M. Ackerman, S. Ben-David, and D. Loker. Towards property-based classification of clus-

tering paradigms. Lafferty et al., pages 10–18, 2010b.

M. Ackerman, S. Ben-David, S. Branzei, and D. Loker. Weighted clustering. In Association

for the Advancement of Artificial Intelligence (AAAI), pages 858–863, 2012.

G. Carlsson and F. Mémoli. Characterization, stability and convergence of hierarchical

clustering methods. The Journal of Machine Learning Research, 11:1425–1470, 2010.

L. Fisher and J.W. Van Ness. Admissible clustering procedures. Biometrika, 58(1):91–104,
1971.

N. Jardine and R. Sibson. Mathematical taxonomy. London, 1971.

J. Kleinberg. An impossibility theorem for clustering. Proceedings of International Confer-

ences on Advances in Neural Information Processing Systems, pages 463–470, 2003.

16
Linkage-Based Hierarchical Clustering

M. Meila. Comparing clusterings: an axiomatic view. In Proceedings of the 22nd interna-

tional conference on Machine learning, pages 577–584. ACM, 2005.

J. Puzicha, T. Hofmann, and J.M. Buhmann. A theory of proximity based clustering:

Structure detection by optimization. Pattern Recognition, 33(4):617–634, 2000.

W.E. Wright. A formalization of cluster analysis. Pattern Recognition, 5(3):273–282, 1973.

R.B. Zadeh and S. Ben-David. A uniqueness theorem for clustering. In Proceedings of the
Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pages 639–646. AUAI
Press, 2009.

Applied Spatial Statistics and Econometrics 1st Edition Katarzyna Kopczewska PDF Download
No ratings yet
Applied Spatial Statistics and Econometrics 1st Edition Katarzyna Kopczewska PDF Download
62 pages
Clustering Tabular Data - Multi - Task
No ratings yet
Clustering Tabular Data - Multi - Task
1 page
ASSIGNMENT 2 (Business Analytics For Managers)
No ratings yet
ASSIGNMENT 2 (Business Analytics For Managers)
5 pages
Data Science and AI
No ratings yet
Data Science and AI
15 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ai - Notes - Unit 1 - Che
No ratings yet
Ai - Notes - Unit 1 - Che
34 pages
Flight Connection Prediction For Airline
No ratings yet
Flight Connection Prediction For Airline
22 pages
Short Questions For Hierarchical Clustering
No ratings yet
Short Questions For Hierarchical Clustering
3 pages
2018 Reinforcement Learning For Solving VRP
No ratings yet
2018 Reinforcement Learning For Solving VRP
21 pages
Zero Touch Networks
No ratings yet
Zero Touch Networks
115 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
3 pages
Statistical Comparison of Classifiers Through Bayesian Hierarchical Modelling
No ratings yet
Statistical Comparison of Classifiers Through Bayesian Hierarchical Modelling
24 pages
10.1007/s10994 015 5486 Z
No ratings yet
10.1007/s10994 015 5486 Z
20 pages
Unit 5 Clustering-2
No ratings yet
Unit 5 Clustering-2
28 pages
Benavoli 14
No ratings yet
Benavoli 14
9 pages
1 s2.0 S1047320308001144 Main
No ratings yet
1 s2.0 S1047320308001144 Main
7 pages
Acoustic Emission Monitoring of Bridge Structures in The Field and Laboratory
No ratings yet
Acoustic Emission Monitoring of Bridge Structures in The Field and Laboratory
10 pages
CLOPE: A Fast and Effective Clustering Algorithm For Transactional Data
No ratings yet
CLOPE: A Fast and Effective Clustering Algorithm For Transactional Data
6 pages
Mukhopadhyay 2015
No ratings yet
Mukhopadhyay 2015
46 pages
Class Topology
No ratings yet
Class Topology
15 pages
2008 Solving The One-Dimensional Bin Packing Problem With A Weight
No ratings yet
2008 Solving The One-Dimensional Bin Packing Problem With A Weight
9 pages
1 s2.0 S0167865516303324 Main
No ratings yet
1 s2.0 S0167865516303324 Main
7 pages
1987 Vehicle Routing With Time Windows
No ratings yet
1987 Vehicle Routing With Time Windows
9 pages
Three-Way K-Means - Integrating K-Means and Three-Way Decision
No ratings yet
Three-Way K-Means - Integrating K-Means and Three-Way Decision
11 pages
Explainable AI: Beware of Inmates Running The Asylum
No ratings yet
Explainable AI: Beware of Inmates Running The Asylum
7 pages
Flow Network Based Generative Models For Non-Iterative Diverse Candidate Generation
No ratings yet
Flow Network Based Generative Models For Non-Iterative Diverse Candidate Generation
25 pages
Deep Neural Networks and Tabular Data: A Survey
No ratings yet
Deep Neural Networks and Tabular Data: A Survey
22 pages
Clustering Analysis of E-Waste Management in BRICS and G7 Countries
No ratings yet
Clustering Analysis of E-Waste Management in BRICS and G7 Countries
10 pages
Algorithm-Agnostic Feature Attributions For Clustering
No ratings yet
Algorithm-Agnostic Feature Attributions For Clustering
24 pages
!discussion 1
No ratings yet
!discussion 1
8 pages
RFM Model For Customer Purchase Behaviour Using K-Means Algorithm
No ratings yet
RFM Model For Customer Purchase Behaviour Using K-Means Algorithm
55 pages
Interpretable Clustering Via Optimal Trees: Prof. Dimitris Bertsimas
No ratings yet
Interpretable Clustering Via Optimal Trees: Prof. Dimitris Bertsimas
8 pages
2002 New Heuristics For One-Dimensional Bin-Packing
No ratings yet
2002 New Heuristics For One-Dimensional Bin-Packing
19 pages
Towards A Grounded Dialog Model For Explainable Artificial Intelligence
No ratings yet
Towards A Grounded Dialog Model For Explainable Artificial Intelligence
15 pages
2016 Hybrid Approach For The Two-Dimensional Bin Packing Problem
No ratings yet
2016 Hybrid Approach For The Two-Dimensional Bin Packing Problem
11 pages
Linktransformer:: A Unified Package For Record Linkage With Transformer Language Models
No ratings yet
Linktransformer:: A Unified Package For Record Linkage With Transformer Language Models
16 pages
2007 What You Should Know About The Vehicle Routing Problem
No ratings yet
2007 What You Should Know About The Vehicle Routing Problem
9 pages
Scalable Hierarchical Agglomerative Clustering
No ratings yet
Scalable Hierarchical Agglomerative Clustering
11 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
4 pages
TDSC Choo 221
No ratings yet
TDSC Choo 221
12 pages
Large Language Models and Where To Use Them - Part 2
No ratings yet
Large Language Models and Where To Use Them - Part 2
12 pages
ИСПИТ AI-900 (QnA)
No ratings yet
ИСПИТ AI-900 (QnA)
39 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
83 pages
Umd-Dsba 1667369436102
No ratings yet
Umd-Dsba 1667369436102
17 pages
Report 2
No ratings yet
Report 2
7 pages
Clustering Jain Dubes (1) - 69-103
No ratings yet
Clustering Jain Dubes (1) - 69-103
5 pages
001 - Clustering - Jain - Dubes (1) - 69-103
No ratings yet
001 - Clustering - Jain - Dubes (1) - 69-103
40 pages
FDS-Content Beyond Syllabus
No ratings yet
FDS-Content Beyond Syllabus
15 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
Lec.4.D. M. Spring 2025
No ratings yet
Lec.4.D. M. Spring 2025
19 pages
NIPS 2010 Towards Property Based Classification of Clustering Paradigms Paper
No ratings yet
NIPS 2010 Towards Property Based Classification of Clustering Paradigms Paper
9 pages
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Introductio 1
No ratings yet
Introductio 1
26 pages
Chapter 9-Cluster Analysis
No ratings yet
Chapter 9-Cluster Analysis
12 pages
Density Based Clustering Algorithm
No ratings yet
Density Based Clustering Algorithm
25 pages
Ahc 1
No ratings yet
Ahc 1
6 pages
ClarkeGorley 2006 PrimerV6UserManual
No ratings yet
ClarkeGorley 2006 PrimerV6UserManual
193 pages
DP 100 PDF
No ratings yet
DP 100 PDF
45 pages
Unit5 CSM ML
No ratings yet
Unit5 CSM ML
32 pages
Toward Theoretical Foundations
No ratings yet
Toward Theoretical Foundations
6 pages
Lecture 0
No ratings yet
Lecture 0
33 pages
MA Unit 5
No ratings yet
MA Unit 5
7 pages
ML Unit-5
No ratings yet
ML Unit-5
31 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
23 pages
ML TCS Lecture Hierarchical 1608
No ratings yet
ML TCS Lecture Hierarchical 1608
41 pages
Unit 3 Clustering
No ratings yet
Unit 3 Clustering
101 pages
Chapter-5-Cluster Analysis PDF
No ratings yet
Chapter-5-Cluster Analysis PDF
5 pages
L08 Hierachical Agglomerative Clustering
No ratings yet
L08 Hierachical Agglomerative Clustering
41 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
26 pages
DM Witten 03
No ratings yet
DM Witten 03
56 pages
ML Lec-17
No ratings yet
ML Lec-17
12 pages
ML Unit-4
No ratings yet
ML Unit-4
40 pages
Heirarchical Clustering
No ratings yet
Heirarchical Clustering
22 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
41 pages
Research 1
No ratings yet
Research 1
36 pages
3.2 HierCluster
No ratings yet
3.2 HierCluster
17 pages
Hierarchical Clustering in Machine Learning
No ratings yet
Hierarchical Clustering in Machine Learning
10 pages
Un Supervised Learning
No ratings yet
Un Supervised Learning
22 pages
Unit 3 DVA
No ratings yet
Unit 3 DVA
50 pages
Irs Question Papers
No ratings yet
Irs Question Papers
6 pages
RK Clustering
No ratings yet
RK Clustering
77 pages
Hierarchical Clustering Unit 4 ML
No ratings yet
Hierarchical Clustering Unit 4 ML
14 pages
Abhishek Data Scientist Resume
0% (1)
Abhishek Data Scientist Resume
5 pages
Machine Learning Functionalities
No ratings yet
Machine Learning Functionalities
58 pages
Mfuzzgui PDF
No ratings yet
Mfuzzgui PDF
7 pages
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Clustring
No ratings yet
Clustring
20 pages
Voronoi Diagrams A Survey of A Fundamental Geometric Data Structure
No ratings yet
Voronoi Diagrams A Survey of A Fundamental Geometric Data Structure
61 pages
DWM Exp8 127 133 137
No ratings yet
DWM Exp8 127 133 137
4 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
11 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
34 pages
Hierarchical Clustering - 11.3.2024 - Full
No ratings yet
Hierarchical Clustering - 11.3.2024 - Full
14 pages
CLUSTERING
No ratings yet
CLUSTERING
16 pages
9536 DWM Expt 7 Merged
No ratings yet
9536 DWM Expt 7 Merged
14 pages
SSRN Id3768295
No ratings yet
SSRN Id3768295
7 pages
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
No ratings yet
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
3 pages
Chinninti Venkata Assessment Machine Learning
No ratings yet
Chinninti Venkata Assessment Machine Learning
11 pages
Lecture 6
No ratings yet
Lecture 6
55 pages
Hierarchical Clustering: Class Program University Semester Lecturer Sources
100% (1)
Hierarchical Clustering: Class Program University Semester Lecturer Sources
33 pages
Module-5-Cluster Analysis-Part1
No ratings yet
Module-5-Cluster Analysis-Part1
24 pages
03 Hierarchical Clustering
100% (1)
03 Hierarchical Clustering
15 pages
On Clustering Using Random Walks: Abstract. We Propose A Novel Approach To Clustering, Based On Deter
No ratings yet
On Clustering Using Random Walks: Abstract. We Propose A Novel Approach To Clustering, Based On Deter
24 pages
Hierarchical Clustering PDF
No ratings yet
Hierarchical Clustering PDF
5 pages
Performance Evaluation of Distance Metrics in The Clustering Algorithms
No ratings yet
Performance Evaluation of Distance Metrics in The Clustering Algorithms
14 pages
6902 An Applied Algorithmic Foundation For Hierarchical Clustering
No ratings yet
6902 An Applied Algorithmic Foundation For Hierarchical Clustering
10 pages
What Is Cluster Analysis?
No ratings yet
What Is Cluster Analysis?
20 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
8 pages
Clustering Hierarchical PDF
No ratings yet
Clustering Hierarchical PDF
31 pages
A Density Clustering Based On Outlier
No ratings yet
A Density Clustering Based On Outlier
6 pages
Landscape of Clustering Algorithms: Anil K. Jain, Alexander Topchy, Martin H.C. Law, and Joachim M. Buhmann
No ratings yet
Landscape of Clustering Algorithms: Anil K. Jain, Alexander Topchy, Martin H.C. Law, and Joachim M. Buhmann
4 pages
Agnes
No ratings yet
Agnes
25 pages
Clustering Hierarchical Algorithms
100% (1)
Clustering Hierarchical Algorithms
21 pages
Clustering: EE-671 Prof L. Behera, IITK
No ratings yet
Clustering: EE-671 Prof L. Behera, IITK
33 pages