scipy.cluster.hierarchy.

maxRstat#

scipy.cluster.hierarchy.maxRstat(Z, R, i)[source]#

Return the maximum statistic for each non-singleton cluster and its children.

Parameters:
Zarray_like

The hierarchical clustering encoded as a matrix. See linkage for more information.

Rarray_like

The inconsistency matrix.

iint

The column of R to use as the statistic.

Returns:
MRndarray

Calculates the maximum statistic for the i’th column of the inconsistency matrix R for each non-singleton cluster node. MR[j] is the maximum over R[Q(j)-n, i], where Q(j) the set of all node ids corresponding to nodes below and including j.

See also

linkage

for a description of what a linkage matrix is.

inconsistent

for the creation of a inconsistency matrix.

Notes

Array API Standard Support

maxRstat has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.

Library

CPU

GPU

NumPy

n/a

CuPy

n/a

PyTorch

JAX

Dask

⚠️ merges chunks

n/a

See Support for the array API standard for more information.

Examples

>>> from scipy.cluster.hierarchy import median, inconsistent, maxRstat
>>> from scipy.spatial.distance import pdist

Given a data set X, we can apply a clustering method to obtain a linkage matrix Z. scipy.cluster.hierarchy.inconsistent can be also used to obtain the inconsistency matrix R associated to this clustering process:

>>> X = [[0, 0], [0, 1], [1, 0],
...      [0, 4], [0, 3], [1, 4],
...      [4, 0], [3, 0], [4, 1],
...      [4, 4], [3, 4], [4, 3]]
>>> Z = median(pdist(X))
>>> R = inconsistent(Z)
>>> R
array([[1.        , 0.        , 1.        , 0.        ],
       [1.        , 0.        , 1.        , 0.        ],
       [1.        , 0.        , 1.        , 0.        ],
       [1.        , 0.        , 1.        , 0.        ],
       [1.05901699, 0.08346263, 2.        , 0.70710678],
       [1.05901699, 0.08346263, 2.        , 0.70710678],
       [1.05901699, 0.08346263, 2.        , 0.70710678],
       [1.05901699, 0.08346263, 2.        , 0.70710678],
       [1.74535599, 1.08655358, 3.        , 1.15470054],
       [1.91202266, 1.37522872, 3.        , 1.15470054],
       [3.25      , 0.25      , 3.        , 0.        ]])

scipy.cluster.hierarchy.maxRstat can be used to compute the maximum value of each column of R, for each non-singleton cluster and its children:

>>> maxRstat(Z, R, 0)
array([1.        , 1.        , 1.        , 1.        , 1.05901699,
       1.05901699, 1.05901699, 1.05901699, 1.74535599, 1.91202266,
       3.25      ])
>>> maxRstat(Z, R, 1)
array([0.        , 0.        , 0.        , 0.        , 0.08346263,
       0.08346263, 0.08346263, 0.08346263, 1.08655358, 1.37522872,
       1.37522872])
>>> maxRstat(Z, R, 3)
array([0.        , 0.        , 0.        , 0.        , 0.70710678,
       0.70710678, 0.70710678, 0.70710678, 1.15470054, 1.15470054,
       1.15470054])