Open In App

Affinity Propagation in ML | To find the number of clusters

Last Updated : 11 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Affinity Propagation (AP) is a clustering algorithm that identifies data clusters by sending messages between data points. Unlike traditional clustering methods like K-Means, it doesn’t require specifying the number of clusters beforehand. Instead, it identifies the optimal number of clusters automatically based on the data. In this article, we’ll see the mathematical implementation of Affinity Propagation and its core concepts.

Key Parameters Influencing Clustering

To understand the math behind Affinity Propagation, we need to understand the two key parameters that influence the clustering process:

1. Preference

  • It controls the number of exemplars (cluster centers) chosen by the algorithm.
  • Higher preferences lead to more exemplars resulting in more clusters.

2. Damping Factor

  • The damping factor helps stabilize the algorithm by limiting how much each update can change between iterations.
  • Without damping, the algorithm can oscillate or keep bouncing between values helps in making it difficult to converge to a final solution.

These two parameters play an important role in finding the stability and effectiveness of the algorithm as it iterates through its processes.

Mathematical Formulation

The core idea behind Affinity Propagation is based on two matrices: responsibility and availability. The algorithm iteratively updates these matrices to find the best exemplars (centroids) representing the data.

Similarity Matrix (Starting Point)

We start with a similarity matrix S where S(i, j) represents the similarity between two points x_i​ and x_j​. The similarity is calculated as the negative squared Euclidean distance:

S(i, j) = -\lVert x_i - x_j \rVert^2

The diagonal elements of this matrix, S(i, i), represent the preference for each point to become an exemplar.

Responsibility

The responsibility matrix R is updated to reflect how well-suited point x_k​ is to serve as the exemplar for point x_i​, relative to other candidate exemplars:

This is calculated as:

r(i, k) \leftarrow s(i, k) - \max \left\{ a(i, k') + s(i, k') \right\}_{k' \neq k}

Here r(i,k) represents the responsibility of point x_k​ for being the exemplar of point x_i​ considering all other points k'.

Availability

The availability matrix A is updated to represent how appropriate it would be for point x_i​ to choose point x_k​ as its exemplar, considering the preferences of other points.

This is calculated as:

a(i, k) \leftarrow \min \left( 0, r(k, k) + \sum_{i' \neq k} \max \left( 0, r(i', k) \right) \right) \quad \text{for} \, i \neq k \,

Where i is not eaul to k, and:

a(k, k) \leftarrow \sum_{i' \neq k} \max \left( 0, r(i', k) \right)

Convergence

These responsibility and availability matrices are updated iteratively until convergence at which point the algorithm selects the exemplars. The final step is to identify the points where the sum of responsibility and availability is positive:

r(i, i) + a(i, i) > 0

Points that meet this condition are considered the exemplars and clusters are formed based on these exemplars.

Visualizing the Process

In Affinity Propagation, messages are passed between data points in two main steps:

  • Responsibility Messages (Left Side): These messages shows how each data point communicates with its candidate exemplars. Each point sends responsibility messages to suggest how suitable it is to be chosen as an exemplar.
  • Availability Messages (Right Side): These messages reflect how appropriate it is for each data point to choose its corresponding exemplar considering the support from other points. Essentially, these messages show how much support the candidate exemplars have.

The diagram above shows how the responsibility messages are passed on the left and the availability messages are passed on the right. This iterative message-passing process helps find the final exemplars for clustering.

Python Implementation with Scikit-Learn

Here we will be generating synthetic dataset for its implementation.
Also we are using Sckit-Learn, Matplotlib and other libraries.

  • AffinityPropagation(preference = -50): Initializes the algorithm with a preference value of -50 which influences the number of exemplars (cluster centers) generated by the algorithm.
  • n_clusters_: Number of clusters is calculated by counting the exemplars identified by the algorithm.
  • cluster_centers_indices_: Retrieves the indices of the data points that serve as cluster centers (exemplars).
Python
from sklearn.cluster import AffinityPropagation 
from sklearn import metrics 
from sklearn.datasets import make_blobs 
import matplotlib.pyplot as plt 
from itertools import cycle 

centers = [[1, 1], [-1, -1], [1, -1], [-1, -1]] 
X, labels_true = make_blobs(n_samples = 400, centers = centers, 
						cluster_std = 0.5, random_state = 0) 

af = AffinityPropagation(preference =-50).fit(X) 
cluster_centers_indices = af.cluster_centers_indices_ 
labels = af.labels_ 

n_clusters_ = len(cluster_centers_indices) 

plt.close('all') 
plt.figure(1) 
plt.clf() 

colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk') 

for k, col in zip(range(n_clusters_), colors): 
	class_members = labels == k 
	cluster_center = X[cluster_centers_indices[k]] 
	plt.plot(X[class_members, 0], X[class_members, 1], col + '.') 
	plt.plot(cluster_center[0], cluster_center[1], 'o', 
			markerfacecolor = col, markeredgecolor ='k', 
			markersize = 14) 

	for x in X[class_members]: 
		plt.plot([cluster_center[0], x[0]], 
				[cluster_center[1], x[1]], col) 

plt.title('Estimated number of clusters: % d' % n_clusters_) 
plt.show() 

Output:

download14
Clustering and visualization.

The algorithm automatically detects 3 clusters without needing to pre-define the number of clusters.

Advantages of Affinity Propagation

  1. No need to pre-define clusters: The algorithm automatically finds the number of clusters based on the data.
  2. Handles varying cluster sizes and shapes: Unlike K-Means, it can identify clusters of different sizes and densities.
  3. Exemplars as cluster centers: Data points themselves serve as exemplars (cluster centers) helps in making the results more interpretable.
  4. Suitable for complex datasets: Performs well on datasets where traditional clustering methods like K-Means might struggle.

Limitations of Affinity Propagation

  1. Computationally expensive: The algorithm can be slow for large datasets due to iterative updates of matrices.
  2. Sensitive to parameters: The preference and damping factor values significantly impact the results and may require tuning.
  3. Memory usage: The need to store and update large similarity matrices can be memory-intensive.
  4. Convergence issues: The algorithm may not converge properly if the damping factor is not tuned correctly.

By mastering Affinity Propagation, one can effectively identify clusters in complex datasets without the need to predefine the number of clusters while also gaining insights into parameter tuning and computational considerations for optimal performance.


Similar Reads