Affinity Propagation in ML | To find the number of clusters

Last Updated : 11 Jul, 2025

Affinity Propagation (AP) is a clustering algorithm that identifies data clusters by sending messages between data points. Unlike traditional clustering methods like K-Means, it doesn’t require specifying the number of clusters beforehand. Instead, it identifies the optimal number of clusters automatically based on the data. In this article, we’ll see the mathematical implementation of Affinity Propagation and its core concepts.

Key Parameters Influencing Clustering

To understand the math behind Affinity Propagation, we need to understand the two key parameters that influence the clustering process:

1. Preference

It controls the number of exemplars (cluster centers) chosen by the algorithm.
Higher preferences lead to more exemplars resulting in more clusters.

2. Damping Factor

The damping factor helps stabilize the algorithm by limiting how much each update can change between iterations.
Without damping, the algorithm can oscillate or keep bouncing between values helps in making it difficult to converge to a final solution.

These two parameters play an important role in finding the stability and effectiveness of the algorithm as it iterates through its processes.

Mathematical Formulation

The core idea behind Affinity Propagation is based on two matrices: responsibility and availability. The algorithm iteratively updates these matrices to find the best exemplars (centroids) representing the data.

Similarity Matrix (Starting Point)

We start with a similarity matrix S where S(i, j) represents the similarity between two points x_i and x_j. The similarity is calculated as the negative squared Euclidean distance:

S(i, j) = -\lVert x_i - x_j \rVert^2

The diagonal elements of this matrix, S(i, i), represent the preference for each point to become an exemplar.

Responsibility

The responsibility matrix R is updated to reflect how well-suited point x_k is to serve as the exemplar for point x_i, relative to other candidate exemplars:

This is calculated as:

r(i, k) \leftarrow s(i, k) - \max \left\{ a(i, k') + s(i, k') \right\}_{k' \neq k}

Here r(i,k) represents the responsibility of point x_k for being the exemplar of point x_i considering all other points k'.

Availability

The availability matrix A is updated to represent how appropriate it would be for point x_i to choose point x_k as its exemplar, considering the preferences of other points.

This is calculated as:

a(i, k) \leftarrow \min \left( 0, r(k, k) + \sum_{i' \neq k} \max \left( 0, r(i', k) \right) \right) \quad \text{for} \, i \neq k \,

Where i is not eaul to k, and:

a(k, k) \leftarrow \sum_{i' \neq k} \max \left( 0, r(i', k) \right)

Convergence

These responsibility and availability matrices are updated iteratively until convergence at which point the algorithm selects the exemplars. The final step is to identify the points where the sum of responsibility and availability is positive:

r(i, i) + a(i, i) > 0

Points that meet this condition are considered the exemplars and clusters are formed based on these exemplars.

Visualizing the Process

In Affinity Propagation, messages are passed between data points in two main steps:

Responsibility Messages (Left Side): These messages shows how each data point communicates with its candidate exemplars. Each point sends responsibility messages to suggest how suitable it is to be chosen as an exemplar.
Availability Messages (Right Side): These messages reflect how appropriate it is for each data point to choose its corresponding exemplar considering the support from other points. Essentially, these messages show how much support the candidate exemplars have.

The diagram above shows how the responsibility messages are passed on the left and the availability messages are passed on the right. This iterative message-passing process helps find the final exemplars for clustering.

Python Implementation with Scikit-Learn

Here we will be generating synthetic dataset for its implementation.
Also we are using Sckit-Learn, Matplotlib and other libraries.

AffinityPropagation(preference = -50): Initializes the algorithm with a preference value of -50 which influences the number of exemplars (cluster centers) generated by the algorithm.
n_clusters_: Number of clusters is calculated by counting the exemplars identified by the algorithm.
cluster_centers_indices_: Retrieves the indices of the data points that serve as cluster centers (exemplars).

Python

from sklearn.cluster import AffinityPropagation 
from sklearn import metrics 
from sklearn.datasets import make_blobs 
import matplotlib.pyplot as plt 
from itertools import cycle 

centers = [[1, 1], [-1, -1], [1, -1], [-1, -1]] 
X, labels_true = make_blobs(n_samples = 400, centers = centers, 
						cluster_std = 0.5, random_state = 0) 

af = AffinityPropagation(preference =-50).fit(X) 
cluster_centers_indices = af.cluster_centers_indices_ 
labels = af.labels_ 

n_clusters_ = len(cluster_centers_indices) 

plt.close('all') 
plt.figure(1) 
plt.clf() 

colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk') 

for k, col in zip(range(n_clusters_), colors): 
	class_members = labels == k 
	cluster_center = X[cluster_centers_indices[k]] 
	plt.plot(X[class_members, 0], X[class_members, 1], col + '.') 
	plt.plot(cluster_center[0], cluster_center[1], 'o', 
			markerfacecolor = col, markeredgecolor ='k', 
			markersize = 14) 

	for x in X[class_members]: 
		plt.plot([cluster_center[0], x[0]], 
				[cluster_center[1], x[1]], col) 

plt.title('Estimated number of clusters: % d' % n_clusters_) 
plt.show()

Output:

download14 — Clustering and visualization.

The algorithm automatically detects 3 clusters without needing to pre-define the number of clusters.

Advantages of Affinity Propagation

No need to pre-define clusters: The algorithm automatically finds the number of clusters based on the data.
Handles varying cluster sizes and shapes: Unlike K-Means, it can identify clusters of different sizes and densities.
Exemplars as cluster centers: Data points themselves serve as exemplars (cluster centers) helps in making the results more interpretable.
Suitable for complex datasets: Performs well on datasets where traditional clustering methods like K-Means might struggle.

Limitations of Affinity Propagation

Computationally expensive: The algorithm can be slow for large datasets due to iterative updates of matrices.
Sensitive to parameters: The preference and damping factor values significantly impact the results and may require tuning.
Memory usage: The need to store and update large similarity matrices can be memory-intensive.
Convergence issues: The algorithm may not converge properly if the damping factor is not tuned correctly.

By mastering Affinity Propagation, one can effectively identify clusters in complex datasets without the need to predefine the number of clusters while also gaining insights into parameter tuning and computational considerations for optimal performance.

Introduction to Machine Learning

Debomit Dey

Improve

Article Tags :

Practice Tags :

Machine Learning

Affinity Propagation in ML | To find the number of clusters

Key Parameters Influencing Clustering

1. Preference

2. Damping Factor

Mathematical Formulation

Similarity Matrix (Starting Point)

Responsibility

Availability

Convergence

Visualizing the Process

Python Implementation with Scikit-Learn

Advantages of Affinity Propagation

Limitations of Affinity Propagation

Similar Reads

Introduction to Machine Learning

Python for Machine Learning

Feature Engineering

Supervised Learning

Unsupervised Learning

Model Evaluation and Tuning

Advance Machine Learning Technique

Machine Learning Practice

Thank You!

What kind of Experience do you want to share?