Open In App

Finding the Nearest Number in a DataFrame Using Pandas

Last Updated : 06 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

When working with data - pandas provide various techniques to find the closest number to given target value in a dataset using methods like argsort, idxmin and slicing techniques.

Method 1: Using 'argsort' to Find the Nearest Number

Python
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'values': [10, 20, 30, 40, 50]
})

# Target number
target = 33

differences = np.abs(df['values'] - target)
nearest_index = differences.argsort()[0]

nearest_value = df['values'].iloc[nearest_index]
print(f"Nearest value to {target} is {nearest_value}")

Output:

Nearest value to 33 is 30

In this case we compute the absolute difference between the target number and each value in the dataset using abs. argsort() sorts the differences.

It is helpful when we need the position of closest number in a dataset. Once the indices are sorted selecting the nearest value is simple and fast. Here we use argsort()[0] to get the nearest first value because the [0] refers to the index of the smallest difference and hence the closest number in the dataset.

Method 2. Using 'idxmin()' to Find the Nearest Number

Python
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'values': [10, 20, 30, 40, 50]
})

# Target number
target = 33


differences = np.abs(df['values'] - target)
nearest_index = differences.idxmin()

nearest_value = df['values'].iloc[nearest_index]
print(f"Nearest value to {target} is {nearest_value}")

Output:

Nearest value to 33 is 30

Here also we first compute the absolute difference between the target and each value in the dataset but instead of sorting we can directly call idxmin() on absolute differences to get the index of the smallest difference.

It directly gives us the index of the smallest value making it useful when we only need the single nearest value and is much faster as we don't need to sort index. It can be useful when dataset is large as sorting will take a lot of time and computing power.

Method 3. Finding n Nearest Numbers using argsort () slicing

Python
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'values': [10, 20, 30, 40, 50]
})

# Target number
target = 33
N = 3 # Number of nearest values you want


differences = np.abs(df['values'] - target)
nearest_indices = differences.argsort()[:N]

nearest_values = df['values'].iloc[nearest_indices]
print(f"The {N} nearest values to {target} are {nearest_values.tolist()}")

Output:

The 3 nearest values to 33 are [30, 40, 20] 

Someties we need to find N nearest values to a given target. To achieve this we can use argsort() with slicing to extract the N closest values. It is same as method 1 but here we use argsort()[:N] which will give N index of sorted array.

Conclusion

When working with numerical data in Pandas finding the nearest number to a target is a common. Depending upon our needs we can use argsort() or idxmin().

  • Use idxmin() for a simpler and direct approach where we want single nearest number. It is comparatively very fast.
  • Use argsort() when we need sorted indices and wants to extract more than one nearest number.
  • To find multiple nearest numbers we use argsort() with slicing to extract the closest N values.

These methods provide efficient and flexible ways to handle nearest number searches in our datasets.


Next Article
Article Tags :

Similar Reads