How to Count Distinct Values of a Pandas Dataframe Column?

Last Updated : 02 Dec, 2024

Let's discuss how to count distinct values of a Pandas DataFrame column.

Using pandas.unique()

You can use pd.unique()to get all unique values in a column. To count them, apply len()to the result. This method is useful when you want distinct values and their count.

Python

import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'height': [165, 165, 164, 158, 167, 160, 158, 165],
    'weight': [63.5, 64, 63.5, 54, 63.5, 62, 64, 64],
    'age': [20, 22, 22, 21, 23, 22, 20, 21]
}, index=['Steve', 'Ria', 'Nivi', 'Jane', 'Kate', 'Lucy', 'Ram', 'Niki'])

# Count unique values in 'height' column using unique()
n = len(pd.unique(df['height']))

print("Number of unique values in 'height':", n)

Output

Number of unique values in 'height': 5

In addition to the pandas.unique() method, there are several other approaches to count distinct values in a Pandas DataFrame:

Table of Content

Using DataFrame.nunique()
Using Series.value_counts()
Using a For Loop
Using drop_duplicates() Method

Count Distinct Values using DataFrame.nunique()

nunique()method counts distinct values in each column, making it perfect for quickly summarizing unique values across one or more columns.

Python

import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'height': [165, 165, 164, 158, 167, 160, 158, 165],
    'weight': [63.5, 64, 63.5, 54, 63.5, 62, 64, 64],
    'age': [20, 22, 22, 21, 23, 22, 20, 21]
}, index=['Steve', 'Ria', 'Nivi', 'Jane', 'Kate', 'Lucy', 'Ram', 'Niki'])

# Count unique values in each column using nunique()
n = df.nunique()

print("Number of unique values in each column:\n", n)

Output

Number of unique values in each column:
 height    5
weight    4
age       4
dtype: int64

Count Distinct Values in Pandas DataFrame using Series.value_counts()

value_counts() counts the frequency of each unique value in a column. Use it to get both the count and distribution of values, and find the number of unique values by applying len().

Python

import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'height': [165, 165, 164, 158, 167, 160, 158, 165],
    'weight': [63.5, 64, 63.5, 54, 63.5, 62, 64, 64],
    'age': [20, 22, 22, 21, 23, 22, 20, 21]
}, index=['Steve', 'Ria', 'Nivi', 'Jane', 'Kate', 'Lucy', 'Ram', 'Niki'])

# Count unique values in 'height' column using value_counts()
li = list(df['height'].value_counts())

print("Number of unique values in 'height':", len(li))

Output

Number of unique values in 'height': 5

Using a For Loop

A for loop can manually count unique values by checking if a value has already been visited. This is useful when built-in Pandas functions aren't available or you need a custom solution.

Python

import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'height': [165, 165, 164, 158, 167, 160, 158, 165],
    'weight': [63.5, 64, 63.5, 54, 63.5, 62, 64, 64],
    'age': [20, 22, 22, 21, 23, 22, 20, 21]
}, index=['Steve', 'Ria', 'Nivi', 'Jane', 'Kate', 'Lucy', 'Ram', 'Niki'])

# Count unique values in 'height' column using a for loop
cnt = 0
visited = []

for value in df['height']:
    if value not in visited:
        visited.append(value)
        cnt += 1

print("Number of unique values in 'height':", cnt)
print("Unique values:", visited)

Output

Number of unique values in 'height': 5
Unique values: [165, 164, 158, 167, 160]

Using drop_duplicates() Method

drop_duplicates()is useful when you need to remove duplicate values and count distinct values directly. It’s a good alternative to unique() when you want to see the actual distinct values as a new DataFrame or Series.

Python

import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'height': [165, 165, 164, 158, 167, 160, 158, 165],
    'weight': [63.5, 64, 63.5, 54, 63.5, 62, 64, 64],
    'age': [20, 22, 22, 21, 23, 22, 20, 21]
}, index=['Steve', 'Ria', 'Nivi', 'Jane', 'Kate', 'Lucy', 'Ram', 'Niki'])

# Count unique values in 'height' column using drop_duplicates()
unique_values = df['height'].drop_duplicates()

print("Unique values in 'height':", unique_values)
print("Number of unique values in 'height':", unique_values.count())

Output

Unique values in 'height': Steve    165
Nivi     164
Jane     158
Kate     167
Lucy     160
Name: height, dtype: int64
Number of unique values in 'height': 5

Count number of columns of a Pandas DataFrame

erakshaya485

Improve

Article Tags :

How to Count Distinct Values of a Pandas Dataframe Column?

Using pandas.unique()

Count Distinct Values using DataFrame.nunique()

Count Distinct Values in Pandas DataFrame using Series.value_counts()

Using a For Loop

Using drop_duplicates() Method

Similar Reads

Thank You!

What kind of Experience do you want to share?