To count distinct, use nunique in Pandas. We will groupby a column and find sun as well using Numpy sum().
At first, import the required libraries −
import pandas as pd import numpy as np
Create a DataFrame with 3 columns. The columns have duplicate values −
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Audi', 'BMW', 'Lexus', 'Lexus'],"Place": ['Delhi','Bangalore','Delhi','Chandigarh','Chandigarh'],"Units": [100, 150, 50, 110, 90]
}
)Count distinct in aggregation agg() with nunique. Calculating the sum for counting, we are using numpy sum() −
dataFrame = dataFrame.groupby("Car").agg({"Units": np.sum, "Place": pd.Series.nunique})Example
Following is the code −
import pandas as pd
import numpy as np
dataFrame = pd.DataFrame(
{
"Car": ['BMW', 'Audi', 'BMW', 'Lexus', 'Lexus'],"Place": ['Delhi','Bangalore','Delhi','Chandigarh','Chandigarh'],"Units": [100, 150, 50, 110, 90]
}
)
print"DataFrame ...\n",dataFrame
# count distinct in aggregation with nunique
dataFrame = dataFrame.groupby("Car").agg({"Units": np.sum, "Place": pd.Series.nunique})
print"\nUpdated DataFrame ...\n",dataFrameOutput
This will produce the following output −
DataFrame ... Car Place Units 0 BMW Delhi 100 1 Audi Bangalore 150 2 BMW Delhi 50 3 Lexus Chandigarh 110 4 Lexus Chandigarh 90 Updated DataFrame ... Units Place Car Audi 150 1 BMW 150 1 Lexus 200 1