We will group Pandas DataFrame using the groupby(). Select the column to be used using the grouper function. We will group minute-wise and calculate the sum of Registration Price with minutes interval for our example shown below for Car Sale Records.
At first, let’s say the following is our Pandas DataFrame with three columns. We have set Date_of_Purchase with timestamp, including Date and Time both −
dataFrame = pd.DataFrame( { "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"], "Date_of_Purchase": [ pd.Timestamp("2021-07-28 00:10:00"), pd.Timestamp("2021-07-28 00:12:00"), pd.Timestamp("2021-07-28 00:15:00"), pd.Timestamp("2021-07-28 00:16:00"), pd.Timestamp("2021-07-28 00:17:00"), pd.Timestamp("2021-07-28 00:20:00"), pd.Timestamp("2021-07-28 00:35:00"), pd.Timestamp("2021-07-28 00:42:00"), pd.Timestamp("2021-07-28 00:57:00"), ], "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350] } )
Next, use the Grouper to select Date_of_Purchase column within groupby function. The frequency is set as 7min i.e. interval of 7 minutes grouped −
print"\nGroup Dataframe by 7 minutes...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='7min')).sum()
Example
Following is the code −
import pandas as pd # dataframe with one of the columns as Date_of_Purchase dataFrame = pd.DataFrame( { "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"], "Date_of_Purchase": [ pd.Timestamp("2021-07-28 00:10:00"), pd.Timestamp("2021-07-28 00:12:00"), pd.Timestamp("2021-07-28 00:15:00"), pd.Timestamp("2021-07-28 00:16:00"), pd.Timestamp("2021-07-28 00:17:00"), pd.Timestamp("2021-07-28 00:20:00"), pd.Timestamp("2021-07-28 00:35:00"), pd.Timestamp("2021-07-28 00:42:00"), pd.Timestamp("2021-07-28 00:57:00"), ], "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350] } ) print"DataFrame...\n",dataFrame # Grouper to select Date_of_Purchase column within groupby function print"\nGroup Dataframe by 7 minutes...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='7min')).sum()
Output
This will produce the following output −
DataFrame... Car Date_of_Purchase Reg_Price 0 Audi 2021-07-28 00:10:00 1000 1 Lexus 2021-07-28 00:12:00 1400 2 Tesla 2021-07-28 00:15:00 1100 3 Mercedes 2021-07-28 00:16:00 900 4 BMW 2021-07-28 00:17:00 1700 5 Toyota 2021-07-28 00:20:00 1800 6 Nissan 2021-07-28 00:35:00 1300 7 Bentley 2021-07-28 00:42:00 1150 8 Mustang 2021-07-28 00:57:00 1350 Group Dataframe by 7 minutes... Reg_Price Date_of_Purchase 2021-07-28 00:07:00 2400.0 2021-07-28 00:14:00 5500.0 2021-07-28 00:21:00 NaN 2021-07-28 00:28:00 NaN 2021-07-28 00:35:00 1300.0 2021-07-28 00:42:00 1150.0 2021-07-28 00:49:00 NaN 2021-07-28 00:56:00 1350.0