0% found this document useful (0 votes)
37 views2 pages

Data Manipulation With Pandas - Aggregates in Pandas Reference Guide - Codecademy

The document discusses using aggregates in Pandas to summarize DataFrames. It shows how to use the groupby() function to calculate statistics like mean across multiple rows by grouping on a column like Name. Aggregate functions like mean(), median(), and max() can be applied to columns to calculate statistics. Pandas has built-in aggregate functions to calculate metrics like average, standard deviation, median, minimum, maximum, count and unique values for each column.

Uploaded by

Yash Purandare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views2 pages

Data Manipulation With Pandas - Aggregates in Pandas Reference Guide - Codecademy

The document discusses using aggregates in Pandas to summarize DataFrames. It shows how to use the groupby() function to calculate statistics like mean across multiple rows by grouping on a column like Name. Aggregate functions like mean(), median(), and max() can be applied to columns to calculate statistics. Pandas has built-in aggregate functions to calculate metrics like average, standard deviation, median, minimum, maximum, count and unique values for each column.

Uploaded by

Yash Purandare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

07/03/2020 Data Manipulation with Pandas: Aggregates in Pandas Reference Guide | Codecademy

Data Manipulation with


Pandas
Aggregates in Pandas
Print cheatsheet

Pandas’ Groupby
In a pandas DataFrame , aggregate statistic functions can be applied across
multiple rows by using a groupby function. In the example, the code takes all of
the elements that are the same in Name and groups them, replacing the values
in Grade with their mean. Instead of mean() any aggregate statistics function,
like median() or max() , can be used. Note that to use the groupby() function,
at least two columns must be supplied.

df = pd.DataFrame([
["Amy","Assignment 1",75],
["Amy","Assignment 2",35],
["Bob","Assignment 1",99],
["Bob","Assignment 2",35]
], columns=["Name", "Assignment", "Grade"])

df.groupby('Name').Grade.mean()

# output of the groupby command


|Name | Grade|
| - | - |
|Amy | 55|
|Bob | 67|

Pandas DataFrame Aggregate Function


Pandas’ aggregate statistics functions can be used to calculate statistics on a
column of a DataFrame. For example, df.columnName.mean() computes the
mean of the column columnName of dataframe df . The code block shows how
https://www.codecademy.com/learn/paths/analyze-data-with-python/tracks/ida-4-data-manipulation-pandas/modules/ida-4-2-aggregates-in-pandas/cheatsheet 1/2
07/03/2020 Data Manipulation with Pandas: Aggregates in Pandas Reference Guide | Codecademy

to calculate statistics on the column columnName of df using Pandas’


aggregate statistics functions.

df.columnName.mean() # Average of all values in column


df.columnName.std() # Standard deviation of column
df.columnName.median() # Median value of column
df.columnName.max() # Maximum value in column
df.columnName.min() # Minimum value in column
df.columnName.count() # Number of values in column
df.columnName.nunique() # Number of unique values in column
df.columnName.unique() # List of unique values in column

https://www.codecademy.com/learn/paths/analyze-data-with-python/tracks/ida-4-data-manipulation-pandas/modules/ida-4-2-aggregates-in-pandas/cheatsheet 2/2

You might also like