
- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
Python Pandas - Descriptive Statistics
Descriptive statistics are essential tools in data analysis, offering a way to summarize and understand your data. In Python's Pandas library, there are numerous methods available for computing descriptive statistics on Series and DataFrame objects.
These methods provide various aggregations like sum(), mean(), and quantile(), as well as operations like cumsum() and cumprod() that return an object of the same size.
In this tutorial we will discuss about the some of the most commonly used descriptive statistics functions in Pandas, applied to both Series and DataFrame objects. These methods can be classified into different categories based on their functionality, such as Aggregation Functions, Cumulative Functions, and more.
Aggregation Functions
Aggregation functions produce a single value from a series of data, providing a concise summary of your dataset. Here are some key aggregation functions −
Sr.No. | Methods & Description |
---|---|
1 |
mean() Returns the mean of the values over the requested axis. |
2 |
sum() Return the sum of the values over the requested axis. |
3 |
median() Returns the Arithmetic median of values. |
4 |
min() It return the minimum of the values over the requested axis. |
5 |
max() Returns the maximum of the values over the requested axis. |
6 |
count() Returns the number of non-NA/null observations in the given object. |
7 |
quantile() Returns the value at the given quantile(s). |
8 |
mode() Returns the mode(s) of each element along the selected axis/Series. |
9 |
var() Return unbiased variance over requested axis. |
10 |
kurt() Return unbiased kurtosis over requested axis. |
11 |
skew() Return unbiased skew over requested axis. |
12 |
sem() Return unbiased skew over requested axis. |
13 |
corr() Compute correlation with other objects, excluding missing values. |
14 |
cov() Computes the covariance between two objects, excluding NA/null values. |
15 |
autocorr() Computes the lag-N autocorrelation. |
Cumulative Functions
Cumulative functions provide running totals or products and maintain the same shape as the input data. These are useful in time series analysis or for understanding trends −
Sr.No. | Methods & Description |
---|---|
1 |
cumsum() Return cumulative sum over a DataFrame or Series axis. |
2 |
cumprod() Return cumulative product over a DataFrame or Series axis. |
3 |
cummax() Return cumulative maximum over a DataFrame or Series axis. |
4 |
cummin() Return cumulative minimum over a DataFrame or Series axis. |
Boolean Functions
Boolean functions return boolean values based on logical operations across the Series −
Sr.No. | Methods & Description |
---|---|
1 |
all() Returns True if all elements are True, potentially along an axis. |
2 |
any() Returns True if any element is True, potentially along an axis. |
3 |
between() Returns True for each element if it is between the left and right bounds. |
Transformation Functions
Transformation functions apply a mathematical operation to each element in the Series, returning a transformed Series−
Sr.No. | Methods & Description |
---|---|
1 |
diff() Computes the difference between elements in the object, over the specified number of periods. |
2 |
pct_change() Computes the percentage change between the current and a prior element. |
3 |
rank() Computes the rank of values in the given object. |
Index Related Functions
These functions relate to the Series index and provide ways to manipulate and analyze index labels −
Sr.No. | Methods & Description |
---|---|
1 |
idxmax() Returns the index of the first occurrence of the maximum value. |
2 |
idxmin() Returns the index of the first occurrence of the minimum value. |
3 |
value_counts() Returns a Series containing counts of unique values. |
4 |
unique() Returns an array of unique values in the Series elements. |
Statistical Functions
These functions provide various statistical metrics on the Series data −
Sr.No. | Methods & Description |
---|---|
1 |
nunique() Returns the number of unique values in the given object. |
2 |
std() Returns the standard deviation of the Series values. |
3 |
abs() Return a Series/DataFrame with absolute numeric value of each element. |
4 |
clip() Trims values at input thresholds, returning values outside the bounds to the boundary. |
5 |
round() Round each value in the given object to the specified number of decimals. |
6 |
prod() Returns the product of the given object elements. |
7 |
describe() Generate descriptive statistics of the given object. |