Understanding Pandas Groupby For Data Aggregation
Understanding Pandas Groupby For Data Aggregation
com/blog/2020/03/groupby-
pandas-aggregating-data-python/
Understanding Pandas Groupby for Data Aggregation
Introduction
What if I told you that we could derive effective and impactful insights from
our dataset in just a few lines of code? That’s the beauty of Pandas’ GroupBy in
Python function! I have lost count of the number of times I’ve relied on GroupBy
to quickly summarize data and aggregate it in a way that’s easy to interpret.
This helps not only when we’re working on a data science project and need quick
results but also in hackathons! When time is of the essence (and when is it not?),
the GroupBy function in Pandas saves us a ton of effort by delivering super quick
results in a matter of seconds. If you are familiar with groups in sql, this article will
be even easier for you to understand!
Loving GroupBy already? In this tutorial, I will first explain the GroupBy function
using an intuitive example before picking up a real-world dataset and
implementing GroupBy in Python. Let’s begin aggregating!
Learning Objectives
If you’re new to the world of Python and Pandas, you’ve come to the right place.
Here are two popular free courses you should check out:
Table of contents
Introduction
What Is the Pandas’ GroupBy Function?
Understanding the Dataset & Problem Statement
First Look at Pandas GroupBy
The Split-Apply-Combine Strategy
Loop Over GroupBy Groups
Applying Functions to GroupBy Groups
Conclusion
Frequently Asked Questions
Let me take an example to elaborate on this. Let’s say we are trying to analyze the
weight of a person in a city. We can easily get a fair idea of their weight by
determining the mean weight of all the city dwellers. But here‘s a question – would
the weight be affected by the gender of a person?
We can group the city dwellers into different gender groups and compute their
mean weight. This would give us a better insight into the weight of a person living
in the city. But we can probably get an even better picture if we further separate
these gender groups into different age groups and then take their mean weight
(because a teenage boy’s weight could differ from that of an adult male)!
You can see how separating people into separate groups and then applying a
statistical value allows us to make better analyses than just looking at the statistical
value of the entire population. This is what makes GroupBy so great!
GroupBy allows us to group our data based on different features and get a more
accurate idea about your data. It is a one-stop shop for deriving deep insights from
your data!
We will be working with the Big Mart Sales dataset from our DataHack platform.
It contains attributes related to the products sold at various stores of BigMart. The
aim is to find out the sales of each product at a particular store.
import pandas as pd
import numpy as np
df = pd.read_csv(‘train_v9rqX0R.csv’)
Python Code:
df.groupby('Outlet_Location_Type'
)
df.groupby('Outlet_Location_Type').count(
)
We did not tell GroupBy which column we wanted it to apply the aggregation
function on, so we applied it to multiple columns (all the relevant columns) and
returned the output.
But fortunately, GroupBy object supports column indexing just like a pandas
Dataframe!
So let’s find out the total sales for each location type:
df.groupby('Outlet_Location_Type')
['Item_Outlet_Sales']
df.groupby('Outlet_Location_Type')
['Item_Outlet_Sales'].sum()
Awesome! Now, let’s understand the work behind the GroupBy function in
Pandas.
You just saw how quickly you can get an insight into grouped data using the
GroupBy function. But, behind the scenes, a lot is taking place, which is important
to understand to gauge the true power of GroupBy.
I want to show you how this strategy works in GroupBy by working with a sample
dataset to get the average height for males and females in a group. Let’s create that
dataset:
data = {'Gender':['m','f','f','m','f','m','m'],'Height':
[172,171,169,173,170,175,178]}
df_sample = pd.DataFrame(data)
df_sample
Splitting the data into separate groups:
f_filter = df_sample['Gender']=='f'
print(df_sample[f_filter])
m_filter = df_sample['Gender']=='m'
print(df_sample[m_filter])
f_avg = df_sample[f_filter]['Height'].mean()
m_avg = df_sample[m_filter]['Height'].mean()
print(f_avg,m_avg)
170.0 174.5
df_output = pd.DataFrame({'Gender':['f','m'],'Height':
[f_avg,m_avg]})
df_output
All these three steps can be achieved by using Groupby with just a single line of
code! Here’s how:
df_sample.groupby('Gender').mean(
)
Now that is smart! Have a look at how Groupby did that in the image below:
You can see how GroupBy simplifies our task by doing all the work behind the
scenes without us having to worry about a thing!
Now that you understand the Split-Apply-Combine strategy let’s dive deeper into
the GroupBy function and unlock its full potential.
obj =
df.groupby('Outlet_Location_Type')
obj
We can display the indices in each group by calling the groups on the GroupBy
object:
obj.groups
But what if you want to get a specific group out of all the groups? Well, don’t
worry. Pandas has a solution for that too.
Just provide the specific group name when calling get_group on the group object.
Here, I want to check out the features for the ‘Tier 1’ group of locations only:
obj.get_group('Tier 1')
Now isn’t that wonderful! You have the entire Tier 1 features to work with and
derive wonderful insights! But wait, didn’t I say that GroupBy is lazy and doesn’t
do anything unless explicitly specified? Alright then, let’s see GroupBy in action
with the aggregate functions.
The apply step is unequivocally the most important step of a GroupBy function
where we can perform a variety of operations using aggregation, transformation,
filtration, or even with your own function!
Aggregation
We have looked at some aggregation functions in the article so far, such as mean,
mode, and sum. These perform statistical operations on a set of data. Have a
glance at all the aggregate functions in the Pandas package:
But the agg() function in Pandas gives us the flexibility to perform several
statistical computations all at once! Here is how it works:
df.groupby('Outlet_Location_Type').agg([np.mean,np.median]
)
We can even run GroupBy with multiple indexes to get better insights from our
data:
df.groupby(['Outlet_Location_Type','Outlet_Establishment_Year'],as_index=False).agg(
{'Outlet_Size':pd.Series.mode,'Item_Outlet_Sales':np.mean})
Notice that I have used different aggregation functions for different column names
by passing them in a dictionary with the corresponding operation to be performed.
This allowed me to group and apply computations on nominal and numeric
features simultaneously.
Also, I have changed the value of the as_index parameter to False. This way, the
grouped index would not be output as an index.
It is amazing how a name change can improve the understandability of the output!
Transformation
We will try to compute the null values in the Item_Weight column using
the transform() function.
The Item_Fat_Content and Item_Type will affect the Item_Weight, don’t you
think? So, let’s group the DataFrame by these columns and handle the missing
weights using the mean of these groups:
df['Item_Weight'] = df.groupby(['Item_Fat_Content','Item_Type'])
['Item_Weight'].transform(lambda x: x.fillna(x.mean()))
You can read more about the transform() function in this article.
Filtration
Filtration allows us to discard certain values based on computation and return only
a subset of the group. We can do this using the filter() function in Pandas.
df.shape
(8523, 12)
If I wanted only those groups that have item weights within 3 standard deviations, I
could use the filter function to do the job:
def filter_func(x):
return x['Item_Weight'].std() < 3
df_filter =
df.groupby(['Item_Weight']).filter(filter_func)
df_filter.shape
(8510, 12)
GroupBy has conveniently returned a DataFrame with only those groups that
have Item_Weight less than 3 standard deviations.
Pandas’ apply() function applies a function along an axis of the DataFrame. When
using it with the GroupBy function, we can apply any function to the grouped
result.
For example, if I wanted to center the Item_MRP values with the mean of their
establishment year group, I could use the apply() function to do just that”:
df_apply = df.groupby(['Outlet_Establishment_Year'])['Item_MRP'].apply(lambda x: x -
x.mean())
df_apply
Here, the values have been centered, and you can check whether the item was sold
at an MRP above or below the mean MRP for that year.
Conclusion
I’m sure you can see how amazing the GroupBy function is and how useful it can
be for analyzing your data. I hope this article helped you understand the function
better! But practice makes perfect, so start with the super impressive datasets on
our very own DataHack platform. Moving forward, you can read about how you
can analyze your data using a pivot table in Pandas.
Key Takeaways
https://pbpython.com/groupby-agg.html
Introduction
One of the most basic analysis functions is grouping and aggregating data. In some
cases, this level of analysis may be sufficient to answer business questions. In other
instances, this activity might be the first step in a more complex data science
analysis. In pandas, the groupby function can be combined with one or more
aggregation functions to quickly and easily summarize data. This concept is
deceptively simple and most new pandas users will understand this concept.
However, they might be surprised at how useful complex aggregation functions can
be for supporting sophisticated analysis.
This article will quickly summarize the basic pandas aggregation functions and show
examples of more complex custom aggregations. Whether you are a new or more
experienced pandas user, I think you will learn a few things from this article.
Aggregating
In the context of this article, an aggregation function is one which takes
multiple individual values and returns a summary. In the majority of the cases,
this summary is a single value.
Here’s a quick example of calculating the total and average fare using the Titanic
dataset (loaded from seaborn):
import pandas as pd
import seaborn as sns
df = sns.load_dataset('titanic')
df['fare'].agg(['sum', 'mean'])
sum 28693.949300
mean 32.204208
Name: fare, dtype: float64
This simple concept is a necessary building block for more complex analysis.
One area that needs to be discussed is that there are multiple ways to call an
aggregation function. As shown above, you may pass a list of functions to apply to
one or more columns of data.
What if you want to perform the analysis on only a subset of columns? There are two
other options for aggregations: using a dictionary or a named aggregation.
Groupby
Now that we know how to use aggregations, we can combine this with groupby to
summarize data.
Basic math
The most common built in aggregation functions are basic math functions including
sum, mean, median, minimum, maximum, standard deviation, variance, mean
absolute deviation and product.
We can apply all these functions to the fare while grouping by the embark_town :
agg_func_math = {
'fare':
['sum', 'mean', 'median', 'min', 'max', 'std', 'var', 'mad',
'prod']
}
df.groupby(['embark_town']).agg(agg_func_math).round(2)
As an aside, I have not found a good usage for the prod function which computes the
product of all the values in a group. For the sake of completeness, I am including it.
One other useful shortcut is to use describe to run multiple built-in aggregations at
one time:
Counting
After basic math, counting is the next most common aggregation I perform on
grouped data. In some ways, this can be a little more tricky than the basic math.
Here are three examples of counting:
The major distinction to keep in mind is that count will not include NaN values
whereas size will. Depending on the data set, this may or may not be a useful
distinction. In addition, the nunique function will exclude NaN values in the
unique counts. Keep reading for an example of how to include NaN in the unique
value counts.
Another selection approach is to use idxmax and idxmin to select the index value
that corresponds to the maximum or minimum value.
df.loc[[258, 378]]
Here’s another shortcut trick you can use to see the rows with the max fare :
df.loc[df.groupby('class')['fare'].idxmax()]
The above example is one of those places where the list-based aggregation is a
useful shortcut.
Other libraries
You are not limited to the aggregation functions in pandas. For instance, you could
use stats functions from scipy or numpy.
Here is an example of calculating the mode and skew of the fare data.
The mode results are interesting. The scipy.stats mode function returns the most
frequent value as well as the count of occurrences. If you just want the most frequent
value, use pd.Series.mode.
The key point is that you can use any function you want as long as it knows how to
interpret the array of pandas values and returns a single value.
This summary of the class and deck shows how this approach can be useful for
some data sets.
Custom functions
The pandas standard aggregation functions and pre-built functions from the python
ecosystem will meet many of your analysis needs. However, you will likely want to
create your own custom aggregation functions. There are four methods for creating
your own functions.
To illustrate the differences, let’s calculate the 25th percentile of the data using
four approaches:
Next, we define our own function (which is a small wrapper around quantile ):
# Define a function
def percentile_25(x):
return x.quantile(.25)
df.groupby(['embark_town']).agg(agg_func).round(2)
As you can see, the results are the same but the labels of the column are all a little
different. This is an area of programmer preference but I encourage you to be
familiar with the options since you will encounter most of these in online solutions.
In most cases, the functions are lightweight wrappers around built in pandas
functions. Part of the reason you need to do this is that there is no way to pass
arguments to aggregations. Some examples should clarify this point.
If you want to count the number of null values, you could use this function:
def count_nulls(s):
return s.size - s.count()
If you want to include NaN values in your unique counts, you need to
pass dropna=False to the nunique function.
def unique_nan(s):
return s.nunique(dropna=False)
agg_func_custom_count = {
'embark_town': ['count', 'nunique', 'size', unique_nan,
count_nulls, set]
}
df.groupby(['deck']).agg(agg_func_custom_count)
If you want to calculate the 90th percentile, use quantile :
def percentile_90(x):
return x.quantile(.9)
If you want to calculate a trimmed mean where the lowest 10th percent is excluded,
use the scipy stats function trim_mean :
def trim_mean_10(x):
return trim_mean(x, 0.1)
If you want the largest value, regardless of the sort order (see notes above
about first and last :
def largest(x):
return x.nlargest(1)
This is equivalent to max but I will show another example of nlargest below to
highlight the difference.
I wrote about sparklines before. Refer to that article for install instructions. Here’s
how to incorporate them into an aggregate function for a unique view of the data:
def sparkline_str(x):
bins=np.histogram(x)[0]
sl = ''.join(sparklines(bins))
return sl
agg_func_largest = {
'fare': [percentile_90, trim_mean_10, largest, sparkline_str]
}
df.groupby(['class', 'embark_town']).agg(agg_func_largest)
The nlargest and nsmallest functions can be useful for summarizing the data in
various scenarios. Here is code to show the total fares for the top 10 and bottom
10 individuals:
def top_10_sum(x):
return x.nlargest(10).sum()
def bottom_10_sum(x):
return x.nsmallest(10).sum()
agg_func_top_bottom_sum = {
'fare': [top_10_sum, bottom_10_sum]
}
df.groupby('class').agg(agg_func_top_bottom_sum)
Using this approach can be useful when applying the Pareto principle to your
own data.
Using this method, you will have access to all of the columns of the data and can
choose the appropriate aggregation approach to build up your resulting DataFrame
(including the column labels):
def summary(x):
result = {
'fare_sum': x['fare'].sum(),
'fare_mean': x['fare'].mean(),
'fare_range': x['fare'].max() - x['fare'].min()
}
return pd.Series(result).round(0)
df.groupby(['class']).apply(summary)
Using apply with groupy gives maximum flexibility over all aspects of the results.
However, there is a downside. The apply function is slow so this approach should be
used sparingly.
For the first example, we can figure out what percentage of the total fares sold can
be attributed to each embark_town and class combination. We use assign and
a lambda function to add a pct_total column:
df.groupby(['embark_town', 'class']).agg({
'fare': 'sum'
}).assign(pct_total=lambda x: x / x.sum())
One important thing to keep in mind is that you can actually do this more simply
using a pd.crosstab as described in my previous article:
pd.crosstab(df['embark_town'],
df['class'],
values=df['fare'],
aggfunc='sum',
normalize=True)
While we are talking about crosstab , a useful concept to keep in mind is that agg
functions can be combined with pivot tables too.
Here’s a quick example:
pd.pivot_table(data=df,
index=['embark_town'],
columns=['class'],
aggfunc=agg_func_top_bottom_sum)
Sometimes you will need to do multiple groupby’s to answer your question. For
instance, if we wanted to see a cumulative total of the fares, we can group and
aggregate by town and class then group the resulting object and calculate a
cumulative sum:
Here’s another example where we want to summarize daily sales data and convert it
to a cumulative daily and quarterly view. Refer to the Grouper article if you are not
familiar with using pd.Grouper() :
In the first example, we want to include a total daily sales as well as cumulative
quarter amount:
sales =
pd.read_excel('https://github.com/chris1610/pbpython/blob/master/data/
2018_Sales_Total_v2.xlsx?raw=True')
To understand this, you need to look at the quarter boundary (end of March through
start of April) to get a good sense of what is going on.
If you want to just get a cumulative quarterly total, you can chain multiple
groupby functions.
First, group the daily results, then group those results by quarter and use a
cumulative sum:
sales.groupby([pd.Grouper(key='date', freq='D')
]).agg(daily_sales=('ext price', 'sum')).groupby(
pd.Grouper(freq='Q')).agg({
'daily_sales': 'cumsum'
}).rename(columns={'daily_sales': 'quarterly_sales'})
In this example, I included the named aggregation approach to rename the variable
to clarify that it is now daily sales. I then group again and use the cumulative sum to
get a running sum for the quarter. Finally, I rename the column to quarterly sales.
Admittedly this is a bit tricky to understand. However, if you take it step by step and
build out the function and inspect the results at each step, you will start to get the
hang of it. Don’t be discouraged!
I have found that the following approach works best for me. I use the
parameter as_index=False when grouping, then build a new collapsed
column name.
multi_df.columns = [
'_'.join(col).rstrip('_') for col in multi_df.columns.values
]
Subtotals
One process that is not straightforward with grouping and aggregating in pandas is
adding a subtotal. If you want to add subtotals, I recommend the sidetable package.
Here is how you can summarize fares by class , embark_town and sex with a
subtotal at each level as well as a grand total at the bottom:
import sidetable
df.groupby(['class', 'embark_town', 'sex']).agg({'fare':
'sum'}).stb.subtotal()
https://www.shanelynn.ie/summarising-aggregation-and-
grouping-data-in-python-pandas/
A Sample DataFrame
In order to demonstrate the effectiveness and simplicity of the grouping
commands, we will need some data. For an example dataset, I have
extracted my own mobile phone usage records. I analysed this type of data
using Pandas during my work on KillBiller. If you’d like to follow along – the
full csv file is available here.
The dataset contains 830 entries from my mobile phone log spanning a total
time of 5 months. The CSV file can be loaded into a pandas DataFrame using
the pandas.DataFrame.from_csv() function, and looks like this:
… … … … … … …
Sample CSV file data containing the dates and durations of phone calls made on my mobile
phone.
min Minimum
max Maximum
mode Mode
data.groupby(['month']).groups.keys()
Out[59]: ['2014-12', '2014-11', '2015-02', '2015-03', '2015-01']
len(data.groupby(['month']).groups['2014-11'])
Out[61]: 230
Functions like max(), min(), mean(), first(), last() can be quickly applied to the
GroupBy object to obtain summary statistics for each group – an immensely
useful function. This functionality is similar to the dplyr and plyr libraries for
R. Different variables can be excluded / included from each summary
requirement.
# Get the first entry for each month
data.groupby('month').first()
Out[69]:
date duration item network network_type
month
2014-11 2014-10-15 06:58:00 34.429 data data data
2014-12 2014-11-13 06:58:00 34.429 data data data
2015-01 2014-12-13 06:58:00 34.429 data data data
2015-02 2015-01-13 06:58:00 34.429 data data data
2015-03 2015-02-12 20:15:00 69.000 call landline landline
# Get the sum of the durations per month
data.groupby('month')['duration'].sum()
Out[70]:
month
2014-11 26639.441
2014-12 14641.870
2015-01 18223.299
2015-02 15522.299
2015-03 22750.441
Name: duration, dtype: float64
# Get the number of dates / entries in each month
data.groupby('month')['date'].count()
Out[74]:
month
2014-11 230
2014-12 157
2015-01 205
2015-02 137
2015-03 101
Name: date, dtype: int64
# What is the sum of durations, for calls only, to each network
data[data['item'] == 'call'].groupby('network')['duration'].sum()
Out[78]:
network
Meteor 7200
Tesco 13828
Three 36464
Vodafone 14621
landline 18433
voicemail 1775
Name: duration, dtype: float64
You can also group by more than one variable, allowing more complex queries.
# How many calls, sms, and data entries are in each month?
data.groupby(['month', 'item'])['date'].count()
Out[76]:
month item
2014-11 call 107
data 29
sms 94
2014-12 call 79
data 30
sms 48
2015-01 call 88
data 31
sms 86
2015-02 call 67
data 31
sms 39
2015-03 call 47
data 29
sms 25
Name: date, dtype: int64
# How many calls, texts, and data are sent per month, split by network_type?
data.groupby(['month', 'network_type'])['date'].count()
Out[82]:
month network_type
2014-11 data 29
landline 5
mobile 189
special 1
voicemail 6
2014-12 data 30
landline 7
mobile 108
voicemail 8
world 4
2015-01 data 31
landline 11
mobile 160
....
Groupby output format – Series or DataFrame?
The output from a groupby and aggregation operation varies between Pandas
Series and Pandas Dataframes, which can be confusing for new users. As a rule
of thumb, if you calculate more than one column of results, your result will be
a Dataframe. For a single column of results, the agg function, by default, will
produce a Series.
Using the
as_index parameter while Grouping data in pandas prevents setting a row
index on the result.
Multiple Statistics per Group
The final piece of syntax that we’ll examine is the “agg()” function for Pandas.
The aggregation functionality provided by the agg() function allows multiple
statistics to be calculated per group in one calculation.
Applying a single function to columns in groups
Instructions for aggregation are provided in the form of a python dictionary or
list. The dictionary keys are used to specify the columns upon which you’d like
to perform operations, and the dictionary values to specify the function to run.
For example:
# Group the data frame by month and item and extract a number of stats from each
group
data.groupby(
['month', 'item']
).agg(
{
'duration':sum, # Sum duration per group
'network_type': "count", # get the count of networks
'date': 'first' # get the first date per group
}
)
The aggregation dictionary syntax is flexible and can be defined before the
operation. You can also define functions inline using “lambda” functions to
extract statistics that are not provided by the built-in options.
# Define the aggregation procedure outside of the groupby operation
aggregations = {
'duration':'sum',
'date': lambda x: max(x) - 1
}
data.groupby('month').agg(aggregations)
Applying multiple functions to columns in groups
To apply multiple functions to a single column in your grouped data, expand
the syntax above to pass in a list of functions as the value in your aggregation
dataframe. See below:
# Group the data frame by month and item and extract a number of stats from each
group
data.groupby(
['month', 'item']
).agg(
{
# Find the min, max, and sum of the duration column
'duration': [min, max, sum],
# find the number of network type entries
'network_type': "count",
# minimum, first, and number of unique dates
'date': [min, 'first', 'nunique']
}
)
The agg(..) syntax is flexible and simple to use. Remember that you can pass in
custom and lambda functions to your list of aggregated calculations, and each
will be passed the values from the column in your grouped data.
Renaming grouped aggregation columns
We’ll examine two methods to group Dataframes and rename the column
results in your work.
data[data['item'] == 'call'].groupby('month').agg(
# Get max of the duration column for each group
max_duration=('duration', max),
# Get min of the duration column for each group
min_duration=('duration', min),
# Get sum of the duration column for each group
total_duration=('duration', sum),
# Apply a lambda to date column
num_days=("date", lambda x: (max(x) - min(x)).days)
)
Grouping with named aggregation using new Pandas 0.25 syntax. Tuples are used to specify
the columns to work on and the functions to apply to each grouping.
For clearer naming, Pandas also provides the NamedAggregation named-
tuple, which can be used to achieve the same as normal tuples:
data[data['item'] == 'call'].groupby('month').agg(
max_duration=pd.NamedAgg(column='duration', aggfunc=max),
min_duration=pd.NamedAgg(column='duration', aggfunc=min),
total_duration=pd.NamedAgg(column='duration', aggfunc=sum),
num_days=pd.NamedAgg(
column="date",
aggfunc=lambda x: (max(x) - min(x)).days)
)
Note that in versions of Pandas after release, applying lambda functions only
works for these named aggregations when they are the only function applied to
a single column, otherwise causing a KeyError.
Renaming index using droplevel and ravel
When multiple statistics are calculated on columns, the resulting dataframe
will have a multi-index set on the column axis. The multi-index can be difficult
to work with, and I typically have to rename columns after a groupby
operation.
One option is to drop the top level (using .droplevel) of the newly created
multi-index on columns using:
grouped = data.groupby('month').agg("duration": [min, max, mean])
grouped.columns = grouped.columns.droplevel(level=0)
grouped.rename(columns={
"min": "min_duration", "max": "max_duration", "mean": "mean_duration"
})
grouped.head()
However, this approach loses the original column names, leaving only the
function names as column headers. A neater approach, as suggested to me by a
reader, is using the ravel() method on the grouped columns. Ravel() turns a
Pandas multi-index into a simpler array, which we can combine into sensible
column names:
grouped = data.groupby('month').agg("duration": [min, max, mean])
# Using ravel, and a string join, we can create better names for the columns:
grouped.columns = ["_".join(x) for x in grouped.columns.ravel()]
Q
uick renaming of grouped columns from the groupby() multi-index can be
achieved using the ravel() function.
Dictionary groupby format <DEPRECATED>
There were substantial changes to the Pandas aggregation function in
May of 2017. Renaming of variables using dictionaries within the agg()
function as in the diagram below is being deprecated/removed from
Pandas – see notes.
Summary
Thanks for reading this article. There is a lot of detail here but that is due to how
many different uses there are for grouping and aggregating data with pandas. My
hope is that this post becomes a useful resource that you can bookmark and come
back to when you get stuck with a challenging problem of your own.
If you have other common techniques you use frequently please let me know in the
comments. If I get some broadly useful ones, I will include in this post or as an
updated article.