0% found this document useful (0 votes)
84 views42 pages

Style Ipynb

Uploaded by

akhil.s18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views42 pages

Style Ipynb

Uploaded by

akhil.s18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 42

{

"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Table Visualization\n",
"\n",
"This section demonstrates visualization of tabular data using the [Styler]
[styler]\n",
"class. For information on visualization with charting please see [Chart
Visualization][viz]. This document is written as a Jupyter Notebook, and can be
viewed or downloaded [here][download].\n",
"\n",
"## Styler Object and Customising the Display\n",
"Styling and output display customisation should be performed **after** the
data in a DataFrame has been processed. The Styler is **not** dynamically updated
if further changes to the DataFrame are made. The `DataFrame.style` attribute is a
property that returns a [Styler][styler] object. It has a `_repr_html_` method
defined on it so it is rendered automatically in Jupyter Notebook.\n",
"\n",
"The Styler, which can be used for large data but is primarily designed for
small data, currently has the ability to output to these formats:\n",
"\n",
" - HTML\n",
" - LaTeX\n",
" - String (and CSV by extension)\n",
" - Excel\n",
" - (JSON is not currently available)\n",
"\n",
"The first three of these have display customisation methods designed to format
and customise the output. These include:\n",
"\n",
" - Formatting values, the index and columns headers, using [.format()]
[formatfunc] and [.format_index()][formatfuncindex],\n",
" - Renaming the index or column header labels, using [.relabel_index()]
[relabelfunc]\n",
" - Hiding certain columns, the index and/or column headers, or index names,
using [.hide()][hidefunc]\n",
" - Concatenating similar DataFrames, using [.concat()][concatfunc]\n",
" \n",
"[styler]: ../reference/api/pandas.io.formats.style.Styler.rst\n",
"[viz]: visualization.rst\n",
"[download]:
https://nbviewer.org/github/pandas-dev/pandas/blob/main/doc/source/user_guide/
style.ipynb\n",
"[format]: https://docs.python.org/3/library/string.html#format-specification-
mini-language\n",
"[formatfunc]: ../reference/api/pandas.io.formats.style.Styler.format.rst\n",
"[formatfuncindex]:
../reference/api/pandas.io.formats.style.Styler.format_index.rst\n",
"[relabelfunc]:
../reference/api/pandas.io.formats.style.Styler.relabel_index.rst\n",
"[hidefunc]: ../reference/api/pandas.io.formats.style.Styler.hide.rst\n",
"[concatfunc]: ../reference/api/pandas.io.formats.style.Styler.concat.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"import matplotlib.pyplot\n",
"# We have this here to trigger matplotlib's font cache stuff.\n",
"# This cell is hidden from the output"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Formatting the Display\n",
"\n",
"### Formatting Values\n",
"\n",
"The [Styler][styler] distinguishes the *display* value from the *actual*
value, in both data values and index or columns headers. To control the display
value, the text is printed in each cell as a string, and we can use the [.format()]
[formatfunc] and [.format_index()][formatfuncindex] methods to manipulate this
according to a [format spec string][format] or a callable that takes a single value
and returns a string. It is possible to define this for the whole table, or index,
or for individual columns, or MultiIndex levels. We can also overwrite index
names.\n",
"\n",
"Additionally, the format function has a **precision** argument to specifically
help format floats, as well as **decimal** and **thousands** separators to support
other locales, an **na_rep** argument to display missing data, and an **escape**
and **hyperlinks** arguments to help displaying safe-HTML or safe-LaTeX. The
default formatter is configured to adopt pandas' global options such as
`styler.format.precision` option, controllable using `with
pd.option_context('format.precision', 2):`\n",
"\n",
"[styler]: ../reference/api/pandas.io.formats.style.Styler.rst\n",
"[format]: https://docs.python.org/3/library/string.html#format-specification-
mini-language\n",
"[formatfunc]: ../reference/api/pandas.io.formats.style.Styler.format.rst\n",
"[formatfuncindex]:
../reference/api/pandas.io.formats.style.Styler.format_index.rst\n",
"[relabelfunc]:
../reference/api/pandas.io.formats.style.Styler.relabel_index.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib as mpl\n",
"\n",
"df = pd.DataFrame({\n",
" \"strings\": [\"Adam\", \"Mike\"],\n",
" \"ints\": [1, 3],\n",
" \"floats\": [1.123, 1000.23]\n",
"})\n",
"df.style \\\n",
" .format(precision=3, thousands=\".\", decimal=\",\") \\\n",
" .format_index(str.upper, axis=1) \\\n",
" .relabel_index([\"row 1\", \"row 2\"], axis=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using Styler to manipulate the display is a useful feature because maintaining
the indexing and data values for other purposes gives greater control. You do not
have to overwrite your DataFrame to display it how you like. Here is a more
comprehensive example of using the formatting functions whilst still relying on the
underlying data for indexing and calculations."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"weather_df = pd.DataFrame(np.random.rand(10,2)*5, \n",
" index=pd.date_range(start=\"2021-01-01\",
periods=10),\n",
" columns=[\"Tokyo\", \"Beijing\"])\n",
"\n",
"def rain_condition(v): \n",
" if v < 1.75:\n",
" return \"Dry\"\n",
" elif v < 2.75:\n",
" return \"Rain\"\n",
" return \"Heavy Rain\"\n",
"\n",
"def make_pretty(styler):\n",
" styler.set_caption(\"Weather Conditions\")\n",
" styler.format(rain_condition)\n",
" styler.format_index(lambda v: v.strftime(\"%A\"))\n",
" styler.background_gradient(axis=None, vmin=1, vmax=5, cmap=\"YlGnBu\")\n",
" return styler\n",
"\n",
"weather_df"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"weather_df.loc[\"2021-01-04\":\"2021-01-08\"].style.pipe(make_pretty)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hiding Data\n",
"\n",
"The index and column headers can be completely hidden, as well subselecting
rows or columns that one wishes to exclude. Both these options are performed using
the same methods.\n",
"\n",
"The index can be hidden from rendering by calling [.hide()][hideidx] without
any arguments, which might be useful if your index is integer based. Similarly
column headers can be hidden by calling [.hide(axis=\"columns\")][hideidx] without
any further arguments.\n",
"\n",
"Specific rows or columns can be hidden from rendering by calling the same
[.hide()][hideidx] method and passing in a row/column label, a list-like or a slice
of row/column labels to for the ``subset`` argument.\n",
"\n",
"Hiding does not change the integer arrangement of CSS classes, e.g. hiding the
first two columns of a DataFrame means the column class indexing will still start
at `col2`, since `col0` and `col1` are simply ignored.\n",
"\n",
"[hideidx]: ../reference/api/pandas.io.formats.style.Styler.hide.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame(np.random.randn(5, 5))\n",
"df.style \\\n",
" .hide(subset=[0, 2, 4], axis=0) \\\n",
" .hide(subset=[0, 2, 4], axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To invert the function to a **show** functionality it is best practice to
compose a list of hidden items."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"show = [0, 2, 4]\n",
"df.style \\\n",
" .hide([row for row in df.index if row not in show], axis=0) \\\n",
" .hide([col for col in df.columns if col not in show], axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Concatenating DataFrame Outputs\n",
"\n",
"Two or more Stylers can be concatenated together provided they share the same
columns. This is very useful for showing summary statistics for a DataFrame, and is
often used in combination with DataFrame.agg.\n",
"\n",
"Since the objects concatenated are Stylers they can independently be styled as
will be shown below and their concatenation preserves those styles."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"summary_styler = df.agg([\"sum\", \"mean\"]).style \\\n",
" .format(precision=3) \\\n",
" .relabel_index([\"Sum\", \"Average\"])\n",
"df.style.format(precision=1).concat(summary_styler)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Styler Object and HTML \n",
"\n",
"The [Styler][styler] was originally constructed to support the wide array of
HTML formatting options. Its HTML output creates an HTML `<table>` and leverages
CSS styling language to manipulate many parameters including colors, fonts,
borders, background, etc. See [here][w3schools] for more information on styling
HTML tables. This allows a lot of flexibility out of the box, and even enables web
developers to integrate DataFrames into their exiting user interface designs.\n",
"\n",
"Below we demonstrate the default output, which looks very similar to the
standard DataFrame HTML representation. But the HTML here has already attached some
CSS classes to each cell, even if we haven't yet created any styles. We can view
these by calling the [.to_html()][tohtml] method, which returns the raw HTML as
string, which is useful for further processing or adding to a file - read on in
[More about CSS and HTML](#More-About-CSS-and-HTML). This section will also provide
a walkthrough for how to convert this default output to represent a DataFrame
output that is more communicative. For example how we can build `s`:\n",
"\n",
"[tohtml]: ../reference/api/pandas.io.formats.style.Styler.to_html.rst\n",
"\n",
"[styler]: ../reference/api/pandas.io.formats.style.Styler.rst\n",
"[w3schools]: https://www.w3schools.com/html/html_tables.asp"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame([[38.0, 2.0, 18.0, 22.0, 21, np.nan],[19, 439, 6, 452,
226,232]], \n",
" index=pd.Index(['Tumour (Positive)', 'Non-Tumour
(Negative)'], name='Actual Label:'), \n",
" columns=pd.MultiIndex.from_product([['Decision Tree',
'Regression', 'Random'],['Tumour', 'Non-Tumour']], names=['Model:',
'Predicted:']))\n",
"df.style"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to just create the below example: code is covered throughout the
guide.\n",
"s = df.style\\\n",
" .hide([('Random', 'Tumour'), ('Random', 'Non-Tumour')],
axis='columns')\\\n",
" .format('{:.0f}')\\\n",
" .set_table_styles([{\n",
" 'selector': '',\n",
" 'props': 'border-collapse: separate;'\n",
" },{\n",
" 'selector': 'caption',\n",
" 'props': 'caption-side: bottom; font-size:1.3em;'\n",
" },{\n",
" 'selector': '.index_name',\n",
" 'props': 'font-style: italic; color: darkgrey; font-weight:normal;'\
n",
" },{\n",
" 'selector': 'th:not(.index_name)',\n",
" 'props': 'background-color: #000066; color: white;'\n",
" },{\n",
" 'selector': 'th.col_heading',\n",
" 'props': 'text-align: center;'\n",
" },{\n",
" 'selector': 'th.col_heading.level0',\n",
" 'props': 'font-size: 1.5em;'\n",
" },{\n",
" 'selector': 'th.col2',\n",
" 'props': 'border-left: 1px solid white;'\n",
" },{\n",
" 'selector': '.col2',\n",
" 'props': 'border-left: 1px solid #000066;'\n",
" },{\n",
" 'selector': 'td',\n",
" 'props': 'text-align: center; font-weight:bold;'\n",
" },{\n",
" 'selector': '.true',\n",
" 'props': 'background-color: #e6ffe6;'\n",
" },{\n",
" 'selector': '.false',\n",
" 'props': 'background-color: #ffe6e6;'\n",
" },{\n",
" 'selector': '.border-red',\n",
" 'props': 'border: 2px dashed red;'\n",
" },{\n",
" 'selector': '.border-green',\n",
" 'props': 'border: 2px dashed green;'\n",
" },{\n",
" 'selector': 'td:hover',\n",
" 'props': 'background-color: #ffffb3;'\n",
" }])\\\n",
" .set_td_classes(pd.DataFrame([['true border-green', 'false', 'true',
'false border-red', '', ''],\n",
" ['false', 'true', 'false', 'true', '',
'']], \n",
" index=df.index, columns=df.columns))\\\n",
" .set_caption(\"Confusion matrix for multiple cancer prediction
models.\")\\\n",
" .set_tooltips(pd.DataFrame([['This model has a very strong true positive
rate', '', '', \"This model's total number of false negatives is too high\", '',
''],\n",
" ['', '', '', '', '', '']], \n",
" index=df.index, columns=df.columns),\n",
" css_class='pd-tt', props=\n",
" 'visibility: hidden; position: absolute; z-index: 1; border: 1px solid
#000066;'\n",
" 'background-color: white; color: #000066; font-size: 0.8em;' \n",
" 'transform: translate(0px, -24px); padding: 0.6em; border-radius:
0.5em;')\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first step we have taken is the create the Styler object from the
DataFrame and then select the range of interest by hiding unwanted columns with
[.hide()][hideidx].\n",
"\n",
"[hideidx]: ../reference/api/pandas.io.formats.style.Styler.hide.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s = df.style.format('{:.0f}').hide([('Random', 'Tumour'), ('Random', 'Non-
Tumour')], axis=\"columns\")\n",
"s"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('after_hide')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Methods to Add Styles\n",
"\n",
"There are **3 primary methods of adding custom CSS styles** to [Styler]
[styler]:\n",
"\n",
"- Using [.set_table_styles()][table] to control broader areas of the table
with specified internal CSS. Although table styles allow the flexibility to add CSS
selectors and properties controlling all individual parts of the table, they are
unwieldy for individual cell specifications. Also, note that table styles cannot be
exported to Excel. \n",
"- Using [.set_td_classes()][td_class] to directly link either external CSS
classes to your data cells or link the internal CSS classes created by
[.set_table_styles()][table]. See [here](#Setting-Classes-and-Linking-to-External-
CSS). These cannot be used on column header rows or indexes, and also won't export
to Excel. \n",
"- Using the [.apply()][apply] and [.map()][map] functions to add direct
internal CSS to specific data cells. See [here](#Styler-Functions). As of v1.4.0
there are also methods that work directly on column header rows or indexes;
[.apply_index()][applyindex] and [.map_index()][mapindex]. Note that only these
methods add styles that will export to Excel. These methods work in a similar way
to [DataFrame.apply()][dfapply] and [DataFrame.map()][dfmap].\n",
"\n",
"[table]: ../reference/api/pandas.io.formats.style.Styler.set_table_styles.rst\
n",
"[styler]: ../reference/api/pandas.io.formats.style.Styler.rst\n",
"[td_class]:
../reference/api/pandas.io.formats.style.Styler.set_td_classes.rst\n",
"[apply]: ../reference/api/pandas.io.formats.style.Styler.apply.rst\n",
"[map]: ../reference/api/pandas.io.formats.style.Styler.map.rst\n",
"[applyindex]: ../reference/api/pandas.io.formats.style.Styler.apply_index.rst\
n",
"[mapindex]: ../reference/api/pandas.io.formats.style.Styler.map_index.rst\n",
"[dfapply]: ../reference/api/pandas.DataFrame.apply.rst\n",
"[dfmap]: ../reference/api/pandas.DataFrame.map.rst"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Table Styles\n",
"\n",
"Table styles are flexible enough to control all individual parts of the table,
including column headers and indexes. \n",
"However, they can be unwieldy to type for individual data cells or for any
kind of conditional formatting, so we recommend that table styles are used for
broad styling, such as entire rows or columns at a time.\n",
"\n",
"Table styles are also used to control features which can apply to the whole
table at once such as creating a generic hover functionality. The `:hover` pseudo-
selector, as well as other pseudo-selectors, can only be used this way.\n",
"\n",
"To replicate the normal format of CSS selectors and properties (attribute
value pairs), e.g. \n",
"\n",
"```\n",
"tr:hover {\n",
" background-color: #ffff99;\n",
"}\n",
"```\n",
"\n",
"the necessary format to pass styles to [.set_table_styles()][table] is as a
list of dicts, each with a CSS-selector tag and CSS-properties. Properties can
either be a list of 2-tuples, or a regular CSS-string, for example:\n",
"\n",
"[table]: ../reference/api/pandas.io.formats.style.Styler.set_table_styles.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cell_hover = { # for row hover use <tr> instead of <td>\n",
" 'selector': 'td:hover',\n",
" 'props': [('background-color', '#ffffb3')]\n",
"}\n",
"index_names = {\n",
" 'selector': '.index_name',\n",
" 'props': 'font-style: italic; color: darkgrey; font-weight:normal;'\n",
"}\n",
"headers = {\n",
" 'selector': 'th:not(.index_name)',\n",
" 'props': 'background-color: #000066; color: white;'\n",
"}\n",
"s.set_table_styles([cell_hover, index_names, headers])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('after_tab_styles1')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we just add a couple more styling artifacts targeting specific parts of
the table. Be careful here, since we are *chaining methods* we need to explicitly
instruct the method **not to** ``overwrite`` the existing styles."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.set_table_styles([\n",
" {'selector': 'th.col_heading', 'props': 'text-align: center;'},\n",
" {'selector': 'th.col_heading.level0', 'props': 'font-size: 1.5em;'},\n",
" {'selector': 'td', 'props': 'text-align: center; font-weight: bold;'},\n",
"], overwrite=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('after_tab_styles2')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As a convenience method (*since version 1.2.0*) we can also pass a **dict** to
[.set_table_styles()][table] which contains row or column keys. Behind the scenes
Styler just indexes the keys and adds relevant `.col<m>` or `.row<n>` classes as
necessary to the given CSS selectors.\n",
"\n",
"[table]: ../reference/api/pandas.io.formats.style.Styler.set_table_styles.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.set_table_styles({\n",
" ('Regression', 'Tumour'): [{'selector': 'th', 'props': 'border-left: 1px
solid white'},\n",
" {'selector': 'td', 'props': 'border-left: 1px
solid #000066'}]\n",
"}, overwrite=False, axis=0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('xyz01')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting Classes and Linking to External CSS\n",
"\n",
"If you have designed a website then it is likely you will already have an
external CSS file that controls the styling of table and cell objects within it.
You may want to use these native files rather than duplicate all the CSS in python
(and duplicate any maintenance work).\n",
"\n",
"### Table Attributes\n",
"\n",
"It is very easy to add a `class` to the main `<table>` using
[.set_table_attributes()][tableatt]. This method can also attach inline styles -
read more in [CSS Hierarchies](#CSS-Hierarchies).\n",
"\n",
"[tableatt]:
../reference/api/pandas.io.formats.style.Styler.set_table_attributes.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"out = s.set_table_attributes('class=\"my-table-cls\"').to_html()\n",
"print(out[out.find('<table'):][:109])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data Cell CSS Classes\n",
"\n",
"*New in version 1.2.0*\n",
"\n",
"The [.set_td_classes()][tdclass] method accepts a DataFrame with matching
indices and columns to the underlying [Styler][styler]'s DataFrame. That DataFrame
will contain strings as css-classes to add to individual data cells: the `<td>`
elements of the `<table>`. Rather than use external CSS we will create our classes
internally and add them to table style. We will save adding the borders until the
[section on tooltips](#Tooltips-and-Captions).\n",
"\n",
"[tdclass]: ../reference/api/pandas.io.formats.style.Styler.set_td_classes.rst\
n",
"[styler]: ../reference/api/pandas.io.formats.style.Styler.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.set_table_styles([ # create internal CSS classes\n",
" {'selector': '.true', 'props': 'background-color: #e6ffe6;'},\n",
" {'selector': '.false', 'props': 'background-color: #ffe6e6;'},\n",
"], overwrite=False)\n",
"cell_color = pd.DataFrame([['true ', 'false ', 'true ', 'false '], \n",
" ['false ', 'true ', 'false ', 'true ']], \n",
" index=df.index, \n",
" columns=df.columns[:4])\n",
"s.set_td_classes(cell_color)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('after_classes')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Styler Functions\n",
"\n",
"### Acting on Data\n",
"\n",
"We use the following methods to pass your style functions. Both of those
methods take a function (and some other keyword arguments) and apply it to the
DataFrame in a certain way, rendering CSS styles.\n",
"\n",
"- [.map()][map] (elementwise): accepts a function that takes a single value
and returns a string with the CSS attribute-value pair.\n",
"- [.apply()][apply] (column-/row-/table-wise): accepts a function that takes a
Series or DataFrame and returns a Series, DataFrame, or numpy array with an
identical shape where each element is a string with a CSS attribute-value pair.
This method passes each column or row of your DataFrame one-at-a-time or the entire
table at once, depending on the `axis` keyword argument. For columnwise use
`axis=0`, rowwise use `axis=1`, and for the entire table at once use `axis=None`.\
n",
"\n",
"This method is powerful for applying multiple, complex logic to data cells. We
create a new DataFrame to demonstrate this.\n",
"\n",
"[apply]: ../reference/api/pandas.io.formats.style.Styler.apply.rst\n",
"[map]: ../reference/api/pandas.io.formats.style.Styler.map.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"np.random.seed(0)\n",
"df2 = pd.DataFrame(np.random.randn(10,4), columns=['A','B','C','D'])\n",
"df2.style"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For example we can build a function that colors text if it is negative, and
chain this with a function that partially fades cells of negligible value. Since
this looks at each element in turn we use ``map``."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def style_negative(v, props=''):\n",
" return props if v < 0 else None\n",
"s2 = df2.style.map(style_negative, props='color:red;')\\\n",
" .map(lambda v: 'opacity: 20%;' if (v < 0.3) and (v > -0.3) else
None)\n",
"s2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s2.set_uuid('after_applymap')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also build a function that highlights the maximum value across rows,
cols, and the DataFrame all at once. In this case we use ``apply``. Below we
highlight the maximum in a column."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def highlight_max(s, props=''):\n",
" return np.where(s == np.nanmax(s.values), props, '')\n",
"s2.apply(highlight_max, props='color:white;background-color:darkblue',
axis=0)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s2.set_uuid('after_apply')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can use the same function across the different axes, highlighting here the
DataFrame maximum in purple, and row maximums in pink."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s2.apply(highlight_max, props='color:white;background-color:pink;', axis=1)\\\
n",
" .apply(highlight_max, props='color:white;background-color:purple',
axis=None)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s2.set_uuid('after_apply_again')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This last example shows how some styles have been overwritten by others. In
general the most recent style applied is active but you can read more in the
[section on CSS hierarchies](#CSS-Hierarchies). You can also apply these styles to
more granular parts of the DataFrame - read more in section on [subset slicing]
(#Finer-Control-with-Slicing).\n",
"\n",
"It is possible to replicate some of this functionality using just classes but
it can be more cumbersome. See [item 3) of Optimization](#Optimization)\n",
"\n",
"<div class=\"alert alert-info\">\n",
"\n",
"*Debugging Tip*: If you're having trouble writing your style function, try
just passing it into ``DataFrame.apply``. Internally, ``Styler.apply`` uses
``DataFrame.apply`` so the result should be the same, and with ``DataFrame.apply``
you will be able to inspect the CSS string output of your intended function in each
cell.\n",
"\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Acting on the Index and Column Headers\n",
"\n",
"Similar application is achieved for headers by using:\n",
" \n",
"- [.map_index()][mapindex] (elementwise): accepts a function that takes a
single value and returns a string with the CSS attribute-value pair.\n",
"- [.apply_index()][applyindex] (level-wise): accepts a function that takes a
Series and returns a Series, or numpy array with an identical shape where each
element is a string with a CSS attribute-value pair. This method passes each level
of your Index one-at-a-time. To style the index use `axis=0` and to style the
column headers use `axis=1`.\n",
"\n",
"You can select a `level` of a `MultiIndex` but currently no similar `subset`
application is available for these methods.\n",
"\n",
"[applyindex]: ../reference/api/pandas.io.formats.style.Styler.apply_index.rst\
n",
"[mapindex]: ../reference/api/pandas.io.formats.style.Styler.map_index.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s2.map_index(lambda v: \"color:pink;\" if v>4 else \"color:darkblue;\",
axis=0)\n",
"s2.apply_index(lambda s:
np.where(s.isin([\"A\", \"B\"]), \"color:pink;\", \"color:darkblue;\"), axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tooltips and Captions\n",
"\n",
"Table captions can be added with the [.set_caption()][caption] method. You can
use table styles to control the CSS relevant to the caption.\n",
"\n",
"[caption]: ../reference/api/pandas.io.formats.style.Styler.set_caption.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.set_caption(\"Confusion matrix for multiple cancer prediction models.\")\\\
n",
" .set_table_styles([{\n",
" 'selector': 'caption',\n",
" 'props': 'caption-side: bottom; font-size:1.25em;'\n",
" }], overwrite=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('after_caption')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Adding tooltips (*since version 1.3.0*) can be done using the
[.set_tooltips()][tooltips] method in the same way you can add CSS classes to data
cells by providing a string based DataFrame with intersecting indices and columns.
You don't have to specify a `css_class` name or any css `props` for the tooltips,
since there are standard defaults, but the option is there if you want more visual
control. \n",
"\n",
"[tooltips]: ../reference/api/pandas.io.formats.style.Styler.set_tooltips.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tt = pd.DataFrame([['This model has a very strong true positive rate', \n",
" \"This model's total number of false negatives is too
high\"]], \n",
" index=['Tumour (Positive)'], columns=df.columns[[0,3]])\n",
"s.set_tooltips(tt, props='visibility: hidden; position: absolute; z-index: 1;
border: 1px solid #000066;'\n",
" 'background-color: white; color: #000066; font-size:
0.8em;' \n",
" 'transform: translate(0px, -24px); padding: 0.6em;
border-radius: 0.5em;')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('after_tooltips')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The only thing left to do for our table is to add the highlighting borders to
draw the audience attention to the tooltips. We will create internal CSS classes as
before using table styles. **Setting classes always overwrites** so we need to make
sure we add the previous classes."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"s.set_table_styles([ # create internal CSS classes\n",
" {'selector': '.border-red', 'props': 'border: 2px dashed red;'},\n",
" {'selector': '.border-green', 'props': 'border: 2px dashed green;'},\n",
"], overwrite=False)\n",
"cell_border = pd.DataFrame([['border-green ', ' ', ' ', 'border-red '], \n",
" [' ', ' ', ' ', ' ']], \n",
" index=df.index, \n",
" columns=df.columns[:4])\n",
"s.set_td_classes(cell_color + cell_border)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hidden cell to avoid CSS clashes and latter code upcoding previous
formatting \n",
"s.set_uuid('after_borders')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Finer Control with Slicing\n",
"\n",
"The examples we have shown so far for the `Styler.apply` and `Styler.map`
functions have not demonstrated the use of the ``subset`` argument. This is a
useful argument which permits a lot of flexibility: it allows you to apply styles
to specific rows or columns, without having to code that logic into your `style`
function.\n",
"\n",
"The value passed to `subset` behaves similar to slicing a DataFrame;\n",
"\n",
"- A scalar is treated as a column label\n",
"- A list (or Series or NumPy array) is treated as multiple column labels\n",
"- A tuple is treated as `(row_indexer, column_indexer)`\n",
"\n",
"Consider using `pd.IndexSlice` to construct the tuple for the last one. We
will create a MultiIndexed DataFrame to demonstrate the functionality."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df3 = pd.DataFrame(np.random.randn(4,4), \n",
" pd.MultiIndex.from_product([['A', 'B'], ['r1', 'r2']]),\n",
" columns=['c1','c2','c3','c4'])\n",
"df3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will use subset to highlight the maximum in the third and fourth columns
with red text. We will highlight the subset sliced region in yellow."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"slice_ = ['c3', 'c4']\n",
"df3.style.apply(highlight_max, props='color:red;', axis=0, subset=slice_)\\\
n",
" .set_properties(**{'background-color': '#ffffb3'}, subset=slice_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If combined with the ``IndexSlice`` as suggested then it can index across both
dimensions with greater flexibility."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"idx = pd.IndexSlice\n",
"slice_ = idx[idx[:,'r1'], idx['c2':'c4']]\n",
"df3.style.apply(highlight_max, props='color:red;', axis=0, subset=slice_)\\\
n",
" .set_properties(**{'background-color': '#ffffb3'}, subset=slice_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This also provides the flexibility to sub select rows when used with the
`axis=1`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"slice_ = idx[idx[:,'r2'], :]\n",
"df3.style.apply(highlight_max, props='color:red;', axis=1, subset=slice_)\\\
n",
" .set_properties(**{'background-color': '#ffffb3'}, subset=slice_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There is also scope to provide **conditional filtering**. \n",
"\n",
"Suppose we want to highlight the maximum across columns 2 and 4 only in the
case that the sum of columns 1 and 3 is less than -2.0 *(essentially excluding
rows* `(:,'r2')`*)*."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"slice_ = idx[idx[(df3['c1'] + df3['c3']) < -2.0], ['c2', 'c4']]\n",
"df3.style.apply(highlight_max, props='color:red;', axis=1, subset=slice_)\\\
n",
" .set_properties(**{'background-color': '#ffffb3'}, subset=slice_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Only label-based slicing is supported right now, not positional, and not
callables.\n",
"\n",
"If your style function uses a `subset` or `axis` keyword argument, consider
wrapping your function in a `functools.partial`, partialing out that keyword.\n",
"\n",
"```python\n",
"my_func2 = functools.partial(my_func, subset=42)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optimization\n",
"\n",
"Generally, for smaller tables and most cases, the rendered HTML does not need
to be optimized, and we don't really recommend it. There are two cases where it is
worth considering:\n",
"\n",
" - If you are rendering and styling a very large HTML table, certain browsers
have performance issues.\n",
" - If you are using ``Styler`` to dynamically create part of online user
interfaces and want to improve network performance.\n",
" \n",
"Here we recommend the following steps to implement:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1. Remove UUID and cell_ids\n",
"\n",
"Ignore the `uuid` and set `cell_ids` to `False`. This will prevent unnecessary
HTML."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-warning\">\n",
"\n",
"<font color=red>This is sub-optimal:</font>\n",
"\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4 = pd.DataFrame([[1,2],[3,4]])\n",
"s4 = df4.style"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-info\">\n",
"\n",
"<font color=green>This is better:</font>\n",
"\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pandas.io.formats.style import Styler\n",
"s4 = Styler(df4, uuid_len=0, cell_ids=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Use table styles\n",
"\n",
"Use table styles where possible (e.g. for all cells or rows or columns at a
time) since the CSS is nearly always more efficient than other formats."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-warning\">\n",
"\n",
"<font color=red>This is sub-optimal:</font>\n",
"\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"props = 'font-family: \"Times New Roman\", Times, serif; color: #e83e8c; font-
size:1.3em;'\n",
"df4.style.map(lambda x: props, subset=[1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-info\">\n",
"\n",
"<font color=green>This is better:</font>\n",
"\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.set_table_styles([{'selector': 'td.col1', 'props': props}])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Set classes instead of using Styler functions\n",
"\n",
"For large DataFrames where the same style is applied to many cells it can be
more efficient to declare the styles as classes and then apply those classes to
data cells, rather than directly applying styles to cells. It is, however, probably
still easier to use the Styler function api when you are not concerned about
optimization."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-warning\">\n",
"\n",
"<font color=red>This is sub-optimal:</font>\n",
"\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.style.apply(highlight_max, props='color:white;background-color:darkblue;',
axis=0)\\\n",
" .apply(highlight_max, props='color:white;background-color:pink;',
axis=1)\\\n",
" .apply(highlight_max, props='color:white;background-color:purple',
axis=None)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-info\">\n",
"\n",
"<font color=green>This is better:</font>\n",
"\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"build = lambda x: pd.DataFrame(x, index=df2.index, columns=df2.columns)\n",
"cls1 = build(df2.apply(highlight_max, props='cls-1 ', axis=0))\n",
"cls2 = build(df2.apply(highlight_max, props='cls-2 ', axis=1,
result_type='expand').values)\n",
"cls3 = build(highlight_max(df2, props='cls-3 '))\n",
"df2.style.set_table_styles([\n",
" {'selector': '.cls-1', 'props': 'color:white;background-
color:darkblue;'},\n",
" {'selector': '.cls-2', 'props': 'color:white;background-color:pink;'},\n",
" {'selector': '.cls-3', 'props': 'color:white;background-color:purple;'}\
n",
"]).set_td_classes(cls1 + cls2 + cls3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4. Don't use tooltips\n",
"\n",
"Tooltips require `cell_ids` to work and they generate extra HTML elements for
*every* data cell."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5. If every byte counts use string replacement\n",
"\n",
"You can remove unnecessary HTML, or shorten the default class names by
replacing the default css dict. You can read a little more about CSS [below](#More-
About-CSS-and-HTML)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"my_css = {\n",
" \"row_heading\": \"\",\n",
" \"col_heading\": \"\",\n",
" \"index_name\": \"\",\n",
" \"col\": \"c\",\n",
" \"row\": \"r\",\n",
" \"col_trim\": \"\",\n",
" \"row_trim\": \"\",\n",
" \"level\": \"l\",\n",
" \"data\": \"\",\n",
" \"blank\": \"\",\n",
"}\n",
"html = Styler(df4, uuid_len=0, cell_ids=False)\n",
"html.set_table_styles([{'selector': 'td', 'props': props},\n",
" {'selector': '.c1', 'props': 'color:green;'},\n",
" {'selector': '.l0', 'props': 'color:blue;'}],\n",
" css_class_names=my_css)\n",
"print(html.to_html())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"html"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Builtin Styles"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some styling functions are common enough that we've \"built them in\" to the
`Styler`, so you don't have to write them and apply them yourself. The current list
of such functions is:\n",
"\n",
" - [.highlight_null][nullfunc]: for use with identifying missing data. \n",
" - [.highlight_min][minfunc] and [.highlight_max][maxfunc]: for use with
identifying extremeties in data.\n",
" - [.highlight_between][betweenfunc] and [.highlight_quantile][quantilefunc]:
for use with identifying classes within data.\n",
" - [.background_gradient][bgfunc]: a flexible method for highlighting cells
based on their, or other, values on a numeric scale.\n",
" - [.text_gradient][textfunc]: similar method for highlighting text based on
their, or other, values on a numeric scale.\n",
" - [.bar][barfunc]: to display mini-charts within cell backgrounds.\n",
" \n",
"The individual documentation on each function often gives more examples of
their arguments.\n",
"\n",
"[nullfunc]:
../reference/api/pandas.io.formats.style.Styler.highlight_null.rst\n",
"[minfunc]: ../reference/api/pandas.io.formats.style.Styler.highlight_min.rst\
n",
"[maxfunc]: ../reference/api/pandas.io.formats.style.Styler.highlight_max.rst\
n",
"[betweenfunc]:
../reference/api/pandas.io.formats.style.Styler.highlight_between.rst\n",
"[quantilefunc]:
../reference/api/pandas.io.formats.style.Styler.highlight_quantile.rst\n",
"[bgfunc]:
../reference/api/pandas.io.formats.style.Styler.background_gradient.rst\n",
"[textfunc]: ../reference/api/pandas.io.formats.style.Styler.text_gradient.rst\
n",
"[barfunc]: ../reference/api/pandas.io.formats.style.Styler.bar.rst"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Highlight Null"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.iloc[0,2] = np.nan\n",
"df2.iloc[4,3] = np.nan\n",
"df2.loc[:4].style.highlight_null(color='yellow')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Highlight Min or Max"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.loc[:4].style.highlight_max(axis=1, props='color:white; font-weight:bold;
background-color:darkblue;')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Highlight Between\n",
"This method accepts ranges as float, or NumPy arrays or Series provided the
indexes match."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"left = pd.Series([1.0, 0.0, 1.0], index=[\"A\", \"B\", \"D\"])\n",
"df2.loc[:4].style.highlight_between(left=left, right=1.5, axis=1,
props='color:white; background-color:purple;')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Highlight Quantile\n",
"Useful for detecting the highest or lowest percentile values"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.loc[:4].style.highlight_quantile(q_left=0.85, axis=None, color='yellow')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Background Gradient and Text Gradient"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can create \"heatmaps\" with the `background_gradient` and `text_gradient`
methods. These require matplotlib, and we'll use
[Seaborn](http://seaborn.pydata.org/) to get a nice colormap."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import seaborn as sns\n",
"cm = sns.light_palette(\"green\", as_cmap=True)\n",
"\n",
"df2.style.background_gradient(cmap=cm)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.style.text_gradient(cmap=cm)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[.background_gradient][bgfunc] and [.text_gradient][textfunc] have a number of
keyword arguments to customise the gradients and colors. See the documentation.\n",
"\n",
"[bgfunc]:
../reference/api/pandas.io.formats.style.Styler.background_gradient.rst\n",
"[textfunc]: ../reference/api/pandas.io.formats.style.Styler.text_gradient.rst"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set properties\n",
"\n",
"Use `Styler.set_properties` when the style doesn't actually depend on the
values. This is just a simple wrapper for `.map` where the function returns the
same properties for all cells."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.loc[:4].style.set_properties(**{'background-color': 'black',\n",
" 'color': 'lawngreen',\n",
" 'border-color': 'white'})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Bar charts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can include \"bar charts\" in your DataFrame."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.style.bar(subset=['A', 'B'], color='#d65f5f')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Additional keyword arguments give more control on centering and positioning,
and you can pass a list of `[color_negative, color_positive]` to highlight lower
and higher values or a matplotlib colormap.\n",
"\n",
"To showcase an example here's how you can change the above with the new
`align` option, combined with setting `vmin` and `vmax` limits, the `width` of the
figure, and underlying css `props` of cells, leaving space to display the text and
the bars. We also use `text_gradient` to color the text the same as the bars using
a matplotlib colormap (although in this case the visualization is probably better
without this additional effect)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.style.format('{:.3f}', na_rep=\"\")\\\n",
" .bar(align=0, vmin=-2.5, vmax=2.5, cmap=\"bwr\", height=50,\n",
" width=60, props=\"width: 120px; border-right: 1px solid
black;\")\\\n",
" .text_gradient(cmap=\"bwr\", vmin=-2.5, vmax=2.5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following example aims to give a highlight of the behavior of the new
align options:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# Hide the construction of the display chart from the user\n",
"import pandas as pd\n",
"from IPython.display import HTML\n",
"\n",
"# Test series\n",
"test1 = pd.Series([-100,-60,-30,-20], name='All Negative')\n",
"test2 = pd.Series([-10,-5,0,90], name='Both Pos and Neg')\n",
"test3 = pd.Series([10,20,50,100], name='All Positive')\n",
"test4 = pd.Series([100, 103, 101, 102], name='Large Positive')\n",
"\n",
"\n",
"head = \"\"\"\n",
"<table>\n",
" <thead>\n",
" <th>Align</th>\n",
" <th>All Negative</th>\n",
" <th>Both Neg and Pos</th>\n",
" <th>All Positive</th>\n",
" <th>Large Positive</th>\n",
" </thead>\n",
" </tbody>\n",
"\n",
"\"\"\"\n",
"\n",
"aligns = ['left', 'right', 'zero', 'mid', 'mean', 99]\n",
"for align in aligns:\n",
" row = \"<tr><th>{}</th>\".format(align)\n",
" for series in [test1,test2,test3, test4]:\n",
" s = series.copy()\n",
" s.name=''\n",
" row +=
\"<td>{}</td>\".format(s.to_frame().style.hide(axis='index').bar(align=align, \n",
" color=['#d65f5f',
'#5fba7d'], \n",
"
width=100).to_html()) #testn['width']\n",
" row += '</tr>'\n",
" head += row\n",
" \n",
"head+= \"\"\"\n",
"</tbody>\n",
"</table>\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"HTML(head)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sharing styles"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Say you have a lovely style built up for a DataFrame, and now you want to
apply the same style to a second DataFrame. Export the style with
`df1.style.export`, and import it on the second DataFrame with `df1.style.set`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"style1 = df2.style\\\n",
" .map(style_negative, props='color:red;')\\\n",
" .map(lambda v: 'opacity: 20%;' if (v < 0.3) and (v > -0.3) else
None)\\\n",
" .set_table_styles([{\"selector\": \"th\", \"props\": \"color:
blue;\"}])\\\n",
" .hide(axis=\"index\")\n",
"style1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"style2 = df3.style\n",
"style2.use(style1.export())\n",
"style2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that you're able to share the styles even though they're data aware.
The styles are re-evaluated on the new DataFrame they've been `use`d upon."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Limitations\n",
"\n",
"- DataFrame only (use `Series.to_frame().style`)\n",
"- The index and columns do not need to be unique, but certain styling
functions can only work with unique indexes.\n",
"- No large repr, and construction performance isn't great; although we have
some [HTML optimizations](#Optimization)\n",
"- You can only apply styles, you can't insert new HTML entities, except via
subclassing."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Other Fun and Useful Stuff\n",
"\n",
"Here are a few interesting examples."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Widgets\n",
"\n",
"`Styler` interacts pretty well with widgets. If you're viewing this online
instead of running the notebook yourself, you're missing out on interactively
adjusting the color palette."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from ipywidgets import widgets\n",
"@widgets.interact\n",
"def f(h_neg=(0, 359, 1), h_pos=(0, 359), s=(0., 99.9), l=(0., 99.9)):\n",
" return df2.style.background_gradient(\n",
" cmap=sns.palettes.diverging_palette(h_neg=h_neg, h_pos=h_pos, s=s,
l=l,\n",
" as_cmap=True)\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Magnify"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def magnify():\n",
" return [dict(selector=\"th\",\n",
" props=[(\"font-size\", \"4pt\")]),\n",
" dict(selector=\"td\",\n",
" props=[('padding', \"0em 0em\")]),\n",
" dict(selector=\"th:hover\",\n",
" props=[(\"font-size\", \"12pt\")]),\n",
" dict(selector=\"tr:hover td:hover\",\n",
" props=[('max-width', '200px'),\n",
" ('font-size', '12pt')])\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"np.random.seed(25)\n",
"cmap = cmap=sns.diverging_palette(5, 250, as_cmap=True)\n",
"bigdf = pd.DataFrame(np.random.randn(20, 25)).cumsum()\n",
"\n",
"bigdf.style.background_gradient(cmap, axis=1)\\\n",
" .set_properties(**{'max-width': '80px', 'font-size': '1pt'})\\\n",
" .set_caption(\"Hover to magnify\")\\\n",
" .format(precision=2)\\\n",
" .set_table_styles(magnify())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sticky Headers\n",
"\n",
"If you display a large matrix or DataFrame in a notebook, but you want to
always see the column and row headers you can use the [.set_sticky][sticky] method
which manipulates the table styles CSS.\n",
"\n",
"[sticky]: ../reference/api/pandas.io.formats.style.Styler.set_sticky.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bigdf = pd.DataFrame(np.random.randn(16, 100))\n",
"bigdf.style.set_sticky(axis=\"index\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is also possible to stick MultiIndexes and even only specific levels."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bigdf.index = pd.MultiIndex.from_product([[\"A\",\"B\"],[0,1],[0,1,2,3]])\n",
"bigdf.style.set_sticky(axis=\"index\", pixel_size=18, levels=[1,2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### HTML Escaping\n",
"\n",
"Suppose you have to display HTML within HTML, that can be a bit of pain when
the renderer can't distinguish. You can use the `escape` formatting option to
handle this, and even use it within a formatter that contains HTML itself."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4 = pd.DataFrame([['<div></div>', '\"&other\"', '<span></span>']])\n",
"df4.style"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.format(escape=\"html\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.format('<a href=\"https://pandas.pydata.org\"
target=\"_blank\">{}</a>', escape=\"html\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Export to Excel\n",
"\n",
"Some support (*since version 0.20.0*) is available for exporting styled
`DataFrames` to Excel worksheets using the `OpenPyXL` or `XlsxWriter` engines.
CSS2.2 properties handled include:\n",
"\n",
"- `background-color`\n",
"- `border-style` properties\n",
"- `border-width` properties\n",
"- `border-color` properties\n",
"- `color`\n",
"- `font-family`\n",
"- `font-style`\n",
"- `font-weight`\n",
"- `text-align`\n",
"- `text-decoration`\n",
"- `vertical-align`\n",
"- `white-space: nowrap`\n",
"\n",
"\n",
"- Shorthand and side-specific border properties are supported (e.g. `border-
style` and `border-left-style`) as well as the `border` shorthands for all sides
(`border: 1px solid green`) or specified sides (`border-left: 1px solid green`).
Using a `border` shorthand will override any border properties set before it (See
[CSS Working Group](https://drafts.csswg.org/css-backgrounds/#border-shorthands)
for more details)\n",
"\n",
"\n",
"- Only CSS2 named colors and hex colors of the form `#rgb` or `#rrggbb` are
currently supported.\n",
"- The following pseudo CSS properties are also available to set Excel specific
style properties:\n",
" - `number-format`\n",
" - `border-style` (for Excel-specific
styles: \"hair\", \"mediumDashDot\", \"dashDotDot\", \"mediumDashDotDot\", \"dashDo
t\", \"slantDashDot\", or \"mediumDashed\")\n",
"\n",
"Table level styles, and data cell CSS-classes are not included in the export
to Excel: individual cells must have their properties mapped by the `Styler.apply`
and/or `Styler.map` methods."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df2.style.\\\n",
" map(style_negative, props='color:red;').\\\n",
" highlight_max(axis=0).\\\n",
" to_excel('styled.xlsx', engine='openpyxl')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A screenshot of the output:\n",
"\n",
"![Excel spreadsheet with styled DataFrame](../_static/style-excel.png)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Export to LaTeX\n",
"\n",
"There is support (*since version 1.3.0*) to export `Styler` to LaTeX. The
documentation for the [.to_latex][latex] method gives further detail and numerous
examples.\n",
"\n",
"[latex]: ../reference/api/pandas.io.formats.style.Styler.to_latex.rst"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## More About CSS and HTML\n",
"\n",
"Cascading Style Sheet (CSS) language, which is designed to influence how a
browser renders HTML elements, has its own peculiarities. It never reports errors:
it just silently ignores them and doesn't render your objects how you intend so can
sometimes be frustrating. Here is a very brief primer on how ``Styler`` creates
HTML and interacts with CSS, with advice on common pitfalls to avoid."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### CSS Classes and Ids\n",
"\n",
"The precise structure of the CSS `class` attached to each cell is as follows.\
n",
"\n",
"- Cells with Index and Column names include `index_name` and `level<k>` where
`k` is its level in a MultiIndex\n",
"- Index label cells include\n",
" + `row_heading`\n",
" + `level<k>` where `k` is the level in a MultiIndex\n",
" + `row<m>` where `m` is the numeric position of the row\n",
"- Column label cells include\n",
" + `col_heading`\n",
" + `level<k>` where `k` is the level in a MultiIndex\n",
" + `col<n>` where `n` is the numeric position of the column\n",
"- Data cells include\n",
" + `data`\n",
" + `row<m>`, where `m` is the numeric position of the cell.\n",
" + `col<n>`, where `n` is the numeric position of the cell.\n",
"- Blank cells include `blank`\n",
"- Trimmed cells include `col_trim` or `row_trim`\n",
"\n",
"The structure of the `id` is `T_uuid_level<k>_row<m>_col<n>` where `level<k>`
is used only on headings, and headings will only have either `row<m>` or `col<n>`
whichever is needed. By default we've also prepended each row/column identifier
with a UUID unique to each DataFrame so that the style from one doesn't collide
with the styling from another within the same notebook or page. You can read more
about the use of UUIDs in [Optimization](#Optimization).\n",
"\n",
"We can see example of the HTML by calling the [.to_html()][tohtml] method.\n",
"\n",
"[tohtml]: ../reference/api/pandas.io.formats.style.Styler.to_html.rst"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(pd.DataFrame([[1,2],[3,4]], index=['i1', 'i2'], columns=['c1',
'c2']).style.to_html())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### CSS Hierarchies\n",
"\n",
"The examples have shown that when CSS styles overlap, the one that comes last
in the HTML render, takes precedence. So the following yield different results:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4 = pd.DataFrame([['text']])\n",
"df4.style.map(lambda x: 'color:green;')\\\n",
" .map(lambda x: 'color:red;')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.map(lambda x: 'color:red;')\\\n",
" .map(lambda x: 'color:green;')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is only true for CSS rules that are equivalent in hierarchy, or
importance. You can read more about [CSS specificity
here](https://www.w3schools.com/css/css_specificity.asp) but for our purposes it
suffices to summarize the key points:\n",
"\n",
"A CSS importance score for each HTML element is derived by starting at zero
and adding:\n",
"\n",
" - 1000 for an inline style attribute\n",
" - 100 for each ID\n",
" - 10 for each attribute, class or pseudo-class\n",
" - 1 for each element name or pseudo-element\n",
" \n",
"Let's use this to describe the action of the following configurations"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.set_uuid('a_')\\\n",
" .set_table_styles([{'selector': 'td', 'props': 'color:red;'}])\\\n",
" .map(lambda x: 'color:green;')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This text is red because the generated selector `#T_a_ td` is worth 101 (ID
plus element), whereas `#T_a_row0_col0` is only worth 100 (ID), so is considered
inferior even though in the HTML it comes after the previous."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.set_uuid('b_')\\\n",
" .set_table_styles([{'selector': 'td', 'props': 'color:red;'},\n",
" {'selector': '.cls-1', 'props':
'color:blue;'}])\\\n",
" .map(lambda x: 'color:green;')\\\n",
" .set_td_classes(pd.DataFrame([['cls-1']]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the above case the text is blue because the selector `#T_b_ .cls-1` is
worth 110 (ID plus class), which takes precedence."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.set_uuid('c_')\\\n",
" .set_table_styles([{'selector': 'td', 'props': 'color:red;'},\n",
" {'selector': '.cls-1', 'props': 'color:blue;'},\
n",
" {'selector': 'td.data', 'props':
'color:yellow;'}])\\\n",
" .map(lambda x: 'color:green;')\\\n",
" .set_td_classes(pd.DataFrame([['cls-1']]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we have created another table style this time the selector `T_c_ td.data`
(ID plus element plus class) gets bumped up to 111. \n",
"\n",
"If your style fails to be applied, and its really frustrating, try the `!
important` trump card."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df4.style.set_uuid('d_')\\\n",
" .set_table_styles([{'selector': 'td', 'props': 'color:red;'},\n",
" {'selector': '.cls-1', 'props': 'color:blue;'},\
n",
" {'selector': 'td.data', 'props':
'color:yellow;'}])\\\n",
" .map(lambda x: 'color:green !important;')\\\n",
" .set_td_classes(pd.DataFrame([['cls-1']]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally got that green text after all!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extensibility\n",
"\n",
"The core of pandas is, and will remain, its \"high-performance, easy-to-use
data structures\".\n",
"With that in mind, we hope that `DataFrame.style` accomplishes two goals\n",
"\n",
"- Provide an API that is pleasing to use interactively and is \"good enough\"
for many tasks\n",
"- Provide the foundations for dedicated libraries to build on\n",
"\n",
"If you build a great library on top of this, let us know and we'll [link]
(https://pandas.pydata.org/pandas-docs/stable/ecosystem.html) to it.\n",
"\n",
"### Subclassing\n",
"\n",
"If the default template doesn't quite suit your needs, you can subclass Styler
and extend or override the template.\n",
"We'll show an example of extending the default template to insert a custom
header before each table."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from jinja2 import Environment, ChoiceLoader, FileSystemLoader\n",
"from IPython.display import HTML\n",
"from pandas.io.formats.style import Styler"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll use the following template:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"with open(\"templates/myhtml.tpl\") as f:\n",
" print(f.read())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we've created a template, we need to set up a subclass of ``Styler``
that\n",
"knows about it."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class MyStyler(Styler):\n",
" env = Environment(\n",
" loader=ChoiceLoader([\n",
" FileSystemLoader(\"templates\"), # contains ours\n",
" Styler.loader, # the default\n",
" ])\n",
" )\n",
" template_html_table = env.get_template(\"myhtml.tpl\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notice that we include the original loader in our environment's loader.\n",
"That's because we extend the original template, so the Jinja environment
needs\n",
"to be able to find it.\n",
"\n",
"Now we can use that custom styler. It's `__init__` takes a DataFrame."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"MyStyler(df3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our custom template accepts a `table_title` keyword. We can provide the value
in the `.to_html` method."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"HTML(MyStyler(df3).to_html(table_title=\"Extending Example\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For convenience, we provide the `Styler.from_custom_template` method that does
the same as the custom subclass."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"EasyStyler = Styler.from_custom_template(\"templates\", \"myhtml.tpl\")\n",
"HTML(EasyStyler(df3).to_html(table_title=\"Another Title\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Template Structure\n",
"\n",
"Here's the template structure for the both the style generation template and
the table generation template:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Style template:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"with open(\"templates/html_style_structure.html\") as f:\n",
" style_structure = f.read()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"HTML(style_structure)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Table template:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"with open(\"templates/html_table_structure.html\") as f:\n",
" table_structure = f.read()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"HTML(table_structure)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See the template in the [GitHub repo](https://github.com/pandas-dev/pandas)
for more details."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"nbsphinx": "hidden"
},
"outputs": [],
"source": [
"# # Hack to get the same style in the notebook as the\n",
"# # main site. This is hidden in the docs.\n",
"# from IPython.display import HTML\n",
"# with open(\"themes/nature_with_gtoc/static/nature.css_t\") as f:\n",
"# css = f.read()\n",
" \n",
"# HTML('<style>{}</style>'.format(css))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.5"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

You might also like