Data Visualization using Plotnine and ggplot2 in Python
Last Updated :
22 Jul, 2025
Plotnine is a Python data visualization library built on the principles of the Grammar of Graphics, the same philosophy that powers ggplot2 in R. It allows users to create complex plots by layering components such as data, aesthetics and geometric objects.
Installing Plotnine in Python
The plotnine is based on ggplot2 in R Programming language which is used to implement grammar of graphics in Python. To install plotnine type the below command in the terminal.
pip install plotnine
Core Components of Plotnine
Plotnine’s design revolves around three essential components:
- Data: The dataset you want to visualize.
- Aesthetics (aes): Mapping of data variables to visual properties like axes, color, size, shape, etc.
- Geometric Objects (geoms): The visual marks used to represent data points (e.g., points, bars, lines).
The basic structure of Plotnine is built around the ggplot()
function and geometric objects (geoms). Here's the general template:
from plotnine import ggplot, aes, geom_point
(ggplot(data, aes(x='x_variable', y='y_variable')) + geom_point())
Plotting with Plotnine – Step-by-Step
1. Data: We will use the Iris dataset and will read it using Pandas.
Python
import pandas as pd
from plotnine import ggplot
df = pandas.read_csv("Iris.csv")
# passing the data to the ggplot constructor
ggplot(df)
Output

This will give us a blank output as we have not specified the other two main components.
2. Aesthetics: This step involves defining which variables from the dataset correspond to the x and y axes, colors, shapes and other attributes. For instance, you may want to map the species of flowers to colors or map sepal length to the y-axis.
Example: Defining Aesthetics of the Plotnine
Python
import pandas as pd
from plotnine import ggplot, aes
df = pd.read_csv("Iris.csv")
ggplot(df) + aes(x="Species", y="SepalLengthCm")
Output

In the above example, we can see that Species is shown on the x-axis and sepal length is shown on the y-axis. But still there is no figure in the plot. This can be added using geometric objects.
3. Geometric Objects: After specifying the data and aesthetics, the final step is to define geoms (geometric objects). Whether you want scatter plots, bar charts or histograms,Plotnine
provides various geoms to display data effectively.
Python
import pandas as pd
from plotnine import ggplot, aes, geom_col
df = pd.read_csv("Iris.csv")
ggplot(df) + aes(x="Species", y="SepalLengthCm") + geom_col()
Output

In the above example, we have used the geam_col() geom that is a bar plot with the base on the x-axis. We can change this to different types of geoms that we find suitable for our plot.
Plotting Basic Charts with Plotnine in Python
Plotnine allows users to create complex plots using a declarative syntax, making it easier to build, customize, and manage plots. In this section, we will cover how to create basic charts using Plotnine, including scatter plots, line charts, bar charts, box plots, and histograms.
Example 1: Plotting Histogram with Plotnine
Python
import pandas as pd
from plotnine import ggplot, aes, geom_histogram
df = pd.read_csv("Iris.csv")
ggplot(df) + aes(x="SepalLengthCm") + geom_histogram()
Output

Example 2: Plotting Scatter plot With Plotnine
Python
import pandas as pd
from plotnine import ggplot, aes, geom_point
df = pd.read_csv("Iris.csv")
ggplot(df) + aes(x="Species", y="SepalLengthCm") + geom_point()
Output

Example 3: Plotting Box plot with Plotnine
Python
import pandas as pd
from plotnine import ggplot, aes, geom_boxplot
df = pd.read_csv("Iris.csv")
ggplot(df) + aes(x="Species", y="SepalLengthCm") + geom_boxplot()
Output

Example 4: Plotting Line chart with Plotnine
Python
import pandas as pd
from plotnine import ggplot, aes, geom_line
df = pd.read_csv("Iris.csv")
ggplot(df) + aes(x="Species", y="SepalLengthCm") + geom_line()
Output

Till now we have learnt about how to create a basic chart using the concept of grammar of graphics and it's three main components. Now let's learn how to customize these charts using the other optional components.
Enhacing Data visualizations Using Plotnine - Customizations
There are various optional components that can make the plot more meaningful and presentable. These are:
- Facets allow data to plot subsets of data
- Statistical transformations compute the data before plotting it.
- Coordinates define the position of the object in a 2D plane.
- Themes define the presentation of the data such as font, color, etc.
1. Facets
Let's consider the tips dataset that contains information about people who probably had food at a restaurant and whether or not they left a tip, their age, gender and so on. Lets have a look at it. To download the dataset used, click here.
Now let's suppose we want to plot about what was the total bill according to the gender and on each day.
Python
import pandas as pd
from plotnine import ggplot, aes, facet_grid, labs, geom_col
df = pd.read_csv("tips.csv")
(
ggplot(df)
+ facet_grid(facets="~sex")
+ aes(x="day", y="total_bill")
+ labs(
x="day",
y="total_bill",
)
+ geom_col()
)
Output

Let's consider the above example where we wanted to find the measurement of the sepal length column and now we want to distribute that measurement into 15 columns. The geom_histogram() function of the plotnine computes and plot this data automatically.
Python
import pandas as pd
from plotnine import ggplot, aes, geom_histogram
df = pd.read_csv("Iris.csv")
ggplot(df) + aes(x="SepalLengthCm") + geom_histogram(bins=15)
Output

3. Coordinates
Let's see the above example of histogram, we want to plot this histogram horizontally. We can simply do this by using the coord_flip() function.
Python
import pandas as pd
from plotnine import ggplot, aes, geom_histogram, coord_flip
df = pd.read_csv("Iris.csv")
(
ggplot(df)
+ aes(x="SepalLengthCm")
+ geom_histogram(bins=15)
+ coord_flip()
)
Output

4. Themes
Plotnine includes a lot of theme. Let's use the above example with facets and try to make the visualization more interactive.
Python
import pandas as pd
from plotnine import ggplot, aes, facet_grid, labs, geom_col, theme_xkcd
df = pd.read_csv("tips.csv")
(
ggplot(df)
+ facet_grid(facets="~sex")
+ aes(x="day", y="total_bill")
+ labs(
x="day",
y="total_bill",
)
+ geom_col()
+ theme_xkcd()
)
Output

We can also fill the color according to add more information to this graph. We can add color for the time variable in the above graph using the fill parameter of the aes function.
Plotting Multidimensional Data with Plotline
Till now we have seen how to plot more than 2 variables in the case of facets. Now let's suppose we want to plot data using four variables, doing this with facets can be a little bit of hectic, but with using the color we can plot 4 variables in the same plot only. We can fill the color using the fill parameter of the aes() function. Example: Adding Color to Plotnine and ggplot in Python
Python
import pandas as pd
from plotnine import ggplot, aes, facet_grid, labs, geom_col, theme_xkcd
df = pd.read_csv("tips.csv")
(
ggplot(df)
+ facet_grid(facets="~sex")
+ aes(x="day", y="total_bill", fill="time")
+ labs(
x="day",
y="total_bill",
)
+ geom_col()
+ theme_xkcd()
)
Output

Exporting Plots With Plotline
We can simply save the plot using the save() method. This method will export the plot as an image.
Python
import pandas as pd
from plotnine import ggplot, aes, facet_grid, labs, geom_col, theme_xkcd
df = pd.read_csv("tips.csv")
plot = (
ggplot(df)
+ facet_grid(facets="~sex")
+ aes(x="day", y="total_bill", fill="time")
+ labs(
x="day",
y="total_bill",
)
+ geom_col()
+ theme_xkcd()
)
plot.save("gfg plotnine tutorial.png")
Output

Related articles:
Similar Reads
Python - Data visualization tutorial Data visualization is a crucial aspect of data analysis, helping to transform analyzed data into meaningful insights through graphical representations. This comprehensive tutorial will guide you through the fundamentals of data visualization using Python. We'll explore various libraries, including M
7 min read
What is Data Visualization and Why is It Important? Data visualization uses charts, graphs and maps to present information clearly and simply. It turns complex data into visuals that are easy to understand.With large amounts of data in every industry, visualization helps spot patterns and trends quickly, leading to faster and smarter decisions.Common
4 min read
Data Visualization using Matplotlib in Python Matplotlib is a widely-used Python library used for creating static, animated and interactive data visualizations. It is built on the top of NumPy and it can easily handles large datasets for creating various types of plots such as line charts, bar charts, scatter plots, etc. These visualizations he
11 min read
Data Visualization with Seaborn - Python Seaborn is a popular Python library for creating attractive statistical visualizations. Built on Matplotlib and integrated with Pandas, it simplifies complex plots like line charts, heatmaps and violin plots with minimal code.Creating Plots with SeabornSeaborn makes it easy to create clear and infor
9 min read
Data Visualization with Pandas Pandas is a powerful open-source data analysis and manipulation library for Python. The library is particularly well-suited for handling labeled data such as tables with rows and columns. Pandas allows to create various graphs directly from your data using built-in functions. This tutorial covers Pa
6 min read
Plotly for Data Visualization in Python Plotly is an open-source Python library designed to create interactive, visually appealing charts and graphs. It helps users to explore data through features like zooming, additional details and clicking for deeper insights. It handles the interactivity with JavaScript behind the scenes so that we c
12 min read
Data Visualization using Plotnine and ggplot2 in Python Plotnine is a Python data visualization library built on the principles of the Grammar of Graphics, the same philosophy that powers ggplot2 in R. It allows users to create complex plots by layering components such as data, aesthetics and geometric objects.Installing Plotnine in PythonThe plotnine is
6 min read
Introduction to Altair in Python Altair is a declarative statistical visualization library in Python, designed to make it easy to create clear and informative graphics with minimal code. Built on top of Vega-Lite, Altair focuses on simplicity, readability and efficiency, making it a favorite among data scientists and analysts.Why U
4 min read
Python - Data visualization using Bokeh Bokeh is a data visualization library in Python that provides high-performance interactive charts and plots. Bokeh output can be obtained in various mediums like notebook, html and server. It is possible to embed bokeh plots in Django and flask apps. Bokeh provides two visualization interfaces to us
4 min read
Pygal Introduction Python has become one of the most popular programming languages for data science because of its vast collection of libraries. In data science, data visualization plays a crucial role that helps us to make it easier to identify trends, patterns, and outliers in large data sets. Pygal is best suited f
5 min read