Plotting a column-wise bee-swarm plot in Python
Last Updated :
23 Jul, 2025
Bee-swarm plots are a great way to visualize distributions, especially when you're dealing with multiple categories or columnar data. They allow you to see the distribution of points in a dataset while avoiding overlap, which gives them a more detailed and granular view than box plots or histograms. In this article, we’ll explore how to create a column-wise bee-swarm plot in Python.
What is a Bee-Swarm Plot?
A bee-swarm plot is a type of scatter plot where data points are plotted along a single axis, but are adjusted to avoid overlapping. This results in a "swarm" of points that provide insights into data distribution across different categories.
Why Use Bee-Swarm Plots?
Bee-swarm plots are particularly useful for:
- Visualizing data density: They show how many data points exist in different regions of the dataset.
- Spotting outliers: The spread of points makes it easy to identify any anomalies in your data.
- Comparing categories: Bee-swarm plots are great when comparing distributions across different groups or categories.
Creating Column-Wise Bee-Swarm Plot
Before we dive into creating a column-wise bee-swarm plot, we need to set up the environment and install the required libraries. For bee-swarm plots, we will use Seaborn, a powerful library built on top of Matplotlib, which simplifies statistical data visualization.
Make sure you have Python installed on your system. To install the required libraries, you can use the following commands:
pip install seaborn matplotlib pandas
Python
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
To create a bee-swarm plot, we need a dataset. Seaborn comes with several built-in datasets that are perfect for plotting. For this article, we will use the Iris dataset. This dataset contains information about different species of Iris flowers, including measurements like petal length and sepal width.
Python
# Load the Iris dataset
df = sns.load_dataset('iris')
print(df.head())
Seaborn provides a swarmplot function to create bee-swarm plots. To make the plot more informative, let’s extend it to multiple columns. We will plot the distributions of sepal_length, sepal_width, petal_length, and petal_width side by side.
Python
# Create a figure with subplots for each column
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Create bee-swarm plots for each feature
sns.swarmplot(ax=axes[0, 0], x='species', y='sepal_length', data=df)
axes[0, 0].set_title('Sepal Length by Species')
sns.swarmplot(ax=axes[0, 1], x='species', y='sepal_width', data=df)
axes[0, 1].set_title('Sepal Width by Species')
sns.swarmplot(ax=axes[1, 0], x='species', y='petal_length', data=df)
axes[1, 0].set_title('Petal Length by Species')
sns.swarmplot(ax=axes[1, 1], x='species', y='petal_width', data=df)
axes[1, 1].set_title('Petal Width by Species')
plt.tight_layout()
plt.show()
Output:
In the plot, we’ve created a 2x2 grid of subplots, each containing a bee-swarm plot for one of the columns (sepal_length, sepal_width, petal_length, and petal_width). This allows us to easily compare the distributions across multiple columns and species.
Customizing the Bee-Swarm Plot
Seaborn and Matplotlib offer many customization options. Let’s look at some of the most commonly used customizations.
1. Adding Color Palette
You can customize the color of each bee-swarm plot using Seaborn's color palettes.
Python
# Customizing with Color Palette
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Sepal Length
sns.swarmplot(ax=axes[0, 0], x='species', y='sepal_length', data=df, palette='Set1')
axes[0, 0].set_title('Sepal Length by Species')
# Sepal Width
sns.swarmplot(ax=axes[0, 1], x='species', y='sepal_width', data=df, palette='Set2')
axes[0, 1].set_title('Sepal Width by Species')
# Petal Length
sns.swarmplot(ax=axes[1, 0], x='species', y='petal_length', data=df, palette='Set3')
axes[1, 0].set_title('Petal Length by Species')
# Petal Width
sns.swarmplot(ax=axes[1, 1], x='species', y='petal_width', data=df, palette='Dark2')
axes[1, 1].set_title('Petal Width by Species')
plt.tight_layout()
plt.show()
Output:
2. Adjusting Point Size
You can change the size of the data points to make the plot more readable.
Python
# Customizing Point Size
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Sepal Length
sns.swarmplot(ax=axes[0, 0], x='species', y='sepal_length', data=df, size=10)
axes[0, 0].set_title('Sepal Length by Species')
# Sepal Width
sns.swarmplot(ax=axes[0, 1], x='species', y='sepal_width', data=df, size=8)
axes[0, 1].set_title('Sepal Width by Species')
# Petal Length
sns.swarmplot(ax=axes[1, 0], x='species', y='petal_length', data=df, size=6)
axes[1, 0].set_title('Petal Length by Species')
# Petal Width
sns.swarmplot(ax=axes[1, 1], x='species', y='petal_width', data=df, size=4)
axes[1, 1].set_title('Petal Width by Species')
plt.tight_layout()
plt.show()
Output:
3. Setting Marker Styles
You can change the marker style of the points.
Python
# Customizing Marker Style
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Sepal Length
sns.swarmplot(ax=axes[0, 0], x='species', y='sepal_length', data=df, marker='o')
axes[0, 0].set_title('Sepal Length by Species')
# Sepal Width
sns.swarmplot(ax=axes[0, 1], x='species', y='sepal_width', data=df, marker='s')
axes[0, 1].set_title('Sepal Width by Species')
# Petal Length
sns.swarmplot(ax=axes[1, 0], x='species', y='petal_length', data=df, marker='D')
axes[1, 0].set_title('Petal Length by Species')
# Petal Width
sns.swarmplot(ax=axes[1, 1], x='species', y='petal_width', data=df, marker='^')
axes[1, 1].set_title('Petal Width by Species')
plt.tight_layout()
plt.show()
Output:
4. Adjusting Alpha (Transparency)
You can adjust the transparency of the points to avoid overlapping and make the plot clearer.
Python
# Customizing Transparency with Alpha
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
# Sepal Length
sns.swarmplot(ax=axes[0, 0], x='species', y='sepal_length', data=df, alpha=0.9)
axes[0, 0].set_title('Sepal Length by Species')
# Sepal Width
sns.swarmplot(ax=axes[0, 1], x='species', y='sepal_width', data=df, alpha=0.7)
axes[0, 1].set_title('Sepal Width by Species')
# Petal Length
sns.swarmplot(ax=axes[1, 0], x='species', y='petal_length', data=df, alpha=0.6)
axes[1, 0].set_title('Petal Length by Species')
# Petal Width
sns.swarmplot(ax=axes[1, 1], x='species', y='petal_width', data=df, alpha=0.8)
axes[1, 1].set_title('Petal Width by Species')
plt.tight_layout()
plt.show()
Output:
Best Practices for Bee-Swarm Plots
When using bee-swarm plots in your analysis, keep the following best practices in mind:
- Use for small to medium datasets: Bee-swarm plots are ideal for datasets with a manageable number of points. Large datasets may require different approaches like density plots.
- Color wisely: Coloring by category adds another layer of insight, but too many colors can overwhelm the reader.
- Overlay with other plots: Combining bee-swarm plots with box plots or violin plots can give a fuller picture of the data distribution.
Conclusion
Bee-swarm plots are a versatile and informative way to visualize distributions across categories in a dataset. In this article, we demonstrated how to create both simple and column-wise bee-swarm plots in Python using Seaborn. We also covered various customization techniques, including color schemes, point sizes, and overlays with other plot types. Whether you're working with small or medium datasets, bee-swarm plots provide a visually compelling way to understand your data.
Similar Reads
Python - Data visualization tutorial Data visualization is a crucial aspect of data analysis, helping to transform analyzed data into meaningful insights through graphical representations. This comprehensive tutorial will guide you through the fundamentals of data visualization using Python. We'll explore various libraries, including M
7 min read
What is Data Visualization and Why is It Important? Data visualization uses charts, graphs and maps to present information clearly and simply. It turns complex data into visuals that are easy to understand.With large amounts of data in every industry, visualization helps spot patterns and trends quickly, leading to faster and smarter decisions.Common
4 min read
Data Visualization using Matplotlib in Python Matplotlib is a widely-used Python library used for creating static, animated and interactive data visualizations. It is built on the top of NumPy and it can easily handles large datasets for creating various types of plots such as line charts, bar charts, scatter plots, etc. Visualizing Data with P
11 min read
Data Visualization with Seaborn - Python Seaborn is a popular Python library for creating attractive statistical visualizations. Built on Matplotlib and integrated with Pandas, it simplifies complex plots like line charts, heatmaps and violin plots with minimal code.Creating Plots with SeabornSeaborn makes it easy to create clear and infor
9 min read
Data Visualization with Pandas Pandas is a powerful open-source data analysis and manipulation library for Python. The library is particularly well-suited for handling labeled data such as tables with rows and columns. Pandas allows to create various graphs directly from your data using built-in functions. This tutorial covers Pa
6 min read
Plotly for Data Visualization in Python Plotly is an open-source Python library designed to create interactive, visually appealing charts and graphs. It helps users to explore data through features like zooming, additional details and clicking for deeper insights. It handles the interactivity with JavaScript behind the scenes so that we c
12 min read
Data Visualization using Plotnine and ggplot2 in Python Plotnine is a Python data visualization library built on the principles of the Grammar of Graphics, the same philosophy that powers ggplot2 in R. It allows users to create complex plots by layering components such as data, aesthetics and geometric objects.Installing Plotnine in PythonThe plotnine is
6 min read
Introduction to Altair in Python Altair is a declarative statistical visualization library in Python, designed to make it easy to create clear and informative graphics with minimal code. Built on top of Vega-Lite, Altair focuses on simplicity, readability and efficiency, making it a favorite among data scientists and analysts.Why U
4 min read
Python - Data visualization using Bokeh Bokeh is a data visualization library in Python that provides high-performance interactive charts and plots. Bokeh output can be obtained in various mediums like notebook, html and server. It is possible to embed bokeh plots in Django and flask apps. Bokeh provides two visualization interfaces to us
4 min read
Pygal Introduction Python has become one of the most popular programming languages for data science because of its vast collection of libraries. In data science, data visualization plays a crucial role that helps us to make it easier to identify trends, patterns, and outliers in large data sets. Pygal is best suited f
5 min read