Visualization of Superhero Characters using Python
Last Updated :
24 Apr, 2025
There are a number of different libraries in Python that can be used to create visualizations of superhero characters. Some popular libraries include Matplotlib, Seaborn, and Plotly.
In this article, we use Matplotlib to generate visualizations and get insights from the Superheroes Dataset.
Matplotlib is a plotting library for Python that provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK. It has a wide range of capabilities and can create a variety of different types of plots, including line plots, scatter plots, bar plots, pie plots, and more.
CSV (Comma Separated Values) is a file format that stores data in a tabular form, i.e., in the form of rows and columns where each column is separated by a comma.
For generating better conclusions and plotting visualizations from the dataset, first, the data should be reliable and clean. Pre-processing of data is the major step to be performed for any dataset to get insights from it. It means we need to check whether all the values are present in the dataset or not. Find any missing values and fill in or remove them completely if needed.
So, Let's import the required libraries and clean our dataset. Later, we can perform some visualizations accordingly.
Step 1: Importing required libraries.
Python3
# importing libraries..
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
Step 2: Cleaning the dataset and find any missing values.
Python3
# Reading Superheroes CSV File using pandas..
df = pd.read_csv("C:/Users/admin/Downloads/superheroes_stats.csv")
# displaying first 10 rows
df.head(10)
Output:
We can observe columns 7 and 8 have missing values (NaN). So, they need to be removed.
Superheroes Dataset
Let's list out how many missing values the dataset contains through the below code.
Python3
# Missing values in dataset..
columns = list(df)
for column in columns:
print("No. of missing values in", column,
"attribute:", df[column].isnull().sum())
# Dropping missing values
df = df.dropna(axis=0)
Output:
From the above python code, we found the dataset contains null values for the entire columns of some specific rows. So, such rows are dropped entirely with dropna( ) method for our effective use of dataset.
Missing Values in each column of the dataset
Step 3: Getting insights from the Superheroes dataset.
Data Insight 1:
Let's find the nature (good, bad and neutral) of superheroes with the help of the Alignment column from the dataset.
Python3
# Getting count of good, bad and neutral characters
cnt = df['Alignment'].value_counts()
print(cnt)
Output:
Nature of Superhero characters count
Plotting pie-plot to know the percentage of superheroes with good, bad and neutral natures.
Python3
# Plotting a pie-plot & getting Nature of super-heroes..
label = ['good', 'bad', 'neutral']
plt.pie(cnt, labels=label, autopct='%.2f%%')
plt.show()
Output:
percentage of good, bad & neutral nature of superheroes
Data Insight 2:
Let's find the top 10 superheroes who are good-natured.
Python3
# Top ten good superheroes
good = df[df['Alignment'] == "good"]
Top_ten = good.sort_values(by=['Total'], ascending=False).head(10)
x = Top_ten['Name']
y = Top_ten['Total']
# setting width and height of the figure
plt.figure(figsize=(10, 5))
y_ticks = np.arange(0, y.max()+50, 50)
plt.xticks(rotation=80, fontsize=12)
plt.yticks(y_ticks)
plt.title("Top 10 good super-heroes", fontsize=22)
# plt.grid(visible=None)
plt.bar(x, y, color="g")
plt.show()
Output:
From the output, we can see that the overall top superheroes are Martian Manhunter, Superman, Stardust, Thor, Supergirl, Nova, Goku, Jean Grey, Phoenix and Iron Man.
Top 10 Superheroes
Data Insight 3:
Now, let's find all the good superheroes having the Highest Strength and Intelligence.
Python3
# Good Superheroes with highest Strength and Intelligence...
Max_strength_Intelligence = good.sort_values(
by=['Strength', 'Intelligence'], ascending=False)
Max_strength_Intelligence
Output:
Filtered Dataset with high Strength & Intelligence Superheroes
Python3
# Top Good Superheroes with both highest strength & Intelligence
X = Max_strength_Intelligence['Name'][0:5]
Intelligence = Max_strength_Intelligence['Intelligence'][0:5]
Strength = Max_strength_Intelligence['Strength'][0:5]
X_axis = np.arange(len(X))
plt.figure(figsize=(10, 5))
# creating bar graph
plt.bar(X_axis - 0.2, Intelligence, 0.4, label='Intelligence')
plt.bar(X_axis + 0.2, Strength, 0.4, label='Strength')
plt.xticks(X_axis, X)
plt.xlabel("Super-heroes", fontsize=18)
plt.ylabel("Strength and Intelligence", fontsize=18)
plt.title("Good Superheroes with highest Strength and Intelligence", fontsize=18)
plt.legend()
plt.show()
Output:
From this output, we can conclude that Captain Marvel, Martian Manhunter, Superman, Beyonder and Hulk have high Strength and Intelligence compared to other characters.
Comparing both the highest Strengths & Intelligence of Good Superheroes
Data Insight 4:
Let's find the Top 5 Highest Power Superheroes along with the highest Speeds.
Python3
# Good Superheroes with both highest Powers and Speeds...
Max_Power_Speed = good.sort_values(by=['Power', 'Speed'], ascending=False)
Max_Power_Speed
Output:
Python3
# Top Superheroes with Good character who have highest speed and power..
X = Max_Power_Speed['Name'][0:5]
Speed = Max_Power_Speed['Speed'][0:5]
Power = Max_Power_Speed['Power'][0:5]
X_axis = np.arange(len(X))
plt.figure(figsize=(9, 5))
plt.bar(X_axis - 0.2, Speed, 0.4, label='Speed', color='y')
plt.bar(X_axis + 0.2, Power, 0.4, label='Power', color='g')
plt.xticks(X_axis, X)
plt.xlabel("Super-heroes", fontsize=18)
plt.ylabel("Speed and Power", fontsize=18)
plt.title("Good Superheroes with highest Speed and Power", fontsize=18)
plt.legend(bbox_to_anchor=(1.05, 1.0), loc='upper left')
plt.show()
Output:
Bar plot shows Superheroes with the highest Speeds & Powers
Data Insight 5:
Plotting Histogram to know the distribution of Speeds of Good Super-heroes from the dataset:
Python3
# plotting histogram for knowing the speeds of good superheroes..
plt.figure(figsize=(12, 6))
X = good['Speed']
plt.xticks(np.arange(0, len(X), 5))
# plotting a histogram
plt.hist(X)
plt.title("Distribution of Speed", fontsize=20)
plt.xlabel("Speed", fontsize=18)
plt.ylabel("Number of Super-heroes", fontsize=18)
plt.show()
Output:
From the Distribution of the Speed histogram, we observe that there are 20 good superheroes with highest speed between 90-100 and there are 80 good superheroes with 25-35 speed range.
Histogram showing the Distribution of Speed
Data Insight 6:
Plotting Line chart to know the superheroes with Total Superpower
The 'Total' column value in the dataset includes the sum of the superhero's Intelligence, Strength, Speed, Durability, Power and Combat values.
Python3
# Plotting superheroes with total superpower
plt.figure(figsize=(12, 6))
Top_ten_total = df.sort_values(by='Total', ascending=False).head(10)
X = Top_ten_total['Name']
Y = Top_ten_total['Total']
plt.xticks(rotation=80)
# plotting line chart
plt.plot(X, Y, 'o-', color='g')
plt.ylabel("Total Superpower", fontsize=18)
plt.xlabel("Superheroes", fontsize=18)
plt.title("Line chart with Total Strength of Superheroes", fontsize=20)
plt.show()
Output:
Line chart of top-ten superheroes with Total power
In this way, we can generate many such visualizations, customize them and gather insights from the data.
Data Insight - 7 :
Plotting bar charts of only Good super heroes with highest strength and durability
We all know that to defeat enemy and win fights easily having durability is as much important as having sheer strength. So in this plot we will check which good natured super heroes have the highest strength and durability.
Python3
good = df[df['Alignment'] == "good"]
Max_strength_durability = good.sort_values(
by=['Strength', 'Durability'], ascending=False)
Max_strength_durability
Python3
# Top Good Superheroes with both highest strength & Durability
X = Max_strength_durability['Name'][0:5]
Durability = Max_strength_durability['Durability'][0:5]
Strength = Max_strength_durability['Strength'][0:5]
X_axis = np.arange(len(X))
plt.figure(figsize=(10, 5))
# creating bar graph
plt.bar(X_axis - 0.2, Durability, 0.4, label='Durability')
plt.bar(X_axis + 0.2, Strength, 0.4, label='Strength')
plt.xticks(X_axis, X)
plt.xlabel("Super-heroes", fontsize=18)
plt.ylabel("Strength and Durability", fontsize=18)
plt.title("Good Superheroes with highest Durability and Strength", fontsize=18)
plt.legend()
plt.show()
Output -
Similar Reads
Circular Visualization of Dataset using hishiryo Python
Among various ways to represent data, circular representation of data is one of the ways to render data points and do a corresponding analysis of the same. This article talks about ways to achieve the said visualization of data for further analytical purposes. hishiryo: This tool helps in the genera
2 min read
Python - Data visualization using Bokeh
Bokeh is a data visualization library in Python that provides high-performance interactive charts and plots. Bokeh output can be obtained in various mediums like notebook, html and server. It is possible to embed bokeh plots in Django and flask apps. Bokeh provides two visualization interfaces to us
4 min read
Dynamic Visualization using Python
Data visualization in Python refers to the pictorial representation of raw data for better visualization, understanding, and inference. Python provides various libraries containing different features for visualizing data and can support different types of graphs, i.e. Matplotlib, Seaborn, Bokeh, Plo
11 min read
Data Visualization using Matplotlib in Python
Matplotlib is a widely-used Python library used for creating static, animated and interactive data visualizations. It is built on the top of NumPy and it can easily handles large datasets for creating various types of plots such as line charts, bar charts, scatter plots, etc. These visualizations he
10 min read
Data Visualisation in Python using Matplotlib and Seaborn
It may sometimes seem easier to go through a set of data points and build insights from it but usually this process may not yield good results. There could be a lot of things left undiscovered as a result of this process. Additionally, most of the data sets used in real life are too big to do any an
14 min read
Python - Data visualization tutorial
Data visualization is a crucial aspect of data analysis, helping to transform analyzed data into meaningful insights through graphical representations. This comprehensive tutorial will guide you through the fundamentals of data visualization using Python. We'll explore various libraries, including M
7 min read
Interactive visualization of data using Bokeh
Bokeh is a Python library for creating interactive data visualizations in a web browser. It offers human-readable and fast presentation of data in an visually pleasing manner. If youâve worked with visualization in Python before, itâs likely that you have used matplotlib. But Bokeh differs from matp
4 min read
Animated Data Visualization using Plotly Express
Data Visualization is a big thing in the data science industry and displaying the proper statistics to a business or governments can help them immeasurably in improving their services. It is very painful to understand data from different times from multiple charts and make any sense of it. That is w
4 min read
How to use Jinja for Data Visualization
Jinja is a powerful templating engine for rendering text whether it be HTML, LaTeX, or just anything that is just pure text. This article provides an introduction to how to utilize Jinja for Data Visualization. First, it presents the idea and finally provides an example of rendering a simple bar gra
5 min read
Top 8 Python Libraries for Data Visualization
Data Visualization is an extremely important part of Data Analysis. After all, there is no better way to understand the hidden patterns and layers in the data than seeing them in a visual format! Donât trust me? Well, assume that you analyzed your company data and found out that a particular product
8 min read