0% found this document useful (0 votes)
17 views10 pages

Data Visualization Using Matplotlib For Beginners. - by Chinmai Rane - GDSC UMIT - Medium

This document is a beginner's tutorial on data visualization using the Matplotlib library in Python. It covers how to create various types of plots, including line graphs, bar graphs, histograms, and scatter plots, using a dataset from the National Institute of Diabetes and Digestive and Kidney Diseases. The tutorial provides code examples and explanations for each type of visualization, emphasizing Matplotlib's ease of use and versatility.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views10 pages

Data Visualization Using Matplotlib For Beginners. - by Chinmai Rane - GDSC UMIT - Medium

This document is a beginner's tutorial on data visualization using the Matplotlib library in Python. It covers how to create various types of plots, including line graphs, bar graphs, histograms, and scatter plots, using a dataset from the National Institute of Diabetes and Digestive and Kidney Diseases. The tutorial provides code examples and explanations for each type of visualization, emphasizing Matplotlib's ease of use and versatility.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Upgrade Open in app

Published in GDSC UMIT

Chinmai Rane Follow

Dec 28, 2021 · 5 min read · Listen

Data visualization using matplotlib for


beginners.

Introduction

Matplotlib was the first library I used in Python for data visualisation. It’s simple to use,
yet the versatility and agility it offers are unrivalled. To show our outcomes, we have a
variety of visualisations to select from. Matplotlib provides a variety of colours, themes,
palettes, and other choices to create and personalise our plots, from histograms to
scatterplots.

By the end of this tutorial, we’ll have learned how to use Matplotlib to visualise data in
a variety of ways.

Following are the visualizations we’ll design using Matplotlib-


1. Line graph Upgrade Open in app

2. Bar graph

3. Histogram

4. Scatter plot

In this article we are going to the dataset from the National Institute of Diabetes and
Digestive and Kidney Diseases. The datasets consist of several medical predictor
variables and one target variable, Outcome.

In this tutorial, I used Jupyter; however, you can use whatever is most convenient or
available at the time.

Let’s import the relevant libraries and checkout the dataset:

To import libraries and checking out dataset type and run the following(replace
NAME_OF_YOUR_CSV_FILE with the name of your csv or path of the csv).

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

df = pd.read_csv(‘NAME_OF_YOUR_CSV_FILE’)

df.head()
Upgrade Open in app

From this we see that the predictor variables include the number of pregnancies the
patient has had, their BMI, insulin level, age, and so on. This dataset diagnostically
predicts whether or not a patient has diabetes, based on certain diagnostic
measurements included in the dataset.

Line graphs-

Line graphs are the most basic graphs you can make using Matplotlib. Let’s create a
graph to see the relationship between Skin thickness and diabetes pedigree function to
see how they relate to each other from the dataset utilizing only a few lines of code.

Type the following lines(change the variable names according to your choice),

skin = df['SkinThickness']

diabetes = df['DiabetesPedigreeFunction']

plt.plot(skin, diabetes)

plt.title('Skin thickness vs Diabetes pedigree function')

plt.xlabel('Diabetes pedigree function')

plt.ylabel('Skin thickness')

plt.show()
Upgrade Open in app

We gave the column names we wished to compare to a simpler named variable in the
above code block, making it easier to call. After that by using the plot method of the
Matplotlib object we pass the variable names. The title method makes the provided
string the plot’s primary title. The xlabel and ylabel techniques, respectively, label the
x- and y-axes. The plot is displayed using the show technique. What if we wish to look
at the relationships of many variables on the same graph? To do so, just use plt.plot()
twice with the two separate series you want to pass as the x-value arguments, as
illustrated here:

skin = df['SkinThickness']

diabetes = df['DiabetesPedigreeFunction']

glucose = df['Glucose']

plt.plot(skin, diabetes)
p p ( , )

plt.plot(skin, glucose) Upgrade Open in app

plt.show()

Bar graph-

Constructing bar graphs in Matplotlib is a bit more difficult than you would think. It
may be accomplished with a few lines of code, but it is essential to comprehend what
this code does.

The following code block is used to generate a bar graph(change the variable names
according to your choice):

age = df['Age']

diabetes = df['DiabetesPedigreeFunction']
plt.bar(age, diabetes, color = "blue") Upgrade Open in app

plt.xlabel("Diabetes Pedigree Function")

plt.ylabel("Age")

plt.title("Diabetes Pedigree function variation due to age")

plt.show()

The latter four lines of code are rather self-explanatory, but what exactly is going on in
the first three? In the first two lines we gave the column names we wished to compare
to a simpler named variable in the above code block, making it easier to call. In the
third line we used the bar method of Matplotlib object to generate a bar graph. When
run, this code produces the following bar graph:

Histogram-
A histogram depicts the distribution of a specific data characteristic. Simply put, it tells
Upgrade Open in app
us how many observations have a certain value. Just like line graphs, histograms are
very easy to create.

To graph a histogram type the following(change the variable names according to your
choice):

bmi = df['BMI']

plt.hist(bmi)

plt.title("BMI frequency")

plt.xlabel("BMI")

plt.ylabel("Frequency")

plt.show()

In the code block mentioned above we used a new method called ‘hist’ to create a
histogram. Other lines of code are pretty similar to the ones we used before.
Upgrade Open in app

This histogram was created in five simple lines of code. It tells us how many people
have that particular BMI. The BMI doesn’t have a continuous range of values so we can
get a general idea just by looking at it.

Scatter plots-

Scatterplots are an excellent method to display a relationship between two variables


without the risk of a wacky trend line that a line graph may produce. Scatter Plots are
useful for discovering linear correlations in data. A scatter plot in Matplotlib is as easy
to make as a line graph, and just takes a few lines of code, as seen below.

Run the following lines of code(change the variable names according to your choice),

age = df['Age']
bmi = df['BMI']
Upgrade Open in app

plt.scatter(age, bmi)

plt.xlabel('Age')

plt.ylabel('BMI')

plt.show()

To create a scatter plot, we utilised the scatter method in the previously described code
block. Other methods are identical to those we used previously.

With a few exceptions, graph axes should always begin at 0 by convention. As we can
see, the lowest x-tick in this graph does not exactly start at zero, which is deceptive.
Fortunately, this is an easy repair. Just before using plt.show, add the line
plt.xlim(0,’end point’) *[‘end point’ is meant to be substituted with a real value] ().
plt.ylim may be used to perform the same on the y-axis.
Conclusion- Upgrade Open in app

As you can see, Matplotlib is an excellent method to rapidly build basic visualisations.
Most graphs are created with only a few lines of code and may be tastefully improved
to make them even better. For more information on Matplotlib methods, click here.

I hope you found this article to be informative and easy to grasp.

Thank you for taking the time to read this!

Sign up for GDSC UMIT Journal - August 2021


By GDSC UMIT

We write about development, Fundamentals, code samples, tutorials and everything which falls
under the field "technical" Take a look.

Emails will be sent to [email protected].


Get this newsletter Not you?

You might also like