0% found this document useful (0 votes)

9 views19 pages

MCA - S3 - Data Visualisation - U5

Uploaded by

Ramu Atmuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views19 pages

MCA - S3 - Data Visualisation - U5

Uploaded by

Ramu Atmuri

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Data Visualisation

Unit-05
Visualisation Using Pandas

Semester-03
Master of Computer Application 1
UNIT

Visualisation Using Pandas

Names of Sub-Units

Setting Up the Environment, Line Plot, Bar Plot, Stacked Plot, Histogram, Box Plot, Area Plot, Scatter
Plot, Hex Plot, Pie Plot, Scatter Matrix, Subplots

Overview

The unit begins by setting up the environment of pandas for visualising data. Next, the unit discusses
how to create a line plot, bar plot and stacked plot in python using pandas. Further, the unit discusses
the functions for creating histogram, box plot and area plot using pandas. The unit also discusses how
to create scatter plot, hex plot and pie plot using pandas. Towards the end, the unit explains how to
build a scatter matrix and subplots in python using pandas.

Learning Objectives

In this unit, you will learn to:

 Explain the process for setting up the environment of pandas for visualising data
 Describe how to create a line plot, bar plot and stacked plot using pandas
 Defines the functions for creating histogram, box plot and area plot using pandas
 Explains how to create scatter plot, hex plot and pie plot using pandas
 Explains how to build a scatter matrix and subplots in python using pandas

2
Learning Outcomes

At the end of this unit, you would:

 Evaluate the process for setting up the environment of pandas for visualising data
 Assess the knowledge about creating a line plot, bar plot and stacked plot using pandas
 Analyse the functions for creating histogram, box plot and area plot using pandas
 Understand about the function for creating scatter plot, hex plot and pie plot using pandas
 Examine how to build a scatter matrix and subplots in python using pandas

Pre-Unit Preparatory Material

 http://blaqueyard.com/download/Python%20Data%20Visualization%20Cookbook.pdf

5.1 INTRODUCTION
Data visualisation is perhaps the most critical phase in the whole data science, big data, or machine
learning life cycle. When we use colours and images to show our study or analysis, it becomes more
stunning, intriguing, and understandable. Clients may better comprehend the key underlying
architecture, trends, patterns, and correlations among parameters within the dataset by using
visualisation components such as graphs, charts, and maps. All of the data visualisations provide us a
clear and precise picture of what the data is trying to tell us. It neutralises all data hence, we can grasp
the data insights.

Pandas is a Python data manipulation and analysis package that is open-source. It is a quick and
strong tool that lets you modify statistical data and series data using data structures and operations.
Combining, restructuring, choosing, data cleansing, and data wrangling are forms of data manipulation
procedures. Data may be imported from a variety of file formats, including SQL, MS Excel, and comma-
separated values, using this library.

5.2 SETTING UP THE ENVIRONMENT

All of the standard of Python was actually a distribution that does not actually come in bundled with
all the Pandas module. A very lightweight alternative is also to install NumPy using all of the popular
Python package installer, pip.
pip install pandas
If you have installed the Anaconda Python package, Pandas will be installed automatically using the
following commands:
 For Windows user: The different ways to install pandas are as follows:
 Anaconda is a free Python distribution for all SciPy stack. Actually, it is also available for all

3
Linux & Mac.

 Canopy is also available as free as well as any commercial distribution with all full SciPy stack
for Windows, Linux & Mac.
 Python is a free Python in which distribution with SciPy can stack & Spyder IDE for Windows OS.
After this, matplotlib is install in for creating a chart.
 For Ubuntu Users: The command to install pandas in Ubuntu OS users is as follows:
sudo apt-get install python-numpy python-scipy python-
matplotlibipythonipythonnotebook
python-pandas python-sympy python-nose
 For Fedora Users: The command to install pandas in Fedora OS users is as follows:
sudo yum install numpyscipy python-matplotlibipython python-pandas
sympy python-nose atlas-devel

5.3 LINE PLOT

Line charts are used to plot continuous data in the form of lines. Therefore, each point on a line chart
corresponds to a value. A line chart can use any number of data series (that is, continuous related data
in a column) and you can distinguish the lines by using different colours or line styles. For instance
plotting the budget and expenses of an organisation as a line chart may enable you to identify cost
fluctuations. To represent data, a line chart uses a horizontal axis (x-axis) and a vertical axis (y-axis).
Line plots may be created straight from pandas dataframes using the dataframe.plot() function. The
syntax for the line plot is as follows:
DataFrame.plot.line(x=None, y=None, **kwargs)
where,
 x: Represents the x-axis data

 y: Represents the y- axis data

 Color: Shows the color for each column in the dataframe

 **kwargs: Refers to the additional keyword arguments that are documented in DataFrame.plot().

The python program to create line plot is as follows:

import pandas as pd
df = pd.DataFrame({
'Q1 Sales': [125, 156, 175, 121, 172],
'Q2 Sales': [152, 169, 131, 189, 135],
'Q3 Sales': [153, 187, 129, 142, 176],
'Q4 Sales':[143, 176, 153, 198,176]
})
df.plot(title="Quarterly Sales of an organisation (in Thousands)");

4
The output of the given program is as follows:

190
180
170
160
150
140
130
120
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

5.4 BAR PLOT

A bar chart is a visual presentation of category data. The data is represented using a bar chart, which
has a number of bars, every representing a different category. Each bar’s height corresponds to a
specific aggregate (for instance, the sum of the values in the category it represents). Bar plots is created
straight from pandas dataframes using the dataframe.plot.bar() function. The syntax for the line plot
is as follows:
dataframe.plot.bar(x=None, y=None, **kwargs)[source]
where,
 x: Represents the x-axis data
 y: Represents the y- axis data
 Color: Shows the color for each column in the dataframe
 **kwargs: Refers to the additional keyword arguments that are documented in DataFrame.plot().

The python program to create bar plot is as follows:

import pandas as pd
df = pd.DataFrame({
'Q1 Sales': [125, 156, 175, 121, 172],
'Q2 Sales': [152, 169, 131, 189, 135],
'Q3 Sales': [153, 187, 129, 142, 176],
'Q4 Sales':[143, 176, 153, 198,176]
})
df.plot.bar(title="Quarterly Sales of an organisation (in Thousands)")

5
The output of the given program is as follows:

175
180
125
100
75
50
25

5.5 STACKED PLOT

A stacked bar graph is another name for a stacked bar chart. It is a graph that compares different
sections of a whole. Every bar in a stacked bar chart symbolises the entire, while the segments or sections
of the bar indicate subcategories within that whole. These subcategories are represented by different
colours.The python program to create bar plot is as follows:
import pandas as pd
df = pd.DataFrame({
'Q1 Sales': [125, 156, 175, 121, 172],
'Q2 Sales': [152, 169, 131, 189, 135],
'Q3 Sales': [153, 187, 129, 142, 176],
'Q4 Sales':[143, 176, 153, 198,176]
})
df.plot.bar(title="Quarterly Sales of an organisation (in Thousands)",
stacked="True")

The output of the given program is as follows:

6
100

0 1 2 3 4

5.6 HISTOGRAM
Histogram chart is used to shows data in the form of frequency within a distribution. Each column in
the histogram chart is known as Bin. However, the continuously flowing data can be represented using
Histogram. It makes it easy to analyse the data defined within various data ranges. The function syntax
for creating histogram is as follows:
DataFrame.plot.hist(by=None, bins=10, **kwargs)[source]
where,
 by [str or sequence, optional]: Refers to the column in the DataFrame based on the data is group.
 Bins[int, default 10]: Refers to a number of histogram bins that is used for creating histogram.
 **kwargs: Refers to the additional keyword arguments that are documented in DataFrame.plot().

The python program to create histogram is as follows:

import pandas as pd
dataframe= pd.read_csv("BP_Record.csv")
dataframe.hist()
The output of the given program is as follows:

7
10

100 120

5.0 4

2.5

100 125 150

5.7 BOX PLOT

A boxplot, often known as a box and whisker plot, is a visual representation of a data set’s spread and
centres. This plot is appropriate to represent statistical data sets related to each other, without using
any formula. This plot produces answers from the raw data. The data is distributed into quartiles, along
with highlighted mean and outliers.

The function syntax for creating box plot is as follows:

DataFrame.boxplot(column=None, by=None, ax=None, fontsize=None, rot=0,
grid=True, figsize=None, layout=None, return_type=None, backend=None,
**kwargs)
where,
 column [str or list of str, optional]: Refers to a column name or list of names, or vector of the dataset
 By [str or array-like, optional]: Refers to a column in the dataframe to DataFrame.groupby()
function.
 ax: [object of class matplotlib.axes.Axes, optional]: Uses the matplotlib axes to boxplot.
 fontsize [float or str]: Specifies the font size
 rot[int or float, default 0]: Refers to the rotation angle of labels in context to the screen coordinate
system
 Grid[bool, default True]: displays grid if you set it ti true
 Fig[sizeA tuple (width, height) in inches]: Specifies the size of the figure to build by using matplotlib
 Layout[tuple (rows, columns), optional]: Shows the subplot
 return_type[{‘axes’, ‘dict’, ‘both’} or None, default ‘axes’]: Refers to the type of object to return

8
 backend[str, default None]: Uses in place of the backend specified in the plotting.backend option
 **kwargs: Refers to the additional plotting keyword arguments that are passed in matplotlib.pyplot.
boxplot().

The python program to create box plot is as follows:

import pandas as pd
dataframe= pd.read_csv("BP_Record.csv")
dataframe.boxplot(by ='Pulse', column =['Calories'], grid = True)
The output of the given program is as follows:

450

350

300

250

200

5.8 AREA PLOT

In an area chart, areas are used to represent values. It is similar to a line chart in that it displays a series
as a set of points connected by a line. However, the difference is that in an area chart, the area below
the line is filled with the colour of the line. Area charts help to draw attention to the total value across
a given data.

The function syntax to create an area plot is as follows:

DataFrame.plot.area(x=None, y=None, **kwargs)
where,
 x: Represents the x-axis data
 y: Represents the y- axis data
 stacked: Shows the area plot in stacked form. It is set to true by default. If you set to False to create
a unstacked plot.
 **kwargs: Refers to the additional keyword arguments that are documented in DataFrame.plot().

The python program to create area plot is as follows:

9
import pandas as pd
dataframe= pd.read_csv("BP_Record.csv")
dataframe.plot.area()
The output of the given program is as follows:

1000

10 20 25 30

5.9 SCATTER PLOT

A scatter plot is truly a diagram drawn between a pair of distributions of variables X & Y on a 2-
dimensional plane. Scatter plot is then used as an initial screening tool that whereas analysing two
variables for any of the connection that will then exist between them.

The function syntax for creating a scatter plot is as follows:

DataFrame.plot.scatter(x, y, s=None, c=None, **kwargs)

where,
 x: Refers to a column name to be used as horizontal coordinates for every purpose
 y: Refers to a column name to be used as vertical coordinates for every purpose
 s: Specifies the size of dots
 c: Specifies the colour of dots
 **kwargs: Refers to the additional keyword arguments that are documented in DataFrame.plot().

The python program to create scatter plot is as follows:

import pandas as pd
dataset={'Student Name':['Yash', 'Madhu', 'Gunjan','Vihan', 'Dipesh',
'Stuti'],'Class':[10, 12, 9, 6, 11, 8]}
df = pd.DataFrame(data = dataset)
df.plot.scatter(x = 'Student Name', y = 'Class')
The output of the given program is as follows:

10
12

Class

5.10 HEX PLOT

It presents a various vary of all utilities from parsing multiple file-formats to then changing a whole data
table into a NumPy matrix array. This is often property makes pandas a sure ally altogether knowledge
science &machine learning. Pandas will then facilitate with the creation of multiple forms of knowledge
and analysis graphs. One main specimen is that the polygonal shape plot. A polygonal shape plot is
particularly} very helpful if the entire scatter plot is simply too dense to interpret. And it also helps to bin
the realm of the chart and assigns colour intensity consequently
The function syntax for the hex plot is as follows:
DataFrame.plot.hexbin(x, y, C=None, reduce_C_function=None,
gridsize=None, **kwargs)
where,
 x: Refers to a column name to be used as horizontal coordinates
 y: Refers to a column name to be used as vertical coordinates
 c: Refers to a column name that is used for the value of (x, y) point
 reduce_C_function: Refers to a function that take a single argument for reducing the values in a
bin to a single number
 gridsize: Specifies the number of hexagons in the x-axes and y-axes
 **kwargs: Refers to the additional keyword arguments that are documented in DataFrame.plot().
The python program to create hexbin plot is as follows:
import pandas as pd
import numpy as np
df = pd.DataFrame({'X-Axis': np.random.randn(2000), 'Y-Axis': np.random.
randn(2000)})
df.plot.hexbin(x='X-Axis', y='Y-Axis', gridsize=20)
The output of the given program is as follows:

11
xis
5.11 PIE PLOT
A pie chart is used to show relative proportions or contributions to a whole, which is contributed by each
value in a single data series. Pie charts are most effective while representing a small amount of data.

A chart highlights information and statistics in pie-slice format. This sort of chart represents numbers
in percentages, and also the total of all pies ought to equal 100%.

The function syntax for creating a pie plot is as follows:

DataFrame.plot.pie(**kwargs)
The python program to create pie plot is as follows:
import pandas as pd
df = pd.DataFrame({'Party Name': ['BJP', 'SP','Congress','Others'],
'Partywise Votes': [47.2,32.8,8,12]})
df.plot.pie(y='Partywise Votes', labels = df['Party Name'])

12
The output of the given program is as follows:

Partywise Votes

5.12 SCATTER MATRIX

Scatter matrix or additionally referred to as pairs plot that is succinctly plots of all the numeric variables
that have in an exceedingly dataset against one another one. All told Python, this information mental
image technique may be then dispensed with several different libraries however if we tend to area unit
exploitation Pandas to then load the information, we will use the bottom scatter matrix technique to
examine the dataset.

It is vital to envision for correlation among freelance variables utilised in analysing regression throughout
information pre-processing. Scatter plots create it terribly straightforward to know the correlation
between the options. Pandas provides analysts with the scatter matrix () perform to feasiblywin these
plots. It is conjointly accustomed verify whether or not the correlation is positive or negative

The function syntax for creation a scatter matrix is as follows:

pandas.plotting.scatter_matrix(frame, alpha=0.5, figsize=None, ax=None,
grid=False, diagonal='hist', marker='.', density_kwds=None, hist_
kwds=None, range_padding=0.05, **kwargs)
where,
 Frame: Refers to a dataframe

 Alpha[float, optional]: Refers to the amount for applying transparency

 Figsize[(float,float), optional]: Specifies the width, height of the figure in inches

 Ax[Matplotlib axis object, optional]: Specifies the axis object for matplotlib

 grid[bool, optional]: Specifies this option to true for displaying the grid

 diagonal[{‘hist’, ‘kde’}]: Allows you to select either hist for histogram plot or kde for kernel density
estimation in the diagonal

13
 marker[str, optional]: Specifies the marker type for matplotlib
 density_kwds[keywords]: Defines the density keyword to specify the kernel density estimate plot

 hist_kwds[keywords]: Defines the density keyword to specify the hist function

 range_padding[float, default 0.05]: Specifies the relative extension of axis range in x and y

 **kwargs: Refers to the additional keyword arguments that are passed to scatter function

The python program to create a scatter matrix is as follows:

import pandas as pd
df= pd.read_csv("BP_Record.csv")
pd.plotting.scatter_matrix(df)
The output of the given program is as follows:
Duration

250
Puls

100
Maxpulse

150

100
Calories

150
120

400
450
250

100

250

5.13 SUBPLOTS
Python subplots are a great tool for data visualisation because they provide you a lot of flexibility
about how data is shown. Subplot is a function that generates a figure and a series of subplots. It is a
wrapper function that makes it easy to generate standard subplot designs in a single call, including the
containing figure object.

5.14 LAB EXERCISE

1. Use Pandas to Perform Exploratory Data Analysis on the Dataset.
Ans. Exploratory Data Analysis (EDA) is a method of analysing data through the use of visual techniques.It
assists data analysts in determining how to effectively modify sources of data to obtain the
information they require, enabling it easier for them to find trends and patterns, test hypotheses,

14
and verify assumptions.

EDA helps data scientists in a variety of ways:

 Increasing your knowledge of data
 Detecting a variety of data patterns
 Improved comprehension of the problem statement
Python is one of the most widely used languages for all Data Science particularly because of the
presence of various libraries & packages that makes data analysis easier.
It also provides various functions & methods to both simplify as well as to expedite the data
analysis process.
The steps for exploratory data analysis on the dataset are as follows:
1. First, you need to import the pandas and numpy library as:
import pandas as pd
import numpy as np
2. Import and read the dataset as:
dataset = pd.read_csv("Automobile.csv")
This will import the Automobile.csv file into the panda’s data frame.
3. Apply the different operation of the dataset to analysis data. Some of them are as follows:
a. For displaying all rows from a data frame, the function is as follows:
import pandas as pd
dataset = pd.read_csv("Automobile.csv")
print(dataset.to_string())
b. For cleaning data from the data some commands are:
i. To remove the empty cells:
df=dataset.dropna()
ii. TO remove the duplicate entries:
df=dataset.drop_duplicate(inplace=true)
c. For displaying first five rows of the data, the function is as follows:
dataset.head()
d. For displaying last five rows of the data, the function is as follows:
dataset.tail()
4. Display structure (row and column number) in a dataset.
dataset.shape
5. Display information (columns and their data types) about the dataset.
dataset.info()
6. Display the quick summary of the dataset.
dataset.describe()

15
7. Prepare the different types of plot or chart that are best suited on the imported dataset.

Conclusion 5.15 CONCLUSION

 Data visualisation is perhaps the most critical phase in the whole data science, big data, or machine
learning life cycle.
 Pandas is a Python data manipulation and analysis package that is open-source
 Combining, restructuring, choosing, data cleansing, and data wrangling are forms of data
manipulation procedures.
 All of the standard of Python was actually a distribution that does not actually come in bundled
with all the Pandas module.
 Line charts are used to plot continuous data in the form of lines.
 Line plots may be created straight from pandas dataframes using the dataframe.plot function.
 A bar chart is a visual presentation of category data.
 Bar plots is created straight from pandas dataframes using the dataframe.plot.bar() function.
 A stacked bar graph is another name for a stacked bar chart. It is a graph that compares different
sections of a whole.
 Histogram chart is used to shows data in the form of frequency within a distribution.
 A boxplot, often known as a box and whisker plot, is a visual representation of a data set’s spread
and centres.
 In an area chart, areas are used to represent values. It is similar to a line chart in that it displays a
series as a set of points connected by a line.
 A scatter plot is truly a diagram drawn between a pair of distributions of variables X & Y on a
2-dimensional plane.
 Hex plot presents a various vary of all utilities from parsing multiple file-formats to then changing
a whole data table into a NumPy matrix array.
 A pie chart is used to show relative proportions or contributions to a whole, which is contributed by
each value in a single data series.
 Scatter matrix or additionally referred to as pairs plot that is succinctly plots of all the numeric
variables that have in an exceedingly dataset against one another one.
 Python subplots are a great tool for data visualisation because they provide you a lot of flexibility
about how data is shown.

5.16 GLOSSARY

 Data visualisation: It is the study of representing data or information in a visual form.

 Data: It refers to raw facts and information that are generally gathered in a systematic approach
for some kind of analysis.

16
 Chart: A graphical representation for all the data visualisation, in which “the data is to represented
by the indicators or symbols
 Scatter chart: It is used to show the relationship between the numeric values in two data series
 Histogram chart: It is used to shows data in the form of frequency within a distribution

5.17 SELF-ASSESSMENT QUESTIONS

A. Essay Type Questions

1. Define the process for setting up the environment for using pandas in python.
2. Explain the concept of bar plot.
3. How to create a hex plot using pandas?
4. Explain histogram plot.
5. How to create scatter matrix in python using pandas?

5.18 ANSWERS AND HINTS FOR SELF-ASSESSMENT QUESTIONS

A. Hints for Essay Types Questions

1. All of the standard of Python was actually a distribution that does not actually come in bundled
with all the Pandas module. A very lightweight alternative is also to install NumPy using all of the
popular Python package installer, pip. Refer to Section Setting up the Environment
2. A bar chart is a visual presentation of category data. The data is represented using a bar chart,
which has a number of bars, every representing a different category. Each bar’s height corresponds
to a specific aggregate (for instance, the sum of the values in the category it represents). Bar plots is
created straight from pandas dataframes using the dataframe.plot.bar() function. Refer to Section
Bar Plot
3. hex plot presents a various vary of all utilities from parsing multiple file-formats to then changing
a whole data table into a NumPy matrix array. This is often property makes pandas a sure ally
altogether knowledge science and machine learning. Refer to Section Hex Plot
4. Histogram chart is used to shows data in the form of frequency within a distribution. Each column
in the histogram chart is known as Bin. However, the continuously flowing data can be represented
using Histogram. It makes it easy to analyse the data defined within various data ranges. Refers to
Section Histogram
5. Scatter matrix or additionally referred to as pairs plot that is succinctly plots of all the numeric
variables that have in an exceedingly dataset against one another one. All told Python, this
information mental image technique may be then dispensed with several different libraries however
if we tend to area unit exploitation pandas to then load the information, we will use the bottom

17
scatter matrix technique to examine the dataset. Refer to Section Scatter Matrix

@ 5.19 POST-UNIT READING MATERIAL

 https://realpython.com/pandas-plot-python/
 https://stackabuse.com/introduction-to-data-visualization-in-python-with-pandas/

5.20 TOPICS FOR DISCUSSION FORUMS

 Discuss with your friends about how to create a different plots or charts in python using pandas.

18
19

Practical Guide To Pandas For Data Science
100% (1)
Practical Guide To Pandas For Data Science
26 pages
Data Visualization Python Tutorial
100% (1)
Data Visualization Python Tutorial
9 pages
Unit 1 - Chap 2 - Data Visualisation
No ratings yet
Unit 1 - Chap 2 - Data Visualisation
29 pages
Data Visualization and Data Handling Using Pandas CLASS 12 - Aashi Nagiya
No ratings yet
Data Visualization and Data Handling Using Pandas CLASS 12 - Aashi Nagiya
19 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Payload
No ratings yet
Payload
1,152 pages
Python Pandas and Matplotlib 7
100% (3)
Python Pandas and Matplotlib 7
72 pages
DIT 0102 - MS Word Notes
0% (1)
DIT 0102 - MS Word Notes
6 pages
Data Visulation
No ratings yet
Data Visulation
8 pages
Data Visualization - Matplotlib PDF
100% (1)
Data Visualization - Matplotlib PDF
15 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
Data Visualization Using Matplotlib in Python
No ratings yet
Data Visualization Using Matplotlib in Python
15 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
Data Visualization
No ratings yet
Data Visualization
17 pages
Notes9 - Class - 10 - Data Visualization Using MatPlotlib Notes
No ratings yet
Notes9 - Class - 10 - Data Visualization Using MatPlotlib Notes
5 pages
BTech 5 CSE Data Analytics Using Python Unit 5 Notes
No ratings yet
BTech 5 CSE Data Analytics Using Python Unit 5 Notes
9 pages
PP Unit-5 Notes
No ratings yet
PP Unit-5 Notes
15 pages
Lectur2 PANDAS
No ratings yet
Lectur2 PANDAS
65 pages
CHAPTER-2 Data Visualization
No ratings yet
CHAPTER-2 Data Visualization
4 pages
Description of Data Visualization Tools
No ratings yet
Description of Data Visualization Tools
15 pages
AIML Short Term Internship Session 9 Summary-1719044709410
No ratings yet
AIML Short Term Internship Session 9 Summary-1719044709410
14 pages
Rajni Ip File Final
No ratings yet
Rajni Ip File Final
42 pages
BDA File
No ratings yet
BDA File
26 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Pandas
No ratings yet
Pandas
12 pages
Data Visualization
No ratings yet
Data Visualization
26 pages
Data Visualization With Matplotlib
No ratings yet
Data Visualization With Matplotlib
20 pages
2,3. Introduction Pandas & Matplotlib
No ratings yet
2,3. Introduction Pandas & Matplotlib
32 pages
Ex1 - Plotting and Visualization Using Numpy and Pandas
No ratings yet
Ex1 - Plotting and Visualization Using Numpy and Pandas
14 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
97 pages
Mohit
No ratings yet
Mohit
19 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Pandas
No ratings yet
Pandas
25 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
Geometry of Structural Form
No ratings yet
Geometry of Structural Form
12 pages
Session 13, Data Visualization
No ratings yet
Session 13, Data Visualization
13 pages
Unit 5
No ratings yet
Unit 5
16 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
ML Week 7
No ratings yet
ML Week 7
12 pages
Data Visualisation
No ratings yet
Data Visualisation
5 pages
Awesome Kubernetes
No ratings yet
Awesome Kubernetes
37 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Pandas
No ratings yet
Pandas
13 pages
ML Lab1 Python Panda
No ratings yet
ML Lab1 Python Panda
9 pages
Movie System
No ratings yet
Movie System
30 pages
Article Review 6 Eng
No ratings yet
Article Review 6 Eng
31 pages
Unit 4
No ratings yet
Unit 4
36 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Data Visualization
No ratings yet
Data Visualization
48 pages
Pandas 1702216043
No ratings yet
Pandas 1702216043
86 pages
Python
No ratings yet
Python
29 pages
Datascienece
No ratings yet
Datascienece
18 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
Unit 5 Python Notes HM
No ratings yet
Unit 5 Python Notes HM
59 pages
Python Pandas Tutorial
No ratings yet
Python Pandas Tutorial
45 pages
Module 4
No ratings yet
Module 4
57 pages
Pandas 3-2
No ratings yet
Pandas 3-2
27 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
12 pages
Cliosoft Sos Fundamentals
No ratings yet
Cliosoft Sos Fundamentals
124 pages
Python Pandas
No ratings yet
Python Pandas
34 pages
Architect Portfolio2022
No ratings yet
Architect Portfolio2022
24 pages
AAVSBamboo Course-Guide
No ratings yet
AAVSBamboo Course-Guide
9 pages
La Te XManual
No ratings yet
La Te XManual
106 pages
Lec 5 Common Bus
No ratings yet
Lec 5 Common Bus
39 pages
CV 2024030507442632
No ratings yet
CV 2024030507442632
2 pages
IMaster NCE V100R021C00 Server Hardware Specifications (x86) 05-C
No ratings yet
IMaster NCE V100R021C00 Server Hardware Specifications (x86) 05-C
13 pages
Ai Final Project Report
No ratings yet
Ai Final Project Report
21 pages
Advanced Android Programming Notes
No ratings yet
Advanced Android Programming Notes
4 pages
Externally Editing Wincc (Tia Portal) Graphics
No ratings yet
Externally Editing Wincc (Tia Portal) Graphics
12 pages
DG-108 Lab Manual
No ratings yet
DG-108 Lab Manual
82 pages
House Price Prediction
No ratings yet
House Price Prediction
25 pages
Wrapped in Beauty: How Quadpack Blends Style, Form and Functionality For Packaging Design
No ratings yet
Wrapped in Beauty: How Quadpack Blends Style, Form and Functionality For Packaging Design
68 pages
Thinker Cad - Cisneros Jacobo Gabriel
No ratings yet
Thinker Cad - Cisneros Jacobo Gabriel
9 pages
CP Price List.
No ratings yet
CP Price List.
6 pages
Texture
No ratings yet
Texture
4 pages
CI170A, C Manual
No ratings yet
CI170A, C Manual
120 pages
Development of Android Based Application For Philippine Coordinate Transformation (Phgeocalc)
No ratings yet
Development of Android Based Application For Philippine Coordinate Transformation (Phgeocalc)
86 pages
Third Year
No ratings yet
Third Year
25 pages
DX Diag
No ratings yet
DX Diag
42 pages
5 Best LaTeX Editors For Windows in 2021
No ratings yet
5 Best LaTeX Editors For Windows in 2021
22 pages
Chapter 04 - Network Layer (NAT, DHCP, Router)
No ratings yet
Chapter 04 - Network Layer (NAT, DHCP, Router)
6 pages
Ownership Based Cache Coherence
No ratings yet
Ownership Based Cache Coherence
10 pages
Selenium Assignments by Raghuveer
No ratings yet
Selenium Assignments by Raghuveer
1 page
Draw Drawio
No ratings yet
Draw Drawio
2 pages
Data Science Programming In Python
From Everand
Data Science Programming In Python
Anita Raichand
No ratings yet

MCA - S3 - Data Visualisation - U5

Uploaded by

MCA - S3 - Data Visualisation - U5

Uploaded by

Data Visualisation

Visualisation Using Pandas

In this unit, you will learn to:

At the end of this unit, you would:

Pre-Unit Preparatory Material

5.2 SETTING UP THE ENVIRONMENT

5.3 LINE PLOT

 y: Represents the y- axis data

 Color: Shows the color for each column in the dataframe

The python program to create line plot is as follows:

5.4 BAR PLOT

The python program to create bar plot is as follows:

5.5 STACKED PLOT

The output of the given program is as follows:

The python program to create histogram is as follows:

100 125 150

5.7 BOX PLOT

The function syntax for creating box plot is as follows:

The python program to create box plot is as follows:

5.8 AREA PLOT

The function syntax to create an area plot is as follows:

The python program to create area plot is as follows:

5.9 SCATTER PLOT

The function syntax for creating a scatter plot is as follows:

The python program to create scatter plot is as follows:

5.10 HEX PLOT

The function syntax for creating a pie plot is as follows:

5.12 SCATTER MATRIX

The function syntax for creation a scatter matrix is as follows:

 Alpha[float, optional]: Refers to the amount for applying transparency

 Figsize[(float,float), optional]: Specifies the width, height of the figure in inches

 hist_kwds[keywords]: Defines the density keyword to specify the hist function

The python program to create a scatter matrix is as follows:

5.14 LAB EXERCISE

EDA helps data scientists in a variety of ways:

Conclusion 5.15 CONCLUSION

 Data visualisation: It is the study of representing data or information in a visual form.

5.17 SELF-ASSESSMENT QUESTIONS

A. Essay Type Questions

5.18 ANSWERS AND HINTS FOR SELF-ASSESSMENT QUESTIONS

A. Hints for Essay Types Questions

@ 5.19 POST-UNIT READING MATERIAL

5.20 TOPICS FOR DISCUSSION FORUMS

You might also like