0% found this document useful (0 votes)

56 views13 pages

Data Visualization

matplotlib and seaborn are two popular Python libraries for data visualization. Matplotlib allows for basic line plots, scatter plots, while seaborn provides additional statistical plot types like box plots, distribution plots, and heatmaps. Seaborn plots were demonstrated on an employee dataset to visualize distributions, relationships between variables, and correlations between dataset features.

Uploaded by

Shania Jone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views13 pages

Data Visualization

Uploaded by

Shania Jone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

# Data Visualization

Data visualization is visual representation of data or information

matplotlib and seaborn are two libraries used for data visualization

matplotlib:
--lineplot,scatterplot

seaborn:
--univariant analysis(distplot,boxplot,countplot)
--bivariant analysis(barplot,scatterplot)
--multivariant analysis(heatmap)

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
executed in 1.20s, finished 11:39:15 2023-03-28

In [8]:

a=np.array([1,2,3,4,5])
b=np.arange(10,60,10)
executed in 12ms, finished 11:48:15 2023-03-28

In [6]:

a
executed in 17ms, finished 11:46:46 2023-03-28

Out[6]:

array([1, 2, 3, 4, 5])

In [9]:

b
executed in 16ms, finished 11:48:17 2023-03-28

Out[9]:

array([10, 20, 30, 40, 50])

In [10]:

plt.plot(a,b)
executed in 86ms, finished 11:48:20 2023-03-28

Out[10]:

[<matplotlib.lines.Line2D at 0x21af1118310>]

In [11]:

plt.plot(a,b)
plt.xlabel("age")
plt.ylabel("height")
plt.title("age vs height")
executed in 91ms, finished 11:57:18 2023-03-28

Out[11]:

Text(0.5, 1.0, 'age vs height')

In [13]:

c=np.arange(20,70,10)
c
executed in 16ms, finished 12:01:12 2023-03-28

Out[13]:

array([20, 30, 40, 50, 60])

In [14]:

plt.plot(a,b)
plt.plot(a,c)
plt.xlabel("age")
plt.ylabel("height")
plt.title("age vs height")
executed in 89ms, finished 12:02:42 2023-03-28

Out[14]:

Text(0.5, 1.0, 'age vs height')

In [15]:

plt.plot(a,b,label='A')
plt.plot(a,c,label='B')
plt.xlabel("age")
plt.ylabel("height")
plt.title("age vs height")
plt.legend()
executed in 121ms, finished 12:11:45 2023-03-28

Out[15]:

<matplotlib.legend.Legend at 0x21af1ff8370>

In [18]:

a=np.random.randint(10,50,10)
b=np.arange(0,10)
c=np.arange(5,15)
plt.scatter(a,b,label='A')
plt.scatter(a,c,label='B')
plt.legend()
executed in 103ms, finished 12:26:54 2023-03-28

Out[18]:

<matplotlib.legend.Legend at 0x21af19e77f0>
In [9]:

import matplotlib.pyplot as plt

import numpy as np
import seaborn as sns
executed in 6ms, finished 09:52:20 2023-03-29

In [3]:

a=np.array([1,2,3,4,5])
b=np.array([4,8,10,8,7])
fig,ax=plt.subplots(figsize=(5,4))
plt.plot(a,b)
executed in 147ms, finished 07:45:07 2023-03-29

Out[3]:

[<matplotlib.lines.Line2D at 0x1b82ce43460>]

In [4]:

plt.scatter(a,b)
executed in 88ms, finished 07:45:16 2023-03-29

Out[4]:

<matplotlib.collections.PathCollection at 0x1b82cf46610>
In [5]:

a=np.random.randint(1,10,20)
executed in 29ms, finished 07:45:18 2023-03-29

In [22]:

executed in 5ms, finished 13:51:13 2023-03-28

Out[22]:

array([3, 4, 8, 3, 3, 3, 6, 6, 8, 3, 4, 9, 1, 9, 8, 8, 4, 3, 8, 3])

In [11]:

import pandas as pd
executed in 3ms, finished 09:52:31 2023-03-29

In [7]:

emp=pd.read_csv("employee.csv")
executed in 39ms, finished 07:45:25 2023-03-29

In [8]:

emp.head()
executed in 32ms, finished 07:45:27 2023-03-29

Out[8]:

Age Attrition BusinessTravel DailyRate Department DistanceFromHome Education E

0 41 Yes Travel_Rarely 1102 Sales 1 2

Research &
1 49 No Travel_Frequently 279 8 1
Development

Research &
2 37 Yes Travel_Rarely 1373 2 2
Development

Research &
3 33 No Travel_Frequently 1392 3 4
Development

Research &
4 27 No Travel_Rarely 591 2 1
Development

5 rows × 35 columns

In [10]:

import seaborn as sns

executed in 3ms, finished 09:52:24 2023-03-29

# Distribution plot
Distribution plot are used for analyzing the detailed distribution of a dataset.

In [28]:

sns.distplot(emp["DistanceFromHome"])
executed in 1.04s, finished 14:00:18 2023-03-28

C:\Users\Harshitha GS\anaconda3\lib\site-packages\seaborn\distributions.p
y:2619: FutureWarning: `distplot` is a deprecated function and will be rem
oved in a future version. Please adapt your code to use either `displot`
(a figure-level function with similar flexibility) or `histplot` (an axes-
level function for histograms).
warnings.warn(msg, FutureWarning)

Out[28]:

<AxesSubplot:xlabel='DistanceFromHome', ylabel='Density'>

# box plot
Box plot are used for analyzing the detailed distribution of a dataset and detection of
outliers.
In [29]:

sns.boxplot(emp["MonthlyIncome"])
executed in 150ms, finished 14:20:16 2023-03-28

C:\Users\Harshitha GS\anaconda3\lib\site-packages\seaborn\_decorators.py:3
6: FutureWarning: Pass the following variable as a keyword arg: x. From ve
rsion 0.12, the only valid positional argument will be `data`, and passing
other arguments without an explicit keyword will result in an error or mis
interpretation.
warnings.warn(

Out[29]:

<AxesSubplot:xlabel='MonthlyIncome'>

#countplot used to univariant analysis of categorical features

In [30]:

sns.countplot(emp["Department"])
executed in 171ms, finished 14:36:55 2023-03-28

Out[30]:

<AxesSubplot:xlabel='Department', ylabel='count'>

In [31]:

sns.countplot(data=emp,x="Department")
executed in 91ms, finished 14:48:54 2023-03-28

Out[31]:

<AxesSubplot:xlabel='Department', ylabel='count'>
In [12]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
executed in 10ms, finished 10:12:22 2023-03-29

In [13]:

emp=pd.read_csv("employee.csv")
emp.head()
executed in 79ms, finished 10:14:19 2023-03-29

Out[13]:

Age Attrition BusinessTravel DailyRate Department DistanceFromHome Education E

0 41 Yes Travel_Rarely 1102 Sales 1 2

Research &
1 49 No Travel_Frequently 279 8 1
Development

Research &
2 37 Yes Travel_Rarely 1373 2 2
Development

Research &
3 33 No Travel_Frequently 1392 3 4
Development

Research &
4 27 No Travel_Rarely 591 2 1
Development

5 rows × 35 columns

# Bar plot
Bar plot shows the relationship between a numeric and a categoric variable.
In [15]:

sns.barplot(data=emp,x="Department",y='MonthlyIncome')
executed in 226ms, finished 10:21:42 2023-03-29

Out[15]:

<AxesSubplot:xlabel='Department', ylabel='MonthlyIncome'>

In [16]:

sns.barplot(data=emp,x="Department",y='MonthlyIncome',hue="Attrition")
executed in 308ms, finished 10:31:30 2023-03-29

Out[16]:

<AxesSubplot:xlabel='Department', ylabel='MonthlyIncome'>

# Scatter plot
Scatter plot shows the relationship between two numerical variables.
In [17]:

sns.scatterplot(data=emp,x="DailyRate",y="MonthlyIncome")
executed in 156ms, finished 10:44:58 2023-03-29

Out[17]:

<AxesSubplot:xlabel='DailyRate', ylabel='MonthlyIncome'>

In [19]:

sns.scatterplot(data=emp,x="DailyRate",y="MonthlyIncome",hue="MonthlyIncome",style="Dep
executed in 382ms, finished 10:52:03 2023-03-29

Out[19]:

<AxesSubplot:xlabel='DailyRate', ylabel='MonthlyIncome'>

# Heatmap
A heatmap is a two-dimensional graphical representation of data where the individual
values that are contained in a matrix are represented as colours
In [20]:

sns.heatmap(emp.corr())
executed in 330ms, finished 11:02:28 2023-03-29

Out[20]:

<AxesSubplot:>

In [22]:

ins=pd.read_csv("insurance.csv")
executed in 23ms, finished 11:08:24 2023-03-29

In [25]:

sns.heatmap(ins.corr(),annot=True)
executed in 198ms, finished 11:18:52 2023-03-29

Out[25]:

<AxesSubplot:>

Graph Theory and Interconnection Networks
100% (1)
Graph Theory and Interconnection Networks
722 pages
Inheritance and Composition
No ratings yet
Inheritance and Composition
85 pages
100 C Program Solution
100% (1)
100 C Program Solution
56 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
Diagnostic Repeater For PROFIBUS-DP
100% (4)
Diagnostic Repeater For PROFIBUS-DP
242 pages
Computer Graphics Programs Using C
No ratings yet
Computer Graphics Programs Using C
73 pages
Aiml Lab Manaual R23
100% (1)
Aiml Lab Manaual R23
10 pages
ASTM Laboratory Information Management Systems Rte1nzg
No ratings yet
ASTM Laboratory Information Management Systems Rte1nzg
27 pages
M100741G MAI Memex Memory Upgrade For Fanuc 16 18
No ratings yet
M100741G MAI Memex Memory Upgrade For Fanuc 16 18
32 pages
Financial Analytics With Python
100% (1)
Financial Analytics With Python
40 pages
SV9100 PCPro Manual Issue 2 0 GE
No ratings yet
SV9100 PCPro Manual Issue 2 0 GE
204 pages
3d Ghost Gun
No ratings yet
3d Ghost Gun
3 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
Nestix Ship Esite
No ratings yet
Nestix Ship Esite
4 pages
Creation of Series Using List, Dictionary & Ndarray
No ratings yet
Creation of Series Using List, Dictionary & Ndarray
65 pages
Macro Man
No ratings yet
Macro Man
7 pages
2D Data For Plans, Sections, and Elevations
No ratings yet
2D Data For Plans, Sections, and Elevations
9 pages
AL Notes
No ratings yet
AL Notes
61 pages
Pandas Visualisation
No ratings yet
Pandas Visualisation
27 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
No ratings yet
3rd Semester DDM AI DAA DEV Print Pages For Spiral Record 25-1-24 - Removed
28 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
66 pages
L6 and 7-Data Preprocessing-Coding
No ratings yet
L6 and 7-Data Preprocessing-Coding
34 pages
CSBS R23 II Year Course Structure and Syllabus
No ratings yet
CSBS R23 II Year Course Structure and Syllabus
52 pages
Project Planning and Scheduling
No ratings yet
Project Planning and Scheduling
4 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Aids Lab
No ratings yet
Aids Lab
45 pages
Unit 2
No ratings yet
Unit 2
36 pages
2,3. Introduction Pandas & Matplotlib
No ratings yet
2,3. Introduction Pandas & Matplotlib
32 pages
Implementation CATIA V5
No ratings yet
Implementation CATIA V5
38 pages
BDA File
No ratings yet
BDA File
26 pages
Pandas
No ratings yet
Pandas
25 pages
Seaborn Besant
No ratings yet
Seaborn Besant
27 pages
Pandas Plotting Capabilities
No ratings yet
Pandas Plotting Capabilities
27 pages
Pandas Complete + Visualisation Summary of IBM Visualization
No ratings yet
Pandas Complete + Visualisation Summary of IBM Visualization
21 pages
6) Exploratory Data Analysis
No ratings yet
6) Exploratory Data Analysis
29 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
AD3301 DEV Lab Manual
No ratings yet
AD3301 DEV Lab Manual
26 pages
Datascience 2 PDF
No ratings yet
Datascience 2 PDF
24 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
Summary: Introduction To Data Visualization Tools
No ratings yet
Summary: Introduction To Data Visualization Tools
13 pages
Khadeeja - DS - PRACTICAL 4
No ratings yet
Khadeeja - DS - PRACTICAL 4
24 pages
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
No ratings yet
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
11 pages
Bulbapedia Walkthrough Leaf Green
No ratings yet
Bulbapedia Walkthrough Leaf Green
9 pages
Lab Record Dev
No ratings yet
Lab Record Dev
20 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
10-Maintenance of GeneXpert
No ratings yet
10-Maintenance of GeneXpert
18 pages
Certificate
No ratings yet
Certificate
25 pages
Cobit Gap Analysis
No ratings yet
Cobit Gap Analysis
17 pages
Python Data Visualization 1
No ratings yet
Python Data Visualization 1
16 pages
Kunal Assignment 3
No ratings yet
Kunal Assignment 3
19 pages
Edp 3
No ratings yet
Edp 3
16 pages
DVA Practical
No ratings yet
DVA Practical
19 pages
Exp 2 SDK Ok
No ratings yet
Exp 2 SDK Ok
18 pages
16 Mark Ds
No ratings yet
16 Mark Ds
18 pages
Data Visualization EDA-print
No ratings yet
Data Visualization EDA-print
18 pages
ML Expt 1 Description
No ratings yet
ML Expt 1 Description
15 pages
06 Seaborn
No ratings yet
06 Seaborn
13 pages
Predictive 23-06-2025 - Jupyter Notebook
No ratings yet
Predictive 23-06-2025 - Jupyter Notebook
14 pages
Hsslive - In: Structures and Pointers
No ratings yet
Hsslive - In: Structures and Pointers
10 pages
M4818
No ratings yet
M4818
24 pages
Data Visualisation
No ratings yet
Data Visualisation
5 pages
Eda Code Snippets
No ratings yet
Eda Code Snippets
17 pages
Constructors 1
No ratings yet
Constructors 1
12 pages
LIS 2022 New 1-161-171
No ratings yet
LIS 2022 New 1-161-171
11 pages
DMV Unit-4-1 PDF
No ratings yet
DMV Unit-4-1 PDF
10 pages
Preksha Ai Practical Class 10th - 070428
No ratings yet
Preksha Ai Practical Class 10th - 070428
13 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
Test 12 - Practical Questions
No ratings yet
Test 12 - Practical Questions
10 pages
Data Sci
No ratings yet
Data Sci
10 pages
Lab 1
No ratings yet
Lab 1
7 pages
External
No ratings yet
External
11 pages
Mashup Tool For Automatic Query Generation For Data Web
No ratings yet
Mashup Tool For Automatic Query Generation For Data Web
5 pages
Data Visualization Lab: Experiment 1
No ratings yet
Data Visualization Lab: Experiment 1
8 pages
DSDBAAssignment2 SUMEET
No ratings yet
DSDBAAssignment2 SUMEET
8 pages
DSBDL Write Ups 8 To 10
No ratings yet
DSBDL Write Ups 8 To 10
7 pages
Local Area Network
No ratings yet
Local Area Network
9 pages
PXC 3871183
No ratings yet
PXC 3871183
8 pages
Advanced Plot Types With Seaborn
No ratings yet
Advanced Plot Types With Seaborn
8 pages
Python Comands
No ratings yet
Python Comands
3 pages
Exploratory Data Analysis (EDA) in Python
No ratings yet
Exploratory Data Analysis (EDA) in Python
6 pages
Data Analysis CheatSheet
No ratings yet
Data Analysis CheatSheet
2 pages
Unit 6
No ratings yet
Unit 6
3 pages
Task PDF
No ratings yet
Task PDF
3 pages
DSBDA Prac4 2
No ratings yet
DSBDA Prac4 2
1 page
Computer Class 12 - T - 3
No ratings yet
Computer Class 12 - T - 3
1 page
Constructions
No ratings yet
Constructions
1 page
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
From Everand
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
Fouad Sabry
No ratings yet

Data Visualization

Uploaded by

Data Visualization

Uploaded by

# Data Visualization

Data visualization is visual representation of data or information

array([10, 20, 30, 40, 50])

Text(0.5, 1.0, 'age vs height')

array([20, 30, 40, 50, 60])

Text(0.5, 1.0, 'age vs height')

import matplotlib.pyplot as plt

executed in 5ms, finished 13:51:13 2023-03-28

Age Attrition BusinessTravel DailyRate Department DistanceFromHome Education E

0 41 Yes Travel_Rarely 1102 Sales 1 2

import seaborn as sns

#countplot used to univariant analysis of categorical features

Age Attrition BusinessTravel DailyRate Department DistanceFromHome Education E

0 41 Yes Travel_Rarely 1102 Sales 1 2

You might also like