0% found this document useful (0 votes)

19 views7 pages

Eda 4 5

EDA-PYTHON

Uploaded by

arafaths062

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views7 pages

Eda 4 5

EDA-PYTHON

Uploaded by

arafaths062

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

EDA LAB

UNIT-II

4.Generate the following charts for a dataset.

a) Polar Chart b)Histogram c)Lollipop chart
a) Polar Chart
A polar chart is a diagram that is plotted on a polar axis. Its coordinates are
angle and radius, as opposed to the Cartesian system of x and y coordinates.
Sometimes, it is also referred to as a spider web plot.

#Let's assume you have five courses in your academic year:

subjects = ["C programming", "Numerical methods", "Operating
system", "DBMS", "Computer Networks"]
#And you planned to obtain the following grades in each subject:
plannedGrade = [90, 95, 92, 68, 68, 90]
#However, after your final examination, these are the grades you got:
actualGrade = [75, 89, 89, 80, 80, 75]

The first significant step is to initialize the spider plot. This can be done by
setting the figure size and polar projection.

#Import the required libraries:

import numpy as np
import matplotlib.pyplot as plt
# Prepare the dataset and set up theta:
theta = np.linspace(0, 2 * np.pi, len(plannedGrade))
#Initialize the plot with the figure size and polar projection:
plt.figure(figsize = (10,6))
plt.subplot(polar=True)
#Get the grid lines to align with each of the subject names:
(lines,labels) = plt.thetagrids(range(0,360,
int(360/len(subjects))),
(subjects))
#Use the plt.plot method to plot the graph and fill the area under it:
plt.plot(theta, plannedGrade)
plt.fill(theta, plannedGrade, 'b', alpha=0.2)
#Now, we plot the actual grades obtained:
plt.plot(theta, actualGrade)
#We add a legend and a nice comprehensible title to the plot:
plt.legend(labels=('Planned Grades','Actual Grades'),loc=1)
plt.title("Plan vs Actual grades by Subject")
#Finally, we show the plot on the screen:
plt.show()

b)Histogram
Histogram plots are used to depict the distribution of any continuous variable.
These types of plots are very popular in statistical analysis.

import matplotlib.pyplot as plt

import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x)
plt.show()
c)Lollipop chart
They are nothing but a variation of the bar chart in which the thick bar is
replaced with just a line and a dot-like “o” (o-shaped) at the end. Lollipop
Charts are preferred when there are lots of data to be represented that can
form a cluster when represented in the form of bars.

Python allows to build lollipops, thanks to the matplotlib library, as shown in

the examples below. The strategy here is to use the stem() function.
A lollipop plot displays each element of a dataset as a segment and a circle.

# Create a dataframe
import pandas as pd
df = pd.DataFrame({'group':list(map(chr, range(65, 85))),
'values':np.random.uniform(size=20) })

# Reorder it following the values:

ordered_df = df.sort_values(by='values')
my_range=range(1,len(df.index)+1)

# Make the plot

plt.stem(ordered_df['values'])
plt.xticks( my_range, ordered_df['group'])

CONCLUSIONS:

EDA LAB
UNIT-II

5.Case Study: Perform Exploratory Data Analysis with Personal Email Data

Code:
import pandas as pd # Python library for data analysis and data frame
import numpy as np
# Numerical Python library for linear algebra computations
pd.set_option('display.max_columns', None) # code to display all columns

# Visualisation libraries
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

import warnings
warnings.filterwarnings("ignore") # To prevent kernel from showing any
warning

train_df = pd.read_csv('train_F3fUq2S.csv')
train_df.sample(5)
# import the above dataset from Kaggle

train_df.shape
train_df.info()

train_df.isnull().sum()

train_df.describe()

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Select only numeric columns from the DataFrame

numeric_df = train_df.select_dtypes(include=['number'])

# Calculate correlation for numeric columns only

corr = numeric_df.corr()

plt.figure(figsize=(20, 8))
sns.heatmap(corr, cmap="YlGnBu", annot=True)
plt.show()
train_df.drop(['campaign_id','is_timer'], axis=1, inplace=True) #dropping
#redundant columns

train_df.rename(columns={'is_image':'no_image','is_quote':'no_quote','is_emo
ticons':'no_emoticons'}, inplace=True)

BIVARIATE ANALYSIS
#Bivariate analysis is one of the simplest forms of quantitative (statistical)
analysis.
#It involves the analysis of two variables (often denoted as X, Y), for the
purpose of determining the empirical relationship between them.

_, ax1 = plt.subplots(2,2, figsize=(25,20))

for i, col in enumerate(num_cols):
if col != 'click_rate':
sns.scatterplot(x=col, y='click_rate', data=train_df, ax=ax1[i//2, i%2])
plt.show()

_, ax1 = plt.subplots(8,2, figsize=(25,50))

for i, col in enumerate(cat_cols):
sns.barplot(x=col, y='click_rate', data=train_df, ax=ax1[i//2, i%2])
plt.show()

_, ax1 = plt.subplots(8,2, figsize=(25,50))

for i, col in enumerate(cat_cols):
sns.boxplot(x=col, y='click_rate', data=train_df, ax=ax1[i//2, i%2])
plt.show()
CONCLUSIONS:

Leni Andriani - 1.0.1.2 Class Activity - Top Hacker Shows Us How It Is Done
No ratings yet
Leni Andriani - 1.0.1.2 Class Activity - Top Hacker Shows Us How It Is Done
2 pages
Lab Manual: 18CS3262S Data Modelling and Visualization Techniques
33% (3)
Lab Manual: 18CS3262S Data Modelling and Visualization Techniques
17 pages
Eda Lab-Unit-Ii-4
No ratings yet
Eda Lab-Unit-Ii-4
2 pages
UNIT - 1 EDA Continuation
No ratings yet
UNIT - 1 EDA Continuation
113 pages
Ccs346 Eda Unit 1
No ratings yet
Ccs346 Eda Unit 1
139 pages
EDA Module 2
No ratings yet
EDA Module 2
34 pages
CS1010S Lecture 11 - Visualising Data
No ratings yet
CS1010S Lecture 11 - Visualising Data
68 pages
SMA Expt 4
No ratings yet
SMA Expt 4
13 pages
Mini Project Report On
No ratings yet
Mini Project Report On
17 pages
Python Unit 4.notes
No ratings yet
Python Unit 4.notes
50 pages
Description of Data Visualization Tools
No ratings yet
Description of Data Visualization Tools
15 pages
Final Dev Record
No ratings yet
Final Dev Record
49 pages
Machine
No ratings yet
Machine
10 pages
DEV Experiment No.3
No ratings yet
DEV Experiment No.3
10 pages
Record DSCP508 - DV-1-1
No ratings yet
Record DSCP508 - DV-1-1
89 pages
Aids Lab
No ratings yet
Aids Lab
45 pages
Chapter 2. Data Analysis and Processing - Full
No ratings yet
Chapter 2. Data Analysis and Processing - Full
49 pages
Graphs Using Matplotlib
No ratings yet
Graphs Using Matplotlib
23 pages
AIDS C04-Session-22
No ratings yet
AIDS C04-Session-22
22 pages
19 Matplotlib
No ratings yet
19 Matplotlib
26 pages
DAL EXT 1 and 2
No ratings yet
DAL EXT 1 and 2
125 pages
Eda Assignment 1
No ratings yet
Eda Assignment 1
12 pages
8537ADS Experiment 03
No ratings yet
8537ADS Experiment 03
4 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
22 pages
Question Bank DEV
No ratings yet
Question Bank DEV
16 pages
DV Lab Manual 2022-23
No ratings yet
DV Lab Manual 2022-23
10 pages
DV Nivas
No ratings yet
DV Nivas
24 pages
Practical List 2022-23
100% (1)
Practical List 2022-23
4 pages
Archita 5
No ratings yet
Archita 5
13 pages
Case Study 1
No ratings yet
Case Study 1
4 pages
Edap Lab
No ratings yet
Edap Lab
47 pages
Pyplot
No ratings yet
Pyplot
14 pages
Module 2
No ratings yet
Module 2
30 pages
Unit 1 - Intro To EDA
No ratings yet
Unit 1 - Intro To EDA
40 pages
Lab Manual
No ratings yet
Lab Manual
19 pages
Exp 4-10 Merged
No ratings yet
Exp 4-10 Merged
89 pages
DAV EXP 1 t12 31
No ratings yet
DAV EXP 1 t12 31
39 pages
Study Material For XII Computer Science On: Data Visualization Using Pyplot
No ratings yet
Study Material For XII Computer Science On: Data Visualization Using Pyplot
22 pages
Lecture 4
No ratings yet
Lecture 4
60 pages
Home Assignment Dataliteracy
No ratings yet
Home Assignment Dataliteracy
4 pages
Slide 3
No ratings yet
Slide 3
54 pages
DT Worksheet 03.10
No ratings yet
DT Worksheet 03.10
14 pages
UNIT 3 Data Science LM 2023
No ratings yet
UNIT 3 Data Science LM 2023
20 pages
Wa0029.
No ratings yet
Wa0029.
16 pages
Lab 9
No ratings yet
Lab 9
2 pages
DSV - Module-5 Exercise Problems
No ratings yet
DSV - Module-5 Exercise Problems
16 pages
EDA QB Full Answers
No ratings yet
EDA QB Full Answers
18 pages
Research Methodogy Class 5
No ratings yet
Research Methodogy Class 5
29 pages
Research Methodogy Class 4
No ratings yet
Research Methodogy Class 4
29 pages
Unit 5
No ratings yet
Unit 5
10 pages
Understanding Results With Python B0DCY757YS
No ratings yet
Understanding Results With Python B0DCY757YS
467 pages
IOT Domain
No ratings yet
IOT Domain
70 pages
Data Sci
No ratings yet
Data Sci
10 pages
DataVisualization - 1 Surya Sir
No ratings yet
DataVisualization - 1 Surya Sir
51 pages
L5 6 DataViz
No ratings yet
L5 6 DataViz
79 pages
Data Visualization
No ratings yet
Data Visualization
35 pages
Data Visualization Using Python
No ratings yet
Data Visualization Using Python
3 pages
Untitled 18
No ratings yet
Untitled 18
8 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Exploratory Data Analysis-1
No ratings yet
Exploratory Data Analysis-1
10 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Data Architecture Quiz 50
No ratings yet
Data Architecture Quiz 50
7 pages
DLCO Bits
No ratings yet
DLCO Bits
19 pages
Data Science Career
No ratings yet
Data Science Career
6 pages
Data Structures Through Python Lab Manual (R20a0503)
No ratings yet
Data Structures Through Python Lab Manual (R20a0503)
130 pages
Data Queries Modeling Quiz 50
No ratings yet
Data Queries Modeling Quiz 50
6 pages
2 Syallubs
No ratings yet
2 Syallubs
97 pages
Design Thinking Stage 1
No ratings yet
Design Thinking Stage 1
3 pages
DMGT Assignment 3
No ratings yet
DMGT Assignment 3
1 page
Unit 5
No ratings yet
Unit 5
9 pages
Datos Tecnicos RLN
No ratings yet
Datos Tecnicos RLN
7 pages
Reda Hps PDF
100% (1)
Reda Hps PDF
1 page
Results Experiment 1: Determination of Power Input, Heat Output and Coefficient of Performance
No ratings yet
Results Experiment 1: Determination of Power Input, Heat Output and Coefficient of Performance
6 pages
WP99-UPC RI - Expense Claim Form - Rediansyah - Maret 2024-3
No ratings yet
WP99-UPC RI - Expense Claim Form - Rediansyah - Maret 2024-3
11 pages
ITAT Efiling Portal Guidelines and FAQs - 0
No ratings yet
ITAT Efiling Portal Guidelines and FAQs - 0
2 pages
Canon I350 Waste Tank Full - Fixyourownprinter
No ratings yet
Canon I350 Waste Tank Full - Fixyourownprinter
22 pages
Parallel Database
No ratings yet
Parallel Database
27 pages
IT Reviewer
No ratings yet
IT Reviewer
13 pages
MY25 Taurus Spec Sheet EN
No ratings yet
MY25 Taurus Spec Sheet EN
3 pages
Caleb M. Lemmons: Research & Development Summer Internship
No ratings yet
Caleb M. Lemmons: Research & Development Summer Internship
2 pages
ICT 9 7.2 Design
No ratings yet
ICT 9 7.2 Design
70 pages
MS Boundary Gate
No ratings yet
MS Boundary Gate
18 pages
Sika Antisol - 90
No ratings yet
Sika Antisol - 90
2 pages
Algebraic Geometry For Geometric Modeling: Ragni Piene
No ratings yet
Algebraic Geometry For Geometric Modeling: Ragni Piene
46 pages
Statement of Account
No ratings yet
Statement of Account
109 pages
Proposal Brochure - Academia
No ratings yet
Proposal Brochure - Academia
10 pages
DP-200 Dump
No ratings yet
DP-200 Dump
164 pages
Bschons Statistics and Data Science (02240193) : University of Pretoria Yearbook 2020
No ratings yet
Bschons Statistics and Data Science (02240193) : University of Pretoria Yearbook 2020
6 pages
BRAC IT Report
No ratings yet
BRAC IT Report
15 pages
HTTPWWW Jamris Org012010saveas Phpquestjamrisno012010p08-19
No ratings yet
HTTPWWW Jamris Org012010saveas Phpquestjamrisno012010p08-19
12 pages
2025-03-17
No ratings yet
2025-03-17
3 pages
Is 1892
No ratings yet
Is 1892
1 page
NIBDocument NIB16
No ratings yet
NIBDocument NIB16
92 pages
Simple Multi-Gbps 60 GHZ Radio-Over-Fiber Links Employing Optical and Electrical Data Up-Convers
No ratings yet
Simple Multi-Gbps 60 GHZ Radio-Over-Fiber Links Employing Optical and Electrical Data Up-Convers
3 pages
IITG Credit Linked DS
No ratings yet
IITG Credit Linked DS
10 pages
Red Hat Enterprise Linux-9-Upgrading From RHEL 8 To RHEL 9-En-US
No ratings yet
Red Hat Enterprise Linux-9-Upgrading From RHEL 8 To RHEL 9-En-US
61 pages
Log Book Week 1 Week 2
No ratings yet
Log Book Week 1 Week 2
12 pages
The Use of Ultrasonic Cleaning in Dairy Industry: How Does It Work?
No ratings yet
The Use of Ultrasonic Cleaning in Dairy Industry: How Does It Work?
3 pages