0% found this document useful (0 votes)

48 views9 pages

Ass 10 DSBDL

The document describes an experiment to visualize features of the Iris flower dataset using histograms and box plots. It includes: 1. Loading and examining the Iris dataset, which contains 4 numeric features and 1 categorical class feature for 150 samples. 2. Creating histograms of each numeric feature to illustrate their distributions. 3. Generating box plots of the numeric features to display their distributions and identify outliers. The document provides code samples to load the dataset, select specific features, generate histograms and box plots, and conclude with observations about visualizing the Iris data.

Uploaded by

Anvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views9 pages

Ass 10 DSBDL

Uploaded by

Anvi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Experiment No.

10
Aim: Data Visualization III
Download the Iris flower dataset or any other dataset into a DataFrame.
(e.g., https://archive.ics.uci.edu/ml/datasets/Iris ). Scan the dataset and give the inference
as:
1. List down the features and their types (e.g., numeric, nominal) available in the
dataset.
2. Create a histogram for each feature in the dataset to illustrate the feature
distributions.
3. Create a box plot for each feature in the dataset.
4. Compare distributions and identify outliers.

Introduction:

Iris dataset is the Hello World for the Data Science, so if you have started your career in Data
Science and Machine Learning you will be practicing basic ML algorithms on this famous
dataset. Iris dataset contains five columns such as Petal Length, Petal Width, Sepal Length, Sepal
Width and Species Type.
Iris is a flowering plant, the researchers have measured various features of the different iris
flowers and recorded digitally.

Code: Reading the dataset “Iris.csv”.

1. List down the features and their types (e.g., numeric, nominal) available in the
dataset.

Code: Displaying the number of features and names of the columns.

The column() function prints all the columns of the dataset in a list form.
data.columns

Code: Displaying only specific columns.

In any dataset, it is sometimes needed to work upon only specific features or columns,
so we can do this by the following code.
specific_data=data[["Id","Species"]]
#data[["column_name1","column_name2","column_name3"]]

#now we will print the first 10 columns of the specific_data

dataframe.
print(specific_data.head(10))

Histograms

Histograms allow seeing the distribution of data for various columns. It can be used for
uni as well as bi-variate analysis.

Histograms with Distplot Plot

Distplot is used basically for the univariant set of observations and visualizes it through
a histogram i.e. only one observation and hence we choose one particular column of the
dataset.

The Box Plot:

The box plot is used to display the distribution of the categorical data in the form of quartiles.
The center of the box shows the median value. The value from the lower whisker to the bottom
of the box shows the first quartile. From the bottom of the box to the middle of the box lies the
second quartile. From the middle of the box to the top of the box lies the third quartile and finally
from the top of the box to the top whisker lies the last quartile.
Attribute Information about data set:
Attribute Information:
-> sepal length in cm
-> sepal width in cm
-> petal length in cm
-> petal width in cm
-> class:
Iris Setosa
Iris Versicolour
Iris Virginica

Number of Instances: 150

Summary Statistics:
Min Max Mean SD Class Correlation
sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194
petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)
petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)

Class Distribution: 33.3% for each of 3 classes.

2. Create a histogram for each feature in the dataset to illustrate the feature
distributions.

Code #1: Histogram for Sepal Length

plt.figure(figsize = (10, 7))

x = data["SepalLengthCm"]

plt.hist(x, bins = 20, color = "green")

plt.title("Sepal Length in cm")

plt.xlabel("Sepal_Length_cm")

plt.ylabel("Count")

Code #2: Histogram for Sepal Width

plt.figure(figsize = (10, 7))

x = data.SepalWidthCm

plt.hist(x, bins = 20, color = "green")

plt.title("Sepal Width in cm")

plt.xlabel("Sepal_Width_cm")

plt.ylabel("Count")

plt.show()

Code #3: Histogram for Petal Length

plt.figure(figsize = (10, 7))

x = data.PetalLengthCm

plt.hist(x, bins = 20, color = "green")

plt.title("Petal Length in cm")

plt.xlabel("Petal_Length_cm")

plt.ylabel("Count")

plt.show()

Code #4: Histogram for Petal Width

plt.figure(figsize = (10, 7))

x = data.PetalWidthCm

plt.hist(x, bins = 20, color = "green")

plt.title("Petal Width in cm")

plt.xlabel("Petal_Width_cm")

plt.ylabel("Count")

plt.show()
3. Create a box plot for each feature in the dataset.

Code #5: Data preparation for Box Plot

# removing Id column

new_data = data[["SepalLengthCm", "SepalWidthCm", "PetalLengthCm",

"PetalWidthCm"]]

print(new_data.head())

Code #6: Box Plot for Iris Data

plt.figure(figsize = (10, 7))

new_data.boxplot()
Conclusion: Thus we have studied data visualization on Iris data set with
histogram and boxplot.

Berserk of Gluttony Volume 8 - PDF Room
No ratings yet
Berserk of Gluttony Volume 8 - PDF Room
264 pages
CSP Exam Essential Practice Questions
67% (6)
CSP Exam Essential Practice Questions
25 pages
Nutrition For Athletes DR Milisav Nikolic
100% (1)
Nutrition For Athletes DR Milisav Nikolic
200 pages
Lecture Planner - Physics - Laksha JEE 2025
100% (1)
Lecture Planner - Physics - Laksha JEE 2025
3 pages
CV GULFAM (Safety Trainer)
No ratings yet
CV GULFAM (Safety Trainer)
3 pages
10
No ratings yet
10
7 pages
EXPERIMENT
No ratings yet
EXPERIMENT
16 pages
ML R Experiment1
No ratings yet
ML R Experiment1
10 pages
DSBDA Lab Assignment No 10
No ratings yet
DSBDA Lab Assignment No 10
3 pages
Module 2 Iris Data Set
No ratings yet
Module 2 Iris Data Set
1 page
EDA AnalysisA
No ratings yet
EDA AnalysisA
15 pages
PR 10
No ratings yet
PR 10
15 pages
Data Exploration and Visualisation With R: Yanchang Zhao
No ratings yet
Data Exploration and Visualisation With R: Yanchang Zhao
45 pages
Part A Assignment 10
No ratings yet
Part A Assignment 10
3 pages
Business Analytics Assignment NAME: Divyansh: Bisht
No ratings yet
Business Analytics Assignment NAME: Divyansh: Bisht
7 pages
A Complete Guide To The Iris Dataset in R
No ratings yet
A Complete Guide To The Iris Dataset in R
3 pages
M1.2 DS
No ratings yet
M1.2 DS
29 pages
AI Lab Exercise 3
No ratings yet
AI Lab Exercise 3
1 page
Ads Exp 3
No ratings yet
Ads Exp 3
7 pages
Module2 R Report
No ratings yet
Module2 R Report
6 pages
Data Exploration LEC3 AM
No ratings yet
Data Exploration LEC3 AM
59 pages
Task 1
No ratings yet
Task 1
14 pages
Exp 10
No ratings yet
Exp 10
2 pages
Lab 5 &6
No ratings yet
Lab 5 &6
6 pages
Using R For Data Preprocessing, Exploratory Analysis, Visualization
No ratings yet
Using R For Data Preprocessing, Exploratory Analysis, Visualization
7 pages
Module 2e - Data Visualization - NV
No ratings yet
Module 2e - Data Visualization - NV
9 pages
03b EDA-Tutorial
No ratings yet
03b EDA-Tutorial
16 pages
Iris Project Presentation
No ratings yet
Iris Project Presentation
13 pages
Univariate and Multivariate Data Exploration
No ratings yet
Univariate and Multivariate Data Exploration
26 pages
Nandini Matplotlib Ws
No ratings yet
Nandini Matplotlib Ws
10 pages
Wk. 4. Exploring Data (12-05-2021)
No ratings yet
Wk. 4. Exploring Data (12-05-2021)
10 pages
Assigntment 3 Python Lab
No ratings yet
Assigntment 3 Python Lab
1 page
Chapter Five
No ratings yet
Chapter Five
48 pages
2.1 Exploratory Data Analysis Using Python
No ratings yet
2.1 Exploratory Data Analysis Using Python
12 pages
Vansh 3089 CA2
No ratings yet
Vansh 3089 CA2
13 pages
Iris Visual Code
No ratings yet
Iris Visual Code
6 pages
Dataviz Cheatsheet
No ratings yet
Dataviz Cheatsheet
9 pages
Dsbda Lab - 3 - 1737952797670
No ratings yet
Dsbda Lab - 3 - 1737952797670
9 pages
Math236 Lecture 3
No ratings yet
Math236 Lecture 3
62 pages
Data Science Project
No ratings yet
Data Science Project
31 pages
Discriminant Analysis Example
No ratings yet
Discriminant Analysis Example
19 pages
Ms Data Science S, 24 (WEEK# 2)
No ratings yet
Ms Data Science S, 24 (WEEK# 2)
19 pages
Introduction To R. Graphical Representation of Multivariate Observations
No ratings yet
Introduction To R. Graphical Representation of Multivariate Observations
5 pages
1 3 ST-explore
No ratings yet
1 3 ST-explore
55 pages
Materi 1 B VDE
No ratings yet
Materi 1 B VDE
18 pages
AMR - Assignment 1-Sample Solutions
No ratings yet
AMR - Assignment 1-Sample Solutions
7 pages
Data Mining - R Assignment: Konstantinos Stavrou (70134) 11/11/2012
No ratings yet
Data Mining - R Assignment: Konstantinos Stavrou (70134) 11/11/2012
13 pages
All Lectures
No ratings yet
All Lectures
53 pages
Anuj Khandelwal 3029 BCP A Business Analytics Continuous Assessment 2
No ratings yet
Anuj Khandelwal 3029 BCP A Business Analytics Continuous Assessment 2
20 pages
Practical 10 Code
No ratings yet
Practical 10 Code
5 pages
Practical 01
No ratings yet
Practical 01
18 pages
DA R Unit-4
No ratings yet
DA R Unit-4
32 pages
LAB1
No ratings yet
LAB1
13 pages
9 .ML Programs
No ratings yet
9 .ML Programs
95 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
47 pages
Chapter 2 Final of Final
No ratings yet
Chapter 2 Final of Final
158 pages
Name:-Nisha Ambike: Roll No: - 02
No ratings yet
Name:-Nisha Ambike: Roll No: - 02
2 pages
CHE331 L08 Descriptive Stats
No ratings yet
CHE331 L08 Descriptive Stats
31 pages
Ds Practical
No ratings yet
Ds Practical
25 pages
Lecture 2.1 Data - Exploration
No ratings yet
Lecture 2.1 Data - Exploration
22 pages
DS Assignment
No ratings yet
DS Assignment
12 pages
25 - Assignment10.ipynb - Colaboratory
No ratings yet
25 - Assignment10.ipynb - Colaboratory
13 pages
Data Mining: Exploring Data Data Mining: Exploring Data: Lecture Notes For Chapter 3 Lecture Notes For Chapter 3
No ratings yet
Data Mining: Exploring Data Data Mining: Exploring Data: Lecture Notes For Chapter 3 Lecture Notes For Chapter 3
34 pages
Merging and Importing Data Additionalmaterial
No ratings yet
Merging and Importing Data Additionalmaterial
2 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
8310 8311 8312 Kkat Eng 13
No ratings yet
8310 8311 8312 Kkat Eng 13
4 pages
Teste de Aderencia
No ratings yet
Teste de Aderencia
7 pages
Geography Oral Presentation Script
No ratings yet
Geography Oral Presentation Script
2 pages
Owners Manual: Warning
No ratings yet
Owners Manual: Warning
35 pages
Gas Turbin Mark VI
100% (8)
Gas Turbin Mark VI
42 pages
Class 12 Physical Education Notes Chapter 9 Studyguide360
No ratings yet
Class 12 Physical Education Notes Chapter 9 Studyguide360
22 pages
No-Bake Cheesecake No-Bake Nuttela Cheesecakes: Ingredients Ingredients
No ratings yet
No-Bake Cheesecake No-Bake Nuttela Cheesecakes: Ingredients Ingredients
6 pages
Nutritional Status Grade 2 CARNATION 2023
No ratings yet
Nutritional Status Grade 2 CARNATION 2023
4 pages
Densified Wooden Nails For New Timber Assemblies and Restoration Works - A Pilot Research
No ratings yet
Densified Wooden Nails For New Timber Assemblies and Restoration Works - A Pilot Research
9 pages
In Love Never Say Never 1161 1170
No ratings yet
In Love Never Say Never 1161 1170
24 pages
Star Wars Episode VI Return of The Jedi 1983
No ratings yet
Star Wars Episode VI Return of The Jedi 1983
102 pages
Hawker 00XPC-Ice Protection System
No ratings yet
Hawker 00XPC-Ice Protection System
14 pages
Princess Accessories For 18" Doll
No ratings yet
Princess Accessories For 18" Doll
5 pages
Fitness Planner
No ratings yet
Fitness Planner
24 pages
Lec-5 Hydrostatic Forces Numericals-5hrs
No ratings yet
Lec-5 Hydrostatic Forces Numericals-5hrs
39 pages
Type of Sutures and Suturing Technique
No ratings yet
Type of Sutures and Suturing Technique
27 pages
Industrial Waste Treatment, Volume I
No ratings yet
Industrial Waste Treatment, Volume I
4 pages
Mr. Rohit Jawa Unilever PDF
No ratings yet
Mr. Rohit Jawa Unilever PDF
16 pages
Unibraid
100% (1)
Unibraid
16 pages
Abamectin
No ratings yet
Abamectin
12 pages
CBSE Class 12 Biology Chapter 12 Ecosystem Important Questions 2024-25
No ratings yet
CBSE Class 12 Biology Chapter 12 Ecosystem Important Questions 2024-25
14 pages
8a Sc3a9rie One Day The Simple Future Tense
0% (1)
8a Sc3a9rie One Day The Simple Future Tense
5 pages
Tunnelling Methods
0% (1)
Tunnelling Methods
15 pages
S&C SMU-20 Rec Total
No ratings yet
S&C SMU-20 Rec Total
1 page
Sample Questions
No ratings yet
Sample Questions
29 pages

Ass 10 DSBDL

Uploaded by

Ass 10 DSBDL

Uploaded by

Experiment No.

Code: Reading the dataset “Iris.csv”.

Code: Displaying the number of features and names of the columns.

Code: Displaying only specific columns.

#now we will print the first 10 columns of the specific_data

Histograms with Distplot Plot

The Box Plot:

Number of Instances: 150

Class Distribution: 33.3% for each of 3 classes.

Code #1: Histogram for Sepal Length

plt.figure(figsize = (10, 7))

plt.hist(x, bins = 20, color = "green")

plt.title("Sepal Length in cm")

Code #2: Histogram for Sepal Width

plt.figure(figsize = (10, 7))

plt.hist(x, bins = 20, color = "green")

plt.title("Sepal Width in cm")

Code #3: Histogram for Petal Length

plt.hist(x, bins = 20, color = "green")

plt.title("Petal Length in cm")

Code #4: Histogram for Petal Width

plt.hist(x, bins = 20, color = "green")

plt.title("Petal Width in cm")

Code #5: Data preparation for Box Plot

new_data = data[["SepalLengthCm", "SepalWidthCm", "PetalLengthCm",

Code #6: Box Plot for Iris Data

plt.figure(figsize = (10, 7))

You might also like