0% found this document useful (0 votes)
11 views15 pages

PR 10

The document loads and analyzes the Iris flower dataset. It lists the dataset features and their types, creates histograms and boxplots for each feature, and compares distributions to identify outliers.

Uploaded by

prathamesh g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views15 pages

PR 10

The document loads and analyzes the Iris flower dataset. It lists the dataset features and their types, creates histograms and boxplots for each feature, and compares distributions to identify outliers.

Uploaded by

prathamesh g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Experiment No.

10
Data Visualization III
Download the Iris flower dataset or any other dataset into a DataFrame. (e.g.,
https://archive.ics.uci.edu/ml/datasets/Iris ). Scan the dataset and give the inference as:

1. List down the features and their types (e.g., numeric, nominal) available in the dataset.
2. Create a histogram for each feature in the dataset to illustrate the feature distributions.
3. Create a boxplot for each feature in the dataset.
4. Compare distributions and identify outliers.

Step 1:
import pandas as pd
import numpy as np
import random

Step 2:
df=pd.read_csv("iris.csv")
df.describe()

Step 3:
df.info()

Step 4:
df.head()

Step 5:
df.tail()

Step 6:
df.isnull().sum()

Step 7:
a=df.groupby(['sepal.length','sepal.width','petal.length','petal.width']).count()
a
Step 8:
b=df.groupby(['variety']).count()
b

Step 9:
v=pd.Categorical(['sepal.length','sepal.width','petal.length','petal.width'],categories
=['Setosa','Versicolor','Virginica'],ordered=False)
v
Step 10:
s1=df.groupby(['variety','sepal.length']).count()
s1
Step 11:
import matplotlib.pyplot as m
m.hist(['s1'])

Step 12:
s2=df.groupby(['variety','sepal.width']).count()
s2
Step 13:
m.hist(['s2'])
Step 14:
s3=df.groupby(['variety','petal.length']).count()
s3

Step 15:
m.hist(['s3'])

Step 16:
s4=df.groupby(['variety','petal.width']).count()
s4

Step 17:
m.hist(['s4'])

Step 18:
m.hist(['s1','s2','s3','s4'])
Step 19:
m.hist(data=df,x='sepal.length')

Step 20:
m.hist(data=df,x='sepal.width')
Step 21:
m.hist(data=df,x='petal.length')

Step 22:
m.hist(data=df,x='petal.width')
Step 23:
m.boxplot(df['sepal.length'],vert=False)

Step 24:
m.boxplot(df['sepal.width'],vert=False)
Step 25:
m.boxplot(df['petal.length'],vert=False)

Step 26:
m.boxplot(df['petal.width'],vert=False)
Step 27:
import seaborn as sns
sns.boxplot(x='variety',y='sepal.length',data=df)

Step 28:
sns.boxplot(x='variety',y='sepal.width',data=df)
Step 29:
sns.boxplot(x='variety',y='petal.width',data=df)

Step 30:
sns.boxplot(x='variety',y='petal.width',data=df)

You might also like