0% found this document useful (0 votes)
2 views3 pages

Lab 12

The document outlines an experiment to create a ridge plot visualizing sepal length distributions by species in the Iris dataset, emphasizing the use of distinct colors and titles for clarity. It includes pre-lab and post-lab questions to enhance understanding of ridge plots and their advantages over other visualization methods. Additionally, it provides Python code for generating the plot and suggests further exercises for exploring other features of the dataset.

Uploaded by

VICKY K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

Lab 12

The document outlines an experiment to create a ridge plot visualizing sepal length distributions by species in the Iris dataset, emphasizing the use of distinct colors and titles for clarity. It includes pre-lab and post-lab questions to enhance understanding of ridge plots and their advantages over other visualization methods. Additionally, it provides Python code for generating the plot and suggests further exercises for exploring other features of the dataset.

Uploaded by

VICKY K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

LAB # 12 - Plot a ridge plot to visualize sepal length distributions by

species. Add distinct colors and titles for each ridge.

Aim of the Experiment:

The aim of this experiment is to create a ridge plot to visualize the distribution of sepal length
by species in the Iris dataset. By utilizing distinct colors and adding titles for each ridge, the
plot will allow for a clearer comparison of how sepal length varies across different species.

Pre-lab Questions:

1. What is a ridge plot, and how is it useful in comparing distributions across multiple
categories?
2. How does a ridge plot differ from other distribution visualization methods like
histograms or boxplots?
3. What are the advantages of using distinct colors when visualizing data distributions?
4. In the context of the Iris dataset, what do you expect to observe about the sepal length
distribution for each species?
5. How can ridge plots be used to identify outliers or trends in the data across different
categories?

Python Code:
# Import necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt
# Load the Iris dataset
iris = sns.load_dataset("iris")
# Create a ridge plot (using a KDE plot in Seaborn) to visualize the distribution of
sepal length by species
sns.set(style="whitegrid")
# Create the ridge plot
g = sns.FacetGrid(iris, hue="species", height=5, aspect=2)
g.map(sns.kdeplot, "sepal_length", fill=True, alpha=0.6, linewidth=2)
# Add titles and adjust the layout
g.set_axis_labels("Sepal Length (cm)", "Density")
g.set_titles("{col_name}")
g.despine(left=True)
# Show the plot
plt.show()

Results:

After running the code, a ridge plot will be generated, showing the density distribution of
sepal length for each species in the Iris dataset. Each species will have its own ridge, and the
distinct colors will help differentiate the species visually. The KDE curves will show how the
sepal length is distributed across the species, with peaks indicating the most common sepal
lengths for each species.

Post-lab Questions:

1. How do the ridge plots help you compare the distributions of sepal length across
different species?
2. What would you infer if the KDE curves for sepal length overlap significantly
between species?
3. How can you use ridge plots to identify differences in the spread or central tendency
of sepal length for each species?
4. What impact does adjusting the alpha (transparency) in the sns.kdeplot() function
have on the visualization?
5. How might ridge plots help in choosing appropriate statistical tests or machine
learning models for data analysis?

Viva Questions:

1. What is a ridge plot, and how does it help visualize distributions of data?
2. How does the hue argument in Seaborn help differentiate between different categories
in the plot?
3. What is the role of KDE (Kernel Density Estimation) in creating the ridge plot?
4. Why would you choose to use a ridge plot instead of a histogram or boxplot for this
type of analysis?
5. How can you interpret overlapping or non-overlapping KDE curves for different
species in the plot?
Additional Exercise Problems:

1. Modify the ridge plot to show the distribution of petal length instead of sepal length,
and compare distributions by species.
2. Create a similar plot for the distribution of sepal width across species, using a boxplot
for comparison.
3. Generate a scatter plot to explore the relationship between sepal length and petal
length, then fit a linear regression line.
4. Use a violin plot to visualize the sepal length distribution across different species,
comparing it to the ridge plot.
5. Perform a pairplot on the Iris dataset and compare how sepal length correlates with
other features like petal width or petal length.

You might also like