0% found this document useful (0 votes)

84 views1 page

Module 2 Iris Data Set

The iris dataset contains measurements of 150 iris flowers of 3 species. It includes 4 numeric attributes and 1 categorical species attribute. The dataset structure and descriptive statistics are explored. Visualizations include scatter plots of attributes, histograms, and box plots to examine relationships and distributions.

Uploaded by

Rachell Ann Uson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views1 page

Module 2 Iris Data Set

Uploaded by

Rachell Ann Uson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Iris Dataset

Allan Lao
2023-09-26
##ctrl-alt-i for code blocks

Iris Dataset in R
The iris dataset is a built-in dataset in R that contains measurements on 4 different attributes (in centimeters) for 50 flowers from 3 different
species.

To explore the dataset, we can describe it statistically or visualize using charts.

Load the Iris Dataset

Since the iris dataset is a built-in dataset, we simply need to load and use it

data(iris)

Explore the Structure of the dataset

First is to examine the data structure to determine the size, number of columns and other attributes. The order on what you want to look is all up to
the analyst.

Structure
The structure of the dataset

str(iris)

## 'data.frame': 150 obs. of 5 variables:

## $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
## $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
## $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
## $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
## $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

str() shows the structure indicating the number of observations (records) and variables as well as its data type. There are 150 rows of records in
the iris dataset with 5 columns. Note the Species variable has a data type of Factor

The dimension

dim(iris)

## [1] 150 5

The names of the columns

names(iris)

## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"

If you want to take a glimpse at the first 4 lines of rows.

head(iris,4)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1 5.1 3.5 1.4 0.2 setosa

2 4.9 3.0 1.4 0.2 setosa

3 4.7 3.2 1.3 0.2 setosa

4 4.6 3.1 1.5 0.2 setosa

4 rows

Optionally you may check also the last 6 records

tail(iris)

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

145 6.7 3.3 5.7 2.5 virginica

146 6.7 3.0 5.2 2.3 virginica

147 6.3 2.5 5.0 1.9 virginica

148 6.5 3.0 5.2 2.0 virginica

149 6.2 3.4 5.4 2.3 virginica

150 5.9 3.0 5.1 1.8 virginica

6 rows

Describe the Iris Dataset using Statistical tools

Now, lets usse some statistics to describe the dataset.

The descriptive statistics summary

summary(iris)

## Sepal.Length Sepal.Width Petal.Length Petal.Width

## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
## Median :5.800 Median :3.000 Median :4.350 Median :1.300
## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
## Species
## setosa :50
## versicolor:50
## virginica :50
##
##
##

For each of the numeric variables we can see the following information:

Min: The minimum value.

1st Qu: The value of the first quartile (25th percentile).
Median: The median value.
Mean: The mean value.
3rd Qu: The value of the third quartile (75th percentile).
Max: The maximum value.

For the only categorical variable in the dataset (Species) we see a frequency count of each value:

setosa: This species occurs 50 times.

versicolor: This species occurs 50 times.
virginica: This species occurs 50 times.

Visualize the Iris Dataset

The plot () function is the generic function for plotting R objects.

plot(iris)

the entire dataset provides a glimpse of the relation between its variables. The chart below Sepal.Length represents the Sepal.Width in the y-axis
and Sepal.Length in the x-axis

Plot quantitative variables

plot(iris$Sepal.Length) #Quantitative

<> #### Plot 2 quantitative variables

plot(iris$Sepal.Width, iris$Sepal.Length,
col=factor(iris$Species),
main='Sepal Length vs Width',
xlab='Sepal Width',
ylab='Sepal Length',

pch=19)

legend(x = "topleft", lty = c(4,6), text.font = 4,

text.col = "blue",
pch=13,
col = (factor(iris$Species)),
legend=levels(factor(iris$Species)))

Plotting a Factor variable

The plot() function automatically detects the type of variable and determines the appropriate chart to use by default

plot(iris$Species)

Next, will use histogram to determine how data is spread across a range of values. Just being curious on the distribution of Sepal Length.

hist(iris$Sepal.Length,
col='steelblue',
main='Histogram',
xlab='Length',
ylab='Frequency')

Box Plot shows 5 statistically significant numbers- the minimum, the 25th percentile, the median, the 75th percentile and the maximum. It is thus
useful for visualizing the spread of the data is and deriving inferences accordingly

Using a boxplot() we can determine the distribution of sepal length across species.

boxplot(Sepal.Length~Species,
data=iris,
main='Sepal Length by Species',
xlab='Species',
ylab='Sepal Length',
col='steelblue',
border='black')

Amendment No. 2 To AS 2047-2014 Windows and External Glazed Doors in Buildings
No ratings yet
Amendment No. 2 To AS 2047-2014 Windows and External Glazed Doors in Buildings
2 pages
Finland Stamp Catalogue 1856-1962 (205 PG)
100% (4)
Finland Stamp Catalogue 1856-1962 (205 PG)
205 pages
A Complete Guide To The Iris Dataset in R
No ratings yet
A Complete Guide To The Iris Dataset in R
3 pages
Data Exploration and Visualisation With R: Yanchang Zhao
No ratings yet
Data Exploration and Visualisation With R: Yanchang Zhao
45 pages
Merging and Importing Data Additionalmaterial
No ratings yet
Merging and Importing Data Additionalmaterial
2 pages
10
No ratings yet
10
7 pages
Module2 R Report
No ratings yet
Module2 R Report
6 pages
ML R Experiment1
No ratings yet
ML R Experiment1
10 pages
Data Visualisation in R
No ratings yet
Data Visualisation in R
3 pages
Ass 10 DSBDL
No ratings yet
Ass 10 DSBDL
9 pages
EDA AnalysisA
No ratings yet
EDA AnalysisA
15 pages
Assigntment 3 Python Lab
No ratings yet
Assigntment 3 Python Lab
1 page
Task 1
No ratings yet
Task 1
14 pages
Introds Final Part2 2020 Incl Sol
No ratings yet
Introds Final Part2 2020 Incl Sol
6 pages
Using R For Data Preprocessing, Exploratory Analysis, Visualization
No ratings yet
Using R For Data Preprocessing, Exploratory Analysis, Visualization
7 pages
Practical 01
No ratings yet
Practical 01
18 pages
NUMPY-case Study
100% (1)
NUMPY-case Study
4 pages
Lecture13_EDA
No ratings yet
Lecture13_EDA
2 pages
03b EDA-Tutorial
No ratings yet
03b EDA-Tutorial
16 pages
Exploratory Data Analysis - Iris Dataset - by Pranshu Sharma - Analytics Vidhya - Medium
No ratings yet
Exploratory Data Analysis - Iris Dataset - by Pranshu Sharma - Analytics Vidhya - Medium
24 pages
Dsfasdflalksdflkasdjfasf
No ratings yet
Dsfasdflalksdflkasdjfasf
4 pages
Introduction To R. Graphical Representation of Multivariate Observations
No ratings yet
Introduction To R. Graphical Representation of Multivariate Observations
5 pages
Module 2e - Data Visualization - NV
No ratings yet
Module 2e - Data Visualization - NV
9 pages
Canonical Discriminant Analysis
No ratings yet
Canonical Discriminant Analysis
10 pages
Plot Library Handouts
No ratings yet
Plot Library Handouts
6 pages
David James B. Ignacio - Midterm Exam 1
No ratings yet
David James B. Ignacio - Midterm Exam 1
3 pages
EXPERIMENT
No ratings yet
EXPERIMENT
16 pages
Business Analytics Assignment NAME: Divyansh: Bisht
No ratings yet
Business Analytics Assignment NAME: Divyansh: Bisht
7 pages
Discriminant Analysis Example
No ratings yet
Discriminant Analysis Example
19 pages
Rexpt 6&7
No ratings yet
Rexpt 6&7
3 pages
Exno 4
No ratings yet
Exno 4
13 pages
1 3 ST-explore
No ratings yet
1 3 ST-explore
55 pages
Iris Analysis Assignment
No ratings yet
Iris Analysis Assignment
12 pages
Dsbda Lab - 3 - 1737952797670
No ratings yet
Dsbda Lab - 3 - 1737952797670
9 pages
EDA With R Lab Manual
No ratings yet
EDA With R Lab Manual
110 pages
Vansh 3089 CA2
No ratings yet
Vansh 3089 CA2
13 pages
Data Visualization With Ggplot2: Sca!er Plots
No ratings yet
Data Visualization With Ggplot2: Sca!er Plots
54 pages
R Programs
No ratings yet
R Programs
30 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
47 pages
9 .ML Programs
No ratings yet
9 .ML Programs
95 pages
Python (Visualization)
No ratings yet
Python (Visualization)
3 pages
Case Study (Iris Data Set)
No ratings yet
Case Study (Iris Data Set)
1 page
Part A Assignment 10
No ratings yet
Part A Assignment 10
3 pages
edr2
No ratings yet
edr2
11 pages
Iris Visual Code
No ratings yet
Iris Visual Code
6 pages
Ds Practical
No ratings yet
Ds Practical
25 pages
R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF
No ratings yet
R For Data Science - Tidyverse For Beginners (Ggplot2, Dplyr, Tidyr, Readr, Purr, Tibble, Stringr, Forcats) PDF
1 page
Tidyverse Cheat Sheet
No ratings yet
Tidyverse Cheat Sheet
1 page
LAB1
No ratings yet
LAB1
13 pages
Material DA 7
No ratings yet
Material DA 7
3 pages
Material DA 7
No ratings yet
Material DA 7
3 pages
Material DA 7
No ratings yet
Material DA 7
3 pages
STA 272 Chapter 02 Notes and Codes Data Frames in R
No ratings yet
STA 272 Chapter 02 Notes and Codes Data Frames in R
5 pages
R Programming
No ratings yet
R Programming
4 pages
Some R Commander Examples: Sunday, January 03, 2010
No ratings yet
Some R Commander Examples: Sunday, January 03, 2010
5 pages
STATISTICALinference
No ratings yet
STATISTICALinference
5 pages
Assignment 5'
No ratings yet
Assignment 5'
4 pages
Practical 10 Code
No ratings yet
Practical 10 Code
5 pages
Lab 3 - SciKitLearn ML
No ratings yet
Lab 3 - SciKitLearn ML
2 pages
Statcon Module 3 With Mylegalwhiz Summaries
No ratings yet
Statcon Module 3 With Mylegalwhiz Summaries
106 pages
Statcon Digest Module 3
No ratings yet
Statcon Digest Module 3
21 pages
Module 2 Intro To R
No ratings yet
Module 2 Intro To R
26 pages
Module 5 - Data Cleaning and Transformation
No ratings yet
Module 5 - Data Cleaning and Transformation
26 pages
Module 4 - Data Exploration and Visualization
No ratings yet
Module 4 - Data Exploration and Visualization
80 pages
Module 3 - Lets Elaborate
No ratings yet
Module 3 - Lets Elaborate
2 pages
Statcon Module 4 With Mylegalwhiz Summaries
No ratings yet
Statcon Module 4 With Mylegalwhiz Summaries
109 pages
Self Discovery Prompts
No ratings yet
Self Discovery Prompts
2 pages
Program Brief - NICTM 2023 - v4
No ratings yet
Program Brief - NICTM 2023 - v4
3 pages
Journal Prompts To Get To Know Yourself
No ratings yet
Journal Prompts To Get To Know Yourself
8 pages
Consti Module 2 Case Digests and Mylegalwhiz Supplementals
No ratings yet
Consti Module 2 Case Digests and Mylegalwhiz Supplementals
40 pages
Statcon Module 2 Case Digest Summary
No ratings yet
Statcon Module 2 Case Digest Summary
23 pages
Government of The Philippine Islands Vs Monte de Piedad
No ratings yet
Government of The Philippine Islands Vs Monte de Piedad
4 pages
Statcon Module 2 Summaries (3 Versions For Supplements)
No ratings yet
Statcon Module 2 Summaries (3 Versions For Supplements)
65 pages
Unit 7 Writing Assignment
No ratings yet
Unit 7 Writing Assignment
4 pages
Co Kim Chan Vs Valdez Tan Keh 75 Phil 113
No ratings yet
Co Kim Chan Vs Valdez Tan Keh 75 Phil 113
3 pages
5111 Written Assignment Unit 7
No ratings yet
5111 Written Assignment Unit 7
6 pages
Protecting Filipino Pride
No ratings yet
Protecting Filipino Pride
8 pages
Written Assignment Unit 7
No ratings yet
Written Assignment Unit 7
5 pages
Subject: Call For Intellectual Property (IP) Rights Requests
No ratings yet
Subject: Call For Intellectual Property (IP) Rights Requests
2 pages
Computer Bus Architecture: Von Neumann Computer Model
No ratings yet
Computer Bus Architecture: Von Neumann Computer Model
19 pages
FMCG
No ratings yet
FMCG
43 pages
Change Detection Algorithms in Urban Expansion
No ratings yet
Change Detection Algorithms in Urban Expansion
51 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
13 pages
Altibase 7.1.0 GettingStarted Eng PDF
No ratings yet
Altibase 7.1.0 GettingStarted Eng PDF
84 pages
Girish Awachat
No ratings yet
Girish Awachat
4 pages
DBeaver Community
No ratings yet
DBeaver Community
13 pages
Bash Profile
No ratings yet
Bash Profile
4 pages
Vega Admin Guide R85 v1.6
No ratings yet
Vega Admin Guide R85 v1.6
349 pages
Nco Sample Paper Class-7 PDF
No ratings yet
Nco Sample Paper Class-7 PDF
2 pages
Serpent's Skull PFS Chronicle Sheets
No ratings yet
Serpent's Skull PFS Chronicle Sheets
9 pages
Rri63 N3rit
No ratings yet
Rri63 N3rit
2,237 pages
Jessore University of Science and Technology Department of Finance and Banking Syllabus For The BBA Program Effective From: 2017-2018 Session
No ratings yet
Jessore University of Science and Technology Department of Finance and Banking Syllabus For The BBA Program Effective From: 2017-2018 Session
71 pages
Company Database Bangladesh-C-P2
No ratings yet
Company Database Bangladesh-C-P2
442 pages
The Electrical Worker September 2009
No ratings yet
The Electrical Worker September 2009
24 pages
ISI+L1 +Most+Important+questions+
No ratings yet
ISI+L1 +Most+Important+questions+
50 pages
Benjamin Seforo, Verification of Ohm's Law, Eeb231 Lab 1
No ratings yet
Benjamin Seforo, Verification of Ohm's Law, Eeb231 Lab 1
11 pages
PIC16F627
No ratings yet
PIC16F627
6 pages
DWDM
No ratings yet
DWDM
30 pages
PPS Programs For END SEM On Unit (3,4,5,6)
No ratings yet
PPS Programs For END SEM On Unit (3,4,5,6)
5 pages
Front-End Internship Report
No ratings yet
Front-End Internship Report
59 pages
Resume - Jlk10a
No ratings yet
Resume - Jlk10a
5 pages
Log
No ratings yet
Log
6 pages
Full Download Modern Physics Kenneth S. Krane PDF
100% (5)
Full Download Modern Physics Kenneth S. Krane PDF
28 pages
Openshift Container Platform 4.14 Virtualization en Us
No ratings yet
Openshift Container Platform 4.14 Virtualization en Us
448 pages
1MA1 3F MSC 20210114
No ratings yet
1MA1 3F MSC 20210114
20 pages
Task 3 - Repair Module 11 of WD Disks
100% (1)
Task 3 - Repair Module 11 of WD Disks
20 pages
JC 2024-33 - Final Report On The Draft RTS and ITS On Incident Reporting
No ratings yet
JC 2024-33 - Final Report On The Draft RTS and ITS On Incident Reporting
128 pages

Module 2 Iris Data Set

Uploaded by

Module 2 Iris Data Set

Uploaded by

Iris Dataset

To explore the dataset, we can describe it statistically or visualize using charts.

Load the Iris Dataset

Explore the Structure of the dataset

## 'data.frame': 150 obs. of 5 variables:

The names of the columns

## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"

If you want to take a glimpse at the first 4 lines of rows.

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

1 5.1 3.5 1.4 0.2 setosa

2 4.9 3.0 1.4 0.2 setosa

3 4.7 3.2 1.3 0.2 setosa

4 4.6 3.1 1.5 0.2 setosa

Optionally you may check also the last 6 records

Sepal.Length Sepal.Width Petal.Length Petal.Width Species

145 6.7 3.3 5.7 2.5 virginica

146 6.7 3.0 5.2 2.3 virginica

147 6.3 2.5 5.0 1.9 virginica

148 6.5 3.0 5.2 2.0 virginica

149 6.2 3.4 5.4 2.3 virginica

150 5.9 3.0 5.1 1.8 virginica

Describe the Iris Dataset using Statistical tools

The descriptive statistics summary

## Sepal.Length Sepal.Width Petal.Length Petal.Width

Min: The minimum value.

setosa: This species occurs 50 times.

Visualize the Iris Dataset

Plot quantitative variables

<> #### Plot 2 quantitative variables

legend(x = "topleft", lty = c(4,6), text.font = 4,

Plotting a Factor variable

You might also like