0% found this document useful (0 votes)

15 views7 pages

Business Analytics Assignment NAME: Divyansh: Bisht

The document is an assignment by Divyansh Bisht analyzing the Iris dataset, which contains measurements of iris flowers categorized into three species. It includes descriptive statistics, correlation analysis, outlier detection, and visual representations of the data, concluding that the species are well-separated based on their measurements. Suggestions for further research include applying machine learning algorithms and exploring dimensionality reduction techniques.

Uploaded by

Divyansh Bisht

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views7 pages

Business Analytics Assignment NAME: Divyansh: Bisht

Uploaded by

Divyansh Bisht

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

BUSINESS ANALYTICS ASSIGNMENT

NAME: Divyansh Bisht

CITATIONS READ

0 1

1 author:

Divyansh Bisht
Hindu College
BUSINESS ANALYTICS
ASSIGNMENT
NAME: Divyansh Bisht
ROLL NO. : 22-4-02-003293
SEMESTER: VI
SUBJECT: BUSINESS ANALYTICS
SUBMITTED TO: PROF. JASPREET KAUR

Teacher’s Signature Student Signature

In-Depth Analysis of the Iris Dataset

The Iris dataset is a famous dataset in machine learning and statistics.

It contains 150 observations of iris flowers, categorized into three
species: Setosa, Versicolor, and Virginica. Each observation includes
four features:

• Sepal Length (in cm)

• Sepal Width (in cm)

• Petal Length (in cm)

• Petal Width (in cm)

The objective is to analyze and interpret the dataset using statistical

formulas.

1. Descriptive Statistics (Mean, Median, Standard Deviation)

Descriptive statistics help us understand the central tendency (mean,

median) and variability (standard deviation) of the data.

Setosa

Feature Mean (µ) Median Standard Deviation (σ)

Sepal Length 5.006 5.0 0.3
5
0.3
Sepal Width 3.428 3.4
7
0.1
Petal Length 1.462 1.5
7
0.1
Petal Width 0.246 0.2 1
Versicolor

Feature Mean (µ) Median Standard Deviation (σ)

Sepal Length 5.936 5.9 0.52

Sepal Width 2.770 2.8 0.31

Petal Length 4.260 4.35 0.47

Petal Width 1.326 1.3 0.20

Virginica
Feature Mean (µ) Median Standard Deviation (σ)

Sepal Length 6.588 6.5 0.64

Sepal Width 2.974 3.0 0.32

Petal Length 5.552 5.55 0.55

Petal Width 2.026 2.0 0.27

2. Range s Variance

The range (difference between the largest and smallest values) and
variance (spread of data points) tell us about data distribution.

Range of Features (Max - Min)

Feature Setosa Versicolor Virginica

Sepal Length 4.3 - 5.8 (1.5) 4.9 - 7.0 (2.1) 4.9 - 7.9 (3.0)

Sepal Width 2.3 - 4.4 (2.1) 2.0 - 3.4 (1.4) 2.2 - 3.8 (1.6)

Petal Length 1.0 - 1.9 (0.9) 3.0 - 5.1 (2.1) 4.5 - 6.9 (2.4)

Petal Width 0.1 - 0.6 (0.5) 1.0 - 1.8 (0.8) 1.4 - 2.5 (1.1)

3. Correlation Analysis

Correlation measures how strongly features are related.

Correlation (r) Value Interpretation

Sepal Length s Petal Length 0.87 Strong positive correlation

Sepal Width s Petal Length -0.43 Negative correlation

Petal Length s Petal Width 0.G6 Very strong positive correlation

4. Z-Score s Outlier Detection

Z-score helps us find outliers (unusual data points). The Z-
score formula: Z=X−μσZ = \frac{{X - \mu}}{\sigma}
where:
• XX = Data point

• μ\mu = Mean

• σ\sigma = Standard deviation

Observations:

• No extreme outliers in petal and sepal sizes.

• However, some sepal widths in Setosa (4.4 cm) and some petal
widths in Virginica (2.5 cm) are slightly unusual.

5. Visual Analysis Using Graphs

Histogram of Sepal Length

A histogram shows the distribution of sepal length for each species. The
distribution helps identify patterns and variations.

Pie Chart of Species Distribution

A pie chart represents the proportion of each species in the dataset,

showing an equal distribution.

Scatter Plot of Petal Length vs. Petal Width

A scatter plot helps visualize the strong correlation between petal length
and petal width. Setosa has distinct clustering, while Versicolor and Virginica
overlap slightly.

Box Plot for Sepal Width

A box plot highlights outliers and the spread of sepal width across species.

6. Conclusion s Insights

Key Takeaways:

• Setosa has the smallest petals s widest sepals → Easily

identifiable.
• Virginica has the longest petals s sepals → Distinct from others.

• Versicolor is in-between → Moderate petal C sepal size.

• Petal length s width are highly correlated → If you know one, you
can predict the other.

• No extreme outliers, but some high sepal width in Setosa and

large petal width in Virginica are unusual.

With this extensive statistical and visual analysis, we conclude that

Setosa, Versicolor, and Virginica are well-separated species
based on their petal and sepal measurements. These insights can
be utilized in classification models for accurate species identification.

Further Research Suggestions:

• Applying machine learning algorithms like KNN or SVM on the

dataset.

• Exploring dimensionality reduction techniques (e.g., PCA) to

analyze feature importance.

• Investigating real-world applications of iris species classification in

botany and agriculture.

References

• Fisher, R.A. "The Use of Multiple Measurements in Taxonomic

Problems." 1936.

• UCI Machine Learning Repository: Iris Dataset.

View publication stats

BCSL-044 - Statistical Techniques (Lab) PDF
100% (1)
BCSL-044 - Statistical Techniques (Lab) PDF
25 pages
Capslet: Practical Research Ii
100% (1)
Capslet: Practical Research Ii
14 pages
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Solutions To Problem Set 1
No ratings yet
Solutions To Problem Set 1
6 pages
Distribucion Log Normal
No ratings yet
Distribucion Log Normal
52 pages
15MA301 U4v1
No ratings yet
15MA301 U4v1
28 pages
Iris Project Presentation
No ratings yet
Iris Project Presentation
13 pages
Chap 10 B
No ratings yet
Chap 10 B
20 pages
SUPPORT VECTOR MACHINES - v2.0
No ratings yet
SUPPORT VECTOR MACHINES - v2.0
22 pages
9 .ML Programs
No ratings yet
9 .ML Programs
95 pages
Sample Size Determination Exercises
No ratings yet
Sample Size Determination Exercises
2 pages
Abdulkerim Dino
No ratings yet
Abdulkerim Dino
68 pages
Civ 7101 Assignment 1 2023 Katende Abdulazziz
No ratings yet
Civ 7101 Assignment 1 2023 Katende Abdulazziz
19 pages
ML Lab Record
No ratings yet
ML Lab Record
64 pages
Hypothesis of Two Population
No ratings yet
Hypothesis of Two Population
122 pages
Business Statistics in Practice 7th Edition Bowerman Solutions Manualdownload
100% (7)
Business Statistics in Practice 7th Edition Bowerman Solutions Manualdownload
46 pages
Learning Unit 8 - 10044701
No ratings yet
Learning Unit 8 - 10044701
60 pages
703 Application of Statistics in Marine Science
100% (1)
703 Application of Statistics in Marine Science
21 pages
EXPERIMENT
No ratings yet
EXPERIMENT
16 pages
Observational Study - Template
No ratings yet
Observational Study - Template
2 pages
NUMPY-case Study
100% (1)
NUMPY-case Study
4 pages
ECON 310 Stata Assignment
No ratings yet
ECON 310 Stata Assignment
8 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
47 pages
Binomial Probability Distribution-2
No ratings yet
Binomial Probability Distribution-2
5 pages
Customer Satisfaction
No ratings yet
Customer Satisfaction
39 pages
Stress and Coping Styles Among Nursing Students During The Initial
No ratings yet
Stress and Coping Styles Among Nursing Students During The Initial
8 pages
Data Exploration and Visualisation With R: Yanchang Zhao
No ratings yet
Data Exploration and Visualisation With R: Yanchang Zhao
45 pages
Exploratory Data Analysis - Iris Dataset - by Pranshu Sharma - Analytics Vidhya - Medium
No ratings yet
Exploratory Data Analysis - Iris Dataset - by Pranshu Sharma - Analytics Vidhya - Medium
24 pages
Vansh 3089 CA2
No ratings yet
Vansh 3089 CA2
13 pages
Data Visualization With Ggplot2: Sca!er Plots
No ratings yet
Data Visualization With Ggplot2: Sca!er Plots
54 pages
Practical 01
No ratings yet
Practical 01
18 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Notes DV
No ratings yet
Notes DV
19 pages
EDA AnalysisA
No ratings yet
EDA AnalysisA
15 pages
Program1 MLA Lab 2025 250109 144615
No ratings yet
Program1 MLA Lab 2025 250109 144615
17 pages
Class 5 - LinearRegression
No ratings yet
Class 5 - LinearRegression
20 pages
Chapter 3 - Census and Sample Method
No ratings yet
Chapter 3 - Census and Sample Method
17 pages
Bayesian Methods For The Analysis of Small Sample Multilevel Data With A Complex Variance Structure
No ratings yet
Bayesian Methods For The Analysis of Small Sample Multilevel Data With A Complex Variance Structure
14 pages
Module Test
No ratings yet
Module Test
27 pages
Dsbda Lab - 3 - 1737952797670
No ratings yet
Dsbda Lab - 3 - 1737952797670
9 pages
STA1505 Assignment 2 - 2025
No ratings yet
STA1505 Assignment 2 - 2025
3 pages
new89万美迪电子商务 202111080314
No ratings yet
new89万美迪电子商务 202111080314
15 pages
Module 2e - Data Visualization - NV
No ratings yet
Module 2e - Data Visualization - NV
9 pages
ML R Experiment1
No ratings yet
ML R Experiment1
10 pages
Relative Pe
No ratings yet
Relative Pe
16 pages
王玉 20201108012390
No ratings yet
王玉 20201108012390
13 pages
Iris Classification
No ratings yet
Iris Classification
8 pages
SUMITs MINOR REPORT
No ratings yet
SUMITs MINOR REPORT
16 pages
Anova
No ratings yet
Anova
8 pages
Ass - 10.ipynb - Colab
No ratings yet
Ass - 10.ipynb - Colab
8 pages
Iris Setosa Species: Vietnam National University University of Languages and International Studies
No ratings yet
Iris Setosa Species: Vietnam National University University of Languages and International Studies
10 pages
Accountability and Fraud Type Effects On Fraud Detection Responsibility
No ratings yet
Accountability and Fraud Type Effects On Fraud Detection Responsibility
13 pages
AI Worksheet Exp10
No ratings yet
AI Worksheet Exp10
8 pages
Module2 R Report
No ratings yet
Module2 R Report
6 pages
Intro To Analytics Modeling Homework 2
No ratings yet
Intro To Analytics Modeling Homework 2
22 pages
Ass 10 DSBDL
No ratings yet
Ass 10 DSBDL
9 pages
10
No ratings yet
10
7 pages
Task 1
No ratings yet
Task 1
14 pages
Practical 10 Code
No ratings yet
Practical 10 Code
5 pages
Dsbda 3B
No ratings yet
Dsbda 3B
5 pages
Dsbda 3B
No ratings yet
Dsbda 3B
5 pages
Experimental Psychology
No ratings yet
Experimental Psychology
3 pages
Probability: PSYB07 Gabriel Baylon October 2, 2013
No ratings yet
Probability: PSYB07 Gabriel Baylon October 2, 2013
9 pages
Pengaruh Personal Selling Terhadap Kepuasan Pelanggan (Survei Pada Tasya Production Bandung)
No ratings yet
Pengaruh Personal Selling Terhadap Kepuasan Pelanggan (Survei Pada Tasya Production Bandung)
10 pages
CA Merged
No ratings yet
CA Merged
6 pages
DSBDA3
No ratings yet
DSBDA3
3 pages
Iris Data Visualization
No ratings yet
Iris Data Visualization
7 pages
Base de Datos IRIS Codigos R Utilizados para El Analisis
No ratings yet
Base de Datos IRIS Codigos R Utilizados para El Analisis
4 pages
Experiment 11 PML
No ratings yet
Experiment 11 PML
3 pages
Vsec PW 7
No ratings yet
Vsec PW 7
3 pages
David James B. Ignacio - Midterm Exam 1
No ratings yet
David James B. Ignacio - Midterm Exam 1
3 pages
Classification Using R
No ratings yet
Classification Using R
9 pages
K Means On IRIS Dataset
No ratings yet
K Means On IRIS Dataset
4 pages
New Era University: College of Computer Studies Department of Information System
No ratings yet
New Era University: College of Computer Studies Department of Information System
11 pages
A Complete Guide To The Iris Dataset in R
No ratings yet
A Complete Guide To The Iris Dataset in R
3 pages
Merging and Importing Data Additionalmaterial
No ratings yet
Merging and Importing Data Additionalmaterial
2 pages
PW3 My Iris Dataset
No ratings yet
PW3 My Iris Dataset
3 pages
Assigntment 3 Python Lab
No ratings yet
Assigntment 3 Python Lab
1 page
Experiment 3
No ratings yet
Experiment 3
4 pages
Homework (Session 5) S5.1
No ratings yet
Homework (Session 5) S5.1
2 pages
4.5 Raw Dataset For Sepal Length and Sepal Width Setosa Versicolour Virginica
No ratings yet
4.5 Raw Dataset For Sepal Length and Sepal Width Setosa Versicolour Virginica
8 pages
Iris Species IB
No ratings yet
Iris Species IB
7 pages
Module 2 Iris Data Set
No ratings yet
Module 2 Iris Data Set
1 page
Name:-Nisha Ambike: Roll No: - 02
No ratings yet
Name:-Nisha Ambike: Roll No: - 02
2 pages
Kmeansrcode
No ratings yet
Kmeansrcode
2 pages
Iris Visual Code
No ratings yet
Iris Visual Code
6 pages
Tidyverse Cheat Sheet
No ratings yet
Tidyverse Cheat Sheet
1 page
Dataset 4
No ratings yet
Dataset 4
1 page
DML About Put
No ratings yet
DML About Put
2 pages
Summary (Iris) #View Statistical Summary of Dataset
No ratings yet
Summary (Iris) #View Statistical Summary of Dataset
1 page

Business Analytics Assignment NAME: Divyansh: Bisht

Uploaded by

Business Analytics Assignment NAME: Divyansh: Bisht

Uploaded by

BUSINESS ANALYTICS ASSIGNMENT

NAME: Divyansh Bisht

Teacher’s Signature Student Signature

The Iris dataset is a famous dataset in machine learning and statistics.

• Sepal Length (in cm)

• Sepal Width (in cm)

• Petal Length (in cm)

• Petal Width (in cm)

The objective is to analyze and interpret the dataset using statistical

1. Descriptive Statistics (Mean, Median, Standard Deviation)

Descriptive statistics help us understand the central tendency (mean,

Feature Mean (µ) Median Standard Deviation (σ)

Feature Mean (µ) Median Standard Deviation (σ)

Sepal Length 5.936 5.9 0.52

Sepal Width 2.770 2.8 0.31

Petal Length 4.260 4.35 0.47

Petal Width 1.326 1.3 0.20

Sepal Length 6.588 6.5 0.64

Sepal Width 2.974 3.0 0.32

Petal Length 5.552 5.55 0.55

Petal Width 2.026 2.0 0.27

Range of Features (Max - Min)

Feature Setosa Versicolor Virginica

Correlation measures how strongly features are related.

Correlation (r) Value Interpretation

Sepal Length s Petal Length 0.87 Strong positive correlation

Sepal Width s Petal Length -0.43 Negative correlation

Petal Length s Petal Width 0.G6 Very strong positive correlation

4. Z-Score s Outlier Detection

• σ\sigma = Standard deviation

• No extreme outliers in petal and sepal sizes.

5. Visual Analysis Using Graphs

Histogram of Sepal Length

Pie Chart of Species Distribution

A pie chart represents the proportion of each species in the dataset,

Scatter Plot of Petal Length vs. Petal Width

Box Plot for Sepal Width

• Setosa has the smallest petals s widest sepals → Easily

• Versicolor is in-between → Moderate petal C sepal size.

• No extreme outliers, but some high sepal width in Setosa and

With this extensive statistical and visual analysis, we conclude that

Further Research Suggestions:

• Applying machine learning algorithms like KNN or SVM on the

• Exploring dimensionality reduction techniques (e.g., PCA) to

• Investigating real-world applications of iris species classification in

• Fisher, R.A. "The Use of Multiple Measurements in Taxonomic

• UCI Machine Learning Repository: Iris Dataset.

You might also like