0% found this document useful (0 votes)

9 views13 pages

EDS Unit 2 ?

The document provides an overview of data types and statistical descriptions, detailing various attributes such as qualitative and quantitative types, along with their subcategories. It explains basic statistical measures including central tendency (mean, median, mode) and dispersion (range, variance, standard deviation), emphasizing their importance in data analysis. Additionally, it covers the structure of datasets and the significance of graphic displays in representing data.

Uploaded by

moinuddinmoin1357

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views13 pages

EDS Unit 2 ?

Uploaded by

moinuddinmoin1357

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

☘️

Unit 2 Data Types & Statistical

Description
Syllabus
Types of Data: Attributes and Measurement, What is an Attribute? The Type of
an Attribute, The Different Types of Attributes, Describing Attributes by the
Number of Values, Asymmetric Attributes, Binary Attribute, Nominal Attributes,
Ordinal Attributes, Numeric Attributes, Discrete versus Continuous Attributes.
Basic Statistical Descriptions of Data:

Measuring the Central Tendency: Mean, Median, and Mode

Measuring the Dispersion of Data: Range, Quartiles, Variance, Standard

Deviation, and Inter-quartile Range, Graphic Displays of Basic Statistical
Descriptions of Data.

Definition of Dataset
A dataset is a collection of data objects organized in a structured format,
typically in rows and columns. Each row represents an instance (record), and
each column represents an attribute (feature or variable). Datasets are used for
analysis, prediction, and decision-making.

Data Objects are defined by number of attributes (variables)

The dataset consists of rows and columns. Where the rows correspond the
data objects, and the columns correspond the attributes of the data objects.

Unit 2 Data Types & Statistical Description 1

Attributes
An Attribute is a property or characteristic of data objects. Example: In a
student dataset, attributes can be age, gender, or GPA.

Describing Attributes by the Number of Values

Finite Attributes: Limited number of possible values (e.g., the number of
days in a week).

Infinite Attributes: Unlimited or uncountable values (e.g., real numbers

within a range).

Types of Attributes

Unit 2 Data Types & Statistical Description 2

The attributes are classified broadly into two types

1. Qualitative

2. Quantitative

https://medium.com/@netrajpatil12mati/data-objects-and-attribute-types-
704d7d9ea8a8

Qualitative Attributes
These attributes are descriptive and non-numerical, and are used to describe
characteristics that can't be easily measured.

Nominal Attributes
Definition: Represent categories or labels without a meaningful order or
ranking.

Examples: Colors (Red, Green, Blue), Gender (Male, Female), Nationalities.

Key Point: No arithmetic operations can be performed.

Binary Attributes
Definition: Attributes with only two possible values.

Examples: Yes/No, True/False, On/Off.

Key Point: Often encoded as 0 (False) and 1 (True).

Unit 2 Data Types & Statistical Description 3

Symmetric Attributes:

Definition: Binary attributes where both outcomes have equal

importance.

Examples: Male/Female, Pass/Fail.

Key Point: No bias toward one outcome over the other.

Asymmetric Attributes:

Definition: Binary attributes where one outcome is more significant than

the other.

Examples: Presence/Absence of a disease, Positive/Negative test

results.

Key Point: The two values are not equally important.

Ordinal Attributes
Definition: Represent categories with a meaningful order or ranking, but the
intervals between values are not defined.

Examples: Education levels (High School < College < Graduate), Likert
scale (Poor, Average, Good).

Key Point: Arithmetic operations are not applicable, but comparisons are.

Quantitative Attributes
These attributes are numerical and quantifiable, and are used to measure
values or counts.

Discrete Attributes
Definition: Numeric attributes with a finite or countable number of values.

Examples: Number of children, Number of cars.

Key Point: Values are distinct and separate.

Continuous Attributes

Unit 2 Data Types & Statistical Description 4

Definition: Numeric attributes with an infinite number of possible values
within a range.

Examples: Height, Weight, Temperature.

Key Point: Can take any value in a given interval.

Numeric Attributes
Numeric attributes represent measurable quantities and can be classified as:

Interval-Scaled:

Definition: Interval-scaled attributes are measured on a scale with

equal-sized units. The values of interval-scaled attributes have a
definite order and can be positive, zero, or negative. However, these
attributes lack a true zero point. While we can calculate the difference
between values, we cannot express one value as a multiple of another.

Example: Temperature in Celsius or Fahrenheit.

Ratio-Scaled:

Definition: A ratio-scaled attribute is a numeric attribute with a natural

zero point. This means if a measurement is ratio-scaled, we can talk
about one value being a multiple (or ratio) of another value. Plus, the
values are ordered, and we can figure out the difference between them,
as well as calculate things like the mean, median, and mode.

Example: Height, Weight, Age.

Discrete vs Continuous attributes

A discrete attribute has a limited or countable set of values. These values might
be numbers, but they don’t have to be. For example, hair color, smoker status,
medical test results, or drink sizes — each of these has a set number of
options, so they’re discrete.
If an attribute isn’t discrete, we call it continuous. Continuous attributes can
take any value within a range. There are no gaps between possible values.

Let’s look at an example to make this clearer. Height is a continuous attribute.

Someone could be 170.5 cm tall, or 170.51 cm, or 170.513 cm — there’s no limit

Unit 2 Data Types & Statistical Description 5

to how precise we can get. We can always squeeze another possible value
between two heights, no matter how close they are.

Basic Statistical Descriptions of Data

Importance of Basic Statistical Descriptions of Data
1. Understanding Central Trends: Measures like mean, median, and mode
help in identifying the central tendency of the data, giving a snapshot of
what is typical or average in the dataset.

2. Evaluating Data Spread: Dispersion measures like range, variance, and

standard deviation indicate how spread out the data is. Understanding
variability is crucial for determining consistency in the dataset.

3. Detecting Anomalies: Statistical tools like box plots and inter-quartile range
(IQR) help identify outliers, which could indicate errors, special cases, or
significant trends worth investigating.

4. Comparing Datasets: Basic statistical measures make it easier to compare

different datasets, helping to identify trends, similarities, or differences
between groups or categories.

5. Supporting Data-Driven Decisions: By summarizing data effectively, these

statistical descriptions provide a solid foundation for making informed
decisions, conducting hypothesis testing, and selecting appropriate models
for analysis.

These basic descriptions are essential for interpreting data accurately and
making informed, data-driven decisions.

Measure of Central Tendency

It is the statistical measure that identifies a single value as representative of an
entire distribution

Mean
The average value, calculated by summing all data points and dividing by the
number of data points.

1. Mean of an Individual Series:

Unit 2 Data Types & Statistical Description 6

The individual series refers to a set of individual data points or values.

The mean (average) is calculated by summing up all the values and

dividing by the number of observations.

Formula:
∑x
Mean = n

Example: For the data series [3, 5, 7, 9], the mean would be:
3+5+7+9 24
Mean = 4 = 4 = 6
2. Mean of a Discrete Series:

A discrete series consists of distinct, countable data points, often

associated with frequencies (i.e., how many times a value appears).

The mean is calculated by multiplying each data point by its frequency,

summing the results, and dividing by the total number of observations.

Formula:

Where:
∑ (f ⋅x)
Mean = ∑f

f = frequency of each value

x= value of the data point
Example: For the data series with values [2, 4, 6]and corresponding
frequencies [3, 5, 2]:
(2×3)+(4×5)+(6×2) 6+20+12 38
Mean = 3+5+2

= 10
= 10
= 3.8
3. Mean of a Continuous Series:

A continuous series deals with data that can take any value within a
given range, often represented in intervals or class groups.

The mean is calculated by finding the midpoint of each class,

multiplying it by the frequency of the class, summing the results, and
dividing by the total number of observations.

Formula:
∑ (f ⋅m)
Mean = ∑f

Where:

Unit 2 Data Types & Statistical Description 7

f = frequency of each class
m= midpoint of each class interval
Example: For a continuous series with class intervals [10 − 20, 20 −
30, 30 − 40]and corresponding frequencies [5, 8, 7]:
Midpoints:

m = 15, 25, 35

(5×15)+(8×25)+(7×35) 75+200+245 520
Mean = 5+8+7
= 20
= 20
= 26

Mode
The most frequently occurring value in the dataset.

1. For an Individual Series:

Identify the value that occurs most frequently in the dataset.

Example: In the series [1, 2, 2, 3, 4], the mode is 2because it occurs

most often.

2. For a Discrete Series (with frequencies):

The mode is the value (or class interval) that has the highest frequency.

Formula:
Mode = Value with highest frequency
Example: For the dataset values [2, 4, 6]with frequencies [3, 5, 2], the
mode is 4because it has the highest frequency (5).

3. For a Continuous Series (with class intervals):

The mode for a continuous series can be found using the following
formula:

Mode = L + ( (2f1f)−f
1 −f0
0 −f 2

) × h

Where:

L= Lower boundary of the modal class

f1 = Frequency of the modal class

f0 = Frequency of the class before the modal class

Unit 2 Data Types & Statistical Description 8

f2 = Frequency of the class after the modal class

h= Width of the class intervals

Example: For the continuous series with class intervals [10 − 20, 20 −
30, 30 − 40]and frequencies [5, 8, 7], the modal class is 20 − 30
(because it has the highest frequency of 8). We would plug in values to
the formula to calculate the mode.

Median
The middle value when data points are arranged in ascending order. If there is
an even number of values, the median is the average of the two middle values.

Odd Number of Data Points:

The median is the middle value.

Example: [1, 3, 5] → Median = 3.

Even Number of Data Points:

The median is the average of the two middle values.

Example: [1, 3, 5, 7] → Median = 3+5

2

= 4.

Measure of Dispersion of Data

The measure of dispersion measures the extent to which the datapoints vary
from the central point (mean, median, mode).

These are of two types

1. Absolute Measure

2. Relative Measure

Absolute Measure of Dispersion

When dispersion is expressed in terms of original units, it’s absolute measure of
dispersion

1. Range:

The difference between the maximum and minimum values in a dataset.

Unit 2 Data Types & Statistical Description 9

Formula: Range = Max − Min
Example: [1, 3, 5, 7] → Range = 7 - 1 = 6

2. Quartiles:

Q1 (First Quartile): The median of the lower half of the dataset.

Q2 (Second Quartile): The median of the entire dataset.

Q3 (Third Quartile): The median of the upper half of the dataset.

Interquartile Range (IQR): The difference between Q3 and Q1.

Formula:
IQR = Q3 − Q1
Example: For [1, 3, 5, 7, 9], Q1 = 3, Q2 = 5, Q3 = 7, and IQR = 7 - 3 = 4

3. Variance:

A measure of how much the values in the dataset deviate from the
mean.

Formula: Variance = n1 ∑ni=1 (xi − μ)2

Where is each value, is the mean, and is the number of data points.

Example: For [1, 2, 3], mean = 2, variance = .

(1−2)2 +(2−2)2 +(3−2)2
3
= 0.67
4. Standard Deviation:

The square root of the variance. It shows how spread out the numbers
are.

Formula: Standard Deviation = Variance

Example: For a variance of 0.67, standard deviation =

0.67 ≈ 0.82

5. Inter-quartile Range (IQR):

The range between Q1 and Q3, representing the middle 50% of the
data.

Formula: IQR = Q3 − Q1

Example: For [1, 3, 5, 7, 9], Q1 = 3, Q3 = 7, and IQR = 7 - 3 = 4

Unit 2 Data Types & Statistical Description 10

Graphic Displays:
Box Plot: Shows the distribution of data, highlighting the median, quartiles,
and outliers.

Histogram: Displays the frequency of data within specific intervals.

Bar Chart: Compares quantities of different categories.

Relative Measure of Dispersion

When dispersion is expressed in terms of ratios of absolute measure of
dispersion, it’s Relative Measure of Dispersion. It is mostly used to compare the
variation of two or more distributions

Coefficients of Dispersion
1. Coefficient of Range:

Measures the relative spread of the range.

Formula:
Max−Min
Coefficient of Range = Max+Min

Example: If Max = 50, Min = 10:

50−10 40
Coefficient of Range = 50+10 = 60 = 0.67
2. Coefficient of Variation (CV):

Measures relative variability compared to the mean.

Formula:
Standard Deviation
CV = Mean
× 100
Example: If mean = 20, standard deviation = 4:
4
CV = 20

× 100 = 20%
3. Coefficient of Mean Deviation:

Measures the mean deviation relative to the mean.

Formula:
Mean Deviation
Coefficient of Mean Deviation = Mean

Example: If mean = 30, mean deviation = 5:

5
Coefficient of Mean Deviation = 30
= 0.167

Unit 2 Data Types & Statistical Description 11

4. Coefficient of Quartile Deviation:

Measures the relative dispersion of the middle 50% of data.

Formula:
Q3−Q1
Coefficient of Quartile Deviation = Q3+Q1

Example: If Q1 = 25, Q3 = 75:

75−25 50
Coefficient of Quartile Deviation = 75+25
= 100
= 0.5
Q3−Q1
Quartile Deviation = 2

Basic Graphical Representations of Data

1. Line Graphs
Single Line Graph:
Represents one variable over time or another continuous variable.
Example: A graph showing daily temperatures over a week.

Multiple Line Graphs:

Displays two or more variables on the same graph for comparison.

Example: Sales trends for two products over a year.

Compound Line Graph:

Used to show cumulative data trends, where different variables contribute
to the total.

Example: A graph showing total revenue divided into product categories

over time.

2. Pie Chart
Represents data as a circular chart divided into slices, where each slice is
proportional to the percentage of a category.

Example: Market share of different companies in a sector.

3. Histogram
Displays frequency distribution of continuous data, with adjacent bars to
indicate intervals.

Unit 2 Data Types & Statistical Description 12

Example: Distribution of student scores in an exam.

4. Bar Charts
Vertical Bar Chart:

Bars are upright, and their height represents the value of each category.

Example: Sales revenue for different products.

Horizontal Bar Chart:

Bars are horizontal, suitable when category names are long.

Example: Population of different cities.

Grouped Bar Chart:

Groups multiple bars for each category to compare subcategories.

Example: Sales revenue for two brands in different regions.

Stacked Bar Chart:

Stacks subcategories within a bar to show their contribution to the total.

Example: Total revenue with individual contributions from different

departments.

Unit 2 Data Types & Statistical Description 13

2 Knowing Data & Visualization
No ratings yet
2 Knowing Data & Visualization
51 pages
Statistics
100% (4)
Statistics
124 pages
(Cambridge Series in Statistical and Probabilistic Mathematics) Gerhard Tutz, Ludwig-Maximilians-Universität Munchen - Regression For Categorical Data-Cambridge University Press (2012)
100% (3)
(Cambridge Series in Statistical and Probabilistic Mathematics) Gerhard Tutz, Ludwig-Maximilians-Universität Munchen - Regression For Categorical Data-Cambridge University Press (2012)
574 pages
Descriptive Analytics Notes
No ratings yet
Descriptive Analytics Notes
6 pages
Statistical Methods in Geography
No ratings yet
Statistical Methods in Geography
15 pages
Data-Preprocessing
No ratings yet
Data-Preprocessing
138 pages
Chap2 Data
No ratings yet
Chap2 Data
101 pages
3 Data
No ratings yet
3 Data
64 pages
Introduction To Statistics and Its Applications
No ratings yet
Introduction To Statistics and Its Applications
17 pages
Ch01 ICS422 04
No ratings yet
Ch01 ICS422 04
84 pages
Topics To Be Covered
No ratings yet
Topics To Be Covered
58 pages
Know - Your - Data and Rescaling
No ratings yet
Know - Your - Data and Rescaling
72 pages
Module No 2 - Part 2 - Compressed - Compressed
No ratings yet
Module No 2 - Part 2 - Compressed - Compressed
46 pages
Presentation 1
No ratings yet
Presentation 1
46 pages
Statistical Foundations - Intro 64zlf
100% (2)
Statistical Foundations - Intro 64zlf
86 pages
Know - Your - Data and Rescaling-1
No ratings yet
Know - Your - Data and Rescaling-1
78 pages
Unit 3 Data Preprocessing - Data
No ratings yet
Unit 3 Data Preprocessing - Data
90 pages
Know Your Data
No ratings yet
Know Your Data
83 pages
01 Data
No ratings yet
01 Data
100 pages
Handout-A-Preliminaries (Advance Statistics)
No ratings yet
Handout-A-Preliminaries (Advance Statistics)
29 pages
UNIT II - Statistics For Data Science - New
No ratings yet
UNIT II - Statistics For Data Science - New
153 pages
Module1 Understanding Data1
No ratings yet
Module1 Understanding Data1
56 pages
Data Preprocessing Data Basics
No ratings yet
Data Preprocessing Data Basics
86 pages
WINSEM2024-25 MCSE615L TH VL2024250502897 2025-01-07 Reference-Material-I
No ratings yet
WINSEM2024-25 MCSE615L TH VL2024250502897 2025-01-07 Reference-Material-I
50 pages
Wa0014
No ratings yet
Wa0014
63 pages
CH 2
No ratings yet
CH 2
35 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
Chapter Two: Describing and Presenting A Distribution of Scores
No ratings yet
Chapter Two: Describing and Presenting A Distribution of Scores
55 pages
E-Note 33325 Content Document 20250319114322AM
No ratings yet
E-Note 33325 Content Document 20250319114322AM
69 pages
Knowing The Data Set
No ratings yet
Knowing The Data Set
31 pages
02data DMDW
No ratings yet
02data DMDW
40 pages
02 Data
No ratings yet
02 Data
35 pages
Lesson 2.1 - Know Your Data PDF
No ratings yet
Lesson 2.1 - Know Your Data PDF
43 pages
Week2 UnderstandingData
No ratings yet
Week2 UnderstandingData
27 pages
Statistics 1
No ratings yet
Statistics 1
16 pages
CS822 DataMining Week2
No ratings yet
CS822 DataMining Week2
28 pages
02 Data
No ratings yet
02 Data
36 pages
Unit 2 Final Ids
No ratings yet
Unit 2 Final Ids
38 pages
ISM - Session 1 - May 2025
No ratings yet
ISM - Session 1 - May 2025
54 pages
Importance of Descriptive Statistics
No ratings yet
Importance of Descriptive Statistics
59 pages
Statistics
No ratings yet
Statistics
18 pages
DA Major Notes
No ratings yet
DA Major Notes
46 pages
IT326 - Ch2
No ratings yet
IT326 - Ch2
44 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
26 pages
Datalec 1
No ratings yet
Datalec 1
23 pages
DWDM Unit-2
No ratings yet
DWDM Unit-2
19 pages
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1) - Oct 2024
72 pages
Unit 3
No ratings yet
Unit 3
30 pages
Intro
No ratings yet
Intro
67 pages
Getting To Know Your Data
No ratings yet
Getting To Know Your Data
42 pages
Get To Know About Data
No ratings yet
Get To Know About Data
25 pages
Getting To Know Your Data
No ratings yet
Getting To Know Your Data
78 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
Unit 2 Data Preprocessing
No ratings yet
Unit 2 Data Preprocessing
8 pages
Introduction To Data
No ratings yet
Introduction To Data
26 pages
Data Preprocessing I
No ratings yet
Data Preprocessing I
39 pages
Types of Variables
No ratings yet
Types of Variables
31 pages
Bba QT
No ratings yet
Bba QT
5 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
ML 2
No ratings yet
ML 2
4 pages
Unit 01
No ratings yet
Unit 01
36 pages
MMW (Data Management) - Part 1
No ratings yet
MMW (Data Management) - Part 1
26 pages
q2 w3 m4 Practical Research 2 Inquiries, Investigations, and Immersion Ellima
No ratings yet
q2 w3 m4 Practical Research 2 Inquiries, Investigations, and Immersion Ellima
39 pages
Sensory Evaluation Resources 2
No ratings yet
Sensory Evaluation Resources 2
31 pages
Statistical Computing I 1
No ratings yet
Statistical Computing I 1
192 pages
HTM655 Manuscript Outline & Template D126
No ratings yet
HTM655 Manuscript Outline & Template D126
39 pages
Municipal Solid Waste
No ratings yet
Municipal Solid Waste
12 pages
Educ 201
No ratings yet
Educ 201
2 pages
8614 Quiz File
No ratings yet
8614 Quiz File
68 pages
Unit I QT Part I
No ratings yet
Unit I QT Part I
86 pages
In To 1 in Fifty In: Variable, Observation, and Data Set
No ratings yet
In To 1 in Fifty In: Variable, Observation, and Data Set
4 pages
Introduction To SPSS
No ratings yet
Introduction To SPSS
59 pages
1.introduction of Statistics
No ratings yet
1.introduction of Statistics
31 pages
Statistics Pyq For Qualifiers
No ratings yet
Statistics Pyq For Qualifiers
11 pages
Test of Difference (Non Parametric)
No ratings yet
Test of Difference (Non Parametric)
18 pages
All About Categorical Variable Encoding
No ratings yet
All About Categorical Variable Encoding
21 pages
Unit 3 BRM
No ratings yet
Unit 3 BRM
18 pages
Mathgazine 2
No ratings yet
Mathgazine 2
19 pages
P and S 2 Marks Questions With Answers
No ratings yet
P and S 2 Marks Questions With Answers
3 pages
Gologit 2
No ratings yet
Gologit 2
18 pages
Jurnal Si 2 An Asep Ediana Latip
No ratings yet
Jurnal Si 2 An Asep Ediana Latip
25 pages
Part II - Data Aalysis
No ratings yet
Part II - Data Aalysis
23 pages
Assignment 1 Chapter 1,2 and 3
No ratings yet
Assignment 1 Chapter 1,2 and 3
13 pages
MF004 Tutorial 1 Answers
No ratings yet
MF004 Tutorial 1 Answers
5 pages
Favouritism PDF
No ratings yet
Favouritism PDF
16 pages
Spearman's Rank Correlation QM3 - 1617
No ratings yet
Spearman's Rank Correlation QM3 - 1617
2 pages
Ba
No ratings yet
Ba
22 pages
Stat1 2021
No ratings yet
Stat1 2021
6 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet

EDS Unit 2 ?

Uploaded by

EDS Unit 2 ?

Uploaded by

☘️

Unit 2 Data Types & Statistical

Measuring the Central Tendency: Mean, Median, and Mode

Measuring the Dispersion of Data: Range, Quartiles, Variance, Standard

Data Objects are defined by number of attributes (variables)

Unit 2 Data Types & Statistical Description 1

Describing Attributes by the Number of Values

Infinite Attributes: Unlimited or uncountable values (e.g., real numbers

Unit 2 Data Types & Statistical Description 2

Examples: Colors (Red, Green, Blue), Gender (Male, Female), Nationalities.

Key Point: No arithmetic operations can be performed.

Examples: Yes/No, True/False, On/Off.

Key Point: Often encoded as 0 (False) and 1 (True).

Unit 2 Data Types & Statistical Description 3

Definition: Binary attributes where both outcomes have equal

Examples: Male/Female, Pass/Fail.

Key Point: No bias toward one outcome over the other.

Definition: Binary attributes where one outcome is more significant than

Examples: Presence/Absence of a disease, Positive/Negative test

Key Point: The two values are not equally important.

Examples: Number of children, Number of cars.

Key Point: Values are distinct and separate.

Unit 2 Data Types & Statistical Description 4

Examples: Height, Weight, Temperature.

Key Point: Can take any value in a given interval.

Definition: Interval-scaled attributes are measured on a scale with

Example: Temperature in Celsius or Fahrenheit.

Definition: A ratio-scaled attribute is a numeric attribute with a natural

Example: Height, Weight, Age.

Discrete vs Continuous attributes

Let’s look at an example to make this clearer. Height is a continuous attribute.

Unit 2 Data Types & Statistical Description 5

Basic Statistical Descriptions of Data

2. Evaluating Data Spread: Dispersion measures like range, variance, and

4. Comparing Datasets: Basic statistical measures make it easier to compare

5. Supporting Data-Driven Decisions: By summarizing data effectively, these

Measure of Central Tendency

1. Mean of an Individual Series:

Unit 2 Data Types & Statistical Description 6

The mean (average) is calculated by summing up all the values and

A discrete series consists of distinct, countable data points, often

The mean is calculated by multiplying each data point by its frequency,

f ﻿= frequency of each value

The mean is calculated by finding the midpoint of each class,

Unit 2 Data Types & Statistical Description 7

m = 15, 25, 35﻿

1. For an Individual Series:

Identify the value that occurs most frequently in the dataset.

Example: In the series [1, 2, 2, 3, 4]﻿, the mode is 2﻿because it occurs

2. For a Discrete Series (with frequencies):

3. For a Continuous Series (with class intervals):

L﻿= Lower boundary of the modal class

f0 ﻿= Frequency of the class before the modal class

Unit 2 Data Types & Statistical Description 8

h﻿= Width of the class intervals

Odd Number of Data Points:

The median is the middle value.

Even Number of Data Points:

Example: [1, 3, 5, 7] → Median = 3+5

Measure of Dispersion of Data

These are of two types

Absolute Measure of Dispersion

The difference between the maximum and minimum values in a dataset.

Unit 2 Data Types & Statistical Description 9

Q1 (First Quartile): The median of the lower half of the dataset.

Q2 (Second Quartile): The median of the entire dataset.

Q3 (Third Quartile): The median of the upper half of the dataset.

Interquartile Range (IQR): The difference between Q3 and Q1.

Formula: Variance = n1 ​ ∑ni=1 (xi − μ)2 ﻿

Example: For [1, 2, 3], mean = 2, variance = .

Formula: Standard Deviation = Variance﻿

Example: For a variance of 0.67, standard deviation =

5. Inter-quartile Range (IQR):

Formula: IQR = Q3 − Q1﻿

Unit 2 Data Types & Statistical Description 10

Histogram: Displays the frequency of data within specific intervals.

f = frequency of each value

m = 15, 25, 35

Example: In the series [1, 2, 2, 3, 4], the mode is 2because it occurs

L= Lower boundary of the modal class

f0 = Frequency of the class before the modal class

h= Width of the class intervals

Formula: Variance = n1 ∑ni=1 (xi − μ)2

Formula: Standard Deviation = Variance

Formula: IQR = Q3 − Q1