Lesson-2-Data-Presentation
Lesson-2-Data-Presentation
PRESENTATION 641
Lesson 2
Jay-R A. Manamtam
October 17, 2024
TABLE OF CONTENTS
01 02 03
Graphical
Tabular Description Measures of
Description of
of Data Central Tendency
Data
01
Tabular Description
of Data
DATA
Parts of a Table
1. Table Number
2. The Title
3. The Box Head (column captions)
4. The Stub (row captions)
5. The Body
6. Prefatory Notes
7. Foot Notes
8. Source Notes
Types of Table
Commerce 20
Science 28
Total 73
Table 2:
Number of Students in Different
Rooms and Gender
No. of Students
Room Male Female Total
Arts 10 15 25
Commerce 10 10 20
Science 18 10 28
Total 38 35 73
Table 3:
Number of Students in Different
Rooms, Gender and Sections
Section I Section II
Room Male Female Total Male Female Total
Arts 10 15 25 12 15 27
Commerce 10 10 20 8 12 20
Science 18 10 28 15 13 28
Total 38 35 73 35 40 75
02
Graphical
Description of Data
Graphical Representation
❖ Pie Chart
❖ Bar Graph
❖ Histogram
❖ Line Graph (Frequency of Polygon)
❖ Pictograph
❖ Scatter plots
❖ Heatmaps
Pie Chart
Reference:
https://thirdspacelearning.com/gcse-maths/statistics/pictograph/
Scatter plot
❑ A scatter plot is a data visualization tool used to display individual
data points in a two-dimensional coordinate system.
❑ Scatter plots are commonly used for identifying relationships,
detecting outliers, pattern recognition, and prediction.
Reference:
https://www.math.net/scatter-plot
Heatmap
❑ A heatmap is a data visualization technique that represents data
values using colors on a grid.
❑ Heatmaps are particularly useful for displaying the intensity,
concentration, or relationships between data points within a matrix or
two-dimensional dataset.
❑ They are a powerful tool for exploratory data analysis and
communication of complex datasets.
Reference:
https://en.wikipedia.org/wiki/Heat_map
03
Measures of
Central Tendency
Measures of Central Tendency
or Average
A measure of central tendency is a single value that attempts to describe a set of
data by identifying the central position within that set of data.
Measures of central tendency or averages give us one value for the distribution
and this value represents the entire distribution. In this way averages convert a
group of figures into one value.
Collected and classified figures are vast. To condense these figures we use
average. Average converts the whole set of figures into just one figure and thus
helps in condensation.
To make comparisons of two or more than two distributions, we have to find the
representative values of these distributions. These representative values are
found with the help of measures of the central tendency.
Three Common
Measures of Central Tendency
Mean
• is equal to the sum of all the values in the data set divided by the number of values in the
data set.
σ𝑛
𝑖=1 𝑥𝑖
𝑥ҧ =
𝑛
Where:
𝑥ҧ = mean
𝑥𝑖 = score of each respondent
σ𝑛𝑖=1 𝑥𝑖 = sum of all scores of the respondents
𝑛 = total number of data points
Weighted Mean
• is a kind of average used in determining the central tendency of each item that was used in
each item of the instruments.
σ(𝑣𝑤)
𝑀𝑤 =
σ𝑤
Where:
𝑀𝑤 = computed mean
𝑣 = value, score, or actual data point
𝑤 = weight assigned to each data point
σ(𝑣𝑤) = the sum of the products of data point and its weight
σ𝑤= the sum of weights
Three Common
Measures of Central Tendency
Median
• The "middle" of a sorted list of numbers.
• Is the number that separates the higher half from the lower half of scores.
• It is the middle value in a sorted, whether ascending or descending, list of
scores.
• To find the median, arrange the scores in ascending or descending order and
get the middle score. The middle score is the median value. If there are two
middle scores, the median is the mean or average of these two middle scores.
Three Common
Measures of Central Tendency
Mode
• Is the number that appears most often in a set of scores.
• A set of scores may have one mode, more than one mode, or no
mode at all.
• In a frequency distribution table, charts, or graphs, the mode is
the maximum frequency value.
Advantage and
Disadvantage of Mean
Advantages Disadvantages
• Takes account of all values in • It is highly affected by extreme values.
the series. • It cannot be determined by inspection.
• It is rigidly defined. • It cannot be computed accurately if any
value/score is missing.
• It is suitable for further
• It cannot be used when we are dealing
algebraic treatment. with qualitative characteristics such as
• It is least affected fluctuation honesty, beauty, etc.
of sampling. • It cannot be calculated for an open ended
distribution.
• Most popular and well known
average. • Two data sets can have the same
arithmetic mean while having completely
different implications.
Advantage and
Disadvantage of Median
Advantages Disadvantages
• Simple to determine and easy to • The process becomes tedious if the
understand. series contains large number of items.
• Less affected by outliers and • It is a less representative average
skewed data. because it does not depend on all the
items in the series.
• Can be easily represented • It is affected much by fluctuations of
graphically. sampling.
• Suitable for open ended • It is not capable of algebraic
distribution. treatment.
• Suitable For qualitative • In the case of an even number of
phenomenon observations, the median cannot be
determined exactly.
Advantage and
Disadvantage of Mode
Advantages Disadvantages
• Simple to determine and easy to
• It is a less representative average
understand.
because it does not depend on all the
• Can be easily located and
items in the series.
represented graphically.
• Less affected by outliers and • It is affected much by fluctuations of
skewed data. sampling.
• Suitable for open ended • It is not capable of algebraic
distribution. treatment.
• Suitable For qualitative • Mode is ill defined.
phenomenon
• The only average that can be used
in nominal level data.
Two ways to represent and analyze data
in Statistics
Sturges' Rule:
Sturges' Rule is a simple formula to estimate the number of intervals (𝑘) in a frequency
distribution. It is given by the formula:
Where 𝑁 is the number of data points in your dataset. Sturges' Rule is a quick and
straightforward way to get an estimate, but it may not work well for small or highly skewed
datasets.
Determine the Number
of Categories (Intervals)
𝑘 = 𝑁
This rule provides a slightly larger number of intervals compared to Sturges' Rule and can be a
better choice for moderately sized datasets.
Determine the Number
of Categories (Intervals)
Scott's Rule:
Scott's Rule takes into account both the number of data points and the standard deviation of
the dataset. It is given by the formula:
Freedman-Diaconis Rule:
Similar to Scott's Rule, the Freedman-Diaconis Rule uses the interquartile range (IQR) instead
of the standard deviation. The formula for the interval width (ℎ) is:
2 ∙ 𝐼𝑄𝑅
ℎ = 1
𝑁 3
Like Scott's Rule, you can determine the number of intervals by dividing the range of the data
by ℎ.
Determine the Number
of Categories (Intervals)
Expert Judgment:
Sometimes, it's best to rely on the expertise of a subject matter expert or domain knowledge.
If you have insights into the nature of your data or the research objectives, you may choose to
define custom intervals that make sense for your analysis.
Steps to Create a Frequency
Distribution Table
𝑅𝑎𝑛𝑔𝑒 = 𝐻𝑉 − 𝐿𝑉
Steps to Create a Frequency
Distribution Table
3. Calculate the Interval Width: Divide the range by the
number of intervals to determine the width of each
interval. Round this number to a convenient value that
makes sense for your data.
𝑅𝑎𝑛𝑔𝑒
ℎ=
𝑘
Where:
ℎ = Interval Width
𝑘 = number of Intervals
Steps to Create a Frequency
Distribution Table
4. Create Categories (Intervals): Define the intervals based on the width
and starting point. For example, if your data ranges from 60 to 99 and
you want 4 intervals, each with a width of 10, you can create the
following intervals: 60-69, 70-79, 80-89, 90-99.
5. Tally the Data: Go through your dataset and tally the data points that
belong to each interval. For ungrouped data, simply count how many
data points fall into each category. For grouped data, place each data
point into the appropriate interval.
6. Construct the Table: Create the frequency distribution table with two
main columns: "Categories (Intervals)" and "Frequency." List the
categories and write down the frequency for each category based on
your tallies.
Steps to Create a Frequency
Distribution Table
Mean
σ𝑘
𝑖=1 𝑓𝑖 ∙ 𝑥𝑖
𝑥ҧ =
𝑛
Where:
𝑥ҧ = mean
𝑥𝑖 = the midpoint of each interval
𝑓𝑖 = the frequency of each interval
𝑘 = the number of intervals
σ𝑘𝑖=1 𝑓𝑖 ∙ 𝑥𝑖 = sum of all scores of the respondents
𝑛 = total number of data points
Measures of Central Tendency
for Grouped Data
Median
𝑁+1
2
−𝐿𝐶𝐹𝑏𝑀
𝑥 = 𝐿𝐶𝐵𝑀𝑒 + ℎ
𝑓𝑀𝑒
Where:
𝑥 = median
𝐿𝐶𝐵𝑀𝑒 = lower class boundary of the median class
𝑁 = total number of data points
𝐿𝐶𝐹𝑏𝑀 = less than cumulative frequency below median class
𝑓𝑀𝑒 = frequency of the median class
ℎ = width of class interval
Measures of Central Tendency
for Grouped Data
Mode
𝑑1
𝑥ො = 𝐿𝐶𝐵𝑀𝑜 + ℎ
𝑑1 +𝑑2
Where:
𝑥ො = mode
𝐿𝐶𝐵𝑀𝑜 = lower class boundary of the modal class
𝑑1 = positive difference between the frequency of the modal class and frequency
below the modal class
𝑑2 = positive difference between the frequency of the modal class and frequency
above the modal class
ℎ = width of class interval
References
• Garrett: H.E. (1956), Elementary Statistics, Longmans, Green, and Co. New York.
• Roth, R.K. (1999): Fundamentals of Educational Statistics and Measurement, Taratarini Pustakalaya,
Orissa
• https://machinep.com/importance-of-statistics-in-education
• https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/discrete-vs-
continuous-variables/
• https://statistics.laerd.com/statistical-guides/measures-central-tendency-mean-mode-median.php
• https://www.preservearticles.com/articles/what-is-the-importance-of-measures-of-central-
tendency-in-statistics/7716
• https://study.com/academy/lesson/central-tendency-measures-definition-
examples.html#:~:text=Central%20tendency%20is%20very%20useful,with%20large%20amounts%
20of%20data.
• https://statistics.laerd.com/statistical-guides/measures-central-tendency-mean-mode-median.php
• https://allthingsstatistics.com
• https://www.aplustopper.com/mean-advantages-disadvantages/
• https://byjus.com/question-answer/what-are-the-advantages-and-disadvantages-of-mean-median-
and-mode/
• https://www.slideshare.net/vharshana/role-of-statistics-in-scientific-research
• https://benefits-drawbacks.blogspot.com/2018/07/advantages-and-disadvantages-of-median.html
• https://www.preservearticles.com/notes/advantages-and-disadvantages-of-median/3760