Data Collection and Display
Data Collection and Display
Introduction.
Since the dawn of time, human beings have asked some fundamental questions like who are we? why are
we here? Are we alone in the universe? Do aliens exist? Why haven’t aliens visited Earth yet? Is there life
after death? Unable to answer any of the above questions, in this topic we will delve into the various
statistical diagrams used to display data.
Recall about two years ago, we saw a topic called data collection and presentation in which we achieved
the following learning outcomes:
The learner should be able to:
• understand mode, mean and median, as measures of location/central tendency and knows how to
find them and when to use them. (k, u, s)
• understand range as a measure of dispersion/spread and how to find it. (k, u, s)
• draw and use frequency tables for ungrouped data. (u, s)
• draw and use frequency tables for grouped data. (u, s)
1
Data collection/display
Measures of central tendency are statistical measures that give us a single value to describe the center or
typical value of a dataset. The three main measures of central tendency are the mean, median, and mode.
These measures are essential for summarizing data and understanding its central characteristics. They
have various applications in different fields, from understanding the average income in a population to
analyzing test scores in education or sales figures in business.
Mean
The mean is the average of a set of numbers. To find the mean, you add up all the values in the dataset
and then divide by the total number of values. For example, let's consider a set of numbers: 5, 8, 10, 12,
5 + 8 + 10 + 12 + 15 50
and 15. The mean would be calculated as 5
= 5
= 10.
2
Data collection/display
However, it's important to note that the mean can be influenced by extreme values (outliers) in the dataset,
which can skew its value, making it less representative of the typical value in some cases.
In general, the mean of the numbers 𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 is denoted by 𝑥 and is given by:
𝑥1 + 𝑥2 + 𝑥3 + ⋯ + 𝑥𝑛
𝑥=
𝑛
A shorthand way of writing 𝑥1 + 𝑥2 + 𝑥3 + ⋯ + 𝑥𝑛 is:
𝑛
∑ 𝑥𝑖
𝑖=0
So we can write 𝑥 = 𝑛
∑𝑥
For discrete data, 𝑥 =
𝑛
∑ 𝑓𝑥
For disrete data in a frequency distribution, , 𝑥 =
∑𝑓
Learner’s activity.
1. To obtain grade A, Ben must achieve an average of at least 70 in five tests. If his average mark for
four tests is 68, what is the lowest mark he can score in his fifth test and still obtain grade A?
2. The members of an orchestra were asked how many instruments they could play and the
following results were obtained.
2 5 2 4 1 1 1 2 1 3
3 2 1 2 1 1 2 4 3 2
1 2 3 1 4 2 3 1 1 2
Find the mean number of instruments played.
Grouping data
Grouped data refers to a method of organizing and presenting numerical information in classes or
intervals. It's often used when dealing with large sets of data to make it more manageable and
easier to analyze. Instead of presenting individual values, grouped data organizes values into
ranges or groups, allowing for a more concise representation of the dataset.
3
Data collection/display
For instance, imagine you have a dataset of ages of people in a town. Instead of listing each
individual age, you might group the ages into intervals like 0-10, 11-20, 21-30, and so on. Here
the lower class limits are 0, 11, 21 and so on and the upper class limits are 10, 20, 30 and so on.
This grouping helps in drawing conclusions and insights without overwhelming detail.
When working with grouped data, it's crucial to consider the width or size of the intervals. The
choice of interval width can impact the insights drawn from the data. If intervals are too wide,
valuable information might be lost. Conversely, if intervals are too narrow, the data might become
too detailed, complicating analysis.
Analyzing grouped data involves various statistical measures and techniques. Measures such as
the mean, median, mode, and standard deviation can still be approximated from grouped data
using assumptions about the distribution within each interval. However, this approximation might
introduce some level of error compared to analyzing raw, individual data points.
Grouped data simplifies the representation of large datasets, making it easier to interpret and draw
conclusions. Still, it's essential to strike a balance between simplification and retaining enough
detail to ensure accurate analysis and decision-making.
Here's an example:
Let's say you have the following grouped data representing the heights(cm) of a group of people:
150 - 160
161 - 170
4
Data collection/display
171 - 180
In this case, the class boundaries would be determined by taking the midpoint between the upper
limit of one class and the lower limit of the next. For the first interval (150 - 160) and the second
interval (161 - 170), the class boundaries would be:
Class 1: 149.5 - 160.5
Class 2: 160.5 - 170.5
This way, the class boundaries help in avoiding confusion about which values belong to which
interval. The use of class boundaries becomes particularly important when dealing with
continuous data, as it ensures that each data point falls unambiguously into a specific interval.
5
Data collection/display
85 + 78 + 92 + 88 + 95
by adding all the scores together and dividing by the number of scores 5
=
87.6.
2. Population Analysis: It's utilized to understand characteristics of populations. For example,
the mean income of a population can provide insights into its economic status. If you have the
incomes of all individuals in a town, summing them up and dividing by the number of people
gives you the mean income.
3. Sample Analysis: In statistical sampling, the mean is used to estimate population parameters.
For instance, a researcher might collect data from a sample of voters to estimate the average
age of voters in a country. The mean age of the sample can provide an estimate of the
population mean.
4. Comparative Analysis: Mean is employed to compare different groups. For instance,
comparing the average monthly sales between different regions or the average test scores of
students in different schools can be done using means.
5. Time-Series Analysis: Mean is used to understand trends over time. In financial analysis, the
moving average, which is a type of mean, is employed to understand stock price trends over a
period.
6. Quality Control: In manufacturing, mean measurements of parts or products are used to
ensure consistency and quality. For example, a company producing screws might measure the
mean length of screws to ensure they meet specific standards.
It's important to note that while the mean is a valuable statistic, it can be influenced by extreme
values (outliers), potentially skewing the interpretation of the data. For instance, in a dataset of
salaries where most employees earn around $50,000, but a few executives earn millions, the mean
salary would be skewed higher due to these outliers.
Therefore, it's often helpful to use other measures of central tendency (like the median or mode)
alongside the mean to gain a more comprehensive understanding of the dataset.
Learner’s activity
1. Thirty bulbs were life-tested and their lifespan to the nearest hour are as follows:
167 171 179 167 171 165 175 179 169 171
177 169 171 177 173 165 175 167 174 177
172 164 175 179 179 174 174 168 171 168
a) Find the mean of lifespan by dividing their sum by 30.
b) Find the mean of lifespan by grouping the lifespan using class intervals 164 – 166, 167 – 169, and
so on.
c) Comment on your answers in (a) and (b) above.
2. The following table shows the distribution of marks of some students who took part in
science quiz.
6
Data collection/display
137 152 127 147 141 157 132 153 166 147 136 134
146 142 162 169 149 135 166 157 141 146 147 148
163 133 148 150 136 127 162 152 143 138 142 153
145 154 144 126 139 126 158 147 136 144 159 161
a) Construct a grouped frequency distribution table with intervals of class width 5mm starting with
the interval 125−129.
b) Calculate the mean length.
4. In an examination taken by 400 students, the scores were as shown in the following
distribution table:
5. The marks scored in an IQ test by 500 six year old children are given in the following table:
Marks Number of children
60−79 81
80 − 99 103
100 − 119 127
120 − 139 99
140 − 159 90
7
Data collection/display
• Descriptive Statistics: Mode helps describe the central tendency of a dataset, especially in
scenarios where identifying the most common value is crucial. For example:
In a survey asking people their favorite color, the mode would indicate the color most preferred by the
respondents.
In a classroom, the mode of test scores could identify the most common score achieved by students.
8
Data collection/display
• Business and Economics: Modes are used in analyzing sales data, market trends, and consumer
behavior.
Identifying the most popular product sold in a month can help businesses strategize and manage their
inventory efficiently.
In income distribution studies, the mode income can indicate the salary range where most individuals fall.
• Healthcare and Medicine: In medical research, the mode can be used to analyze patient data.
Identifying the most prevalent blood type within a specific population or region aids in planning blood
donation drives and medical interventions.
In clinical trials, researchers might use the mode to identify the most common side effect experienced by
patients taking a particular medication.
• Education and Psychology: The mode is used to understand student performance or behavioral
trends.
In a study of learning styles, identifying the mode learning method preferred by students can assist
educators in designing effective teaching strategies.
In psychological studies, the mode might help identify the most common behavior among participants in
certain situations.
• Transportation and Urban Planning: Modes are utilized to analyze commuting patterns and urban
development.
Identifying the mode of transportation used by commuters in a city helps urban planners allocate
resources for public transportation.
Studying the mode of travel in different regions can assist in developing infrastructure tailored to specific
transportation needs.
Calculating the mode of grouped data.
Calculating the mode of grouped data involves determining the most frequently occurring class interval or
category in a given dataset. Grouped data is usually presented in the form of a frequency distribution
table, where data is organized into intervals or classes, and the frequency of each class is specified. The
modal class/interval is the class/interval with the highest frequency.
To find the mode of grouped data, you can use the following formula:
d1
Mode = L + ( )×c
d1 + d2
9
Data collection/display
f1 − f0
Mode = L + ( )×c
2f1 − f0 − f2
Where:
Where:
L is the lower class boundary of the modal class,
f1 is the frequency of the modal class,
f0 is the frequency of the class before the modal class,
f2 is the frequency of the class after the modal class, and
c is the class width of the modal class
It's essential to understand that the mode obtained from this formula is an approximation. Due to the
grouping of data, the precise values within each class interval are not known. As a result, the mode is only
an estimate, providing a reasonable approximation of the most probable value within the modal class.
Learner’s activity
1. The table below shows the volume of petrol(litres) William used for his bike in the last 20 days.
No. of Families 7 8 2 2 1
3. The information on the observed lifetimes (in hours) of 225 electrical components are given in
the following frequency table. Find the modal lifetimes of the electrical components.
10
Data collection/display
Frequency 10 35 52 61 38 29
4. The following distribution table shows the number of runs scored by some top batsmen of the
world in one-day international cricket matches. Find the mode of the given data.
3000 – 4000 4
4000 – 5000 18
5000 – 6000 9
6000 – 7000 7
7000 – 8000 6
8000 – 9000 3
9000 – 10000 1
10000- 11000 1
A histogram or frequency histogram is a pictorial representation of the numerical data with rectangular
bars. Like a bar graph, the height of each bar depicts the frequency of the data values. A histogram differs
from a bar graph in that the bars are drawn with no space in between them.
11
Data collection/display
To construct a histogram using the grouped data, we plot class boundaries on the x-axis and frequencies
on the y-axis. Each bar represents a class interval, and the height of the bar corresponds to the frequency
of that interval.
Note:
It's important to note that estimating the mode using a histogram is a straightforward method but provides
only an approximation. The exact values within each class interval are unknown due to the grouping of
data, and the mode obtained is based on visual inspection. While this method is more accessible than
using formulas, users should be aware that the mode is still an estimate and might not precisely represent
the most frequent value in the dataset.
Given below is the table showing the approximate lengths, in mm, of 40 leaves taken from different parts
of a certain species.
12
Data collection/display
Number of leaves 1 4 8 10 8 7 2
Represent the data in the form of a histogram.
Estimating the mode of grouped data using a histogram involves visually identifying the class interval
(bar) with the highest frequency. While this method is more intuitive than using a formula, it still provides
an estimate rather than an exact value due to the nature of grouped data. Here's a step-by-step guide for
estimating the mode using a histogram:
• Draw a histogram of the data and identify the modal class. The modal class is the class with the
highest bar.
• Draw a straight line connecting the top-left corner of the tallest bar to the top-left corner of the
bar representing the frequency of the following class.
• Draw a straight line connecting the top-right corner of the tallest bar to the top-right corner of the
bar representing the frequency of the class immediately before.
• From the point of intersection of these lines, draw a vertical line down to the 𝑥-axis. This value is
the estimate for the mode.
An example of this is given below. Here, the mode of the frequency distribution has been estimated
graphically as 27.
13
Data collection/display
Class boundaries
Consider the examples below:
1. The speeds, in kilometres per hour, of cars driving on a road were recorded in the table below and
are represented in the histogram.
Frequency 8 15 17 25 7
14
Data collection/display
2. The table represents the time taken by a group of people to travel to work.
Frequency 10 15 7 3
15
Data collection/display
Learner’s activity.
1. The table shows the distribution of ages of 100 people attending a school concert. Represent this
data on a histogram and hence estimate the mode.
Frequency 22 35 31 10 2
2. The table shows the results of a survey on the weekly pocket money of 100 sixteen-year-olds.
Draw a histogram for this data and use it to estimate mode.
Weekly earnings ($) 20-30 30-40 40-50 50-60 60-70 70-80
Frequency 45 20 11 11 10 3
3. The table shows the distribution of the average marks of 40 children in the end-of-year
examinations. Draw a histogram to represent the frequency table hence estimate the modal mark.
Average
0-19 20-39 40-59 60-79 80-99
Marks
Frequency 2 4 16 15 3
4. In a survey, the length of the ring finger on the right hand of a sample of adults was measured (to
the nearest mm). Draw a histogram to represent the frequency table hence estimate mode.
16
Data collection/display
Length (mm) 45-55 55-65 65-75 75-85 85-95
Frequency 4 10 47 32 7
5. Bags of clips were weighed to the nearest gram. Draw a histogram to represent the frequency
table hence estimate mode.
Mass (g) 50-59 60-69 70-79 80-89 90-99
Frequency 2 6 22 12 8
Median.
The median is a statistical measure of central tendency that represents the middle value of a dataset when
it is ordered in either ascending or descending numerical order. Unlike the mean, which is the average of
all values, the median is less sensitive to extreme values and provides a more robust measure of central
location.
Calculating the median of ungrouped data involves finding the middle value when the data set is arranged
in either ascending or descending order. If the number of observations is odd, the median is the middle
value. If the number of observations is even, the median is the average of the two middle values.
Here are the steps to calculate the median for ungrouped data:
Order the data set in either ascending or descending order. This step is crucial for identifying the middle
values.
17
Data collection/display
Count the total number of observations (𝑛) in the data set. This will help determine whether the data set
has an odd or even number of elements.
𝑛+1
M = value at position
2
If 𝑛 is even, the median (M) is the average of the two middle values:
𝑛 𝑛
value at position 2 + value at position 2 + 1
M=
2
Calculating the median for ungrouped data is a straightforward process when the data set is relatively
small. However, for larger datasets or grouped data, additional techniques may be employed to simplify
the calculation.
For discrete data in an ungrouped frequency distribution, a cumulative frequency column is constructed
for purposes of finding the median.
Cumulative frequency represents the running total of frequencies up to a certain point in a dataset.
Cumulative frequency is particularly useful for understanding the distribution of values and identifying
patterns, especially when dealing with grouped data.
Create a cumulative frequency table for the following information, which represents the number of hours
per week that Arjun plays indoor games:
18
Data collection/display
Learner’s activity
19
Data collection/display
Number of 8 11 54 20 17 6 4
bolts
4. The shoe size of 155 people was recorded and the raw data was presented in the form of the
following frequency table:
Size of shoe 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
Frequency 10 18 22 25 40 15 10 8 7
5. The time taken, in minutes, for a group of children to complete a puzzle is recorded.
Number of children 8 4 3 10 3
Find the median time taken by the group of children to complete the puzzle.
9. The numbers 3, 7, 13, 14, 16, 19, 20 and 𝑥 are arranged in ascending order. If the mean of the
numbers is equal to the median, find the value of 𝑥.
10. The median of a set of eight numbers is 4.5. Given that seven of the numbers are 7, 2, 13, 4, 8, 2
and 1, find the eighth number.
11. If 10, 13, 15, 18, (𝑥+1), (𝑥+3), 30, 32, 35 and 41 are the observations in the ascending order with
median 24, find the value of 𝑥.
12. The temperatures, taken at midnight, of six consecutive nights in Bengaluru are given as follows:
22° C, 24° C, 26° C, 20° C, 23° C, 22° C.
i. State the median temperature.
ii. If 22° C is added to the above set of data, what will be the new median temperature?
13. As part of the school's Earth Day celebration, 100 students each sowed 5 seeds into each of 100
planters. One week later, the number of seeds germinating in each planter was recorded and the
results are given in the table.
20
Data collection/display
Number of planters 10 20 30 25 10 5
i. Write down the total number of seeds that were sown.
ii. Find the fraction of the seeds that did not germinate.
iii. Calculate the median of the distribution.
14. The number of errors in the first draft of Tina's thesis is shown in the table.
Number of errors 0 1 2 3 4 5 6
Number of pages 11 3 10 7 4 3 2
i. How many pages does the thesis contain?
ii. Find the percentage of pages with fewer than 2 errors.
iii. Calculate the median number of errors made by Tina.
15. The number of magazines read by a group of women in a week is recorded.
Number of magazines 0 1 2 3
Number of women 5 2 1 𝑥
21
Data collection/display
The median can be estimated within the median class using the formula:
N
− F𝑏
Median = L + ( 2 )×c
𝑓𝑚
Where:
L-lower class boundary of median class
N-total number of observations(∑ 𝑓)
F𝑏 - cumulative frequency before that of the median class
𝑓𝑚 -frequency of median class
c-class width of median class
4. Interpret the Result:
The calculated value represents the estimated median for the grouped data. It is the point
within the median class where half of the data lies below and half above.
It's important to note that this method provides an approximation of the median for grouped
data since the exact value might not be known. However, it gives a meaningful estimate
based on the available information.
Learner’s activity
1. The following frequency distribution that shows the number of points scored per game by 60
basketball players. Calculate the median.
Number of students 8 20 36 24 12
22
Data collection/display
Mid-points 5 15 25 35 45 55
Frequency 7 10 23 51 6 3
Frequency 2 4 8 9 4 2 1
Frequency 4 5 13 20 14 8 4
6. The median of the following data is 52.5. Find the value of 𝑓1 and 𝑓2 if the total frequency is 100.
Class 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100
Frequency 2 𝑓1 9 12 17 𝑓2 15 9 7 4
23
Data collection/display
Estimating the median of grouped data using a cumulative frequency curve or O-give involves creating a
graphical representation of the cumulative frequencies and then identifying the point on the curve
corresponding to the median. This method is useful when the exact values are not available, and the data
is presented in intervals or classes.
The data organized in the form of a cumulative frequency distribution may be graphically represented
through the cumulative frequency graph. The technique of drawing the cumulative frequency polygons
and cumulative frequency curves or ogives is more or less the same. We plot the cumulative frequencies
on the 𝑦 −axis and upper class boundaries on the 𝑥 −axis. The only difference is that the cumulative
frequency polygon is obtained by joining the points by line segments while the cumulative frequency
curve is obtained by joining the points by free hand smooth curve.
24
Data collection/display
Add a column for cumulative frequency to the frequency distribution table. Calculate the
cumulative frequencies, representing the running total of frequencies as you move down the
table.
Plot the cumulative frequencies on a graph. The x-axis represents the class boundaries, and
the y-axis represents the cumulative frequencies. Plot the cumulative frequencies with the
corresponding upper class boundary. Connect the points with a smooth curve.
Identify the point on the cumulative frequency curve that corresponds to the median. For n
𝑛 ∑𝑓
observations, the median is typically the value where 2 𝑜𝑟 2 falls on the y-axis.
Once you've located the median point on the curve, draw a line horizontally from that point to
the y-axis. The value where this line intersects the y-axis gives you the estimated median for
the grouped data.
This graphical method provides a visual and approximate way to estimate the median for grouped data. It
is particularly useful when a cumulative frequency curve is available, and it offers insights into the central
tendency of the dataset based on the graphical representation of cumulative frequencies.
25
Data collection/display
1. The cumulative frequency table below shows the marks of senior four students in an
examination graded out of 70.
20-24 19.5-24.5 1 1
25-29 24.5-29.5 2 3
30-34 29.5-34.5 4 7
35-39 34.5-39.5 8 15
40-44 39.5-44.5 11 26
45-49 44.5-49.5 9 35
50-54 49.5-54.5 7 42
55-59 54.5-59.5 4 46
60-64 59.5-64.5 3 49
65-69 64.5-69.5 1 50
26
Data collection/display
2. Draw the o-give for the below given data and from it determine the median income.
Number of
40 68 86 120 90 40 26
employees
The cumulative frequency distribution for the given distribution is given below.
27
Data collection/display
600-700 40 40
700-800 68 108
800-900 86 194
1000-1100 90 404
1200-1300 40 444
1300-1400 26 470
28
Data collection/display
Learner’s activity.
1. Draw an o-give for the following frequency distribution of test marks for a group of 32
students hence estimate the median mark.
Marks (%) 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100
Number of
1 2 4 7 5 8 2 2 1 0
students
2. The table below shows the frequency distribution of test marks for 120 students.
Marks (%) 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100
Number of students 1 6 8 15 17 24 22 15 9 3
(a) Draw a cumulative frequency curve for this data hence estimate the median.
(b) Use your curve in (a) to find the number of students that scored above 75%.
(c) Find the interquartile range
3. The results for the long jump at school sports day are recorded.
Distance (cm) 170-180 180-190 190-200 200-210 210-220 220-230 230-240 240-250
Number of
2 6 9 7 15 8 8 2
students
Draw a cumulative frequency curve and hence obtain the median distance.
4. The temperature in °C recorded over a 60-day period is shown in the table below:
Number of days 1 6 16 18 16 1 1 1
5. The local gym conducted a survey on the age distribution of its 800 members.
Draw a cumulative frequency curve for this data and hence determine the median.
29
Data collection/display
The median income is often used to measure the economic well-being of a population. It is less affected
by extremely high or low incomes, providing a more accurate representation of the typical earning level.
Real Estate:
In the real estate market, the median home price is frequently reported to give a sense of the typical cost
of housing in a specific area. This helps potential buyers and sellers understand the market without being
skewed by exceptionally expensive or inexpensive properties.
Education:
The median score on standardized tests is used to evaluate the performance of students. It offers a
measure that is less sensitive to extreme scores, giving a more realistic indication of the typical student's
achievement level.
Healthcare:
In medical research, the median is often employed to describe the central tendency of patient
characteristics, such as age or recovery time. This helps researchers avoid the influence of outliers and
provides a more representative measure.
Demographics:
When studying demographics, the median age is a useful metric. It gives a better indication of the typical
age in a population, minimizing the impact of outliers that might be present in mean age calculations.
Traffic Analysis:
In transportation planning, the median travel time or distance can be used to represent the typical
commuting experience. This is beneficial for understanding the average conditions without being overly
influenced by extreme cases.
30
Data collection/display
Survival Analysis:
In medical and social sciences, the median survival time is often used to describe the time until an event
of interest (e.g., death or failure) occurs. It provides a more robust estimate, especially when the survival
data is skewed.
Customer Reviews:
In e-commerce, the median rating of a product can be more informative than the mean, as it is less
sensitive to a few exceptionally high or low reviews. This gives a better representation of the typical
customer experience.
Median earnings are commonly used to analyze salary and wage data. Unlike the mean, the median
provides a better sense of the middle ground, making it valuable for understanding income distribution.
In these and many other scenarios, the median is preferred when the dataset is skewed or contains
outliers, as it provides a more robust measure of central tendency that reflects the typical value in a
dataset.
Measures of dispersion/variability/spread.
Measures of dispersion are statistical metrics that quantify the spread or variability of a dataset. These
measures provide insights into how individual data points are distributed around the central tendency,
such as the mean or median. Common measures of dispersion include the range, interquartile range,
variance, and standard deviation.
Its apparent but nevertheless noteworthy that the measures of tendency discussed earlier can not be used
to measure spread of a data set.
For example, each of these sets of numbers has mean 7 but the spread of each set is different.
(a) 7, 7, 7, 7, 7
(b) 4, 6, 6.5, 7.2, 11.3
31
Data collection/display
Range
The range is the simplest measure of dispersion and is calculated as the difference between the maximum
and minimum values in a dataset.
While easy to compute, the range is sensitive to extreme values and may not provide a robust
representation of variability in the presence of outliers.
Interquartile range
The interquartile range is a more robust measure that focuses on the middle 50% of the data. It is
calculated as the difference between the upper quartile (Q 3 ) and the lower quartile (Q1 ).
Interquartile range = 𝑄3 − 𝑄1
The semi-interquartile range (SIQR) is a statistical measure of dispersion that is related to the interquartile
range (IQR). While the IQR represents the spread of the central 50% of the data, the SIQR focuses on the
spread of the central 25% of the data on either side of the median. It is calculated as half of the IQR and is
particularly useful when the distribution of data is skewed or contains outliers.
𝑄3 − 𝑄1
semi − interquartile range =
2
The IQR is less affected by extreme values, making it useful for datasets with outliers.
We shall discuss more sophisticated measures of variability like variance and standard deviation in senior
five.
In manufacturing processes, the range is often used to monitor the consistency and precision of products.
A narrow range indicates that the manufacturing process is producing items with similar specifications,
while a wide range may suggest variations that need to be addressed.
Educational Assessments:
32
Data collection/display
In education, the range of scores on tests and exams can provide insights into the diversity of performance
among students. Teachers and educational institutions use the range to assess the spread of scores and
identify areas where additional support may be needed.
Meteorologists use the range of temperatures to describe the variability in weather conditions. A larger
temperature range within a day or across seasons indicates more significant fluctuations, affecting climate
patterns and helping in weather prediction.
Financial Analysis:
In finance, the range is employed to assess the volatility of financial instruments, such as stocks or
currencies. Traders and investors use the range to understand the potential risk and return associated with
particular assets.
Coaches and analysts in sports use the range of performance metrics like scores, running times, or
distances covered to evaluate the consistency of athletes. A narrow range in performance may suggest a
high level of skill and stability.
Real Estate:
In the real estate market, the range of property prices in a particular area provides information about the
variability of housing costs. It helps potential buyers and sellers understand the market dynamics and
make informed decisions.
Medical researchers use the range in various studies, such as clinical trials or health surveys. For example,
the range of blood pressure measurements can indicate the variability in a population, contributing to the
understanding of health conditions.
Businesses involved in supply chain management use the range to evaluate the variability in delivery
times, production rates, or inventory levels. Understanding the range helps in optimizing processes and
ensuring a more efficient supply chain.
33
Data collection/display
Environmental Monitoring:
Ecologists and environmental scientists use the range to analyze ecological data, such as species diversity
or pollutant concentrations. A wide range in these variables may indicate ecosystem instability or
environmental stress.
Human Resources:
In HR analytics, the range of salaries within a company can provide insights into pay equity and
compensation structures. It helps organizations identify potential disparities and make data-driven
decisions to ensure fair and competitive compensation.
Understanding the range of data is crucial in various fields, as it provides a quick and straightforward
measure of the spread or variability in a dataset, allowing for informed decision-making and analysis.
END.
34