0% found this document useful (0 votes)
23 views35 pages

Statistics

Statistics is a mathematical discipline focused on data collection, analysis, interpretation, and presentation, essential for informed decision-making across various fields such as business, healthcare, and social sciences. It employs methods like descriptive and inferential statistics to summarize data, validate hypotheses, and predict outcomes. Understanding both internal and external data sources enhances statistical analysis, aiding in risk management and problem-solving.

Uploaded by

ishi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views35 pages

Statistics

Statistics is a mathematical discipline focused on data collection, analysis, interpretation, and presentation, essential for informed decision-making across various fields such as business, healthcare, and social sciences. It employs methods like descriptive and inferential statistics to summarize data, validate hypotheses, and predict outcomes. Understanding both internal and external data sources enhances statistical analysis, aiding in risk management and problem-solving.

Uploaded by

ishi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Why is Statistics Important?

Statistics is a branch of mathematics that deals with the collection, analysis,


interpretation and presentation of data in a more understandable and useful manner.
Using various statistics techniques one can present the data in a more readable way
and we can easily draw conclusion from the given data. Statistics is not only used in the
field of mathematics but it is used in almost every area From business and economics
to healthcare and social sciences, statistics provides the tools and methodologies
necessary for making informed decisions based on data.

What is Statistics?

Statistics is a branch of mathematics that deals with the collection, analysis, interpretation
and presentation of data in a more understandable and useful manner. It is used for
various purposes in different fields but mainly it is associated with organizing and
analyzing the data. It is also used to validate the hypothesis and predict the
probability of the outcome.
Why is Statistics Important?

The importance of statistics is mentioned below in detail:

Informed Decision Making

 Statistics provides methods and tools to analyze data enabling us to perform


better and evidenced based results in various fields such as business,
healthcare, and public policy.

Data Interpretation:

 It helps in interpreting complex data sets, making it easier to understand trends,


patterns, and relationships within the data. Which is further used in machine
learning model to provide better and accurate results.

Problem Solving

 Statistics equips individuals with the skills to solve real-world problems by


applying appropriate statistical techniques to analyze and interpret data.

Statistics in Decision Making

The role of statistics in decision making is mentioned below:

Informed Decision Making

Informed decision making is a process in which the choice is based on proper collection,
analysis and interpretation of the data that we collect for any specific
purposes. In this we use statistical tools and methods to transform the unorganized or
the raw data into a analyzable data from which we can obtain any insights. Using these
methods to analyze the data help us in taking better and accurate decision for the future
events. This method is used at so many places such as business, healthcare, and public
policy, where decisions have significant and far-reaching consequences.
Risk Management and Assessment

Risk management and assessment involve identifying potential risks, evaluating their
likelihood and impact, and implementing strategies to mitigate or manage them. This
process is fundamental to maintaining organizational stability and achieving objectives. Key
components include risk identification, where potential issues are recognized; risk analysis,
which quantifies the probability and potential impact of these risks using statistical
methods such as probability distributions and regression analysis; and risk evaluation,
which prioritizes risks based on their significance. Techniques such as Monte Carlo
simulations, fault tree analysis, and sensitivity analysis are often used to model risk
scenarios and develop mitigation strategies. By systematically assessing and managing
risks, organizations can minimize negative outcomes and capitalize on opportunities,
ensuring long-term success and resilience.
Statistical Methods and Tools

The various methods and tools used in statistics are mentioned below:

Descriptive Statistics

The descriptive statistics is used to describe and evaluate the main numerical features of
the data we have. This includes:

Measure of central tendency: This provides the central point of the data set. In order to
provide the central tendency we use methods like Mean, Median, Mode, Variance and
Standard deviation

 Mean: The average value of a dataset.

 Median: The middle value when data is ordered.


 Mode: The most frequently occurring value.

 Standard Deviation: A measure of the dispersion or spread of data points around


the mean.

 Variance: The average of the squared differences from the mean.

These summarize and describe the main features of a dataset, such as measures of central
tendency (mean, median, mode) and measures of variability (standard
deviation, variance).

Inferential Statistics

Inferential statistics is the another part of statistics which enable us to make prediction or
inferences about a large dataset or large population based on a sample data taken from
the whole population. It includes methods like:

 Hypothesis Testing: Determining whether there is enough evidence to reject a


null hypothesis.

 Confidence Intervals: Estimating the range within which a population parameter


is likely to lie.

 Regression Analysis: Exploring the relationship between variables and predicting


future values.

These methods used for drawing conclusions that extend beyond the immediate data,
allowing for generalized findings and informed predictions.
In statistics, internal data comes from within an organization, while external data comes
from outside the organization:

 Internal data

Data that is generated and used within a company or organization. This data can come
from areas such as operations, maintenance, personnel, and finance. Examples of
internal data include expense reports, cash flow reports, production reports, and
budget variance analysis. Internal data is usually stored in spreadsheets, databases, or
customer relationship management (CRM) systems.

 External data

Data that is collected outside an organization from areas like press releases, statistics
departments, government databases, and market research. Examples of external data
include market research reports, social media data, and government data.

Internal data is free to the company, and it can be very relevant and telling. External data can
be purchased from third-party providers or gathered from publicly available sources.

In statistics, **internal** and **external sources of data** refer to the origin from where
data is collected for analysis.

### Internal Sources of Data:

These are data collected from within the organization or system that is being studied.
Internal sources tend to be more specific and relevant to the particular needs of the entity
collecting the data. Examples include:

1.**Sales records** – Data from transactions or sales made by the company.

2.**Employee records** – Information about staff, such as attendance, salary, and


performance.

3.**Production reports** – Data related to goods manufactured, costs, and efficiency.

4.**Customer databases** – Data about customer preferences, purchase history, or


feedback.

5.**Inventory records** – Data on stock levels, order frequency, or warehouse


performance.
### External Sources of Data:

These are data obtained from sources outside the organization. External data is often used
to complement or enrich internal data. Examples include:

1.**Government publications** – Census data, economic reports, or labor statistics.

2.**Industry reports** – Studies or reports on market trends and industry performance.

3.**Surveys and research reports** – Data collected by research agencies or


institutions.

4.**Academic publications** – Studies published by scholars in journals or research


papers.

5.**Public databases** – Open-source data from organizations like the World Bank,
WHO, or Eurostat.

6.**Social media and web data** – Data from online platforms, user behavior, and
engagement metrics.

Both internal and external sources play critical roles in forming a comprehensive dataset
for statistical analysis. Internal data provides direct insights from within the organization,
while external data helps in benchmarking and understanding broader trends.
Frequency distribution is a method of organizing and summarizing data to show the
frequency (count) of each possible outcome of a dataset. It is an essential tool in statistics
for understanding the distribution and pattern of data. There are several types of
frequency distributions used based on the nature of the data and the analysis
required.

It is not always possible for an investigator to easily measure the items of a series or set of
data. To make the data simple and easy to read and analyze, the items of the series are
placed within a range of values or limits. In other words, the given raw set of data is
categorized into different classes with a range, known as Class Intervals. Every item of
the given series is put against a class interval with the help of tally bars. The number of
items occurring in the specific range or class interval is shown under Frequency against
that particular class range to which the item belongs.

Frequency Distribution Examples

The marks of a class of 20 students are 11, 27, 18, 14, 28, 18, 2, 22, 11, 24, 22, 11, 8, 20,
25, 28, 30, 12, 11, 8. Prepare a frequency distribution table for the same.

Solution:

The range of marks of the students is 2- 28. Let us take class intervals 0-5, 5-10, 10-15,
15-20, 20-25, and 25-30.

Types of Frequency Distribution

The six different types of the frequency distribution are as follows:

1. Exclusive Series

2. Inclusive Series
3. Open End Series

4. Cumulative Frequency Series

5. Mid-Value Frequency Series

6. Equal and Unequal Class Interval Series


1. Exclusive Series

The series with class intervals, in which all the items having the range from the lower
limit to the value just below its upper limit are included, is known as the Exclusive
Series. This kind of frequency distribution is known as an exclusive series because the
frequencies corresponding to the specific class interval do not include the value of its
upper limit. For example, if a class interval is 0-10, and the values of the given series are
4, 10, 2, 15, 8, and 9, then only 4, 2, 8, and 9 will be included in the 0-10 class interval. 10
and 15 will be included in the next class interval, i.e., 10-20. Also, the upper limit of a class
interval is the lower limit of the next class interval.

Frequency Distribution in Exclusive Series Example

From the above table of exclusive series, it can be seen that the upper limits of the first class
interval is the lower limit of the second class interval, and so on. Also, as
discussed above, if the data includes a value 10, it will be included in the class interval
10-20, not in 0-10.

2. Inclusive Series

The series with class intervals, in which all the items having the range from the lower
limit up to the upper limit are included, is known as Inclusive Series. Like exclusive
series, the upper limit of one class interval does not repeat itself as the lower limit of the next
class interval. Therefore, there is a gap (between 0.1 to 1) between the upper-class limit of one
class interval and the lower limit of the next class interval. For example, class intervals of
an inclusive series can be, 0-9, 10-19, 20-29, 30-39, and so on. In this case, the gap
between the upper limit of one class interval and the lower limit of the next class interval is
1, and the class intervals do not overlap with each other like in an exclusive series.
Sometimes it gets difficult to perform statistical analysis with inclusive series. In those
cases, the inclusive series is converted into an exclusive series.
Frequency Distribution in Inclusive Series Example

From the above table of inclusive series, it can be seen that the upper limit of one class
interval (say, 9 of interval 0-9) is not the same as the lower limit of the next class interval (10
of interval 10-19). Also, all the values that come under 0-9, including 0 and 9 are included
in the frequency against 0-9.

Conversion of Inclusive Series into Exclusive Series

For statistical calculation, sometimes it becomes necessary to convert the inclusive series into
exclusive series. Suppose, in the above example some students have
obtained marks such as 10.5, 40,5, etc. In this case, this series will be converted into
exclusive series,

The steps for converting an inclusive series into exclusive series are:

 In this first step, calculate the difference between the upper class limit of one class
interval and the lower limit of the next class interval.

 The next step is to divide the difference by two and then add the resulting value to the upper
limit of every class interval and subtract it from the lower limit of every class interval.

Example:
The inclusive series of the above example is converted into exclusive series as under.
Difference between Inclusive and Exclusive Series

 In Inclusive Series, the upper limit of one class interval is not the same as the
lower limit of the next class interval. There is a gap ranging from 0.1 to 1.0
between the upper class limit of one class interval and the lower class limit of the
next class interval. However, in the Exclusive Series, the upper limit of one class
interval is the same as the lower limit of the next class interval.

 In the case of Inclusive Series, the value of the upper and the lower limit are included
in that class interval only. However, in the case of Exclusive Series, the value of upper
limit of a class interval is not included in that interval, instead, it is included in the next
class interval.

 Inclusive Series is suitable for an investigator only if the value is in complete number
and not in decimal form. However, an Exclusive Series is suitable for an investigator
whether the value is in complete number or decimal form.

 Counting in Inclusive Series is possible only after converting it into an Exclusive


Series. However, counting in Exclusive Series is possible in all cases.

3. Open End Series

Sometimes the lower limit of the first class interval and the upper class limit of a series is
not available; instead, Less than or Below is mentioned in the former case (in place of
the lower limit of the first class interval), and More than or Above is mentioned in the
latter case (in place of the upper limit of the last class interval). These types of series are
known as Open End Series.

Frequency Distribution in Open End Series Example

For statistical calculations, if one needs to change the first and last class open-end class
interval into limits, it can be done by the general practice of giving the same magnitude
or class size to these intervals as the class size of other class intervals. In the above
example, the magnitude of other class intervals is 5. Therefore, the open-end class intervals
can be written as 5-10 and 30-35, respectively.
4. Cumulative Frequency Series

A series whose frequencies are continuously added corresponding to the class intervals
is known as Cumulative Frequency Series.

Conversion of a Simple Frequency Series into Cumulative Frequency Series

A simple frequency series can be converted into a cumulative frequency series. There are
two ways through which it can be done. These are as follows:

 Expressing the cumulative frequencies on the basis of the upper limits of


the class intervals. For example, expressing 10-20, 20-30, and 30-40 as Less
than 20, Less than 30, and Less than 40.

 Expressing the cumulative frequencies on the basis of lower limits of the


class intervals. For example, expressing 10-20, 20-30, and 30-40 as More than
20, More than 30, and More than 40.

Frequency Distribution in Cumulative Frequency Series Example

Convert the following simple frequency series into a cumulative frequency series using both
ways.

Solution:

Method-I (On the Basis of Upper Limits)

Method – II (On the Basis of Lower Limits)


Conversion of Cumulative Frequency into Simple Frequency Series

To attain the frequency against a specific class interval of a cumulative frequency series, it can
be converted into a simple frequency series.

Example:

Determine the frequency of the following cumulative frequency series.

Solution:
5. Mid-Value Frequency Series

The series in which, instead of class intervals, their mid-values are given with the
corresponding frequencies, is known as Mid-Value Frequency Series.

Conversion of Mid-Value Frequency Series into Simple Frequency Series

The steps to convert a mid-value frequency series into a simple frequency series are as
follows:

 The first step is to determine the mutual difference between the mid-values.

 The next step is to obtain half of the resulting difference.

 The last step of conversion is to subtract the resulting figure from the second
step from the mid-value to get the lower limit of the class interval, and add the
resulting figure from the second step to the mid-value to get the upper limit.

Lower Limit (l1)=m−12iLower Limit (l1)=m−21i


Upper Limit (l2)=m+12iUpper Limit (l2)=m+21i m

= Mid-Value

i = Difference between mid-values

l1=lower limitl1=lower limit

l2=upper limitl2=upper limit

Frequency Distribution in Mid-Value Frequency Series Example

Convert the following Mid-Value Frequency Series into Simple Frequency Series.

Solution:
Calculation:

Difference between mid-values (i) = 10


6. Equal and Unequal Class Interval

Series Equal Class Interval Series

When the classes of a series are of the same interval, it is known as Equal Class Interval
Series.

Example of Frequency Distribution in Equal Class Interval Series

Following is the frequency distribution of marks of 25 students with equal class intervals.

Unequal Class Interval Series

When the classes of a series are of unequal interval, it is known as Equal Class Interval
Series.

Example of Frequency Distribution in Unequal Class Interval Series:

Following is the frequency distribution of marks of 30 students with unequal class


intervals.

Summary – Types of Frequency Distribution

Frequency distribution is a crucial tool in statistics used to organize and summarize


data. The main types include ungrouped, grouped, cumulative, and relative frequency
distributions. Ungrouped frequency distribution lists each individual data point and its
frequency, while grouped frequency distribution categorizes data into intervals.
Cumulative frequency distribution provides a running total of frequencies, and relative
frequency distribution shows the proportion of total observations in each category.
These methods help in understanding the distribution and pattern of data, facilitating better
analysis and decision-making.
Simple and Weighted Arithmetic Mean

Simple Arithmetic Mean gives equal importance to all the variables in a series. However, in
some situations, a greater emphasis is given to one item and less to others, i.e., ranking of
the variables is done according to their significance in that situation. For example, during
inflation, the price of everything in an economy tends to rise, but households pay more
importance to the rise in the price of necessary food items rather than the rise in the
price of clothes. In other words, more significance is given to the
price of food and less to the price of clothes. This is when Weighted Arithmetic Mean
comes into the picture.

When every item in a series is assigned some weight according to its significance, the average
of such series is called Weighted Arithmetic Mean.

Here, weight stands for the relative importance of the different variables. In simple words,
the Weighted Arithmetic Mean is the mean of weighted items and is also known as the
Weighted Average Mean.

Calculation of Weighted Arithmetic Mean

Weighted Arithmetic Mean is calculated as the weighted sum of the items divided by the
sum of the weights.

Steps to calculate Weighted Arithmetic Mean:

 Step-1: All the items (X) in a series are weighted according to their significance.
Weights are denoted as ‘W’.

 Step-2::Add up all the values of weights ‘W’ to get the sum total of weights, i.e.,

∑W= W1+W2+W3+...............+Wn

 Step-3:Items (X) are multiplied by the corresponding weights (W) to get ‘XW’.

 Step-4:Add up all the values of ‘XW’ to get the sum total of the product ‘XW’, i.e.,

∑XW= X1W1+X2W2+X3W3+...................+XnWn

 Step-5: To get the weighted mean, divide the weighted sum of the items ‘∑XW’ by
the sum of weights ‘∑W’.

Formula for calculating Weighted Arithmetic Mean is

Example:

Calculate a weighted mean of the following data:

Items (X) 5 10 25 20 25 30

Weight 8 4 5 10 7 6
(W)
Solution:
Items Weight XW
(X) (W)

5 8 40

10 4 40

25 5 125

20 10 200

25 7 175

30 6 180

∑W=40 ∑XW=7
60
Weighted Mean =
= 7c0/40

= 1S

Explanation:

1. Multiply each item with its corresponding weight to get XW, i.e.,

[ 5×8=40, 10×4=40, 25×5=125, 20×10=200, 25×7=175, 30×c=180 ]

2. Add up all the values of weight to get the sum of weights, i.e.,
∑W= 8 + 4 + 5 + 10 + 7 + c = 40

3. Add up all the values of the product of weight and items(XW) to get the sum of
the product, i.e.,

∑XW= 40 + 40 + 125 + 200 + 175 + 180 = 7c0

4. Divide ∑XW by ∑W to get the weighted arithmetic mean, i.e., 1G.


Mean, Median and Mode

Mean, Median, and Mode are measures of the central tendency. These values are used to
define the various parameters of the given data set. The measure of central tendency
(Mean, Median, and Mode) gives useful insights about the data studied, these are used to
study any type of data such as the average salary of employees in an organization, the
median age of any class, the number of people who plays cricket in a sports club, etc.

Measures of Central Tendency

Measure of central tendency is the representation of various values of the given data set.
There are various measures of central tendency and the most important three
measures of central tendency are:

 Mean

 Median

 Mode

What are Mean, Median, and Mode?

Mean, median, and mode are measures of central tendency used in statistics to summarize
a set of data.

Mean (x̅ or μ): The mean, or arithmetic average, is calculated by summing all the values
in a dataset and dividing by the total number of values. It’s sensitive to outliers and is
commonly used when the data is symmetrically distributed.

Median (M): The median is the middle value when the dataset is arranged in ascending
or descending order. If there’s an even number of values, it’s the average of the two
middle values. The median is robust to outliers and is often used when the data is
skewed.

Mode (Z): The mode is the value that occurs most frequently in the dataset. Unlike the
mean and median, the mode can be applied to both numerical and categorical data. It’s
useful for identifying the most common value in a dataset.

What is Mean?

Mean is the sum of all the values in the data set divided by the number of values in the
data set. It is also called the Arithmetic Average. Mean is denoted as x̅ and is read as x
bar.

The formula to calculate the mean is:

Formula of Mean
Mean Symbol

The symbol used to represent the mean, or arithmetic average, of a dataset is typically the
Greek letter “μ” (mu) when referring to the population mean, and “̄x” (x-bar) when
referring to the sample mean.

 Population Mean: μ (mu)

 Sample Mean: x̄ (x-bar)

These symbols are commonly used in statistical notation to represent the average value
of a set of data points.

Mean Formula

The formula to calculate the mean is:

Mean (x̅) = Sum of Values / Number of Values

If x1, x2, x3,……, xn are the values of a data set then the mean is calculated as:

x̅ = (x1 + x2 + x3 + . . . + xn) / n

Example: Find the mean of data sets 10, 30, 40, 20, and 50.

Solution:

Mean of the data 10, 30, 40, 20, 50 is

Mean = (sum of all values) / (number of values)

Mean = (10 + 30 + 40 + 20+ 50) / 5 = 30

Mean of Grouped Data

Mean for the grouped data can be calculated by using various methods. The most common
methods used are discussed in the table below:

Direct Method Assumed Mean Step Deviation


Method Method

x̅ = a + h∑ fixi / ∑ fi
x̅ = a + ∑
x̅ = ∑ Where,
fixi / ∑ fi
a is Assumed mean
fixi / ∑
Where, ui = (xi – a)/h
fi a is Assumed mean h is Class size
di is equal to xi – a ∑fi the
Where,
∑fi the sum of all sum of all
∑fi is the sum of all
frequencies frequencie
frequencies
s
Read More about Mean, Median and Mode of Grouped Data.

What is Median?

A Median is a middle value for sorted data. The sorting of the data can be done either in
ascending order or descending order. A median divides the data into two equal halves.

The formula to calculate the median of the number of terms if the number of terms is
even is shown in the image below:

Median Formula for Even Terms

The formula to calculate the median of the number of terms if the number of terms is odd
is shown in the image below:

Median Formula for Odd Term

Median Symbol

The letter “M” is commonly used to represent the median of a dataset, whether it’s for a
population or a sample. This notation simplifies the representation of statistical concepts
and calculations, making it easier to understand and apply in various contexts.
Therefore, in Indian statistical practice, “M” is widely accepted and understood as the
symbol for the median.

Median Formula

The formula for the median is:

If the number of values (n value) in the data set is odd then the formula to calculate the
median is:

Median = [(n + 1)/2]th term


If the number of values (n value) in the data set is even then the formula to calculate the median is:
Median = [(n/2)th term + {(n/2) + 1}th term] / 2 .
Example: Find the median of given data set 30, 40, 10, 20, and 50.

Solution:

Median of the data 30, 40, 10, 20, 50 is,


Step 1: Order the given data in ascending order as:

10, 20, 30, 40, 50

Step 2: Check n (number of terms of data set) is even or odd and find the median of the data
with respective ‘n’ value.

Step 3: Here, n = 5 (odd)

Median = [(n + 1)/2]th term

Median = [(5 + 1)/2]th term

= 30

Median of Grouped Data

The median of the grouped data median is calculated using the formula,

Median = l + [(n/2 – cf) / f]×h

where

 l is lower limit of median class

 n is number of observations

 f is frequency of median class

 h is class size

 cf is cumulative frequency of class preceding the median class.

Read More about Median of Grouped Data.

What is Mode?

A mode is the most frequent value or item of the data set. A data set can generally have
one or more than one mode value. If the data set has one mode then it is called “Uni-
modal”. Similarly, If the data set contains 2 modes then it is called “Bimodal” and if the
data set contains 3 modes then it is known as “Trimodal”. If the data set consists of
more than one mode then it is known as “multi-modal”(can be bimodal or trimodal). There
is no mode for a data set if every number appears only once.

The formula to calculate the mode is shown in the image below:

Formula of Median
Symbol of Mode

In statistical notation, the symbol “Z” is commonly used to represent the mode of a
dataset. It indicates the value or values that occur most frequently within the dataset. This
symbol is widely utilised in statistical discourse to signify the mode, enhancing clarity
and precision in statistical discussions and analyses.

Mode Formula

Mode = Highest Frequency Term

Example: Find the mode of the given data set 1, 2, 2, 2, 3, 3, 4, 5.

Solution:

Given set is {1, 2, 2, 2, 3, 3, 4, 5}

As the above data set is arranged in ascending order.

By observing the above data set we can say that,

Mode = 2

As, it has highest frequency (3)

Mode of Grouped Data

The mode of the grouped data is calculated using the formula:

Mode = l + [(f1 + f0) / (2f1 – f0 – f2)] × h

where,

 f1 is the frequency of the modal class,

 f0 is the frequency of the class preceding the modal class,

 f2 is the frequency of the class succeeding the modal class,

 h is the size of class intervals, and l is the lower limit of modal class.
Relation between Mean, Median, And

Mode

For any group of data, the relation between the three central tendencies mean, median, and
mode is shown in the image below:

Mode = 3 Median – 2 Mean

Mode = 3 Median – 2 Mean

Mean, Median and Mode: Another name for this relationship is an empirical relationship.
When we know the other two measures for a given set of data, this is used to find one of
the measures. The LHS and RHS can be switched to rewrite this relationship in various
ways.

What is Range?

In a given data set the difference between the largest value and the smallest value of the data
set is called the range of data set. For example, if height(in cm) of 10 students in a class are
given in ascending order, 160, 161, 167, 169, 170, 172, 174, 175, 177, and 181
respectively. Then range of data set is (181 – 160) = 21 cm.

Range of Data

Range is the difference between the highest value and the lowest value. It is a way to
understand how the numbers are spread in a data set. The range of any data set is easily
calculated by using the formula given in the image below:

Formula to Find Range

Range Formula:
The formula to find the Range is:

Range = Highest value – Lowest Value

Example: Find the range of the given data set 12, 1G, 6, 2, 15, 4.

Solution:

Given set is {12, 1S, c, 2, 15, 4} Here,

Lowest Value = 2 Highest

Value = 1S Range = 1S −
2 = 17
Differences between Mean, Median and Mode

Mean, median, and mode are measures of central tendency in statistics.

Feature Mean Median Mode

Median is the Mode is the most


Mean is the average of middle value frequently
all values. when data is occurring value in
Definition sorted. the dataset.

Median is not
Mean is sensitive to Mode is not
sensitive to
outliers. sensitive to outliers.
Sensitivity outliers .

Calculated by adding up
Calculated by
all values of a dataset Calculated by
finding which value
and dividing them by finding the
occurs more
the total number of middle value in a
number of times in
values in dataset. list of data.
a dataset.
Calculation

Value of median
Value of mode is
Value of mean may or is always a value
also always a value
may not be in dataset. from the dataset.
from the dataset.
Representatio
n
Note: Mean gets easily affected by extreme values.

Let’s see the following example to understand the difference.

Difference between Mean and Median is understood by the following example. In a school,
there are 8 teachers whose salaries are 20000 rupees, a principal with a salary of 35000,
find their mean salary and median salary.

Mean = (20000+20000+20000+20000+20000+20000+20000+20000+35000)/S =
1S5000/S = 21ccc.c7

Therefore, the mean salary is ₹21,CCC.C7.

For median, in ascending order: 20000, 20000, 20000, 20000, 20000, 20000, 20000,
20000, 35000.

n = S,

Thus, (S + 1)/2 = 5

Thus, the median is 5th observation. Median

= 20000

Therefore, the median is ₹20,000.

Mode is the data with maximum frequency

Mode = 20,000.

How does Mean Median Mode link to Real Life?

In our daily life we came across various instances where we have to use the concept of
mean, median and mode. There are various application of mean, median and mode, here’s
how they link to real life:

 Mean: Mean, or average, is used in everyday situations to understand typical


values. For example, if you want to know the average income of people in a city,
you would calculate the mean income.
 Median: Median is in household income data, the median income provides a
better representation of the typical income than the mean when there are
extreme values. In real estate, the median house price is often used to gauge the
affordability of homes in a particular area.

 Mode: Mode represents the most frequently occurring value in a dataset and is
used in scenarios where identifying the most common value is important. For
example, in manufacturing, the mode may be used to identify the most common
defect in a production line to prioritize quality control efforts
Solved Questions on Mean, Median, and Mode

Question 1: Study the bar graph given below and find the mean, median, and mode of
the given data set.

Solution:

Mean = (sum of all data values) / (number of values)

Mean = (5 + 7 + S + c) / 4
= 27 / 2
= c.75

Order the given data in ascending order as: 5, c, 7, S Here, n

= 4 (which is even)

Median = [(n/2)th term + {(n/2) + 1}th term] / 2

Median = (c + 7) / 2
= c.5

Mode = Most frequent value


= S (highest value)

Range = Highest value – Lowest value

Range = S – 5
=4
Question 2: Find the mean, median, mode, and range for the given data

1G0, 153, 168, 17G, 1G4, 153, 165, 187, 1G0, 170, 165, 18G, 185, 153, 147, 161, 127, 180

Solution:

For Mean:

1S0, 153, 1c8, 17S, 1S4, 153, 1c5, 187, 1S0, 170, 1c5, 18S, 185, 153, 147, 1c1, 127, 180

Number of observations = 18

Mean = (Sum of observations) / (Number of observations)

= (1S0+153+1c8+17S+1S4+153+1c5+187+1S0+170+1c5+18S+185+153+147
+1c1+127+180) / 18

= 2871/18

= 15S.5

Therefore, the mean is 15S.5

For Median:

The ascending order of given observations is,

127, 147, 153, 153, 153, 1c1, 1c5, 1c5, 1c8, 170, 17S, 180, 185, 187, 18S, 1S0, 1S0, 1S4

Here, n = 18

Median = 1/2 [(n/2) + (n/2 + 1)]th observation


= 1/2 [S + 10]th observation
= 1/2 (1c8 + 170)
= 338/2
= 1cS

Thus, the median is 1cS

For Mode:

The number with the highest frequency = 153

Thus, mode = 53

For Range:

Range = Highest value – Lowest value


= 1S4 – 127
= c7
Question 3: Find the Median of the data 25, 12, 5, 24, 15, 22, 23, 25

Solution:

25, 12, 5, 24, 15, 22, 23, 25

Step 1: Order the given data in ascending order as:

5, 12, 15, 22, 23, 24, 25, 25

Step 2: Check n (number of terms of data set) is even or odd and find the median of the data
with respective ‘n’ value.

Step 3: Here, n = 8 (even) then,

Median = [(n/2)th term + {(n/2) + 1)th term] / 2

Median = [(8/2)th term + {(8/2) + 1}th term] / 2

= (22+23) / 2

= 22.5

Question 4: Find the mode of given data 15, 42, 65, 65, G5. Solution:

Given data set 15, 42, c5, c5, S5

The number with highest frequency = c5 Mode

=c5
Practice Questions on Mean, Median and Mode
Question 1: A company recorded the weekly sales (in dollars) of five salespersons as
follows: $450, $520, $480, $510, and $490, Find the mean sales value for this group?

Question 2: Find the median of the following data set: 12, 15, 20, 9, 17, 25, 10.

Question 3: A survey collected the number of books read by a group of 10 people last year:
5, 7, 6, 5, 9, 7, 8, 5, 10, 6. What is the mode of the data set?

Question 4: In a classroom, the scores (out of 100) for a test are: 56, 78, 67, 45, 56, 90,
56, 67, 78, 82. Find the mean, median, and mode of the scores.

Question 5: In a skewed distribution the mean of the data is 40 and median of the data
is 35. Calculate the mode of the data set.

Answers to Practice Questions

Ans 2: Median = Ans 3: Mode


Ans 1: Mean = $490
15. = 5.

Ans 4: Mean = 67.5, Median = 67, Mode =


Ans 5: Mode = 25
56.

Conclusion

Mean, Median and Mode are essential statistical measures of central tendency that
provide different perspectives on data sets. The mean provides a general average, making
it useful for evenly distributed data. The median gives a middle value, providing a better
view of central tendency when dealing with skewed distributions or extreme values and,
the mode highlights the most frequent value, making it valuable in
categorical data analysis.

Mean, in statistical terms, represents the arithmetic average of a dataset. It is calculated by


summing up all the values in the dataset and dividing the sum by the total number of
values. For instance, if you have the numbers 2, 4, 6, 8, and 10, the mean would be (2 + 4
+ 6 + 8 + 10) / 5 = 6.

You might also like