0% found this document useful (0 votes)
19 views45 pages

Types of Statistics

Uploaded by

leannekeith09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views45 pages

Types of Statistics

Uploaded by

leannekeith09
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

STATISTICS IN

RESEARCH
WHAT IS STATISTICS?
• Statistics is the science of collecting, analyzing,
presenting, and interpreting data.
• A branch of science that deals with collection,
organization and analysis of data from the sample to
the whole population
• Statistics in research helps interpret the data
clustered near the mean of distributed data or spread
across the distribution.
TYPES OF STATISTICS
• Descriptive statistics - type of statistics focuses on
summarizing and describing data
• Experimental statistics - type of statistics is used to analyze
data from experiments where variables are manipulated
• Historical statistics – type of statistics is used to analyze
historical data to understand trends and patterns
• Inferential statistics - type of statistics involves making
predictions or inferences about a population based on sample
data
• Predictive statistics - type of statistics is concerned
with analyzing data to understand past trends and
make future predictions
• Qualitative statistics - type of statistics deals with
data that cannot be measured numerically and is often
based on qualities or attributes
• Quantitative statistics - type of statistics deals with
data that can be measured or counted numerically
• The types of statistics are required for the collection,
description, organization, analysis, and interpretation of
data and help to describe certain attributes related to the
data as well as estimate the parameters of the population by
analyzing samples.
DESCRIPTIVE STATISTICS
• Used to quantitatively describe the attributes of the known data and
provides summaries of either the sample or the population.
• It can be presented through graphs, charts, and tables.
MEASURES OF DESCRIPTIVE STATISTICS
1. Measures of central tendency are used to describe data with
respect to a single central point by computing the mean, median,
and mode.
Central tendency is the statistical measure that recognizes the entire
set of data or distribution through a single value and it provides an
exact description of the whole data.
MEAN
• is equal to the sum of all the values of a collection of data
divided by the total number of values in the data.
• commonly known as "average”
• represented as x-bar (x̄ )
• the average or a calculated central value of a set of numbers
and is used to measure the central tendency of the data.
• The mean formula in statistics for a set is defined as the
sum of the observations divided by the total number of
observations.
• Mean formula for Grouped Data
PROBLEM:
Find the mean of the first five natural odd numbers, using
the mean formula.
SOLUTION:
The first five natural odd numbers = 1, 3, 5, 7, and 9
Mean = (1 + 3 + 5 + 7 + 9) ÷ 5 = 25/5 = 5
Answer: The mean of the first five natural odd numbers is 5.
• Ungrouped data is the raw data gathered from
an experiment or study.
• To find the mean of ungrouped data, we simply
calculate the sum of all collected observations
and divide by the total number of the
observations.
• EXAMPLE:
The heights of five students are 161 in, 130 in, 145 in, 156
in, and,162 in respectively. Find the mean height of the
students.
• SOLUTION To find: the mean height of the students.
Height of students = 161 in, 130 in, 145 in, 156 in,
and,162 in
Sum = (161 + 130 + 145 + 156 + 162) = 754
Mean = 754/5 = 150.8
Answer: The mean height of the students is 150.8 inches
MEDIAN
• It represents the middle value for any group.
• For calculation of median, the data has to be arranged in
ascending order, and then the middlemost data point
represents the median of the data.
• For an odd number of data, the median is the middlemost
data, and for an even number of data, the median is the
average of the two middle values.
EXAMPLE
Step 1: Consider the data: 4, 4, 6, 3, and 2. Let's arrange this data in
ascending order: 2, 3, 4, 4, 6.
Step 2: Count the number of values. There are 5 values.
Step 3: Look for the middle value. Thus, median = 4.
MEDIAN FORMULA FOR UNGROUPED DATA
Step 1: Arrange the data in ascending or descending order.
Step 2: Secondly, count the total number of observations 'n'.
Step 3: Check if the number of observations 'n' is even or odd .
Median Formula When n is Odd Median Formula When n is Even
EXAMPLE: The age of the members of a weekend poker team has been
listed below. Find the median of the given age set of players . {42, 40, 50,
60, 35, 58, 32}
SOLUTION:
Step 1: Arrange the data items in ascending order.
Ordered Set: {32, 35, 40, 42, 50, 58, 60}
Step 2: Count the number of observations. If the number of observations is
odd, then we will use the following formula:
Median = [(n + 1)/2]th term
Step 3: Calculate the median using the formula.
Median = [(n + 1)/2]th term
= (7 + 1)/2th term = 4th term = 42
ANSWER is 42
MEDIAN FORMULA FOR GROUPED DATA
Step 1: Find the total number of observations(n).
Step 2: Define the class size(h), and divide the data into different
classes.
Step 3: Calculate the cumulative frequency of each class.
Step 4: Identify the class in which the median falls. (Median Class is
the class where n/2 lies.)
Step 5: Find the lower limit of the median class(l), and the
cumulative frequency of the class preceding the median class (c).
EXAMPLE:
There are 5 top management employees in an
organization. The salaries given to the employees are
Using the median formula calculates the median salary.
5,000, 6,000, 4,000, 8,000, and 7,500.
Solution:
Step 1: Sorting the given data in increasing order,
4,000, 5,000, 6,000, 7,500, and 8,000.
Step 2: Total number of observations = 5
Step 3: The given number of observations is odd.
Step 4: Using median formula for odd observation,
Median = [(n + 1)/2] th term
Median = [(5+1)/2] th term. = 6/2 = 3rd term.
The third term is 6,000.
The median salary is 6,000.
For a set of ungrouped data, we can follow the below-given steps to
find the median value.
Step 1: Sort the given data in increasing order.
Step 2: Count the number of observations.
Step 3: If the number of observations is odd use median formula:
Median = [(n + 1)/2]th term
Step 4: If the number of observations is even use median formula:
Median = [(n/2)th term + (n/2 + 1)th term]/2
EXAMPLE: The height (in centimeters) of the members of a school
football team have been listed below.
{142, 140, 130, 150, 160,135, 158,132}
Find the median of the above set.
SOLUTION:
Step 1: Arrange the data items in ascending order.
Original set: {142, 140, 130, 150, 160, 135, 158,132}
Ordered Set: {130, 132, 135, 140, 142, 150, 158, 160}
Step 2:
Count the number of observations. Number of observations, n = 8
If number of observations is even, then we will use the
following formula:
Median = [(n/2)th term + ((n/2) + 1)th term]/2
Step 3: Calculate the median using the formula.
Median = [(n/2)th term + ((n/2) + 1)th term]/2
Median = [(8/2)th term + ((8/2) + 1)th term]/2
= (4th term + 5th term)/2
= (140 + 142)/2
Median is 141
SET OF GROUPED DATA
When the data is continuous and in the form of a frequency distribution, the
median is calculated through the following sequence of steps.
Step 1: Find the total number of observations(n).
Step 2: Define the class size(h), and divide the data into different classes.
Step 3: Calculate the cumulative frequency of each class.
Step 4: Identify the class in which the median falls.
(Median Class is the class where n/2 lies.)
Step 5: Find the lower limit of the median class(l), and the cumulative
frequency(c)
Step 6: Apply the formula for median for grouped data:
Median =l+[n2−cf]×h=𝑙 +[𝑛2−𝑐𝑓]×ℎ
EXAMPLE Calculate the median for the following data

Marks 0 - 20 20 - 40 40 - 60 60 - 80 80 - 100

Number of
5 20 35 7 3
students
Solution: We need to calculate the cumulative frequencies
to find the median.
Number of Cumulative
Marks
students frequency
0 - 20 5 0+5 5
20 - 40 20 5 + 20 25
40 - 60 35 25 + 35 60
60 - 80 7 60 + 7 67
80 - 100 3 67 + 3 70
N = ∑fi∑𝑓𝑖 = 70
N/2 = 70/2 = 35
Median Class is 40 - 60
l = 40, f = 35, c = 25, h = 20
Using Median formula:
Median =l+[n2−cf]×h=𝑙+[𝑛2−𝑐𝑓]×ℎ
= 40 + [(35 - 25)/35] × 20
= 40 + (10/35) × 20
= 40 + (40/7)
MODE
• One of the values of the measures of central tendency that
gives the idea about which of the items in a data set tend to
occur most frequently.
• tells us about the highest frequency of any given item in
the data set.
TYPES OF MODE
• Unimodal List: A list of given data with only one mode
• Bimodal List: A list of given data with two modes
• Multimodal list : A list of given data with three or more
modes.
MODE FORMULA

Where :
• 'L' is the lower limit of the modal class,
• 'h' is the size of the class interval,
• '(f)m(𝑓)𝑚' is the frequency of the modal class,
• '(f)1(𝑓)1' is the frequency of the class preceding the modal class,
and '(f)2(𝑓)2' is the frequency of the class succeeding the modal
class.
MODE FOR GROUPED DATA
To find the mode for grouped data, follow the steps shown
below.
Step 1: Find the class interval with the maximum frequency.
This is also called modal class.
Step 2: Find the size of the class. This is calculated by
subtracting the upper limit from the lower limit.
Step 3: Calculate the mode using the mode formula:
Mode = L + h (fm−f1) % (fm−f1)+(fm−f2)
EXAMPLE
Class
0−5 5−10 10−15 15−20 20−25
Interval

Frequency 5 3 7 2 6

Modal class = 10 - 15 (class with the highest frequency).


The Lower limit of the modal class = (L) = 10
Frequency of the modal class = (f)m(𝑓)𝑚 = 7
Frequency of the preceding modal class = (f)1(𝑓)1 = 3
Frequency of the next modal class = (f)2(𝑓)2 = 2 and
Size of the class interval = (h) = 5.
• Thus, the mode can be found by substituting the above values in
.
the formula:
Mode = L + h (fm−f1) / (fm−f1)+(fm−f2)
Thus, Mode = 10 + 5 x (7−3) / (7−3)+(7−2)
= 10 + 5 × 4/9
= 10 + 20/9
= 10 + 2.22
= 12.22.
Therefore the mode for the above dataset is 12.22
IMPORTANT NOTES AND TIPS ON MODE:

• Mode value can sometimes be the same as mean and/or


median, but not always.
• The mode is very useful to find out categorical data.
• There can be no mode for data that does not have any
repeating numbers.
• Mode can also be found out for data sets that do not have
any numbers.
• It is easy to find the mode when the given set of numbers
are arranged in ascending order.
• Mode for ungrouped data can be found by observation,
whereas mode for grouped data can be found using the
formula.
2. MEASURES OF DISPERSION
• These measures are used to describe the
variability of data.
• It is used to quantify the spread of a
distribution about a central value.
• It includes range, variance, standard
deviation, mean deviation, quartile
deviation, and coefficients of dispersion.
RANGE
• usually defined with an upper value and lower value and it refers
to all the units between those values.
• When you buy things, they are always sold within a price range.
Take the example of your favorite pair of jeans. The store from
where you made the purchase probably had a range of colors, a
range of fits, a range of sizes, and a range of prices.
The range is the difference between the highest value
and the lowest value of the data. It helps in knowing
the spread of the data.
Example:
Find the range of the data 2, 7, 11, 12, 19,
22, 25, 27, 33, 35
Highest Value = 35
Lowest Value = 2
Range = Highest Value - Lowest Value =
35 - 2 = 33
TERMS TO REMEMBER
1.Range is the difference of the highest value and the
lowest value of the data.
2.Range is useful to find the class interval(CI).
CI=Range / Number of classes
3. For data having outliers, Interquartile range is used to
represent the data.
4.Interquartile range is the difference between the first
quartile and the third quartile.
5.Outliers refers to the extreme values in the data
EXAMPLE
• Joseph wishes to find the range of the first 40 multiples of 3. Can
you help Joseph to find the range?
• Solution
• Let us first list the first 40 multiples of the number 3
3, 6, 9, 12, .......114, 117, 120
• Here the Lowest Value = 3 and the Highest Value = 120
• Range = Highest Value - Lowest Value = 120 - 3 = 117
2. Peter finds the highest price of a variant of potato is 79 cents and the lowest price for another range
of potato is 36 cents. Find the price range of the potatoes.
• Solution
• Given the highest price = 79 and the lowest price = 36
• Price range = Highest price - Lowest price = 79 - 36 = 43

3.The marks scored by students of a class is 3, 4, 42, 47, 51, 55, 57, 63, 69, 74, 74, 75, 97. Find the
outliers of the data. Also find the range of the data, after removing the outliers.
• Solution
• The marks scored by the students is: 3, 4, 42, 47, 51, 55, 57, 63, 69, 74, 74, 75, 97
• Here the numbers 3, 4, and 97 are the outliers.
• After removing the outliers the remaining data is as follows.
42, 47, 51, 55, 57, 63, 69, 74, 74, 75
• Here we have the Highest Score = 75 and the Lowest Score = 42
• Range = Highest Score - Lowest Score = 75 - 42 = 33
VARIANCE
• A statistical measurement that is used to determine the spread of numbers in a
data set with respect to the average value or the mean.
• The expected value of the squared differences from the mean.
• The standard deviation squared will give us the variance
• Measure of dispersion is a quantity that is used to check the variability of
data about an average value.
Types of variance
• Population Variance - All the members of a group are known as the
population.
• Sample Variance when the population is too large then a select number of
data points are picked up from the population to form the sample that can
describe the entire group.
HOW TO FIND VARIANCE?
• Find the mean of the observations. This can be done by
dividing the sum of all observations by the number of
observations.
• Subtract the mean from each observation.
• Square each of these values.
• Add all the values obtained in the previous step.
• Divide the value from step 4 by n (for population
variance) or n - 1 (for sample variance).
STANDARD DEVIATION
• The positive square root of the variance.
• It is one of the basic methods of statistical analysis.
• Commonly abbreviated as SD and denoted by the symbol 'σ’ and it
tells about how much data values are deviated from the mean value.
• The degree of dispersion or the scatter of the data points relative to
its mean,It tells how the values are spread across the data sample
and it is the measure of the variation of the data points from the
mean
• If we get a low standard deviation then it means that the values tend
to be close to the mean whereas a high standard deviation tells us
that the values are far from the mean value.
MEAN DEVIATION
• used to compute how far the values in a data set are from
the center point. Mean, median, and mode all form center
points of the data set.
• a simpler measurement of variability as compared to
standard deviation. When we want to find the average
deviation from the data's center point.
QUARTILE DEVIATION
• a statistic that measures the deviation in the middle of the data.
• also referred to as the semi interquartile range and is half of the difference between the
third quartile and the first quartile value.
• The formula for quartile deviation

• Quartile deviation measures the absolute level of dispersion and the relative measure
with reference to quartile deviation is known as the coefficient of quartile deviation.

Coefficient of Quartile Déviation = (Q3 – Q1) / (Q3 +


Q 1)
STEPS IN FINDING QUARTILE DEVIATION
1. Arrange the available data in ascending or both the grouped and
ungrouped data.
2. Find the first quartile value using one of these formulas. For
grouped data use the formula Q1 = (n + 1)/4,
and for ungrouped data use the formula
Q1=l1+(N/4)−cf(l2−l1)𝑄1=𝑙 1+(𝑁/4)−𝑐𝑓(𝑙 2−𝑙 1).
Here n is for the particular quartile, N is the total frequency, f is the
frequency of the particular class, c is the cumulative frequency of the
preceding class, and l1, l2 are the lower and upper boundaries of the
class interval.
The given data points are 23, 8, 5, 16, 33, 7, 24, 5, 30, 33, 37, 30, 9, 11,
26, 32
Let us arrange this data in the following ascending order.
5, 5, 7, 8, 9, 11, 16, 23, 24, 26, 30, 30, 32, 33, 33, 37
From the above data we have Q1 = ( 8 + 9)/2 = 17/2 = 8.50, and Q3 = (30
+ 32)/2 = 62/2 = 31
Quartile Deviation = Q3−Q1/2
=31−8.5 = 22.50/2 =11.25
Coefficient of Quartile Deviation =Q3−Q1 / Q3+Q1=
=31−8.5 /31+8.5
=22.5 /39.5
=0.57
Therefore, the quartile deviation is 11.25, and the coefficient of quartile
deviation is 0.57.

You might also like