Measurement 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

DESCRIPTIVE STATISTICS:

NUMERICAL MEASURES
PART 1
◦ Measures of Central Tendency

◦ Measures of Variability
Central Tendency
In general terms, central tendency is a statistical measure that determines a
single value that accurately describes the center of the distribution and
represents the entire distribution of scores.

The goal of central tendency is to identify the single value that is the best
representative for the entire set of data.
By identifying the "average score," central tendency allows researchers to summarize or
condense a large set of data into a single value.

Thus, central tendency serves as a descriptive statistic because it allows researchers to


describe or present a set of data in a very simplified, concise form.

In addition, it is possible to compare two (or more) sets of data by simply comparing the
average score (central tendency) for one set versus the average score for another set.
Types
◦ Mean
◦ Median
◦ Mode
◦ Percentiles
◦ Quartiles
Mean
The mean is the arithmetic average of the scores.

Sample mean

Population mean
The Median
If the scores in a distribution are listed in order from smallest to
largest, the median is defined as the midpoint of the list.

The median divides the scores so that 50% of the scores in the
distribution have values that are equal to or less than the median.

Computation of the median requires scores that can be placed in rank


order (smallest to largest) and are measured on an ordinal, interval,
or ratio scale.

7
Mode

The mode of a data set is the value that occurs with greatest frequency.

The greatest frequency can occur at two or more different values.

If the data have exactly two modes, the data are bimodal.

If the data have more than two modes, the data are multimodal.
The Mode
The mode is defined as the most frequently occurring category or score in the
distribution.

In a frequency distribution graph, the mode is the category or score


corresponding to the peak or high point of the distribution.

The mode can be determined for data measured on any scale of measurement:
nominal, ordinal, interval, or ratio.

11
Mean > Median > Mode

Mean < Median < Mode


Percentiles
A percentile (or a centile) is a measure used in statistics indicating the value
below which a given percentage of observations in a data set falls. For example,
the 20th percentile is the value (or score) below which 20% of the observations
may be found.

The pth percentile of the data set is a measurement such that after the data are
ordered from smallest to largest, at most, p% of the data are at or below this
value and at most, (100 - p)% at or above it.

16
Percentiles and Quartiles
◦ For a set of measurements arranged in increasing order, the 𝑝𝑝th percentile is a
value such that p percent of the measurements fall at or below the value and
(100−𝑝𝑝) percent of the measurements fall at or above the value

The first quartile Q1 is the 25th percentile


The second quartile (or median) Md is the 50th percentile
The third quartile Q3 is the 75th percentile
Steps for calculating p-th percentile:

Example: 20 customer satisfaction ratings:
1 3 5 5 7 8 8 8 8 8 8 9 9 9 9 9 10 10 10 10

Q1 (25th percentile) = (7+8)/2 = 7.5


Q2 = Md (50th percentile)= (8+8)/2 = 8
Q3 (75th percentile)= (9+9)/2 = 9
Example: For the following data, compute
mean, median and mode. Comment on the
shape of the distribution

We often describe a set of measurements by using a five-number
summary. The summary consists of:

1) the smallest measurement


2) the first quartile, Q1
3) the median, Q2
4) the third quartile, Q3
5) The largest measurement
Quartile Measures
Three Quartiles (Q1, Q2 and Q3) split the ranked data into 4 segments with an
equal number of values per segment

25% 25% 25% 25%

Q1 Q2 Q3

■ The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
■ Q2 is the same as the median (50% of the observations
are smaller and 50% are larger)
■ Only 25% of the observations are greater than the third
quartile
The Interquartile Range (IQR)

The IQR is Q3 – Q1 and measures the spread in the middle 50% of the data

The IQR is a measure of variability that is not influenced by outliers or extreme


values

Q1, Q3, and IQR measures are called resistant measures because they are not
influenced by the outliers in the dataset
Think of Q1 as a median in the lower half of the data and think of Q3 as a
median for the upper half of data.

(1, 2, 5, 6, 7, 9, 12, 15, 18, 19, 27).

Lower half upper half

Q1 = 5 and Q3 = 18
Example
we will consider the set of data: First quartile = 3.5
2, 3, 3, 4, 5, 6, 6, 7, 8, 8, 8, 9 Median = 6
Third quartile = 8

Find the Q1, Q3 and the IQR. Thus we see that the interquartile range is 8
– 3.5 = 4.5
The Boxplot or Box and
Whisker Diagram
◦ The Boxplot: A Graphical display of the data.

Xsmallest -- Q1 -- Median -- Q3 -- Xlargest

Example:

25% of data 25% 25% 25% of data


of data of data

Xsmallest Q1 Median Q3 Xlargest


Calculating The Interquartile Range from
boxplot

Example:
X Median X
minimum Q1 (Q2) Q3 maximum

25% 25% 25% 25%

12 30 45 57 70

Interquartile range
IQR= 57 – 30= 27

You might also like