Week 3
Week 3
1
Arithmetic Mean or Mean: The most familiar measure of central tendency is the
arithmetic mean. It is the descriptive measure most people have in mind when they
speak of the “average.” One may also refer to the arithmetic mean simply as the
mean. The mean is obtained by adding all the values in a population or sample and
dividing by the number of values that are added.
Suppose there are 𝑛 values 𝑥1 , 𝑥2 , … , 𝑥𝑛 for a variable X, then the mean, denoted
by 𝑥, is defined as
n
𝑥1 + 𝑥2 + 𝑥3 + ⋯ + ⋯ + 𝑥𝑛 i=1 xi
𝑥= =
𝑛 n
n
i=1 xi 123 + 116 + 122 + 110 + 140 + 120 + 125 + 111 + 118 + 117
x= =
n 10
= 120.2
2
Weighted Arithmetic Mean (WAM):
Let us consider an example. Suppose a student took a final examination for 5
courses with different credit hour. The scores (percentage) he/she obtained are 70
(3-cedit), 60 (4-credit), 75 (3-credit), 80 (2-credit) and 90 (2-credit).
Theorem:
If a set consists of 𝑛1 observations of the form 𝑥11 , 𝑥12 , … , 𝑥1𝑛 1 with mean 𝑥1 and
a second set consists of 𝑛2 observations of the form 𝑥21 , 𝑥22 , … , 𝑥2𝑛 2 with mean
𝑥2 , then the mean of all the 𝑛1 + 𝑛2 observations called combined mean or pooled
mean, is given by
𝑛1 𝑥1 + 𝑛2 𝑥2
𝑥𝑐 =
𝑛1 + 𝑛2
3
Properties of the Mean: The arithmetic mean possesses certain properties, some
desirable and some not so desirable. These properties include the following:
1. Uniqueness. For a given set of data there is one and only one arithmetic
mean.
2. Simplicity. The arithmetic mean is easily understood and easy to compute.
3. Since each and every value in a set of data enters into the computation of
the mean, it is affected by each value. Extreme values (too large or too small
compared to other values), therefore, have an influence on the mean and, in
some cases, can so distort it that it becomes undesirable as a measure of
central tendency.
As an example of how extreme values may affect the mean, consider the following
situation. Suppose the five physicians who practise in an area are surveyed to
determine their charges for a certain procedure. Assume that they report these
charges: $75, $75, $80, $80, and $280. The mean charge for the five physicians is
found to be $118, a value that is not very representative of the set of data as a
whole. The single atypical value had the effect of inflating the mean.
4
𝑛 +1 𝑡ℎ
a) If 𝑛 is odd: Median = observation.
2
𝑛 𝑡ℎ 𝑛 𝑡ℎ
b) If 𝑛 is even: Median = Mean of observation and +1
2 2
observation
Example:
The ages of seven members of a family are given as 12, 7, 2, 34, 17, 21 and 19.
Find the median age.
Example:
The ages of a family of eight members are given as 12, 7, 2, 34, 17, 40, 21 and 19.
Find the median age.
Arrange the values in ascending order 2, 7, 12, 17, 19, 21, 34, 40
5
Mode: The mode of measurements is the measurement that occurs most
frequently. If all the values are different there is no mode; on the other hand, a set
of values may have more than one mode. When data are in classes, the class with
the highest frequency is the modal class.
For an example of a set of values that has more than one mode, let us consider a
laboratory with 10 employees whose ages are 20, 21, 20, 20, 34, 22, 24, 27, 27,
and 27. We could say that these data have two modes, 20 and 27. The sample
consisting of the values 10, 21, 33, 53, and 54 has no mode since all the values are
different.
Note: The mode is usually used for describing qualitative data. Consider the
following frequency distribution
Table: Frequency distribution for internet use among first year students
Internet Usage Category Frequency
Not internet user 19
One hour or less 11
One to four hour 9
Four to ten hour 8
More than 10 hours 3
6
Comparing Mean, Median and Mode
• Bell-shaped distribution:
Mean = Median = Mode
• Right skewed distribution:
Mean > Median > Mode
• Left-skewed distribution:
Mean < Median < Mode
Remark:
7
Percentiles and Quartiles
Given a set of 𝑛 observations, the 𝑝th percentile, 𝑥𝑝 is the value of random variable
𝑋 such that 𝑝 percent or less of the observations are less than 𝑥𝑝 and (100 − 𝑝)
percent or less of the observations are greater than 𝑥𝑝 . Note that 0 < 𝑝 < 100.
- The first quartile, Q1 is the 25th percentile
- The second quartile, Q2 (or median), Md is the 50th percentile
- The third quartile, Q3 is the 75th percentile
Note:
1. 25% of observations are between Q1 and Q2
2. 25% of observations are between Q2 and Q3
3. 50% of observations are between Q1 and Q3
Steps for calculating p-th percentile
1. Arrange the sample in increasing order
2. Calculate 𝑘 = 𝑝/100 × 𝑛, where n = sample size
3. If k is fraction (not integer), then the next integer greater than k denotes the
position of the pth percentile in the ordered arrangement
4. If k is an integer, then pth percentile is the average of the measurements in
positions k and (k+1) in the ordered arrangement.
1 3 5 5 7 8 8 8 8 8 8 9 9 9 9 9 10 10 10 10
Q1 = (7+8)/2 = 7.5
Q2= (8+8)/2 = 8
Q3 = (9+9)/2 = 9
8
Problem: