0% found this document useful (0 votes)
180 views

Data Description

The document discusses various measures of central tendency and weighted averages that can be used to summarize numerical data. It defines the mean, median and mode as common measures of central tendency. It also introduces the weighted mean, which involves multiplying each value by its corresponding weight before dividing by the total sum of weights. Several examples are provided to demonstrate how to calculate the arithmetic mean, weighted mean, and solve word problems involving finding the mean of data sets.

Uploaded by

222041
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
180 views

Data Description

The document discusses various measures of central tendency and weighted averages that can be used to summarize numerical data. It defines the mean, median and mode as common measures of central tendency. It also introduces the weighted mean, which involves multiplying each value by its corresponding weight before dividing by the total sum of weights. Several examples are provided to demonstrate how to calculate the arithmetic mean, weighted mean, and solve word problems involving finding the mean of data sets.

Uploaded by

222041
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

Summarizing & Describing Numerical Data.

Chapter Outline
• Measures of Central Tendency:
Summarize data using the measures of central tendency,
such as the Mean, Median, Mode, Geometric mean &
Harmonic mean.
• Measures of Variation & Position:
Describe data using the measures of variation, such as
the Range, Quartile deviation, Mean deviation, Variance
and Standard deviation. Identify the position of a data
value in a data set. Chebyshev’s Theorem. Moments.
Moment Ratios. Box-and-Whisker Plot.
• Shape:
Symmetric, Skewness & Kurtosis detecting outlier in data.

Rizwan Yusuf Khan 1


Associate Professor
A single value which is calculated to represent the whole data is called
average. Since the average tends to lie in the center of the distribution,
they are called measure of central tendency. They are also known as
measure of location because they locate the center of a distribution.
Types of Averages
The most used measures of central tendency are:

Positional
Mathematical

Average
(i) The Arithmetic Mean (ii) The Median

Average
(iv) Geometric Mean (iii) The Mode
(v) Harmonic Mean
The mean is the arithmetic average of a data set. This is found by
adding the numbers in a data set and dividing by how many numbers
there are. OR The sum of all the observations dividing by their
number of observations. It is abbreviated as A.M, denoted by X
that is
X =
 X
Ungroup data ത
𝑋=
σ 𝑓𝑋
Group data

n σ𝑓
Example # 1
The marks obtained by 8 students are given below:
46, 32, 37, 46, 39, 36, 48, 36. Calculate the arithmetic mean.
46 + 32 + 37 + 46 + 39 + 36 + 48 + 36 320
X = = = 40 marks
8 8
2
Example # 2
Data given below, Calculate the mean.
9, 16, 9, 13, 15, 7, –12, 4, 11, – 8 .
9 + 16 + 9 + 13 + 15 + 7 – 12 + 4 + 11 – 8 64
𝑋ത = = = 6.4
10 10
Example # 3
Find the arithmetic mean from the following distribution.
Classes 0 – 5 6 – 11 12 – 17 18 – 23 24 – 29
Frequency 4 8 14 3 1

Classes f X fX
0–5 4 2.5 10
6 – 11 8 8.5 68
12 – 17 14 14.5 203
18 – 23 3 20.5 61.5
24 – 29 1 26.5 26.5
Total 30 – 369
− fX 369
X= = 30 = 12.30
f 3
Question # 1
Find the arithmetic mean of the set of numbers 45, 48, 50, 55, 40, 32, 41.
Question # 2 Ans: 44.43

Calculate the arithmetic mean for the following frequency distribution.


Classes Frequency
11 – 13 1
13 – 15 4
15 – 17 6
17 – 19 10
19 – 21 7
21 – 23 2 Ans: 17.6

Question # 3
Calculate the mean.
Classes 0 – 10 10 – 20 20 – 30 30 – 40
Frequency 10 20 40 30 Ans: 24

Question # 4
The ages of residents of a town are given below. Find Mean.
4
Age 29 – 33 34 – 38 39 – 43 44 – 48
Ans: 36.2
Frequency 11 12 2 5
Weighted Mean
Mean found by multiplying each value by its corresponding weight
and dividing by the sum of the weights. It is denoted by Xw.
Xw =  wX
w
Example # 4: You are taking a class in which your
grade is determined from five sources: 50% from your
test mean, 15% from your midterm, 20% from your final
exam, 9% from your computer lab work and 6% from
your homework. Your scores are 86 (test mean), 96
(midterm), 82 (final exam), 98 (computer lab), and 100
(homework). What is the weighted mean of your scores?
Source X(score) w (weight) wX  wx
Xw =  w = 88.62 / 1.00
Test mean 86 0.50 43.0
Midterm 96 0.15 14.4 = 88.62
Final 82 0.20 16.4
Computer Lab 98 0.09 8.82
Homework 100 0.06 6.00 5
1.00 88.62
Example # 5 Find the weighted mean price of three models of cars
sold. The number and price of each model sold are shown in the
following list.
Model Number Price
A 8 $10,000
B 10 $12,000
C 12 $8,000
Solution

Number W Price X
WX
8 10000 80000
10 12000 120000
WX 296000
12 8000 96000 Xw= W
=
30
= 9866.67
30 – 296000

Question # 5
A recent survey of a new diet cola reported the following percentages of
people who liked the taste. Find the weighted mean of the percentages.
Area % Favoured Number surveyed
1 40 1000
2 30 3000 Ans: 35.42

3 50 800 6
Question # 6
In an advertisement, a retail store stated that its employees
averaged nine years of service. The distribution is shown here.
Number of employees Years of service
8 2
2 6
3 10
Calculate weighted mean for years of service.
Ans: 4.46
Question # 7
The costs of three models of helicopters are shown below. Find the
weighted mean of the costs of the models.
Model Number sold Cost
Sun scraper 9 $427,000
Sky coaster 6 $365,000
Highflyer 12 $725,000
Ans: $545666.67

7
Question # 8
For the month of April, a checking account has a balance
of $523 for 24 days, $2415 for 2 days, and $250 for 4 days.
What is the account’s mean daily balance for April? Ans: Σw = 30, Σwx = 18382, Mean = 612.7333
Question # 9
A student receives the following grades, with
an A+ worth 4 points, a B worth 3 points,
a C worth 2 points, and a D worth 1 point.
What is the student’s mean grade point score?
B in 2 three-credit classes D in 1 two-credit class
A+ in 1 four-credit class C in 1 three-credit class
Ans: Σw = 15, Σwx = 42, Mean = 2.8
Grades A+ B C D
X 4 3 2 1
w 4 6 3 2
Question # 10 wX 16 18 6 2

The average starting salaries (by degree attained) for


25 employees at a company are given. What is the
mean starting salary for these employees?
8 with MBAs: $92,500 17 with BBAs in business: $68,000
8
Question # 11
An instructor grades as follows: exams, 20%; term paper, 30%; final exam,
50%. A student had grades of 83, 72, and 90, respectively, for exams, term
paper, and final exam. Find the student’s final average. Use the weighted
mean. Ans: 83.2

Question # 12
An instructor gives four 1-hour exams and one final exam, which counts as
two 1-hour exams. Find weighted mean of a student’s grade if she received
62, 83, 97, and 90 on the hour exams and 82 on the final exam. Ans: 82.67

Question # 13
Using the weighted mean, find the average number of grams of fat in meat
or fish a person would consume over a five-day period if he ate the
following:
Meat or Fish Fat (grams/oz)
3 oz. fried shrimp 3.33
3 oz. veal cutlet (broiled) 3.00
2 oz. roast beef (lean) 2.50
2.5 oz. fried chicken drumstick 4.40
4 oz. tuna (canned in oil) 1.75
Ans: 2.8959
9
Properties of the Arithmetic Mean.
(i) Sum of deviations taken from arithmetic mean is zero, i.e.
n

 (X
i =1
i − X) = 0

(ii) If n1 values have mean X1, n2 values have mean X2, . . . .


,nk values have meanX k then combined mean or mean of means
for all values denoted by X and is given by
n1 X 1 + n2 X 2 + n3 X 3 + . . . + nk X k nX
X= =
n1 + n2 + n3 + . . . + nk n
(iii) Sum of square of the deviations taken from arithmetic mean is
minimum, i.e. , ෍(X − X) ሜ 2 ≤ ෍(X − 𝐴)2
where ‘A’ is any value other than the mean.
(iv) Mean is affected by change of origin and unit of measurement.
i.e. if Y = aX ± b then Y = a X ± b
where ‘a’ and ‘b’ are any two numbers. a ≠ 0.
Example # 6
If X = 3 and Y = 50 – 5X, then find 𝑌ത.
Solution

According to the property of A.M Y = 50 – 5 X = 50 – 5(3) = 35 10


Example # 7
Let a variable X has the values as 40, 50, 60, 80, and 100. Find the mean of X and
n
verify that  (X
i =l
i − X) = 0 . Also verify that (X − X)2  (X − A) 2 if A = 60.

Solution
− −
X X – X (X – X)2 X – A (X – A)2
40 – 26 676 – 20 400
50 – 16 256 – 10 100
60 –6 36 0 0
80 14 196 20 400
100 34 1156 40 1600
330 0 2320 - 2500
− X 330 n
X = n = 5 = 66. Hence  (X i − X) = 0
i =l

As (X − X) 2 = 2320, (X − A) 2 = 2500


Hence (X − X)2  (X − A) 2
11
Example # 8
The average marks obtained by three sections of first year class are given below:
Sections No. of Students Means
A 30 75
B 25 82
C 17 84
Find the combined mean of the whole class.
Solution
n X + n X + n X  nX
X= 1 1 2 2 3 3 =
n1 + n 2 + n 3 n
30(75) + 25(82) + 17(84) 5728
X= 30 + 25 + 17 =
72 = 79.56

Question # 14
Three teachers of statistics reported mean examination grades of 75,
82, and 85 in their classes, which consisted of 30, 25 and 17 students,
respectively. Determine the mean grade for all the classes. Ans: 79.083

12
The median is the middle number in a data set when the numbers
are listed in either ascending or descending order. OR Middle most value
of an array series is known as median. Median divides the data into two
equal parts. Fifty percent of the data falls below the median and fifty
percent falls above the median. It is denoted by X (tilde).
When the number of values is odd, the median is the middle
value and when the number of values is even, the median is the mean of
the two middle values.
Example # 9 The prices, in dollars for a sample of a room air conditioners
are listed. Find the median. 180, 201, 220, 191, 219, 209, 186
180 186 191 201 209 219 220
(IN ORDER) X = 201
Example # 10 The prices,= in
Median dollars for a sample of a room air
201
conditioners are listed. Find the median. 18, 24, 20, 35, 19, 23, 26, 18
18 18 19 20 23 24 26 35
20 + 23
X = 2
= 21.5

13
~ h n 
X =l+ f  2 – C
 
l = lower class boundary of the median class, n = total frequency
h = size of class interval of median class
f = frequency of the median class
c = cumulative frequency of the class preceding the median class.
Example # 11
From the following frequency distribution find median.
Classes 100 – 149 150 – 199 200 – 249 250 – 299
Frequency 10 30 40 20

Classes f C Class Boundaries


100 – 149 10 10 99.5 – 149.5
150 – 199 30 40 149.5 – 199.5
200 – 249 40 80 199.5 – 249.5
250 – 299 20 100 249.5 – 299.5
Total 100 – –
~ h  n  n 100
X =l+ f  2 – C , 2 = 2 = 50, l = 199.5, h = 50, f = 40, C = 40.
 
50
Median = 199.5 + (50 – 40) = 212 14
40
Question # 15
Twelve secretaries were given a typing test, and the
times (in minutes) to complete it were as follows.
8, 12, 15, 9, 6, 8, 10, 9, 8, 6, 7, 8. Find median. Ans: 8
Question # 16
Calculate the median for the following frequency distribution. Ans: 75.081

Grade 30–39 40–49 50–59 60–69 70–79 80–89 90–99


No. of
1 3 11 21 43 32 9
Students
Question # 17
Calculate the median for the following frequency distribution. Ans: 109.67

Hourly
40–60 60–80 80–100 100–120 120–140 140–160 160–180
wages (Rs)
No. of
13 23 101 182 105 19 7
employees

15
Question # 18
During a quality assurance check, the actual coffee
contents (in ounces) of six jars of instant coffee were
recorded as 6.03, 5.59, 6.40, 6.00, 5.99, and 6.02.
(a) Find the mean and the median of the coffee content.
(b) The third value was incorrectly measured and is
actually 6.04. Find the mean and median of the
coffee content again.
(c) Which measure of central tendency, the mean or the
median, was affected more by the data entry error?
Question # 19 Ans: (a) Mean = 6.005, Median = 6.01 (b) Mean = 5.945, Median = 6.01 (c) Mean

The distances (in yards) for nine holes of a golf course


are listed. 336 393 408 522 147 504 177 375 360
(a) Find the mean and median of the data.
(b) Convert the distances to feet. Then rework part (a).
(c) Compare the measures you found in part (b) with
those found in part (a).What do you notice?
(d) Use your results from part (c) to explain how to find
quickly the mean and median of the given data set if
the distances are measured in inches. 16
Ans: (a) Mean = 358, Median = 375 (b) Mean = 1074, Median = 1125 (c) The mean and median in part (b) are three times the mean and
median in part (a). (d) If you multiply the mean and median from part (b) by 12, you will get the mean and median of the data set in inches.
(Mean = 1074 × 12 = 12888, Median = 1125 × 12 = 13500)
Quantiles
Quartiles, deciles, percentiles and some other values obtained by
subdividing of a given set of data collectively known as quantiles.

Quartiles
Not a Measure of Central Tendency. Split Ordered Data into 4 Quarters

25% 25% 25% 25%


Q1 Q2 Q3
Example # 12.
From the following data find the Q1, Q2, & Q3.
17,15,9,7,5,13,14,16,11,10,18.

Data in Ordered Array: 5 7 9 10 11 13 14 15 16 17 18

Q1 Q2 Q3 17
Example # 13
From the data find the Q1, Q2, & Q3. 8, 6, 5, 4, 3, 9, 14, 16.

Data in Ordered Array: 3 4 5 6 8 9 14 16

Q1 = 4.5 Q2 = 7 Q3 = 11.5

Example # 14
From the data find the Q1, Q2, & Q3. 2, 3, 5, 4, 6, 9, 7, 8, 10.
Solution
Data in Ordered Array: 2 3 4 5 6 7 8 9 10

Q2 = 6
Q1 = 3.5 Q3 = 8.5 18
Example # 15
From the data find the Q1, Q2, & Q3. 2, 3, 5, 4, 6, 9, 7, 8, 10,11.
Solution

Data in Ordered Array: 2 3 4 5 6 7 8 9 10 11

Q2 = 6.5
Q1 = 4 Q3 = 9
Example # 16
From the data find the Q1, Q2, & Q3. 8, 3, 5, 4, 6, 7, 2.
Solution
Data in Ordered Array: 2 3 4 5 6 7 8
Q2 = 5

Q1 = 3 Q3 = 7

Example # 17
From the data find the Q1, Q2, & Q3. 3, 5, 4, 6, 7, 2.
Solution
Data in Ordered Array: 2 3 4 5 6 7
Q2 = 4.5 19
Q1 = 3 Q3 = 6
The mode is the value that occurs the most often in a data set. A
distribution having only one mode is called uni-modal distribution; a
distribution having two modes is called bio-modal distribution and a
distribution having more than two modes is called a multi-modal
distribution.
Example # 18 Identify the mode for each of the following lists of numbers.
(i) 2, 4, 5, 5, 5, 6, 9, 16, 22
(ii) 4, 9, 3, 5, 4, 6, 9, 3, 4, 11, 9
(iii) 18, 22, 9, 4, 6, 12
(i) Since 5 occurs maximum number of times so mode = 5
(ii) Since 4 and 9 occurs maximum but equal no of times. So, mode = 4 &
9. A distribution with two modes is called a bimodal distribution.
(iii) Here mode does not exist, because each number occurs only once.

fm – f1
Mode = l + ×h
(2fm – f1 – f2) l = lower class boundary of the modal class,
fm = frequency of the modal class or Maximum frequency.
f1 = frequency preceding to the modal class,
OR f2 = frequency following to the modal class,
fm – f1 h = class interval size of the modal class.
Mode = l + ×h
(fm – f1 ) + (fm – f2)
20
Example # 19
From the following frequency distribution find mode.
Classes 100 – 149 150 – 199 200 – 249 250 – 299
Frequency 10 30 40 20

Classes f Class Boundaries


100 – 149 10 99.5 – 149.5
150 – 199 30 149.5 – 199.5
200 – 249 40 199.5 – 249.5
250 – 299 20 249.5 – 299.5
– – –
fm = 40, f1 = 30, f2 = 20, h = 50, l = 199.5
fm – f1
Mode = l + ×h
(2fm – f1 – f2)
40 − 30
𝑀𝑜𝑑𝑒 = 199.5 + × 50 = 216.17
2(40) − 30 − 20

21
Question # 20
Calculate the mode for the following frequency distribution. Ans: 75.82

Classes 30–39 40–49 50–59 60–69 70–79 80–89 90–99


Frequency 2 3 11 20 32 25 7
Question # 21
Calculate the mode for the following frequency distribution. Ans: 18.14

Classes Frequency
11 – 13 1
13 – 15 4
15 – 17 6
17 – 19 10
19 – 21 7
21 – 23 2
Question # 22
The weights of the 40 students at a university are given in the following frequency
table.
Weight 118–126 127–135 136–144 145–153 154–162 163–171 172–180
Frequency 3 5 9 12 5 4 2
Calculate arithmetic mean, median, mode, Q1, Q3 & IQR.
Ans: 146.975, 146.75, 147.2, 137.5, 155.3, 17.8

Q1 = l + f  4 – C , Q3 = l + f  4 – C
h n h 3n
    22
Empirical Relation between Mean, Median and Mode
For a moderately skewed distribution there exists an empirical
relationship among the mean, median and mode hence
Mean – Mode = 3(Mean – Median)
Mode = 3Median – 2Mean
Example # 20
In a symmetrical distribution, mean is 40, what is the median and mode?
Solution
In a symmetrical distribution the mean, median and mode coincide. Hence the
mean, median and mode are also 40.
If Mean = 75, Mode = 70, using empirical relation, find the value of Median.

Solution
From the empirical relation between the mean, median and mode.
Mode = 3Median – 2Mean
70 = 3Median – 2(75)
3Median = 70 + 150 = 220
220
Median = = 73.33
3
Question # 23
Find the mode from the following distribution.
Classes 0 – 5 6 – 11 12 – 17 18 – 23 24 – 29
Frequency 4 14 14 6 2 23
Ans: Mean = 12.7, Median = 12.36, Mode = 3Median – 2Mean = 11.68
Choosing between Mean, Median, and Mode

 The mean is essential when calculating certain (traditional) statistics


 For descriptive purposes, the median is often the most versatile and
useful measure of central tendency, especially when a distribution is
skewed.
 If you are dealing with nominal level data, the mode is the only useful
measure of central tendency.
 For small groups (less than 10), it is better to use median than the
mean (because an extreme score can drag the mean in its direction).
Question # 24
Calculate the mean, median, mode, Q1, Q3, D6 for the following frequency
distribution. Classes Frequency
10 – 20 2
21 – 31 8
32 – 42 15
43 – 53 7
54 – 64 10
65 – 75 3
ℎ 6𝑛 24
𝐷6 = 𝑙 + −𝐶 Ans: 42.87, 40.67, 36.63, 32.62, 55.43, 45.64
𝑓 10
Example # 21
Example 23
Arithmetic mean of 20 values is 25. By adding 4 more values the mean becomes
30. Find the four values if the ratio between these values is 1 : 2 : 3 : 4.
Solution

Since mean of 20 values is 25, their sum is X = nX = 20 × 25 = 500

On adding 4 more values n = 24 and X = 30
Sum of 24 values: X = 24 × 30 = 720
Sum of the newly added values = 720 – 500 = 220
Since the values are in the ratio 1:2:3:4 sum of ratio is 10, the values are:
1 2 3 4
220 ×   = 22. 220 ×   = 44. 220 ×   = 66. 220 ×   = 88.
10 10 10 10
Hence the newly added values are 22, 44, 66, and 88.
Example
Example 22# 22
Suppose Q1 = 70, Q2 = Median = 80 and Q3 = 84. Interpret the significance of
each.
Solution
Twenty five percent scored 70 or lower (or seventy five percent scored 70 or
higher); Fifty percent scored 80 or lower (or fifty percent scored 80 or higher);
Seventy five percent scored 84 or lower (or twenty five percent scored 84 or higher). 25
Example 19# 23
Example
(a) The mean of 15 values is 10. If one more value is included, the mean becomes
12. Find the value which is included.
(b) The mean of n values is 13. If a new value 30 is included, the mean becomes
14. Find the value of n.
Solution
− X X
(a) X = = = 10 or ΣX = 10(15) = 150
n 15

Let A denote the value which is included, then mean of 16 values is


X + A X + A
= = 12 or ΣX + A = 12(16) = 192 or
n+1 15 + 1
A = 192 – ΣX = 192 – 150 = 42. Hence the included value is 42.
− X
(b) X = n = 13 or ΣX = 13n

When 30 is included, the total of (n + 1) values becomes ΣX + 30


X + 30
Thus, mean of n + 1 values = = 14 or
n+1
ΣX + 30 = 14(n + 1) = 14n + 14 or ΣX = 14n + 14 – 30 = 14n – 16
13n = 14n – 16 (since ΣX = 13n)
14n – 13n = 16 or n = 16 26
Example #2024
Example
If sum of 15 values is 300 and by addition of two more values, it becomes 360.
Find the new values if the ratio between them is 1 : 4.
Solution
n = 15, X = 300. If n = 17, X = 360
Sum of the newly added values = Sum of 17 values – Sum of 15 values
= 360 – 300
Sum of the newly added values = 60

Example # 25

27
Question # 25
Which of the averages would be suitable for each of the following?
(i) Heights of students.
(ii) Dress and shoe sizes.
(iii) Half of the factory workers make more than $5.37 per hour and
half make less than $5.37 per hour.
(iv) Number of tomatoes on plants.
(v) Comparison of Intelligence.
(vi) Per capita income in Pakistan.
(vii) Marks obtained in any examination.
(viii) The distribution is open-ended.
(ix) The data is categorical.
(x) The average person cuts the lawn once a week.
(xi) The most common fear today is fear of speaking in public.
(xii) The average age of university professors is 42.3 years.

Ans: (i) Mean (ii) Mode (iii) Median (iv) Mode (v) Median or Mean (vi) Mean (vii) Mean or Median (viii) Median (ix) Mode
(x) Mode (xi) Mode (xii) Mean

28
Example # 26.
The data shown consist of the number of games
played each year in the career of Baseball Hall of
Farmer Bill Mazeroski. Check for outliers.
81 148 152 135 151 152 159 142 34 162 130
162 163 143 67 112 70.

Q1 Q3
96.5 155.5
Find the interquartile range (IQR) IQR = Q3 – Q1 = 155.5 – 96.5 = 59

Multiply IQR value by 1.5 59 * 1.5 = 88.5


Subtract the value obtained in the Q1 – 88.5 = 96.5 – 88.5 = 8
above step from Q1 and add the Q3 + 88.5 = 155.5 + 88.5 = 244
value obtained in the above step Any value less than 8 or above
from Q3. 244 is considered an outlier. In
this case there is no outliers.
If there are two or more outliers, then reject normality. 29
Question # 26
Find the outlier.
12, 6, 5, 15, 13, 22, 50, 18. Ans: Q1 = 9, Q3 = 20, Outlier = 50

Question # 27
Check each data set for outliers.
a. 16, 18, 22, 19, 3, 21, 17, 20 b. 24, 32, 54, 31, 16, 18, 19, 14, 17, 20
c. 321, 343, 350, 327, 200 d. 88, 72, 97. 84, 86, 85, 100
e. 145, 119, 122,118, 125, 116 f. 14, 16, 27, 18, 13, 19, 38, 15, 20
Ans: (a) Outlier 3 (b) 54 (c) No outlier (d) No outlier (e) 145 (f) 38

Geometric Mean
The geometric mean G.M, of a set of a positive values X1,X2,…,Xn is
defined as the possible nth root of their product i.e.
1
𝑛
G.M = 𝑋1 × 𝑋2 . . .× 𝑋𝑛 = 𝑋1 × 𝑋2 . . .× 𝑋𝑛 𝑛
  log X 
Or G.M = Anti log   Ungroup data

 n 
σ 𝑓log𝑋
G.M = 𝐴𝑛𝑡𝑖log
σ𝑓 30
The geometric mean applies only to positive numbers. It is often used for
a set of numbers whose values are meant to be multiplied together or are
exponential in nature, such as data on the growth of the human population
or interest rates of a financial investment. It is also used in certain financial
and stock market indexes, such as Financial Time’s Value Line Geometric
index.
Example # 27
Find the G.M of the values 2, 9, 12.
1
GM = 2  9 12 = 216 = (216) = 6
3 3 3

Example # 28
Find the geometric mean of the values 7.96, 13.82, 24.14, 30.27, 37.44
X logX logX
G.M = Antilog  n 
7.96 0.9009
 
13.82 1.1405
6.4784
= Antilog  5 
24.14 1.3827
30.27 1.4810  
37.44 1.5733
Total 6.4784
= Antilog (1.29568) = 19.76
31
Example # 29
Find the geometric mean for the following distribution.
Classes 0 – 30 30 – 50 50 – 80 80 – 100
Frequency 20 30 40 10

Classes f X log X f log X f logX


G.M = Antilog  
0 – 30 20 15 1.1761 23.5220  f 
30 – 50 30 40 1.6021 48.0630 163.6432
50 – 80 40 65 1.8129 72.5160 G.M = Antilog  100 
 
80 – 100 10 90 1.9542 19.5420 G.M = 43.29
Total 100 – – 163.6432
Question # 28
Find the geometric mean. 8, 12, 16, 18, 22, 24 Ans: 15.63

Question # 29
Given the following frequency distribution of weights, calculate the
geometric mean. Ans: 117.7

Weight 65–84 85–104 105–124 125–144 145–164 165–184 185–204


f 9 10 17 10 5 4 5
32
Harmonic Mean
Harmonic mean, H.M, is the inverse of arithmetic mean of the reciprocals of
the observations of a set. In symbols
n
H.M = Ungroup data
1
 
 X

f f
H.M = =
 f 1
 X f X
   
Harmonic means are often used in averaging things like rates (e.g. the
average travel speed given a duration of several trips). The weighted
harmonic mean is used in finance to average multiples like the price-
earnings ratio because it gives equal weight to each data point.
Example # 30
Find the H.M of the values 2, 6, and 8.
3 3 3
H.M = = 0.5 + 0.167 + 0.125 = 0.792 = 3.788
1 1 1
+ +
2 6 8 33
Example # 31
Find the harmonic mean for the following distribution.
Weights 20–40 41–61 62–82 83–103 104–124
Frequency 8 7 10 6 4

Weights 1 1
f X 𝑓
𝑋 𝑋
20 – 40 8 30 0.0333 0.2664
41 – 61 7 51 0.0196 0.1372
62 – 82 10 72 0.0139 0.1390
83 – 103 6 93 0.0108 0.0648 f 35
H.M = = = 54.47
104 – 124 4 114 0.0088 0.0352  1  0.6426
f  
Total 35 – – 0.6426 X
Question # 30
Find the harmonic mean. 4, 5, 7, 9, 20. Ans: 6.67

34
Question # 31
Given the following frequency distribution, calculate the harmonic mean.
Hourly wages (Rs) 40 – 50 50 – 60 60 – 70 70 – 80 80 – 90 Ans: 63.05

Number of persons 4 8 16 8 4
Question # 32
Calculate geometric mean and harmonic mean from the following distribution.
X 37 42 47 52 57 62 67 Ans: 49.18, 48.47

f 15 13 17 29 11 10 5
Question # 33
Find average rate of increase of income, if the income of a worker
increase by 25% during 1st year and 40% during 2nd year and 50% during
3rd year. Ans: percentage increase 37.95%

Question # 34
Find average rate of increase in population, which in the first decade
increased 20%, in the next 25% and in the third 44%. Ans: percentage increase 29.27%

Question # 35
Find out the average rate of motion in the case of a person who rides the
first mile at the rate of 10 miles per hour, the next mile at the rate of 8 miles
per hour, and the third mile at the rate of 6 miles per hour. Ans: 7.66 m.p.h.

35
Question # 36
Find out the average speed of person who rides the first mile at the rate of 8
miles an hour, the next mile at the rate of 7.5 miles an hour, and the third mile
at the rate of 5.5 miles an hour. Ans: 6.8 m.p.h.

Question # 37
A man gets a rise of 20% in salary at the end of his first year of service and
further rises of 30% and 35% at the end of second and third years respectively;
the rise in each year being calculated on his salary at the beginning of the year.
To what average annual percentage increase is this equivalent? Ans: 28.18%

Relation between Arithmetic Mean, Geometric Mean


and Harmonic Mean
The general relation between AM, GM and HM is:
A.M > G.M > H.M or H.M < G.M < A.M
Question # 38
Find arithmetic mean, geometric mean and harmonic mean for the following
data and prove that A.M > G.M > H.M.
Marks 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50
No of students 5 10 15 7 3
Ans: Mean = 23.25, GM = 19.95, HM = 15.75, so A.M > G.M > H.M. 36
Measures of Dispersion
There are two types of measures of dispersion which describes how the
observations in a data set are scattered or spread out. They are:
(i) Absolute measures of dispersion
(ii) Relative measures of dispersion.

Absolute Measures of Dispersion


When the dispersion is measured in the same units as the units of original
data, is called absolute measures of dispersion. It cannot be used to
compare two sets of data of different units. The main measures of absolute
dispersion are:
(i) Range
(ii) The Quartile Deviation or the Semi-Inter Quartile Range
(iii) The Mean Deviation or the Average Deviation
(iv) The Variance and the Standard Deviation.

37
Relative Measures of Dispersion
When the dispersion is expressed in the form of the ratio, percentage or
coefficient and is free from any units of measurement is called relative
measures of dispersion. Each absolute measures of dispersion can be
converted into its relative measures. Relative dispersion is defined as:
Relative dispersion = Absolute dispersion /average. The main measures of
relative dispersions are:
(i) Coefficient of range or range coefficient of dispersion.
(ii) Coefficient of quartile deviation.
(iii) Coefficient of mean deviation or mean coefficient of dispersion.
(iv) Coefficient of variation.

Range
Range is defined as the difference between the largest and smallest
observations. Symbolically, the range is given by the relation: R = Xm – Xo
Where Xm denotes the largest observation and Xo denotes the smallest
observation in the data set.
In case of grouped data, the range is the difference between the upper
boundary of the highest class and the lower boundary of the lowest class.
38
Coefficient of Range
It is a relative measure of dispersion and is based on the value of range; it is
defined by the following relation. Xm – Xo
Coefficient of Range = X + X
m o
Example # 32
Weights of eight students are recorded below:
84, 87, 91, 67, 60, 74, 69, 84. Find the range and coefficient of range.

Example # 33
Find the range and coefficient of range from the following frequency
distribution.
Classes 2 – 4 4 – 6 6 – 8 8 – 10 10 – 12
Frequency 4 5 8 2 1

39
Classes f X Xm = 12, Xo = 2
2–4 4 3 R = Xm – Xo = 12 – 2 = 10
4–6 5 5
6–8 8 7 Xm – Xo 12 – 2
8 – 10 2 9 Coefficient of Range = = = 0.71 = 71%
Xm + Xo 12 + 2
10 – 12 1 11

Quartile Deviation or Semi Inter Quartile range


It depends on the upper quartile Q3 and the lower quartile Q1. Difference
between Q3 – Q1 is called Interquartile Range (IQR). Half the difference
between Q3 and Q1 is known as quartile deviation (Q.D) or semi-inter
quartile range (SIQR).
Q3 – Q1
Symbolically, QD is given by the relation Q.D =
2

Coefficient of Quartile Deviation


It is a relative measure of dispersion depend on the quartile deviation is
called the coefficient of quartile deviation. It is defined as:
Q3 – Q1
Coefficient of Q.D = 40
Q3 + Q1
Example # 34
Compute Quartile Deviation and coefficient of Quartile deviation from the
given data: 40, 45, 55, 60, 64, 71, 52. Also find IQR.

40, 45, 55, 60, 64, 71, 52. Here Q1 = 45 and Q3 = 64


Q3 – Q1 64 – 45
Q.D = 2 = 2 = 9.5
Q3 – Q1 64 – 45
Coefficient of Quartile Deviation = Q + Q = 64 + 45 = 0.174 = 17.4%
3 1
IQR = Q3 – Q1 = 64 – 45 = 19
Example # 35
Compute Quartile Deviation and coefficient of Quartile deviation from the
following frequency distribution.
Classes 2 – 4 4 – 6 6 – 8 8 – 10 10 – 12
Frequency 4 5 8 2 1
h n 
Classes f C Q1 = l + f  4 – C
2–4 4 4  
4–6 5 9 n 20
4 = 4 = 5, l = 4, h = 2, f = 5, C = 4
6–8 8 17
8 – 10 2 19 2
Q1 = 4 + (5 – 4) = 4.4
10 – 12 1 20 5 41
Total 20 –
h 3n 
Q3 = l +  – C
f 4 
3n 3 × 20
= = 15, l = 6, h = 2, f = 8, C = 9.
4 4
2
Q3 = 15 + (15 – 9) = 16.5
8
Q3 – Q1 16.5 – 4.4
Q.D = = = 6.05
2 2
Q3 – Q1
Coefficient of Quartile Deviation =
Q3 + Q1
16.5 – 4.4
=
16.5 + 4.4 = 0.579
Coefficient of Quartile Deviation = 57.9%

Mean Deviation or Average Deviation


It is defined as the arithmetic mean of the deviations measured either
from the mean, median or mode, all deviations being counted as positive.
The symbolic definition of the mean deviation (M.D) is

42
Mean Deviation about
Types of Data
Mean Median

Ungrouped − |X – Median|


|X – X| M.D =
data M.D = n n

− f|X – Median|
Grouped data M.D = f|X – X| M.D =
f f

Coefficient of Mean Deviation


A relative measure of dispersion depend on the mean deviation is called
the coefficient of the mean deviation or the mean coefficient of
dispersion. It is given by
M.D from Mean
Coefficient of M.D (mean) =
Mean
M.D from Median 43
Coefficient of M.D (median) = Median
Example # 36
Calculate the mean deviation from
(i) The mean
(ii) The Median
Also calculate the coefficient of M.D about mean and median from the
given data. 36, 48, 45, 32, 37, 46, 39, 36 and 41.

36 + 48 + 45 + 32 + 37 + 46 + 39 + 36 + 41 360
(i) 𝑋ത = = = 40
10 9
(ii) Data in an array: 32, 36, 36, 37, 39, 41, 45, 46, 48. Median = 39
X |X – 𝑋ത | |X – Median| −
|X – X|
36 4 3 (i) M.D =
48 8 9 n
45 5 6 40
= 9 = 4.44
32 8 7
37 3 2 M.D from Mean 4.44
Coefficient of M.D (mean) = = = 0.111 or 11.1%
46 6 7 Mean 40
39 1 0 |X – Median|
36 4 3
(ii) M.D =
n
41 1 2 39
= = 4.33
Total 40 39 9
M.D from Median 4.33
Coefficient of M.D (median) = = = 0.111 or 11.1% 44
Median 39
Example # 37
Compute mean deviation from mean and median from the following
frequency distribution. Also calculate the coefficient of mean deviation
from mean and median.
Daily wages (Rs) 200 – 250 250 – 300 300 – 350 350 – 400 400 – 450
No of persons 10 20 40 20 10
− fX 32500 h n 
X= = 100 = 325, Median = l + f  2 – C = 325
f  
~ ~
Daily wages f X fX C |X – 𝑋ത| f |X – 𝑋ത| |X – X| f |X – X|
200 – 250 10 225 2250 10 100 1000 100 1000
250 – 300 20 275 5500 30 50 1000 50 1000
300 – 350 40 325 13000 70 0 0 0 0
350 – 400 20 375 7500 90 50 1000 50 1000
400 – 450 10 425 4250 100 100 1000 100 1000
Total 100 – 32500 – – 4000 – 4000

f|X – X| 4000
M.D = = = 40
f 100
M.D from Mean 40
Coefficient of M.D (mean) = = = 0.123 or 12.3%
Mean 325
f|X – Median| 4000
M.D = = = 40
f 100
M.D from Median 40 45
Coefficient of M.D (median) = = = 0.123 or 12.3%
Median 325
Question # 39
Following are the wages of 8 workers of a factory. Find the range and the
coefficient of range. Wages (in Rs) 1400, 1450, 1520, 1380, 1485, 1495,
1575, 1440. Ans: 195, 6.6%

Question # 40
The following distribution gives the height distribution of 320 students.
Calculate its range and coefficient of range.
Height 50–52 53–55 56–58 59–61 62–64 65–67 68–70 Ans: 21, 17.5%

Frequency 2 20 72 65 83 70 8
Question # 41
Compute Quartile Deviation and coefficient of Quartile deviation from the
given data: 4, 4, 3, 6, 8, 7, 10, 14. Ans: 2.5, 19.23%

Question # 42
The following table shows the distribution of marks of students. Compute the
lower and the upper quartiles, quartile deviation, and coefficient of quartile
deviation. Marks 41–50 51–60 61–70 71–80 81–90 91–100
Frequency 30 36 43 104 73 14
Question # 43 Ans: 62.59, 82.17, 9.78, 13.5%
Find the mean deviation from the mean and median for the values
30, 36, 32, 33, 35, 39, 36.5, 35, 34. Also calculate the coefficient of mean
deviation from mean and median. 46
Ans: Mean = 34.5, Median = 35, M.D(mean) = 2.0, M.D(median) = 1.94, coefficient of M.D (mean) = 5.8%, coefficient of M.D (median) = 5.5%
Question # 44
Compute the mean deviation from mean and median from the data given
below. Also calculate the coefficient of mean deviation from mean and
median.
Classes 20–24 25–29 30–34 35–39 40–44 45–49 50–54
Frequency 1 4 8 11 15 9 2
Ans: Mean = 39, Median = 39.8, M.D(mean) = 5.72, M.D(median) = 5.69, coefficient of M.D (mean) = 14.7%, coefficient of M.D (median) = 14.3%

Variance & Standard Deviation


Variance is defined as the mean of the square of deviations of observations taken
from the mean of the observations.
When it is calculated from the entire population, the variance is called the
population variance denoted by s2. If the data from sample are used to calculate the
variance, it is referred to as the sample variance and denoted by S2
Variance represents an important theoretical concept and is based on standard
deviation. The standard deviation indicates the average distances that scores vary
from the mean of the distribution and denoted by S.

47
Dispersion

Variance Coefficient of
Standard Deviation
Variation
Range Population Population
Variance (s2) Standard
s 
Deviation (s) CV =    100%
Sample X 
Standard Sample
Variance (s2)
Error of Standard
Mean Deviation (s)

𝑠
ത =
𝑆𝐸(𝑋)
𝑛

48
Midrange
It is a rough estimate of the middle, defined as the sum of the lowest and
highest values in the data set divided by 2.
MR = (Lowest value + Highest value ) / 2

Variance
• Important Measure of Variation
Ungroup data
• Shows Variation About the Mean: Biased

 (X − )
2

• For the Population: s 2


= i Group data
Biased
N

 ( −X)  (
 X 2 − 
) 
2 2
X 1 X
• For the Sample: s 2 = i S2 =  Ungroup data
n −1 n −1 n  Unbiased
 

 (
  fX 2 − 
) 
2
fX 1 σ 𝑓𝑋 2
1 
S2 = 𝑆= ෍ 𝑓𝑋 2 − Group data

 f − 1  f 

σ𝑓 − 1 σ𝑓 Unbiased

For the Population: use N in For the Sample : use n - 1


the denominator. in the denominator. 49
Example # 38: Using the following data find Range, Variance, Standard
Deviation, Mean, & CV.
11.2, 11.9, 12.0, 12.8, 13.3, 14.3

R = xLargest − xSmallest
X X2 = 14.3 – 11.2 = 3.1
σ 𝑋2 σ𝑋 2
11.2 125.44 S2 = –
𝑛 𝑛
11.9 141.61
12.0 144.00 958.94 75.6 2
S2 = –
12.8 163.84 6 6
13.4 179.56
14.3 204.49 S2 = 1.063

75.6 958.94 S = 1.03


X =
X =
75.6
= 12.6
n 6
S 
CV =    100%
X 
= (1.03 / 12.6) × 100 50
= 8.17%
Coefficient of Variation
• Always a %
• Used to Compare 2 or More Groups
• Formula (for Sample): CV =  s   100%
X 

Comparing Coefficient of Variation

• Stock A: Average Price last year = $50, Standard Deviation = $5


• Stock B: Average Price last year = $100, Standard Deviation = $5
s
CV =    100%
X 

Which stock is more variable & which


stock is more consistence:
Stock A: CV = 10% Stock A is more variable
Stock B: CV = 5% Stock B is more consistence
51
Question # 45
The average price of the Panther convertible is $40000. with a standard
deviation of $4000. The average price of the Suburban station wagon is
$20000. with a standard deviation of $200. Compare the variability of the
two prices. Ans: panther

Question # 46
The average score on a Marketing final examination was 80, with a
standard deviation of 9; the average score on a Finance final exam was
120, with a standard deviation of 18. Which class was more consistent?
Ans: Marketing class consistent (11.25%)

Question # 47
The average score on an English final examination was 85, with a
standard deviation of 5; the average score on an Accounting final
examination was 110, with a standard deviation of 8. Which class was
more variable? Ans: Accounting class is more variable.

Question # 48
A student calculated mean and standard deviation of 25 values as 20 and
4 respectively. Find the value of coefficient of variation. Ans: 20%

Question # 49
If X = 5.2, 4.4, 3.1. Find variance, mean & coefficient of variation.
52
Ans: 0.75, 4.23, 17.73%
Question # 50
Two candidates A and B at the BBA (Hons) Examination obtained the following
marks in ten papers. Which of the candidate showed a more consistent
performance? Ans: 𝑋ത = 62.7, S = 12.72, CV = 20.29%, 𝑋ത = 63.4, S = 17.10, CV = 26.97%. A is consistent
(A) (B)

Paper I II III IV V VI VII VIII IX X


A 58 49 76 80 47 72 61 59 77 48
B 39 38 86 72 75 69 57 49 83 66
Question # 51
A tire manufacturer wants to determine the inner diameter of a certain
grade of tire. Ideally the diameter would be 570 mm. The data are follows;
572, 572, 573, 568, 569, 575, 565, 570
(i) Find the sample mean & Median
(ii) Find the sample Variance, Standard Deviation, and Range.
Ans: (i) 570.5, 571, (ii) 8.75, 2.96, 10
Question # 52
A doctor measured the heart rate of 20 people who had been placed on a
long-distance running program. A frequency distribution for these rates is
displayed below. Compute the standard deviation and coefficient of variation
of the heart rates. Ans: Mean = 70.9, S = 2.022, CV = 2.85%

Rate (X) 67 68 69 70 72 75
Frequency 1 1 3 5 8 2 53
Question # 53
Goals recorded by two teams A and B in a football season were as follows:
Number of goals scored in a match 0 1 2 3 4
Number of Team A 24 9 8 5 4
Matches Team B 17 9 6 5 3
By calculating the C.V. in each case, find which team may be considered as
more consistent.
Ans: Mean (A) = 1.12, S(A) = 1.32, CV(A) = 117.86% | Mean (B) = 1.2, S(B) = 1.31, CV(B) = 109.17% , Team B more consistent.

Question # 54
If X = 2, 3, 6, 8, 11. Find SD, mean & coefficient of variation.
Question # 55 Ans: 3.28, 6, 54.67%

An analysis of monthly wages paid to workers in two firms A and B belonging


to the same industry gives the following results.
Firm A Firm B
Number of wage earners 586 648
Average monthly wages (Rs) 52.6 47.5
Variance of wages 100 121
(i) Which firm pays larger amount as monthly wages
(ii) In which firm A or B, is there greater variability in individual wages?
Ans: (i) A pays larger (ii) B has greater variability
54
Properties of the Variance / Standard Deviation.
(i) The variance of a constant is zero i.e. if ‘a’ is a constant
then Var(a) = 0. SD(a) = 0.
(ii) Variance is not affected by change of origin, in other
words if a constant is added to or subtracted from all the
values of the variable then variance remains unchanged
i.e. Var(X ± a) = Var (X). SD(X ± a) = SD (X)
(iii) Variance is affected by the change of scale or unit, in
other words, if all the values are multiplied or divided by
a constant, their variance gets multiplied or divided by
the square of that constant. i.e.
Var (aX) = a2 Var(X), Var  X  = Var(X)
a
X
SD(aX) = a SD(X), SD a  = SD(X)

(iv) Variance of the sum or difference of two independent


random variables is equal to the sum of their respective
variances i.e., Var(X ± Y) = Var(X) + Var(Y).

or SD(X ± Y) = 55
Example # 39
The mean and standard deviation of a variable X are 80 and 2
respectively. Find the mean and variance of a new variable if
(i) All the values of X are increased by 20 points.
(ii) All the values of X are increased by 20%.
(iii) All the values of X are decreased by 0.05%.

X = 80, SD( X ) = 2, Var ( X ) = 4


Mean Variance
(i) Y = X + 20 Y = X + 20 = 80 + 20 = 100 Var (Y ) = Var ( X + 20) = Var ( X ) = 4
(ii) Y = X + 0.2X
Y = 1.2X Y = 1.2 X = 1.2  80 = 96 Var (Y ) = 1.22Var ( X ) = 1.44  Var ( X ) = 1.44  4 = 5.76
(iii) Y = X – 0.0005X
Y = 0.9995X Y = 0.9995 X = 0.9995  80 = 79.96
Var (Y ) = 0.99952Var ( X ) = 0.99900025  4 = 3.996001
Example # 40
If Var (X) = 25, find Var (2X + 4).

Var (2X + 4) = 4 Var(X) = 4 × 25 = 100


56
Question # 56
The mean and standard deviation of a variable X are 50 and
9.77 respectively. Find the mean and standard deviation of a
new variable if
(i) All the values of X are increased by 18 points.
(ii) All the values of X are increased by 18%.
(iii) All the values of X are decreased by 18%.
Ans: (i) 68, 9.77 (ii) 59, 11.53 (iii) 41, 8.01
Question # 57
The mean and standard deviation of the weekly
earnings of a random sample of women workers
from a locality are 3200 and 800, respectively.
(i) What will be the mean and SD be if every
women has a decrease of 0.5% of previous
earnings? Also find coefficient of variation.
(ii) What will happen to the values of the mean
and SD if every women has an increase of
Rs.600 per week?
(iii) What will the mean and variance be if every
women has an increase of 15% of previous
earnings?
57
Ans: (i) 3184, 796, 25% (ii) 3800, 800 (iii) 3680, 846400
Percentiles
It is a location measure of data value; it divides the distribution into 100
groups.
Percentiles are not the same as percentages. That is, if a student gets
72 correct answers out of a possible 100, she obtains a percentage
score of 72 . There is no indication of her position with respect to the
rest of the class. She could have scored highest, the lowest , or
somewhere in between. On the other hand , if a raw score of 72
corresponds to the 64th percentile, then she did better than 64% of the
students in her class.

Percentile Formula
The percentile corresponding to a given value (X) is computed by using
the following formula:
(number of values below X) + 0.5
Percentile = Χ 100%
Total number of values
58
Procedure for finding a data value corresponding to a given percentile.
1. Arrange the data in order from lowest to highest.
2. Substitute in the formula.
nΧp
C=
100
n = total number of values
p = percentile

3. If C is not a whole number, round up to the next whole number.


Starting at the lowest value, count over to the number that
corresponds to the rounded-up value.
If C is a whole number, use the value halfway between C and C + 1
when counting up from the lowest value.

59
Example # 41: A professor gives
a 100-point test to 6 students. The
scores are shown below. Find the
percentile rank of a score of 82.
88, 92, 86, 97, 78, 82.
What value corresponds to the
45th percentile.

Arrange the data in order from lowest to highest. 78, 82, 86, 88, 92, 97.
1+ 0.5
Then substitute in the formula. Percentile = Χ 100% = 25th percentile
6
Thus, a student whose score was 82 did better than 25% of the class.
For 45th percentile. Substitute in the formula 6 Χ 45
C= = 2.7 = 3
100

Start at the lowest value and count over to the third value, which is 86. Hence,
the value 86 corresponds to the 45th percentile.

60
Example # 42: A professor gives a 20-point test to10 students. The
scores are shown below. Find the value that corresponds to the 60th
percentile. 18, 15, 12, 6, 8, 2, 3, 5, 20, 10.

Arrange the data in order from lowest to highest.


2, 3, 5, 6, 8, 10, 12, 15, 18, 20.
n  p 10  60
Substitute in the formula C = = = 6
100 100
Here C is a whole number, use the value halfway between C and C + 1
when counting up from the lowest value. In this case, the 6th and 7th
values.
2, 3, 5, 6, 8, 10, 12, 15, 18, 20.
6th value 7th value
The value halfway between 10 and 12 is 11. Find it by adding the two
values and dividing by 2. (10 + 12)/2 = 11
Hence, 11 corresponds to the 60th percentile. Anyone scoring 11 would
have done better than 60% of the class.
61
Question # 58
The number of previous jobs held by each of six applicants is
shown here. 2, 4, 5, 6, 8, 9
(i) Find the percentile of each value.
(ii) What value corresponds to the 58th percentile?
Ans: (i) 8.3th, 25th, 41.7th, 58.3th, 75th, 91.7nd percentile. (ii) X = 6

Question # 59
The number of credits in business courses eight job applicants had is
shown here. 9, I2, 15, 27, 33, 45, 63, 72.
(i) Find the percentile for each value.
(ii) What value corresponds to the 40th percentile?
Ans: (i) 6.25th, 18.75th, 31.25th , 43.75th, 56.25th, 68.75th, 81.25th, 93.75th (ii) X = 27

62
Empirical Rule for Normal Distributions
Approximately 68% of the data values fall within one standard deviation of the mean.
Approximately 95% of the data values fall within two standard deviations of the mean.
Approximately 99.7% of the data values fall within three standard deviations of the mean.

Bell – Shaped Distribution


68%
95%

99.7%

34% 34%
13.5% 13.5%
2.35% 2.35%
–∞ x +∞
x – 3s x – 2s x – 1s x x + 1s x + 2s x + 3s
–3 –2 –1 0 +1 +2 +3
z
0.5 0.5
63
Standard Z Scores
A variable is defined to be standardized or in standard units if it is expressed
in terms of deviations from its mean and divided by its standard deviation. It
is denoted by Z. Its mean is zero, and standard deviation is one. The formula is
𝑋 − 𝑋ത For samples 𝑋−𝜇
✓ 𝑍=
𝑠
𝑍=
𝜎
For populations

The Z-values, being independent of the units of measurement, provide a basis for comparison
between individual values, even though they belong to different distributions. That is why they
are often used in psychological and education testing, where they are known as standard
scores. The negative numbers are avoided by multiplying the Z values by 10, an arbitrary SD.,
and adding 50, an arbitrary mean, to them. The values so obtained are called standard Z
scores. It is used to produce random Numbers that are Normally Distributed in SPSS. Thus, a
standard Z score is given by the relation
𝑋 − 𝑋ത
𝑍 = 50 + 10
𝑠

Chebyshev’s Theorem
The proportion of values from a data set that will fall within k standard
1
deviations of the mean will be at least (1 − k ) ; where k is a number
2

greater than 1. This theorem applies to any distribution regardless 64


of its shape.
Example # 43.
A student scored 65 on a Calculus test that had a mean of 50 and a
standard deviation of 10; she scored 30 on a TQM test with a mean of 25
and a standard deviation of 5. Compare her relative position on the two
tests.

First, find the Z scores. For Calculus, the Z score is


X − X 65 − 50
Z= = = 1.5
s 10

For TQM, the Z score is


X − X 30 − 25
Z= = = 1.0
s 5
Since the Z score for Calculus is larger, her relative position in the Calculus
class is higher than her relative position in the TQM class.

65
Example # 44.
Find the Z score for each test and state which is higher.
Test A x = 38 X = 40 s = 5
Test B x = 94 X = 100 s = 10

X−X 38 − 40
For test A, Z = = = − 0.4
s 5
X − X 94 − 100
For test B, Z= = = − 0.6
s 10
The Z score for test A is relatively higher than the score for test B.

Question # 60
Which of the following exam grades has a better relative position?
(i) A grade of 43 on a test with X = 40 and s = 3.
(ii) A grade of 75 on a test with X = 72 and s = 5. Ans: (i) 1 (ii) 0.6, (i) is higher

66
Question # 61
A student scores 60 on a mathematics test that has a mean of 54 and a
standard deviation of 3, and she scores 80 on a history test with a mean of
75 and a standard deviation of 2. On which test did she do better than the
rest of the class? Ans: Math = 2.0, History = 2.5, student did better in history.

Question # 62
Which score indicates the highest relative position?
(i) A score of 3.2 on a test with X = 4.6 and s = l.5.
(ii) A score of 630 on a test with X = 800 and s = 200.
(iii) A score of 43 on a test with X = 50 and s = 5.
Ans: (i) - 0.93 (ii) -0.85 (iii) - 1.4, part (ii) is highest
Question # 63
Which score has the highest relative position?
(i) A score of 12 on a test with X = 10 and s = 4.
(ii) A score of 170 on a test with X = 120 and s = 32.
(iii) A score of 180 on a test with X = 60 and s = 8.
Ans: (i) 0.5 (ii) 1.6 (iii) 15, part (iii) is highest
Question # 64
A set of data is mounded, with a mean of 450 and a variance of 625.
Approximately what proportion of the observations is
(i) Greater than 425?
(ii) Less than 500?
(iii) Greater than 525? Ans: (i) Approx 84% of the observations will be greater than 425. (ii) Approx 67
97.5% of the observations will be less than 500. (iii) Approx 0% of the observations will be greater than 525.
Example # 45.
The mean price of houses in a certain neighborhood is $50000, and the
standard deviation is $10000. Find the price range for which at least 75% of
the houses will sell.

According to Chebyshev’s theorem


1 – 12 = 0.75 or k = 2. Thus,
k

X  kS = $50000  2($10000)
$50000 − 2($10000) = $30000
$50000 + 2($10000) = $70000
Hence, at least 75% of all homes sold in the area will have a price
range from $30000 to $70000.

68
Example # 46.
A survey of local companies found that the mean amount of travel
allowance for executives was $0.25 per mile. The standard
deviation was $0.02. Using Chebyshev’s theorem, find the minimum
percentage of the data values that will fall between $0.20 and $0.30.

X−X 0.30 − 0.25


k= = = 2.5
s 0.02
Use Chebyshev’s theorem to find the
percentage.
1 1
1 − 2
= 1 − 2
= 0.84
k (2.5)
Hence, at least 84% of the data values will fall between $0.20 and
$0.30.

69
Question # 65
Using Chebyshev's theorem to approximate each of the following
observations if the mean is 250 and the standard deviation of 20.
Approximately what proportion of the observations is
(i) Between 190 and 310 (ii) Between 210 and 290
(iii) Between 230 and 270 (iv) Less than 215 and more than 285
Ans: (i) 88.89% (ii) 75% (iii) 0% (iv) 33%
Question # 66
The average cost of a certain type of grass seed is $4.00 per box. The
standard deviation is $0.10. Using Chebyshev’s theorem, find the minimum
percentage of data values that will fall in the range of $3.82 to $4.18.
Ans: 69%

Question # 67
Using Chebyshev's theorem, solve the following problems for a distribution
with a mean of 80 and a standard deviation of 10.
(i) At least what percentage of values will fall between 60 and 100?
(ii) At least what percentage of values will fall between 65 and 95?
Ans: (i) 75% (ii) 56%

Question # 68
The mean of a distribution is 20 and standard deviation is 2. Answer each
using Chebyshev's.
(i) At least what percentage of values will fall between 10 and 30?
70
(ii) At least what percentage of values will fall between 12 and 28?
Ans: (i) 96% (ii) 93.75%
Question # 69
The average delivery charge for a refrigerator is $32. The standard
deviation is $4. Find the minimum percentage of data values that will fall in
the range of $20 to $44. Use Chebyshev’s theorem. Ans: 88.89%

Question # 70
For a certain type of job, it costs a company an average of $231 to train an
employee to perform the task. The standard deviation is $5. Find the
minimum percentage of data values that will fall in the range of $219 to
$243. Use Chebyshev’s theorem. Ans: 83%

Question # 71
The average score on a special test of knowledge of wood refinishing has a
mean of 53 and a standard deviation of 6. Using Chebyshev’s theorem, find
the range of values in which at least 75% of the scores will lie.
Ans: 41 ~ 65
Question # 72
A survey of several leading brands of cereal shows that the mean content
of potassium per serving is 95 milligrams, and the standard deviation is 2
milligrams. Find the values in which at least 88.89% of the data will fall. Use
Chebyshev‘s theorem. Ans: 89 ~ 101

71
Question # 73
A sample of the hourly wages of employees who work in restaurants in a
large city has a mean of $5.02 and a standard deviation of $0.09. Using
Chebyshev’s theorem, find the range in which at least 75% of the data
values will fall. Ans: $4.84 ~ $5.20

Question # 74
A sample of the labour costs per hour to assemble a certain product has a
mean of $2.60 and a standard deviation of $0.15. Using Chebyshev’s
theorem, find the values in which at least 88.89% of the data will lie.
Ans: $2.15 ~ $3.05
Question # 75
In a distribution of 200 values, the mean is 50 and the standard deviation is
5. Answer each using Chebyshev's theorem.
(i) At least how many values will fall between 30 and 70?
(ii) At most how many values will be less than 40 or more than 60?
Ans: (i) 93.75% 188 (ii) 25% 50

Question # 76
In a distribution of 300 values, the mean is 50 and the standard deviation is
15. Answer each using Chebyshev's theorem.
(i) At least how many values will fall between 20 and 80?
(ii) At most how many values will be less than 30 or more than 70?
Ans: (i) 75% 225 (ii) 56.27% 169

72
Question # 77
A study of the nicotine contents of a certain brand of cigarette shows that
on the average one cigarette contains 1.52 milligrams of nicotine with a
standard deviation of 0.07 milligram. According to Chebyshev's theorem,
between what values must the nicotine content be for
(i) At least 24/25 of all cigarettes of this brand?
(ii) At least 48/49 of all cigarettes of this brand?
Ans: (i) 1.17 ~ 1.87 (ii) 1.03 ~ 2.01
Question # 78
Old Faithful is a famous geyser at Yellowstone National Park. From a
sample with n = 32, the mean duration of Old Faithful’s eruptions is 3.32
minutes, and the standard deviation is 1.09 minutes. Using Chebychev’s
Theorem, determine at least how many of the eruptions lasted between
1.14 minutes and 5.5 minutes.
Ans: k = 2, 75%, No of eruptions = 0.75 × 32 = 24

73
Moments
Moments tells us the power of the deviations to which they are raised before
finding their averages. These moments are also called the central moments
or the mean moments and are used to describe a set of data.
The first four moments about, the mean is defined as
Types of
Moments about Mean
Data
1 = 2 = 3 = 4 =
Ungrouped − − − −
 (X − X)  (X − X)2  (X − X)3  (X − X)4
Data n n n n
=0 = S2
1 = 2 = 3 = 4 =
Grouped − − 2 − 3 − 4
f(X − X)  f(X − X) f(X − X) f(X − X)
Data f f f f
=0 = S2 74
Moment Ratios
There are some ratios in which both numerators and the denominators are
moments. The most common of these moment ratios are 1 and 2 defined by the
32
relation 1 = 3
2
4
2 = 2
2
For symmetrical distribution 1 = 0, 2 is used to explain the shape of the
curve and is measure of peakedness. 1 is known as moment coefficient of skewness
and 2 is known as moment coefficient of kurtosis.

75
Example # 47
Find the first four moments about mean for the values 2, 4, 6, 8, 10. Also
find 1 and 2. State whether the distribution is leptokurtic or platykurtic.
X X – 𝑋ത (X – 𝑋ത )2 (X – 𝑋ത )3 (X – 𝑋ത )4
2 –4 16 – 64 256
4 –2 4 –8 16
6 0 0 0 0
8 2 4 8 16
10 4 16 64 256
30 0 40 0 544

− X 30  (X − X) 32 (0)2
X= 1 = = 0, 1 = 3 = 3 = 0
n = 5 =6 n 2 (8)

2 =
 (X − X)2 40 4 108.8
=
5 =8 2 = = = 1.7
n
22 (8)2
− Distribution is platykurtic.
 (X − X)3 0
3 = =
n 5=0

 (X − X)4 544
4 = = = 108.8
n 5 76
Example # 48
Find the first four moments about mean from the following distribution. Also
find 1 and 2. State whether the distribution is leptokurtic or platykurtic.
Classes 1–3 3–5 5–7 7–9
Frequency 40 30 20 10

Classes f X fX (X –𝑋ത ) f(X – 𝑋ത ) f(X – 𝑋ത )2 f(X – 𝑋ത )3 f(X – 𝑋ത )4


1–3 40 2 80 –2 –80 160 –320 640
3–5 30 4 120 0 0 0 0 0
5–7 20 6 120 2 40 80 160 320
7–9 10 8 80 4 40 160 640 2560
Total 100 – 400 – 0 400 480 3520

− fX 400 32 (4.8)2


X= = =4 1 = 3 = 3 = 0.36
f 100 2 (4)
− −3
f(X − X) f(X − X) 480 4 35.2
1 =
0 3 = = 100 = 4.8 2 = 2 = 2 = 2.2
f
= 100 = 0 f 2 (4)
− −4 Distribution is platykurtic.
f(X − X)2 400 f(X − X) 3520
2 = = = 4 4 = = 100 = 35.2
f 100 f 77
Question # 79
Find the first four moments about mean from the following distribution. Also
find 1 and 2.
X 31 32 33 34 35 36 37 38 39
Ans: (i) 0, 2.85, – 0.15, 24.15 (ii) 0.00097, 2.97
f 1 2 4 8 9 10 3 2 1

Question # 80
Find the first four moments about mean from the following distribution. Also
find 1 and 2.
X 74.5 94.5 114.5 134.5 154.5 174.5 194.5 Ans: (i) 0, 1216, 23104, 3717632 (ii) 0.297, 2.51
f 9 10 17 10 5 4 5
Question # 81
The first four moments about, the arithmetic mean of a distribution are 0, 4,
6 and 48. Find 2. Ans: 3

Question # 82
The following information obtained from a frequency distribution of patients,
weights. f(X − 𝑋ഥ ) = 0, f(X − 𝑋ത )2 = 124, f(X − 𝑋ത )3 = 180 and total number
of patients = 48, Find 1. Ans: 0.819

78
Symmetrical Distribution.
A distribution is said to be symmetrical in which the data values are
uniformly distributed about its mean. In a symmetrical distribution, a
deviation below the mean is equal to the corresponding deviation above
the mean. In symmetrical distribution
(i) Mean = Median = Mode
(ii) Q3 – Median = Median – Q1
(iii) 3 = 0
3 2
(iv) 1 = 3 = 0 2 1 1 2
2 Mean
Skewness. Median
Mode

Skewness is the lack of symmetry in a distribution around some central


value (mean, median or mode). It is important to note that in a
symmetrical distribution the mean, median and mode coincide as shown
in Fig above, and that the two tails of the frequency curve are equal in
length from the central value. There are two types of skewness.
(a) Positive Skewness (b) Negative Skewness.

79
(a) Positive Skewness.
When a distribution departs from symmetry, the mean, median and mode
are pulled apart and one tail becomes longer than the other. If the
frequency curve has a longer tail to the right, as in Fig the distribution is
said to be positively skewed.
In positively skewed idstribution
(i) Mean > Meidan > Mode
(ii) Q3 – Median > Median – Q1

b) Negatively Skewness.
If the frequency has a longer tail to the left, as in Fig the distribution is
said to be negatively skewed.
In negatively skewed distribution.
(i) Mean < Median < Mode
(ii) Q3 – Median < Median – Q1

80
Distribution Shape

Symmetric or Skewed

Negatively Skewed positively Skewed


Mean < Median < Mode Mean > Median > Mod
Symmetric e
Mean = Median = Mode

Fig. 1(c) Fig. 1(b)


Fig. 1(a)
81
Box–and–Whisker Plot
Boxplots are graphical representations of a five number summary of a
data set. The five specific values that make up a five-number summary
are: The smallest value of data set Q3
Q1 The largest value of data set
The Median

Graphical Display of Data Using


5-Number Summary
Xsmallest Median Xlargest
Q1 Q3

4 6 8 10 12
Example # 49
Using box plots, compare the two distributions of the sodium contents of a
sample of real cheese and a cheese substitute.
Real Cheese Cheese Substitute
310 420 45 40 270 180 250 290
220 240 180 90 130 260 340 310
Real Cheese: Q1 = 67.5, Q2 = 200, Q3 = 275, Min = 40, Max = 420.
Cheese Substitute: Q1 = 215, Q2 = 265, Q3 = 300, Min = 130, Max = 340.
Cheese Substitute
Compare the plots. It is quite apparent
that the distribution for the cheese
Real Cheese substitute data has a higher median for
the distribution for the real cheese data.
The variance or spread for the
distribution of the real cheese data is
0 100 200 300 400 500
larger than the variation for the
Question # 83 distribution of the cheese substitute data.
The number of previous jobs held by each of six applicants is shown here.
2, 4, 5, 6, 8, 9. Construct a box plot and comment on the nature of the
distribution. Ans: Q = 4, Q = 5.5, Q = 8, Min = 2, Max = 9, positively skewed.
1 2 3

Question # 84
The number of credits in business courses eight job applicants had is shown
here. 9, I0, 15, 27, 44, 45, 53, 67.
Construct a boxplot and comment on the nature of the distribution.
Ans: Q1 = 12.5, Q2 = 35.5, Q3 = 49, Min = 9, Max = 67, negatively skewed.
4
Kurtosis. 2 = 2
2
It measures the peakedness of the distribution. Leptokurtic

It is of three types Mesokurtic

(i) Leptokurtic (ii) Mesokurtic (iii) Platykurtic Platykurtic

Mean

(i) Leptokurtic.
The distribution is said to be Leptokurtic when 2 is greater than 3, the
curve is more sharply peaked and has wider tails than the normal curve.
(ii) Mesokurtic.
The distribution is said to be mesokurtic or normal when 2 = 3, the curve
is neither flat nor highly peaked.
(iii) Platykurtic.
The distribution is said to be Platykurtic when 2 is less than 3, the curve
has a flatter top and relatively narrower tails than the normal curve.

84
Coefficient of Skewness
Q 3 + Q1 - 2Median (Bowley’s coefficient of skewness)
Sk = Its value lies between ±1
Q 3 − Q1
Sk = Mean – Mode Its value lies between ± 1
(Karl Pearson coefficient of skewness)
S
3(Mean – Median) Its value lies between ± 3
Sk = (Karl Pearson coefficient of skewness)
S
Determining Normality
There are several ways for checking normality. The easiest way to draw
a histogram for the data and check its shape. If the histogram is not
approximately bell-shaped, then the data are not normally distributed.
Skewness can be checked by using Pearson’s Index of skewness (PI).
The formula is 3(X – Median)
PI = s
If the index is greater than or equal to +1 or less than or equal to –1, it
can be concluded that the data are significantly skewed. In addition, the
data should be checked for outliers. If there are two or more, then
85
reject normality.
Example # 50.
The data shown consist of the number of games
played each year in the career of Baseball Hall
of Farmer Bill Mazeroski. Check for normality.
81 148 152 135 151 152 159 142 34 162
130 162 163 143 67 112 70.

Mean Median
127.24 143
St Dev
38.68
Check for normality;

3(127.24 – 143)
PI = = –1.222
38.68
Since the PI is less than –1, it can be concluded that the distribution is
significantly skewed to the left. There is no outlier in the data. (already
solved in example # 26) 86
Standard Error of Skewness
Standard Error of Skewness.
The ratio of skewness to its standard
6 × 𝑛 × (𝑛 − 1) error can be used as a test of normality
Std. Error of Sk = (that is, you can reject normality if the
𝑛 − 2)(𝑛 + 1)(𝑛 + 3 ratio is less than -2 or greater than +2). A
large positive value for skewness
indicates a long-right tail; an extreme
negative value indicates a long-left tail.

Standard Error of Kurtosis


Standard Error of Kurtosis.
The ratio of kurtosis to its
standard error can be used as a
test of normality (that is, you can
reject normality if the ratio is less
4 × (𝑛2 − 1) × 𝑉𝑎𝑟 𝐸𝑟𝑟𝑜𝑟 𝑜𝑓 𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 than -2 or greater than +2). A
Std. Error of kurtosis = large positive value for kurtosis
𝑛 − 3)(𝑛 + 5
indicates that the tails of the
distribution are longer than those
of a normal distribution; a
negative value for kurtosis
indicates shorter tails (becoming
like those of a box-shaped
uniform distribution).
87
Example # 51.
A survey of 18 high-technology firms showed the number of days’ inventory
they had on hand. Determine if the data are approximately normally
distributed. 5 29 34 44 45 63 68 74 74 81 88 91 97 98 113
118 151 158.

Construct a frequency distribution and draw a histogram for the data, as


shown below

Since the histogram is approximately bell-shaped, we can say that the


distribution is approximately normal. 88
Check for skewness.
For these data, Mean = 79.5, median = 77.5, and S = 39.31. Using the
Pearson coefficient of skewness PI = 0.148
3(79.5 – 77.5)
PI = = 0.153
39.31
In this case, the PI is not greater than +1 or less than –1, so it can be
concluded that the distribution is not significantly skewed.

Check for outliers.


In this case, Q1 = 45 and Q3 = 98; hence, IQR = Q3 – Q1 = 98 – 45 = 53.
An outlier would be a data value less than 45 – 1.5(53) = – 34.5 or a data
value larger than 98 + 1.5(53) = 177.5. In this case, there are no outliers.
Since the histogram is approximately bell-shaped, the data are not
significantly skewed, and there are no outliers, it can be concluded that the
distribution is approximately normally distributed.

89
Example # 52.
Annual salaries for a sample of five employees are
$39000 $37500 $35200 $40400 $100000
Describe the central tendency and symmetry of the data.

First, we check for data accuracy. Annual Salaries


Finding no error. We calculate the Mean 50420
mean annual salary as $50420, a
Median 39000
value that does not seem to
represent a “typical "salary. The Mode #N/A
Median salary of $39000 is the Standard Deviation 27782.94
preferred measure of central
Sample Variance 7.7189
tendency. These data have no
mode. Since the mean is much Kurtosis 4.905059
larger than the median, we expect Skewness 2.209069
the data to be positively skewed
with skewness approximately 2.21 Q1 36350
and it is Leptokurtic. Q3 70200
90
Question # 85
Find the coefficient of skewness for each distribution and describe the
shape of the distribution.
(i) Mean = 10, median = 8, standard deviation = 3.
(ii) Mean = 42, median = 45, standard deviation = 4.
(iii) Mean = 18.6, median = 18.6. standard deviation = l.5.
(iv) Mean = 98, median = 97.6, standard deviation = 4.
(v) 1 = 0 and 2 = 3.9
(vi) 1 = 0 and 2 = 1.7 Ans: (i) 2, +ly sk (ii) -2.25, -ly sk (iii) 0, Symm (iv) 0.3, +ly sk (v) symm & Lepto (vi) symm & platy

Question # 86
The cost per load (in cents) of 35 laundry detergents tested by a consumer
organization is shown below. Calculate the coefficient of skewness using
Karl Pearson’s Method, also interpret your result.
Class Limits 13 – 19 20 – 26 27 - 33 34 - 40 41 - 47 48 – 54 55 - 61 62 - 68
Frequency 2 7 12 5 6 1 0 2
Ans: Mean = 33.8, S = 11.6, Mode = 29.42, SK = 37.76%, positively skewed.

91
Question # 87
Draw an ogive and locate median, quartiles, D4, D7, P10, P90 & IQR
graphically
Classes 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99 100-109 110-119
f 1 4 17 28 25 18 13 6 5 2 1

The necessary calculations are given below, and graphic locations are in
Fig. 3.1
Class
Classes f C
Boundaries
10 – 19 1 9.5–19.5 1
20 – 29 4 19.5–29.5 5
30 – 39 17 29.5–39.5 22
40 – 49 28 39.5–49.5 50
50 – 59 25 49.5–59.5 75
60 – 69 18 59.5–69.5 93
70 – 79 13 69.5–79.5 106
80 – 89 6 79.5–89.5 112
90 – 99 5 89-5–99.5 117
100 – 109 2 99.5–109.5 119
110 – 119 1 109.5–119.5 120 92
The approximate values of median, quartiles D4, D7, P10, P90 & IQR can be
located from an ogive.
Fig. 3.1

Answers of median and IQR is same as both fall on 60, 93


Median = IQR = 53.5
Question # 88
From the following data locate mode graphically.
Classes 7- 11 12 - 16 17- 21 22 - 26 27- 31 32-36 37- 41 42- 46 47- 51
f 2 6 11 18 22 17 10 7 5

The necessary calculations are given below, and graphic location is in fig 3.2
Class
Classes f
Boundaries
7 — 11 2 6.5 — 11.5
12 — 16 6 11.5 — 16.5
17 — 21 11 16.5 — 21.5
22 — 26 18 21.5 — 26.5
27 — 31 22 26.5 — 31.5
32 — 36 17 31.5 — 36.5
37 — 41 10 36.5 — 41.5
42 — 46 7 41.5 — 46.5
47 — 51 5 46.5 — 51.5

94
Fig 3.2

95
Question # 89

Ans: Q1 = 1.2 Q3 = 3.7, IQR = Q3 – Q1 = 3.7 – 1.2 = 2.5

96
Question # 90

Ans: S = 0.831

97
Solution: X: 1 2 3 4, f: 5 15 12 3, ΣfX = 8 , ΣfX2 = 221, Σf = 35, S = 0.831
Question # 91
Circle the correct option i.e. A / B / C / D.
(1) If any value in a series is zero, then we cannot calculate the:
(A) Mean (B) Geometric mean
(C) Mode (D) Median
(2) The empirical relationship between mean, median and mode: the value of mode =
(A) 3 Mean – 2 Median (B) 2 Mean – 3 Median
(C) 3 Median – 2 Mean (D) 2 Median – 3 Mean
(3) If 10% is added to each value of variable, the GM of new variable is added by:
(A) 10 (B) 0.01
(C) 10% (D) 0.11
(4) If mean of 5 values is 10, then the sum of the values will be:
(A) 2 (B) 15
(C) 25 (D) 50
(5) If the arithmetic mean of the two numbers X1 and X2 is 5 if X1 = 3, then X2 is:
(A) 3 (B) 5
(C) 7 (D) 10
(6) In a moderately skewed distribution, the mean is 11 and the median is 13 then the
value of mode is:
(A) 15 (B) 13
(C) 11 (D) 17
(7) The lack of uniformity or symmetry is called:
(A) Skewness (B) Dispersion
(C) Kurtosis (D) Standard deviation
98
Answer: 1. B 2. C 3. C 4. D 5. C 6. D 7. A
CRITICAL THINKING PROBLEM
( No # 1 )
SKILL CHECK
An Internet site compares the strokes
per round of two professional golfers.
Which golfer is more consistent: Player
A with  = 71.5 strokes and s = 2.3
strokes, or Player B with  = 70.1
strokes and s = 1.2 strokes? Explain.
Answer: Player B

CRITICAL THINKING PROBLEM


( No # 2 )
DATA ANALYSIS
Answer:
A consumer testing service obtained the following miles per
• Mean should be used since Car A has
gallon in five test runs performed with three types of cars. the highest mean of the three.
Run 1 Run 2 Run 3 Run 4 Run 5 • Median should be used since Car B has
Car A: 28 32 28 30 34 the highest median of the three.
Car B: 32 29 34 29 33 • Mode should be used since Car C has
the highest mode of the three.
Car C: 29 32 28 32 30
(i) If the manufacturer of Car A wants to advertise that their car performed best in this test, which measure of central
tendency – mean, median, or mode – should be used for their claim? Explain your reasoning.
(ii) If the manufacturer of Car B wants to advertise that their car performed best in this test, which measure of
central tendency – mean, median, or mode – should be used for their claim? Explain your reasoning.
(iii) If the manufacturer of Car C wants to advertise that their car performed best in this test, which measure of 99
central tendency – mean, median, or mode – should be used for their claim? Explain your reasoning.
Homework
EXERCISES. (Elementary Statistics, Bluman, 4th Edition)
• Examples on Page # 100 ~ 136.
• Example # 3.19 ~ 3.23, 3.25, 3.26, 3.32 ~ 3.37, 3.42.
• Exercises on page # 95, 97 ~ 98, 101 ~ 105, 114, 125, 126,142,143,146.
• Problem #: 1 ~ 3, 26 ~ 33, 71 ~ 85, 88,101 ~ 103,116,138 ~ 142,144 ~150,27 ~ 33.

EXERCISES. (Statistics for Business & Economics, Newbold, 6th Edition)


• Exercises on Page # 50, 51, 59.
• Problem#: 3.1 ~ 3.8, 3.12 ~ 3.15, 3.17, 3.18, 3.19.

CASE STUDY: SUNGLASS SALES IN THE UNITED STATES

100

You might also like