Study of Averages Final

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 111

Measures of Central Tendency

Unit 1

What are the objectives of study of Averages or Measures of Central Tendency?

There are two main objectives:

1. To get one single value that describes the characteristics of the entire data.

To facilitate comparison

CHARACTERISTICS OF A GOOD AVERAGE

1.It should be easy to understand. 2. It should be simple to compute. 3. It should be based on all the observations. 4. It should be rigidly defined. 5. It should be capable of further algebraic treatment. 6.It should not be affected by extreme values.

There are the five different Measures of Central Tendency


1. Arithmetic Mean 2. Median 3. Mode 4.Geometric Mean 5. Harmonic Mean Depending upon the need and nature of study, proper measure is chosen

1. ARITHMETIC MEAN( A.M.)

It is most popular and widely used method for representing the entire data.

Calculation of Arithmetic meanUngrouped data


i)

Direct method Short-cut method.

ii)

Calculation of Arithmetic meanUngrouped data( Direct method)


If X1, X2,.,X n denotes n no. of observations, the A.M. denoted by X is defined as: X = X1+ X2 +.+ X n n n X = i=1 n

Example:
The following figures relate to the monthly output of cloth of a factory in a given year: Compute average monthly output.
Months Output ( in 000 metres)

Jan
Feb March

80
88 92

April
May June July

84
96 92 96

August
Sept Oct Nov Dec

100
92 94 98 86

Short-cut Method
To simplify the manual calculations, we may sometimes use Change of Origin: We add or subtract (usually subtract) a constant to the individual observation. Change of Scale: This is achieved by multiplying or dividing each individual observation by a constant. Combination of the above two

Solution by using short Cut Method


X

d = X -A

d*= x-A c

80 88 92 84 96 92 96 100 92 94 98 86

Effect of change of origin and scale on A.M.

Remember A.M. is not independent of the change of origin. This means that if same constant is subtracted from each observation it must be added in the final answer. Remember A.M. is not independent of the change of scale. This means that if each observation is divided by the same constant the final answer must be multiplied by the same.

Theoretically we can select any value as Assumed mean. However, for the purpose of Simplification of calculation work, the selected value should be as nearer to the value X as possible.

X = A+ d* x c n

Formula Of Arithmetic Mean For Ungrouped data with frequency


Suppose there be n values X1, X2,.,X n

out of which X1 has occurred f1 times, X2 has occurred f2 times, ., Xn has occurred fn times
X = f .X f

Application
The following is the frequency distribution of age of 670 students of a school Compute the arithmetic mean of the data. Age (in years) X
5 6 7

No. of students (frequency)


25 45 90

8
9 10

165
112 96

11
12 13 14

81
26 18 12

Direct Method
Age (in years) X
5 6 7

No. of students (frequency)


25 45 90

f.x

8
9 10 11 12 13 14

165
112 96 81 26 18 12

f.x =5918

Short -Cut

Calculation of Arithmetic Mean of the Grouped Data


Class Intervals
0-10 10-20 20-30

Frequency
3 8 12

30-40
40-50 50-60 60-70 70-80

15
18 16 11 5

Direct method
X = f .X f X= mid point of various classes

Short-cut method
Class Intervals
0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80

Frequency

Mid values x

fx

3 8 12 15 18 16 11 5

Short-cut method
Class Intervals
0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80

Frequency

Mid d=x-35 values x

fd

3 8 12 15 18 16 11 5

=A+ f.d f

= 35+ 660 88 = 42.5

Properties of Arithmetic Mean


The sum of deviations of all the observations taken from their arithmetic mean is always zero. i.e. (X-X ) =0 , where X =Actual mean
1.

Example
X 10 20 30 40 50 X = 150 X-X -20 -10 0 +10 +20 (X- X ) = 0

According to this property, the arithmetic mean serves as a point of balance or centre of gravity of the distribution; since sum of the positive deviations( i.e. deviations of observations which are greater than X ) is equal to the sum of negative deviations ( i.e. deviations of observations which are less than X .

2. The sum of the squares of deviations is minimum when taken from their arithmetic mean. ( X- A)2 is minimum if A= X

Example
X 2 X-X -2 (X-X)2 4

3 4 5
6 X = 20

-1 0 +1
+2 (X-X )=0

1 0 1
4 ((X-X)2 =10

If the deviation is taken from any other value (except actual mean), the sum of the squared deviation would be greater than 10.

3. Arithmetic mean is capable of being treated algebrically.

If X1 and n1 are mean and number of observations of a group1 and if X2 and n2 are mean and number of observations of a group 2, then the mean X of the combined series [ n1+n2] observations is given by X=
[n1X1+n2X2] [n1+n2]

Illustration 1
There are two of a company employing 100 and 80 employees respectively. If average monthly Salary paid by two branches are Rs.4570 and Rs.6750 Respectively, find the average monthly salary of the employees of the company as a whole.
Ans:5538.89

Illustration 2
The average marks in statistics of 100 students of a class was 72. The average marks of boys was 75, while their number was 70. Find the average marks of girls in the class.

Ans: 65

Illustration 3
The mean age of a combined group of men and women is 30 years. If the mean age of the group of men is 32 and that of the group of women is 25. Find the percentage of men and women in the group.

Ans: 71.43 and 28.57

Illustration 4
The average rainfall for a week, excluding Sunday, was 10 cms. Due to heavy rainfall on Sunday, the average for the week rose to 15 cms. How much rainfall was on Sunday?

Ans:45cms.

Illustration 5
There are 130 teachers and 100 non-teaching employees in a college. The respective distribution of their monthly salaries are given in the Following table.
Teachers Monthly Salary (Rs) 4000-5000 5000-6000 6000-7000 7000-8000 8000-9000 Total Frequency Non-teaching Employees Monthly Salary (Rs) 1000-2000 2000-3000 3000-40000 40000-5000 Frequency

10 16 22 67 15 130

21 45 28 6

Total

100

From the above data find: i) Average monthly salary of a teacher. Average monthly salary of a non-teaching Employee.
ii)

iii) Average monthly salary of a college employee ( teaching and non-teaching ).

To find Missing Frequency


The following is the distribution of weights ( in lbs.) of 60 students of a class :
Weights 93-97 98-102 103-107 108-112 113-117 118-122 No. of students 2 5 12 ? 14 ?

123-127
128-132 Total

3
1 60

If the mean weight of the students is 110.917, find the missing frequencies.

Ans: 17 and 6

Example
Find out the missing item (x) of the following frequency distribution whose arithmetic mean is 11.37.

X
5 7 x 11 13 16

f
2 4 29 54 11 8

20

Merits and Demerits of Arithmetic Mean


Out of all averages arithmetic mean is the most Popular average in statistics because of the following merits:
1.

2.

It is simplest average to understand and easiest to compute. Arithmetic mean is rigidly defined by a mathematical
formula. As a result of that everyone who compute the average from a given set of data get the same answer.

3.

Calculation of arithmetic mean is based on all the observations and hence, it can be considered as representative of the given data. Being determined by a rigid formula, it is capable of being treated mathematically and hence, widely used in statistical analysis. The mean is typical in the sense that it is the centre of gravity, balancing either side of it.

4.

5.

Demerits
Although, arithmetic mean satisfies most of the properties of ideal average, it has certain drawbacks and should be used with care. Some demerits of arithmetic mean are:

Since the value of mean depends upon each and every item of the series, extreme values( i.e. very small and large items) affect the value of the average. Hence, it can not represent data consisting of some extreme observations.

2. It can neither be determined by inspection nor by graph. 3. Arithmetic mean can not be computed for a qualitative data; like data on intelligence, honesty, smoking habit etc. Arithmetic mean can not be computed when class intervals have open ends. In such cases, to compute mean some assumptions regarding the size of the class interval (width) of the open-end classes should be made.

4.

Median
The median by definition refers to the middle value in a distribution when they are arranged in their ascending or descending order of their magnitude.

In other words, median of distribution is that value Which divides it into two equal parts. It is called Positional average because its value depends upon the Position of an item and not on its magnitude.

Determination of Median
When individual observations are given: Example, If the income of five employees are Rs. 900, 950, 1020, 1200 and 1280. Calculate the median.
a)

Example on Calculation of Median for even no. of observations


If the income of six employees are Rs. 900, 950, 1020, 1200 , 1280, 1300. Calculate the median.

Note: there is no single middle position value and median is taken to be the A.M. of two middle most items.

Hence, in case of median may be found by averaging two middle position values.

Observations:
The following steps are involved in the calculation of Median The given observations are arranged in either ascending or descending order of magnitude. ii Given that there are n observations, the median is given by: (n+1) th observation , when n is odd. 2 iii When n is even, median is (n+1) th observation ie A.M of middle values. 2

Illustrations
Obtain the value of median from the following data:

391 384 591 407 672 522 777 753 2488 1490

b) Computation of Median Discrete series


Example: From the following data find the value of median Income No. of (Rs.) persons 1000 24 1500 800 26 16

2000
2500 1800

20
6 30

Income (Rs.) 800 1000 1500 1800 2000 2500

No. of persons 16 24 26 30 20 6

Cumulative Frequency(c.f.)

Steps
i.

Arrange the data in ascending or decending order of magnitude.

ii Find out the cumulative frequencies

iii Apply the formula: Median = size of n+1 2

iv Now look at the cumulative frequency column and find the total which is either equal to n+1 or next higher to that and determine 2 the value of the variable corresponding to it. That gives the value of median.

c) Calculation of MedianContinuous Series


Example: Calculate the median of the following frequency Distribution. Classes Frequency
0-10 10-20 20-30 30-40 40-50 50-60 5 12 14 18 13 8

c.f

Determine the median class. In grouped data use (n/2) as the median class and not n+1 2

Apply the Formula:


Median= L + N/2 c.f x i f
L= lower boundary of the median class c.f = cumulative frequency of the class preceding to the median class. f = frequency of the median class. i = The class interval of the median class.

Median=

More Illustration: calculate median


Marks 5-10 10-15 15-20 20-25 25-30 30-35 35-40 40-45 45-50 No. of students 7 15 24 31 42 30 26 15 10

Illustration :calculate median


Weight (in grams) 410-419 420-429 430-439 440-449 450-459 460-469 470-479 No. of apples

14 20 42 54 45 18 7

Example of open-end class


Class Intervals
Less than 425

frequency
2 8 33 80 170 213 213 145 91 45

c.f.

425-475
475-525 525-575

575-625
625-675 675-725

725-775
775-825 825-875

Determination of Missing Frequencies


If the frequencies of some classes are missing, however, the median of the distribution is known, then these frequencies can be determined by the use of median formula.

Example
The following table gives the distribution of daily wages of 900 workers. However, the frequencies of the classes 40-50 and 60-70 are Missing. If the median of the distribution is Rs.59.25, find the missing frequencies. Wages (Rs.) Frequency 30-40 40-50 50-60 60-70 70-80 120 ? 200 ? 185

f1=145, f2=250

Calculaton of Median when class Intervals are Unequal


Calculate the median from the following data:
Marks 0-10 10-30 30-60 60-80 80-90 No. of students 5 15 30 8 2

Note: when the class intervals are unequal, there is no need to make any adjustment to make it equal.

Ans:40

Merits and Demerits of Median


Merits 1) It is easy to understand and simple to calculate.
2)

Median can be determined even when class intervals have openends (since only the position not the values of item must be known.) The median is recommended if the classes are not equal width, since it is easier to compute than mean. Extreme values do not influence the median. In fact when extreme values are present in the data , the median is more satisfactory measure of average than mean. For example, the median of 10,20,30, 40 and 150 would be 30 whereas mean is 50. Hence, very often when extreme values are present in a set of observations, the median is a more satisfactory measure of central tendency.

3)

4)

5)

The value of median can be determined graphically whereas the value of A.M. can not be determined graphically.

Demerits or limitations of Median


1.

For calculating median, it is necessary to arrange the data in order of magnitude, which may be cumbersome task, particularly when the number of observations are very large. It is not capable of algebraic treatment. For example median can not be used for determining the combined median of two or more groups as is possible in case of mean. It is not based on all the values.

2.

3.

Related Positional Averages


Besides median, there are other measures which divide a series into equal parts. Important among these are quartiles, deciles and percentiles.

Quartiles
Quartiles are those values of the variable which divide the total frequency into four equal parts, deciles divide the total frequency into 10 equal parts and the percentiles divide the total frequency into 100 equal parts .

Computation of Quartiles
Since there values are needed to divide a distribution into four equal parts, there are three quartiles , Q1, Q2 and Q3, known as the

first, second and third quartiles respectively.

In Individual observations and discrete series Q1=Size of N+1 th item. 4 Q2=Size of N+1 th item. 2 Q3=Size of 3(N+1) th item 4

Example
Price of a commodity in 8 different shops are as follows. Calculate the quartiles. Price (Rs): 208, 205,212,209, 207,210,208,206. Solution: Arrange in their ascending order of magnitude. 205,206,207,208,209,210,212

The first quartile isQ1=Size of N+1 th value in the series 4 =(8+1) th value 4 = (2.25) th value = 2nd value + 0.25( 3rd value- 2nd value ) = 206+ 0.25(207-206) = 206.25/Similarly, calculate Q2 and Q3.

For Continuous Series


Q1=Size of N th item. 4 Q2=Size of N th item. 2 Q3=Size of 3(N) th item 4

Deciles and Percentiles


D1 = (N+1) th item ( in individual and discrete series) 10 D1 = N th item ( in continuous series) 10 .. P1 = (N+1) th item ( in individual and discrete series) 100 P1 = N th item ( in continuous series) 100

Illustration From the following data compute the value of upper quartiles (Q3) and lower quartiles (Q1) ,D2, P5 and P90. Marks No. of Students
Below 10 10-20 20-40 40-60 60-80 Above 80 8 10 22 25 10 5

Marks

No. of Students

C.f.

Below 10 10-20

8 10

8 18

20-40
40-60 60-80

22
25 10

40
65 75

Above 80

80=N

Q1=Size of N th item = 80/4= 20th item. 4 Hence Q1 lies in the class 20-40.

Q1= L+ N/4c.f. x i f =21.82 Q3 = L+ N/4c.f. x I f = 56.


D2 = Size of 2N th item = 16th item. 10

Hence D2 lies in the class 10-20. D2= L + 2N/10 c.f. X i f = 18

P5 = Size of 5N th item = 4th 10 Hence P5 lies in the class 0-10.

MODE
Mode is that value which occurs maximum number of times in a distribution. In other words, modal value is that value in a series of observations which occurs with the highest frequency.

For example, what is the mode of the following series 3,5,8,5,4,5,9,3. Ans. is 5 , since this value occurs more frequently than any of the others.

Example
Calculate modal size of shoes from the following data: Size of shoes No. of shoes 5 6 7 8 9 10 11 10 20 25 40 22 15 6

Remarks:
i) When there are two or more values having the same maximum frequency, one cannot say which is the modal value and hence mode is said to be ill-defined. Such a series is also known as bi-modal or multi-modal. For example, observe the following data: Income( Rs): 110, 120,130,120,110,140,130,120,130,140. Since 120 and 130 have the same maximum frequency i.e. 3, mode will be 120 and 130. So, in this case mode is illdefined.
When mode is ill-defined, its value may be determined by the following formula based upon the relationship between mean, median and mode: Mode = 3Median 2 Mean.

Remarks:
ii)

If the frequency of each possible value is same, there is no mode. Ex:


X f 5 6 10 15 20 25 30 35 40 6 6 6 6 6 6 6

Calculation of Mode- Continuous Series


Step 1 By inspection identify the modal class ( the class having the highest frequency). Step 2 Determine the value of mode by Applying the following formula M= L+ 1 1+ 2
x i

Where L = lower boundary of the modal class 1= the difference between the frequency of the modal class and the frequency of the premodal class( i.e. preceding class). 2 = the difference between the frequency of the modal class and the frequency of the postmodal class( i.e. succeeding class).

i = the class interval of the modal class

Illustration
Calculate mode from the following data:

Marks Above 0

No. of students

Above 10
Above 20 Above 30

Above 40
Above 50 Above 60 Above 70 Above 80 Above 90 Above 100

80 77 72 65 55 43 28 16 10 8 0

Solution: Since it is more than type cumulative frequency distribution, we will convert it into simple frequency distribution.

Marks 0- 10 10-20 20-30 30-40 40-50

No. of students

50-60
60-70 70-80

80-90
90-100

3 5 7 10 12 15 12 6 2 8

By inspection the modal class is 50-60.


M= L+ = 1 1+ 2
x i

Find the Mode of the data given below:


Weight (kg) 93-97 98-102 103-107 108-112 113-117 118-122 123-127 128-132 No. of students 2 5 12 17 14 6 3 1

By inspection mode lies in the class 108-112. But the boundaries of this class is 107.5-112.5. Mode= L + 1 1+ 2
x i

L = 107.5 1 = (17-12)=5 2 = (17-14)=3 Mode = 107.5+ 3.125 = 110.625.

Calculation of Mode by Method of Grouping & Analysis


In frequency distribution, sometimes mode can not be determined just by inspection ( looking at the table) when the maximum frequency and the preceding it and succeeding it is very small. In such cases it is desirable to prepare a Grouping table and an analysis table. These tables help us in determining the mode.

Ex-Discrete Series
Calculate the value of mode for the following table:
Marks (X) 10 15 20 25 30 35 40 Frequency 8 12 36 35 28 18 9

Solution: Since it is difficult to say by inspection which is the


modal value, we prepare grouping and analysis tables.
Steps of preparing a grouping table: 1. Prepare a table consisting of six columns in addition to a column for various values of X. 2. In the first column write the frequencies against the various values of X as given in the question and mark the highest frequency. 3. In second column, starting from the top, take the sum of frequencies which are grouped in twos and mark the highest frequency. 4. In 3rd column, leave the first frequency and then group the remaining in twos and take the sum of frequencies and mark the highest frequency.

5. In col 4, group the frequencies in threes, take the sum of


frequencies of each group and mark the highest frequency. 6. In col 5, leave the first frequency, then group them in threes. Take the sum of frequencies of each group and mark the highest frequency. 7. In col 6, leave the first two frequencies, then group them in threes. Take the sum of frequencies of each group and mark the highest frequency.

X 10

Col 1(f) Col 2 8

Col 3

Col 4 Col 5

Col 6

15
20 25 30 35 40

12
36 *

20 48 71

56
83

35
63

28
46

81 55

99

18
27

After this grouping, prepare an Analysis table


Col no 1 2 3 4 5 6 Total 1 / / / 4 10 15 20 / / / / / / / 5 / 3 1 / / / 25 30 35 40

The values against which frequencies are the highest are marked in the grouping table and entered by means of a bar (/) in the Analysis table. Total is maximum corresponding to the value 25.So mode is 25.

Calculate mode of the following distribution


Class Intervals 0-10 10-20 20-30 30-40 Frequency 7 15 18 30

40-50 50-60
60-70 70-80

29 4
3 1

Step 1: Identify the modal class. By inspection, it is difficult to locate. Hence, modal class will be determined by method of grouping and analysis.

X 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80

Col 1(f) Col 2 7 15 18 30 29 4 3 1 33 48

Col 3

Col 4

Col 5

Col 6

22
33 40 63 77 59 63 36

7 4
8

Column 0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80 no 1 /


2 / /

3
4 5 / /

/
/ /

/
/ /

6
Total 1 3

/
6

/
3

/
1

From the analysis table, the modal class is 30-40.


Mode= L + 1 1+ 2
x i

L= 30 , 1 =12 , 2 = 1 =

Merits and Demerits of Mode


1.

2.

3.

4.

It is easy to understand and easy to calculate. In many cases it can be located just by inspection. It is not effected by extreme values. For example, the mode of 10, 2, 5, 10,5, 60, 5, 10, 60,10 is 10 as it has occurred most often in the data. It can be determined even if the distribution has open-end classes. It is a value around which there is more concentration of

observation and hence the best representative of data.

Demerits 1. It is not based on all the observations. 2. It is not capable of further mathematical treatment. 3. The value of mode cannot always be determined. In some cases we may have a bimodal or multimodal series.

1.a) For a distribution, mode and mean are 32.1 and 35.4 respectively. Find the value of median.

Ans 34.3

b) Given median = 20.6 mode = 26, find mean

Ans. 17.9

Geometric Mean ( G.M.)


The Geometric Mean of a series of n positive numbers is defined as the nth root of their product. G.M.= n (X1 )x (X2.) x(X n) G.M.= (X1. X2 X n) 1/n

For example G.M. of 3 values 2,3,4 would be: G.M.= 2x3x4 = 2.885

When the number of items are large, the task of multiplying the numbers and of extracting the root becomes excessively difficult. To simplify calculations logarithms are used. Geometric mean then calculated as follows:

log G.M.= log X1+ log X2 +. + log X n n log G.M.= log X n G.M.= Antilog

log X n

Weighted Geometric Mean


If the various observations are not equal importance in the data, we calculate weighted arithmetic mean. Weighted Geometric Mean of the observations X1. X2 X n with respective weights as w1, w2, .wn is G.M.= Antilog w. log X w

Harmonic Mean
Harmonic mean is defined as the reciprocal of the Arithmetic mean of the reciprocals of the data, ( none of which is zero). If there are n observations X1. X2 X n , their Harmonic Mean is defined as: H.M.= n 1/ X1 + 1/ X2+.. +1/ Xn

Example
1.Calculate the harmonic mean of 8 and 10.

2. Ram goes from his house to office on a car at a speed of 60km/hour and returns at a speed of 40km/hour. Find his average speed

Relation between A.M., G.M. and H.M. G.M. = A.M. x H.M.

Application If the arithmetic mean of two positive numbers is 15 and their geometric mean is 9, find their harmonic mean.

You might also like