Study of Averages Final
Study of Averages Final
Study of Averages Final
Unit 1
1. To get one single value that describes the characteristics of the entire data.
To facilitate comparison
1.It should be easy to understand. 2. It should be simple to compute. 3. It should be based on all the observations. 4. It should be rigidly defined. 5. It should be capable of further algebraic treatment. 6.It should not be affected by extreme values.
It is most popular and widely used method for representing the entire data.
ii)
Example:
The following figures relate to the monthly output of cloth of a factory in a given year: Compute average monthly output.
Months Output ( in 000 metres)
Jan
Feb March
80
88 92
April
May June July
84
96 92 96
August
Sept Oct Nov Dec
100
92 94 98 86
Short-cut Method
To simplify the manual calculations, we may sometimes use Change of Origin: We add or subtract (usually subtract) a constant to the individual observation. Change of Scale: This is achieved by multiplying or dividing each individual observation by a constant. Combination of the above two
d = X -A
d*= x-A c
80 88 92 84 96 92 96 100 92 94 98 86
Remember A.M. is not independent of the change of origin. This means that if same constant is subtracted from each observation it must be added in the final answer. Remember A.M. is not independent of the change of scale. This means that if each observation is divided by the same constant the final answer must be multiplied by the same.
Theoretically we can select any value as Assumed mean. However, for the purpose of Simplification of calculation work, the selected value should be as nearer to the value X as possible.
X = A+ d* x c n
out of which X1 has occurred f1 times, X2 has occurred f2 times, ., Xn has occurred fn times
X = f .X f
Application
The following is the frequency distribution of age of 670 students of a school Compute the arithmetic mean of the data. Age (in years) X
5 6 7
8
9 10
165
112 96
11
12 13 14
81
26 18 12
Direct Method
Age (in years) X
5 6 7
f.x
8
9 10 11 12 13 14
165
112 96 81 26 18 12
f.x =5918
Short -Cut
Frequency
3 8 12
30-40
40-50 50-60 60-70 70-80
15
18 16 11 5
Direct method
X = f .X f X= mid point of various classes
Short-cut method
Class Intervals
0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Frequency
Mid values x
fx
3 8 12 15 18 16 11 5
Short-cut method
Class Intervals
0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80
Frequency
fd
3 8 12 15 18 16 11 5
=A+ f.d f
Example
X 10 20 30 40 50 X = 150 X-X -20 -10 0 +10 +20 (X- X ) = 0
According to this property, the arithmetic mean serves as a point of balance or centre of gravity of the distribution; since sum of the positive deviations( i.e. deviations of observations which are greater than X ) is equal to the sum of negative deviations ( i.e. deviations of observations which are less than X .
2. The sum of the squares of deviations is minimum when taken from their arithmetic mean. ( X- A)2 is minimum if A= X
Example
X 2 X-X -2 (X-X)2 4
3 4 5
6 X = 20
-1 0 +1
+2 (X-X )=0
1 0 1
4 ((X-X)2 =10
If the deviation is taken from any other value (except actual mean), the sum of the squared deviation would be greater than 10.
If X1 and n1 are mean and number of observations of a group1 and if X2 and n2 are mean and number of observations of a group 2, then the mean X of the combined series [ n1+n2] observations is given by X=
[n1X1+n2X2] [n1+n2]
Illustration 1
There are two of a company employing 100 and 80 employees respectively. If average monthly Salary paid by two branches are Rs.4570 and Rs.6750 Respectively, find the average monthly salary of the employees of the company as a whole.
Ans:5538.89
Illustration 2
The average marks in statistics of 100 students of a class was 72. The average marks of boys was 75, while their number was 70. Find the average marks of girls in the class.
Ans: 65
Illustration 3
The mean age of a combined group of men and women is 30 years. If the mean age of the group of men is 32 and that of the group of women is 25. Find the percentage of men and women in the group.
Illustration 4
The average rainfall for a week, excluding Sunday, was 10 cms. Due to heavy rainfall on Sunday, the average for the week rose to 15 cms. How much rainfall was on Sunday?
Ans:45cms.
Illustration 5
There are 130 teachers and 100 non-teaching employees in a college. The respective distribution of their monthly salaries are given in the Following table.
Teachers Monthly Salary (Rs) 4000-5000 5000-6000 6000-7000 7000-8000 8000-9000 Total Frequency Non-teaching Employees Monthly Salary (Rs) 1000-2000 2000-3000 3000-40000 40000-5000 Frequency
10 16 22 67 15 130
21 45 28 6
Total
100
From the above data find: i) Average monthly salary of a teacher. Average monthly salary of a non-teaching Employee.
ii)
123-127
128-132 Total
3
1 60
If the mean weight of the students is 110.917, find the missing frequencies.
Ans: 17 and 6
Example
Find out the missing item (x) of the following frequency distribution whose arithmetic mean is 11.37.
X
5 7 x 11 13 16
f
2 4 29 54 11 8
20
2.
It is simplest average to understand and easiest to compute. Arithmetic mean is rigidly defined by a mathematical
formula. As a result of that everyone who compute the average from a given set of data get the same answer.
3.
Calculation of arithmetic mean is based on all the observations and hence, it can be considered as representative of the given data. Being determined by a rigid formula, it is capable of being treated mathematically and hence, widely used in statistical analysis. The mean is typical in the sense that it is the centre of gravity, balancing either side of it.
4.
5.
Demerits
Although, arithmetic mean satisfies most of the properties of ideal average, it has certain drawbacks and should be used with care. Some demerits of arithmetic mean are:
Since the value of mean depends upon each and every item of the series, extreme values( i.e. very small and large items) affect the value of the average. Hence, it can not represent data consisting of some extreme observations.
2. It can neither be determined by inspection nor by graph. 3. Arithmetic mean can not be computed for a qualitative data; like data on intelligence, honesty, smoking habit etc. Arithmetic mean can not be computed when class intervals have open ends. In such cases, to compute mean some assumptions regarding the size of the class interval (width) of the open-end classes should be made.
4.
Median
The median by definition refers to the middle value in a distribution when they are arranged in their ascending or descending order of their magnitude.
In other words, median of distribution is that value Which divides it into two equal parts. It is called Positional average because its value depends upon the Position of an item and not on its magnitude.
Determination of Median
When individual observations are given: Example, If the income of five employees are Rs. 900, 950, 1020, 1200 and 1280. Calculate the median.
a)
Note: there is no single middle position value and median is taken to be the A.M. of two middle most items.
Hence, in case of median may be found by averaging two middle position values.
Observations:
The following steps are involved in the calculation of Median The given observations are arranged in either ascending or descending order of magnitude. ii Given that there are n observations, the median is given by: (n+1) th observation , when n is odd. 2 iii When n is even, median is (n+1) th observation ie A.M of middle values. 2
Illustrations
Obtain the value of median from the following data:
391 384 591 407 672 522 777 753 2488 1490
2000
2500 1800
20
6 30
No. of persons 16 24 26 30 20 6
Cumulative Frequency(c.f.)
Steps
i.
iv Now look at the cumulative frequency column and find the total which is either equal to n+1 or next higher to that and determine 2 the value of the variable corresponding to it. That gives the value of median.
c.f
Determine the median class. In grouped data use (n/2) as the median class and not n+1 2
Median=
14 20 42 54 45 18 7
frequency
2 8 33 80 170 213 213 145 91 45
c.f.
425-475
475-525 525-575
575-625
625-675 675-725
725-775
775-825 825-875
Example
The following table gives the distribution of daily wages of 900 workers. However, the frequencies of the classes 40-50 and 60-70 are Missing. If the median of the distribution is Rs.59.25, find the missing frequencies. Wages (Rs.) Frequency 30-40 40-50 50-60 60-70 70-80 120 ? 200 ? 185
f1=145, f2=250
Note: when the class intervals are unequal, there is no need to make any adjustment to make it equal.
Ans:40
Median can be determined even when class intervals have openends (since only the position not the values of item must be known.) The median is recommended if the classes are not equal width, since it is easier to compute than mean. Extreme values do not influence the median. In fact when extreme values are present in the data , the median is more satisfactory measure of average than mean. For example, the median of 10,20,30, 40 and 150 would be 30 whereas mean is 50. Hence, very often when extreme values are present in a set of observations, the median is a more satisfactory measure of central tendency.
3)
4)
5)
The value of median can be determined graphically whereas the value of A.M. can not be determined graphically.
For calculating median, it is necessary to arrange the data in order of magnitude, which may be cumbersome task, particularly when the number of observations are very large. It is not capable of algebraic treatment. For example median can not be used for determining the combined median of two or more groups as is possible in case of mean. It is not based on all the values.
2.
3.
Quartiles
Quartiles are those values of the variable which divide the total frequency into four equal parts, deciles divide the total frequency into 10 equal parts and the percentiles divide the total frequency into 100 equal parts .
Computation of Quartiles
Since there values are needed to divide a distribution into four equal parts, there are three quartiles , Q1, Q2 and Q3, known as the
In Individual observations and discrete series Q1=Size of N+1 th item. 4 Q2=Size of N+1 th item. 2 Q3=Size of 3(N+1) th item 4
Example
Price of a commodity in 8 different shops are as follows. Calculate the quartiles. Price (Rs): 208, 205,212,209, 207,210,208,206. Solution: Arrange in their ascending order of magnitude. 205,206,207,208,209,210,212
The first quartile isQ1=Size of N+1 th value in the series 4 =(8+1) th value 4 = (2.25) th value = 2nd value + 0.25( 3rd value- 2nd value ) = 206+ 0.25(207-206) = 206.25/Similarly, calculate Q2 and Q3.
Illustration From the following data compute the value of upper quartiles (Q3) and lower quartiles (Q1) ,D2, P5 and P90. Marks No. of Students
Below 10 10-20 20-40 40-60 60-80 Above 80 8 10 22 25 10 5
Marks
No. of Students
C.f.
Below 10 10-20
8 10
8 18
20-40
40-60 60-80
22
25 10
40
65 75
Above 80
80=N
Q1=Size of N th item = 80/4= 20th item. 4 Hence Q1 lies in the class 20-40.
MODE
Mode is that value which occurs maximum number of times in a distribution. In other words, modal value is that value in a series of observations which occurs with the highest frequency.
For example, what is the mode of the following series 3,5,8,5,4,5,9,3. Ans. is 5 , since this value occurs more frequently than any of the others.
Example
Calculate modal size of shoes from the following data: Size of shoes No. of shoes 5 6 7 8 9 10 11 10 20 25 40 22 15 6
Remarks:
i) When there are two or more values having the same maximum frequency, one cannot say which is the modal value and hence mode is said to be ill-defined. Such a series is also known as bi-modal or multi-modal. For example, observe the following data: Income( Rs): 110, 120,130,120,110,140,130,120,130,140. Since 120 and 130 have the same maximum frequency i.e. 3, mode will be 120 and 130. So, in this case mode is illdefined.
When mode is ill-defined, its value may be determined by the following formula based upon the relationship between mean, median and mode: Mode = 3Median 2 Mean.
Remarks:
ii)
Where L = lower boundary of the modal class 1= the difference between the frequency of the modal class and the frequency of the premodal class( i.e. preceding class). 2 = the difference between the frequency of the modal class and the frequency of the postmodal class( i.e. succeeding class).
Illustration
Calculate mode from the following data:
Marks Above 0
No. of students
Above 10
Above 20 Above 30
Above 40
Above 50 Above 60 Above 70 Above 80 Above 90 Above 100
80 77 72 65 55 43 28 16 10 8 0
Solution: Since it is more than type cumulative frequency distribution, we will convert it into simple frequency distribution.
No. of students
50-60
60-70 70-80
80-90
90-100
3 5 7 10 12 15 12 6 2 8
By inspection mode lies in the class 108-112. But the boundaries of this class is 107.5-112.5. Mode= L + 1 1+ 2
x i
Ex-Discrete Series
Calculate the value of mode for the following table:
Marks (X) 10 15 20 25 30 35 40 Frequency 8 12 36 35 28 18 9
X 10
Col 3
Col 4 Col 5
Col 6
15
20 25 30 35 40
12
36 *
20 48 71
56
83
35
63
28
46
81 55
99
18
27
The values against which frequencies are the highest are marked in the grouping table and entered by means of a bar (/) in the Analysis table. Total is maximum corresponding to the value 25.So mode is 25.
40-50 50-60
60-70 70-80
29 4
3 1
Step 1: Identify the modal class. By inspection, it is difficult to locate. Hence, modal class will be determined by method of grouping and analysis.
Col 3
Col 4
Col 5
Col 6
22
33 40 63 77 59 63 36
7 4
8
3
4 5 / /
/
/ /
/
/ /
6
Total 1 3
/
6
/
3
/
1
L= 30 , 1 =12 , 2 = 1 =
2.
3.
4.
It is easy to understand and easy to calculate. In many cases it can be located just by inspection. It is not effected by extreme values. For example, the mode of 10, 2, 5, 10,5, 60, 5, 10, 60,10 is 10 as it has occurred most often in the data. It can be determined even if the distribution has open-end classes. It is a value around which there is more concentration of
Demerits 1. It is not based on all the observations. 2. It is not capable of further mathematical treatment. 3. The value of mode cannot always be determined. In some cases we may have a bimodal or multimodal series.
1.a) For a distribution, mode and mean are 32.1 and 35.4 respectively. Find the value of median.
Ans 34.3
Ans. 17.9
For example G.M. of 3 values 2,3,4 would be: G.M.= 2x3x4 = 2.885
When the number of items are large, the task of multiplying the numbers and of extracting the root becomes excessively difficult. To simplify calculations logarithms are used. Geometric mean then calculated as follows:
log G.M.= log X1+ log X2 +. + log X n n log G.M.= log X n G.M.= Antilog
log X n
Harmonic Mean
Harmonic mean is defined as the reciprocal of the Arithmetic mean of the reciprocals of the data, ( none of which is zero). If there are n observations X1. X2 X n , their Harmonic Mean is defined as: H.M.= n 1/ X1 + 1/ X2+.. +1/ Xn
Example
1.Calculate the harmonic mean of 8 and 10.
2. Ram goes from his house to office on a car at a speed of 60km/hour and returns at a speed of 40km/hour. Find his average speed
Application If the arithmetic mean of two positive numbers is 15 and their geometric mean is 9, find their harmonic mean.