0% found this document useful (0 votes)
10 views34 pages

Lecture 3 Notes

Uploaded by

sirajamalif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views34 pages

Lecture 3 Notes

Uploaded by

sirajamalif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Central Tendency

Numerical Measures of Data

 Data sets are usually either a sample or to a population

 Our target: ultimately to use numerical descriptive measures to


make inferences

 Most methods measure one of two data characteristics

 Central tendency - This measures the extent to which all the


values are grouped around a typical or central value.

 Variation or Dispersion - This measures the amount of


dispersion or scattering of values away from a central value.
Measures of Central Tendency

• In most datasets i.e. population or sample i.e. the values show a


distinct tendency to group or cluster around a central point.

• This tendency of clustering the values around the center of the


series is usually called central tendency.

• The numerical measure of this tendency of concentration is


variously known as
• The measure of central tendency
• The measure of location
• The measure of average.
Necessity of measuring the central tendency

I. They give us an idea about the concentration of the values in


the central part of the distribution.
II. It is the value of the variable, which is typical of the whole set.
III. It represents all relevant information contained in the data in as
few numbers as possible.
IV. They give precise information, not information of a vague
general type.
Characteristics of a good measure of central
tendency

i. It should be easy to understand.


ii. It should be easy to calculate.
iii. It should be based upon all observations.
iv. It should be rigidly defined.
v. It should not be unduly affected by extreme values.
vi. It should be suitable for further algebraic treatment.
vii. It should be less affected by sampling fluctuation.
Different measures of central tendency
Arithmetic Mean
Arithmetic Mean

• We can obtain the arithmetic mean of a series of observations


by adding the values of the observations and then dividing the
sum by the number of observations.

• Arithmetic mean (AM) for


• Sample observation is denoted by x͞
• Population mean is denoted by µ

• If there are n values x1, x2 ,…,…,…, xn for a variable X, then


the AM (denoted by x͞ ) is defined as

x1 + x2 +x3 +…+…+…+xn
x͞ =
Example

Banglatel is studying the number of minutes used by clients in a


particular cell phone rate plan. A random sample of 12 clients
showed the following number of minutes used last month.

90 77 94 89 119 112
91 110 92 100 113 83

What is the mean (arithmetic mean) number of minutes used?


Solution

Average use of the rate plan

x1 + x2 +x3 +…+…+…+xn 90 + 77+…+…+ 112 + 91 +…+…+ 113 + 83


x̅ = = = 97.5

Thus the arithmetic mean number of minutes used last month by the
sample of cell phone users is 97.5 minutes.
Group Data With Frequencies

For a group data as given in the following table

Values: x1 x2 … … xn
Frequencies : f1 f2 … … fn

Such that f1 + f2 +f3 +…+…+…+fk = n, then the AM (denoted by x͞ )


is defined as

f1x1 +f2x2 +f3x3 +…+fkxk


x̅ =
Example

Calculate the mean for the following frequency distribution for


n=100:

Class Interval Frequency

0-10 10

10-20 20

20-30 40

30-40 20

40-50 10
Solution

Class Frequency(fi) Mid values (fi)*(xi)


Interval (xi)
0-10 10 5 50
10-20 20 15 300
20-30 40 25 1000
30-40 20 35 700
40-50 10 45 450
Total 100 2500

Arithmetic mean,

f1x1 +f2x2 +f3x3 +f4x4+f5x5


=

= = 25
Test Yourself

The following data represent the distribution of the age of


employees within two different divisions of publishing company.
Determine which company have relatively aged group of
employees.

Age of Number of employees of


employees division
X Y
20-30 6 13
30-40 19 30
40-50 9 24
50-60 10 0
60-70 2 4
Solution
Age of Mid Frequency(fxi) Frequency(fyi) (fxi)*(xi) (fyi)*(xi)
employees values(xi)
20-30 25 6 13 150 325
30-40 35 19 30 665 1050
40-50 45 9 24 405 1080
50-60 55 10 0 550 0
60-70 65 2 4 130 260
Total 46 71 1900 2715

∑ fxi*xi
Arithmetic mean age of employee division X = = = 41.3
∑ fxi

∑ fyi*xi
Arithmetic mean age of employee division Y = = = 38.2
∑ fyi

Since, A.M of X group of employees > A.M of Y group of


employees, X group of employees are relatively aged more.
Arithmetic Mean

Merits Demerits

1. Rigidly defined. 1. Cannot be defined


graphically.
2. Easy to understand and
calculate. 2. Cannot be used in case of
qualitative data.
3. Based upon all observation.
3. Affected very much by
4. Most amenable to algebraic extreme values.
treatment.
4. May not occur in the series.
5. Not based on position in the
series. 5. Difficult to calculate in the
case of the data with open-end
class.
When NOT to use Arithmetic Mean

1. In highly - skewed distributions.

2. When the distribution is unevenly spread; concentration being


small or large at irregular points.

3. When an average rate of growth or change over a period of


time is required.

4. When the observation are from geometric progression.

5. When averaging rates (that is speed, fluctuations in the prices


of articles, etc.)

6. When there are very large and very small values of


observations.
Median
Median

• If the values of a series are arranged in an ascending or


descending order of magnitude, then the middle most value in
this arrangement is called the median of the series.

• Median is usually denoted by Me.

• The median is generally the best average in open-end grouped


distribution, especially where if plotted as a frequency curve
one gets a J or reverse J shaped curve
Determination of Median: Ungrouped Data

Let n be the number of observations

a. When n is odd the value of the observation will be the


median.
b. When n is even the median will be the AM of the values of
and observation in the series.
Example: n is odd
The ages of a family of seven members are given as 12, 7, 2,
34, 17, 21 and 19. Find the median age.

Step 1 Count the total number of elements, n=?


Here n= 7, an odd number

Step 2 Arrange the values in ascending order :


2, 7, 12, 17, 19, 21, 34

Step 3 Median: Me = Value of


observation
th observation = Value of th

=value of 4th observation = 17

Step 4 Median Age of the family is 17 years


Example: n is even
The ages of a family of eight members are given as 12, 7, 2,
34, 17, 40, 21 and 19. Find the median age.

Count the total number of elements, n=?


Step 1 Here n= 8, an even number

Arrange the values in ascending order :


Step 2 2, 7, 12, 17, 19, 21, 34, 40

Step 3 Median: Me = AM of the values of 𝑡ℎ and ( + 1)𝑡ℎ observation

= AM of the values of 4𝑡ℎ and 5𝑡ℎ observation = = 18

Step 4 Median Age of the family is 18 years


Determination of Median: Grouped Data

( - F-Me)
Formula, Me = L0 + * WMe
fMe

• Me = Median
• L0 = Lower limit of the Median class
• F-Me = Cumulative frequency of the pre median class.
• fMe= Frequency of the median class.
• WMe = Width of the median class.
• n = Total number of observation.

MEDIAN CLASS: the class that contains observation of the


given data.
Example

The Table displays summary information of the parent of 50


students. Compute the median income of the parents.

Income of parent Frequency


(in thousand taka)
Below 20 3
20-40 4
40-60 6
60-80 8
80-100 12
100-120 10
120 and over 7
Total 50

• Step 1: Compute the cumulative frequencies.


• Step 2: Determine one half of the total number of
cases.
• Step 3: Locate the median class.
• Step 4: Determine the lower limit (L0 ) of the median class.
• Step 5: Sum the frequencies of all the classes prior to the
median class. This is F-Me .
• Step 6: Determine the frequency of the median class fMe..
• Step 7: Determine the width of the median class, WMe .
Test Yourself

The following data represents the amount (in thousands taka) of


loan requirements of the people of two different upazilla. Using
median, comment on which upazilla has the greater average demand
of loans.

Upazilla 1 42 12 26 18 9 35 28 39 8
Upazilla 2 8 15 10 18 22 20 26 42 35

Solution

Here, n = 9 (odd)

Arranging Upazilla 1 observations in ascending order:


8, 9, 12, 18, 26, 28, 35, 39, 42
Therefore, median of Upazilla 1 = th observation = 26

Arranging Upazilla 2 observations in ascending order:


8, 10, 15, 18, 20, 22, 26, 35, 42
Therefore, median of Upazilla 2 = th observation = 20

Since, median of Upazilla 1 > median of Upazilla 2, Upazilla 1 has


the greater average demand of loans.
Test Yourself

The following table gives the data pertaining to kilowatt hours of


electricity consumed by 100 randomly selected flat owners of Japan
garden city.

Consumption
(in K-watt 0-100 100-200 200-300 300-400 400-500
hours)
No. of users 6 25 36 20 13

Calculate
i. Mean consumption of electricity
ii. Median use of electricity

Solution

Consumption Mid No. of (fi)*(xi) Cumulative


(in K-watt Value(xi) users(fi) Frequency
hours)
0-100 50 6 300 6
100-200 150 25 3750 31
200-300 250 36 9000 67
300-400 350 20 7000 87
400-500 450 13 5850 100
Total 100 25900

∑ fi*xi
i) Mean consumption of electricity = = = 259
∑ fi
[continued in next page]
ii) Median = Observation

Median class = (200-300)

Lower Limit of the median class (L0) = 200

Sum of the frequencies of all classes prior the median class


(F-Me) = 31

Frequency of median class (fMe) = 36

Width of the median class (WMe) = 300-200 = 100

( - F-Me)
Median, Me = L0 + * WMe
fMe

= 200 +
= 252.78 (Answer)
Median

Merits Demerits

1. Rigidly defined. 1. In case of even number of


observations, it is not defined
2. Easy to understand and exactly.
calculate.
2. Not based on all
3. Not affected very much by observations.
extreme values.
3. Not easy for algebraic
4. Can be calculated in the case treatment.
of the data with open-end
class. 4. For calculating median, it is
necessary to arrange the data
5. Can be defined graphically. in either ascending or
descending order.
Mode
Mode

• Mode: The value of the variable that occurs most frequently;


that is for which the frequency is a maximum.

• Generally speaking, mode can be used to describe qualitative


data.

• Mode is particularly useful average for discrete data.

• For ungrouped data / categorical variable:


Mode is the value of the variable for which the frequency
is highest.
Mode: Ungrouped Data

For the data sets:

i. 7, 8, 6, 7, 9, 7, and 4: Here ‘7’ appears highest 3 times, hence


mode is ‘7’and the data is unimodal.

ii. 6, 4, 8, 5, 8, 1, 2, 5, 4, 7, 5, 2, 4, and 3: here ‘5’ and ‘4’ both


occur highest 3 times hence the mode ‘5’ and ‘4’ and the data is
bimodal.

iii. 1, 5, 7, 2, 6, 9, and 4: there is no mode.

iv. Consider the following table representing the frequency


distribution of religion

Religion Muslim Hindu Buddhist Christian Others


Frequency 18 75 12 4 2

 Here the highest frequency ‘75’ occurs for the category


‘Hindu’. Hence mode for the given data is Hindu.
Determination of Mode: Grouped Data

f0 - f-1
Formula, Mo = L0 + { }*W
f0 - f-1 (f0 - f1)

• Mo = Mode
• L0 = Lower limit of the Modal class
• f0 = Frequency of the modal class.
• f-1 = Frequency of the pre modal class.
• 1 = Frequency of the post modal class.
• W = Width of the modal class.
Example

The Table displays summary information of the parent of 50


students. Compute the mode of the parents’ income.

Income of parent (in Frequency


thousand taka)
Below 20 3
20-40 4
40-60 6
60-80 8
80-100 12
100-120 10
120 and over 7
Total 50

• Step 1: Locate the modal class.


• Step 2: Determine the lower limit(L0 ) of the modal class.
• Step 3: Determine the frequency(f0 ) of the modal class.
• Step 4: Determine the frequency(f-1 ) of the pre modal class.
• Step 5: Determine the frequency(f1) of the post modal class.
• Step 6: Determine the width of the modal class, W .
Mode

Merits Demerits

1. Most typical and 1. Not clearly defined in case of


representative value of a bimodal or multi modal
distribution. distribution.
2. Not at all affected by extreme 2. Not based on all observation.
values.
3. Not suitable for further
3. Can be calculated in the case algebraic treatment.
of the data with open-end
class. 4. Affected by sampling
fluctuations.
4. Easy to understand and
calculate.

5. Can be defined graphically.

You might also like