0% found this document useful (0 votes)
6 views47 pages

Measures of Central Tendency

Download as pps, pdf, or txt
Download as pps, pdf, or txt
Download as pps, pdf, or txt
You are on page 1/ 47

Quality

Management

Dr. S.M. Khan


Scientist, RDSO
Talk Includes
 Organizing, presenting and summarizing data
 Measures of Central Tendency and dispersion
 Concept of distributions
 Concept of sampling distributions
 Hypothesis testing
 Acceptance sampling – Attribute
Characteristics
 Single and double sampling plans and OC
curves
 Correlation and Regression Analyses
 Statistical designs of experiments for product
quality improvement
Talk Contd….

 Introduction to Quality
 Quality of Design vs Conformance
 Cost of Quality
 Investigational methods
 Quality Assurance Functions and their
evaluation
 Process and product check-Inspections,
Quality control and testing schemes
 Organization of Quality Control, Quality
Audit, Quality Circles
 ISO 9000
Measures of Central
Tendency
Today’s Questions

 How can we summarize a


distribution of scores efficiently
using quantitative (as opposed to
graphical) methods?
Summary Measures

Summary Measures

Central Tendency Quartile Variation

Mean Mode
Median Range Coefficient of
Variation
Variance

Standard Deviation
Mean: The arithmetic mean of a
sample (or simply the sample
mean) of n observations is
computed as:
Sample Size n

x1  x2  ...  xn i 1x i
x 
x = an individual score
n n
n = the number of scores
Sigma or = take the sum
Mean: The population mean is defined by
the formula:

N Population Size
x i
Sumof thevalues
of allobservatio
nsin populatio
n
 i 1

N Total numberof observatio
ns in populatio
n

x = an individual score
N = the number of scores
Sigma or = take the sum
(Contd..)

 The Most Common Measure of Central


Tendency
Can Be Affected by Extreme Values
(Outliers) (Try -10, 3, 5, 7, 20)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Mean = 5 Mean = 6
Mean: Grouped Data

1
Mean  A   fx
N
A= Assumed Mean
x = an individual score
f = frequency of the class
N = the number of scores
Sigma or = take the sum
Mean

 In the stress
example, the sum of
all the scores is 975.
 975 / 157 = 6.2
 Thus, the average
score is 6.2, on a 0
to 10 scale.
Other Means

 Geometric Mean
 Harmonic Mean
 Trimmed Mean
 Winsorized Mean
 Walsh and Wilcoxon Statistic W
Advantages

 It is based on all observations


 It is a measure that can be
calculated and is unique
 Least affected by fluctuations of
sampling
 It is useful for performing
statistical procedures such as
comparing the means from
several data sets.
 Suitable for algebraic treatment
Disadvantages
 It cannot be calculated for qualitative data
 It is affected by extreme values that are
not representative of the rest of the data.
 Consider 7 observations: 4.2, 4.3, 4.7, 4.8,
5.0, 5.1, 9.0.
 Indeed, if in the above example we
compute the mean of the first 6 numbers
and exclude the 9.0 value, then the mean
is 4.7. The one extreme value 9.0 distorts
the value we get for the mean. It would be
more representative to calculate the
mean without including such an extreme
value.
Measures of Central
Tendency

Central Tendency

Mean Median Mode


n

X i
X  i 1
n
N

X i
  i 1
N
Median
The median m of a sample of n observations:
Set all the terms in ascending or
descending order of magnitude. The
middle number that divides the data set
into two equal halfs: one half of the items
lie above this point, and the other half lie
below it.

 x k if n 2k  1 ( n is odd)

m  Median  1
 x k  x k 1  if n 2k ( n is even
2
Exercise: Find the median of the data set
consisting of the observations 7, 4, 3, 5, 6,
8, 10.

Solution:
 First, we arrange the data set in ascending order
3 4 5 6 7 8 10.
 Since the number of observations is odd, n = 2 x
4 - 1, then median m = x4 = 6. We see that a half
of the observations, namely, 3, 4, 5 lie below the
value 6 and an another half of the observations,
namely, 7, 8 and 10 lie above the value 6.
Example: Suppose we have an even number of the
observations 7, 4, 3, 5, 6, 8, 10, 1. Find the median
of this data set.

Solution:
 First, we arrange the data set in ascending
order
1 3 4 5 6 7 8 10.
 Since the number of the observations n = 2 x
4, then by Definition
 Median = (x4+x5)/2 = (5+6)/2 = 5.5
Median
The Midpoint or the 50th percentile of a
distribution.

Odd Number of Measurements


x=[1234567]
Median = 4
Even Number of Measurements
x=[12345678]
Median = (4+5)/2= 4.5
2. Median: the value at which 1/2 of the ordered
scores fall above and 1/2 of the scores fall below

12345 1234

Median = 3 Median = 2.5


Median: For Grouped Data
N
 F
Median l  2 h
f
Where
N = Total Cases
l = lower limit of the median class
f = frequency of the median class
h = size of the class interval
F = Cumulative frequency before the median class
Advantages

 It is easy to understand and readily


calculated
 It gives best results in qualitative
measurements
 It eliminates the effects of extreme
values
 In most of the cases it can be found
tentatively
Disadvantages

 It requires considerable work


 It cannot give correct total when
multiplied by the number of items
 Not suitable for algebraic
treatment
Advantage of the median
over the mean:
 Extreme values in data set do not
affect the median as strongly as
they do the mean.
 Indeed, in a series
4.2, 4.3, 4.7, 4.8, 5.0, 5.1, 9.0.
 Mean = 5.3, Median = 4.8.
 The extreme value of 9.0 does not
affect the median.
Mode

 The mode of a data set is the


value that occurs with the
greatest frequency, i.e., is
repeated most often in the
data set.
Example: Find the mode of the data set
in the following Table.

 Quantity of glucose (mg%) in blood of 25 students:

70 88 95 101 106

79 93 96 101 107

83 93 97 103 108

86 93 97 103 112

87 95 98 106 115
Solution: First we arrange this data set in
the ascending order

70 88 95 101 106
This data set
contains 25 numbers. 79 93 96 101 107
We see that, the
value of 93 is 83 93 97 103 108
repeated most often.
Therefore, the mode 86 93 97 103 112

of the data set is 93.


87 95 98 106 115
Mode: most frequently occurring score

Mode = 7
Mode

Most frequently occurring score or group


of scores in a set.
Mode: For Grouped Data
f  f 1
Mode l1  h
2 f  f 1  f 1
Where:
f = Maximum frequency
l1 = lower limit of the corresponding model class
f-1 = Preceeding frequency
f1 = Succeeding frequency
h = Class Interval
Advantage:

 Easily located, merely by inspection


 Based on all values
 Like the median, the mode is not unduly
affected by extreme values. Even if the
high values are very high and the low
value is very low, we choose the most
frequent value of the data set to be the
mode value.
 We can use the mode no matter how
large, how small, or how spread out the
values in the data set happen to be.
Disadvantages:
 The mode is not used as often to measure
central tendency as are the mean and the
median. Too often, there is no modal value
because the data set contains no values
that occur more than once.
 Other times, every value is the mode
because every value occurs the same
number of times. Clearly, the mode is a
useless measure in these cases.
 When data sets contain two, three, or
many modes, they are difficult to interpret
and compare.
When the distribution of
scores is normal, the
Mean mode = median = mean
Median
Mode
Mode = 2
Median = 2.5
Mean = 2.7

When scores are When scores are


positively skewed, negatively skewed,
mean is dragged in mean is dragged in
direction of skew and direction of skew and
mode < median < mean mode > median > mean
Empirical Relationship

Mean – Median = 1/3 (Mean – Mode) or

Mode = 3 Median – 2 Mean


Which is best?
MCT Advantages Disadvantages

Mean Takes all Can be affected by


numbers into outliers (very small or
account. large numbers).
Median Fairly easy to Tedious to find for a
calculate. Half of large set of numbers
the scores lie or for a set that is not
above the in order.
median.
Mode Quick and easy to May not be
calculate. representative of the
whole sample
Quartiles
Split Ordered Data into 4 Quarters

Data in Ordered Array: 11 12 13 16 16 17 18 21 22

25% 25% 25% 25%


Q1  Q2  Q3 

i n  1
Qi  
4
EX. Evaluate the values of mean,
mode and median for the following
grouped data:
No. of days absent No. of students
5 29
10 124
15 349
20 442
25 478
30 487
35 493
40 497
45 500
Contd..
No. of Mid Value X f Cum f a=22.5 fx
days d=x-a

0-5 2.5 29 29 -20 -580


5-10 7.5 95 124 -15 -1425
10-15 12.5 225 349 -10 -2250
15-20 17.5 93 442 -5 -465
20-25 22.5 36 478 0 0
25-30 27.5 9 487 5 45
30-35 32.5 6 493 10 60
35-40 37.5 4 497 15 60
40-45 45.0 3 500 22.5 67.5
Total ∑f=500 ∑fx=-4488.5
Mean: Grouped Data

1
Mean  A   fx
N
Mean = 13.523
Median: For Grouped Data
N
 F
Median l  2 h
Where f
N = Total Number
l = lower limit of the median class
f = frequency of the median class
h = size of the class interval
F = Cumulative frequency before the median class
Contd..
No. of Mid Value X f Cum f a=22.5 fx
days d=x-a

0-5 2.5 29 29 -20 -580


5-10 7.5 95 124 -15 -1425
10-15 12.5 225 349 -10 -2250
15-20 17.5 93 442 -5 -465
20-25 22.5 36 478 0 0
25-30 27.5 9 487 5 45
30-35 32.5 6 493 10 60
35-40 37.5 4 497 15 60
40-45 45.0 3 500 22.5 67.5
Total ∑f=500 ∑fx=-4488.5
Contd…

Where,
N = 500
l = 10
f = 225
h=5
F = 124 Median = 12.8
Mode: For Grouped Data

f  f 1
Mode l1  h
2 f  f  1  f1
Where:
f = Maximum frequency
l1 = lower limit of the corresponding model class
f-1 = Preceding frequency
f-2 = Succeeding frequency
h = Class Interval
Contd..
No. of Mid Value X f Cum f a=22.5 fx
days d=x-a

0-5 2.5 29 29 -20 -580


5-10 7.5 95 124 -15 -1425
10-15 12.5 225 349 -10 -2250
15-20 17.5 93 442 -5 -465
20-25 22.5 36 478 0 0
25-30 27.5 9 487 5 45
30-35 32.5 6 493 10 60
35-40 37.5 4 497 15 60
40-45 45.0 3 500 22.5 67.5
Total ∑f=500 ∑fx=-4488.5
Mode: For Grouped Data

Where:
f = 225
l1 = 10 Mode = 12.48
f-1 = 95
f1 = 93
h=5

You might also like