Measures of Central Tendency, Location, and Dispersion in Salary Survey Research
Measures of Central Tendency, Location, and Dispersion in Salary Survey Research
Compensation
Measures of Central
Tendency,Location,and
Dispersion in Salary
Survey Research
Robert M. Halley
Compensation Consultant Proper application of
Swanson Consulting
salary data requires a
clear understanding of
the methods and
measures used.
t
o obtain the maximum utility from on the method of data selection. In a probability
salary survey results, it is essential sample, the data elements are selected in a ran-
to know the mathematical attrib- dom fashion from a sample population such that
utes and proper applications of each item has an equal chance of inclusion and
the various statistical measures each sample of a particular size also has an equal
used to summarize survey find- chance of selection. For this reason, a probability
ings. This article outlines the sample permits generalizations about the larger
use and interpretation of the de- population universe from which it is drawn. In
scriptive statistics commonly addition, the accuracy of an estimate from a
found in commercial salary probability sample may be specified within de-
survey reports and suggests some finable limits. Finally, a probability sample per-
general guidelines for their mits the use of both descriptive1 and inferential
implementation. statistics.2 An example of a probability sample
is the Washington State Legislature Salary Survey
(see Exhibit 1).
Probability and In a nonprobability sample, the data elements
Nonprobability Sampling are selected in a nonrandom or biased manner.
Sample surveys may be classified as either Factors other than chance such as subjective
probability or nonprobability samples depending judgment, self-selection, or availability adversely
Keywords: statistics; salary survey; compensation; measures of central tendency; location and dispersion;
salary research
DOI: 10.1177/0886368704268598
EXHIBIT 1
Survey Sampling Techniques
affect the probability of inclusion of survey ele- nized in a frequency distribution that arrays pay
ments. This means that some sample units have a rates (by category) in terms of their frequency of
high probability of selection, whereas others have occurrence. A frequency distribution is described
virtually no chance of inclusion (with no way of by the following three principle features: (a) its
determining the probability of either their inclu- shape or form, (b) its cluster around the central
sion or exclusion). For this reason, the results of a value, and (c) its degree of variability.
nonprobability sample may not be generalized Descriptive statistics are employed to describe
beyond the actual sample group to a larger popu- various aspects of frequency distributions. Data
lation universe. The precision of a nonprobability are condensed and summarized via descriptive
sample estimate cannot be determined. Al- statistics to better understand their meaning and
though a nonprobability sample allows the use of to prepare them for further analysis. The intent of
descriptive statistics to summarize the important descriptive measures is to represent an entire
features of a sample distribution, it does not per- data set with a single number. Because the result-
mit the use of inferential statistical analysis (see ing data reduction causes a loss of information,
Exhibit 1). several different types of statistics may be neces-
The majority of commercial salary surveys are sary to fully represent the various features of a
nonprobability convenience surveys. Data ele- distribution of salary rates. Once pay distribu-
ments are included because of the “voluntary re- tions have been fully described, various compar-
sponse” of survey subscribers who are willing and isons can be made between them.
able to provide input. Even if participation in a
convenience survey is extensive, its results may Measures of Central Tendency
be generalized only to the immediate sample it- Measures of central tendency are used to sum-
self and no farther. However, if a convenience marize (with one or a few statistics) a distribution
sample is representative, it is just as useful as a of pay rates in terms of its center. They represent
random sample for making management deci- the pay rate that is typical, expected, or represen-
sions, although statistical inferences may not be tative in a set of salary data. Because the center of
made from the data. The Management and Pro- a distribution may be defined in several ways
fessional Salary Survey published by Milliman (e.g., balance point, geographical center, or high-
USA is an example of a convenience survey (see est point), there are several different measures of
Exhibit 1). central tendency, including the mean (simple,
weighted, or trimmed), median, and the mode.
There is no one right way to measure central
Measures of Central Tendency tendency. Depending on the nature of a set of
and Dispersion data (e.g., its quality, quantity, distribution, dis-
The results of salary surveys are reduced and persion, and representativeness), certain mea-
summarized by various measures of central ten- sures of central tendency (and location) are more
dency, location, and/or dispersion. Data ob- suitable than others. The task of the compensa-
tained from salary surveys are commonly orga- tion professional is to evaluate distributions of
Compensation
EXHIBIT 2
Measures of Central Tendency
EXHIBIT 3
Measures of Dispersion
pay and to choose the best measure of central rate represents a set of salary data. These mea-
tendency to describe a set of pay rates. However, sures help determine the goodness, quality, or re-
once a measure is chosen, it should be consis- liability of generalizations made from the pay
tently maintained throughout and should not be data. The greater the dispersion of the pay data,
deviated from without a good reason and without the less reliable to average salary reported (see
noting it (see Exhibit 2). Exhibit 3).
Among the major measures of central tenden-
cy, the mean is the most stable, followed by the
median, and lastly the mode. Simple Mean
Definition. “The simple mean is the sum of
Measures of Dispersion scores in a distribution divided by the number of
A salary rate assumes meaning only when it is scores.”3
contrasted with other pay rates or other statistics.
Calculation. The simple mean is calculated by
To more fully describe a distribution of pay rates
totaling the average salary paid in each organiza-
or to more accurately interpret a pay rate, addi-
tion and then dividing by the total number of or-
tional information is required concerning the
ganizations (see Exhibit 4).
dispersion of salary rates about the measure of
central tendency. Mathematical characteristics. The simple (or
Measures of dispersion indicate how scattered arithmetic) mean “gives equal weight to the
or spread out the salary data are around a mea- salary paid by each organization regardless of the
sure of central tendency. The purpose of this type number of incumbents.”4 In other words, salary
of measure is to assess how well an average pay data from a company with a single incumbent
Compensation
EXHIBIT 6 EXHIBIT 7
The Mean Is Drawn in the The Normal Distribution
Direction of Extreme Pay Rates
Frequency
Frequency
Mode Mean
Pay Rate
Median Mean }
Median} All Equal
The mean is drawn in the direction of skewedness (or toward the tail). Mode }
amount rapidly and spread upward toward the by extreme pay rates, and (e) plays an important
top of the range due to the impact of pay-for- role in higher statistical applications such as
performance and seniority systems as well as standard deviation, variance, correlation, and re-
the performance of additional (yet unidentified) gression as well as in inferential statistics.
duties. The principle difference between the simple
Research has shown that the mean of a distri- mean and the weighted mean is that the latter as-
bution of pay rates may exceed the median (or signs equal weight to the pay rate of every em-
midpoint of a distribution) by 3% to 5% because ployee (rather than each employer) represented
of the presence of very high rates of pay. Richard in the survey sample. For this reason, the weight-
Henderson warns, “If an organization . . . uses the ed mean could be thought of as the “employee
mean as the value setting pay practices it may be mean” (see Exhibit 10).
overpaying by 3 to 4 percent.”7 Because the pay data of each company are
Finally, the simple mean is less volatile than weighted by the number of incumbents in the
the weighted mean and thus better suited for job, the data from companies with many incum-
making year-to-year comparisons. bents will have greater influence on the weighted
mean than the data from employers with few
incumbents.
Weighted Mean The weighted mean is the most sensitive mea-
Definition. “The weighted mean is the sum of sure of central tendency to extreme pay rates in a
the mean of each group multiplied by its respec- data set. For instance, when the mean pay rate
tive weight (the n in each group), divided by the with the greatest weight (number of cases) is sit-
sum of the weights (total n).”8 uated on either the high or low side of a distribu-
tion of means, then it will pull the weighted mean
Calculation. The weighted mean is calculated in that direction and away from the simple (un-
by totaling the pay rate of each employee and weighted) mean.
then dividing by the total number of incumbents When the survey sample participants are high-
or by weighting the mean pay rate of each com- ly representative of the market for a job, the
pany surveyed by the number of pay rates that weighted mean (which reflects the pay rate of
went into its calculation in the first place (see Ex- each job incumbent) will represent the market
hibit 9). value of the position. In effect, the number of in-
Mathematical characteristics. The weighted cumbents represents the number of potential
mean possesses most of the same properties as openings for that job.
the simple mean, including that it (a) serves as Note: Because they share the same mathemat-
the balance point or center of gravity in a data set, ical properties, the simple mean and weighted
(b) reflects all the pay rates in a distribution of mean may be combined to reflect the influence
salaries, (c) is impacted by the exact numerical of both employers and employees on the market
value of each pay rate in a data set, (d) is affected value of a job.
SEPTEMBER/OCTOBER 2004 43
Downloaded from cbr.sagepub.com at EASTERN KENTUCKY UNIV on May 26, 2015
SALARY SURVEYS
Compensation
EXHIBIT 8
Applications of the Simple Mean
Compensation
EXHIBIT 10
Applications of the Weighted Mean
Median
EXHIBIT 11
Calculating the Trimmed Mean Definition. “The median is the value of the
middle item when the data are arranged in as-
$40,000 cending or descending order.”10
$39,000
$39,000 Calculation. The median is calculated by first
$39,000 ranking a set of pay rates. If there are an odd
$37,000
number of rates, then the median is the middle
$36,000
$34,000 pay rate (see Exhibit 13).
$30,000
Mathematical characteristics. The median is a
$28,000
$26,000 measure of location that indicates the geographi-
$25,000 cal center of a distribution of pay rates. It is the
“middle data point in an ordered array of data.”11
Simple Mean = $33,909
The median cleaves a data set in two so that half
N = 11 of the pay rates are less than (or equal to) its val-
Top and bottom pay rates trimmed ue and half the pay rates are greater than (or
equal to) its value. It is the value at the 50th per-
($40,000)
$39,000 centile, the fifth decile, and the second quartile
$39,000 (see Exhibit 14).
$39,000 Although the mean takes into account the ac-
$37,000 tual value of each pay rate in its calculation, the
$36,000
median (once pay rates have been rank ordered)
$34,000
$30,000 treats each pay rate equally regardless of its actu-
$28,000 al value. The value of the median depends on the
$26,000 order among pay rates rather than their numeri-
($25,000) cal value.
Trimmed Mean = $34,222 The computation of the median involves a
simple count of pay rates with each rate counting
N=9
as only one case (irrespective of its value). When
EXHIBIT 12
Applications of the Trimmed Mean
EXHIBIT 13 EXHIBIT 14
Calculating the Median The Relation of the Median to the Mean
Median = (n + 1)
2
Raw Data – Ordered Data
Frequency
32 39 (7)
37 39 (6)
50% 50%
38 38 (5) of of
36 37 (4) Pay Pay
Rates Rates
39 37 (3)
39 36 (2)
Rates
37 32 (1)
MEDIAN Pay Rate
n=7
If balance Lever falls to
Median = 7 + 1 or 4th pay rate point here. right.
2
If balance Lever
when the salary data are order. point here. balances.
The median is 37.
MEAN
When there is an even number of salary rates, the
median is the middle of the pay rates.
Less than
Raw Data – Ordered Data
More than
50% of the pay { } 50% of the
rates. pay rates.
13 14 (8)
10 13 (7)
14 13 (6)
7 12 (5) there is an odd number of pay rates, it is simply
8 10 (4) the middle rate. When there is an even number of
7 8 (3)
12 7 (2) pay rates, it is the simple average of two center-
13 7 (1) most rates. When more than two scores share the
middle value, calculation of the median becomes
n=8
a somewhat more complicated proposition12 (see
Median = 8 + 1 or 4.5th pay rate footnote 16).
The median is easier to obtain than the mean.
2
In fact, the median can even be computed when
when rates are ordered. The median is 11 some of the pay rates are missing from the top or
(10 + 12) = 11 bottom of a distribution.
The median provides the “typical” pay rate in a
2
distribution of pay rates. It indicates the pay rate
Compensation
EXHIBIT 15
Applications of the Median
EXHIBIT 18
Applications of the Mode
Compensation
EXHIBIT 19 EXHIBIT 20
Calculating the Percentile Applications of the Percentile
= (.70)(10)
= 7th data point
= 422 EXHIBIT 21
Calculating the Range
la p(n + 1), where p = the percentage of the per- For example, if the highest pay rate in a distribution of
centile and n = the number of pay rates. If the p(n salaries is $68,000 and the lowest pay rate $56,000,
+ 1) observations fall between two pay rates, then then the range would be $12,000 ($68,000 – $56,000 =
calculate a value proportionally between them $12,000)
using linear interpolation. Note: The p(n +1) for-
mula presented is the most widely used. Not all
texts use (n + 1). This can make a difference with formed into percentile ranks. For the percentile
small samples (see Exhibit 19). rank of a pay rate to be meaningful, it must also
be made in relation to some reference group (e.g.,
Mathematical characteristics. The percentile is
the market or major competitors).
a measure of position that identifies the location
Percentiles permit comparisons only at the or-
of a pay rate in a distribution of rates. The per-
dinal level. This means that such comparisons in-
centile specifies the percentage of rates that are
dicate only that one pay rate is larger or smaller
less than (or equal to) a given pay rate in a distri-
than another rate but does not specify by how
bution (e.g., the 70th percentile is that point at
much. An appropriate analogy for ordinal-level
which 70% of the pay data fall under when rated
comparisons is a horse race (where the horses
on size). It also indicates the position of pay rates
finish in first, second, and third place).
that are larger than a proportion of rates.
For a percentile to make sense, there must be a
The percentile is not necessarily a measure of
minimum number of two data points lying out-
central tendency or typicality but rather more like
side the percentile. 16
the median. In fact, the 50th percentile of a distri-
bution is the median. However, unlike the medi-
an, the percentile may reflect all areas of a distri-
Range
bution (of pay rates), not merely in its middle
section. Definition. “The range is the (actual numeri-
The percentile is often used to compare an or- cal) difference between the largest and smallest
ganization’s pay philosophy against the market. pay rates in a distribution.”17
For example, an organization might target its
Calculation. To calculate the range, you sub-
base pay to the 60th percentile of the market rate
tract the lowest pay rate in a distribution from the
(see Exhibit 20).
highest pay rate (see Exhibit 21).
A pay rate has little meaning or utility in isola-
tion. To provide a comprehensible basis for inter- Mathematic characteristics. The range is a
preting and comparing pay rates, they are trans- measure of dispersion that indicates the spread
EXHIBIT 22
Applications of the Range
Compensation
EXHIBIT 24
Applications of the Interquartile Range
survey matches (i.e., whether survey participants in the behavioral sciences. St. Paul, MN: West
were matching the same job). Publishing Company. (p. 60). Elifson, K. W.
The quartiles (Q1, Q3) in conjunction with the (1982). Fundamentals of social statistics.
median provide an indication of the center, Reading, MA: Addison-Wesley. (p. 95).
spread, and shape of a distribution of pay rates. 4. Daniels, L. (Ed.). (1994). Puget Sound regional
When either Q1 or Q3 is farther away from the salary survey summary report. Seattle, WA:
median than the other quartile, the distribution Milliman & Robertson, Inc. (p. 18). Clark, L. J.
is said to be skewed in that direction. (1993). Business statistics I. Piscataway, NJ:
The quartiles typically provide a more stable Research & Education Association. (p. 20).
measure of pay rate variation than the range be- 5. Rees, D. G. (1989). Essential statistics. Lon-
cause (being less impacted by extreme rates) they don: Chapman & Hall. (pp. 27, 42). Note: To
vary less from sample to sample. The quartiles are determine the skewness of a distribution of
also a more informative measure of dispersion pay rates, use the following equation. Marked
with markedly skewed data than the range be- skewness means that the measure of skew-
cause they portray the unequal spread of pay ness is greater than +1 or less than –1, a rough
data in a distribution. guide.
The IQR, similar to the median, does not enter
into any of the higher mathematical relationships (Sample mean – sample median)
that are basic to inferential statistics. Sample standard deviation
11. Weiss, N. (1989). Elementary statistics. Read- used. Compare p and (1 – p) with the value of
ing, MA: Addison-Wesley. (p. 59). n/(n + 1). If p or (1 – p), is larger than n/(n +
12. Schmidt, M. J. (1974). Understanding and us- 1), the formula will not work, as it will gener-
ing statistics: Basic concepts. Lexington, MA: ate a value outside the range of the data.”
D. C. Health. (“When more than two scores (Davis et al., 1990, p. R2.33)
share the middle value, and there are rela-
tively few observations involved, computing Number of Cases Needed to
Ensure Two Data Points
the median becomes a slightly more compli- Outside of the Percentile(s) Percentile(s)
cated matter. Consider the distribution: 2, 3,
4 50th
3, 4, 7, 7, 7, 7, 7, 8, 9. The middle score is 7, but
four other scores also share the value. In such 8 25th, 75th
cases, one has to determine where the medi- 20 10th, 90th
an lies between the upper and lower true lim-
its of the value 7, that is, where exactly the 17. Herzberg, P. A. (1983). Principles of statistics.
median falls between 6.5 and 7.5. Here, the New York: John Wiley. (p. 60).
middle score stands second among five 18. Davis et al. (1990, p. R2.34).
scores that are equal. The median is this two- 19. Bjorndal, J. A., & Ison, L. K. (1991). Mastering
fifths the difference between 6.5 and 7.5 Market Data. Scottsdale, Arizona: American
above the lower true limit of 6.5. The median Compensation Association. (p. 8).
is 6.5 + 0.4 = 6.9,” p. 78.) 20. McMahon, J. R., & Hand, J. S. (1991). Measur-
13. Pagano (1990, p. 67). ing the Marketplace. Scottsdale, Arizona:
14. Davis et al. (1990, p. R2.28). American Compensation Association. (pp.
15. Davis et al. (1990, p. R2.31). 14-15).
16. “In order to determine whether there are a 21. Lutz, Carl F. How to develop, conduct, and use
sufficient number of cases to compute a sta- a pay/fringe benefit survey. Crete, Illinois: Ab-
ble percentile for very small n, very small p, bott, Langer & Associates (p. 38).
or very large p the following formula can be 22. Lutz, Carl F. (pp. 37-38).
Robert M. Halley, CCP/CBP, is a compensation consultant associated with Swanson Consulting, a labor re-
lations consulting firm, in Seattle, Washington. He has more than 15 years of experience in the field of
compensation and has previously served as manager of compensation services for Associated Grocers, Inc.
and compensation administrator for the Port of Seattle. He formerly edited and published the Compara-
tive Salary Survey Report, a major regional salary survey. Originally trained as a behavioral scientist, he
holds advanced degrees in the social sciences and education. His professional specialties include salary
survey research, compensation program development, and salary administration training.