CHAP 4 Multiple Choice
CHAP 4 Multiple Choice
1. A data set with two values that are tied for the highest number of occurrences is called bimodal.
True False
True False
True False
4. A trimmed mean may be preferable to a mean when a data set has extreme values.
True False
5. One benefit of the box plot is that it clearly displays the standard deviation.
True False
True False
True False
True False
9. When data are right-skewed, we expect the median to be greater than the mean.
True False
10. The sum of the deviations around the mean is always zero.
True False
11. The midhinge is a robust measure of center when there are outliers.
True False
12. Chebyshev's Theorem says that at most 50 percent of the data lie within 2 standard deviations of the mean.
True False
13. Chebyshev's Theorem says that at least 95 percent of the data lie within 2 standard deviations of the mean.
True False
14. If there are 19 data values, the median will have 10 values above it and 9 below it since n is odd.
True False
15. If there are 20 data values, the median will be halfway between two data values.
True False
16. In a left-skewed distribution, we expect that the median will be greater than the mean.
True False
17. If the standard deviations of two samples are the same, so are their coefficients of variation.
True False
18. A certain health maintenance organization (HMO) examined the number of office visits by its members in the last year. This
data set would probably be skewed to the left due to low outliers.
True False
19. A certain health maintenance organization (HMO) examined the number of office visits by its members in the last year. For this
data set, the mean is probably not a very good measure of a "typical" person's office visits.
True False
20. Referring to this box plot of ice cream fat content, the median seems more "typical" of fat content than the midrange as a
measure of center. (NOTE)
True False
21. Referring to this box plot of ice cream fat content, the mean would exceed the median.
True False
22. Referring to this box plot of ice cream fat content, the skewness would be negative.
True False
23. Referring to this graph of ice cream fat content, the second quartile is about 61.
True False
24. The range as a measure of variability is very sensitive to extreme data values.
True False
25. In calculating the sample variance, the sum of the squared deviations around the mean is divided by n - 1 to avoid
underestimating the unknown population variance.
True False
26. Outliers are data values that fall beyond ±2 standard deviations from the mean.
True False
27. The Empirical Rule assumes that the distribution of data follows a normal curve.
True False
28. The Empirical Rule can be applied to any distribution, unlike Chebyshev's theorem.
True False
29. When applying the Empirical Rule to a distribution of grades, if a student scored one standard deviation below the mean, then
she would be at the 25th percentile of the distribution.
True False
True False
31. A platykurtic distribution is more sharply peaked (i.e., thinner tails) than a normal distribution.
True False
32. A leptokurtic distribution is more sharply peaked (i.e., thinner tails) than a normal distribution.
True False
True False
34. A sample consists of the following data: 7, 11, 12, 18, 20, 22, 43. Using the "three standard deviation" criterion, the last
observation (X = 43) would be considered an outlier.
True False
Multiple Choice Questions
B. a unit-free statistic.
36. Which is not an advantage of the method of medians to find Q1 and Q3?
B. It is less reliable than the mode when the data are continuous.
B. n/2 if n is even.
C. n/2 if n is
odd.
A. It is similar to the mean if there are offsetting high and low extremes.
B. It is especially helpful in a small sample.
45. In a sample of 10,000 observations from a normal population, how many would you expect to lie beyond three standard
deviations of the mean?
A. None of them
B. About 27
C. About 100
D. About 127
46. The Excel formula for the standard deviation of a sample array named Data is:
A. =STDEV.S(Data).
B. =STANDEV(Data).
C. =STDEV.P(Data).
D. =SUM(Data)/(COUNT(Data)-1).
48. Estimating the mean from grouped data will tend to be most accurate when:
A. A distribution that is flatter than a normal distribution (i.e., thicker tails) is mesokurtic.
B. A distribution that is more peaked than a normal distribution (i.e., thinner tails) is
platykurtic.
A. It applies to any
distribution.
A. In a left-skewed distribution, we expect that the median will exceed the mean.
54. Exam scores in a small class were 10, 10, 20, 20, 40, 60, 80, 80, 90, 100, 100. For this data set, which statement is incorrect
concerning measures of center?
55. Exam scores in a small class were 0, 50, 50, 70, 70, 80, 90, 90, 100, 100. For this data set, which statement is incorrect
concerning measures of center?
56. Exam scores in a random sample of students were 0, 50, 50, 70, 70, 80, 90, 90, 90, 100. Which statement is incorrect?
57. For U.S. adult males, the mean height is 178 cm with a standard deviation of 8 cm and the mean weight is 84 kg with a
standard deviation of 8 kg. Elmer is 170 cm tall and weighs 70 kg. It is most nearly correct to say that:
B. John is an outlier.
59. John scored 35 on Prof. Johnson's exam (Q1 = 70 and Q3 = 80). Based on the fences, which is correct?
B. John is an outlier.
60. A population consists of the following data: 7, 11, 12, 18, 20, 22, 25. The population variance is:
A. 6.07.
B. 36.82.
C. 5.16.
D. 22.86.
61. Consider the following data: 6, 7, 17, 51, 3, 17, 23, and 69. The range and the median are:
A. 69 and
17.5.
B. 66 and
17.5.
C. 66 and
17.
D. 69 and
17.
62. When a sample has an odd number of observations, the median is the:
63. As a measure of variability, compared to the range, an advantage of the standard deviation is:
B. considering only the data values in the middle of the data array.
64. Which two statistics offer robust measures of center when outliers are present?
65. Which Excel function is designed to calculate z = (x - μ)/σ for a column of data?
A. =STANDARDIZE
B. =NORM.DIS
T
C. =STDEV.P
D. =AVEDEV
66. Which Excel function would be least useful to calculate the quartiles for a column of data?
A. =STANDARDIZE
B. =PERCENTILE.EX
C
C. =QUARTILE.EXC
D. =RANK
67. A sample of 50 breakfast customers of McDonald's showed the spending below. Which statement is least likely to be correct?
68. VenalCo Market Research surveyed 50 individuals who recently purchased a certain CD, revealing the age distribution shown
below. Which statement is least defensible?
70. A sample of customers from Barnsboro National Bank shows an average account balance of $315 with a standard deviation of
$87. A sample of customers from Wellington Savings and Loan shows an average account balance of $8350 with a standard
deviation of $1800. Which statement about account balances is correct?
A. box
plot
B. bar
chart
C. histogram
D. scatter
plot
73. If the mean and median of a population are the same, then its distribution is:
A. normal.
B. skewed
.
C. symmetric.
D. uniform.
74. In the following data set {7, 5, 0, 2, 7, 15, 5, 2, 7, 18, 7, 3, 0}, the value 7 is:
A. the
mean.
B. the
mode.
A. 800.
B. 1000.
C. 900.
D. 950.
76. The 25th percentile for waiting time in a doctor's office is 19 minutes. The 75th percentile is 31 minutes. The interquartile range
is:
A. 12
minutes.
B. 16
minutes.
C. 22
minutes.
77. The 25th percentile for waiting time in a doctor's office is 19 minutes. The 75th percentile is 31 minutes. Which is incorrect
regarding the fences?
78. When using Chebyshev's Theorem, the minimum percentage of sample observations that will fall within two standard
deviations of the mean will be __________ the percentage within two standard deviations if a normal distribution is assumed
(Empirical Rule).
A. smaller
than
B. greater than
C. the same
as
79. Which distribution is least likely to be skewed to the right by high values?
80. Based on daily measurements, Bob's weight has a mean of 200 pounds with a standard deviation of 16 pounds, while Mary's
weight has a mean of 125 pounds with a standard deviation of 15 pounds. Who has the smaller relative variation?
A. Bob
B. Mary
81. Frieda is 67 inches tall and weighs 135 pounds. Women her age have a mean height of 65 inches with a standard deviation of
2.5 inches and a mean weight of 125 pounds with a standard deviation of 10 pounds. In relative terms, it is correct to say that:
B. for this group of women, weight has greater variation than height.
D. the variation coefficient exceeds 10 percent for both height and weight.
B. The standard deviation is in the same units as the mean (e.g., kilograms).
C. The mean from a frequency tabulation may differ from the mean from raw data.
83. The values of xmin and xmax can be inferred accurately except in a:
A. box
plot.
B. dot plot.
C. histogram.
D. scatter
plot.
A. The median personal income of California taxpayers would probably be near the mean.
B. The interquartile range offers a measure of income inequality among California residents.
C. For income, the sum of squared deviations about the mean is negative about half the time.
D. For personal incomes in California, outliers in either tail would be equally likely.
D. about 32 percent of the data are beyond one standard deviation from the
mean.
87. Three randomly chosen Seattle students were asked how many round trips they made to Canada last year. Their replies were
3, 4, 5. The geometric mean is:
A. 3.877.
B. 4.000.
C. 3.915.
D. 4.422.
88. Three randomly chosen California students were asked how many times they drove to Mexico last year. Their replies were 4, 5,
6. The geometric mean is:
A. 3.87.
B. 5.00.
C. 5.42.
D. 4.93.
89. Three randomly chosen Colorado students were asked how many times they went rock climbing last month. Their replies were
5, 6, 7. The standard deviation is:
A. 1.212.
B. 0.816.
C. 1.000.
D. 1.056.
90. Patient survival times after a certain type of surgery have a very right-skewed distribution due to a few high outliers.
Consequently, which statement is most likely to be correct?
C. Mean >
Midrange
91. So far this year, stock A has had a mean price of $6.58 per share with a standard deviation of $1.88, while stock B has had a
mean price of $10.57 per share with a standard deviation of $3.02. Which stock is more volatile?
A. Stock A
B. Stock B
A. box
plot.
B. dot plot.
C. histogram.
D. Pareto
chart.
B. Rang
e
C. Coefficient of variation
D. Trimmed mean
94. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 3, 2, 1, 2, 1, 5, 9, 1, 2, 3, 3, 10. The geometric mean is:
A.
B. 2.604
C. 1.517
D.
95. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 3, 2, 1, 2, 1, 5, 9, 1, 2, 3, 3, 10. The median is:
A. 7.0.
B. 3.0.
C. 3.5.
D. 2.5.
98. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample, the geometric mean is:
A. 2.158.
B. 1.545.
C. 2.376.
D. 3.017.
99. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample, the median is:
A. 2.
B. 3.
C. 3.5.
D. 2.5.
100. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample, which measure of center is least representative of the "typical" student?
A. Mean
B. Median
C. Mode
D. Midrange
101. Here are statistics on order sizes of Megalith Construction Supply's shipments of two kinds of construction materials last year.
A. Girders
B. Rivets
102. The quartiles of a distribution are most clearly revealed in which display?
A. Box plot
B. Scatter plot
C. Histogram
D. Dot plot
B. smaller when the units are smaller (e.g., milligrams versus kilograms).
C. always
zero.
104. What does the graph below (profit/sales ratios for 25 Fortune 500 companies) reveal?
105. Find the sample correlation coefficient for the following data.
A. .8911
B. .9132
C. .9822
D. .9556
106. Heights of male students in a certain statistics class range from Xmin = 61 to Xmax = 79. Applying the Empirical Rule, a
reasonable estimate of σ would be:
A. 2.75.
B. 3.00.
C. 3.25.
D. 3.50.
107. A reporter for the campus paper asked five randomly chosen students how many occupants, including the driver, ride to school
in their cars. The responses were 1, 1, 1, 1, 6. The coefficient of variation is:
A. 25
percent.
B. 250
percent.
C. 112 percent.
D. 100
percent.
108. A smooth distribution with one mode is negatively skewed (skewed to the left). The median of the distribution is $65. Which of
the following is a reasonable value for the distribution mean?
A. $76
B. $54
C. $81
D. $65
109. In a positively skewed distribution, the percentage of observations that fall below the median is:
A. about 50 percent.
C. more than 50
percent.
A. continuous
data.
B. categorical
data.
C. discrete data.
113. Estimate the mean exam score for the 50 students in Prof. Axolotl's class.
A. 59.2
B. 62.0
C. 63.5
D. 64.1
114. A survey of salary increases received during a recent year by 44 working MBA students is shown. Find the approximate mean
percent raise.
A. 6.56
B. 6.74
C. 5.90
D. 6.39
115. The following frequency distribution shows the amount earned yesterday by employees of a large Las Vegas casino. Estimate
the mean daily earnings.
A. $112.50
B. $125.01
C. $105.47
D. $117.13
116. The following table is the frequency distribution of parking fees for a day:
A. $7.07.
B. $6.95.
C. $7.00.
D. $7.25.
A. 4.550
B. 3.798
C. 4.278
D. 2.997
118. The 25th percentile for waiting time in a doctor's office is 10 minutes. The 75th percentile is 30 minutes. Which is incorrect
regarding the fences?
119. Five homes were recently sold in Oxnard Acres. Four of the homes sold for $400,000, while the fifth home sold for $2.5 million.
Which measure of central tendency best represents a typical home price in Oxnard Acres?
A. The mean or
median.
B. The median or
mode.
120. In Tokyo, construction workers earn an average of ×420,000 (yen) per month with a standard deviation of ×20,000, while in
Hamburg, Germany, construction workers earn an average of €3,200 (euros) per month with a standard deviation of €57. Who
is earning relatively more, a worker making ×460,000 per month in Tokyo or one earning €3,300 per month in Hamburg?
B. If the data are from a normal population, about 68 percent of the values will be within μ ± σ.
B. Standard deviation
C. Midhing
e
D. Interquartile range
123. If Q1 = 150 and Q3 = 250, the upper fences (inner and outer) are:
A. 450 and
600.
B. 350 and
450.
C. 400 and
550.
124. Variables X and Y have the strongest correlation in which scatter plot?
A. Figure
A.
B. Figure
B.
125. Which of the following statements is likely to apply to the incomes of 50 randomly chosen taxpayers in California?
126. A certain health maintenance organization (HMO) examined the number of office visits by each of its members in the last year.
For this data set, we would anticipate that the geometric mean would be
127. Three randomly chosen Colorado students were asked how many times they went rock climbing last month. Their replies were
5, 6, 7. The coefficient of variation is:
A. 16.7
percent.
B. 13.6
percent.
C. 20.0
percent.
D. 35.7
percent.
128. The mean of a population is 50 and the median is 40. Which histogram is most likely for samples from this population?
A. Sample
A.
B. Sample
B.
C. Sample C.
D. we should consult a table of percentiles that takes sample size into consideration.
130. If Excel's sample kurtosis coefficient is negative, we conclude that
D. we should consult a table of percentiles that takes sample size into consideration.
131. In Osaka, Japan, stock brokers earn ×6000 per hour on the average, with a standard deviation of ×1200. In Stuttgart,
Germany, stock brokers earn an average of €18 per hour with a standard deviation of €6. In which country is the variation in
wages greatest?
132. Find the coefficient of variation of these numbers: 14, 17, 17, 19, 26. Would the variability of those numbers be greater than,
less than, or the same as the variability of 24, 27, 27, 29, 36? Defend your answer.
133. Ten randomly chosen students at a certain university were asked how many times they smoked marijuana during the preceding
week. Their answers were 0, 8, 0, 0, 2, 4, 0, 0, 6, 0. A campus newspaper article appeared, with the headline "Average Student
Uses No Pot." Is this a fair assessment of central tendency? Discuss the alternatives.
134. Twelve students were asked how many credit cards they owned. The responses were 0, 0, 1, 2, 2, 3, 3, 4, 4, 5, 5, 11. (a) Find
the mean, median, and mode. (b) Which measure of center seems best in this case? (c) Find the first and third quartiles. What
do they tell you?
135. Eleven students were asked how many siblings they had. The responses were 0, 1, 2, 2, 2, 2, 2, 3, 3, 4, 5. Find the mean,
median, mode, and geometric mean. Which would you prefer in this case, and why not the others?
136. Patient waiting times in the Tardis Orthopedic Clinic have a mean of 50 minutes with a standard deviation of 25 minutes. Within
what range would approximately 95 percent of the waiting times lie if we were sampling a normal distribution? Do you think the
distribution is likely to be normal? Explain.
137. The athletic departments at 10 randomly selected U.S. universities were asked by the Equal Employment Opportunity
Commission to state what percentage of their nursing scholarships were presently held by women. The responses were 5, 4, 2,
1, 1, 2, 10, 5, 5, 5. Find the mean, median, mode, and geometric mean. Which is the most appropriate measure of central
tendency? The least appropriate? Explain your answer. Is there an outlier?
138. A survey of 10 randomly chosen drivers showed the following number of persons per car, including the driver: 1, 5, 1, 5, 2, 1, 1,
1, 2, 1. Describe the center, variability, and skewness for this sample.
139. A national survey showed that most commuter cars contain only the driver. Hungry for a story, a campus newspaper reporter
asked five randomly chosen commuter students how many occupants, including the driver, rode to school in their cars. Their
responses were 1, 1, 1, 1, and 6. The next day a story appeared in the paper headlined "University Commuters Double
National Average Ridership." Is this a reasonable assessment of central tendency? How would you characterize the variability
of the sample?
140. A 10-point quiz was given by Professor Ennuyeaux. Of the 10 students in the class, half got zero and the others got perfect
scores. List the students' scores. Then find the mean, median, mode, and geometric mean of their scores. Which is the most
appropriate measure of center? The least appropriate?
141. The owner of a chicken farm kept track of each hen's eating and egg production for many months, with the results below.
Which has more variation, feed consumption or egg output?
142. Below are the ages of 21 CEOs. Find the mean, median, and mode. Are there any outliers? Explain.
46, 48, 49, 49, 50, 52, 54, 55, 57, 57, 58, 59, 60, 61, 62, 62, 63, 63, 65, 67, 75
143. Bob's sample of freshman GPAs showed a mean of 2.72 with a standard deviation of 0.31. (a) What range would you predict
for all the grades? For the middle 95 percent? Explain. (b) Why might your estimates be inaccurate?
144. A team of introductory statistics students went to a grocery store and recorded the total calories and fat calories for various
kinds of soup. They produced a table of statistics and two dot plots. Write a succinct summary of the center, variability, and
shape for each data set. Note: TrimMean is the 5 percent trimmed mean (removing the smallest 5 percent and the largest 5
percent of the values, rounded to the nearest integer).
145. Here are descriptive statistics from Excel for annual per-pupil expenditures in 94 Ohio cities and home sizes in a certain
neighborhood. Very briefly compare the variability and shape of the two data sets.
146. Below are shown a dot plot and summary statistics for a random sample of 34 shower heads. The measurements are
maximum flow rates (in gallons per minute) at pressure of 80 pounds per square inch. Use the data to illustrate the difference
between the two alternative definitions of "outlier," and make any other comments you feel are relevant. Note: TrimMean
removes the smallest 5 percent and the largest 5 percent of the values.
147. Briefly describe these data. Sketch its box plot and describe the sample succinctly.
148. Craig operates a part-time snow-plowing business using a 2002 GMC 2500 HD extended cab short box truck. Describe Craig's
gasoline mileage based on this histogram of 195 tanks of gas.
149. Craig operates a part-time snow-plowing business using a 2002 GMC 2500 HD extended cab short box truck. Describe Craig's
gasoline mileage based on this box plot of 195 tanks of gas.
150. Here are advertised prices of 21 used Chevy Blazers. Describe the distribution (center, variability, shape).
151. Briefly describe this sample of departure delays on American Airlines flights out of Denver over a seven-day period, March 3-9
(n = 149 flights).
152. Six graduates from Fulsome University's Master's of Waste Management program were hired by a Saudi Arabian firm at
$110,000 each, while the other four graduates were unemployed. The university placement office bragged, "Our MWM
graduates enjoyed a median starting salary of $110,000." Is this a reasonable assessment of central tendency? What are the
alternatives?
Answer Key
1. A data set with two values that are tied for the highest number of occurrences is called bimodal.
TRUE
FALSE
Extremes distort the midrange (average of highest and lowest data values).
MIDRANGE = (1+1000)/2
TRUE
The second quartile, the median, and the 50th percentile are the same thing.
4. A trimmed mean may be preferable to a mean when a data set has extreme values.
TRUE
5. One benefit of the box plot is that it clearly displays the standard deviation.
FALSE
TRUE
TRUE
Median = 5
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
FALSE
9. When data are right-skewed, we expect the median to be greater than the mean.
FALSE
It's the other way around, as the mean will be pulled up by extremes.
10. The sum of the deviations around the mean is always zero.
TRUE
The mean is the fulcrum (balancing point), so deviations must sum to zero.
11. The midhinge is a robust measure of center when there are outliers.
TRUE
Outliers have little effect on the midhinge (average of the 25th and 75th percentiles).
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Understand
Difficulty: 2 Medium
Learning Objective: 04-07 Calculate quartiles and other percentiles.
Topic: Percentiles, Quartiles, and Box Plots
12. Chebyshev's Theorem says that at most 50 percent of the data lie within 2 standard deviations of the mean.
FALSE
13. Chebyshev's Theorem says that at least 95 percent of the data lie within 2 standard deviations of the mean.
FALSE
14. If there are 19 data values, the median will have 10 values above it and 9 below it since n is odd.
FALSE
When n is odd, the median is the middle member of the sorted data set. In this case, the median is x10 and there will be 9
below x10 (x1,..., x9) and 9 above x10 (x11,..., x19).
15. If there are 20 data values, the median will be halfway between two data values.
TRUE
16. In a left-skewed distribution, we expect that the median will be greater than the mean.
TRUE
17. If the standard deviations of two samples are the same, so are their coefficients of variation.
FALSE
18. A certain health maintenance organization (HMO) examined the number of office visits by its members in the last year. This
data set would probably be skewed to the left due to low outliers.
FALSE
Lower bound is zero, but high extremes are likely for sicker individuals.
19. A certain health maintenance organization (HMO) examined the number of office visits by its members in the last year. For
this data set, the mean is probably not a very good measure of a "typical" person's office visits.
TRUE
Lower bound is zero, but high extremes are likely for sicker individuals.
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Evaluate
Difficulty: 3 Hard
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
20. Referring to this box plot of ice cream fat content, the median seems more "typical" of fat content than the midrange as a
measure of center.
TRUE
Midrange (average of low and high) will be pulled down by left-tail minimum.
EXPLANATION:
If the data is skewed or has outliers, the median is a more robust measure of the centre
than the midrange.
The midrange is the mathematical average of the minimum and maximum values in the
dataset and can be influenced heavily by outliers.
Whereas, the median is the value that separates the lower 50% of the dataset from the
upper 50%, and it is not affected by extreme values or outliers in the same way as the
midrange.
Here, In the given boxplot, the data is heavily left skewed, and the distribution is totally
non-symmetric. Hence, we can say that the median is a better measure of the "typical"
value than the midrange.
Hence, True
Explanation:
Therefore the statement that, the median seems more "typical" of fat content than the
midrange as a measure of center is True.
AACSB: Analytical Thinking
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 04-08 Make and interpret box plots.
Topic: Percentiles, Quartiles, and Box Plots
21. Referring to this box plot of ice cream fat content, the mean would exceed the median.
FALSE
22. Referring to this box plot of ice cream fat content, the skewness would be negative.
TRUE
Data are skewed left (negative skewness) as indicated by long left tail.
TRUE
24. The range as a measure of variability is very sensitive to extreme data values.
TRUE
Range depends only on highest and lowest data values, so it is easily distorted.
25. In calculating the sample variance, the sum of the squared deviations around the mean is divided by n - 1 to avoid
underestimating the unknown population variance.
TRUE
Check the definition. You lose one piece of information because the mean is estimated.
26. Outliers are data values that fall beyond ±2 standard deviations from the mean.
FALSE
27. The Empirical Rule assumes that the distribution of data follows a normal curve.
TRUE
28. The Empirical Rule can be applied to any distribution, unlike Chebyshev's theorem.
FALSE
The E.R. assumes a normal population, while Chebyshev applies to any population.
FALSE
About 15.87 percent (not 25 percent) are less than one standard deviation below the mean (in a normal distribution).
(100%-68%)/2 = 16%
She would be at the 25th percentile of the distribution. This means that she will be top 75%.
NOTE:
TRUE
FALSE
32. A leptokurtic distribution is more sharply peaked (i.e., thinner tails) than a normal distribution.
TRUE
TRUE
The sign of Excel's kurtosis coefficient indicates the kurtosis direction relative to a normal distribution.
FALSE
43 is not more than three standard deviations above the mean for this data set.
µ = 19
σ = 11.9
(µ ± 3σ)
🡺 Interval = (-16,7 ; 54,7)
🡺 43 is within that interval; therefore, 43 is within 3 standard deviation. So, 43 is not outlier
NOTE: Tuy nhiên nếu đề hỏi số 56 có phải outlier hay không, thì câu trả lời là có vì 56 > 54,7.
B. a unit-free statistic.
C. helpful when the sample means are zero.
The C.V. is unit free. It is the standard deviation as a percentage of the mean.
36. Which is not an advantage of the method of medians to find Q1 and Q3?
When the quartiles lie between two data values, the method of medians goes halfway between the values (very simple),
while Excel interpolates between them in a more complex way.
B. It is less reliable than the mode when the data are continuous.
The mean utilizes all n data values. Deviations always sum to zero around the mean. The mean works for continuous data
(unlike the mode). The mean often differs from the median in business data.
B. n/2 if n is even.
C. n/2 if n is
odd.
This formula always works for the median position. For example, if n = 10 (even) the median is at position (10 + 1)/2 = 5.5,
or halfway between x5 and x6. But if n = 11 (odd) the median is at position (11 + 1)/2 = 6, which is observation x6.
A. It is similar to the mean if there are offsetting high and low extremes.
Although both the mean and the geometric mean are affected by high extremes in skewed data, the geometric mean tends
to reduce their influence.
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Understand
Difficulty: 2 Medium
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
The standard deviation applies to any data measured on a ratio or interval scale. Because it is a square root, its visual
interpretation may be less clear than the MAD.
The strength of Chebyshev's Theorem is that it makes no assumption about normality, while the E.R. only works for normal
populations.
44. If samples are from a normal distribution with μ = 100 and σ = 10, we expect:
45. In a sample of 10,000 observations from a normal population, how many would you expect to lie beyond three standard
deviations of the mean?
A. None of them
B. About 27
C. About 100
D. About 127
46. The Excel formula for the standard deviation of a sample array named Data is:
A. =STDEV.S(Data).
B. =STANDEV(Data).
C. =STDEV.P(Data).
D. =SUM(Data)/(COUNT(Data)-1).
AACSB: Technology
Accessibility: Keyboard Navigation
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 04-03 Calculate and interpret common measures of variability.
Topic: Measures of Variability
48. Estimating the mean from grouped data will tend to be most accurate when:
Many bins and uniform data distribution within bins would give a result closest to the ungrouped mean μ.
A. A distribution that is flatter than a normal distribution (i.e., thicker tails) is mesokurtic.
B. A distribution that is more peaked than a normal distribution (i.e., thinner tails) is
platykurtic.
Shape is hard to judge in small samples. The 50 is just a rule of thumb. Excel computes kurtosis for samples of any size,
but tables of critical values may not go down below 50.
Skewness due to extreme data values is common in business data. Right skewness is common, which increases the mean
relative to the median.
A. It applies to any
distribution.
The E.R. applies only to normal populations, while Chebyshev's Theorem is general.
A. In a left-skewed distribution, we expect that the median will exceed the mean.
The mean is pulled down in left-skewed data, but deviations around it sum to zero in any data set. The median may be
between two data values and may not be in the middle of the box plot.
54. Exam scores in a small class were 10, 10, 20, 20, 40, 60, 80, 80, 90, 100, 100. For this data set, which statement is
incorrect concerning measures of center?
To find the geometric mean, multiply the data values and take the 11th root to get G = 41.02. Outliers affect both the mean
and the standard deviation. There are multiple modes in this example.
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
55. Exam scores in a small class were 0, 50, 50, 70, 70, 80, 90, 90, 100, 100. For this data set, which statement is incorrect
concerning measures of center?
The median is 75 (halfway between x5 = 70 and x6 = 80 in the sorted array). The zeros render the geometric mean useless.
The modes in this case are not unique.
56. Exam scores in a random sample of students were 0, 50, 50, 70, 70, 80, 90, 90, 90, 100. Which statement is incorrect?
57. For U.S. adult males, the mean height is 178 cm with a standard deviation of 8 cm and the mean weight is 84 kg with a
standard deviation of 8 kg. Elmer is 170 cm tall and weighs 70 kg. It is most nearly correct to say that:
58. John scored 85 on Prof. Hardtack's exam (Q1 = 40 and Q3 = 60). Based on the fences, which is correct?
B. John is an outlier.
59. John scored 35 on Prof. Johnson's exam (Q1 = 70 and Q3 = 80). Based on the fences, which is correct?
B. John is an outlier.
The lower inner fence is 70 - 1.5(80 - 70) = 55 so John is an outlier. Actually, John is an extreme outlier because the lower
outer fence is 70 - 3.0(80 - 70) = 40.
60. A population consists of the following data: 7, 11, 12, 18, 20, 22, 25. The population variance is:
A. 6.07.
B. 36.82.
C. 5.16.
D. 22.86.
61. Consider the following data: 6, 7, 17, 51, 3, 17, 23, and 69. The range and the median are:
A. 69 and
17.5.
B. 66 and
17.5.
C. 66 and
17.
D. 69 and
17.
62. When a sample has an odd number of observations, the median is the:
Median position is always (n + 1)/2. It need not be halfway between the quartiles.
B. considering only the data values in the middle of the data array.
The range is easy to calculate but utilizes only two data values, which may be unusual.
64. Which two statistics offer robust measures of center when outliers are present?
Extremes are excluded from the trimmed mean and do not affect the median.
65. Which Excel function is designed to calculate z = (x - μ)/σ for a column of data?
A. =STANDARDIZE
B. =NORM.DIS
T
C. =STDEV.P
D. =AVEDEV
You need the sample mean and sample standard deviation to find the z-score.
AACSB: Technology
Accessibility: Keyboard Navigation
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 04-06 Transform a data set into standardized values.
Topic: Standardized Data
66. Which Excel function would be least useful to calculate the quartiles for a column of data?
A. =STANDARDIZE
B. =PERCENTILE.EX
C
C. =QUARTILE.EXC
D. =RANK
AACSB: Technology
Accessibility: Keyboard Navigation
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 04-07 Calculate quartiles and other percentiles.
Topic: Percentiles, Quartiles, and Box Plots
67. A sample of 50 breakfast customers of McDonald's showed the spending below. Which statement is least likely to be
correct?
EXPLANATION:
Multiply and take the 3rd root to get the geometric mean of 4.932. With only three data values, the quartiles cannot be
calculated (we can't divide three items into four groups).
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
70. A sample of customers from Barnsboro National Bank shows an average account balance of $315 with a standard
deviation of $87. A sample of customers from Wellington Savings and Loan shows an average account balance of $8350
with a standard deviation of $1800. Which statement about account balances is correct?
Calculate the coefficient of variation for each bank. For Barnsboro, CV = 100 × s/ = 100 × 87/315 = 27.62, while for
Wellington CV = 100 × s/ = 100 × 1800/8350 = 21.56.
A. box
plot
B. bar
chart
C. histogram
D. scatter
plot
73. If the mean and median of a population are the same, then its distribution is:
A. normal.
B. skewed
.
C. symmetric.
D. uniform.
74. In the following data set {7, 5, 0, 2, 7, 15, 5, 2, 7, 18, 7, 3, 0}, the value 7 is:
A. the
mean.
B. the
mode.
A. 800.
B. 1000.
C. 900.
D. 950.
76. The 25th percentile for waiting time in a doctor's office is 19 minutes. The 75th percentile is 31 minutes. The interquartile
range is:
A. 12
minutes.
B. 16
minutes.
C. 22
minutes.
77. The 25th percentile for waiting time in a doctor's office is 19 minutes. The 75th percentile is 31 minutes. Which is incorrect
regarding the fences?
Apply definitions of fences. For example, the upper inner fence is 31 + 1.5(31 - 19) = 49.
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Apply
Difficulty: 3 Hard
Learning Objective: 04-07 Calculate quartiles and other percentiles.
Topic: Percentiles, Quartiles, and Box Plots
78. When using Chebyshev's Theorem, the minimum percentage of sample observations that will fall within two standard
deviations of the mean will be __________ the percentage within two standard deviations if a normal distribution is
assumed (Empirical Rule).
A. smaller
than
B. greater than
C. the same
as
Chebyshev guarantees fewer observations within two standard deviations than the E.R.
79. Which distribution is least likely to be skewed to the right by high values?
A few high values would skew the data badly in all but the hamburger example, because a McDonald's hamburger is a
standard menu item.
80. Based on daily measurements, Bob's weight has a mean of 200 pounds with a standard deviation of 16 pounds, while
Mary's weight has a mean of 125 pounds with a standard deviation of 15 pounds. Who has the smaller relative variation?
A. Bob
B. Mary
81. Frieda is 67 inches tall and weighs 135 pounds. Women her age have a mean height of 65 inches with a standard
deviation of 2.5 inches and a mean weight of 125 pounds with a standard deviation of 10 pounds. In relative terms, it is
correct to say that:
B. for this group of women, weight has greater variation than height.
D. the variation coefficient exceeds 10 percent for both height and weight.
Calculate the z-scores for Frieda's weight and Frieda's height. For Frieda's height, z = (x - μ)/σ = (67 - 65)/(2.5) = 0.80,
while for Frieda's weight, z = (x - μ)/σ = (135 - 125)/10 = 1.00. Therefore, Frieda's weight is farther from the mean than her
height. For heights, the CV = 100 × σ/μ = 100 × (2.5)/(65) = 3.8%, while for weights, CV = 100 × σ/μ = 100 × 10/125 =
8.0% (both CVs are below 10%).
B. The standard deviation is in the same units as the mean (e.g., kilograms).
C. The mean from a frequency tabulation may differ from the mean from raw data.
Normal populations are symmetric, but a sample may differ from the population.
A. box
plot.
B. dot plot.
C. histogram.
D. scatter
plot.
The bin limits in a histogram may be rounded, so the values of xmin and xmax may be unclear.
A. The median personal income of California taxpayers would probably be near the mean.
B. The interquartile range offers a measure of income inequality among California residents.
C. For income, the sum of squared deviations about the mean is negative about half the time.
D. For personal incomes in California, outliers in either tail would be equally likely.
Incomes are likely to be skewed due to high extremes, while income is bounded on the low end by zero. A wider IQR
would suggest greater inequality of incomes.
Any measure of center using the mean is subject to the influence of outliers.
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Remember
Difficulty: 2 Medium
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
D. about 32 percent of the data are beyond one standard deviation from the
mean.
The E.R. says that about 68 percent of the observations are within one standard deviation of the mean. Business data
often are skewed.
87. Three randomly chosen Seattle students were asked how many round trips they made to Canada last year. Their replies
were 3, 4, 5. The geometric mean is:
A. 3.877.
B. 4.000.
C. 3.915.
D. 4.422.
Multiply the three numbers and take the 3rd root of 60 to get 3.915.
88. Three randomly chosen California students were asked how many times they drove to Mexico last year. Their replies were
4, 5, 6. The geometric mean is:
A. 3.87.
B. 5.00.
C. 5.42.
D. 4.93.
Multiply the three numbers and take the 3rd root of 120 to get 4.932.
89. Three randomly chosen Colorado students were asked how many times they went rock climbing last month. Their replies
were 5, 6, 7. The standard deviation is:
A. 1.212.
B. 0.816.
C. 1.000.
D. 1.056.
90. Patient survival times after a certain type of surgery have a very right-skewed distribution due to a few high outliers.
Consequently, which statement is most likely to be correct?
C. Mean >
Midrange
91. So far this year, stock A has had a mean price of $6.58 per share with a standard deviation of $1.88, while stock B has had
a mean price of $10.57 per share with a standard deviation of $3.02. Which stock is more volatile?
A. Stock A
B. Stock B
C. They are the same.
A. box
plot.
B. dot plot.
C. histogram.
D. Pareto
chart.
On a boxplot, outliers are identified by their distance from the median. Data values outside the inner fences (median ± 1.5
IQR) are outliers. Data values beyond the outer fences (median ± 3.0 IQR) are extreme outliers. This definition of "outlier"
is not the same as the Empirical Rule, which is based on the distance from the mean.
B. Rang
e
C. Coefficient of variation
D. Trimmed mean
A.
B. 2.604
C. 1.517
D.
95. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 3, 2, 1, 2, 1, 5, 9, 1, 2, 3, 3, 10. The median is:
A. 7.0.
B. 3.0.
C. 3.5.
D. 2.5.
Although we square the deviations around the mean, we take the square root of the sum to get back to the original units of
X. However, the standard deviation is affected by outliers and its interpretation may be nonintuitive.
98. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample, the geometric mean is:
A. 2.158.
B. 1.545.
C. 2.376.
D. 3.017.
99. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample, the median is:
A. 2.
B. 3.
C. 3.5.
D. 2.5.
Sort and look halfway between the two middle data values.
100. Twelve randomly chosen students were asked how many times they had missed class during a certain semester, with this
result: 2, 1, 5, 1, 1, 3, 4, 3, 1, 1, 5, 18. For this sample, which measure of center is least representative of the "typical"
student?
A. Mean
B. Median
C. Mode
D. Midrange
The unusual data value pulls up the mean (3.75) but affects the midrange (1 + 18)/2 = 9.5 even more noticeably.
101. Here are statistics on order sizes of Megalith Construction Supply's shipments of two kinds of construction materials last
year.
A. Girders
B. Rivets
Calculate the coefficient of variation. For girders, the CV = 100 × s/ = 100 × (48)/(160) = 30.00%, while for rivets, CV =
100 × s/ = 100 × 702/2800 = 25.07.
AACSB: Analytical Thinking
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 04-03 Calculate and interpret common measures of variability.
Topic: Measures of Variability
102. The quartiles of a distribution are most clearly revealed in which display?
A. Box plot
B. Scatter plot
C. Histogram
D. Dot plot
The histogram, scatter plot, or dot plot will not directly show quartiles.
B. smaller when the units are smaller (e.g., milligrams versus kilograms).
C. always
zero.
Box is skewed right, so mean probably exceeds the median. The IQR is about 12 - 4 = 8.
105. Find the sample correlation coefficient for the following data.
A. .8911
B. .9132
C. .9822
D. .9556
106. Heights of male students in a certain statistics class range from Xmin = 61 to Xmax = 79. Applying the Empirical Rule, a
reasonable estimate of σ would be:
A. 2.75.
B. 3.00.
C. 3.25.
D. 3.50.
107. A reporter for the campus paper asked five randomly chosen students how many occupants, including the driver, ride to
school in their cars. The responses were 1, 1, 1, 1, 6. The coefficient of variation is:
A. 25
percent.
B. 250
percent.
C. 112 percent.
D. 100
percent.
108. A smooth distribution with one mode is negatively skewed (skewed to the left). The median of the distribution is $65. Which
of the following is a reasonable value for the distribution mean?
A. $76
B. $54
C. $81
D. $65
109. In a positively skewed distribution, the percentage of observations that fall below the median is:
A. about 50 percent.
C. more than 50
percent.
Mode is helpful for categorical data and is easy to calculate in small samples, but requires sorting the sample. Continuous
(decimal) data generally have no mode, or, if a mode exists, it is often not near the center.
AACSB: Analytical Thinking
Accessibility: Keyboard Navigation
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
A. continuous
data.
B. categorical
data.
C. discrete data.
Mode is good for discrete or categorical data but fails for continuous data.
112 Craig operates a part-time snow-plowing business using a 2002 GMC 2500 HD extended cab short box truck. This box plot of
Craig's MPG on 195 tanks of gas does not support which statement?
Narrow box. With outliers in both tails, it's unclear which way skewness would be.
N = 195
Group % data
X min to Q1 25%
Q1 to Q2 25%
Q2 to Q3 25%
Q3 to X max 25%
113. Estimate the mean exam score for the 50 students in Prof. Axolotl's class.
A. 59.2
B. 62.0
C. 63.5
D. 64.1
Apply the formulas for weighted average using interval midpoint multiplied by frequency.
AACSB: Analytical Thinking
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 04-10 Calculate the mean and standard deviation from grouped data.
Topic: Grouped Data
114. A survey of salary increases received during a recent year by 44 working MBA students is shown. Find the approximate
mean percent raise.
A. 6.56
B. 6.74
C. 5.90
D. 6.39
Apply the formulas for weighted average using interval midpoint multiplied by frequency.
115. The following frequency distribution shows the amount earned yesterday by employees of a large Las Vegas casino.
Estimate the mean daily earnings.
A. $112.50
B. $125.01
C. $105.47
D. $117.13
Apply the formulas for weighted average using interval midpoint multiplied by frequency.
AACSB: Analytical Thinking
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 04-10 Calculate the mean and standard deviation from grouped data.
Topic: Grouped Data
116. The following table is the frequency distribution of parking fees for a day:
A. $7.07.
B. $6.95.
C. $7.00.
D. $7.25.
Apply the formulas for weighted average using interval midpoint multiplied by frequency.
AACSB: Analytical Thinking
Blooms: Apply
Difficulty: 2 Medium
Learning Objective: 04-10 Calculate the mean and standard deviation from grouped data.
Topic: Grouped Data
A. 4.550
B. 3.798
C. 4.278
D. 2.997
118. The 25th percentile for waiting time in a doctor's office is 10 minutes. The 75th percentile is 30 minutes. Which is incorrect
regarding the fences?
119. Five homes were recently sold in Oxnard Acres. Four of the homes sold for $400,000, while the fifth home sold for $2.5
million. Which measure of central tendency best represents a typical home price in Oxnard Acres?
A. The mean or
median.
B. The median or
mode.
Summary:
5 homes
120. In Tokyo, construction workers earn an average of ×420,000 (yen) per month with a standard deviation of ×20,000, while in
Hamburg, Germany, construction workers earn an average of €3,200 (euros) per month with a standard deviation of €57.
Who is earning relatively more, a worker making ×460,000 per month in Tokyo or one earning €3,300 per month in
Hamburg?
Summary:
● a worker making ×460,000 per month in Tokyo (x1 = 460,000 yen) -> z1 = (460,000-420,000)/20,000 = 2
● a worker making €3,300 per month in Hamburg ( x2 = 3,300) -> z2 = (3,300 – 3,200)/57 = 1.75
B. If the data are from a normal population, about 68 percent of the values will be within μ ± σ.
Calculate the z-score to detect outliers: z = (x - μ)/σ = (81 - 52)/(15) = 1.93, which is not an outlier, while the CV is 100 ×
σ/μ = 100 × 128/640 = 20%.
B. Standard deviation
C. Midhing
e
D. Interquartile range
123. If Q1 = 150 and Q3 = 250, the upper fences (inner and outer) are:
A. 450 and
600.
B. 350 and
450.
C. 400 and
550.
Add 1.5 times the interquartile range to the third quartile to get the upper inner fence. Add 3.0 times the interquartile range
to the third quartile to get the upper outer fence.
A. Figure
A.
B. Figure
B.
125. Which of the following statements is likely to apply to the incomes of 50 randomly chosen taxpayers in California?
126. A certain health maintenance organization (HMO) examined the number of office visits by each of its members in the last
year. For this data set, we would anticipate that the geometric mean would be
B. zero because some HMO members would not have an office visit.
Zeros would exist for those who had no office visits, so the geometric mean would be zero.
127. Three randomly chosen Colorado students were asked how many times they went rock climbing last month. Their replies
were 5, 6, 7. The coefficient of variation is:
A. 16.7
percent.
B. 13.6
percent.
C. 20.0
percent.
D. 35.7
percent.
128. The mean of a population is 50 and the median is 40. Which histogram is most likely for samples from this population?
A. Sample
A.
B. Sample
B.
C. Sample C.
D. we should consult a table of percentiles that takes sample size into consideration.
We have tables that show the expected range of expected variation for a sample skewness coefficient for various sample
sizes from a symmetric, normal population.
D. we should consult a table of percentiles that takes sample size into consideration.
We have tables that show the expected range of expected variation for a sample kurtosis coefficient for various sample
sizes from a normal population.
131. In Osaka, Japan, stock brokers earn ×6000 per hour on the average, with a standard deviation of ×1200. In Stuttgart,
Germany, stock brokers earn an average of €18 per hour with a standard deviation of €6. In which country is the variation
in wages greatest?
Feedback: Osaka CV = 20 percent, Stuttgart CV = 33.3 percent, so variation is greater in Stuttgart. The point is to show
that you cannot assess relative variation based solely on the standard deviation when the units of measurement differ. (You
have to look also at the mean.)
Feedback: First sample: mean = 8.6, standard deviation = 4.5055, CV = 24.25 percent. Second sample: mean = 28.6,
standard deviation = 4.5055, CV = 15.75 percent. The standard deviations are the same, but the relative variation is
greater in the first sample because the mean is smaller.
133. Ten randomly chosen students at a certain university were asked how many times they smoked marijuana during the
preceding week. Their answers were 0, 8, 0, 0, 2, 4, 0, 0, 6, 0. A campus newspaper article appeared, with the headline
"Average Student Uses No Pot." Is this a fair assessment of central tendency? Discuss the alternatives.
Mode and median are 0, but the mean is 2. Geometric mean is zero due to zeros.
Feedback: Mode and median are 0, but the mean is 2. It is correct that 6 out of 10 students used no marijuana, but to say
that the "average" is zero ignores the four users who bring up the mean. The geometric mean is useless since it is zero
whenever the data set contains zero.
134. Twelve students were asked how many credit cards they owned. The responses were 0, 0, 1, 2, 2, 3, 3, 4, 4, 5, 5, 11. (a)
Find the mean, median, and mode. (b) Which measure of center seems best in this case? (c) Find the first and third
quartiles. What do they tell you?
(a) Mean is 3.33, median is 3, mode is not unique; (b) The mean is slightly influenced by the highest data value, but is not
greatly different than the median. (c) Quartiles depend on which method is used (e.g., Minitab gives 1.25 and 4.75).
Feedback: Mean is 3.33, median is 3. The mode is useless because 0, 2, 3, 4, and 5 each occur twice. In this case the
mean or median gives a reasonable indication of what is "typical." Using the method of medians, Q1 = 1.5 and Q3 = 4.5.
The method of medians only requires sorting the data, finding the median, and then finding the median of the observations
below the median and the median of the observations above the median. Excel and Minitab may use different methods of
calculating quartiles. Excel's =QUARTILE.INC would give 1.75 and 4.25; Minitab would give 1.25 and 4.75, while Excel's
=QUARTILE.EXC will agree with Minitab.
Feedback: Mean is 2.364, median is 2, mode is 2. Any of these conveys a reasonable idea of the "typical" student. The
median is representative of the data, but a good case can also be made for the mode (5 of 10 students had 2 siblings).
There are no outliers, so the mean is not badly distorted (but 7 are below it and 4 above it). Only the mean reflects the fact
that an "average" family has more than two children. The geometric mean is unhelpful because of the zero in the data set.
136. Patient waiting times in the Tardis Orthopedic Clinic have a mean of 50 minutes with a standard deviation of 25 minutes.
Within what range would approximately 95 percent of the waiting times lie if we were sampling a normal distribution? Do
you think the distribution is likely to be normal? Explain.
By the Empirical Rule, range is 0 to 100 minutes, but waiting times may be skewed by a few long waits (nonnormal).
Feedback: By the Empirical Rule, 50 ± (2)(25) gives a range of 0 to 100 minutes. However, the E.R. assumes normality,
which is unlikely for waiting times (probably right-skewed by a few unusually long waits). The large standard deviation likely
is due to outliers.
137. The athletic departments at 10 randomly selected U.S. universities were asked by the Equal Employment Opportunity
Commission to state what percentage of their nursing scholarships were presently held by women. The responses were 5,
4, 2, 1, 1, 2, 10, 5, 5, 5. Find the mean, median, mode, and geometric mean. Which is the most appropriate measure of
central tendency? The least appropriate? Explain your answer. Is there an outlier?
Mean is 4, median is 4.5, mode is 5, geometric mean is 3.1623. The boxplot shows that 10 is an outlier but not an extreme
outlier (based on the fences criterion for outliers).
Feedback: Mean is 4, median is 4.5, mode is 5, geometric mean is 3.1623. For this data set, an argument can be made for
each of these measures of central tendency. The mean or median would probably be most "typical," although the mode
does represent 4 of the 10 observations. The geometric mean downplays the outlier (10) but is not really "typical" of any
university. The boxplot shows that 10 is an outlier but not an extreme outlier (based on the fences criterion for outliers).
AACSB: Reflective Thinking
Blooms: Evaluate
Difficulty: 2 Medium
Learning Objective: 04-02 Calculate and interpret common measures of center.
Topic: Measures of Center
138. A survey of 10 randomly chosen drivers showed the following number of persons per car, including the driver: 1, 5, 1, 5, 2,
1, 1, 1, 2, 1. Describe the center, variability, and skewness for this sample.
Feedback: Mean is 2, median is 1, mode is 1. For this sample, the mode (6 of 10) most clearly characterizes the "typical"
car occupancy, which is also true of the median. However, only the mean would indicate that more than one person is
actually traveling, on average. The geometric mean is 1.585, which is not especially helpful but does downplay the two 5's.
Data are right-skewed.
139. A national survey showed that most commuter cars contain only the driver. Hungry for a story, a campus newspaper
reporter asked five randomly chosen commuter students how many occupants, including the driver, rode to school in their
cars. Their responses were 1, 1, 1, 1, and 6. The next day a story appeared in the paper headlined "University Commuters
Double National Average Ridership." Is this a reasonable assessment of central tendency? How would you characterize
the variability of the sample?
The mean is 2, median is 1, and mode is 1. Coefficient of variation (112 percent) indicates high dispersion (standard
deviation exceeds the mean).
Feedback: The mean is 2, median is 1, and mode is 1. While technically correct, the paper's story is misleading since 80
percent of the cars contained only one occupant. Data are extremely right-skewed. The standard deviation is 2.236, so the
coefficient of variation (112 percent) indicates very high dispersion (standard deviation exceeds the mean).
Feedback: 0, 0, 0, 0, 0, 10, 10, 10, 10, 10. Mean is 5, median is 5, bimodal (0, 10). Geometric mean is zero (useless due to
zeros in the data set). There is no "typical" or correct description of central tendency since there is no centrality in the data.
In such cases, stick with the mean and median but add a verbal caveat about the extremely bimodal nature of the data.
141. The owner of a chicken farm kept track of each hen's eating and egg production for many months, with the results below.
Which has more variation, feed consumption or egg output?
Feed CV = 14.3 percent, egg CV = 25.0 percent. Egg production is more variable.
Feedback: Feed CV = 14.3 percent, egg CV = 25.0 percent. Egg production is more variable. Problem illustrates that when
units of measurement or means differ, you cannot use the standard deviation to compare variation.
142. Below are the ages of 21 CEOs. Find the mean, median, and mode. Are there any outliers? Explain.
46, 48, 49, 49, 50, 52, 54, 55, 57, 57, 58, 59, 60, 61, 62, 62, 63, 63, 65, 67, 75
Mean is 57.714, median is 58, four modes (49, 57, 62, 63). Standard deviation is s = 7.233. No outliers, but there is one
unusual data value at 75.
Feedback: Mean is 57.714, median is 58, four modes (49, 57, 62, 63). Standard deviation is s = 7.233. No outliers, but
there is one unusual data value at 75. Its standardized value is z = (75 - 57.714)/7.233 = 2.39. Using the method of
medians, Q1 = 51, Q2 = 58, Q3 = 62.5, students could also construct fences.
AACSB: Reflective Thinking
Blooms: Evaluate
Difficulty: 1 Easy
Learning Objective: 04-05 Apply the Empirical Rule and recognize outliers.
Topic: Standardized Data
143. Bob's sample of freshman GPAs showed a mean of 2.72 with a standard deviation of 0.31. (a) What range would you
predict for all the grades? For the middle 95 percent? Explain. (b) Why might your estimates be inaccurate?
By the Empirical Rule, we expect the middle 95 percent between μ - 2σ and μ + 2σ (2.10 and 3.34) and all the GPAs
between μ - 3σ and μ + 3σ (1.79 and 3.65). The E.R. is based on the normal distribution, so could be inaccurate if grades
are skewed.
Feedback: By the Empirical Rule, we expect the middle 95 percent between μ - 2σ and μ + 2σ (2.10 and 3.34) and all the
GPAs between μ - 3σ and μ + 3σ (1.79 and 3.65). The E.R. is based on the normal distribution, so could be inaccurate if
grades are skewed. If there is skewness, it is more likely to be to the left since many hard-working students will earn GPAs
in the range 3.00 to 4.00, while very few will be below 2.00 (but a few really poor performers could pull the mean down,
since GPA could even be 0.00).
Both are right-skewed (mean > median) though not greatly so, judging from the dot plots. Trimmed mean is only slightly
less than the mean, suggesting that we don't have too many extreme values. However, on the Calories dot plot there is
one outlier because z = (180 - 96.63)/26.91 = 3.10.
Feedback: Both are right-skewed (mean > median) though not greatly so, judging from the dot plots. In each case, the
trimmed mean is only slightly less than the mean, suggesting that we don't have too many extreme values. However, on
the Calories dot plot there is one extreme value, which turns out to be an outlier since its standardized score is z = (180 -
96.63)/26.91 = 3.10. Better students will notice more details and aspects of the data and discuss them.
Expenditure per pupil is right-skewed (mean > median), skewness coefficient is also high; home size is practically
symmetric (mean ≅ median) and has skewness near zero. Expenditure per pupil has at least one severe outlier z = 7.76,
while home size has no outliers but one unusual value at z = 2.71.
Feedback: Expenditure per pupil is right-skewed (mean > median), and the skewness coefficient is also high. Home size is
practically symmetric (mean ≅ median) and has skewness near zero, though many students will say it's right-skewed. (It is
important to realize that skewness is a matter of degree, not a "yes-no" decision.) The modes are unhelpful since both data
sets are continuous measurements. The CVs indicate that expenditure per pupil has much greater dispersion (40.2
percent) than home size (11.2 percent). Expenditure per pupil has at least one severe outlier at z = (11,226 -
2724.61/1095.22) = 7.76, while home size has no outliers but one possibly unusual value at z = (2908 - 2231.41/249.32) =
2.71. Better student answers will notice and discuss more of the data features, perhaps attempting to draw a histogram.
Upper inner fence is 3.5, upper outer fence is 4.1, so by these definitions, three (maybe four) data points are "unusual"
(above the upper inner fence) and three are outliers (beyond the upper outer fence).
Feedback: Requires definitions of fences. The upper inner fence is Q3 + 1.5(Q3 - Q1) = 2.9 + 1.5(2.9 - 2.5) = 3.5, while the
upper outer fence is Q3 + 3.0(Q3 - Q1) = 2.9 + 3.0(2.9 - 2.5) = 4.1. By these definitions, three (maybe four) data points are
"unusual" (above the upper inner fence) and three are outliers (beyond the upper outer fence). Using the standardized
variable definition, the cutoff for an "unusual" data point is = 2.882 + 2(0.750) = 4.382 (which includes 3 data
points), while the cutoff for an "outlier" is = 2.882 + 3(0.750) = 5.132 (which includes 1 data point). Therefore, the
definitions generally agree on what is "unusual" but not on what constitutes an "outlier."
Skewed right (mean > median), at least one outlier at z = 3.22, box plot will be skewed right and asymmetric.
Feedback: Skewed right (mean > median) as reflected also in the trimmed mean (below the mean). There is at least one
outlier, whose standardized score is z = (49 - 12.89)/11.23 = 3.22. Box plot will be skewed right (long right whisker) and has
asymmetric "box" whose upper half (Q2 to Q3) is wider than its lower half (Q1 to Q2). The picture is that in most Rose Bowl
games, the winning margin tends to be small, but in a few games there was a "blowout" that raises the mean. Astute
students may notice the 0 and ask how the winning margin can be zero. (In 1922, Washington and Jefferson played
California to a scoreless tie, this being before the "sudden death" overtime had been established.)
Fairly symmetric, yet a few high values will draw up the mean.
Feedback: Fairly symmetric. A few high values exist (they could be outliers, but we would need standard deviation or
quartiles to say for sure). Astute students could apply the Empirical Rule to estimate σ = (XMax - XMin)/6, or σ = (XMax - XMin)/4
and try to check for outliers, but this would not be expected. Some will suggest that the data are normal but there were
data measurements (e.g., three tanks erred on the high side, one on the low side).
149. Craig operates a part-time snow-plowing business using a 2002 GMC 2500 HD extended cab short box truck. Describe
Craig's gasoline mileage based on this box plot of 195 tanks of gas.
Range is from just under 9.0 to just over 21.0; typical gas mileage is concentrated near 13 mpg, with the middle 50 percent
between about 12.5 and 13.5 (middle of the "box"); two unusual data values on low end and three on high end (beyond
inner fences).
Feedback: Range is from just under 9.0 to just over 21.0. Typical gas mileage is concentrated near 13 mpg, with the middle
50 percent between about 12.5 and 13.5 (middle of the "box"). Symmetric except for one data point in right tail. Two
unusual data values on low end and three on high end (beyond inner fences). On the high end, two are outliers (beyond
outer fence). Requires knowing definitions of fences.
AACSB: Reflective Thinking
Blooms: Evaluate
Difficulty: 2 Medium
Learning Objective: 04-08 Make and interpret box plots.
Topic: Percentiles, Quartiles, and Box Plots
150. Here are advertised prices of 21 used Chevy Blazers. Describe the distribution (center, variability, shape).
Range is from 7,000 to almost 18,000; median is around 11,500; interquartile range is about 11,000 to 14,000, with
right-skewness.
Feedback: Range is from 7,000 to almost 18,000. Median is around 11,500 with interquartile range about 11,000 to 14,000.
Right-skewed, based on the extremely asymmetric box, but whiskers are roughly symmetric. Mean would probably be well
above the median, based on skewness. Requires knowing how to read quartiles from a box plot.
151. Briefly describe this sample of departure delays on American Airlines flights out of Denver over a seven-day period, March
3-9 (n = 149 flights).
Short left whisker, skewed right. Most data are packed into a very narrow range, but there are 14 outliers (above the upper
fence) and 3 or 4 more above the inner fence.
Feedback: An early departure ("pushback after doors closed") can occur once a plane is fully loaded. In this data set,
flights departed up to 6 minutes early. The short left whisker and narrow box show that most data values are packed into a
very narrow range. The quartiles Q1, Q2, and Q3 are near -5 (i.e., flights typically push back about 5 minutes early). Only 9
flights departed more than 20 minutes late. There are 14 outliers (above the upper outer fence) and 3 or 4 more above the
upper inner fence. Data are extremely right-skewed. Factors such as weather can cause long departure delays, but for
most flights an early or on-time departure is the norm.
AACSB: Reflective Thinking
Blooms: Evaluate
Difficulty: 2 Medium
Learning Objective: 04-08 Make and interpret box plots.
Topic: Percentiles, Quartiles, and Box Plots
152. Six graduates from Fulsome University's Master's of Waste Management program were hired by a Saudi Arabian firm at
$110,000 each, while the other four graduates were unemployed. The university placement office bragged, "Our MWM
graduates enjoyed a median starting salary of $110,000." Is this a reasonable assessment of central tendency? What are
the alternatives?
Can't use geometric mean due to zeros, but none of the measures is typical of anyone.
Feedback: The median and mode are 110, but the mean is only 66. We can't use geometric mean due to zeros. Sample is
small, so no measure is very reliable, but an honest placement service would note that 40 percent of the graduates were
unemployed and that the salary was only for those who actually found jobs.