C/a/d Expressing Dollars and Employees in Thousands, The Weighted Mean Expenditure Per Employee Is
C/a/d Expressing Dollars and Employees in Thousands, The Weighted Mean Expenditure Per Employee Is
C/a/d Expressing Dollars and Employees in Thousands, The Weighted Mean Expenditure Per Employee Is
SECTION EXERCISES
3.1 c/a/m = 61.664/7 = $8.809 trillion. Median = $8.782 trillion, the middle value in data array.
3.2 c/a/m = 894/20 = 44.70 goals per season. Median = 44.50, average of 10 th and 11th values in data
array.
3.3 c/a/m = 1141/20 = 57.05 visitors. Median = 57.50, average of 10 th and 11th values in data array.
The mode is 63. There were three different days with 63 visitors.
3.4 c/a/m = 198/10 = 19.8. Median = 18.50, average of 5th and 6th values in data array. The mode is 30.
There were two cartoons that had 30 incidents.
3.5 c/a/m = 167.07/20 = 8.35. Median = 8.60, average of 10th and 11th values in data array.
3.6 c/a/m = 972.21/15 = $64.81. Median = $65.50, the middle value in data array.
3.7 c/a/m = 1167/30 = 38.9 yrs. Median = 26.5 yrs., average of 15th and 16th values in data array.
3.8 c/a/d Expressing dollars and employees in thousands, the weighted mean expenditure per employee is
= $49,370.70
3.10 d/p/d
a. Motorcyclists usually ride 1 to a motorcycle, so this would be the most frequent value.
b. Mean will be greater, because there is sometimes more than one rider, but always at least one.
c. Mean will be greater, because the distribution is skewed to the right.
3.11 d/p/d
a. Mean will be higher since salaries are usually skewed to the right. Management will emphasize
the mean to make the present situation look brighter.
b. The union representative will wish to make the present situation look worse and therefore will
emphasize the median.
In each set of ratings, the mean exceeds the median. Each distribution is positively skewed.
3.13 c/c/m The Minitab and Excel printouts are shown below.
39
Descriptive Statistics: PSI
E F
1 PSI
2
3 Mean 398.86
4 Standard Error 1.96
5 Median 396.75
6 Mode 403.90
7 Standard Deviation 19.58
8 Sample Variance 383.50
9 Kurtosis -0.24
10 Skewness 0.23
11 Range 102.80
12 Minimum 351.70
13 Maximum 454.50
14 Sum 39885.60
15 Count 100
The mean exceeds the median. The distribution is positively skewed.
3.14 c/c/m
Descriptive Statistics: absent by gender
The mean number of absences for female employees is less than that for males. The median for the
female employees is also lower. For each gender, the mean exceeds the median and the distribution is
positively skewed.
3.15 c/c/m
Descriptive Statistics: age by gender
The mean age for female employees is less than that for males. The median for the female employees is
also lower. For females, the mean age exceeds the median and the distribution is positively skewed.
3.16 d/p/d An ad claim such as "Get up to 70% more miles per gallon by using product x." Most cars
tested may have obtained little or no increase in mpg.
40
3.17 c/a/m Range = 75 - 36 = 39 visitors. MAD = 207.00/20 = 19.71 visitors.
s2 = 2922.95/19 = 153.84, and s = = 12.40 visitors.
3.19 c/a/m
a. = 38.4/7 = $5.486 billion. Median = $2.40 billion. Range = 18.1 - 1.3 = $16.8 billion.
Midrange = (18.1 + 1.3)/2 = $9.7 billion.
b. MAD = 30.657/7 = $4.38 billion.
c. 2 = 222.6086/7 = 31.80, = $5.64 billion.
3.20 c/a/m
a. = 229/11 = 20.82 cents. Median = 18 cents, 6 th value in data array. Range = 55 - 2 = 53 cents.
Midrange = (2 + 55)/2 = 28.5 cents.
b. MAD = 133.454/11 = 12.13 cents.
c. s2 = 2553.6/10 = 255.36, and s = = 15.98 cents
3.21 c/a/m
a. = 272/10 = 27.2 mpg. Median = (27 + 29)/2 = 28 mpg. Range = 40 - 10 = 30 mpg.
Midrange = (10 + 40)/2 = 25 mpg.
b. MAD = 56/10 = 5.6 mpg.
c. s2 = 583.6/9 = 64.84, and s = = 8.052 mpg.
3.22 c/a/m
a. = 10,550/8 = 1318.75 acres. Median = (300 + 500)/2 = 400 acres.
Range = 7050 - 200 = 6850 acres. Midrange = (7050 + 200)/2 = 3625 acres.
b. MAD = 11,462.5/8 = 1432.8 acres.
c. s2 = 5,486,383.93, and s = = 2342.30 acres.
3.23 c/a/m First quartile is in ranked position (11 + 1)/4 = 3; Q1 = first quartile = 7
Second quartile is in ranked position 2(11 + 1)/4 = 6; Q 2 = second quartile = 18
Third quartile is in ranked position 3(11 + 1)/4 = 9; Q3 = third quartile = 30
Interquartile range = 30 - 7 = 23; quartile deviation = 23/2 = 11.5
3.24 c/a/m First quartile is in ranked position (10 + 1)/4 = 2.75; Q 1 = 21(0.25) + 23(0.75) = 22.5
Second quartile is in ranked position 2(10 + 1)/4 = 5.5; Q 2 = 27(0.5) + 29(0.5) = 28
Third quartile is in ranked position 3(10 + 1)/4 = 8.25; Q 3 = 32(0.75) + 33(0.25) = 32.25
Interquartile range = 32.25 - 22.5 = 9.75; quartile deviation = 9.75/2 = 4.875
41
3.25 c/c/m a. and c. The Excel and Minitab descriptive statistics are shown below.
C D
1 Seconds
2
3 Mean 23.3498
4 Standard Error 0.7764
5 Median 22.8600
6 Mode 22.7400
7 Standard Deviation 5.4897
8 Sample Variance 30.1372
9 Kurtosis 0.6938
10 Skewness 0.6721
11 Range 26.13
12 Minimum 13.40
13 Maximum 39.53
14 Sum 1167.49
15 Count 50
b. The mean absolute deviation must be calculated separately. It is 215.809/50 = 4.316 seconds.
3.26 c/c/m a. and c. The Excel and Minitab descriptive statistics are shown below.
E F
1 absent
2
3 Mean 9.2800
4 Standard Error 0.3216
5 Median 9
6 Mode 8
7 Standard Deviation 3.2164
8 Sample Variance 10.3451
9 Kurtosis -0.0461
10 Skewness -0.2065
11 Range 16
12 Minimum 1
13 Maximum 17
14 Sum 928
15 Count 100
b. The mean absolute deviation must be calculated separately. It is 255.12/100 = 2.55 absences.
42
3.27 c/c/m a. and c. The Excel and Minitab descriptive statistics are shown below.
C D
1 meters
2
3 Mean 90.7713
4 Standard Error 0.9347
5 Median 91.4
6 Mode 85.6
7 Standard Deviation 8.3606
8 Sample Variance 69.9000
9 Kurtosis -0.1468
10 Skewness 0.0717
11 Range 40.4
12 Minimum 71.8
13 Maximum 112.2
14 Sum 7261.7
15 Count 80
b. The mean absolute deviation must be calculated separately. It is 536.3575/80 = 6.70 meters.
3.28 c/a/m
a. The median is approximately 37.5 defects per day. The first quartile is approximately 37 defects per
day. The third quartile is approximately 39 defects per day.
b. The asterisks at the right are outliers, indicating two days on which unusually large numbers of defects
were produced. The production supervisor should try to determine if anything out of the ordinary was
happening at the plant on those days.
c. The distribution is positively skewed.
3.29 c/a/e
a. At least (1-(1/2.52))*100 = 84%
b. At least (1-(1/32))*100 = 88.89%
c. At least (1-(1/52))*100 = 96%
3.30 c/a/m Standardized data values: -1.18, -0.99, -0.87, -0.55, -0.36, -0.18, -0.05, 0.26, 0.57, 1.20, and
2.14; 90.9% of them are within 1.5 standard deviation units of the mean. Chebyshev's Theorem states
that at least (1 – (1/1.52))*100 = 55.6% should fall within that interval, and these results support the
theorem.
3.31 c/a/m Standardized data values: -2.14, -0.77, -0.52, -0.02, -0.02, 0.22, 0.35, 0.60, 0.72, and 1.59;
90% of them are within 2.0 standard deviations of the mean. Chebyshev's Theorem states that at least
(1 – (1/22))*100 = 75% should fall within that interval, and these results support the theorem.
43
3.32 c/a/m Using the empirical rule:
a. 95%. This is the percentage of values that are within 2 standard deviations of the mean.
b. 16%, or 50% - 34%. Recall that 68% of the values are within 1 standard deviation of the mean.
c. 2.5%, or 50% - 47.5%; 95% of the values are within 2 standard deviations of the mean.
d. 81.5%, obtained by 34% (the area between the mean and 11,500) plus 47.5% (the area from the mean
to 13,000).
3.34 c/a/m Coefficient of variation = s/ = (140/1235)*100 = 11.34% for data set A. Coefficient of
variation = s/ = (1.87/15.7)*100 = 11.91% for data set B. Set B has greater relative dispersion.
3.36 c/c/m
a. Box-and-whisker plot and listing of key descriptors. The distribution is positively skewed.
A B C D E F G H
1 Box Plot
2
3 Income
4 Smallest = 23117
5 Q1 = 36655
6 Median = 54826
7 Q3 = 78794.75
8 Largest = 242575
9 IQR = 42139.75
10 Outliers: 242575, 192724, 189017, 179145, 178007, 172763, 149147,
11
12 BoxPlot
13
14
15
16
17
18
19
20
21 0 50000 100000 150000 200000 250000 300000
22
44
b. A portion of the data and standardized data, and descriptive statistics for the 100 standardized values.
A B C D E
1 Income StdInc StdInc
2 80329 0.33538
3 39459 -0.62387 Mean 0.00000
4 149147 1.95054 Standard Error 0.10000
5 55058 -0.25775 Median -0.26320
6 172763 2.50483 Mode #N/A
7 49005 -0.39983 Standard Deviation 1.00000
8 49968 -0.37723 Sample Variance 1.00000
9 27168 -0.91234 Kurtosis 3.6763
10 65544 -0.01165 Skewness 1.8345
11 47740 -0.42951 Range 5.1508
12 27370 -0.90760 Minimum -1.0074
13 67870 0.04296 Maximum 4.1433
14 69140 0.07275 Sum 0.0000
15 86130 0.47151 Count 100
3.37 c/c/m
a. Box-and-whisker plot and listing of key descriptors. The distribution is positively skewed.
A B C D E F G H
1 Box Plot
2
3 absent
4 Smallest = 1
5 Q1 = 8
6 Median = 9
7 Q3 = 12
8 Largest = 17
9 IQR = 4
10 Outliers: 1, 1,
11
12 BoxPlot
13
14
15
16
17
18
19
20
21 0 5 10 15 20
22
45
b. A portion of the data and standardized data, and descriptive statistics for the 100 standardized values.
D E F G H
1 absent StdAbsent StdAbsent
2 8 -0.3980
3 10 0.2239 Mean 0.0000
4 13 1.1566 Standard Error 0.1000
5 8 -0.3980 Median -0.0871
6 13 1.1566 Mode -0.3980
7 10 0.2239 Standard Deviation 1.0000
8 11 0.5348 Sample Variance 1.0000
9 7 -0.7089 Kurtosis -0.0461
10 1 -2.5743 Skewness -0.2065
11 11 0.5348 Range 4.9745
12 4 -1.6416 Minimum -2.5743
13 8 -0.3980 Maximum 2.4002
14 13 1.1566 Sum 0.0000
15 8 -0.3980 Count 100
16 11 0.5348
3.38 c/c/m
a. Box-and-whisker plot and listing of key descriptors. The distribution is positively skewed.
A B C D E F G H
1 Box Plot
2
3 Seconds
4 Smallest = 13.4
5 Q1 = 19.095
6 Median = 22.86
7 Q3 = 26.7175
8 Largest = 39.53
9 IQR = 7.6225
10 Outliers: 39.53,
11
12 BoxPlot
13
14
15
16
17
18
19
20
21 0.00 10.00 20.00 30.00 40.00 50.00
22
b. A portion of the data and standardized data, and descriptive statistics for the 50 standardized values.
46
A B C D E
1 Seconds StdSecs StdSecs
2 19.11 -0.7723
3 13.56 -1.7833 Mean 0.0000
4 22.98 -0.0674 Standard Error 0.1414
5 32.46 1.6595 Median -0.0892
6 19.05 -0.7832 Mode -0.1111
7 27.19 0.6995 Standard Deviation 1.0000
8 19.39 -0.7213 Sample Variance 1.0000
9 23.96 0.1112 Kurtosis 0.6938
10 27.70 0.7924 Skewness 0.6721
11 19.02 -0.7887 Range 4.7598
12 22.60 -0.1366 Minimum -1.8124
13 20.44 -0.5300 Maximum 2.9474
14 28.59 0.9545 Sum 0.0000
15 24.13 0.1421 Count 50
3.39 c/a/d
a. Frequency distribution with classes having widths of 1:
class mi fi fimi fimi2
6 - under 7 6.5 1 6.5 42.25
7 - under 8 7.5 6 45.0 337.50
8 - under 9 8.5 7 59.5 505.75
9 - under 10 9.5 6 57.0 541.50
sum = 168.0 sum = 1427.0
b. The mean and standard deviation for the actual data were 8.353 and 0.868, respectively.
d. If each data value were the midpoint of its own class, the approximate values would be identical to the
exact values.
3.40 c/a/d
47
mi fi fimi fimi2
10 7 70 700
20 9 180 3,600
30 12 360 10,800
40 14 560 22,400
50 13 650 32,500
60 9 540 32,400
70 8 560 39,200
80 11 880 70,400
90 10 900 81,000
100 7 700 70,000
sum = 100 sum = sum = 363,000
5400
3.41 c/a/d
mi fi fimi fimi2
5 25 125 625
15 17 255 3,825
25 15 375 9,375
35 9 315 11,025
45 10 450 20,250
55 4 220 12,100
sum = 80 sum = 1740 sum = 57,200
3.42 d/p/e The coefficient of determination is the proportion of the variation in y that is explained by the
best-fit linear equation. It is a measure of the strength of the relationship between the variables.
3.43 c/a/e Because the variables are inversely related, r will be negative. Thus, r will be the negative
square root of 0.64, or r = -0.8.
48
3.44 c/c/m
Fitted Line Plot
absent = 5.799 + 0.08523 age
18 S 3.10622
R-Sq 7.7%
16 R-Sq(adj) 6.7%
14
12
10
absent
0
20 30 40 50 60
age
The equation explains 7.7% of the variation in the number of absences. The coefficient of correlation is
the positive (since the slope is positive) square root of 0.077, or r = 0.277.
3.45 c/c/m
F G H I J K L
1
2 5.0
3
y = 0.9x + 0.3805
4 4.5 2
R = 0.9454
Lawyers/Judges
5
6 4.0
7
8
3.5
9
10
3.0
11
12
13 2.5
14 2.5 3.0 3.5 4.0 4.5 5.0
15 Academicians
16
Ratings from the academicians explain 94.54% of the variation in the ratings of the lawyers/judges.
The coefficient of correlation is the positive (since the slope is positive) square root of 0.9454,
or r = 0.972
49
3.46 c/c/m
Fitted Line Plot
CancerRate = 63.71 + 0.4796 HeartRate
160 S 8.10447
R-Sq 61.8%
150 R-Sq(adj) 61.0%
140
130
CancerRate
120
110
100
90
80
100 120 140 160 180 200
HeartRate
The equation explains 61.8% of the variation in the cancer rates. The coefficient of correlation is the
positive (since the slope is positive) square root of 0.618, or r = 0.786.
3.47 c/c/m
D E F G H I J
1
2 45
3 40 y = 0.377x - 3.5238
Generic Price ($)
4 35 2
R = 0.7447
5 30
6 25
7 20
8 15
9 10
10 5
11 0
12 0 20 40 60 80 100 120
13
Brand-Name Price ($)
14
15
The equation explains 74.47% of the variation in the generic prices. The coefficient of correlation is the
positive (since the slope is positive) square root of 0.7447, or r = 0.863.
CHAPTER EXERCISES
3.48 c/a/m = (1.25 + 2.36 + 2.50 + 2.15 + 4.55 + 1.10 + 0.95)/7 = $2.12. Yes, the service to the first
seven customers was profitable.
3.50 c/a/m
a. ; Median = (0.7 + 1.1)/2 = 0.9; Modes are 0.2 and 0.7.
b. The mode is not a good measure since 0.2 and 0.7 are very small relative to the other values.
50
3.51 c/a/m Median = (116 + 121)/2 = 118.5; There is no mode.
3.52 c/a/m
a. mph. Median = (30 + 30)/2 = 30 mph. b. Mode = 30 mph.
3.54 c/p/d
a. The mean exceeds the median and, based on the rough character-graph boxplot shown below, the
distribution appears to be very slightly positively skewed.
---------
----------I + I------------
---------
-----+---------+---------+---------+-----
50 100 150 200
b. Approximately 2.5%, obtained by 50% (the area to the left of the mean) minus 47.5% (the area
between 64 cups and the mean). According to the empirical rule, approximately 95% of the data
values will lie within 2 standard deviations of the mean; 64 cups is about two standard deviations less
than the mean.
3.55 d/p/m
a. Since all values should be increased by 0.1, the sample mean will increase by 0.1 to 3.1 lbs. Since the
relative variation is unchanged, the sample standard deviation will still be 0.5 lbs.
b. Using the empirical rule, this would be 4.1 lbs., obtained by 3.1 + 2(0.5). Approximately 95% of the
data values will lie within 2 standard deviations of the mean.
3.56 c/a/m
a. stoppages. Median = 235 stoppages (3rd value in data array).
Range = 424 – 44 = 380 stoppages
Midrange = (44 + 424)/2 = 234.0 stoppages
b.
c.
3.57 c/a/m
a. , Median = (2.08 + 2.15)/2 = 2.115 tons.
Range = 2.31 - 1.85 = 0.46 tons Midrange = (1.85 + 2.31)/2 = 2.08 tons.
b.
c.
51
3.58 c/a/m The median is approximately 99 gallons. The first quartile is approximately 92 gallons.
The third quartile is approximately 104 gallons. The range is approximately 120 - 80 = 40 gallons.
The distribution appears to be slightly negatively skewed.
3.59 c/a/m The median is approximately 120 watts. The first quartile is approximately 116 watts.
The third quartile is approximately 124 watts. The range is approximately 130 - 110 = 20 watts.
The distribution appears to be symmetrical.
3.60 c/a/m
a.
b. Chebyshev's Theorem states that at least (1 - (1/1.52))*100 = 55.6% should fall within 1.5 standard
deviation units. For this data, all except the largest three values, or 88% of the data set, fall within
1.5 standard deviation units.
c. Coefficient of variation = (s/ )*100% = (0.0684/0.0736)*100% = 92.9%
3.61 c/a/m Exercise 3.57: coefficient of variation = (s/ )*100 = (0.156/2.10)*100 = 7.43 %
Exercise 3.60: coefficient of variation = (s/ )*100 = (0.0684/0.0736)*100% = 92.9%
There is greater variation for the data in exercise 3.60.
3.62 c/a/m
mi fi fimi fimi2
50 27 1350 67,500
150 11 1650 247,500
250 4 1000 250,000
350 1 350 122,500
450 2 900 405,000
550 1 550 302,500
650 0 0 0
750 1 750 562,500
850 1 850 722,500
950 0 0 0
1050 1 1050 1,102,500
1150 1 1150 1,322,500
sum = 50 sum =
sum = 9600 5,105,000
Approximate values:
3.63 c/a/m Median = (24 + 25)/2 = 24.5 pages. First Quartile = 22(0.75) + 22(0.25) = 22 pages.
Third Quartile = 29(0.25) + 35(0.75) = 33.5 pages.
Variable N Mean Median TrMean StDev SE Mean
pages 20 25.65 24.50 25.72 8.01 1.79
52
3.64 c/a/m
Class mi fi fimi fimi2
10 - under 20 15 4 60 900
20 - under 30 25 11 275 6,875
30 - under 40 35 5 175 6,125
sum = 20 sum = 510 sum = 13,900
3.65 c/c/m
a. Descriptive statistics.
C D
1 Utility
2
3 Mean 1644.000
4 Standard Error 13.953
5 Median 1651.000
6 Mode 1765.000
7 Standard Deviation 220.624
8 Sample Variance 48674.916
9 Kurtosis 1.495
10 Skewness 0.113
11 Range 1635
12 Minimum 1016
13 Maximum 2651
14 Sum 411000
15 Count 250
53
c. As shown in part (b), there are two outlier households ($1057 and $1016) at the low end and one
($2651) at the high end of utility expenditures. Energy-conservation officials may wish to examine
these households for habits or characteristics that should either be emulated or avoided.
3.66 c/c/m
a. Descriptive statistics.
C D
1 $cost
2
3 Mean 3657.00
4 Standard Error 46.55
5 Median 3647.00
6 Mode 3028.00
7 Standard Deviation 806.29
8 Sample Variance 650100.60
9 Kurtosis 0.39
10 Skewness 0.53
11 Range 4455
12 Minimum 2026
13 Maximum 6481
14 Sum 1097100
15 Count 300
c. As shown in part (b), there are three outlier couples ($6481, $6305, and $5990) at the high end of
honeymoon expenditures. Cruise lines, resort areas, and various governmental tourism-promotion
agencies could be interested in finding out more about the age, media habits, and other characteristics
of these people so as to be able to reach and persuade others like them to spend their honeymoons or
vacations at their venues.
3.67 c/c/m
54
a. Descriptive statistics.
C D
1 SAT
2
3 Mean 517.96
4 Standard Error 5.51
5 Median 519.50
6 Mode 437.00
7 Standard Deviation 110.26
8 Sample Variance 12158.00
9 Kurtosis 0.43
10 Skewness -0.12
11 Range 673
12 Minimum 159
13 Maximum 832
14 Sum 207182
15 Count 400
c. A test-taker would have to score 589 on the math portion to be higher than 75% of the sample
members. He or she would have to score 449 (448.25, rounded up) to be higher than 25% of the
sample members. These correspond to the third and first quartiles, respectively.
3.68 c/c/m
55
Fitted Line Plot
Seconds = 3.406 + 0.005654 Weight
5.8 S 0.182466
R-Sq 66.8%
5.6 R-Sq(adj) 66.6%
5.4
5.2
Seconds
5.0
4.8
4.6
4.4
4.2
150 200 250 300 350
Weight
With the linear estimation equation, player weight explains 66.8% of the variation in 40-yard times.
Since the slope is positive, the coefficient of correlation is the positive square root of 0.668, or r = 0.82.
3.69 c/c/m
Fitted Line Plot
$Fines = 218989 + 2458 Actions
3000000 S 450817
R-Sq 69.8%
R-Sq(adj) 68.4%
2500000
2000000
$Fines
1500000
1000000
500000
Through the linear estimation equation, the number of actions explains 69.8% of the variation in fine
amounts. Because the slope is positive, the coefficient of correlation is the positive square root of 0.698,
or r = 0.84.
INTEGRATED CASES
56
THORNDIKE SPORTS EQUIPMENT
1. Measures of central tendency and dispersion for the new golf balls, using Minitab:
Descriptive Statistics: NewBall
The mean is 251.53 and the median is 252.80. Both are good measurements to reflect central
tendency. The standard deviation is 17.86, measuring the dispersion of the data.
2. Measures of central tendency and dispersion for the conventional golf balls:
Descriptive Statistics: ConBall
The mean is 238.04 and the median is 240.30. The standard deviation is 19.29.
3. The mean and median distances traveled by the new ball are considerably larger than the
corresponding values for the old ball. This indicates that the new ball is “more lively” than the old
ball, and on average travels further. Another indication of a greater distance for the new ball can be
seen in the ranges. The range of the new ball is from 223.70 to 294.10; whereas, the range of the old
ball is from 201.00 to 267.90. The standard deviations of the samples are relatively similar, with a
larger dispersion among the distances of the old ball than the new one.
This exercise is based on SHOPPING, the Springdale shopping survey database. There are 30 variables
and 150 cases (respondents) in this database. Using Minitab and SHOPPING.MTW:
Variable Maximum
IMPEXCH 7.000
IMPQUALI 7.000
IMPPRICE 7.0000
IMPVARIE 7.000
IMPHELP 7.000
IMPHOURS 7.000
IMPCLEAN 7.000
IMPBARGN 7.000
1b. In part (a), for all 8 variables, the median exceeds the mean, indicating negative skewness.
The corresponding boxplots, shown below, support this conclusion.
57
Boxplot of IMPEXCH, IMPQUALI, IMPPRICE, IMPVARIE, IMPHELP, ...
2 4 6
IMPEXC H IMPQUALI IMPPRIC E
IMPCLEAN IMPBARGN 2 4 6
2 4 6
2. Quality and price seem to be the most important attributes in respondents’ choice of a shopping
area. Helpful staff, clean store, and convenient hours are the least important attributes.
With r = -0.099, (-0.099)2*100 is just 0.98%. Slightly less than 1% of the variation in the number
of persons in the respondent’s household is explained by the respondent’s age.
58
BUSINESS CASE
1. The mean score on the screening test is higher for those who did not default, shown in the Minitab
printout below as 63.439 versus 56.65.
2. The third quartile for those who did not default was 72.00 -- for this group, 75% scored 72.00 or
lower on the screening test. If a score of 72.00 had been had been established as a cutoff for receiving
a computer loan, 25% of those who repaid would have been denied a loan in the first place. Granting a
loan solely on the basis of a screening test score of 72.00 or above would seem to be rather unfair to
those students who end up repaying the loan, as 25% of them would not have received the loan they
ended up repaying.
3. The Minitab dotplots below visually compare the screening test scores of students who did not default
on their computer loan to the scores of those who defaulted. The distribution of screening test scores
for those who did not default is most definitely shifted to the right of the distribution of scores for
those who did default.
Dotplot of Score vs Default
Default
1
27 36 45 54 63 72 81 90
Score
4. Based on the preceding results, the screening test does appear to be potentially useful as one of the
factors in helping Baldwin predict whether a given applicant will end up defaulting on his or her
computer loan. However, Baldwin might benefit from considering other factors as well -- note that
four students with screening test scores well over 72.00 (ranging from the low 80s to the high 80s)
ended up defaulting on their computer loans. Also, one of the students who did not default had the
lowest screening score of all, shown in the dotplots above as slightly above 27.
59
60