Solution Chapter 2 Mandenhall
Solution Chapter 2 Mandenhall
Descriptive Statistics
2.2 The data were entered into Excel and the following pie chart was created:
Unemployed
Close Match
The responses were pretty evenly divided between close match, not a match and unemployed.
Chapter 2
2.4
The root causes of the 83 incidences are pretty evenly split between the causes Engineering & Design, Procedures & Practices, and Management & Oversight.
Descriptive Statistics
2.6
The countries for all the European PWB manufacturers were combined and the following bar graph was produced:
While not a leading location of PWB manufacturers, we dont see a major cause of concern about the viability of the PWB industry in Europe. 2.8 A pie chart for the data appears below:
Analysis of Softw are Code
"True" 10%
"False" 90%
It appears that a very high percentage of the software code is defect free.
Chapter 2
2.10
Aquifer
250 200 150 100 50 0 Bedrock Unconsolidated
It appears that most of the aquifer types are bedrock A pie chart was used to describe the detectable levels of MTBE information below:
Detectable MTBE
Detect 31%
It appears that twice as many wells are below that detectable level than are above the detectable level.
Descriptive Statistics
A bar graph was used to describe the well class variable below:
Well Class
125 120 115 110 105 100 95 90 Private Public
There are more public than private wells in the data set. 2.12 a. We will allow each number to represent the stems and mark the leaves with an asterisk (*). The stem-and-leaf display is: Stems 1 2 3 4 5 6 b. 2.14 a. Leaves ********** ************* *********** **** **
10 of the 40 or 10/40 = .25 of the asteroid observations resulted in exactly 1 spectral image exposure. To construct a frequency histogram, first calculate the range by subtracting the smallest endpoint of the histogram from the largest. To include the largest and smallest values, we will start the histogram at 2.12 and end at 10.7625. Range = 10.7625 7.95 = 2.8125. Next, find the class width for 9 classes by dividing the range by the number of classes: Class width = 2.8125/9 = .3125
Chapter 2
The first class will begin at 7.95, below the smallest voltage reading. The classes are shown below: Class Class Interval 1 2 3 4 5 6 7 7.9500 8.2625 8.2625 8.5750 8.5750 8.8875 8.8875 9.2000 9.2000 9.5125 9.5125 9.8250 9.8250 10.1375
||||
Data Tabulation
Frequency 1 0
|||
3 0 0 5
15 5 1 Totals n = 30
8 9
To obtain the class frequency, count the number of observations that fall within each class interval. The class relative frequency is the class frequency divided by the total number of observations (30).
b.
Stems 8. 9. 10. Leaves 05 72 72 80 98 97 87 80 87 55 95 70 84 80 73 98 84 26 05 29 03 55 26 12 05 15 00 15 02 01
The histogram in part a more effectively describes how the data falls.
Descriptive Statistics
c.
d. 2.16
The new process appears to be worse than the old process. More voltage readings are less than 9.2 volts with the new process than with the old process.
Statistix was used to generate the following stem-and-leaf plot for the surface roughness data:
Stem and Leaf Plot of ROUGH Leaf Digit Unit = 0.1 1 0 represents 1.0 Stem 3 1 5 1 7 1 8 1 9 1 (5) 2 6 2 4 2 1 2 Leaves 001 22 45 7 9 00111 23 455 6 Minimum Median Maximum 1.0600 2.0400 2.6400
The data appear to be pretty evenly spread out across the entire range of the data. 2.18 a. The stem will be the first decimal value (tenths) and the leaves will be the second and third decimal values (hundredths and thousandths). The stem-and-leaf display is:
Stems .1 .2 .3 .4 .5 .6 Leaves 12 70 41 05 70 25 39 30 75
23 91 18
b.
The three oxon/thion ratios for the clear air days have been marked in boxes above. The ratios for the fog air days have been circled above. The clear air days do seem to produce oxion/thion ratios that are larger than the fog air days. Keep in mind, however, that such a statement cannot be made with any measure of reliability (yet).
10
Chapter 2
2.20
Only two of the 26 till ratios exceed the value of 4.5, so we would estimate this proportion to be 2/26 = 0.0769. 2.22 a. The sample mean radioactivity level is:
y=
y = 43.75 = 4.861
n 9
The median is the middle observation once they have been ordered. The 5th observation is 4.85. Thus the median is 4.85. The mode is 5.00. b. The average radioactivity level is 4.861. Half of the radioactivity levels are less than 4.85 and half are greater. The modal observation occurred 2 times. The mean of the data is y =
2.24
a. b.
y = 11.77 = 1.4713.
n 8
The median is the average of the middle two numbers once the data are arranged in order. The data arranged in order are: 1.37, 1.41, 1.42, 1.48, 1.50, 1.51, 1.53, 1.55 The middle two numbers are 1.48 and 1.50. The median is
1.48 + 1.50 = 1.49 2
c. 2.26
Since the mean is less than the median, the data are somewhat skewed to the left.
The following measures of central tendency were calculated for the 174 ship sanitation scores:
Descriptive Statistics Variable Rating N 174 Mean 94.420 Median 95.000 Mode 97.000
Descriptive Statistics
11
The average sanitation score of the 174 ships sampled was 94.420. Half of the 174 ships sampled had a sanitation score that was below 95 and half had a sanitation score that was above 95. The most frequently occurring sanitation score among the 174 ships sampled was the score 97. 2.28 a. The mean number of ant species discovered is:
y=
y = 3 + 3 + + 4 = 141 = 12.82
n 11 11
The median is the middle number once the data have been arranged in order: 3, 3, 4, 4, 4, 5, 5, 5, 7, 49, 52. The median is 5. The mode is the value with the highest frequency. Since both 4 and 5 occur 3 times, both 4 and 5 are modes. b. For this case, we would recommend that the median is a better measure of central tendency than the mean. There are 2 very large numbers compared to the rest. The mean is greatly affected by these 2 numbers, while the median is not. The mean total plant cover percentage for the Dry Steppe region is: y=
c.
y = 40 + 52 + + 27 = 202 = 40.4
n 5 5
The median is the middle number once the data have been arranged in order: 27, 40, 40, 43, 52. The median is 40. The mode is the value with the highest frequency. Since 40 occurs 2 times, 40 is the mode. d. The mean total plant cover percentage for the Gobi Desert region is:
y=
y = 30 + 16 + + 14 = 168 = 28
n 6 6
The median is the mean of the middle 2 numbers once the data have been arranged in order: 14, 16, 22, 30, 30, 56. The median is
22 + 30 52 = = 26 . 2 2
12
Chapter 2
The mode is the value with the highest frequency. Since 30 occurs 2 times, 30 is the mode. e. Yes, the total plant cover percentage distributions appear to be different for the 2 regions. The percentage of plant coverage in the Dry Steppe region is much greater than that in the Gobi Desert region. The mean number of power plants is: y= y 5 + 3 + + 3 80 = = =4 n 20 20
2.30
a.
The median is the mean of the middle 2 numbers once the data have been arranged in order: 1, 1, 1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 7, 9, 13 The median is 3+ 4 7 = = 3.5 . 2 2
The mode is the value with the highest frequency. Since 1 occurs 5 times, 1 is the mode. b. Deleting the largest number, 13, the new mean is: y=
y = 5 + 3 + + 3 = 67 = 3.526
n 19 19
The median is the middle number once the data have been arranged in order: 1, 1, 1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 7, 9 The median is 3. The mode is the value with the highest frequency. Since 1 occurs 5 times, 1 is the mode. By dropping the largest measurement from the data set, the mean drops from 4 to 3.526. The median drops from 3.5 to 3. There is no effect on the mode. c. Deleting the lowest 2 and highest 2 measurements leaves the following: 1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 7 The new mean is: y=
y = 5 + 3 + + 3 = 56 = 3.5
n 16 16
The trimmed mean has the advantage that any possible outliers have been eliminated.
Descriptive Statistics
13
2.32
a.
Range = 13 1 = 12
s2 =
( y)
n 1 n
498
s2 =
( y)
n 1 n
329
67 2 19 = 92.7368421 = 5.1520 19 1 18
s = s 2 = 5.1520 = 2.270
By dropping the largest observation from the data set, the range decreased from 12 to 8, the variance decreased from 9.3684 to 5.1520 and the standard deviation decreased from 3.061 to 2.270. c. Dropping the largest and smallest measurements: Range = 9 1 = 8
s2 =
y2
( y)
n 1 n
328
662 18 = 86 = 5.0588 18 1 17
s = s 2 = 5.0588 = 2.249 By dropping the largest and smallest observations from the data set, the range decreased from 12 to 8, the variance decreased from 9.3684 to 5.0588 and the standard deviation decreased from 3.061 to 2.249. 2.34 Comparing the means of the two distributions, we see that the center of the true distribution will be located to the right of the false distribution when viewed on a number line. Comparing the standard deviations of the two distributions, we see that the spread of the true distribution will be much greater than the spread of the false distribution. Graphically, this would be represented by a true distribution that looked shorter and more spread out than the false distribution. 2.36 In mound-shaped symmetric distributions, the Empirical Rule tells us to expect approximately 95% of the data to fall within two standard deviations of the mean.
14
Chapter 2
We expect approximately 95% of the SNR values in the population to fall between 10.1 and 28.9. We would not expect to see an SNR value outside of this interval. Therefore, a value of 30 would not be expected. 2.38 a. Since no information is given about the distribution of the velocities of the Winchester bullets, we can only use Chebyshev's Rule to describe the data. We know that at least 3/4 of the velocities will fall within the interval:
y 2s 936 2(10) 936 20 (916, 956)
Also, at least 8/9 of the velocities will fall within the interval:
y 3s 936 3(10) 936 30 (906, 966)
b. 2.40 a. b.
Since a velocity of 1,000 is much larger than the largest value in the second interval in part a, it is very unlikely that the bullet was manufactured by Winchester. From the printout, y = 2.425 and s = 1.259
y s 2.425 1.259 (1.166, 3.684) y 2s 2.425 2(1.259) 2.425 2.518 (0.093, 4.943) y 3s 2.425 3(1.259) 2.425 3.777 (1.352, 6.202)
c.
24 observations fall in the interval y s or 24/40 = .60. The Empirical Rule says there should be approximately .68 of the measurements within 1 standard deviation of the mean. This is fairly close. 38 observations fall in the interval y 2s or 38/40 = .95. The Empirical Rule says there should be approximately .95 of the measurements within 2 standard deviations of the mean. This agrees with the Empirical Rule. 40 observations fall in the interval y 3s or 40/40 = 1.00. The Empirical Rule says approximately all of the measurements should fall within 3 standard deviations of the mean. Again, this agrees with the Empirical Rule.
2.42
a.