Assignment 1 - PART1 - ANSWER
Assignment 1 - PART1 - ANSWER
Question 1:
The Woodmill Company makes windows and door trim products. The first step in the process is to rip
dimensions (2 x 8, 2 x 10, etc) lumber into narrower pieces. Currently, the company uses the manual
process in which an experienced operator quickly looks at a board and determines what rip widths to use.
The decision is based on the knots and defects in the wood.
A company in Oregon has developed an optical scanner that can be used to determine the rip widths. The
scanner is programmed to recognize defects and to determine the rip widths that will optimize the value
of the board. A test run of 100 boards was put through the scanner and the rip widths are identified.
However, the boards were not actually ripped. A lumber grader determined the resulting values for each
of the 100 boards assuming that the rips determined by the scanner had been made. Next, the same 100
boards were manually ripped using the normal process. The grader then determined the value for each
board after the manual rip process was completed. The resulting data, in the file WOODMILL, consist of
manual rip values and scanner rip values for each of the 100 boards.
a. Develop a frequency distribution for the board values for the scanner and the manual process.
ANSWER: The frequency table should be in classes of category. Do not make it more than 10 classes.
b. Compute appropriate descriptive statistics for both manual and scanner values. Use these data
along with the frequency distribution developed in part (a).
ANSWER
Summary: Manual process had lower mean, wider variation and slightly skewed to right. Scanner process
had higher mean, narrower variation and normally distributed.
Relative variability
Manual Process 37%
Scanner Process 25%
Scanner process has the least relative variability.
Question 2:
The commercial banking industry is undergoing rapid changes due to advances in technology and
competitive pressures in the financial services sector. The data file BANKS contains selected information
tabulated by Fortune concerning the revenues, profitability, and number of employees for the 51 largest
US Commercial Banks in terms of revenues. Use the information in this file to complete the following:
a. Compute the mean, median and standard deviation for the three variables: revenues, profits
and number of employees.
ANSWER:
Revenues Profits Employees
Mean 6354.706 803.4314 21530.27
Median 3428 401 11000
Standard Deviation 7457.657 881.8123 21269.35
b. Convert the data for each variable to z value. Consider Mellon Bank Corporation headquarters in
Pittsburgh. How does it compare to the average bank in the study on the three variables?
Discuss.
ANSWER:
Name Revenues - z value Profits - z value Employees - z value
MELLON BANK CORP. -0.163684908 -0.03677809 0.280672628
Mellon Bank Corp. has lower revenues, lower profits and more employees than the average bank.
c. As you can see by examining the data and by looking at statistics computed in part (a), not all
banks had the same revenue, same profit or the same number of employees. Which variable
had the greatest relative variation among the banks in the study?
ANSWER:
Revenues Profits Employees
Relative variability 117% 110% 99%
Revenues variable has the greatest relative variation among the banks.
d. Calculate a new variable: profits per employee. Develop a frequency distribution and histogram
for this new variable. Also compute the mean, median and standard deviation for the new
variable. Write a short report that describes the profits per employee for the banks.
ANSWER: 25
The frequency table for the new variable is as below: 20
Frequency
60 5
75 1 Profit (in thousands) per
90 2 Employee
More 0
e. Referring to part (d), how many banks had a profit-per-employee ratio that exceeded 2 standard
deviations from the mean?
ANSWER:
There were 3 banks had a profit-per-employee ratio that exceeded 2 standard deviations from the
mean.
Question 3:
Zepolle’s Bakery makes a variety of bread types that it sells to supermarket chains in the area. One of
Zepolle’s problems is that the number of loaves of each type of bread sold each day by the chain stores
varies considerably, making it difficult to know how many loaves to bake. A sample of daily demand data
is contained in the file called BAKERY.
a. Which bread type has the highest average daily demand?
ANSWER:
d. Which bread type has the greatest relative variability? Which type has the lowest relative
variability?
ANSWER:
Bread Relative Variability
White 25%
Wheat 20%
Multigrain 23%
Black 21%
Cinnamon Raisin 24%
Sour Dough French 23%
Light Oat 22%
White bread has the greatest relative variability and Wheat bread has the lowest relative variability.
e. Assuming that these sample data are representative of demand during the year, determine how
many loaves of each type of bread should be made such that demand would be met on at least
75% of the days during the year.
ANSWER:
White Wheat Multigrain Black Cinnamon Raisin Sour Dough French Light Oat
703 635 569 435 162 147 307
f. Create a new variable called Total Loaves Sold. On which day of the week is the average for total
loaves sold the highest?
ANSWER:
Day Day of the week Average of Total Loaves Sold
Sunday 1 2,196
Monday 2 2,947
Tuesday 3 2,388
Wednesday 4 2,336
Thursday 5 2,337
Friday 6 3,117
Saturday 7 1,772
Friday had the highest average loaves sold.
Question 4:
The Franklin Tire Company is interested in demonstrating the durability of its steel-belted radial tires. To
do this, the managers have decided to put four tires on 100 different sport utility vehicles and drive them
throughout Alaska. The data collected indicate the number of miles (rounded to the nearest 1,000 miles)
that each of the SUVs traveled before one of the tires on the vehicle did not meet minimum federal
standards for tread thickness. The data file is called Franklin.
a) Construct a frequency distribution and histogram using eight classes. Use 51 as the lower limit of
the first class.
ANSWER:
b) The marketing department wishes to know the tread life of at least 50% of the tires, the 10% that
had the longest tread life, and the longest tread life of these tires. Provide this information to the
marketing department. Also provide any other significant items that point out the desirability of
this line of steel-belted tires.
ANSWER:
c) Construct a frequency distribution and histogram using 12 classes, using 51 as the lower limit of
the first class. Compare your results with those in parts (a). Which distribution gives the best
information about the desirability of this line of steel-belted tires?
ANSWER:
With frequency distribution and histogram using 12 classes it gives the best information about the
desirability of this line of steel-belted tires.
Question 5:
Orlando, Florida, is a well-known, popular vacation destination visited by tourists from around the world.
Consequently, the Orlando International Airport is busy throughout the year. Among the variety of data
collected by the Greater Orlando Airport Authority is the number of passengers by airline. The file Orlando
Airport contains passenger data for July 2008. Suppose the airport manager is interested in analyzing the
column labeled “Total” for this data.
a) Using the 𝟐𝟐𝒌𝒌 ≥ 𝒏𝒏 guideline, what is the minimum number of classes that should be used to display
the data in the “Total” column in a grouped data frequency distribution?
ANSWER:
The minimum number of classes is 5 classes
c) Based on your answer to part (b), construct and interpret a frequency histogram for the data.
ANSWER:
The total for enplaned and deplaned of domestic, international, revenues and non revenues for airline is
mostly between 1000 to 121,341.
Question 6:
The data in the file named Fast100 was collected by D. L. Green & Associates, a regional investment
management company that specializes in working with clients who wish to invest in smaller companies
with high growth potential. To aid the investment firm in locating appropriate investments for its clients,
Sandra Williams, an assistant client manager, put together a database on 100 fast-growing companies.
The database consists of data on eight variables for each of the 100 companies. Note that in some cases
data are not available. A code of 99 has been used to signify missing data. These data will have to be
omitted from any calculations.
a) Select the variable Sales. Develop a frequency distribution and histogram for Sales.
ANSWER:
Frequency Distribution
Histogram graph:
b) Compute the mean, median, and standard deviation for the Sales variable.
ANSWER:
c) Determine the interquartile range for the Sales variable.
ANSWER:
IQR for the sales variable = $270.9
d) Construct a box and whisker plot for the Sales variable. Identify any outliers. Discard the outliers
and recalculate the measures in part b.
ANSWER:
Outlier is identified as below.