NAME: ____________________
Assignment 3
Data Files needed for these problems are in the Attached Files.
Problems:
2.25
a. Construct a bar chart, a pie or doughnut chart, and a Pareto chart.
The Bar Chart
Other
Working and Related Activities
Traveling
Sleeping
Activity
Leisure and Sports
Grooming
Educational Activities
Eating and Drinking
0 5 10 15 20 25 30 35 40
Percentage
The Pie Chart
Eating and Drinking Educational Activities
Grooming Leisure and Sports
Sleeping Traveling
Working and Related Activities Other
b. Which graphical method do you think is best for portraying these data?
I find the pie chart to be most useful because visually, for me, it is easer to see the area
of a circle for each section relative to the others when they meet at a central point.
However do find the automatic ranking you get from the Pareto chart useful.
c. What conclusions can you reach concerning how college students spend their day?
I’ve learned that my personal habits are quite inline with the rest of my peers.
2.31
a. Construct a side-by-side bar chart and a doughnut chart of project outcome and
category.
Side by Side bar chart
40000
35000
30000
Targetd amount 25000
20000
15000
10000
5000
0
Film \& Video Games Music Technology
Category
Sucessful Not Sucessful
Doughnut Chart
Film \& Video Games Music Technology
b. What conclusions concerning the pattern of successful Kickstarter projects can you
reach?
Music the most successful category, though it still fails half the time, Technology appears to
be the least successful.
2.33
Stem Leaf
5 34
6 9
7 4
8
9 38
2.35
a. Construct an ordered array.
91,94,97,100,102,102,103,108,111,112,115,115,116,116,117,117,117,122,122,123,124
,128,129,130,132
b. Which of these two displays seems to provide more information? Discuss.
The stem and leaf plot is nicer that in it is more dense, as well as showing at a glance
what ranges or bins the numbers fall into,
c. How many gallons are most likely to be purchased?
the center of the distribution is in the 11-gallon range.
d. Is there a concentration of the purchase amounts in the center of the distribution?
yes it falls off to either side of the 11 range so they are centered around 11.
2.41
a. Construct a percentage histogram.
Percent Histogram
35
30
25
20
15
10
0
207 224.1 241.2 258.3 275.4 292.5 309.6 326.7 343.8 360.9
b. Construct a cumulative percentage polygon.
cumulative percentage polygon
120
100
80
60
40
20
0
207 224.1 241.2 258.3 275.4 292.5 309.6 326.7 343.8 360.9
c. What conclusions can you reach concerning the time Americans living in cities spend
commuting to work each week?
The most common commuting time is 224.1-224.2 minute range, taking 30% of the recorded average
commutes.
2.45
a. Construct a percentage histogram and a percentage polygon.
Percentage Histogram
35
30
25
20
15
10
0
65 73 47 2 93 67 4 13 87 6 33 07 8 53 7
6. 8. 0. 1. 3. 5. 7. 8. 0. 2. 4. 5. 7. .2
13 20 28 35 42 49 56 63 71 78 85 92 99 069
1
using midpoints for the percentage polygon.
Percentage Polygon
35
30
25
20
15
10
0
0 200 400 600 800 1000 1200
b. Construct a cumulative percentage polygon.
Cumulative Percentage Polygon
120
100
80
60
40
20
0
0 200 400 600 800 1000 1200
c. What can you conclude about call center performance if a call duration target of less
than 240 seconds is set?
Looking at the histograms and polynomials leads to an over estimate of how often they meet their goal,
since we have the raw data we can see they meet the goal of <240 exactly 30 out of 50 times, or 60%,
however the histograms include a few data points over that and end up in the high 70’s.
2.47
a. Construct a histogram and a percentage polygon.
Histogram
10
9
8
7
6
Count
5
4
3
2
1
0
1.89 1.91 1.92 1.94 1.95 1.97 1.98 1.99 2.01 2.02 2.04 2.05 2.07 2.08 2.09
Volume
Using midpoints for percentage polygon
Percentage polygon
20
18
16
14
Percent 12
10
8
6
4
2
0
1.85 1.9 1.95 2 2.05 2.1 2.15
Volume
b. Construct a cumulative percentage polygon.
Cumulative Percentage polygon
120
100
80
Percent
60
40
20
0
1.85 1.9 1.95 2 2.05 2.1 2.15
Volume
c. On the basis of the results in (a) and (b), does the amount of soft drink filled in the
bottles concentrate around specific values?
Yes it seems like there is a concentration of samples very close to and just over 2.0.
2.49
a. Construct a time-series plot.
Time series
25
20
15
Sales
10
0
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Year
b. Does there appear to be any change in annual sales over time? Explain.
Yes, the sales increase slowly over time up until 2013 where they start to slowly fall
back down until they are the same was they were in 2008
2.51
a. Construct a scatter plot with Bundle score on the X axis and typical cost on the Y axis.
Scatter plot
120
100
80
Cost
60
40
20
0
0 10 20 30 40 50 60 70 80 90 100
Bundle Score
b. What conclusions can you reach about the relationship between Bundle score and
typical cost?
For the lower values on the range, the bundles scores are all over the place, once you
start spending at least 40, the bundle scores tend to be at least 60 or higher.
2.73
a. Describe at least one good feature of this visual display.
Like a pie chart, you can easily see the breakdown of relative sizes of the categories.
b. Describe at least one bad feature of this visual display.
They picked a strange font on the right side with their breakdown, aside from that, their
claim is not supported by the graph since the bin is not broken at the 5 year mark.
Furthermore, some of the bins overlap and others do not.
c. Redraw the graph, by using the Exhibit 2.1 guidelines.
Years until managers believe they need to make IT
changes.
8
18
14
12+
9-12
4-8
2-4
0-2
Already done
23
25
12
2.77
How do histograms and polygons differ in construction and use?
The polygon and the histogram show the exact same data, they only differ in presentation and that the
polygon allows for more than data set in a single graphic.
2.79
What are the advantages and disadvantages of using a bar chart, a pie chart, a doughnut chart,
and a Pareto chart?
Bar chart: + Can compare multiple data sets, allows full values, not just ratios
- works best with discrete data.
Pie chart: + very easy to see what is happening at a quick glance, and compare ratios between groups.
- Only allows one data set, does not work well with lots of categories.
Pareto chart: + Combines the histogram with a cumulative frequency.
-Requires familiarity with the chart to understand what’s going on. Less visually appealing.
2.81
What is the difference between a time-series plot and a scatter plot?
A time series plot has time as the X axis, and the point are in a sequential order, scatter plots do not have to be
in a specific order.
2.83
What are the three different ways to break down the percentages in a contingency table?
You can break it down by row, column or overall totals.
2.85
What type of insights can you gain from a contingency table that contains three variables that
you cannot gain from a contingency table that contains two variables?
The table with 3 variables can show you the combinations of any 2 of the 3 as well as the combination
of all 3 effects on the resulting measurement.
2.87
What is the difference between a time-series plot and sparklines?
A sparklines shows overall trends for multiple sources at a time, often with lower resolution and detail
than you would expect in a time-series plot which only shows one data set.