Creating More Effective Graphs
Creating More Effective Graphs
C E
Figure 1. Pie Chart. This pie chart has five wedges. Please order them in size order from largest to smallest. 2. 3. 4. A dot plot is more effective than a pie chart for ordering the sizes of A through E above. (5) A table is often very effective for small data sets. (9) Summary of Keynote: Limitations of some common graph forms. Human perception and our ability to decode graphs. Newer and more effective graph forms. Trellis graphics and other innovative methods to present more than two variables. General principles for creating effective graphs.
Numbers in parentheses at the end of key points provide page numbers from Creating More Effective Graphs for those readers who want additional information or wish to see examples of the points made.
Figure 2. Excel 3-D Bar Chart. Avoid putting extra dimensions in your charts. The pseudo threedimensional charts are difficult to read. A two-dimensional chart is clearer than a pseudo threedimensional one. 5. Avoid putting extra dimensions in your charts. The pseudo three-dimensional charts are difficult to read. If you know categories and values for each category, a two-dimensional chart is clearer than a pseudo three-dimensional one. (22-27) The way to read pseudo three dimensional bar charts depends on the software used to create them. However, were rarely told what software was used. (26-27) Data labels dont help; they confuse the reader even more. (358-359) Choose options carefully when the new version of Excel is released. It will offer even more ways to confuse, mislead, and overwhelm the audience. It is difficult to determine trends from stacked bar charts unless we are looking at the bottom category since lengths without a common baseline are difficult to compare. (28-31) Management, clients, or colleagues often request graphs with perceptual problems. Several suggestions for handling this problem follow: Make an analogy with words. Point out the parallels between using fancy fonts and unnecessary dimensions. Show how annoying unnecessary words are to a sentence. Offer options. Give them what they asked for but also the way you think it should be drawn. Often, they will see the benefit of the better version. Show the poor graph and ask questions about the data. Then show a better figure and ask the same questions. It will be obvious that the better figure communicates the information more clearly. Give them a book or article that points out the limitations of the figure they are asking for.
6. 7. 8. 9. 10.
Robbins
Swiss Society of Statistics 12. Color saturation (52-53) Density or amount of black (52-53) Length or distance (54-55) Position along a common scale (56-57) Position along identical, nonaligned scales (58-59) Slope (60-61) Volume (50-51)
Dot plots allow us to decode the data by making judgments of positions along the common horizontal scale. Experiments have shown that this is the most accurate of the elementary graphical tasks. (57) We judge position along identical nonaligned scales almost as accurately as position along a common scale. (58) Clevelands hierarchy of tasks ordered by our ability to perform accurate judgments: (61) 1. Position along a common scale 2. Position along identical, nonaligned scales 3. Length 4. Angle - Slope 5. Area 6. Volume 7. Color hue - Color saturation Density
13. 14.
15.
Creating a more effective graph involves choosing a graphical construction where the visual decoding uses tasks as high as possible on the ordered list of elementary graphical tasks while balancing this ordering with distance and perception. (62-63) Distance: The closer together objects are, the easier it is to compare them. As distance between the objects increases, accuracy of judgments decreases. Detection: Before we can perform any of the elementary tasks, we must be able to detect the data. We often cannot if data points overlap or are hidden in the axes or tick marks.
An italicized statement signifies that the statement is a direct quote from Clevelands Elements of Graphing Data.
Robbins
It is helpful to use two sets of labels for the pair of scales lines when using logarithmic axes. (74-75) Send an email to [email protected] for a macro to draw dot plots with Excel. These useful web sites describe how to draw dot plots with Excel: http://www.processtrends.com/pg_charts_dot_plots.htm http://www.exceluser.com/dash/dotplot.htm
26.
Multiple line plots are often used to show a day of the week or month of the year effect. However, it is difficult to follow the trend of a given week or month. It is difficult to examine the day of the week or month of the year effect with a time series plot. Month plots are useful for showing a day of week or month of year effect. (102-103, 106107) Month plots are also called cycle plots. (103)
27. 28.
Figure 3. Month plot of items sold. First, all Monday values are plotted, then all Tuesday values, and so forth. Each dot represents a week: from left to right, the eight week period). The trend for each day is shown clearly, yet we still see daily effects such as that sales are highest on Wednesdays. We also see that sales are increasing on Mondays and Wednesdays over the eight week period but decreasing on Tuesdays. The horizontal lines represent the mean for each day.
Trellis Graphics and Other Methods for Showing More than Two Variables
29. 30. 31. 32. Trellis displays provide a framework for multivariate data. They are often extremely useful. (118-125, 327) A major feature of Trellis displays is multipanel conditioning. (119) Multipanel plots often avoid the need for color. (321) Terminology for multipanel plots and Trellis:
Robbins
Swiss Society of Statistics Multipanel Plot: any plot with more than one panel.
Small Multiples: a series of graphics, showing the same combination of variables, indexed by changes in another variable. Tufte (2001, p. 170) Trellis Display: Small multiples with structure imposed; e.g., ordering of panels. o o o Trellis trademarked by Insightful Corporation Often called lattice displays Term trellis also used for a graphics system in S-Plus
33.
The Trellis plot of the barley data clearly shows an anomaly of the data that statistical analyses missed. (120-123)
+ 1931
Waseca
Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota
1932
Crookston
Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota
Morris
Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota
University Farm
Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota
Duluth
Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota
Grand Rapids
Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota 20 30 40 50 60
Figure 4. Barley Example. This figure shows the power of visualization and of Trellis displays by showing the anomaly at the Morris site in the barley data set, which was not discovered in 60 years of conventional statistical analyses. 34. Bar charts get cluttered more quickly than dot plots. (124-125)
Robbins
Terminology (156-157)
Make the data stand out. Deemphasize non-data elements. (158-161) Look at the graph and notice what you see first. The answer should be the data (or model) and not grid lines, long labels, or other graphical elements. (158-161) Eliminate unnecessary clutter. (Chapter 6) Use visually prominent graphical elements to show the data. (162-163) Do not clutter the interior of the scale line rectangle. (174-177) Some ways to reduce clutter: (174-177) Show axes labels in thousands, millions, or billions instead of including strings of zeros. Label an axis as percent or dollars rather than including a percent sign or dollar symbol at each tick mark label.
43. 44.
Deemphasize grid lines and distinguish grid lines from data. (158-163, 184-185) Visual clarity must be preserved under reduction and reproduction. (190-192)
Scales
45. 46. 47. Whether zero needs to be included on scales is a controversial topic. (232-237) A bar graph without zero is misleading. (238-239) Do not use evenly spaced tick marks for uneven intervals on an arithmetic scale. (286-291, 334-335)
Software
Most graphs in this presentation were drawn using S-Plus or R. R is freely downloadable from www.r-project.org Send email to [email protected] for an Excel macro to create dot plots with Excel.
Robbins
References
Cleveland, William S. 1994. The Elements of Graphing Data (Revised Edition), Hobart Press, Summit, NJ. (1st Edition, Wadsworth, Inc., Monterey, CA, 1985) Cleveland, William S. 1993. Visualizing Data, Hobart Press, Summit, NJ. Robbins, Naomi B. 2005 Creating More Effective Graphs, Wiley, Hoboken, NJ. Robbins, Naomi B. 2006. Dot Plots: A Useful Alternative to Bar Charts, http://www.b-eye-network.com/view/2468 Tufte, Edward. 2001. The Visual Display of Quantitative Information, 2nd edition, Graphics Press, Cheshire, CT. (First edition 1983) Walkenbach, John. 2003. Excel Charts, Wiley, Hoboken, NJ.
Robbins
Robbins