Statistics Chapter4 Data Visualization
Statistics Chapter4 Data Visualization
Eric Zhang Lu
Acknowledgements:
Lan Liang, School of Communication, HKBU
Katy Börner, School of Informatics and Computing, Indiana University, USA
Robert Putnam, Research Computing, IS&T, Boston University, USA
Outline
Concepts of Data Visualization
– Value, goals
– History
Elements of Visualization
– Choosing the Right Chart
– Color, Size, Text, Titles, Labels
Design of Visualization
– A Visualization Workflow
Case study: COVID-19
– Mask wearing trends
– People concern about impact of COVID-19
Outline
Concepts of Data Visualization
– Value, goals
– History
Elements of Visualization
– Choosing the Right Chart
– Color, Size, Text, Titles, Labels
Design of Visualization
– A Visualization Workflow
Case study: COVID-19
– Mask wearing trends
– People concern about impact of COVID-19
Value of Information Visualization
With minimal effort, the human visual system can process a large amount
of information in a parallel manner
4
Goals for Information Visualization
Provide insight
– Explain data to solve specific problems
– Support the analytical task, showing the comparison or causality
– Explore large data sets for better understanding
6
Brief History
7
Brief History
• A graph by Playfair (1821) shows the price of wheat, weekly wages, and
reigning monarch over a two hundred fifty year span from 1565 to 1820.
Integration of bar charts and line graph 8
Brief History
Single number
– Showing count, frequency, mean, median etc
Icon Array
Bar
Simple text with a single large number
You have some numbers does not mean that you need a graph!
A simple sentence would suffice in this particular example.
Concern: Overload. Use simple text if you have a single large number you want to convey.
Icon Array
Icon arrays
– One icon (e.g. square, circle) is repeatedly 10, 100 or 1,000 times and then some of the
icons are colored to represent a percentage or proportion.
– Icon arrays have been shown to be especially effective for people with low numeracy
skills.
Combing icon array with a single large number
Donut or Pie Chart
Column charts and bar charts are similar. Their difference lies in orientation.
– A bar chart is horizontally orientated whereas the column chart is vertically orientated.
Side by side column charts (or bar charts) are possibly the most ubiquitous
way to showing how two or more numbers compare.
– Side by side column charts (or bar charts) are also referred to as clustered columns or
bars
Limitation of side by side column charts (or bar charts)
Effectiveness really stops with two comparisons
– It is difficult for people to compare between nonadjacent bars.
Once we start slicing in a third, fourth, or fifth column within each group, we
are asking brains to do too much
Slopegraphs are perfect for highlight the story of how just one category
decreased when other categories increased
Or to show that one change at a rate much faster than others.
Stories
– Our key indicators met the pre-established targets in three out of seven areas
– Regions A and B did not meet quarterly benchmarks
– Here is how our groups compared to the national norm
– Students in the Chemistry Department are above average on final exams this year
– We did not meet our fundraising goal, but we got very close
– Our click rate was 25.6% while the industry standard is 4.3%
– …
Displaying Relative Performance
Benchmark Line
Combo chart
Bullet Chart
Indicator Dots
Benchmark Line
Adding a benchmark line to a graph gives loads of context for the viewer.
This simple line packs so much power.
Combo Chart
Combo Chart
Bullet Graph
Indicator dots are little colored markers that show up next to a target value
listed in a table.
They indicate whether or not the target has been met. Show students
who are failing
(score under 50)
Show students
who are on the
borderline (score
between 50 and
60)
Choosing the Right Chart
Pie/Donut
Stacked Bar
Histogram
Tree Map
Map
Pie Chart
Swap the pie for bar chart if there are more than four categories in your
dataset
A histogram is nothing
more than a standard bar
or column chart, with
absolutely no space
between the bars or
columns.
Treemaps are square or rectangle shapes that represent parts, all positioned
inside a larger square or rectangle that represents a whole.
Treemaps can visualize hierarchy
Stories:
– Things changed/didn’t change
– After is so much better/worse/the same as before
– We started this intervention and the outcome improved as a result
– When the legislation took effect, we saw peaks in services for several years
– Sales increased 0.5% over the last quarter
– Maternal services have steadily increased their proportion of hospital use
– Here’s how much change occurred on this measure in the last decade
Visualize How Things Changed Over Time
Line Graph
Area Graph
Stacked Column
Deviation Bar
Slopegraph
Dot Plot
Area
Line Graph
Line Graphs are the most common way to depict trends over time
Area Graph
Area Graph
– A line graph with each segment stacked on
top of one another.
– Can show both parts of whole and trends.
Stacked Columns
While these visualizations are accurate, the color Swapping in semantically- resonant colors to the
red represents cooler temperatures, and doesn’t same dashboard really heats things up. With a
resonate with the information the data is trying to little extra thought into color choices and pallets,
portray. the data points now tell the story they were meant
to tell—and faster.
Good Great
Color Blind Friendly - Accessibility
1. Avoiding problematic color combinations, e.g., red
& green / green & brown / green & blue…
https://venngage.com/blog/color-blind-friendly-palette/
Size
68
Text
• Readability is essential.
• Make the most important information stand out.
69
Titles
Find the sweet spot. Too many mark labels can be very
distracting.
Try labeling the most recent mark, or min/max. Save
additional and more detailed information for tooltips.
Outline
Concepts of Data Visualization
– Value, goals
– History
Elements of Visualization
– Choosing the Right Chart
– Color, Size, Text, Titles, Labels
Design of Visualization
– A Visualization Workflow
Case study: COVID-19
– Mask wearing trends
– People concern about impact of COVID-19
A Visualization Workflow
A needs-driven workflow
• How to select visualization types?
• How to visually encode data?
• How to visualize dynamically? – Animation and Interaction
73
Types and Levels of Analysis
Levels
Types
Needs‐Driven Workflow Börner (2014) – Visual Insights
75
Line graph Map visualization
Needs‐Driven Workflow – Select Visualization Type
Börner (2014) 76
Visualization Types
77
Visualization Type Selection: Temporal
Temporal data analysis and visualization answer “WHEN” question and can
help to
– Understand the temporal distribution of datasets
– Identify growth rates, latency to peak times, or decay rates
– See patterns in time-series data, such as trends, seasonality, or bursts.
78
Visualization Type Selection: Temporal
Temporal trends
– Increasing
– Decreasing
– Stable
– Cyclic
79
Visualization Type Selection: Temporal
80
Visualization Type Selection: Temporal
81
Visualization Type Selection: Geospatial
82
Visualization Type Selection: Geospatial
83
Visualization Type Selection: Geospatial
84
Visualization Type Selection: Topical
85
Visualization Type Selection: Topical
86
Visualization Type Selection: Topical
87
Visualization Type Selection: Topical
Concept maps are
network graphs that
show the relationships
among concepts.
88
Visualization Type Selection: Network
89
Visualization Type Selection: Network
90
Visualization Type Selection: Network
91
Visualization Type Selection: Network
Network overlays on
geospatial maps
– Use a geospatial reference
system to place nodes
– E.g., airport traffic
92
Needs‐Driven Workflow – Visually Encode data
Börner (2014) 93
Data Scale Types
Categorical (nominal): A categorical scale, also called nominal or
category scale, is qualitative. Categories are assume to be non- More
overlapping. Qualitative
95
Graphic Variable Types
Quantitative Qualitative
– Position – Form
x, y; possibly z Shape
– Form Orientation (Rotation)
Size – Color
– Color Hue (tint)
Value (Lightness, Brightness)
96
Dynamic Visualization
97
Shneiderman's Mantra
User-Interface Interaction
– Immediate interaction not only allows direct manipulation of the visual objects
displayed but also allows users to select what to be displayed (Card et al.,
1999)
– Shneiderman (1996) summarizes six types of interface functionality
Overview
Zoom
Filtering
Details on demand
Relate
History
– “Overview first, zoom and filter, then details-on-demand.”
98
Two Interaction Approaches
User-Interface Interaction
– Overview + detail
First overview provides overall patterns to users; then details about the part of interest to the
use can be displayed. (Card et al., 1999)
Spatial zooming & semantic zooming are usually used
– Focus + context
Details (focus) and overview (context) dynamically on the same view. Users could change the
region of focus dynamically.
Information Landscape( Andrews, 1995)
Cone Tree (Robertson et al., 1991)
Fish-eye (Furnas, 1986)
99
Interactive Vis. Example: Gapminder World
http://www.gapminder.org/world/ 100
https://www.youtube.com/watch?v=BPt8ElTQMIg (Hans Rosling telling story)
Outline
Concepts of Data Visualization
– Value, goals
– History
Elements of Visualization
– Choosing the Right Chart
– Color, Size, Text, Titles, Labels
Design of Visualization
– A Visualization Workflow
Case study: COVID-19
– Mask wearing trends
– People concern about impact of COVID-19
Case study - COVID 19
Your dashboard’s purpose is to help guide the reader’s eye through more than
one visualization, tell the story of each insight, and reveal how they’re
connected.
The more you employ better dashboard design, your users will discover
what’s happening, why and what’s most important. Take into account how
you’re guiding their eyes across the dashboard.
107
Guide the user
Don’t leave people high and dry
without guidance on how to use a
visualization.
Try swapping a filter title with
explicit language directions
about how to navigate.
Rule of three
112