Data Visualization Literacy

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

COLLOQUIUM

PAPER
Data visualization literacy: Definitions, conceptual
frameworks, exercises, and assessments
Katy Börnera,b,1, Andreas Buecklea, and Michael Gindaa
a
School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408; and bEducational Technology/Media Centre, Dresden
University of Technology, 01062 Dresden, Germany

Edited by Ben Shneiderman, University of Maryland, College Park, MD, and accepted by Editorial Board Member Eva Tardos December 6, 2018 (received for
review May 20, 2018)

In the information age, the ability to read and construct data A review of major international education surveys with varying
visualizations becomes as important as the ability to read and write degrees of global coverage and diverse intended age groups can
text. However, while standard definitions and theoretical frame- be found in ref. 5.
works to teach and assess textual, mathematical, and visual literacy Mathematical literacy (also referred to as “numeracy”) has been
exist, current data visualization literacy (DVL) definitions and defined as an “understanding of the real number line, time, mea-
frameworks are not comprehensive enough to guide the design surement, and estimation” as well as an “understanding of ratio
of DVL teaching and assessment. This paper introduces a data concepts, notably fractions, proportions, percentages, and proba-
visualization literacy framework (DVL-FW) that was specifically bilities” (6). PISA defines it as “an individual’s capacity to formu-
developed to define, teach, and assess DVL. The holistic DVL-FW late, employ, and interpret mathematics in a variety of contexts,”
promotes both the reading and construction of data visualizations, including “reasoning mathematically and using mathematical con-
a pairing analogous to that of both reading and writing in textual cepts, procedures, facts and tools to describe, explain and predict
phenomena.” PISA administers standardized tests for math,

SOCIAL SCIENCES
literacy and understanding and applying in mathematical literacy.
Specifically, the DVL-FW defines a hierarchical typology of core
problem solving, and financial literacy (7). The PISA 2015 Draft
Mathematics Framework (8) explains the theoretical underpin-
concepts and details the process steps that are required to extract
nings of the assessment, the formal definition of mathematical
insights from data. Advancing the state of the art, the DVL-FW
literacy, the mathematical processes that students undertake
interlinks theoretical and procedural knowledge and showcases
when using mathematical literacy, and the fundamental mathe-
how both can be combined to design curricula and assessment matical capabilities that underlie those processes.
measures for DVL. Earlier versions of the DVL-FW have been used Visual literacy was initially defined as a person’s ability to
to teach DVL to more than 8,500 residential and online students, “discriminate and interpret the visible actions, objects, and
and results from this effort have helped revise and validate the symbols natural or man-made, that he encounters in his envi-
DVL-FW presented here. ronment” (9). In 1978, it was defined “as a group of skills which
enable an individual to understand and use visuals for in-
data visualization | information visualization | literacy | assessment | tentionally communicating with others” (10). More recently, the
learning sciences Association of College and Research Libraries defined stan-
dards, performance indicators, and learning outcomes for visual
literacy (11, 12). In the academic setting, Avgerinou (13) de-
T he invention of the printing press created a mandate for
universal textual literacy; the need to manipulate many large
numbers created a need for mathematical literacy; and the
veloped and validated a visual literacy index by running focus
groups of visual literacy experts, and Taylor (14) reviewed vi-
ubiquity and importance of photography, film, and digital drawing sual, media, and information literacy, arguing for the design of
tools posed a need for visual literacy. Analogously, the increasing a visual language and coining the term “visual information
literacy.”
availability of large datasets, the importance of understanding
DVL, also called “visualization literacy,” has been defined as
them, and the utility of data visualizations to inform data-driven the “the ability to confidently use a given data visualization to
decision making pose a need for universal data visualization literacy translate questions specified in the data domain into visual
(DVL). Like other literacies, DVL aims to promote better com- queries in the visual domain, as well as interpreting visual pat-
munication and collaboration, empower users to understand their terns in the visual domain as properties in the data domain” (15);
world, build individual self-efficacy, and improve decision making “the ability and skill to read and interpret visually represented
in businesses and governments. data in and to extract information from data visualizations” (16);
and “the ability to make meaning from and interpret patterns,
Pursuit of Universal Literacy trends, and correlations in visual representations of data” (17).
In what follows, we review definitions and assessments of textual, Other works have sought to advance the assessment of DVL.
mathematical, and visual literacy and discuss an emerging con- Boy et al. (15) applied item response theory (IRT) to assess
sensus around the definition and assessment of DVL.
Textual literacy, according to the Organisation for Economic
Co-operation and Development’s (OECD’s) Program for In- This paper results from the Arthur M. Sackler Colloquium of the National Academy of
ternational Student Assessment (PISA), is the process of “un- Sciences, “Creativity and Collaboration: Revisiting Cybernetic Serendipity,” held March
derstanding, using, reflecting on and engaging with written texts, in 13–14, 2018, at the National Academy of Sciences in Washington, DC. The complete pro-
order to achieve one’s goals, to develop one’s knowledge and gram and video recordings of most presentations are available on the NAS website at
www.nasonline.org/Cybernetic_Serendipity.
potential, and to participate in society” (1). Major tests for textual
literacy are issued by PISA (2) and are regularly administered in Author contributions: K.B. designed research; K.B., A.B., and M.G. performed research;
and K.B., A.B., and M.G. wrote the paper.
over 70 countries to measure how effectively they are preparing
The authors declare no conflict of interest.
students to read and write text. Another major international test,
Progress in International Reading Literacy Study (PIRLS), has This article is a PNAS Direct Submission. B.S. is a guest editor invited by the Editorial
Board.
measured reading aptitude for fourth graders every 5 years since
2001. For advanced students, the Graduate Record Examination Published under the PNAS license.
1
Subject Tests are widely used to assesses verbal reasoning and an- To whom correspondence should be addressed. Email: [email protected].
alytical writing skills for people applying to graduate schools (3, 4). Published online February 4, 2019.

www.pnas.org/cgi/doi/10.1073/pnas.1807180116 PNAS | February 5, 2019 | vol. 116 | no. 6 | 1857–1864


visualization literacy for bar charts and scatterplot graphs using (DVL-FW) that provides the theoretical underpinnings required
12 static visualization types as prompts with six tasks (minimum, to develop teaching exercises and assessments for data visuali-
maximum, variation, intersection, average, comparison). Börner zation construction and interpretation.
et al. (17) used 20 graph, map, and network visualizations from
newspapers, textbooks, and magazines to assess the basic DVL of DVL-FW
273 science museum visitors. Results show that participants had Analogous to the PISA mathematics and literacy frameworks (1,
significant limitations in naming and interpreting visualizations 8), we present here a DVL-FW that covers a typology of core
and particular difficulties when reading network layouts (18). concepts and terminology together with a process model for
Maltese et al. (19) showed significant differences between nov- constructing and interpreting data visualizations. The initial
ices and experts when reading and interpreting visualizations DVL-FW was developed via an extensive review of more than
commonly found in textbooks and school curricula. Lee et al. 600 publications documenting 50+ years of work by statisticians,
(16) developed a visualization literacy assessment test (VLAT) cartographers, cognitive scientists, visualization experts, and
that consists of 12 data visualizations (e.g., line chart, histogram, others. These publications were selected using a combination of
scatterplot graphs) and 53 multiple-choice test items that cover expert surveys and cited reference searches for key publications,
eight data visualization tasks (e.g., retrieve value, characterize such as refs. 25–32. An extended review of prior work and an
distribution, make comparisons). The VLAT test demonstrated earlier version of the DVL-FW were presented in Börner’s Atlas
high reliability when administered to 191 individuals. of Knowledge: Anyone Can Map (33). The DVL-FW has been
Interested to understand how users with no specific training applied and systematically revised over more than 10 years of
create visualizations from scratch, Huron et al. (20) identified developing exercises and assessments for residential and online
simplicity, expressivity, and dynamicity as the main challenges. In courses at Indiana University. More than 8,500 students ap-
subsequent work, Huron et al. (21) identified 11 different sub- plied the DVL-FW to solve 100+ real-world client projects;
tasks and a variety of paths that subjects used to navigate to a student performance and feedback were used to expand the
final visualization; they then grouped these tasks into four cat- coverage, internal consistency, utility, and usability of the
egories: load data, build constructs, combine constructs, and framework.
correct. Building on this work, Alper et al. (22) developed and In what follows, we present the revised DVL-FW that con-
tested an interactive visualization creation tool for elementary nects DVL core concepts and process steps. Plus, we showcase
school children, focusing on the abstraction from individual how the framework can be used to design exercises and associ-
pictographs to abstract visuals. They observed that touch inter- ated assessments. While DVL requires textual, mathematical,
activity, verbal activity, and class dynamics were significant fac- and visual literacy skills, the DVL-FW focuses on core DVL
tors in how students used the application. Chevalier et al. (23) concepts and procedural knowledge.
observed that, while visualizations are omnipresent in grades K–
4, students are rarely taught how to read and create them and DVL-FW Typology. Visualizations have been classified by insight
argued that visualization creation should be added to the con- needs, types of data to be visualized, data transformations used,
cept of visualization literacy. visual mapping transformations, or interaction techniques among
Interestingly, not one of the existing approaches explicitly cov- others. Over the past five decades, many studies have proposed
ers the crucial question of how to assess the construction of data diverse visualization taxonomies and frameworks (Table 1). Among
visualizations. Furthermore, there is no agreed on standardized the most notable are Bertin’s Semiology of Graphics: Diagrams,
terminology, typology, or classification system for core DVL Networks, Maps (25), Harris’ Information Graphics: A Compre-
concepts. In fact, most experts do not agree on names for some of hensive Illustrated Reference (28), Shneiderman’s “The eyes have
the most widely used visualizations (e.g., the radar graph is also it: A task by data-type taxonomy for information visualizations”
called a “polygon graph” or “star chart”). Plus, there is little (32), MacEachren’s How Maps Work: Representation, Visualization,
agreement on how to classify visualizations (e.g., what is a chart, and Design (29), Few’s Show Me the Numbers: Designing Tables
graph, or diagram?) or on the general process steps of how to and Graphs to Enlighten (27), Wilkinson’s The Grammar of
render data into actionable insights. Finally, there is little agree- Graphics (Statistics and Computing) (31), Cairo’s The Functional
ment about how to teach visualization design most effectively. Art (26), Interactive Data Visualization: Foundations, Techniques,
Given the rather low level of DVL (17) and the high demand and Applications by Ward et al. (34), and Munzner’s Visualization
for it in the workforce (24), there is an urgent need for basic and Analysis and Design (30). Several of these frameworks have guided
applied research that defines and measures DVL and develops subsequent tool development. For example, The Grammar of
effective interventions that measurably increase DVL. In what Graphics (Statistics and Computing) (31) informed the statistical
follows, we present a data visualization literacy framework software Stata (35) and Wickham’s ggplot2 (36).

Table 1. Typology of the DVL-FW


Insight needs Data scales Analyses Visualizations Graphic symbols Graphic variables Interactions

Categorize/cluster Nominal Statistical Table Geometric symbols Spatial Zoom


Order, rank, sort Ordinal Temporal Chart Point Position Search and locate
Distributions (also outliers) Interval Geospatial Graph Line Retinal Filter
Comparisons Ratio Topical Map Area Form Details on demand
Trends (process and time) Relational Tree Surface Color History
Geospatial Network Volume Optics Extract
Compositions (also of text) Linguistic symbols Motion Link and brush
Correlations/relationships Text Projection
Numerals Distortion
Punctuation marks
Pictorial symbols
Images
Icons
Statistical glyphs

1858 | www.pnas.org/cgi/doi/10.1073/pnas.1807180116 Börner et al.


COLLOQUIUM
PAPER
Table 1 shows the seven core types of the revised DVL-FW
theory. The members for each type were derived from an exten-
sive literature review and refined using feedback gained from
constructing and interpreting data visualizations for the 100+ cli-
ent projects in the Information Visualization massive open online
course (IVMOOC) discussed in the next section. Subsequently, we
will detail each of the seven types.
Insight needs. Different stakeholders have different insight needs
(also called “basic task types”) that must be understood in detail to
design effective visualizations for communication and/or explora- Fig. 2. Typical reference systems used in a 2D table, graph, map, and net-
tion. The DVL-FW builds on and extends prior definitions of in- work visualization. Horizontal dimension is colored red, and vertical is in
sight needs and task types. For example, Bertin (25) identifies four green to highlight commonalities and differences. Force-directed layout al-
task types: selection, order, association (or similarity), and quantity. gorithms can be used to assign x–y values to each node in a network based
The graph selection matrix by Few (27) distinguishes ranking, dis- on node similarity.
tribution, nominal comparison and deviation, time series, geo-
spatial, part to whole, and correlation. Yau (37) distinguishes
patterns over time, proportions, relationships, differences, and is possible as well [e.g., multidimensional scaling (40) converts
spatial relations. Ward et al. (34) propose tasks, such as identify, ordinal into ratio data]; other examples are in ref. 33.
Prior work has used data-scale types as independent variables
locate, distinguish, categorize, cluster, rank, compare, associate, and
in user studies (41) with regard to selecting appropriate visuali-
correlate. Munzner (30) identifies three actions (analyze, search,
zation types (27, 38), tasks (32), and how different types relate to
query), listing a variety of tasks, such as discover, annotate, identify,
datasets (30, 34). Results provide valuable guidance for the de-
and compare. Fisher and Meyer (38) describe tasks, such as finding
sign of visualizations, exercises, and assessments.
and reading values, characterizing distributions, and identifying
Analyses. Most datasets need to be analyzed before they can be
trends. Table 1, column 1 shows a superset of core types covered by

SOCIAL SCIENCES
visualized. While focusing on visualization construction and in-
the DVL-FW proposed here. Any alignment of previously pro-
terpretation, the DVL-FW does cover different types of analyses
posed needs and tasks will be imperfect, as detailed definitions of
that are commonly used to preprocess, analyze, or model data
terms do not always exist. An extended discussion of additional
before they are visualized (Table 1, column 3). Five general types
prior works and their tabular alignment can be found in ref. 33.
are distinguished: statistical analysis (e.g., to order, rank, or sort);
Data scales. Data variables may have different scales (e.g., qualitative
temporal analysis answering “when” questions (e.g., to discover
or quantitative), influencing which analyses and visual encodings can
trends); geospatial analysis answering “where” questions (e.g., to
be used. Building on the work of Stevens (39), the DVL-FW dis-
identify distributions over space); topical analysis answering
tinguishes nominal, ordinal, interval, and ratio data based on the
“what” questions (e.g., to examine the composition of text); and
type of logical mathematical operations that are permissible (Fig. 1).
relational analysis answering “with whom” questions (e.g., to ex-
The approach subsumes Bertin’s (25) three data-scale types: quali-
amine relationships; also called network analysis). Algorithms for
tative, ordered, and quantitative—which roughly correspond to the different types of analyses come from statistics, geography,
nominal, ordinal, and quantitative (also called “numerical”; includes linguistics, network science, and other areas of research. The tools
interval and ratio). Bertin’s terminology was later adopted by used in the IVMOOC (see below) support more than 100 differ-
MacEachren (29) and many other cartographers and information ent temporal, geospatial, topical, and network analyses (42).
visualization researchers (27, 30, 38). Atlas of Knowledge: Anyone Visualizations. Any comprehensive and effective DVL-FW must
Can Map (33) has a more detailed discussion of different approaches contend with the many existing proposals for visualization naming
and their interrelations. and classification (33). For example, Harris (28) details hundreds
Nominal data (e.g., job type) have no ranking but support of visualizations and distinguishes tables, charts (e.g., pie charts),
equality checks. Ordinal data assumes some intrinsic ranking but graphs (e.g., scatterplots), maps, and diagrams (e.g., block dia-
not at measurable intervals (e.g., chapters in a book). Interval- grams, networks, Voronoi diagrams). Bertin (25) distinguishes
and ratio-scale data assume that the distance between any two diagrams, maps, and networks. Based on an extensive literature
adjacent values is equal. For interval data, the zero point is ar- and tool review and with the goal of providing a universal set of
bitrary (e.g., Celsius or Fahrenheit temperature scales), while for visualization types, the DVL-FW identifies five general types: ta-
ratio, there exists a unique and nonarbitrary zero point (e.g., ble, chart, graph, map, tree, and network visualizations (Table 1,
length or weight). Logical mathematical operations permissible column 4) (definitions and examples are in ref. 33).
for the different data-scale types are given in Fig. 1. In addition, the DVL-FW distinguishes between the reference
Note that quantitative data can be converted into qualitative system (or base map) and data overlays. Fig. 2 exemplifies typical
data (e.g., one may use thresholds to convert interval data into reference systems for four visualization types. All four support the
ordinal data). Ordinal rankings can be converted to yes/no cat- placement of data records (data records can be connected via
egorical decisions (e.g., to make funding decisions). The reverse linkages); color coding of table cells, graph areas, geospatial areas
(e.g., in choropleth maps), or subnetworks; and the design of an-
imations (e.g., changes in the number of data records over time).
Some visualizations use a grid reference system (e.g., tables), while
others use a continuous reference system (e.g., scatterplot graph
or geospatial map). Some visualizations use lookup tables to po-
sition data [e.g., lookup tables for US zip codes to latitude/longi-
tude values or journals to the position of scientific disciplines in
science maps (43)]. One visualization can be transformed into
another. For instance, changing the quantitative axes of a graph
into categorical axes results in a table. Similarly, interpolation
applied to discrete area geospatial (or topic) maps results in
continuous, smooth surface elevation maps.
Prior research on DVL shows that people have difficulties reading
Fig. 1. Logical mathematical operations permissible, measure of central most visualization types but especially, networks (17, 18). Controlled
tendency, and examples for different data scale types. laboratory studies examining the recall accuracy of relational data

Börner et al. PNAS | February 5, 2019 | vol. 116 | no. 6 | 1859


Graphic symbols. Graphic symbols are essential to data visualiza-
tion, as they give data records a visual representation. Any
comprehensive framework must acknowledge and build on pre-
vious efforts to classify and name these symbols. Bertin (25), for
instance, calls graphic symbols “visual marks” and proposes three
geometric elements: point, line, and area. MacEachren (29)
adopts Bertin’s framework to explain how geospatial maps work.
Harris (28) distinguishes two classes of symbols, geometric and
pictorial, and provides numerous examples. Horn (49) distin-
guishes three general classes of graphic symbols: shapes, words,
and images. Other approaches are discussed in Atlas of Knowl-
edge: Anyone Can Map (33).
The DVL-FW distinguishes three general classes consisting
of 11 graphic symbols: geometric (point, line, area, surface, vol-
ume), linguistic (text, numerals, and punctuation marks), and
pictorial (images, icons, and statistical glyphs). Fig. 3 shows a
subset of graphic symbols and graphic variables listed in columns 5
and 6 in Table 1. A four-page table of 11 graphic symbol types vs.
24 graphic variable types can be found in Börner (33). Different
graphic symbol types can be combined (e.g., a geometric symbol
used to represent a node in a network might have an associated
linguistic symbol label).
User studies that aim to assess the effectiveness of different
graphic symbol types are typically combined with studies on
graphic variable types and are discussed subsequently.
Graphic variables. Data records commonly have attribute values that
can be represented by so-called graphic variables (e.g., color or
size) of graphic symbols. Bertin (25) calls graphic variables “visual
Fig. 3. Four graphic symbols and 11 graphic variables from full 11 graphic
symbols by 24 graphic variables set in ref. 34. Qualitative nominal variables
channels” and identifies “retinal variables,” such as shape, orien-
(shape, color hue, and pattern) have a gray mark. tation, color, texture, value, and size. MacEachren’s (29) instan-
tiations (which he calls “implantations”) of different graphic
variables for different symbol types include position and Bertin’s
using map and network visualizations have found that map visuali- retinal variables. Munzner (30) distinguishes “magnitude chan-
zations are easier to read and increase memorability (44). nels” (e.g., position [1-3D], size [1-3D], color luminance/satura-
In psychology and cognitive science, research has aimed to tion, and curvature) from “identity channels” (e.g., color hue,
identify the cognitive processes required for reading visualizations shape, and motion). Wilkinson (31) proposes position, form, color,
and confounding variables. Pinker (45) reviewed general cognitive texture, optics, and transparency. The DVL-FW details and ex-
and perceptual mechanisms to develop a theory of graph com- emplifies a superset of these proposed graphic variables organized
prehension in which “graph schemas” provide template-like in- into spatial (x, y, z position) and retinal variables. Retinal variables
formation on how to create or read certain visualizations. His are further divided into form (size, shape, rotation, curvature,
“graph difficulty principle” helps explain why some graph-type angle, closure), color (value, hue, saturation), texture (spacing,
visualizations are easier or harder to comprehend. Kosslyn et al. granularity, pattern, orientation, gradient), optics (blur, trans-
(46) demonstrated that the time required to read a visual image parency, shading, stereoscopic depth), and motion (speed, veloc-
increases systematically with the distance between initial focus ity, and rhythm) and grouped into quantitative and qualitative
point and the target—independent of the “amount of material” variables (33). A subset of these 24 variables is given in Fig. 3.
between both points. Shah (47) showed that line graphs facilitate Like qualitative data variables, qualitative graphic variables (e.g.,
the extraction of information for x–y relationships and that bar shape or color hue) have no intrinsic ordering. In contrast, quan-
graphs ease the comparison of graphical elements in close prox- titative graphic variables (e.g., size or color intensity) can have
imity to each other. However, when subjects had to perform different ordering directions, such as sequential, diverging, or cyclic
computation while reading a visualization, comprehension became (30, 50); examples are given in Fig. 4. Quantitative graphic vari-
more difficult, showing that the interpretation of graphs is “serial ables, such as motion, can be binned to encode qualitative data
and incremental, rather than automatic and holistic” (47). Build- variables [e.g., binary yes/no motion can be used to encode binary
data values as proposed by Munzner (30)].
ing on the study by Shah (47) of the iterative nature of graph
Laboratory studies by Cleveland and McGill (51) and crowd-
comprehension, Trickett and Trafton (48) emphasize the impor-
sourced studies using Amazon Mechanical Turk by Heer and
tance of spatial processes (e.g., the temporal storage and retrieval
Bostock (52) quantify how accurately humans can perceive dif-
of an object’s location in memory, allowing for mental transfor-
ferent graphic variables. Both studies show that position encoding
mations, such as creating and transforming a mental image) for has the highest accuracy followed by length (“size” in the DVL-FW),
graph comprehension. Results from these studies can guide the angle and rotation, and then, area. When examining area encod-
design of visualization exercises and assessments. ings more closely, rectangular and circular area encoding yielded the
When teaching DVL, a decision must be made about which
visualizations should be taught and what subset of core types and
process steps should be prioritized. The practical value of a vi-
sualization (e.g., determined by the frequency with which dif-
ferent populations, such as journalists, doctors, or high school
students, encounter it in work contexts, scholarly papers, news
reports, or social media) can guide this decision-making process.
Simply put, the more usage and actionable insights gained, the
more important it becomes to empower individuals to properly Fig. 4. Exemplary color schemas for qualitative and quantitative data var-
construct and interpret that visualization. iables using colors from ColorBrewer (50).

1860 | www.pnas.org/cgi/doi/10.1073/pnas.1807180116 Börner et al.


COLLOQUIUM
PAPER
showing common paths when constructing graphs; it features four
main tasks—load data, build constructs, combine constructs, and
correct—and several mental and physical subtasks. Grammel et al.
(61) conducted exploratory laboratory studies to identify three ac-
tivities central to the interactive visualization construction process:
data variable selection, visual template selection, and visual mapping
specification (i.e., assigning graphic symbol types and variables to
Fig. 5. Geometric symbol (circle) encoding using different graphic variable data variables). “Voyager: Exploratory analysis via faceted browsing
types in support of outlier, trend, and cluster identification as inspired by of visualization recommendations” by Wongsuphasawat et al.
Szafir et al. (53). (62) is a recommendation engine that suggests diverse visuali-
zation options to help users pick different data variable subsets,
data transformations (e.g., aggregation and binning), and visual
lowest accuracy, explaining why visualizations, such as bubble charts data encodings for graph-type visualizations using the Vega-Lite
and tree maps, are harder to read. visualization specification language (63). The system also ranks
Kim and Heer (41) conducted a Mechanical Turk study with results by perceptual effectiveness score and prunes visually similar
1,920 subjects using US daily weather measurements to de- results.
termine what combination of qualitative and quantitative data- Building on this prior work and extending Börner (33), we
scale types plus visual encodings leads to the best performance identify key process steps involved in data visualization con-
on key literacy tasks (e.g., read value, find maximum, compare struction and interpretation (Fig. 6) and interlink process steps
values, and compare averages). Results from these studies help with the DVL-FW typology (Table 1). Subsequently, we discuss
guide the design and interpretation of visualizations. the important role of stakeholders and detail five process steps
Different graphic variable types can be combined (e.g., a node and their interrelation with the typology.
in a network may be coded by size and color as shown in a pro- Stakeholders. The data visualization process (also called “work-
portional symbol map). Szafir et al. (53) investigated what graphic flow”) starts with the identification of stakeholders and their

SOCIAL SCIENCES
symbol and graphic variable types (position, size, orientation, insight needs (Table 1, column 1). Just as a verbal math problem
color, and luminance) together support what visual insight needs needs to be reformulated into a numerical math problem, the
(e.g., the identification of outliers, trends, or clustering as shown in verbal or textual description of a real-world problem presented
Fig. 5). These findings make it possible to order graphic variables by a stakeholder must be operationalized (i.e., reformulated into
by effectiveness and guide the selection and combination of vari- a data visualization problem so that appropriate datasets, anal-
ables when constructing data visualizations. ysis and visualizations workflows, and deployment options can be
Interactions. The DVL-FW recognizes that, while some visualiza- identified). Math assessment frameworks allocate up to one-half
tions are static (e.g., printed on paper), many can be manipulated of the overall problem-solving effort for the translation of verbal
dynamically using diverse types of interaction. Shneiderman (32) to numerical problems; analogously, major effort is required to
identifies overview, zoom, filter, details on demand, relate (view- translate real-world problems into well-defined insight needs.
ing relationships among items), history (keeping a log of actions to
Acquire. Given well-defined insight needs, relevant datasets and
support undo, replay, and progressive refinement), and extract
other resources can be acquired. Data quality and coverage will
(access subcollections and query parameters). Keim (54) distin-
strongly impact the quality of results, and much care must be
guishes zoom, filter, and link and brush as well as projection and
distortion techniques as a means to provide focus and context. The taken to acquire the best dataset with data scales that support
typology proposed by Brehmer and Munzner (55) covers two main subsequent analysis and visualization.
abstract visualization tasks. The first is “why,” which includes Analyze. Typically, data need to be preprocessed before they can be
consume (present, discover, enjoy, produce), search (lookup, visualized. This step can include data cleaning (e.g., identify and
browse, locate, explore), and query (identify, compare, summa- correct errors, deduplicate data, deal with missing data, anomalies,
rize). The second is “how,” which consists of encode, manipulate unusual distributions); data transformations (e.g., aggregations,
(select, navigate, arrange, change, filter, aggregate), and introduce geocoding, network extraction); and statistical, temporal, geo-
(annotate, import, derive, record). Heer and Shneiderman (56) spatial, topical, or relational network analyses (Table 1, column 3).
focus on the flexible and iterative use of visualizations by naming Visualize. This step can be split into two main activities: pick
12 actions ordered into three high-level categories: data and view reference system (or base map) and design data overlay. The first
specification (visualize, filter, sort, derive), view manipulation activity is associated with selecting a visualization type, and the
(select, manage, coordinate, organize), and process and prove- second activity is associated with mapping data records and
nance (record, annotate, share, guide). As before, the DVL-FW
covers core interaction types, including zoom, search and locate,
filter, details on demand, history, extract, link and brush, pro-
jection, and distortion (Table 1, column 7).

DVL-FW Process Model. Human sense making in general—and


particularly, sense making of data and data visualizations—has
been studied extensively. Pirolli and Card (57) used cognitive
task analysis to develop a notational model of sense making for
intelligence analysts. It consists of a foraging loop (seeking in-
formation, searching and filtering it, and reading and extracting
information) and a sense-making loop (iterative development of
a mental model that best fits the evidence). Klein et al. (58)
proposed a data/frame theory of how domain practitioners make
decisions in complex real-world contexts. Lee et al. (59) observed
novice users examining unfamiliar visualizations and identified
five major cognitive activities: encountering visualization, con-
structing a frame, exploring visualization, questioning the frame, Fig. 6. Process of data visualization construction and interpretation with
and floundering on visualization. Pioneering work by Mackinlay major steps in white letters. Types identified in Table 1 are given in italics,
(60) aimed to automate the design of effective visualizations for and an exemplary US reference system and sample data overlays are given
graph-type visualizations. Huron et al. (21) published a flow diagram in Right.

Börner et al. PNAS | February 5, 2019 | vol. 116 | no. 6 | 1861


visualization solution, the DVL-FW provides guidance for visu-
alization construction and assessment.
Graphic symbols and variables. Graphic symbols/variables knowl-
edge can be evaluated using matching problems that require
students to deconstruct a visualization’s graphic symbols and
Fig. 7. Data table with two data records (Left) and a scatterplot graph of variables into a data table (an example is in Fig. 7) or construct a
the data showing the correct spatial position of both records and encoding visualization based on a dataset by matching data types and
weight by size and type by color hue (Right). scales to graphic symbol types used in a visualization. These
exercises can vary in complexity depending on the number of
data variables and visualization types.
variables to graphic symbols and graphic variables (e.g., position DVL-FW process model. General steps in the design of data visual-
and retinal variables). Exemplarily, Fig. 6 shows a US map ref- izations (Fig. 6) may be evaluated using fill in the blank and
erence system with graphic symbols (circles and lines) that are matching exercises that assess overall knowledge of the iterative
first positioned and then coded by both size and color. process.
Deploy. Different deployments (e.g., a printout on paper or in 3D; Construction. Hands-on homework assignments can assess stu-
an interactive display on a handheld device, large tiled wall, or dents’ ability to create and evaluate visualizations. Students are
virtual reality headset) will support different types of interactions provided with insight needs, data, analysis and visualization algo-
via different human–user interfaces and metaphors (e.g., zoom rithms, and rubrics that scaffold visualization construction, in-
might be achieved by physical body movement toward a large terpretation, and assessment. The rubrics cover the completeness
format map, pinch on a touch panel, or body movement in a and accuracy of the visualization, whether the results meet the
virtual reality setup). Different interface controls make diverse insight needs, and if reported insights are supported by the data.
interactions possible: buttons, menus, and tabs support selection; Peer responses use rubrics to evaluate a peer submission, which
sliders and zoom controls let users filter by time, region, or topic; also provide a means of training and evaluating a reviewer’s ability
hover and double click help users retrieve details on demand; and to use the DVL-FW to assess a visualization accurately.
multiple coordinated windows are connected via link and brush. Interpretation. DVL assessments systematically evaluate stu-
Interpret. Finally, the visualization is read and interpreted by its dents’ ability to interpret visualizations that address different
author(s) and/or stakeholder(s). This process includes translating insight needs. By using standardized visualizations with known
the visualization results into insights and stories that make a interaction types, we can both quantify students’ ability to in-
difference in the real-world application. terpret visualizations across tasks and DVL-FW types and
Frequently, the entire process is cyclical (i.e., a first look at the identify different kinds of misinterpretations. The assessments
data results in a discussion of adding more/alternative datasets, can be used pre- and postevaluation to determine changes in the
running different analysis and visualization workflows, or using dif- interpretation ability of students.
ferent data mappings or even deployment options). In some cases, a Going forward, a close collaboration between researchers and
better understanding of existing data leads to asking novel questions, educators is desirable to design exercises and assessments for use
compiling new datasets, and developing better algorithms. Some- in different learning environments that leverage active learning,
times, a first analysis of the data might result in the acquisition of social learning, scaffolding, and horizontal transfer as suggested
new/different data, a first visualization might result in choosing a by Chevalier et al. (23).
different analysis algorithm, and a first deployment might reveal that
DVL-FW Usage in IVMOOC
a different visualization is easier to read/interact with.
Over the last 15 years, earlier versions of the DVL-FW typology,
Exercises and Assessments. Given the core DVL-FW typology in process model, exercises, and assessments presented here have
Table 1 and the associated process in Fig. 6, it becomes possible to been implemented in the Information Visualization course at
design effective interventions that measurably improve DVL. This Indiana University, providing first evidence that the framework can
section presents selected exercises that facilitate learning and be used to teach and assess DVL. Data on student engagement,
DVL assessment. Additional theoretical lectures and hands-on performance, and feedback guided the continuous improvement of
exercises can be accessed online via the IVMOOC (see below). the DVL-FW. Since 2013, the DVL-FW has been taught in an
Textual literacy (e.g., proper spelling of titles, axis labels, etc.), online course called IVMOOC (https://ivmooc.cns.iu.edu/). More
than 8,500 students registered and—as is typical for MOOCs—
math literacy (e.g., measurement, estimation, percentages, corre-
about 10% of those students completed the course. IVMOOC
lations, and probabilities), and visual literacy (e.g., composition
students’ online activity is captured in extensive detail, providing
of a visualization, color theory) have standardized tests and are unique opportunities to validate and further improve the DVL-FW.
not covered here.
DVL-FW typology. Factual knowledge of the core types in Table 1 Learning Objectives. In the IVMOOC, the main learning objective
can be taught and assessed by presenting students with short is mastery over the typology, process, and exercises defined in
answer, multiple choice, fill in the blank, or matching tasks that the DVL-FW. A special focus is on the construction of data vi-
ask them to pick the correct set of DVL-FW types. For example, sualizations (i.e., given an insight need and dataset, pick a valid
different types can be trained and assessed as follows. data analysis and visualization workflow that renders data into
Insight needs. Given a verbal description of a real-world prob- insights; improve workflow preparation and parameter selection
lem or a recording of a stakeholder explaining a need, identify for statistical, temporal, geospatial, topical, and network analyses
and operationalize insight need(s) in short answer responses. and visualizations; interpret results and add title, legend, and
Data scales and analysis. Given a data table, identify the scale of descriptive text; and document the workflow so that it can be
data in each column and suggest analyses that could be run to replicated by others).
meet a given set of insight needs. Then, review if there is missing Customization of exercises is possible and desirable, as there
or erroneous data, disambiguate text, geocode addresses, and are at least four critical factors that affect the comprehension of
sample or aggregate data as needed to run certain visualization graphs: purpose for usage, task characteristics, discipline char-
workflows. Decide which data variables are control, independent, acteristics, and reader characteristics (64). Interventions can be
or dependent variables. tailored to fit different types of learners (e.g., theory or hands-on
Visualizations. Given a visualization, correctly name and classify first) and different levels of expertise. Instructors might like to
it by type (graph, map, etc.). For a greater challenge, have stu- introduce students to the real-world stakeholder needs and
dents take analysis results and visualize them to satisfy a set of datasets that are particularly relevant for a subject area (e.g.,
insight needs. While there might not always be a single best social networks in sociology; food webs in biology).

1862 | www.pnas.org/cgi/doi/10.1073/pnas.1807180116 Börner et al.


COLLOQUIUM
PAPER
Instructional Strategies and Tools. The IVMOOC uses a combi- Planned User Studies. Going forward, we are interested in further
nation of lectures and quizzes, hands-on exercises and home- improving DVL instruction using concreteness fading, scaffold-
work, real-world client projects, and examinations to increase ing, horizontal transfer, and reciprocity.
and assess students’ DVL. The DVL-FW is introduced in the Concreteness fading. Alper et al. (22) and Chevalier et al. (23)
first week and used to structure the initial 7 weeks of the course, examined visualization literacy teaching methods for elementary
which include theory and hands-on lectures and exercises. Fur- school children and developed their proof-of-concept tool C’est
thermore, the DVL-FW dictates the menu system of the Sci2 la Vis. The tool uses concreteness fading, an approach where
Tool (65) that provides easy access to 180+ analysis and visual- concrete, countable entities (pictograms) are gradually trans-
ization algorithms and organizes the 50 visualizations featured in formed into abstract visual representations of quantitative data.
the IVMOOC flashcard app. In the last 7 weeks of the course, We plan to use concreteness fading to ease the construction of
students collaborate on real-world client projects that ask stu- different visualization types.
dents to implement the full DVL-FW process—from stakeholder Scaffolding. Studies are needed to determine what sequence is
interviews to identify insight needs, data acquisition, analysis, best for introducing the DVL-FW typology and process steps to
visualization, and deployment to interpretation. Sample client support effective scaffolding. As for factual scaffolding (Table 1),
projects are documented in the textbook Visual Insights: A only a subset of visualizations, graphic symbol, and variable types
Practical Guide to Making Sense of Data (42). and their members might initially be familiar to a student. As
students learn more types and their members, their DVL in-
Assessments. IVMOOC quizzes, homework, examinations, and creases. In terms of procedural scaffolding (Fig. 6), students
peer reviews assess students’ knowledge and application of the might be presented with a sequence of successively harder tasks:
DVL-FW typology (Table 1), process steps (Fig. 6), and interrela- (i) examine a graph and answer yes/no insight questions by
tions between types and steps using classical test theory. Examina- modifying usage of graphic variable types; (ii) read a simple case
tion and quiz responses are analyzed using IRT to evaluate question study that defines an insight need and dataset, and then, select
difficulty and student misconceptions. Peer evaluation uses rubrics the best visualization, graphic symbols, and variable types to
that scaffold the visualization evaluation process using the DVL- meet the predefined need; and (iii) listen to a client explaining a
real-world problem, identify insight need(s), pick the most rele-

SOCIAL SCIENCES
FW. Together, the assessments allow instructors to check the de-
gree to which the students are meeting the learning objectives. vant dataset(s), construct an appropriate visualization, and ver-
bally communicate key insights to the client.
Discussion and Outlook Horizontal transfer. The DVL-FW aims to ease the transfer of
In this paper, we have presented a typology, process model, and knowledge across visualization-type reference systems (Fig. 2).
Knowledge on how to construct graphs with diverse data overlays
exercises for defining, teaching, and assessing DVL. The DVL-
should make it easier to read and construct other visualization
FW combines and extends pioneering works by leading experts
types. For example, Friel et al. (64) propose a sequence for the
to arrive at a comprehensive set of core types and major process introduction of graph-type visualizations to students of different
steps required for the systematic construction and interpretation ages. Additional user studies are needed to determine how prior
of data visualizations. As a key contribution, this paper interlinks knowledge impacts the reading and construction of visualizations so
the typology and process steps and presents a set of DVL-FM that the typology and process steps can be taught most effectively.
exercises and assessments that can be used by anyone interested Reciprocity. Recent work shows that visualization construction
to measurably improve DVL. Early versions of the DVL-FW (i.e., starting with a reference system and then adding graphic
were implemented and tested in the IVMOOC over the last 6 symbols and additional graphic variables) leads to better un-
years and have informed the DVL-FW typology, process model, derstanding and interpretation of the visualization than decon-
exercises, and assessments presented here. structing a complete visualization (70). Additional user studies
are needed to determine the strength of transfer between con-
Controlled User Studies. Going forward, there is a need to run structing and reading visualizations of different types and what
controlled user studies to understand difficulty levels for the construction workflows are most effective for increasing DVL.
diverse DVL-FW types and process steps and their combinations
and to provide additional guidance for the construction of Outlook. DVL is of increasing importance for making sound
effective visualizations based on scientific evidence. Seminal decisions in one’s personal and professional life. Existing literacy
studies by Cleveland and McGill (66) and Heer and Bostock (52) tests—a review is in Pursuit of Universal Literacy—include sta-
have examined the effectiveness of different visual encodings. A tistical graphs as part of mathematical and financial literacy tests.
similar study design can be used to examine the effectiveness of a In the United States, K–12 national standards for math and science
larger range of graphic symbol types, variable types, and their cover statistical graphs (71, 72) and geospatial maps (72). However,
combinations (53). Work by Wainer (67) and Boy et al. (15) used most exercises ask students to read (not construct) data visualiza-
IRT to compute DVL scores for the interpretation of different tions; topical or network analyses and visualizations are rarely
graph visualizations. IRT was also used in the IVMOOC to as- covered. Adding DVL literacy exercises and assessments to existing
sess student DVL when constructing visualizations, but more tests or establishing separate DVL literacy tests will make it possible
work is needed to optimally use the DVL-FW for teaching to assess how effectively different classes, schools, corporations,
visualization construction. countries, etc. are preparing students to read and construct data
visualizations; what interventions and exercises work for what age
User Studies in the Wild. In addition to laboratory experiments, groups and industries/research areas; and how to further improve
there is a need to understand how general audiences can construct DVL typology, processes, exercises, and assessments via a close
and interpret data visualizations in real-world settings using so- collaboration among academic and industry experts, learning sci-
called “research in the wild” (68). Building on prior work assessing entists, instructional developers, teachers, and learners.
the DVL of science museum visitors (17), we are developing a
museum experience that lets visitors first generate and then vi- ACKNOWLEDGMENTS. We thank Anna Keune and the anonymous re-
sualize their very own data using a so-called “Make-a-Vis” (MaV) viewers for their extensive expert comments on an earlier version of this
setup. MaV is aligned with the DVL-FW and supports the map- paper. We appreciate the support of figure design by Tracey Theriault and
ping of data to visual variables via the drag and drop of column Leonard E. Cross and copyediting by Todd Theriault. This work was
headers to axis and legend areas in a data visualization. The active partially supported by a Humboldt Research Award, NIH Awards
U01CA198934 and OT2OD026671, and NSF Awards 1713567, 1735095,
learning setup aims to empower learners to become producers and and 1839167. Any opinions, findings, and conclusions or recommendations
creators across the lifespan—in line with recommendations found expressed in this material are those of the author(s) and do not necessarily reflect
in How People Learn II: Learners, Contexts, and Cultures (69). the views of the NSF.

Börner et al. PNAS | February 5, 2019 | vol. 116 | no. 6 | 1863


1. OECD (2013) PISA 2015 Draft Reading Literacy Framework (OECD Publishing, Paris). 39. Stevens SS (1946) On the theory of scales of measurement. Science 103:677–680.
2. OECD (2018) PISA 2018 released field trial new reading items (ETS Core A, Paris). Available at 40. Kruskal JB, Shepard RN (1974) A nonmetric variety of linear factor analysis.
https://www.oecd.org/pisa/test/PISA_2018_FT_Released_New_Reading_Items.pdf. Accessed Psychometrika 39:123–157.
December 20, 2018. 41. Kim Y, Heer J (2018) Assessing effects of task and data distribution on the effec-
3. Educational Testing Service (2018) Graduate record examination (ETS, Erwing, NJ). Available tiveness of visual encodings. Comput Graph Forum 37:157–167.
at https://www.ets.org/gre/revised_general/about/?WT.ac=grehome_greabout_a_180410. 42. Börner K, Polley DE (2014) Visual Insights: A Practical Guide to Making Sense of Data
Accessed December 20, 2018. (MIT Press, Cambridge, MA).
4. Educational Testing Service (2018) Graduate record examination subject test (ETS, 43. Börner K, et al. (2012) Design and update of a classification system: The UCSD map of
Erwing, NJ). Available at https://www.ets.org/gre/subject/about/?WT.ac=grehome_ science. PLoS One 7:e39464.
gresubject_180410. Accessed December 20, 2018. 44. Saket B, Scheidegger C, Kobourov S, Börner K (2015) Map-Based Visualizations In-
5. Schleicher A (2013) Beyond PISA 2015: A longer-term strategy of PISA. Available at crease Recall Accuracy of Data (EUROGRAPHICS, Zurich), pp 441–450.
https://www.oecd.org/pisa/pisaproducts/Longer-term-strategy-of-PISA.pdf. Accessed 45. Pinker S (1990) A theory of graph comprehension. Artificial Intelligence and the
December 20, 2018. Future of Testing, ed Freedle R (L. Erlbaum Associates, Hillsdale, NJ), pp 73–126.
6. Reyna VF, Nelson WL, Han PK, Dieckmann NF (2009) How numeracy influences risk 46. Kosslyn SM, Ball TM, Reiser BJ (1978) Visual images preserve metric spatial in-
comprehension and medical decision making. Psychol Bull 135:943–973. formation: Evidence from studies of image scanning. J Exp Psychol Hum Percept
7. OECD (2018) Programme for international student assessment. Available at www.
Perform 4:47–60.
oecd.org/pisa/. Accessed December 20, 2018.
47. Shah P (1997) A model of the cognitive and perceptual processes in graphical display
8. OECD (2013) PISA 2015 Draft Mathematics Framework (OECD Publishing, Paris).
comprehension. Reasoning with Diagrammatic Representations, ed Anderson M
9. Fransecky RB, Debes JL (1972) Visual Literacy: A Way to Learn—A Way to Teach
(AAAI Press, Menlo Park, CA), pp 94–101.
(Association for Educational Communications and Technology, Washington, DC).
48. Trickett SB, Trafton JG (2006) Toward a comprehensive model of graph comprehen-
10. Ausburn LJ, Ausburn FB (1978) Visual literacy: Background, theory and practice.
sion: Making the case for spatial cognition. Proceedings of the International
Program Learn Educ Technol 15:291–297.
Conference on Theory and Application of Diagrams (Springer, Berlin), pp 286–300.
11. Association of College and Research Libraries (2018) ACRL visual literacy competency
49. Horn RE (1998) Visual Language: Global Communication for the 21st Century (Mac-
standards for higher education (American Library Association). Available at www.ala.
roVU Inc., Bainbridge Island, WA).
org/acrl/standards/visualliteracy. Accessed December 20, 2018.
50. ColorBrewer Team (2018) ColorBrewer 2.0 (Cynthia Brewer). Available at color-
12. Hattwig D, Burgess J, Bussert K, Medaille A (2013) Visual literacy standards in higher
education: New opportunities for libraries and student learning. Portal 13:61–89. brewer2.org. Accessed December 20, 2018.
13. Avgerinou MD (2007) Towards a visual literacy index. J Vis Lit 27:29–46. 51. Cleveland WS, McGill R (1984) Graphical perception: Theory, experimentation, and
14. Taylor C (2003) New kinds of literacy, and the world of visual information. Proceed- application to the development of graphical methods. J Am Stat Assoc 79:531–554.
ings of the Explanatory & Instructional Graphics and Visual Information Literacy 52. Heer J, Bostock M (2010) Crowdsourcing graphical perception: Using Mechanical Turk
Workshop. Available at www.conradiator.com/resources/pdf/literacies4eigvil_ct2003. to assess visualization design. Proceedings of the SIGCHI Conference on Human
pdf. Accessed January 6, 2019. Factors in Computing Systems (ACM, New York), pp 203–212.
15. Boy J, Rensink RA, Bertini E, Fekete J-D (2014) A principled way of assessing visuali- 53. Szafir DA, Haroz S, Gleicher M, Franconeri S (2016) Four types of ensemble coding in
zation literacy. IEEE Trans Vis Comput Graph 20:1963–1972. data visualizations. J Vis 16:11.
16. Lee S, Kim S-H, Kwon BC (2017) VLAT: Development of a visualization literacy as- 54. Keim DA (2001) Visual exploration of large data sets. Commun ACM 44:38–44.
sessment test. IEEE Trans Vis Comput Graph 23:551–560. 55. Brehmer M, Munzner T (2013) A multi-level typology of abstract visualization tasks.
17. Börner K, Maltese A, Balliet RN, Heimlich J (2016) Investigating aspects of data vi- IEEE Trans Vis Comput Graph 19:2376–2385.
sualization literacy using 20 information visualizations and 273 science museum vis- 56. Heer J, Shneiderman B (2012) Interactive dynamics for visual analysis. Commun ACM
itors. Inf Vis 15:198–213. 55:45–54.
18. Zoss A (2018) Network visualization literacy: Task, context, and layout. PhD thesis 57. Pirolli P, Card S (2005) The sensemaking process and leverage points for analyst
(Indiana University, Bloomington, IN). technology as identified through cognitive task analysis. Proceedings of the Inter-
19. Maltese AV, Harsh JA, Svetina D (2015) Data visualization literacy: Investigating data national Conference on Intelligence Analysis. Available at https://www.e-education.
interpretation along the novice—expert continuum. J Coll Sci Teach 45:84–90. psu.edu/geog885/sites/www.e-education.psu.edu.geog885/files/geog885q/file/
20. Huron S, Carpendale S, Thudt A, Tang A, Mauerer M (2014) Constructive visualization. Lesson_02/Sense_Making_206_Camera_Ready_Paper.pdf. Accessed January 6, 2019.
Proceedings of the 2014 Conference on Designing Interactive Systems (ACM, New 58. Klein G, Moon B, Hoffman RR (2006) Making sense of sensemaking 2: A macro-
York), pp 433–442. cognitive model. IEEE Intell Syst 21:88–92.
21. Huron S, Jansen Y, Carpendale S (2014) Constructing visual representations: In- 59. Lee S, et al. (2016) How do people make sense of unfamiliar visualizations? A
vestigating the use of tangible tokens. IEEE Trans Vis Comput Graph 20:2102–2111. grounded model of novice’s information visualization sensemaking. IEEE Trans Vis
22. Alper B, Riche NH, Chevalier F, Boy J, Sezgin M (2017) Visualization literacy at ele- Comput Graph 22:499–508.
mentary school. Proceedings of the 2017 CHI Conference on Human Factors in 60. Mackinlay J (1986) Automating the design of graphical presentations of relational
Computing Systems (ACM, New York), pp 5485–5497. information. ACM Trans Graph 5:110–141.
23. Chevalier F, et al. (2018) Observations and reflections on visualization literacy in el- 61. Grammel L, Tory M, Storey M-A (2010) How information visualization novices con-
ementary school. IEEE Comput Graph Appl 38:21–29. struct visualizations. IEEE Trans Vis Comput Graph 16:943–952.
24. Bortz D (2017) The top data skills you need to get hired. Available at https://www. 62. Wongsuphasawat K, et al. (2016) Voyager: Exploratory analysis via faceted browsing
monster.com/career-advice/article/top-data-skills-0617. Accessed December 20, 2018.
of visualization recommendations. IEEE Trans Vis Comput Graph 22:649–658.
25. Bertin J (1983) Semiology of Graphics: Diagrams, Networks, Maps (Esri Press, Seattle).
63. Satyanarayan A, Moritz D, Wongsuphasawat K, Heer J (2017) Vega-lite: A grammar of
26. Cairo A (2012) The Functional Art (New Riders, San Francisco).
interactive graphics. IEEE Trans Vis Comput Graph 23:341–350.
27. Few S (2012) Show Me the Numbers: Designing Tables and Graphs to Enlighten
64. Friel SN, Curcio FR, Bright GW (2001) Making sense of graphs: Critical factors
(Analytics, Oakland, CA).
influencing comprehension and instructional implications. J Res Math Educ 32:
28. Harris RL (1999) Information Graphics: A Comprehensive Illustrated Reference (Ox-
124–158.
ford Univ Press, New York).
65. Sci2 Team (2009) Science of Science (Sci2) Tool (1.3; Indiana University and SciTech
29. MacEachren AM (2004) How Maps Work: Representation, Visualization, and Design
Strategies). Available at https://sci2.cns.iu.edu/user/index.php. Accessed December 20,
(Guilford, New York).
2018.
30. Munzner T (2014) Visualization Analysis and Design (CRC, Boca Raton, FL).
31. Wilkinson L (2005) The Grammar of Graphics (Statistics and Computing) (Springer 66. Cleveland WS, McGill R (1984) The many faces of a scatterplot. J Am Stat Assoc 79:
Science + Business Media, New York). 807–822.
32. Shneiderman B (1996) The eyes have it: A task by data type taxonomy for information 67. Wainer H (1980) A test of graphicacy in children. Appl Psychol Meas 4:331–340.
visualizations. Proceedings of the IEEE Symposium on Visual Languages (IEEE Com- 68. Rogers Y, Marshall P (2017) Research in the wild. Synth Lect Hum Cent Inf 10:i–97.
puter Society, Washington, DC), pp 336–343. 69. National Academies of Sciences, Engineering, and Medicine (2018) How People Learn
33. Börner K (2015) Atlas of Knowledge: Anyone Can Map (MIT Press, Cambridge, MA). II: Learners, Contexts, and Cultures (Natl Acad Press, Washington, DC).
34. Ward MO, Grinstein G, Keim D (2015) Interactive Data Visualization: Foundations, 70. Wojton MA, Palmquist S, Yocco V, Heimlich JE (2014) Making meaning through data
Techniques, and Applications (A. K. Peters Ltd., Natick, MA). representation: Construction and deconstruction, Evaluation Reports 41. Available at
35. StataCorp (2018) Stata (15). Available at https://www.stata.com. Accessed December www.informalscience.org/meaning-making-through-data-representation-construction-
20, 2018. and-deconstruction. Accessed December 20, 2018.
36. Wickham H (2010) A layered grammar of graphics. J Comput Graph Stat 19:3–28. 71. National Governors Association Center for Best Practices and Council of Chief State
37. Yau N (2011) Visualize This: The FlowingData Guide to Design, Visualization, and School Officers (2010) Common Core State Standards for Mathematics (NGA and
Statistics (Wiley, Indianapolis). CCSCO, Washington, DC).
38. Fisher D, Meyer M (2018) Making Data Visual: A Practical Guide to Using Visualization 72. NGSS Lead States (2013) Next Generation Science Standards: For States, by States
for Insight (O’Reilly Media, Sebastopol, CA). (Natl Acad Press, Washington, DC).

1864 | www.pnas.org/cgi/doi/10.1073/pnas.1807180116 Börner et al.

You might also like