0% found this document useful (0 votes)
57 views9 pages

2015 Task-Based Effectiveness of Basic Visualizations

This document describes a study that evaluated the effectiveness of five basic visualizations (table, line chart, bar chart, scatterplot, and pie chart) for ten common data analysis tasks using two datasets. The researchers found that the effectiveness of the visualizations varied significantly depending on the task. Based on their findings, they derived recommendations for which visualizations to use for different tasks. They also created a prototype visualization recommender called Kopol that uses a decision tree model trained on the user data to provide ranked visualization recommendations for a given task and data type.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views9 pages

2015 Task-Based Effectiveness of Basic Visualizations

This document describes a study that evaluated the effectiveness of five basic visualizations (table, line chart, bar chart, scatterplot, and pie chart) for ten common data analysis tasks using two datasets. The researchers found that the effectiveness of the visualizations varied significantly depending on the task. Based on their findings, they derived recommendations for which visualizations to use for different tasks. They also created a prototype visualization recommender called Kopol that uses a decision tree model trained on the user data to provide ranked visualization recommendations for a given task and data type.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO.

8, AUGUST 2015 1

Task-Based Effectiveness of Basic Visualizations


Bahador Saket, Alex Endert, and Çağatay Demiralp

Abstract—Visualizations of tabular data are widely used; understanding their effectiveness in different task and data contexts is
fundamental to scaling their impact. However, little is known about how basic tabular data visualizations perform across varying data
analysis tasks. In this paper, we report results from a crowdsourced experiment to evaluate the effectiveness of five small scale (5-34
data points) two-dimensional visualization types—Table, Line Chart, Bar Chart, Scatterplot, and Pie Chart—across ten common data
analysis tasks using two datasets. We find the effectiveness of these visualization types significantly varies across task, suggesting that
visualization design would benefit from considering context-dependent effectiveness. Based on our findings, we derive recommendations
on which visualizations to choose based on different tasks. We finally train a decision tree on the data we collected to drive a
recommender, showcasing how to effectively engineer experimental user data into practical visualization systems.
arXiv:1709.08546v3 [cs.HC] 24 Apr 2018

Index Terms—Information Visualization, Visualization Types, Visualization Effectiveness, Graphical Perception

1 I NTRODUCTION

V ISUALIZATIONS aim to enhance understanding of underlying


data by leveraging visual perception, evolved for fast pattern
detection and recognition. Understanding the effectiveness of a
There is a renewed interest (e.g., [4], [19], [24], [29], [41],
[42]) in visualization recommendation systems that aims to shift
some of the burden of visualization design and exploration
given visualization in achieving this goal is a fundamental pursuit decisions from users to algorithms. Our results can be used to
in visualization research and has important implications in practice. improve visualization recommendation systems moving forward.
A large body of prior research evaluated the general effective- In particular, our findings from the current study inform the ongoing
ness of different visualization types [7], [8], [13], [16], [17], [21], design and development of Foresight [11] at IBM. We envision
[35]. Guidelines and insights derived from these earlier studies have creating a recommendation engine that suggests visualizations
significant influence on data visualization today. However, these based on user-specified tasks. To this end, we develop Kopol1 ,
studies were conducted under conditions that were inconsistent a prototype visualization recommender. A decision tree model is
across studies, with varying sample sizes, a limited number of trained on the user data and then used by Kopol to provide ranked
tasks, and using different datasets. Research indicates, however, recommendations for a given task and data type. This model takes
the effectiveness of a visualization depends on several factors into account performance time, accuracy, and user preference. One
including task at the hand [1], and data attributes and datasets relevant application area of such a recommendation engine can be
visualized [31]. For example, while one chart might be suitable natural language interfaces for data visualization (e.g., [37], [38]).
for answering a specific type of question (e.g., to check whether In such interfaces people tend to specify tasks as a part of their
there is a correlation between two data attributes), it might not queries (e.g., “Is there a correlation between price and width of
be appropriate for other types (e.g., to find a data point with the cars in this dataset?”). Such an engine can be used to suggest more
highest value). Yet, we know little about how some of the basic effective visualizations given the task context.
visualizations perform across different visual analysis tasks.
In this paper, we conducted a crowdsourced study to evaluate
the effectiveness of five small scale (5-34 data points) two- 2 R ELATED WORK
dimensional visualization types (Table, Line Chart, Bar Chart, Data representation is a main component of information visualiza-
Scatterplot, and Pie Chart) across 10 different visual analysis tions. The fundamental focus of data representation is mapping
tasks [1] and from two different datasets (Cars and Movies). Our from data values to graphical representations [6]. Visualization
results indicate that the effectiveness of these visualization types designers use elementary graphical units called visual encodings
often significantly varies across tasks. For example, while pie to map data to graphical representation. Through human-subject
charts are one of the most effective visualizations for finding the experiments, researchers have investigated the effects of visual
extremum value, they are less effective for finding correlation encodings on the ability to read and make judgments about data
between two data attributes. We also asked participants to rank represented in visualizations (e.g., [3], [30], [33], [34], [35]).
five different visualization types in the order of their preference Although prior research has proposed models of visualization
for performing each task. We found a positive correlation between comprehension [22], [27], [33], little is known about how visual
accuracy and user preference, indicating people have a preference encodings or design parameters interact with each other or different
for visualizations that allow them to accurately complete a task. data and task contexts in forming the overall performance of a given
visualization. A large body of earlier work (e.g., [7], [8], [9], [12],
• Bahador Saket and Alex Endert are with Georgia Tech. [13], [16], [17], [21], [34], [35]) has also studied the effectiveness
E-mail: {saket, endert}@gatech.edu. of visualization types with common design configurations for a
• Çağatay Demiralp is with IBM Research. selected number of tasks.
E-mail:[email protected]
Manuscript received April 19, 2005; revised August 26, 2015. 1. https://kopoljs.github.io/
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 2

Eells [13] investigated effectiveness, of proportional compar- which suggested considering a broad set of visualization techniques
ison (percentage estimation) task in divided (stacked) bar charts in our experiment. At the same time, we would like our study
and pie charts. Eells asked participants to estimate the proportions to have the members of the general public as our participants:
in pie charts and bar charts. He found pie charts to be as fast as this would suggest to include a set of visualization techniques
and more accurate than bar charts for proportional comparison which are understandable by all participants. Building on previous
tasks. He also found that as the number of components increases, work [23] and investigations on visualization techniques supported
divided bar charts become less accurate but pie charts become by different visualization tools (e.g., Microsoft Excel, Tableau,
more (maximum five components were considered). In a follow Spotfire, QlikView, Adobe Analytics, IBM Watson Analytics), we
up study with a different setting, Croxton and Stryker [8] also decided to include five well-recognized visualization techniques
tested the effectiveness of divided bar charts and pie charts using in our study. In this study, we include Bar Chart, Line Chart,
a proportional comparison task. They also found pie charts to be Scatterplot, Table, and Pie Chart (see Figure 1).
more accurate than divided bar charts in most cases, but contrary
to Eells’ study, not all. 3.1 Datasets
Spence et al. [35] studied the effectiveness of bar charts, Selecting Datasets: To create visualizations for our experiment,
tables and pie charts. They found that when participants were we selected datasets where the participants were unfamiliar with
asked to compare combinations of proportions, the pie charts the content, but familiar with the meaning of the data attributes
outperformed bar charts. Their results also show that for tasks used in the dataset. This is particularly important since we did not
where participants were asked to retrieve the exact value of want user performance to be affected by how familiary participants
proportions, tables outperform pie charts and bar charts. In another are with the meaning of the data attributes.
study comparing the effectiveness of bar charts and line charts, We first selected five different datasets: Cereals [28], Cars [18],
Zacks and Tversky [43] indicated that when participants were Movies [10], Summer Olympics Medalists [10], and University
shown these two types of visualizations and asked to describe the Professors [28]. We then printed a part of each dataset on paper
data, they constantly used bar charts to reference the compared and showed them to six pilot participants (4 male, 2 female). We
values (e.g., A is 10% greater than B). Whereas with line charts, asked participants “Please look at data attributes used in each of
participants described trends. these datasets. Which datasets do you feel contain data attributes
Study by Siegrist [32] was one of the first studies that compared that you are more familiar with?” Cars and Movies datasets were
2D with 3D visualizations. Siegrist found that there is not a the ones that five out of the six participants selected. The Cars
significant between 2D and 3D bar charts in terms of accuracy. dataset [18] provides details for 407 new cars and the Movies
However, participants using 3D bar charts take slightly longer dataset [10] provides details for 335 movies released from 2007 to
to perform tasks. In addition, Siegrist found that accuracy of 2012.
perceiving 3D pie charts is significantly lower than 2D ones, Data Attribute Types: Both datasets include data attributes of
probably because some of the slices in the 3D pie charts are Nominal, Ordinal, and Numerical types. We define Nominal data
more obscured. Harrison et al. [17] measured the effectiveness attribute type as categorically discrete data such as types of cars
of different visualizations for explaining correlation, finding that (e.g., Sedan, SUV, Wagon). Ordinal is defined as quantities within a
parallel coordinates and scatterplots are best at showing correlation. specific range that have a natural ordering such as rating of movies
They also found that stacked bar charts outperform stacked area and (the number of unique data values ranged from 6 to 12). We define
stacked line. In a follow up study, Kay and Heer reanalyzed [20] the Numerical as continuous numerical data such as Profit values of
data collected by Harrison et al. [17]. The top ranking visualization movies. We generated visualizations using pairwise combinations
remained the same. of all three types of data attributes available in our datasets (e.g.,
While these independent studies provide helpful generic guide- Nominal * Numerical or Ordinal * Numerical).
lines, they were conducted under different conditions, varying Data Sampling: During our informal pilot study, we generated
sample sizes, datasets, and for a disperse set of tasks. In fact, visualizations representing different number of data points ranging
several of these studies used manually created visualizations in from 50 to 300, in increments of 50 data points. In these
their experiments without using actual datasets [8], [13], [35], visualization each visual mark (e.g., a circle or a bar) represented a
[43] or created visualizations using artificial datasets [17]. Also, data point. We noticed our pilot participants faced two challenges
these earlier studies have conducted experiments typically using using static visualizations containing more than 50 visual marks.
atomic generic tasks such as comparison of data values (e.g., [8], First, participants had difficulties performing some of the tasks (e.g.,
[43]) or estimation of proportions (e.g., [13], [34], [35]). However, compute derived value and characterized distribution) using static
many visual analysis tasks (e.g., filtering, finding clusters) require visualizations (error rate increased and in some cases participants
integration of results from multiple atomic tasks, limiting the appli- gave up). In addition, in some cases participants had to spend
cability of earlier findings [1], [2]. Inconsistency in experimental more than two minutes performing the tasks. Due to practical
settings and limited atomic tasks used in previous work encourages limitations of conducting the study (e.g., length and complexity of
studying the effectiveness of visualization types for larger spectrum the experiment) with a high number of visual marks, we decided
of tasks in a more consistence setting. to not show more than 50 visual marks at the time. We had two
options for not showing all the data points in our datasets.
First, we could pick a subset of data points and create
3 S TUDY D ESIGN visualizations using only that subset. In that case, each visual
When deciding which visualization types to include in our mark would represent a data point in our dataset. We could then
experiment, we balanced the familiarity of the visualizations create a bar chart showing manufacturers on the x-axis and price
considered with the comprehensiveness of the experiment. On on the y-axis. In this case, each bar represents a data point/car and
the one hand, we would like to have more generalizable results, the y-axis is the absolute price value for each car.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 3
19

Cylinder Highway Miles Per Gallon


60 60 60
26 66 Cylinder
3 66
Highway Miles Per Gallon

Highway Miles Per Gallon


Highway Miles Per Gallon
50 50 50
3
4 4 32
40 40 40
5
6 5 26
30 30 30 22 8
9 6 19
20 20 20
12
8 22
10 10 10

19 9 26
0 0 0 32
3 4 5 6 7 8 9 12 3 4 5 6 8 9 12 3 4 5 6 7 8 9 12
12 19
Cylinder Cylinder Cylinder 26

Fig. 1. Five visualization types used in this study. In this figure, each visualization shows the average highway miles per gallon (a numerical data
attribute) for cars with different number of cylinders (an ordinal data attribute).

Second, we could use the cardinality of the data attributes to Characterize Distribution. For a given set of data points and
define how many visual marks (e.g., bars in a bar chart) should be an attribute of interest, we asked participants to identify the
shown on a visualization. For example, imagine a bar chart that has distribution of that attribute’s values over the set. For example,
manufacturers on the x-axis and price on the y-axis. In this case, what percentage of the movie genres have an average gross value
we show 8 bars, each representing a manufacturer (e.g., Toyota, higher than 10 million?
BMW, etc.), and the average price for each car manufacturer on Find Extremum. For this task, we asked participants to find data
the y-axis. Thus, glyphs are not representing the data points, but points having an extreme value of an data attribute. For example,
the cardinality of the paired data attribute. This approach would what is the car with highest cylinders?
require us to have an averaged data attribute on one of the axis Filter. For given concrete conditions on data attribute values, we
(e.g., average price for different manufacturers). Cardinality of data asked participants to find data points satisfying those conditions.
attributes that were less than 50 ranged from 5 (minimum number For example, which car types have city miles per gallon ranging
of visual marks) to 34 (maximum number of visual marks). In our from 25 to 56?
study design, we went with this second approach. Order. For a given set of data points, we asked participants to rank
them according to a specific ordinal metric. For example, which
of the following options contains the correct sequence of movie
3.2 Tasks
genres, if you were to put them in order from largest average gross
We selected the tasks for our study based on two considerations. value to lowest?
First, tasks should be drawn from those commonly encountered Determine Range. For a given set of data points and an attribute
while analyzing tabular data. Second, the tasks should be present in of interest, we asked participants to find the span of values within
existing task taxonomies and often used in other studies to evaluate the set. For example, what is the range of car prices?
visualizations. Retrieve Value. For this task, we asked participants to identify
Previously, Amar et al. [1] proposed a set of ten low-level anal- values of attributes for given data points. For example, what is the
ysis tasks that describe users’ activities while using visualization value of horsepower for the cars?
tools to understand their data. First, these tasks are real world tasks
because users came up with them while exploring five different 3.3 Visualization Design
datasets with different visualization tools. Second, different studies To generate visualizations, we used three pairwise combinations
used these tasks to evaluate effectiveness of visualizations. With of three different data attribute types available in our datasets. In
this in mind, we used the low-level taxonomy by Amar et al. [1], particular, we used Nominal * Numerical, Ordinal * Numerical,
described below. Numerical * Numerical. We did not include Nominal * Nominal
Find Anomalies. We asked participants to identify any anomalies because it is not possible to represent this combination using all
within a given set of data points with respect to a given relationship five visualizations considered in this study (e.g., line chart).
or expectation. We crafted these anomalies manually so that, once To create Scatterplots, Bar Charts, and Line Charts, we used
noticed, it would be straightforward to verify that the observed the same length, font size, and color to draw their x − y axes. In
value was inconsistent with what would normally be present in the addition, all the visual elements (e.g., bars in a bar chart) used in
data (e.g., movies with zero or negative length would be considered the three charts had the same blue color. Unlike other visualizations,
abnormal). For example, which genre of movies appears to have pie charts do not have any axis to read the values from. That is,
abnormal length? to create Pie Charts we had to make design decisions on how to
Find Clusters. For a given set of data points, we asked participants show values of two data attributes used to generate them. The
to count the number of groups of similar data attribute values. For main design decision that we had to make for Pie Charts was
example, how many different genres are shown in the chart below? whether to include legends. Instead of having legends, we could
Find Correlation. For a given set of two data attributes, we potentially add labels on the top of slices of Pie Charts. We tried
asked participants to determine if there is a correlation between to put the labels on the top of slices but this caused visual clutter,
them. To verify the responses to correlation tasks, we computed particularly in cases where the labels were long. Additionally,
Pearson’s correlation coefficient (r) to ensure that there was a strong using legends for Pie Charts is a common practice in majority of
correlation (r ≤ −0.7 or r ≥ 0.7) between the two data attributes. commercial visualization dashboards [36], [39]. We decided to
For example, is there a strong correlation between average budget not show any value on the top of the slices of Pie Charts, instead
and movie rating? showing the values of one data attribute using a legend and another
Compute Derived Value. For a given set of data points, we asked one beside the slices. For Tables, we separated different rows of
participants to compute an aggregate value of those data points. the table using light gray lines. We used a darker background color
For example, what is the sum of the budget for the action and the to make the labels (two data attributes used for creating the table)
sci-fi movies? distinguishable. See Figure 1 for more details.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 4

The following chart shows the average Highway Miles Per Gallon for 5 types of cars (e.g. at least 100 approved HITs as a quality check. We implemented
Sedan and SUV). What is the value of Highway Miles Per Gallon for the type Wagon?
our experiment as a web application hosted on a server external
About 26
About 32 to MTurk. Participants accessed the experiment through a URL
About 20 link posted on the MTurk site. Each worker could participate in
About 15
Sumbit
Highway Miles Per Gallon our study only once. The study took about 25 to 40 minutes to
25
complete and we compensated the workers who participated $4.
20 In order to determine the minimum number of participants
15 needed for our study, we first conducted a pilot study with
10
50 participants on Amazon’s Mechanical Turk. Based on the
data collected from our pilot study, we conducted a statistical
5
power analysis to ensure that our experiment included enough
0
Sedan SUV Sports Car Wagon Minivan
participants to reliably detect meaningful performance differences
Type across independent variables of the experiment. Our power analysis,
(a) Retrieve Value Task
based on the results of the pilot study, indicated that at least 160
participants would be required to detect a large effect.
The following chart shows the average budget for movies with different ratings. Movies After determining the number of subjects required to participate
with what ratings have the budget ranging from 115 to 190? in our study, we recruited 203 workers to participate in our
7, 9 study. Among the 203 who participated in our study 180 of
9, 4
7, 9, 6 them (105 Male, 75 Female) completed the study. The age of
6, 8, 9
our workers ranged from 25–40 years. All workers participated in
Sumbit
our experiment were based in the United States and have used vi-
sualizations before. 107 of the participants had experience creating
visualizations using Microsoft Excel. Five of the participants also
had experience in creating visualizations using Tableau software.

4.2 Procedure
Training. Before starting the main experiment, participants were
briefed about the purpose of the study and their rights. At
(b) Determine Range Task this stage, the participants were also asked to answer to some
demographic questions (e.g., age, sex, and prior experience in
The following chart shows the rating for movies with different genres. Which genre of creating visualizations). Participants were then asked to perform
movie has the highest rating?
5 trial questions (one question per visualization) as quickly and
horror
biography accurately as possible. Trial questions were presented in a random
drama order. For each participant, the training questions were a randomly
comedy
Sumbit
ordered set of these five questions. During this session, after
answering each question participants received feedback that showed
Rating

the correctness of their answers. To prevent the participants from


skipping the training questions, participants were not able to move
to the next question unless they answered the question correctly.
Main Experiment. During the main experiment 180 partici-
pants were randomly assigned to 10 tasks (18 participants per
Genre task). So, each participant performed questions designed for
one type of task. For each type of task, we had 30 questions
(c) Find Extremum
(5 Visualizations × 2 Datasets × 3 Trials). As recommended by
previous work [25], we also designed two additional questions
Fig. 2. Screenshots of three of the trials used in this experiment. Each of to detect if a participant answered the questions randomly. These
the trials asks users to perform a specific task.
two questions were straightforward and designed to make sure
that participants read the questions. Questions were presented in
a random order to prevent participants from extrapolating new
4 U SER E XPERIMENT
judgments from previous ones. We counterbalanced the difficulty
In this section, we explain the details of the experiment. We make (number of visual marks shown in a visualization) of the questions
all the relevant materials for our analysis publicly available2 . for each visualization type. Screenshots of the questions for the
main experiment are shown in Figure 2. More task screenshots are
provided in our supplemental materials3 ).
4.1 Experimental Platform & Participants
Follow Up Questions. After completing the main experiment, the
We conducted our experiment by posting it as a job, Human participants were asked to perform 6 additional ranking questions
Intelligence Task (HIT), on Amazon’s Mechanical Turk (MTurk). (3 Trials × 2 Datasets). In each ranking question the participants
To be able to participate in our study, MTurk workers (who perform were asked to rank the five different visualizations in the order
tasks posted on MTurk), had to have an approval rate of 95% and of their preference for performing this task. Before finishing the

2. https://github.com/gtvalab/ChartsEffectiveness 3. https://github.com/gtvalab/ChartsEffectiveness
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 5

Best Worse
Scatterplot Table Bar Chart Line Chart Pie Chart

Worse Accuracy Time User Preference Accuracy Time User Preference

Find Anomalies Find Clusters

Correlation Derived Value

Distribution Find Extremum

Order
Retrieve Value

Filter Determine Range

Fig. 3. Pairwise relation between visualization types across tasks and performance metrics. Arrows show that the source is significantly better than
the target.

experiment, we asked participants to "Please enter the criteria analysis of variance (ANOVA) for each task independently to test
you used for ranking the charts along with any other additional for differences among the various visualizations, datasets, and
comments you have about the experiment in general". This was their interactions with one another. While the Visualization had
to allow the participants to convey their feedback and in order to significant effects on both accuracy and time, the Dataset had no
solicit potentially unexpected insights. significant effect on accuracy or time.
Questions (training questions, main experiment questions, and
ranking questions) were pre-generated in an iterative process by all
three authors in multiple sessions. After each session, we conducted 5 R ESULTS
a pilot study to extract the major problems with the designed We first give an overview of our analysis of the results and then
questions. We had two criteria while designing questions for our discuss them in detail for each task. We provide detailed analysis of
experiment. First, for our study to have a reasonable length and the results in Table 1. Throughout the following sections, accuracy
complexity, we had to design questions with a reasonable level of refers to values in percentages (%) and time refers to values in
difficulty. For example, questions with a high level of difficulty seconds.
could frustrate the participants. To define difficulty, we designed Results, aggregated over tasks and datasets, show that Bar Chart
the questions in a way that the average response time for a single is the fastest and the most accurate visualization type. This result is
question is in the range from 10 to 40 seconds. Second, questions in line with prior work on graphical perception showing that people
are balanced across different datasets and presented comparable can decode values encoded with length faster than other encodings
values. For example, if a categorical attribute in the movies dataset such as angle or volume [5], [33], [40]. Conversely, Line Chart
had five categories, we tried to pick a variable from the cars dataset has the lowest aggregate accuracy and speed. However, Line Chart
that also had five (or around five) categories. is significantly more accurate than other charts for Correlation
and Distribution tasks. This finding concurs with earlier research
4.3 Data Analysis reporting the effectiveness of line charts for trend finding tasks
To analyze the differences among the various visualizations, for (e.g., [43]). Nonetheless, the overall low performance of Line
each participant, we calculated mean performance values for each Chart is surprising and, for some tasks, can be attributed to the fact
task and visualization type. That is, we averaged time and accuracy that the axes values ("ticks") were drawn at intervals. This makes
of questions for each visualization type and task. Before testing, we it difficult to precisely identify the value for a specific data point.
checked that the collected data met the assumptions of appropriate While Pie Chart is comparably as accurate and fast as Bar Chart
statistical tests. The assumption of normality was not satisfied and Table for Retrieve, Range, Order, Filter, Extremum, Derived
for performance time. However, the normality was satisfied for and Cluster tasks, it is less accurate for Correlation, Anomalies
log transformation of time values. So, we treated log-transformed and Distribution tasks. Pie Chart is the fastest visualization for
values as our time measurements. We conducted repeated-measures performing Cluster task. The high performance of Pie Chart
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 6

TABLE 1
This figure shows performance results for 10 different tasks. Performance results for each task are shown using three sub-charts. Mean accuracy
results are shown on the left (mean accuracy is measured in percentage), mean time results are shown in the middle, and user preferences/rankings
are shown at the right (1 shows least preferred and 5 shows the most preferred). Statistical test results are also shown below the charts. All tests
display 95% confidence intervals and are Bonferroni-corrected.

Find Anomalies Find Clusters


Accuracy Time Preferences Accuracy Time Preferences
Scatterplot Bar chart Bar chart Pie chart Bar chart
Scatterplot
Bar chart Scatterplot Pie chart Bar chart Table
Bar chart
Table Pie chart Table Scatterplot Pie chart
Table
Pie chart Table Scatterplot Table Scatterplot
Line chart
Line chart Line chart Line chart Line chart Line chart
Pie chart

0 25 50 75 100 0 10 20 30 0 25 50 75 100 0 10 20 30 0 1 2 3 4 5
0 1 2 3 4 5

Accuracy: (F(3.4,4915.1) = 3.03, p < 0.05, η p2 = 0.15) Accuracy: (F(2.6,45065.1) = 60.7, p < 0.05, η p2 = 0.78)
Results of Bonferroni-corrected post-hoc comparisons showed that Line Chart was Results of Bonferroni-corrected posthoc comparisons show that Pie Chart and Bar Chart
significantly less accurate than Scatterplot (p < 0.05). were significantly more accurate than other visualizations. (p < 0.05).

Time: (F(4,68) = 0.48, p < 0.05, η p2 = 0.27) Time: (F(3.9,67.9) = 6.9, p < 0.05, η p2 = 0.29)
Posthoc comparisons indicate that Bar Chart was significantly faster than Line Chart Pie Chart and Bar Chart were significantly faster than Table (p < 0.05) and line Chart
and Table (p < 0.05). This might be because people can decode values encoded with (p < 0.05. We believe that uniquely coloring different slices of pie charts improved the
length faster than other encodings such as angle or distance [5], [33], [40]. performance of Pie Chart for this type of tasks.

Preference: (F(3.1,45.56) = 5.9, p < 0.05, η p2 = 0.26) Preference: (F(2.9,188.56) = 30.2, p < 0.05, η p2 = 0.64)
For the Anomalies task type, results of pairwise comparisons show that user preference User preferences in using Bar Chart and Table were significantly higher than other
in performing Anomalies tasks using Bar Chart and Scatterplot were significantly visualizations (p < 0.05). While user preferences in using Bar Chart can be explained
higher than Pie Chart and Line Chart (p < 0.05). by its high accuracy and speed, it is surprising that Table was also highly preferred by
users for Cluster tasks.

Compute Derived Value


Correlation
Accuracy Time Preferences
Accuracy Time Preferences Table Table Table
Line chart Line chart Line chart Bar chart Pie chart Pie chart
Scatterplot Scatterplot Bar chart Pie chart Bar chart Bar chart
Bar chart Bar chart Scatterplot Scatterplot Scatterplot Scatterplot
Pie chart Table Table Line chart Line chart Line chart
Table Pie chart Pie chart
0 25 50 75 100 0 10 20 30 0 1 2 3 4 5
0 25 50 75 100 0 10 20 30 0 1 2 3 4 5

Accuracy: (F(2.7,18234.2) = 16.2, p < 0.05, η p2 = 0.49)


Accuracy: (F(2.5,20528.2) = 12.1, p < 0.05, η p2 = 0.41)
Accuracy of Line Chart was significantly lower than rest of the four chart types
Pairwise comparison show that Line Chart and Scatterplot were significantly more
(p < 0.05). On the other hand, there was no significant difference among Bar Chart,
accurate than other charts (p < 0.05). Bar Chart was also significantly more accurate
Scatterplot, Pie Chart, and Table. High accuracy of Pie Chart may have been further
than Pie Chart and Table (p < 0.05).
helped by having text labels showing the data values.
Time: (F(1,479.7) = 42.3, p < 0.05, η p2 = 0.7)
Time: (F(3.2,0.4) = 9.6, p < 0.05, η p2 = 0.36)
We found that Line Chart, Bar Chart and Scatterplot were significantly faster than Pie
Table was significantly faster than Bar Chart (p < 0.05), Scatterplot (p < 0.05), and Line
Chart and Table (p < 0.05). In fact, our results validates the findings of the previous work
Chart (p < 0.05) for this type of tasks. High effectiveness of Table might be because
that showed the effectiveness of Scatterplots and Line charts for Correlation tasks [17],
the exact values for each data point is shown in tables. So it might be the case that less
[26].
cognitive work is required to aggregate the values when the exact values are shown.
Preference: (F(3.6,75.2) = 13.6, p < 0.05, η p2 = 0.44)
Preference: (F(3.1,187.8) = 35.3, p < 0.05, η p2 = 0.67)
User preference in performing Correlations tasks using Bar Chart and Line Chart were
Participants preference for using Table, Pie Chart, and Bar Chart is significantly higher
significantly higher than that of Pie Chart, Scatterplot, and Table (p < 0.05).
than Scatterplot (p < 0.05) and Line Chart (p < 0.05).

Characterize Distribution Find Extremum


Accuracy Time Preferences Accuracy Time Preferences
Bar chart Scatterplot Bar chart Scatterplot Bar chart Bar chart
Scatterplot Bar chart Table Bar chart Line chart Table
Line chart Line chart Scatterplot Pie chart Scatterplot Scatterplot
Pie chart Pie chart Pie chart Table Table Line chart
Table Table Line chart Line chart Pie chart Pie chart

0 25 50 75 100 0 10 20 30 0 1 2 3 4 5 0 25 50 75 100 0 10 20 30 0 1 2 3 4 5

Accuracy: No significant main effect was found. Accuracy: No significant main effect was found.

Time: (F(4,68) = 5.6, p < 0.05, η p2 = 0.25) Time: (F(4,0.4) = 10.4, p < 0.05, η p2 = 0.38)
Our results indicate that Scatterplot and Bar Chart are significantly faster than Pie Chart Bar Chart is significantly faster than Table (p < 0.05) and Pie Chart (p < 0.05). Previous
(p < 0.05) and Table (p < 0.05) for Distribution tasks. Previous work also showed the work also recommends using Bar Chart in cases where readers are looking for a
fast speed of Scatterplot for correlation tasks [17], [20]. maximum or minimum values [15].

Preference: (F(2.5,20528.2) = 12.1, p < 0.05, η p2 = 0.41) Preference: (F(2.8,89.4) = 8.2, p < 0.05, η p2 = 0.61)
Our results indicate that participants preferred Bar Chart, Scatterplot, and Table There is a significant main effect of Visualization on user preference. For Extremum
significantly more than Pie Chart (p < 0.05) and Line Chart (p < 0.05). It is surprising tasks, participants’ preference in using bar charts is significantly higher than all other
that even though Table was not faster than the other four visualizations, participants visualizations (p < 0.05).
highly preferred using it.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 7
Order
Retrieve Value
Accuracy Time Preferences
Bar chart Bar chart Bar chart
Pie chart Line chart Table
Scatterplot Scatterplot Scatterplot
Table Table Line chart
Line chart Pie chart Pie chart

0 25 50 75 100 0 10 20 30 0 1 2 3 4 5

Accuracy: (F(4,.03) = 2.6, p < 0.05, η p2 = 0.17)


Accuracy: (F(2.9,7114.1) = 7.7, p < 0.05, η p2 = 0.32)
Bar Chart is significantly more accurate than Line Chart (p < 0.05). We did not find a
Overall, Bar Chart, Table and Pie Chart were significantly more accurate than Line
significant difference among Bar Chart, Pie Chart, Scatterplot, and Table.
Chart (p < 0.05). The difference between accuracy in Scatterplot and Line Chart was
not significant. We would like to mention that Pie Chart may have been further helped
Time: (F(3.3,0.6) = 9.3, p < 0.05, η p2 = 0.35)
by having text labels showing the data values.
Bar Chart is significantly faster than Pie Chart (p < 0.05) and Table (p < 0.001). Line
Chart is also significantly faster than Table (p < 0.05) for Order tasks. We also found
Time: (F(3.0,52.1) = 4.34, p < 0.05, η p2 = 0.26)
that Scatterplot is significantly faster than Pie Chart (p < 0.05). High performance of
Table, Pie Chart, and Bar Chart are significantly faster than Scatterplot (p < 0.05) and
Line Chart, Scatterplot, and Bar Chart could be the due to their usage of length and
Line Chart (p < 0.05) for performing Retrieve tasks. Successful performance time of
position as primary graphical encodings. Length and position are fastest encodings to
Retrieve tasks highly depends on readers ability to rapidly identify the value for a
perceive [5], [6].
certain data point. As Ehrenberg [14] points out, tables are well-suited for retrieving
the numerical value of a data point when a relatively small number of data points are
Preference: (F(3.0,103.3) = 11.8, p < 0.05, η p2 = 0.52)
displayed.
For Order tasks, users preferred Bar Chart significantly more than other visualizations
(p < 0.05). Moreover, our results indicate that user preference in using Pie Chart is
Preference: (F(1.5,417.2) = 47.1, p < 0.05, η p2 = 0.73)
significantly lower than other visualizations. There was not a significant different in
User preference for performing Retrieve tasks using Table is significantly higher than
user preference for Line Chart and Scatterplot.
other visualizations. After Table, Bar Chart is the second most visualization type highly
preferred by users to perform this type of tasks (p < 0.05). Moreover, user preference
in using Bar Chart is significantly higher than Pie Chart, Scatterplot, and Line Chart.
Filter Determine Range
Accuracy Time Preferences Accuracy Time Preferences
Table Bar chart Table Bar chart Scatterplot Scatterplot
Pie chart Table Bar chart Pie chart Line chart Line chart
Bar chart Scatterplot Pie chart Scatterplot Pie chart Pie chart
Line chart Pie chart Scatterplot Table Table Bar chart
Scatterplot Line chart Line chart Line chart Bar chart Table
0 25 50 75 100 0 10 20 30 0 1 2 3 4 5 0 25 50 75 100 0 10 20 30 0 1 2 3 4 5

Accuracy: No significant main effect was found. Accuracy: No significant main effect was found.

Time: (F(2.2,210.5) = 42.2, p < 0.05, = 0.72) η p2


Bar Chart and Table are significantly faster than other visualizations (p < 0.05). Time: No significant main effect was found.

Preference: (F(3.6,75.2) = 13.6, p < 0.05, = 0.44) η p2


Participants’ preference towards using Table, Bar Chart, and Pie Chart is significantly Preference: No significant main effect was found.
higher than Line Chart (p < 0.05) and Scatterplot (p < 0.05) for Filter tasks.

for these tasks can be attributed to its relative effectiveness 6 D ISCUSSION


in conveying part-whole relations and facilitating proportional
In this section, we reflect on the results of our work more broadly
judgments, particularly when the number of data points visualized
with respect to information visualization.
is small [13], [35]. Pie Chart may have been further helped by
having colored slices with text labels showing the data values.
6.1 No One Size Fits All
Overall, Scatterplot performs reasonably well in terms of both
accuracy and time. For the majority of tasks Scatterplot is among Depending on the task at hand, various visualizations perform differ-
the most effective top three visualizations, and it was never the ently. That is, we do not advocate generalizing the performance of a
least accurate or slowest visualization for any of the tasks. specific visualization on a particular task to every task. For example,
throughout the history of the graphical perception research, pie
Bar Chart and Table are the two visualization types highly charts have been the subject of passionate arguments [8], [13],
preferred by participants across most of the tasks. Bar Chart is [35] for and against their use. Although the current common
always among the two top-performing visualizations for almost wisdom among visualization researchers is to avoid them, pie
all tasks, so this makes sense that people prefer using Bar charts continue to be popular in everyday visualizations. Results
Chart over other visualizations. Surprisingly, while performing of our study present a more nuanced view of pie charts. We found
some of the tasks (e.g., Distribution, Anomalies) using Table is that pie charts can be as effective as other visualizations for task
relatively slow and less accurate, participants still prefer Table types such as Cluster, Extremum, Filter, Retrieve, and Range. On
for performing these tasks. People’s familiarity with tables and the other hand, our results suggest that pie charts perform poorly
ease of understanding tables could have led people to prefer in Correlation and Distribution tasks.
using tables over other visualizations. To determine whether
performance time and accuracy are related to user preferences,
we calculated the correlation between performance time, accuracy, 6.2 User Preferences
and user preference. We found a positive correlation between Our results show user preferences correlate with user accuracy
accuracy and user preference (Pearson’s r(5) = 0.68, p < 0.05), and speed in completing tasks. Before completing the study, we
indicating people have a preference for visualizations that allow asked participants to explain the criteria they used for ranking the
them to accurately complete a task. We also found a weak visualizations. Some of participants explicitly mentioned perceived
negative correlation between performance time and user preferences accuracy of the charts as one of the factors that influenced their
(Pearson’s r(5) = −0.43, p < 0.05). decision while ranking visualizations. For example, one of the
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 8

participants stated: “Just by how accurate I felt my own answer 7 L IMITATIONS AND F UTURE W ORK
was, and how easy it was to derive the answer from the graphs.”
Neither accuracy nor speed appear to be the only criteria by Our experimental results should be interpreted in the context
which participants describe their individual rankings. Additionally, of the specified visualizations tasks, and datasets. While our
perceived accuracy does not always match with task accuracy. We findings should be interpreted in the context of the specified
noticed that for some task types such as Distribution and Cluster, settings and conditions, we tested the most common visualization
preference for using tables and bar charts is significantly higher techniques incorporated in various visualization dashboards [23],
than other visualizations, even though these two visualizations are analytical tasks used in different studies [1], [2] and datasets used
not the most effective ones for these type of tasks. Interestingly, in various studies [10], [28]. That being said, additional studies are
some of the participants took into account their familiarity with required to test our research questions taking into account different
visualizations as one of the factors for preferring some visualization visualization techniques, tasks and datasets.
over others. For example, one of the participants mentioned: “I just In this study, participants were required to perform the tasks
went with the ones I felt were familiar to me.” Another participant using static visualizations. While we are aware of the importance
also stated: “I deal with bars a lot. I know how to read them.” of interactivity and the fact that interactivity could impact user
experience with a specific visualization, we decided to exclude
6.3 Which Visualization Type to Use? interactivity because of the following reasons. First, adding
Based on our results, when time, accuracy and preference are interactivity increases the complexity of the study design. In fact, it
important factors to consider, we provide the following guidelines: would require us to take into account another set of factors including
users’ input devices such as mouse, trackpad, and touch. Moreover,
G1. Use bar charts for finding clusters. Our results show that we had to take into account interaction design/implementation.
pie charts and bar charts are significantly faster and more accurate For example, the implementation of each interaction varies across
for this type of task. However, users preference in using bar charts different input devices. Second, static visualizations are commonly
was significantly higher than using pie charts for finding clusters. used for presentation and educational purposes (e.g., visualization
Thus, bar chart having a better overall performance in terms of used in books, newspapers, and presentations). In many of these
time, accuracy, and user preferences for finding clusters. cases, visualization consumers still need to perform a variety of
G2. Use line charts for finding correlations. We found that line tasks using static visualizations. That being said, we encourage
charts and scatterplots have significantly higher accuracy and speed additional studies to directly investigate the effectiveness of these
for finding correlations. However, users preferences in using line visualizations taking into account interactivity.
charts for finding correlations was signifincalty higher than using Due to practical limitations of conducting the study using static
scatterplots. Thus, line charts performed better in terms of time, visualizations with a large number of visual marks (e.g., length
accuracy and user preferences. and complexity of the experiment), the number of visual marks
G3. Use scatterplots for finding anomalies. Results of our study shown in the visualizations used in our study is restricted to be
indicate that scatterplots have high accuracy, speed, and are highly between 5 to 34. We used the cardinality of the data attributes to
preferred by users for this type of task. define how many visual marks (e.g., bars in a bar chart, circles in a
G4. Avoid line charts for tasks that require readers to precisely scatterplot) should be shown in a visualization. However, we would
identify the value of a specific data point. The low performance like to emphasize that performance of these visualization types
of line charts for some tasks such as Derived Value and Cluster might change depending on the number of data points encoded by
might be attributed to the fact that the axes values (i.e., the “ticks”) them. Our study results hold for static visualizations with visual
were drawn at uniform intervals. This makes it difficult to precisely marks that number between 5 and 34. We defer investigation of
identify the value of a specific data point. how datapoint cardinality affects the task-based performance of
G5. Avoid using tables and pie charts for correlation tasks. visualizations to future work.
Findings indicate that Tables and pie charts are significantly less In this study, we investigated the effectiveness of five basic two-
accurate, slower, and less preferred by users for this type of task. dimensional visualization types. However, some of the visualization
types can be extended to more than two dimensions (e.g., line chart).
6.4 How to engineer empirical user performance data Performance of these visualization types might change depending
into practical systems? on their dimensionalities. One interesting avenue of continued
Graphical perception experiments are the work-horse of our quest research is to investigate the impact of the number of dimensions
to understand and improve the effectiveness of visualizations. represented by a visualization type on its effectiveness.
Guidelines and heuristics that we use today in data visualization
are primarily due to the accumulation of experimental results over
decades. It is not, however, always possible to extract guidelines 8 C ONCLUSION
from data collected by user experiments. Even when this is
possible, such derived guidelines require visualization practitioners In this work, we report the results of a study that gathers user
to manually incorporate them in visualizations systems. We performance and preference for performing ten common data
believe machine learning models provide a practical opportunity to analysis tasks using five small scale (5-34 data points) two-
implicitly engineer the insights embodied by empirical performance dimensional visualization types: Table, Line Chart, Bar Chart,
data into visualization systems in an unbiased and rigorous manner. Scatterplot, and Pie Chart. We use two different datasets to
Kopol is a basic example of how this can be achieved. To drive further support the ecological validity of results. We find that
Kopol, we train a decision tree on the data we collected. Kopol the effectiveness of the visualization types considered significantly
then uses the learned model to recommend visualizations at “test” changes from one task to another. We compile our findings into a
time for given user tasks and datasets. set of recommendations to inform data visualization in practice.
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 9

R EFERENCES [29] B. Saket, H. Kim, E. T. Brown, and A. Endert. Visualization by


demonstration: An interaction paradigm for visual data exploration. IEEE
[1] R. Amar, J. Eagan, and J. Stasko. Low-level components of analytic Transactions on Visualization & Computer Graphics, (1):331–340, 2017.
activity in information visualization. In Proceedings of the Proceedings [30] B. Saket, A. Srinivasan, E. D. Ragan, and A. Endert. Evaluating
of the 2005 IEEE Symposium on Information Visualization, INFOVIS ’05, interactive graphical encodings for data visualization. IEEE Transactions
pages 15–, Washington, DC, USA, 2005. IEEE Computer Society. on Visualization and Computer Graphics, 2017.
[2] R. Amar and J. Stasko. Best paper: A knowledge task-based framework for [31] B. S. Santos. Evaluating visualization techniques and tools: What are
design and evaluation of information visualizations. In IEEE Symposium the main issues. The AVI Workshop on Beyond Time and Errors: Novel
on Information Visualization, pages 143–150, 2004. Evaluation Methods For information Visualization (BELIV ’08), 2008.
[3] J. Bertin. Semiology of graphics. University of Wisconsin Press, 1983. [32] M. Siegrist. The use or misuse of three-dimensional graphs to represent
lower-dimensional data. Behaviour & Information Technology, 15(2):96–
[4] F. Bouali, A. Guettala, and G. Venturini. Vizassist: an interactive user
100, 1996.
assistant for visual data mining. The Visual Computer, pages 1–17, 2015.
[33] D. Simkin and R. Hastie. An information-processing analysis of graph
[5] W. S. Cleveland and R. McGill. Graphical perception: Theory, experimen-
perception. Journal of the American Statistical Association, 82(398):454–
tation, and application to the development of graphical methods. Journal
465, 1987.
of the American statistical association, 79(387):531–554, 1984.
[34] D. Skau and R. Kosara. Arcs, angles, or areas: Individual data encodings
[6] W. S. Cleveland and R. McGill. Graphical perception and graphical in pie and donut charts. In Computer Graphics Forum, volume 35, pages
methods for analyzing scientific data. Science, 229(4716):828–833, 1985. 121–130. Wiley Online Library, 2016.
[7] M. Correll and M. Gleicher. Error bars considered harmful: Exploring [35] I. Spence and S. Lewandowsky. Displaying proportions and percentages.
alternate encodings for mean and error. IEEE transactions on visualization Applied Cognitive Psychology, 5(1):61–77, 1991.
and computer graphics, 20(12):2142–2151, 2014. [36] SpotFire. http://www.spotfire.com, 2016.
[8] F. E. Croxton and R. E. Stryker. Bar charts versus circle diagrams. Journal [37] A. Srinivasan and J. T. Stasko. Natural Language Interfaces for Data
of the American Statistical Association, 22(160):473–482, 1927. Analysis with Visualization: Considering What Has and Could Be Asked.
[9] M. Dambacher, P. Haffke, D. Groß, and R. Hübner. Graphs versus In B. Kozlikova, T. Schreck, and T. Wischgoll, editors, EuroVis 2017 -
numbers: How information format affects risk aversion in gambling. Short Papers. The Eurographics Association, 2017.
Judgment and Decision Making, 11(3):223, 2016. [38] Y. Sun, J. Leigh, A. Johnson, and S. Lee. Articulate: A semi-automated
[10] T. Datasets. https://public.tableau.com/s/resources, 2015. model for translating natural language queries into meaningful visualiza-
[11] C. Demiralp, P. J. Haas, S. Parthasarathy, and T. Pedapati. Foresight: tions. In R. Taylor, P. Boulanger, A. Krüger, and P. Olivier, editors, Smart
Recommending visual insights. Proc. VLDB Endow., 10(12):1937–1940, Graphics, pages 184–195. Springer Berlin Heidelberg, 2010.
2017. [39] Tableau. Tableau software, http://www.tableau.com/, 2016.
[12] E. Dimara, A. Bezerianos, and P. Dragicevic. Conceptual and method- [40] J. Talbot, V. Setlur, and A. Anand. Four experiments on the perception of
ological issues in evaluating multidimensional visualizations for decision bar charts. IEEE Transactions on Visualization and Computer Graphics,
support. IEEE Transactions on Visualization and Computer Graphics, 20(12):2152–2160, Dec 2014.
24(1):749–759, Jan 2018. [41] M. Vartak, S. Madden, A. Parameswaran, and N. Polyzotis. Seedb:
[13] W. C. Eells. The relative merits of circles and bars for representing Automatically generating query visualizations. Proc. VLDB Endow.,
component parts. Journal of the American Statistical Association, 7(13):1581–1584, Aug. 2014.
21(154):119–132, 1926. [42] K. Wongsuphasawat, D. Moritz, A. Anand, J. Mackinlay, B. Howe,
[14] A. E. Ehrenberg. Data Reduction: Analysing and interpreting statistical and J. Heer. Voyager: Exploratory analysis via faceted browsing of
data. John Wiley and Sons, London, 1975. visualization recommendations. IEEE Transactions on Visualization and
[15] S. Few. Information dashboard design. O’Reilly, 2006. Computer Graphics, 22(1):649–658, Jan 2016.
[16] R. Garcia-Retamero and M. Galesic. Who proficts from visual aids: [43] J. Zacks and B. Tversky. Bars and lines: A study of graphic communica-
Overcoming challenges in people’s understanding of risks. Social science tion. Memory & Cognition, 27(6):1073–1079, 1999.
& medicine, 70(7):1019–1025, 2010.
[17] L. Harrison, F. Yang, S. Franconeri, and R. Chang. Ranking visualizations
of correlation using weber’s law. Visualization and Computer Graphics,
IEEE Transactions on, 20(12):1943–1952, 2014.
[18] H. V. Henderson and P. F. Velleman. Building multiple regression models
interactively. Biometrics, 37(2):391–411, 1981. Bahador Saket is currently a Ph.D. student at Georgia Institute of Tech-
[19] S. Kandel, R. Parikh, A. Paepcke, J. M. Hellerstein, and J. Heer. Profiler: nology. His research focuses on the design of interaction techniques for
Integrated statistical analysis and visualization for data quality assessment. visual data exploration. He is also interested in conducting experiments
In Proceedings of the International Working Conference on Advanced as a method to understand how visualizations can be used to support
Visual Interfaces, AVI ’12, pages 547–554, New York, NY, USA, 2012. data analysis.
ACM.
[20] M. Kay and J. Heer. Beyond weber’s law: A second look at ranking
visualizations of correlation. IEEE Transactions on Visualization and
Computer Graphics, 22(1):469–478, Jan 2016.
[21] M. Kay, T. Kola, J. R. Hullman, and S. A. Munson. When (ish) is my bus?
Alex Endert is an Assistant Professor in the School of Interactive
user-centered visualizations of uncertainty in everyday, mobile predictive
Computing at Georgia Tech. He directs the Visual Analytics Lab, where
systems. In Proceedings of the 2016 CHI Conference on Human Factors
him and his students explore novel user interaction techniques for visual
in Computing Systems, pages 5092–5103. ACM, 2016.
analytics. His lab often applies these fundamental advances to domains
[22] S. M. Kosslyn. Understanding charts and graphs. Applied cognitive including text analysis, intelligence analysis, cyber security, decision-
psychology, 3(3):185–225, 1989. making, and others. He received his Ph.D. in Computer Science at
[23] S. Lee, S. H. Kim, and B. C. Kwon. Vlat: Development of a visualization Virginia Tech in 2012.
literacy assessment test. IEEE Transactions on Visualization and
Computer Graphics, PP(99):1–1, 2016.
[24] J. Mackinlay, P. Hanrahan, and C. Stolte. Show me: Automatic
presentation for visual analysis. IEEE Transactions on Visualization
and Computer Graphics, 13(6):1137–1144, Nov. 2007.
[25] J. S. Olson and W. A. Kellogg. Ways of Knowing in HCI. Springer, 2014. Çağatay Demiralp is a research scientist at IBM and the co-founder and
[26] A. V. Pandey, J. Krause, C. Felix, J. Boy, and E. Bertini. Towards chief scientific advisor at Fitnescity. His current research focuses around
understanding human similarity perception in the analysis of large sets two themes: 1) Automating visual data exploration for scalable guided
of scatter plots. In Proceedings of the 2016 CHI Conference on Human data analysis and 2) improving the data science pipeline with interactive
Factors in Computing Systems, CHI ’16, pages 3659–3669, New York, tools that facilitate iterative visual data and model experimentation. Before
NY, USA, 2016. ACM. IBM, Çağatay was a postdoctoral scholar at Stanford University and
[27] S. Pinker. A theory of graph comprehension. Artificial intelligence and member of the Interactive Data Lab at the University of Washington. He
the future of testing, pages 73–126, 1990. obtained his PhD from Brown University.
[28] U. M. L. Repository. https://archive.ics.uci.edu/ml/datasets.html, 2016.

You might also like