0% found this document useful (0 votes)
26 views44 pages

Module 4

This document discusses visual encodings for arranging tables of data. It explains that spatial position is the most important visual encoding because it dominates the user's mental model. Common visual encodings like scatterplots, bar charts, and line charts are described in terms of how they arrange data spatially using techniques like separating data into regions and list alignments. Scatterplots encode two quantitative variables spatially, while bar charts encode one quantitative variable spatially against a categorical key. Stacked bar charts and dot charts are also discussed as variants that encode additional dimensions of data.

Uploaded by

survey ss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views44 pages

Module 4

This document discusses visual encodings for arranging tables of data. It explains that spatial position is the most important visual encoding because it dominates the user's mental model. Common visual encodings like scatterplots, bar charts, and line charts are described in terms of how they arrange data spatially using techniques like separating data into regions and list alignments. Scatterplots encode two quantitative variables spatially, while bar charts encode one quantitative variable spatially against a categorical key. Stacked bar charts and dot charts are also discussed as variants that encode additional dimensions of data.

Uploaded by

survey ss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Visual Analytics

Dr. Jyotismita Chaki


Arrange Tables: Why?
• The arrange design choice covers all aspects of the use of spatial channels
for visual encoding.
• It is the most crucial visual encoding choice because the use of space
dominates the user’s mental model of the dataset.
• The three highest ranked effectiveness channels for quantitative and
ordered attributes are all related to spatial position: planar position against
a common scale, planar position along an unaligned scale, and length.
• The highest ranked effectiveness channel for categorical attributes,
grouping items within the same region, is also about the use of space.
• Moreover, there are no nonspatial channels that are highly effective for all
attribute types: the others are split into being suitable for either ordered or
categorical attributes, but not both, because of the principle of
expressiveness.
Arrange Tables: Arrange by Keys and Values
• The distinction between key and value attributes is very relevant to
visually encoding table data.
• A key is an independent attribute that can be used as a unique index
to look up items in a table, while a value is a dependent attribute: the
value of a cell in a table.
• Key attributes can be categorical or ordinal, whereas values can be all
three of the types: categorical, ordinal, or quantitative.
• The unique values for a categorical or ordered attribute are called
levels, to avoid the confusion of overloading the term value.
Arrange Tables: Arrange by Keys and Values
• The core design choices for visually encoding tables directly relate to
the semantics of the table’s attributes: how many keys and how many
values does it have?
• An idiom could only show values, with no keys; scatterplots are the
canonical example of showing two value attributes.
• An idiom could show one key and one value attribute; bar charts are
the best-known example.
• An idiom could show two keys and one value; for example, heatmaps.
• Idioms that show many keys and many values often recursively
subdivide space into many regions, as with scatterplot matrices.
Arrange Tables: Express: Quantitative Values:
Scatterplots
• The idiom of scatterplots encodes two quantitative value variables
using both the vertical and horizontal spatial position channels, and
the mark type is necessarily a point.
• Scatterplots are effective for the abstract tasks of providing overviews
and characterizing distributions, and specifically for finding outliers
and extreme values.
• Scatterplots are also highly effective for the abstract task of judging
the correlation between two attributes.
• The stronger the correlation, the closer the points fall along a perfect
diagonal line; positive correlation is an upward slope, and negative is
downward.
Arrange Tables: Express: Quantitative Values:
Scatterplots
Each point mark represents a
country, with horizontal and
vertical spatial position encoding
the primary quantitative attributes
of life expectancy and infant
mortality. The color channel is
used for the categorical country
attribute and the size channel for
quantitative population attribute.
Highly negatively correlated
dataset.
Arrange Tables: Express: Quantitative Values:
Scatterplots
• Additional transformations can also be used to shed more light on the
data.
• Figure (a) shows the relationship between diamond price and weight.
• Figure (b) shows a scatterplot of derived attributes created by
logarithmically scaling the originals; the transformed attributes are
strongly positively correlated.
• When judging correlation is the primary intended task, the derived
data of a calculated regression line is often superimposed on the raw
scatterplot of points
Arrange Tables: Express: Quantitative Values:
Scatterplots

(a) Original diamond price/carat data. (b) Derived log-scale attributes are
highly positively correlated
Arrange Tables: Express: Quantitative Values:
Scatterplots
• Scatterplots are often augmented with color coding to show an additional
attribute.
• Size coding can also portray yet another attribute; sizecoded scatterplots are
sometimes called bubble plots.
• Figure (previous slide) shows an example of demographic data, plotting infant
mortality on the vertical axis against life expectancy on the horizontal axis.
Arrange Tables: Categorical Regions: List
Alignment: One Key
• The use of space to encode categorical attributes is more complex than the
simple case of quantitative attributes where the value can be expressed
with spatial position.
• Spatial position is an ordered magnitude visual channel, but categorical
attributes have unordered identity semantics.
• With a single key, separating into regions using that key yields one region
per item.
• The regions are frequently arranged in a one-dimensional list alignment,
either horizontal or vertical.
• The view itself covers a two-dimensional area: the aligned list of items
stretches across one of the spatial dimensions, and the region in which the
values are shown stretches across the other.
Arrange Tables: Categorical Regions: List
Alignment: One Key: Bar Chart
• The well-known bar chart idiom is a simple initial example. Figure (next slide) shows a
bar chart of approximate weights on the vertical axis for each of three animal species on
the horizontal axis.
• Analyzing the visual encoding, bar charts use a line mark and encode a quantitative value
attribute with one spatial position channel.
• The other attribute shown in the chart, animal species, is a categorical key attribute.
• Each line mark is indeed in a separate region of space, and there is one for each level of
the categorical attribute.
• In Figure (a) the regions are ordered alphabetically by species name.
• Figure (b) shows this dataset with the regions ordered by the values of the same value
attribute that is encoded by the bar heights, animal weight.
• This kind of data-driven ordering makes it easier to see dataset trends.
• Bar charts are also well suited for the abstract task of looking up individual values.
Arrange Tables: Categorical Regions: List
Alignment: One Key: Bar Chart
Arrange Tables: Categorical Regions: List
Alignment: One Key: Bar Chart
Arrange Tables: Categorical Regions: List
Alignment: Two Key: Stacked Bar Chart
• A stacked bar chart uses a more complex glyph for each bar, where
multiple sub-bars are stacked vertically.
• The length of the composite glyph still encodes a value, as in a standard bar
chart, but each subcomponent also encodes a length-encoded value.
• Stacked bar charts show information about multidimensional tables,
specifically a two-dimensional table with two keys.
• The composite glyphs are arranged as a list according to a primary key.
• The other secondary key is used in constructing the vertical structure of the
glyph itself.
• Stacked bar charts are an example of a list alignment used with more than
one key attribute.
Arrange Tables: Categorical Regions: List
Alignment: Two Key: Stacked Bar Chart
• Stacked bar charts typically use color as well as length coding.
• Each subcomponent is colored according to the same key that is used
to determine the vertical ordering; since the subcomponents are all
abutted end to end without a break and are the same width, they
would not be distiguishable without different coloring.
• While it would be possible to use only black outlines with white fill as
the rectangles within a bar, comparing subcomponents across
different bars would be considerably more difficult.
Arrange Tables: Categorical Regions: List
Alignment: Two Key: Stacked Bar Chart
• Figure (next slide) shows an example of a stacked bar chart used to
inspect information from a computer memory profiler.
• The key used to distribute composite bars along the axis is the
combination of a processor and a procedure.
• The key used to stack and color the glyph subcomponents is the type
of cache miss; the height of each full bar encodes all cache misses for
each processor–procedure combination.
• Each component of the bar is separately stacked, so that the full bar
height shows the value for the combination of all items in the stack.
Arrange Tables: Categorical Regions: List
Alignment: Two Key: Stacked Bar Chart
Arrange Tables: Categorical Regions: List
Alignment: Two Key: Stacked Bar Chart
Arrange Tables: Categorical Regions: List
Alignment: One Key: Dot Chart
• The dot chart idiom is a visual encoding of one
quantitative attribute using spatial position against
one categorical attribute using point marks, rather
than the line marks of a bar chart.
• Figure shows a dot chart of cat weight over time with
the ordered variable of year on the horizontal axis and
the quantitative weight of a specific cat on the vertical
axis.
Arrange Tables: Categorical Regions: List
Alignment: One Key: Line Chart
• The idiom of line charts augments dot
charts with line connection marks running
between the points.
Arrange Tables: Categorical Regions: List
Alignment: One Key: Line Chart
Bar charts and line charts
both encode a single
attribute. Bar charts
encourage discrete
comparisons, while line
graphs encourage trend
assessments. Line charts
should not be used for
categorical data, as in the
upper right, because their
implications are misleading.
Arrange Tables: Matrix Alignment: Two Keys:
Cluster Heatmaps
• Datasets with two keys are often arranged in a two-dimensional matrix alignment
where one key is distributed along the rows and the other along the columns, so
a rectangular cell in the matrix is the region for showing the item values.
• The idiom of heatmaps is one of the simplest uses of the matrix alignment: each
cell is fully occupied by an area mark encoding a single quantitative value
attribute with color.
• Heatmaps are often used with bioinformatics datasets.
• Figure (next slide) shows an example where the keys are genes and experimental
conditions, and the quantitative value attribute is the activity level of a particular
gene in a particular experimental condition as measured by a microarray.
• This heatmap uses a diverging red–green colormap.
• The benefit of heatmaps is that visually encoding quantitative data with color
using small area marks is very compact, so they are good for providing overviews
with high information density.
Arrange Tables: Matrix Alignment: Two Keys:
Cluster Heatmaps
Arrange Tables: Matrix Alignment: Two Keys:
Cluster Heatmaps
• The cluster heatmap idiom combines the basic heatmap with matrix
reordering, where two attributes are reordered in combination.
• The goal of matrix reordering is to group similar cells in order to check
for largescale patterns between both attributes, just as the goal of
reordering a single attribute is to see trends across a single one.
Arrange Tables: Matrix Alignment: Two Keys:
Cluster Heatmaps
Arrange Tables: Matrix Alignment: Two Keys:
Scatterplot Matrix
• A scatterplot matrix (SPLOM) is a matrix where each cell contains an
entire scatterplot chart. A SPLOM shows all possible pairwise
combinations of attributes, with the original attributes as the rows
and columns.
Arrange Tables: Volumetric Grid: Three Keys
• Just as data can be aligned in a 1D list or a 2D matrix, it is possible to
align data in three dimensions, in a 3D volumetric grid.

Arrange Tables: Recursive Subdivision:


Multiple Keys
• With multiple keys, it’s possible to extend the above approaches by
recursively subdividing the cell within a list or matrix.
Arrange Tables: Area charts
Arrange Tables: Area charts
Arrange Tables: Comparison between Line
and Area chart
Arrange Tables: Spatial Axis Orientation:
Rectilinear Layouts
• An additional design choice with the use of space is how to orient the
spatial axes: whether to use rectilinear, parallel, or radial layout.
• In a rectilinear layout, regions or items are distributed along two
perpendicular axes, horizontal and vertical spatial position, that range
from minimum value on one side of the axis to a maximum value on
the other side.
• Rectilinear layouts are heavily used in vis design and occur in many
common statistical charts.
• All of the examples above use rectilinear layouts.
Arrange Tables: Spatial Axis Orientation:
Parallel Layouts
• The idiom of parallel coordinates is an approach for visualizing many
quantitative attributes at once using spatial position.
• As the name suggests, the axes are placed parallel to each other,
rather than perpendicularly at right angles.
• While an item is shown with a dot in a scatterplot, with parallel
coordinates a single item is represented by a jagged line that zigzags
through the parallel axes, crossing each axis exactly once at the
location of the item’s value for the associated attribute.
Arrange Tables: Spatial Axis Orientation:
Parallel Layouts

Comparison of scatterplot matrix and parallel coordinate idioms for a small data table.
Arrange Tables: Spatial Axis Orientation:
Parallel Layouts
Arrange Tables: Spatial Axis Orientation:
Radial Layouts
• In a radial spatial layout, items are distributed around a circle using
the angle channel in addition to one or more linear spatial channels,
in contrast to the rectilinear layouts that use only two spatial
channels.
• The natural coordinate system in radial layouts is polar coordinates,
where one dimension is measured as an angle from a starting line and
the other is measured as a distance from a center point.
Arrange Tables: Spatial Axis Orientation:
Radial Layouts

(a) Radial layouts use polar coordinates, with one spatial position and one angle
channel. (b) Rectlinear layouts use two perpendicular spatial position channels. (c)
Transforming rectilinear to radial layouts maps two parallel bounding lines to a
point at the center and a circle at the perimeter.
Arrange Tables: Spatial Axis Orientation:
Radial Layouts
• The same five-attribute dataset is encoded with a rectilinear bar chart
in Figure (a) and with a radial alternative in Figure (b).
• In both cases, line marks are used to encode a quantitative attribute
with the length channel, and the only difference is the radial versus
the rectilinear orientation of the axes.

Radial versus rectilinear layouts. (a)


Rectilinear bar chart. (b) Radial bar
chart.
Arrange Tables: Spatial Axis Orientation:
Radial Layouts

Pie chart versus bar chart accuracy. (a) Pie charts require angle and area
judgements. (b) Bar charts require only high-accuracy length judgements for
individual items. (c) Polar area charts are a more direct equivalent of bar charts,
where the length of each wedge varies like the length of each bar.
Arrange Tables: Spatial Axis Orientation:
Radial Layouts
Arrange Tables: Spatial Axis Orientation:
Radial Layouts
Arrange Tables: Spatial Axis Orientation:
Radial Layouts: Polar plot

2000 8000 10000 12000


4000
6000
Arrange Tables: Spatial Axis Orientation:
Radial Layouts
• Figure (next slide) compares rectilinear and radial layouts for 12 iconic
time-series datasets: linear increasing, decreasing, shifted, single
peak, single dip, combined linear and nonlinear, seasonal trends with
different scales, and a combined linear and seasonal trend.
• The rectilinear layouts in Figure (a) are more effective at showing the
differences between the linear and nonlinear trends, whereas the
radial plots Figure (b) are more effective at showing cyclic patterns.
Arrange Tables: Spatial Axis Orientation:
Rectilinear Layouts
Arrange Tables: Spatial Axis Orientation:
Radial Layouts

You might also like