Network Data 1
Network Data 1
@tamaramunzner
Network data
• networks
– model relationships Spatial
between things
• aka graphs
– two kinds of items,
both can have attributes
• nodes
• links
• tree
– special case
– no cycles
• one parent per node
2
Network tasks: topology-based and attribute-based
• topology based tasks
– find paths
– find (topological) neighbors
– compare centrality/importance measures
– identify clusters / communities
• attribute based tasks (similar to table data)
– find distributions, ...
• combination tasks, incorporating both
– example: find friends-of-friends who like cats
• topology: find all adjacent nodes of given node
• attributes: check if has-pet (node attribute) == cat
3
Node-link diagrams
• nodes: point marks
• links: line marks
– straight lines or arcs
– connections between nodes
• intuitive & familiar
– most common
– many, many variants
4
Criteria for good node-link layouts
• minimize
– edge crossings, node overlaps
– distances between topological neighbor nodes
– total drawing area
– edge bends
• maximize
– angular distance between different edges
– aspect ratio disparities
• emphasize symmetry
– similar graph structures should look similar in layout
5
Criteria conflict
• most criteria NP-hard individually
• many criteria directly conflict with each other
6
Optimization-based layouts
• formulate layout problem as optimization problem
• convert criteria into weighted cost function
– F(layout) = a*[crossing counts] + b*[drawing space used]+...
• use known optimization techniques to find layout at minimal cost
– energy-based physics models
– force-directed placement
– spring embedders
7
Force-directed placement
• physics model
– links = springs pull together
– nodes = magnets repulse apart
• algorithm
– place vertices in random locations
– while not equilibrium
• calculate force on vertex
– sum of
» pairwise repulsion of all nodes
» attraction between connected nodes
• move vertex by c * vertex_force
http://mbostock.github.com/d3/ex/force.html
8
Force-directed placement properties
• strengths
– reasonable layout for small, sparse graphs
– clusters typically visible
– edge length uniformity
• weaknesses
– nondeterministic
– computationally expensive: O(n^3) for n nodes
• each step is n^2, takes ~n cycles to reach equilibrium
– naive FD doesn't scale well beyond 1K nodes https://bl.ocks.org/steveharoz/8c3e2524079a8c440df60c1ab72b5d03
9
Idiom: force-directed placement
• visual encoding
–link connection marks, node point marks
• considerations
–spatial position: no meaning directly encoded
• left free to minimize crossings
–proximity semantics?
• sometimes meaningful
• sometimes arbitrary, artifact of layout algorithm
• tension with length
–long edges more visually salient than short http://mbostock.github.com/d3/ex/force.html
• tasks
–explore topology; locate paths, clusters
• scalability
–node/edge density E < 4N
10
Idiom: circular layouts / arc diagrams (node-link)
• restricted node-link layouts: lay out nodes around circle or along
line
• data
– original: network
–derived: node ordering attribute (global computation)
• considerations: node ordering crucial to avoid
excessive clutter from edge crossings
–examples: before & after barycentric ordering
http://profs.etsmtl.ca/mmcguffin/research/2012-mcguffin-simpleNetVis/mcguffin-2012-simpleNetVis.pdf 11
Adjacency matrix representations
• derive adjacency matrix from network
12
Adjacency matrix examples
13
Node order is crucial: Reordering
https://bost.ocks.org/mike/miserables/ 14
Adjacency matrix
•˜
15
Structures visible in both
http://www.michaelmcguffin.com/courses/vis/patternsInAdjacencyMatrix.png 16
Idiom: adjacency matrix view
• data: network
– transform into same data/encoding as heatmap
• derived data: table from network [NodeTrix: a Hybrid Visualization of Social Networks.
Henry, Fekete, and McGuffin. IEEE TVCG (Proc.
– 1 quant attrib InfoVis) 13(6):1302-1309, 2007.]
17
Node-link vs. matrix comparison
• node-link diagram strengths
–topology understanding, path tracing
–intuitive, flexible, no training needed
• adjacency matrix strengths http://www.michaelmcguffin.com/courses/vis/patternsInAdjacencyMatrix.png
–focus on edges rather than nodes
–layout straightforward (reordering needed)
–predictability, scalability
–some topology tasks trainable
• empirical study
–node-link best for small networks
–matrix best for large networks [On the readability of graphs using node-link and matrix-based
• if tasks don’t involve path tracing! representations: a controlled experiment and statistical analysis. Ghoniem,
Fekete, and Castagliola. Information Visualization 4:2 (2005), 114–135.]
18
Idiom: NodeTrix
• hybrid nodelink/matrix
• capture strengths of both
[NodeTrix: a Hybrid Visualization of Social Networks. Henry, Fekete, and McGuffin. IEEE TVCG (Proc.
InfoVis) 13(6):1302-1309, 2007.]
19
Trees
20
Node-link trees
• Reingold-Tilford
– tidy drawings of trees
• exploit parent/child structure
– allocate space: compact but
without overlap
• rectilinear and radial variants
http://billmill.org/pymag-trees/ http://bl.ocks.org/mbostock/4063550
http://bl.ocks.org/mbostock/4339184
21
Idiom: radial node-link tree
http://mbostock.github.com/d3/ex/tree.html
• data
– tree
• encoding
– link connection marks
– point node marks
– radial axis orientation
• angular proximity: siblings
• distance from center: depth in tree
• tasks
– understanding topology, following paths
• scalability
– 1K - 10K nodes (with/without labels)
22
Link marks: Connection and containment
• marks as links (vs. nodes)
– common case in network drawing
– 1D case: connection
• ex: all node-link diagrams
• emphasizes topology, path tracing
• networks and trees
– 2D case: containment
• ex: all treemap variants
• emphasizes attribute values at leaves (size coding)
• only trees
23
Idiom: treemap
• data
– tree
– 1 quant attrib at leaf nodes
• encoding
– area containment marks for hierarchical structure
– rectilinear orientation
– size encodes quant attrib
• tasks
– query attribute at leaf nodes
– ex: disk space usage within filesystem https://www.win.tue.nl/sequoiaview/
24
Idiom: implicit tree layouts (sunburst, icicle plot)
• alternative to connection and containment: position
– show parent-child relationships only through relative positions
25
Idiom: implicit tree layouts (sunburst, icicle plot)
• alternative to connection and containment: position
– show parent-child relationships only through relative positions
26
Idiom: implicit tree layouts (sunburst, icicle plot)
• alternative to connection and containment: position
– show parent-child relationships only through relative positions
Implicit
Spatial Position
27
Tree drawing idioms comparison
https://treevis.net/ 32
Arrange networks and trees
Implicit
Spatial Position