0% found this document useful (0 votes)
27 views13 pages

IN4089 - Lecture 05 - Graphs and Dimensionality Reduction-Pdfjam

Uploaded by

frankxlebrun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views13 pages

IN4089 - Lecture 05 - Graphs and Dimensionality Reduction-Pdfjam

Uploaded by

frankxlebrun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Graphs

Graphs

IN4089 Data Visualization


Martin Skrodzki – Computer Graphics & Visualization

Graph (Network) Visualization Tree Root

Parent

Depth
What is a graph 𝐺𝐺 ? • Graph that is
Child
𝐺𝐺 = (𝑉𝑉, 𝐸𝐸): 𝐸𝐸 ⊆ 𝑉𝑉 2 • Connected: all nodes can be reached

Undirected • Acyclic: without cycles Tree


𝑉𝑉 are vertices
• Can determine a root, then:
𝐸𝐸 are edges connecting two vertices
• Each (child)node has a parent
• Distance from root: depth
also called Network with attributes 𝑉𝑉 and 𝐸𝐸
Directed Not trees

3 4
Networks/graphs and trees Node-link techniques
• Layout:
• Nodes should not overlap

• Minimize edge crossing

• Edge length homogeneous

• Graph structures easily recognizable

• …

• A lot of research in the Graph


Drawing community
Munzner Book - Chapter 9

5 http://mbostock.github.com/d3/ex/force.html 6

Idiom: force-directed placement Node-link techniques: Force based


• visual encoding
• link connection marks, node point marks Force-directed algorithms:
• considerations
• Mechanical laws
• spatial position: no meaning directly encoded
• left free to minimize crossings • Model edges as springs, also nodes repel each other.
• proximity semantics?
• sometimes meaningful • Numerically simulate until stable state is reached.
• sometimes arbitrary, artifact of layout algorithm
• Repelling force on vertices (all other vertices)
• tension with length
• long edges more visually salient than short • Attracting force on edges (only connected vertices)
• tasks
• explore topology; locate paths, clusters
• scalability
• node/edge density E < 4N

http://mbostock.github.com/d3/ex/force.html 7 Based on slides by Michel Westenberg 8


Vertex Force Edge Force
• Repelling force between vertices 𝑖𝑖 and 𝑗𝑗 • Spring forces on edge
• Attracts vertices connected by edge
Repulsion strength Repulsion direction
Spring tension Spring length
𝒙𝒙𝒊𝒊
𝑟𝑟𝑖𝑖𝑖𝑖 𝑥𝑥𝑖𝑖 − 𝑥𝑥𝑗𝑗 𝒙𝒙𝒊𝒊 𝑥𝑥𝑖𝑖 −𝑥𝑥𝑗𝑗
𝑔𝑔𝑖𝑖,𝑗𝑗 = 2 𝒙𝒙𝒋𝒋 • 𝑓𝑓𝑖𝑖𝑖𝑖 = 𝑘𝑘𝑖𝑖𝑖𝑖 𝑑𝑑 𝑥𝑥𝑖𝑖 , 𝑥𝑥𝑗𝑗 − 𝑠𝑠𝑖𝑖𝑖𝑖
𝑑𝑑 (𝑥𝑥𝑖𝑖 , 𝑥𝑥𝑗𝑗 ) 𝑑𝑑(𝑥𝑥𝑖𝑖 , 𝑥𝑥𝑗𝑗 ) 𝑑𝑑 𝑥𝑥𝑖𝑖 ,𝑥𝑥𝑗𝑗 𝒙𝒙𝒋𝒋

• Prevents these vertices from getting too far apart


• Prevents vertices to come too close to each other

Based on slides by Michel Westenberg 9 Based on slides by Michel Westenberg 10

Layout Layout
• Compute forces • Compute forces
• Move vertices according to forces • Move vertices according to forces
• Terminate on • Terminate on
• Fixed number of iteration • Fixed number of iteration
• Total energy below some • Total energy below some
threshold threshold
• Local minimum • Local minimum
• User input • User input

Based on slides by Michel Westenberg 11 Based on slides by Michel Westenberg 12


Idiom: adjacency matrix view Connection vs. adjacency comparison
• data: network • adjacency matrix strengths
• predictability, scalability, supports reordering
• transform into same data/encoding as
heatmap NodeTrix: a Hybrid Visualization of Social • some topology tasks trainable
Networks. Henry, Fekete, and McGuffin. IEEE

• derived data: table from network


TVCG (Proc. InfoVis) 13(6):1302-1309, 2007.
• node-link diagram strengths
• topology understanding, path tracing
• 1 quant attrib
• intuitive, no training needed
• weighted edge between nodes
• empirical study
• 2 categ attribs: node list x 2 • node-link best for small networks
• visual encoding • matrix best for large networks
• cell shows presence/absence of edge • if tasks don’t involve topological structure! http://www.michaelmcguffin.com/courses/vis/patternsInAdjacencyMatrix.p
ng
• scalability On the readability of graphs using node-link and matrix-based
representations: a controlled experiment and statistical analysis.
• 1K nodes, 1M edges Ghoniem, Fekete, and Castagliola. Information Visualization 4:2 (2005),
Points of view: Networks. Gehlenborg and Wong. Nature Methods 9:115. 114–135.

13 14

Networks/graphs and trees Idiom: (radial) node-link tree


• data
• tree
• encoding
• link connection marks
• point node marks
• (radial) axis orientation
• vertical/angular proximity: siblings
• Horizontal distance/ from center:
depth in tree
• tasks
• understanding topology, following paths
• scalability
• 1K - 10K nodes
Munzner Book - Chapter 9

15 https://observablehq.com/@d3/tidy-tree 16
Idiom: treemap Idiom: icicle/sunburst
• data Icicle SunBurst
• tree
• 1 quantitative attribute at leaf nodes
• encoding
• area containment marks for hierarchical
structure
• Every level one row/circle
• rectilinear orientation
• size encodes quant attrib • No overlapping parent child – attributes easier
• tasks displayed
• query attribute (at leaf nodes)
• scalability • Space on interior nodes can be used
• 1M leaf nodes • Not as dense as Treemaps
https://observablehq.com/@d3/treemap 17 http://homes.cs.washington.edu/~jheer/files/zoo/ 18

Idiom: icicle/sunburst Link marks: Connection and containment


marks as links (vs. nodes)
• common case in network drawing
• 1D case: connection
• ex: all node-link diagrams
• emphasizes topology, path tracing
• networks and trees
• 2D case: containment
• ex: all treemap variants
• emphasizes attribute values at leaves
(size coding) Elastic Hierarchies: Combining Treemaps and Node-
Thomas Höllt, Nicola Pezzotti, Vincent van Unen, Frits Koning, Boudewijn P.F. Lelieveldt, and Anna Vilanova. CyteGuide: Visual
Guidance for Hierarchical Single-Cell Analysis. IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis • only trees Link Diagrams. Dong, McGuffin, and Chignell. Proc.
InfoVis 2005, p. 57-64.
2017), 24(1), 2018.

19 20
Tree drawing idioms comparison Further reading
• data shown • Visual Analysis of Large Graphs: State-of-the-Art and Future Research Challenges. von
– link relationships Landesberger et al. Computer Graphics Forum 30:6 (2011), 1719–1749.
– tree depth • Simple Algorithms for Network Visualization: A Tutorial. McGuffin. Tsinghua Science and
– sibling order Technology (Special Issue on Visualization and Computer Graphics) 17:4 (2012), 383–398.
• design choices • Drawing on Physical Analogies. Brandes. In Drawing Graphs: Methods and Models, LNCS
– connection vs containment link marks Tutorial, 2025, edited by M. Kaufmann and D. Wagner, LNCS Tutorial, 2025, pp. 71–86.
What‘s a Springer-Verlag, 2001.
– rectilinear vs radial layout fitting data
– spatial position channels • http://www.treevis.net Treevis.net: A Tree Visualization Reference. Schulz. IEEE Computer
set for any
of these Graphics and Applications 31:6 (2011), 11–15.
• considerations
idioms? • Perceptual Guidelines for Creating Rectangular Treemaps. Kong, Heer, and Agrawala. IEEE
– information density?
Quantifying the Space-Efficiency of 2D Graphical Representations of Trees. Trans. Visualization and Computer Graphics (Proc. InfoVis) 16:6 (2010), 990–998.
• avoid wasting space McGuffin and Robert. Information Visualization 9:2 (2010), 115–140.

21 22

Further reading
• Visual Analysis of Large Graphs: State-of-the-Art and Future Research Challenges. von
Landesberger et al. Computer Graphics Forum 30:6 (2011), 1719–1749.
• Simple Algorithms for Network Visualization: A Tutorial. McGuffin. Tsinghua Science and
Technology (Special Issue on Visualization and Computer Graphics) 17:4 (2012), 383–398.
• Drawing on Physical Analogies. Brandes. In Drawing Graphs: Methods and Models, LNCS
Tutorial, 2025, edited by M. Kaufmann and D. Wagner, LNCS Tutorial, 2025, pp. 71–86.
Springer-Verlag, 2001.
• http://www.treevis.net Treevis.net: A Tree Visualization Reference. Schulz. IEEE Computer Questions!
Graphics and Applications 31:6 (2011), 11–15. Slides (partially) based on lectures by
• Perceptual Guidelines for Creating Rectangular Treemaps. Kong, Heer, and Agrawala. IEEE Anna Vilanova (TU Eindhoven)
Trans. Visualization and Computer Graphics (Proc. InfoVis) 16:6 (2010), 990–998. Tamara Munzner (University of British Columbia)
Thomas Höllt (TU Delft)

23
Recap: Idiom scatterplot matrix
• scatterplot matrix (SPLOM)
Dimensionality • rectilinear axes, point mark

Reduction •


all possible pairs of axes
scalability
• one dozen attributes
IN4089 Data Visualization
• dozen to hundreds of items
Martin Skrodzki – Computer Graphics & Visualization
• Interaction is crucial

Reduce items and attributes Aggregation


• filter • Clustering
• pro: straightforward and intuitive
• Typically group items
• con: out of sight, out of mind
• Also on attributes (bi-clustering)
• aggregation
• pro: inform about whole set • Dimensionality Reduction
• con: difficult to avoid losing signal • Aggregate attributes
• not mutually exclusive • Remove redundancies in the data
• combine filter, aggregate
• Reduce downstream costs
• combine reduce, change, facet
• Visualization

3 4
Dimensionality Reduction Types of Dimensionality Reduction
• Filter attributes possible • Linear
Resulting attributes are linear combination of existing attributes (interpretable)
… but which ones?
• Principal Component Analysis (PCA)
• New space/embedding preserves • Linear Discriminant Analysis (LDA)
specific properties (e.g., variance, • …

cluster and structure) of the original • Non-Linear


Resulting attributes do not have straightforward relation to original attributes
high-dimensional space as much as • Multi-Dimensional Scaling (MDS) – preserve distances
possible. • t-Distributed Stochastic Neighbor Embedding (t-SNE) – preserve neighborhoods

• Typically target 2D, vis as Scatterplot! • …

5 6
PCA - Intuition Principal Component Analysis (PCA) in Brief
• Given a dataset with n attributes (n-dimensional problem)
• PCA:
• finds a new coordinate system obtained from the previous one by translation
and rotation only – changes the point of view
• moves the center of the coordinate system with center of the data
• moves the x-axis into the principal axis of variation
• orders axes by amount of variation (importance)

11 12
PCA PCA
• PCA transforms an n-dimensional space to an n-dimensional space
• In the new space dimensions are ordered by importance (highest variance)
PCA - Projection that best
• Dimensionality reduction: take the first m dimensions (m<n) represents the data variation

Projections that consider the


variance

classes need other dimensionality


reduction methods – NOT PCA

PC

13 14

Types of Dimensionality Reduction


• Linear
Resulting attributes are linear combination of existing attributes (interpretable)
• Principal Component Analysis (PCA)
• Linear Discriminant Analysis (LDA)
• …
• Non-Linear
Resulting attributes do not have straightforward relation to original attributes
Questions! • Multi-Dimensional Scaling (MDS) – preserve distances
• t-Distributed Stochastic Neighbor Embedding (t-SNE) – preserve neighborhoods
• …

16
MNIST dataset – Handwritten numbers t-SNE Intuition
0 • Non-linear dimensionality reduction 7
• Compute neighborhoods in hi-D 1
4
• Model low-D to preserve 9

neighborhoods 8
2
6
• Preserves local neighborhoods 3 5

➫ Preserves high-D clusters! 0

17 18

t-SNE in Brief t-SNE in Brief

• Create probability Remember force directed graphs?

distributions P/Q
• P: similarities in HD
• Q: random init

• Minimizie Kullback
Leiber Divergence
KLD(P,Q)

19 20
t-SNE in Brief t-SNE Parameters
• Computationally intensive • Perplexity
• compute high dimensional • Number of iterations
neighborhoods
• Learning rate
• optimize low dimensional neighborhoods
• Theta (for BH t-SNE)
(Many optimized implementations)

• Several parameters
• Some can severely impact results
https://distill.pub/2016/misread-tsne/

21 22

Progressive and Approximated tSNE at Tensorflow

collaboration with

GPGPU Linear Complexity t-SNE Optimization


Nicola Pezzotti, et al.
IEEE TVCG (Proceedings of VAST 2019)

https://nicola17.github.io/tfjs-tsne-demo/
https://github.com/tensorflow/tfjs-tsne

23 T. Höllt et al.: Interactive Immune Cell Phenotyping for Large Single-Cell Datasets, EuroVis 2016
Dimensionality Reduction Theses Projects
• Dimensionality Reduction is an
active field of research
• Both application-driven and
theory-based projects possible
• Come talk to us if you are
interested!
• Tensorflow GPU t-SNE in javascript: https://nicola17.github.io/tfjs-tsne-demo/
• Many algorithms in javascript: https://github.com/saehm/DruidJS
• Further Read: Visualizing Dimensionally-Reduced Data: Interviews with Analysts and a
Characterization of Task Sequences, Brehmer, et al. In BELIV 2014
25

Questions!
Slides (partially) based on lectures by
Thomas Höllt (TU Delft)
Anna Vilanova (TU Eindhoven)
Tamara Munzner (University of British Columbia)

You might also like