0% found this document useful (0 votes)

29 views

Module-2 Notes

Uploaded by

suhascheruvu2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

Module-2 Notes

Uploaded by

suhascheruvu2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 93

Course title : CSE3006

Course title : Data Visualization

Module :2

Visualization in High-Dimensional Data

Graph-theoretic Graphics, High-dimensional Data Visualization, Multivariate Data
Glyphs: Principles and Practice, Linked Views for Visual Exploration, Linked Data
Views, Visualizing Trees and Forests

1
Graph-theoretic Graphics
• A graph is a a non linear data structure that consists of (V,E)
• A finite collection of vertices or nodes V
• A finite collection of edges E, represented as ordered pairs of vertices (u,v)

Graph has a set of vertices V= { 1,2,3,4,5} and

A set of edges E= { (1,2),(1,3),(2,3),(2,4),(2,5),(3,5),(4,5) }.

2
Graph-theoretic Graphics
Term Description
Vertex Every individual data element is called a vertex or a node.
A connecting link between two nodes or vertices. Each edge has
Edge (Arc)
two ends and is represented as startingVertex, and endingVertex.
Undirected Edge It is a bidirectional edge.
Directed Edge It is a unidirectional edge.
Weighted Edge An edge with value on it.
Degree The total number of edges connected to a vertex in a graph.
Indegree The total number of incoming edges connected to a vertex.
Outdegree The total number of outgoing edges connected to a vertex.
Self-loop An edge is called a self-loop if its two endpoints coincide.
Adjacency Vertices are said to be adjacent if an edge is connected.

3
Graph-theoretic Graphics
Path:
• A finite or infinite set of edges which joins a set of vertices.
• It can connect to 2 or more nodes.
• If the path connects all the nodes of a graph, then it is a connected
graph, otherwise it is called a disconnected graph.
• There may or may not be path to each and every node of graph. In case,
there is no path to any node, then that node becomes an isolated node.
• The path from 'a' to 'e' is = {a, b, c, d, e}
Closed Path:
• A path is called as closed path if the initial node is same as terminal(end)
node, i.e., if : V0 = Vn, where V0 is the starting node if the graph and Vn
is the last node.
• The closed path = {e, d, f, g, e}
Simple Path:
• A path that does not repeat any nodes (vertices)
• A simple path in a graph exists if all the nodes of the graph are distinct,
expect for the first and the last vertex. {a, b, c, d}

4
Graph-theoretic Graphics
Cycle Graph:
• A simple graph of ‘n’ nodes(vertices) (n>=3) and n edges forming a
cycle of length ‘n’ is called as a cycle graph.
• In a cycle graph, all the vertices are of degree 2.
Connected Graph:
• A graph in which there is an edge or path joining each pair of vertices.
• In connected graph,
• Can visit from any one vertex to any other vertex.
• There exists at least one path between every pair of vertices.
• There is not a single vertex in a connected graph, which is unreachable(or isolated).
Complete Graph (full graph)
• An edge between every single pair of node in the graph or every vertex has an edge to all other
vertices.
• A complete graph of ‘n’ vertices contains exactly nC2 edges. (n*(n-1)/2 edges)
• A complete graph of ‘n’ vertices is represented as Kn
• Every complete graph is a connected graph, however, vice versa is not necessary.
• In a Complete graph, the degree of every node is n-1, where, n = number of nodes.

5
Graph-theoretic Graphics
Undirected: A graph in which all the edges are bi-directional.
The edges do not point in a specific direction.

Directed (digraph): A graph in which all the edges are uni-directional.

The edges point in a single direction. Pair of vertices in an edge is
ordered.
Weighted Graph:
• A graph with a value associated with every edge. The values
corresponding to the edges are called weights.
• A value in a weighted graph can represent quantities such as
cost, distance, and time, depending on the graph.
• An edge in a weighted graph is represented as (u, v, w), where:
• u is the source vertex
• v is the destination vertex
• w represents the weight associated with going from u to v 6
Graph-theoretic Graphics
Finite Graph: The graph G=(V, E) is called a finite graph if the number
of vertices and edges in the graph is limited in number.

Infinite Graph: The graph G=(V, E) is called a infinite graph if the

number of vertices and edges in the graph is infinite

Trivial Graph: If a finite graph has only a single vertex and no edge, it is known as a trivial graph.

7
Graph-theoretic Graphics
Loop:
• A loop (also called a self-loop) is an edge that connects a vertex to
itself.
• An edge with both ends as the same vertex.
• Although all loops are cycles, not all cycles are loops. Because,
cycles do not repeat edges or vertices except for the starting and
ending vertex.

• A path is cyclic if a node appears more than once in its corresponding

list of edges.
• The graph without cycles is called acyclic graph, acyclic undirected
graphs are called tree.

8
Graph-theoretic Graphics
Isomorphic :
Two graphs G1=(V1,E1) and G2=(V2,E2) are isomorphic if there exists a bijective mapping between
the vertices in V1 and V2 and there is an edge between two vertices of one graph if and only if there is
an edge between the two corresponding vertices in the other graph.

Checklist
• Are the number of vertices in both graphs the same?
• Yes, both graphs have 4 vertices.
• Are the number of edges in both graphs the same?
• Yes, both graphs have 4 edges.
• Is the degree sequence in both graphs the same?
• Yes, each vertex is of degree 2.
• If the vertices in one graph can form a cycle of length k, can we
find the same cycle length in the other graph?
• Yes, each graph has a cycle of length 4.
• If answer is yes to all four of the above questions, then the graphs
are isomorphic.
• In other words, they are the equivalent graphs just in different
forms.
9
Graph-theoretic Graphics

10
Graph-theoretic Graphics
• The graph-theoretic distance (or geodesic distance) between connected nodes u
and v is the sum of the weights of the edges in any shortest path connecting the
nodes.
• If no such path exists (i.e., if the vertices lie in different connected components), then
the distance is set equal to infinity.
• In a grid graph the distance between two vertices is the sum of the "vertical" and
the "horizontal" distances.
• The matrix dij consisting of all distances from vertex vi to vertex vj is known as the all-
pairs shortest path matrix, or the graph distance matrix.

11
Graph-theoretic Graphics
Adjacency matrix:
• An adjacency matrix is a 2D array of V x V vertices. Each row and column represent a vertex.
• If the value of any element a[i][j] is 1, it represents that there is an edge connecting vertex i and vertex j.

Undirected Graph
Weighted Undirected Graph Directed Graph

In undirected graph, for edge (A,C), also

need to mark edge (C,A); making the Adjacency matrix of a directed graph
adjacency matrix symmetric about the is never symmetric
diagonal.
• The set of eigenvalues of the graph adjacency matrix is called the graph spectrum.
• The spectrum is useful for identifying the dimensionality of a space in which a graph may be
embedded or represented as a set of points (for vertices) and a set of connecting lines (for
12
edges).
Graph-theoretic Graphics
• A tree is a mathematical structure which is used to model the actual evolutionary
history of a group of sequences or organisms. This actual pattern of historical
relationships is the phylogeny or evolutionary tree.
• A tree consists of nodes connected by branches (edges).

• The nodes and branches of a tree may have various kinds of information associated
with them like estimate the amount of evolution that takes place between each node
on the tree, which can be represent as branch lengths (or edge lengths).
• Trees with branch lengths are sometimes called weighted trees.

13
Graph-theoretic Graphics
• A tree is a graph in which any two nodes are connected by
exactly one path.
• Trees are acyclic connected graphs.
• Trees may be directed or undirected.
• A tree with one node labeled root is a rooted tree.
• Directed trees are rooted trees, the root of a directed tree is the
node having no incoming edges.
• A hierarchical tree is a directed tree with a set of leaf nodes
representing a set of objects and a set of parent nodes
representing relations among the objects.
• In a hierarchical tree, every node has exactly one parent, except
for the root node, which has one or more children and no parent.
• Examples of hierarchical trees: Decision-trees A B C D E
• A Spanning tree is an undirected geometric tree, that will have
n − 1 edges that define all distances between n nodes. Shorthand Representation of
Tree: (((A,B),C),(D,E))
• A Minimum Spanning Tree (MST) has the shortest total edge
length of all possible spanning trees.
14
Graph-theoretic Graphics
Cladograms, Ultrametric trees and additive trees,

Cladograms :
• Branch Lengths are meaning less.
• Shows the evolutionary relations ship of nodes A B C D E

15
Graph-theoretic Graphics
Ultrametric Trees (Ultrametric spaces or Chronogram)
• An Ultrametric tree is a rooted tree with edge lengths where all
leaves are equidistant from the root.
• Ultrametric trees represent the molecular clock which states that the
rate of mutation is the same across all lineages of the tree.
• The term "Ultrametric" refers to a specific type of metric space where
the distance between any two points is always less than or equal
to the maximum of the distances from either point to a third fixed
point, the metric satisfies a stronger form of the triangle inequality.
• Mathematically, for points x, y, z in the space, the Ultrametric
inequality is given by:
d(x, y) ≤ max(d(x, z), d(y, z))
• In an Ultrametric tree, the graph-theoretic distances take at most n − 1
possible values, where n is the number of leaves.
• Ultrametric trees have applications in computer science, particularly in
hierarchical clustering algorithms. They are also used in
mathematical analysis and the study of p-adic numbers.

16
Graph-theoretic Graphics
Additive Trees (Phylogram or Additive hierarchical clustering or Additive
binary trees)
• Branch Lengths measure evolutionary distance.
• The rate of evolution vary over time.
• Additive trees possess the additive property, which means that the distance
between two leaves (data points) is equal to the sum of the edge lengths
along the unique path connecting those leaves in the tree.
• Mathematically, for leaves i, and j, and their common ancestor k:
d(i,j) ≤ (d(i,k) + d(j, k))
• Additive trees are widely used in clustering analysis, classification, and
visualization of relationships within datasets.
• They are applied in fields such as
• Bioinformatics to represent evolutionary relationships,
• Linguistics for language classification, and
• Various domains for exploratory data analysis.

17
Graph-theoretic Graphics

(a) Ultrametric and (b) additive trees along with their corresponding path-length matrices.

18
Graph-theoretic Graphics - Graph Drawing
• When a connected graph can be drawn without any edges crossing, it is called
planar.
• When a planar graph is drawn in this way, it divides the plane into regions called
faces.

• The graph above has 3 faces (include the “outside” region as a face).
• The number of faces does not change no matter how you draw the graph (as long
as you do so without the edges crossing), so it makes sense to ascribe the number
of faces as a property of the planar graph.
19
Graph-theoretic Graphics - Graph Drawing

• If you try to count faces using the graph on the left, you might say there are 5 faces
(including the outside).
• But drawing the graph with a planar representation shows that in fact there are only
4 faces.
• Eulers formula:
• For any connected planar graph with v vertices, e edges, and f faces
v– e+f=2
20
Graph-theoretic Graphics - Graph Drawing
• The graph G has 6 vertices with degrees 2, 2, 3, 4, 4, 5
• How many edges does G have?
• Could G be planar?
• If so, how many faces would it have.

• Solution:
No.of edges =
(2 + 2 + 3 + 4 + 4 + 5) / 2 = 10

It could be planar,
By using Euler's formula v– e+f=2
6 – 10 + f = 2
f=6
To make sure that it is actually planar though, we would need to draw a graph with
those vertex degrees without edges crossing.
This can be done by trial and error (and is possible).
21
Graph-theoretic Graphics - Graph Drawing
• Drawing graphs is more than a theoretical exercise.
• Finding compact planar drawings of graphs representing electrical circuits is a critical
application in the semiconductor industry. [If the circuit can be redrawn without any
wires crossing each other, then it is planar].

22
Graph-theoretic Graphics - Graph Drawing
• The graph-drawing (or graph-layout) problem is as follows.
• Given a planar graph, how do we produce an embedding on the plane or
sphere? And if a graph is not planar, how do we produce a planar layout that
minimizes edge crossings?
• Different types of graphs require different algorithms for clean layouts
• Hierarchical Trees
• Spanning Trees
• Networks
• Directed Graphs
• Tree Maps

23
Graph-theoretic Graphics - Graph Drawing
Hierarchical Trees:
• Suppose if we are given a recursive list of single parents and their children.
• In this list, each child has one parent and each parent has one or more children.
• One node, the root, has no parent.
• This tree is a directed graph because the edge relation is asymmetric.

Horizontal layout Vertical layout 24

Graph-theoretic Graphics - Graph Drawing
Hierarchical Trees: Given edges : {{'a','b'}, {'a','d'}, {'b','c'}}
• Suppose if you are given only a list of edges and
told to layout a rooted tree. (vertex, parent_vertex) / Construct a binary tree.
(parent_vertex, vertex) Order: (parent_vertex, vertex)
• To lay out a tree using only an edge list, {a,b}, b's parent node is a.
• The first step is to make an inventory of existing
content, i.e., need to inventory the parent–child {a,d}, d's parent node is a.
relationships. {b,c}, c's parent node is b.
• First, identify leaves by locating nodes appearing
a
only once in the edge list.
• Then assign a layer value to each node by finding
the longest path to any leaf from that node. b d
• Then begin with the leaves, group children by
parent, and align parents above the middle child in c
each group.

25
Graph-theoretic Graphics - Graph Drawing
Hierarchical Trees:
• The data are adapted from weblogs of a small
website.
• The thicknesses of the branches of the tree are
proportional to the number of visitors navigating
between pages represented by nodes in the tree.

Layout of a website tree

26
Graph-theoretic Graphics - Graph Drawing
Hierarchical Trees:
• Suppose if the nodes of a tree are ordered by an external variable such as
joining or splitting distance, then locate them on a scale instead of using
paternity (parent and child relation ship) to determine ordering.
• An inverted tree-shaped structure, called the dendrogram.
• There are two types of hierarchical clustering:
• Agglomerative: The data points are clustered using a bottom-up approach
starting with individual data points.
• Divisive: The top-down approach is followed where all the data points are
treated as one big cluster and the clustering process involves dividing the one big
cluster into several small clusters.

Violent Crime Rates By US State 27

Graph-theoretic Graphics - Graph Drawing
Hierarchical Trees:
• Hierarchical trees with many leaves can become unwieldy in rectangular layouts,
in such cases use the circular layouts by using polar coordinates.
• Circular layouts are popular in biological applications involving many variables
because of their space-saving characteristics.

Polar hierarchical cluster tree of US murder rates 28

Graph-theoretic Graphics - Graph Drawing
Hierarchical Trees:
• The nodes of hierarchical trees may represent nested
collections of objects.
• Classification and regression trees hierarchically
partition a set of objects.
• By using this model, highlight the marginality of splits
i.e., outlying splits are shifted away from the bulk of the
display.
• This layout is relatively inefficient with regard to
space, and it is not well suited to a polar arrangement
because the balance metaphor has no meaning in that
context.
Classification and regression trees

29
Graph-theoretic Graphics - Graph Drawing
Hierarchical Trees:
• A directed geometric tree with one root having many children.
• Such a tree may represent a flow from a source at the root branching to sinks at the
leaves.
• Example: Water and migration flows

30
Graph-theoretic Graphics - Graph Drawing
Treemaps:
• Used to identify categories and the proportional size of categories in a data set.
• A treemap visualizes large amounts of hierarchically structured data.
• The structure illustrates the hierarchy of the data content and the area of the
rectangle is proportionate to the amount of data it represents.

A treemap is organized by color in the following ways:

• By value: Each color and rectangle represents one value.
• By category: Each color represents a category and each category is further divided
into multiple levels.
Display modes of Treemap
• Squarified: The default display mode.
• Slice: Displays measures that belong to the same node in a vertically sliced way.
• Dice: Displays measures that belong to the same node in a horizontally sliced way.
• Slice - Dice: Displays stacked hierarchical measures that belong to the same node
in a vertically sliced way. 31
Graph-theoretic Graphics - Graph Drawing
Treemaps:
• Treemaps are recursive partitions of a space. The simplest form is a nested rectangular partitioning of
the plane.
• To transform a binary tree into a rectangular treemap; Start at the root of the tree.
• Partition a rectangle vertically; Each block (tile) represents one of the two children of the root.
• Then partition each of the two blocks horizontally so that the resulting nested blocks represent the
children of the children.
• Apply algorithm recursively until all the tree nodes are covered.
• The recursive splits alternate between vertical and horizontal.

32
Graph-theoretic Graphics - Graph Drawing

33
Graph-theoretic Graphics - Graph Drawing
Displaying region-wise customer complaints about a product
Suppose there are 10 different types of complaints (assume these are denoted as C1 to
C10) about a product and the company wants to visualize which complaints are relevant
to a region then in such a case a treemap could be used as shown below.
Here, it can be clearly seen how different regions have specific types of user complaints.

34
Graph-theoretic Graphics - Graph Drawing
Showcasing category-wise product availability of mobile phones
Let us assume that there are four categories of mobile phones with their market share
percentages i.e., Low-end (up to 10,000 INR – 15%), Mid-Range (10,000-25000 INR-
55%), Premium (above 25,000 to 50,000 INR-25%), and Top-end (above 50,000 INR-
10%). Construct the treemap and draw your insights.

From this treemap, we can estimate that

there is a bigger demand and market for
Mid-Range phones while there are limited
phones available in the Top-End category.

35
Graph-theoretic Graphics - Graph Drawing
Explore customer segmentation for a product
• Usually, companies for apparel or personal products divide their customers based on
their age.
• This way they can categorize their products and the product variants separately for
each age group.
• In the case of this treemap, the company could decide whether to launch more
products for particular customer segments based on the distribution.

36
Graph-theoretic Graphics - Graph Drawing
Continent Country Area
Northern Canada 9976140
America
United States 9372610

Greenland 2175600

Africa Central African 622980

Republic

Cameroon 475440

Zimbabwe 390580

Asia Indonesia 1919440

Mongolia 1565000

India 3287590

Europe Finland 337030

Ukraine 603700

Poland 312683

Germany 356910

http://6.anychart.com/products/anychart/docs/users-guide/Tree-Map-Chart.html
37
Graph-theoretic Graphics - Graph Drawing

38
Graph-theoretic Graphics - Graph Drawing
• Treemap charts can be used for a variety of presentation types, industries, and areas of
study.
• For Business Analysis: Treemap charts can help businesses compare their sales
numbers of different models and brands. Such businesses will employ treemap charts to
visualize organizational structure, revenue breakdowns, market segmentations, and other
factors over a certain period of time.
• File Systems: Treemaps can identify the allocation of storage space in file systems. These
charts also enable users to identify large data sets, such as files or folders that can occupy
excessive space, through trends and patterns in the data chart.
• Inventory of different trends within a population: Treemap charts can depict literacy
rates or population densities in certain geographic areas over a specific time period.
• Portfolio Management: Treemap charts are also a useful tool for investors in order to
analyze portfolio allocations and assess how their investments are distributed across
resource categories and industries.
• Social Sciences: Researchers and scientists can use treemap charts to refer to
demographic information, inventory of animals, etc. This data chart can help facilitate the
exploration of population trends and other related factors among these distributions. 39
Graph-theoretic Graphics - Graph Drawing
• Treemaps are a good choice for categorical data visualization
• Treemaps do not support data with negative numbers.
• A treemap ignores negative values.
• Alternatives to Treemaps
• When there are too many categories to visualize and the focus is more on
finding the top ‘n’ categories based on a value or there is simply no hierarchy in
the data to be plotted. In such cases, treemaps prove to be difficult to read and
ineffective.
• A Bar chart can replace a treemap where the data to be plotted has one
quantitative and one categorical variable.
• A Scatter plot could be a replacement where the plotted data has two
quantitative variables.
• Example:
• To identify products with higher sales volume and profits, a 2D scatter plot is
a better option since both variables are quantitative.
• On the other hand, a bar chart could be a better choice if we only intend to
plot sales volume for different products or total revenue. 40
Graph-theoretic Graphics - Graph Drawing

Treemap Problems
Too disorderly
What does adjacency mean?
Aspect ratios uncontrolled leads to lots of skinny boxes
that clutter
Hard to understand
Must mentally convert nesting to hierarchy descent
Color not used appropriately
In fact, is meaningless here
Wrong application
Don’t need all this to just see the largest files in the OS

41
High-dimensional Data Visualization
• Data sets of dimensions 1,2,3 are common
• Number of variables per class
• 1 - Univariate data
• 2 - Bivariate data
• 3 - Trivariate data
• >3 – Hypervariate / Multivariate data
• One of the biggest challenges in data visualization is to find general representations
of data that can display the multivariate structure of more than two variables.
• Several graphic types like mosaic plots, parallel coordinate plots, trellis displays,
and the grand tour have been developed over the course of the last three
decades.

42
High-dimensional Data Visualization
Mosaic Plot:
• To draw a mosaic plot, begin by placing one categorical variable along the x axis and
subdivide the x axis by the relative proportions that make up the categories.
• Then place the other categorical variable along the y axis and, within each category
along the x axis, subdivide the y axis by the relative proportions that make up the
categories of the y variable.
• The result is a set of rectangles whose areas are proportional to the number of cases
representing each possible combination of the two categorical variables.

43
High-dimensional Data Visualization
• A contingency table is simply a table that displays a count (frequency) in each cell
that resides at the column and row intersections of two or more categorical variables.
• Consider a group of individuals for whom data was collected regarding two variables:
hair color (black, brown, red, and blond) and eye color (brown, blue, hazel, and
green).

Mosaic plot representation, in which, the counts in

the table are approximations of the frequencies.
High-dimensional Data Visualization
• The widths of the rectangles represent the proportion of
people with each hair color and their heights represent
the proportion of people with each eye color within each
hair color group.
• The areas represent the numbers in the body of the
contingency table.
• To create a mosaic plot, begin with a large rectangle
and divide it into vertical sections based on the first
categorical variable, in this case hair color, and then
you add a little space between the sections.

• Then divide it into horizontal sections based on the second variable, in this case eye color, and once
again add some space between them.
High-dimensional Data Visualization

• Spaces between the sections are conventional, but not

necessary. When these spaces are omitted, the graph
is sometimes called a Mondrian diagram.
High-dimensional Data Visualization
• Mosaic plots represent the data as it is, and do not make any attempt to generalize to the full
population.
• To make inferences about the population, need to provide measures of statistical significance.
• By the chi-square test, define Pearson residuals which measure the departure of each cell from
independence. The formula is (actual - expected)/sqrt(expected).
• The units are in standard deviations, so a residual greater than 2 or less than -2 represents a departure
significant at the 95% level.
• The expected count under independence is (row marginal)*(column marginal)/(table total).
• The residuals interpretation:
• A cell is shaded blue if we are confident that it is taller than the
other cells in the same row.
• A cell is shaded red if we are confident that it is shorter than the
other cells in the same row.
• If a cell is visibly short, but does not get shaded red, then
there is not enough data to conclude that the cell would continue
to be short if we took another sample.
• A blue cell is usually accompanied by a red cell in the same row
but not always-- see e.g. the bottom row of the plot (green eyes)
• Note that shading does not say anything about the relative
height of boxes in the same column.
High-dimensional Data Visualization

2×2×2 contingency table

High-dimensional Data Visualization
Data Entry
> music = c(210, 194, 170, 110,
190, 406, 730, 290)
> dim(music) = c(2, 2, 2)
> dimnames(music) =
list(Age = c("Old", "Young"),
Education = c("High", "Low"),
Listen = c("Yes", "No"))
The R function which produces mosaic plots is called mosaicplot. The simplest way to produce a
mosaic plot is:
mosaicplot(music)
It is also easy to colour the plot and to add a title.
mosaicplot(music, col = hcl(240),
main = "Classical Music Listening")

49
High-dimensional Data Visualization
Example: Survival on the Titanic
On Sunday, April 14th, 1912 at 11:40pm, the RMS
Titanic struck an iceberg in the North Atlantic. Within
two hours the ship had sunk. At best reckoning 705
survived the sinking, 1,523 did not.
The Data
• There is very good documentation on who survived
and who did not survive the sinking of the Titanic.
• Passengers on the Titanic, cross-classified by:
• Class: 1st, 2nd, 3rd, Crew.
• Sex: Male, Female.
• Age: Child, Adult.
• Survived: No, Yes.

50
High-dimensional Data Visualization

51
High-dimensional Data Visualization
Example: Sexual Discrimination at Berkeley

• In the 1980s, a court case brought against the University of California at Berkeley by
women seeking admission to graduate programs there.

• The women claimed that the proportion of women admitted to Berkeley was much
lower than that for men, and that this was the result of discrimination.

• It is clear that a higher proportion of males is being admitted.

52
High-dimensional Data Visualization
The University Case
• The Dean of Letters and Science at
Berkeley was a famous statistician (Peter
Bickel) and he was able to argue that the
difference in admissions rates was not
caused by sexual discrimination in the
Berkeley admissions policy, but was
caused by the fact that males and
females generally sought admission to
different departments.
• The Dean broke the admissions data
down by department and showed that
within each program there was no
admission discrimination against women.
Indeed, there seemed to be some
admissions bias in favour of women. 53
High-dimensional Data Visualization
• The widths of the boxes are proportional to the percentage of females and males, respectively.
• In fact, 41% of applicants were female and 59% were male.
• The heights of the boxes are proportional to percent admitted.
• In fact, 45% of the male applicants were admitted, while only 30% of the female applicants were
admitted.
• This seems to show a large gender-bias in admission.
• To make the plot easier to interpret, the boxes for admitted females and males are colored blue while the
not admitted females and males are colored pink.
• It is easy to see that females’ blue box on the left is much shorter than the males’ blue box on the right

54
High-dimensional Data Visualization
• To understand admission pattern, the university department of applications was considered.
• In the following plot, the departments are shown across the plot in different colors, from department A
on the left in pink to department F on the right in yellow.
• The percentage of applicants to each department is proportional to the width of the bars.
• It is obvious that departments A and C have the largest number of applicants and departments B and
E have the smallest.

• By construction, the percent

admitted within each gender-by-
department combination is the
width of the corresponding box.

55
High-dimensional Data Visualization
Stratification on department:
• It appears that most departments have no gender bias,
and those departments that are biased favor women.
How can this be?
• First, note that depts A and B have very few female
applicants (the columns are narrow).
• It is also relatively easy to get into those departments---
the proportion rejected is lower than other departments,
especially F.
• So one explanation is that more males get in because
they are applying to the hungrier, perhaps fastest-
growing, departments.
• One problem with the mosaic plot in this context is that
when a proportion is very small, the corresponding box
is nearly invisible.
• Unusually large cells are emphasized in a mosaic plot,
while unusually small cells are hidden.

56
High-dimensional Data Visualization
Mosaic Plot vs Treemaps:
• To draw a mosaic plot, begin by placing one categorical variable along the x axis and
subdivide the x axis by the relative proportions that make up the categories.
• Then place the other categorical variable along the y axis and, within each category
along the x axis, subdivide the y axis by the relative proportions that make up the
categories of the y variable.
• The result is a set of rectangles whose areas are proportional to the number of cases
representing each possible combination of the two categorical variables.

• In a treemap, take an enclosing rectangle and subdivide it into smaller rectangles

whose areas represent the proportions.
• In a treemap, recursively nest rectangles inside each other.

57
High-dimensional Data Visualization
Trellis Displays (Lattice Graphics / Lattice Displays)
• Trellis Graphics is a family of techniques for viewing complex, multi-variable data
sets.
• The techniques were given the name Trellis because they usually result in a
rectangular array of plots, resembling a garden trellis.
• A number of statistical software systems provide multi-panel conditioning plots
under the name Trellis plots or Cross plots.
• Trellis displays use a grid like structure to plot the data conditioned on certain
subgroups.
• Each small plot in the grid represents a subset of the data, allowing for the comparison
of multiple conditions or variables simultaneously.
• To make plots comparable across rows and columns, the same scales are used in
all the panel plots.
• This technique is particularly useful for exploring and understanding complex
datasets.
58
High-dimensional Data Visualization
R : lattice package

Python: plotly or pandas

• The Trellis Technology

• There are a variety of displays which can be produced by Trellis, including:
• Bar Charts
• Dot Charts
• Box and Whisker Plots
• Histograms
• Density Traces
• QQ Plots
• Scatter Plots
• A common framework is used to produce all these plots.
59
High-dimensional Data Visualization
• Every Trellis display consists of a series of rectangular panels, laid out in a
regular row-by-column array.
• The indexing of the array is left-to-right, bottom-to-top.
• The x axes of all the panels are identical. This is also true for the y axes.
• Each panel of the a display corresponds to conditioning, either on the levels of a
factor, or on sub-intervals of the range of a numeric variable.
• Up to three categorical variables can be used as conditioning variables to
form rows, columns, and pages of the trellis display.
• Strip labels are used to annotate the conditioning categories of each panel plot
for listing the corresponding category names.

60
High-dimensional Data Visualization

61
High-dimensional Data Visualization

Standard non-trellised scatter plot

Trellis visualization based on the two

variables "Gender" and "Political affiliation",
this will result in four separate panels
representing the combinations Female-
Republican, Female-Democrat, Male-
Republican, and Male-Democrat.
62
High-dimensional Data Visualization
Parallel Coordinate Plots or Parallel plot
• Escape the dimensionality of two or three dimensions and can accommodate many variables at a time by plotting the
coordinate axes in parallel.
• When you are trying to visualize high dimensional numerical data instead of multiple bar/line charts (one for each numerical
variable), a single Parallel Coordinates plot could be more useful.
• A Parallel coordinates plot is used to analyze multivariate numerical data. It allows a comparison of the samples or
observations across multiple numerical variables.
• Allows to compare the feature of several individual observations (series) on a set of numeric variables.
• Each vertical bar represents a variable and often has its own scale. (The units can even be different).
• Values are then plotted as series of lines connected across each axis.
x y z w

Example: (0,1,-1,2)
0 0 0 0

63
High-dimensional Data Visualization

64
High-dimensional Data Visualization

65
High-dimensional Data Visualization
• https://r-graph-gallery.com/parallel-plot-ggally.html - R
• https://www.analyticsvidhya.com/blog/2021/11/visualize-data-using-parallel-coordinates-plot/ - Python

# R Libraries
library(GGally)
# Plot
ggparcoord()

# Python # Python
# Using Pandas # Using Plotly Express interface
pd.plotting.parallel_coordinates() import plotly.express as px
pd.plotting.parallel_coordinates() px.parallel_coordinates()
With the pandas interface, we have 2 issues
1. Cannot control the scale of individual axes #Plotly’s graph_objects interface
2. Cannot label the (poly-)lines inline import plotly.graph_objects as go
go.Figure(data= go.Parcoords())
66
High-dimensional Data Visualization
• The most interesting aspects in using parallel coordinate plots are the
investigation of groups/clusters, outliers, and structures over many variables
at a time.
• Three main uses of parallel coordinate plots in exploratory data analysis :
• Overview: An ideal tool to get a first overview of a data set.
• Profiles: Used to visualize the profile of a single case via highlighting.
• Profiles are not only restricted to single cases but can be plotted for a whole
group, to compare the profile of that group with the rest of the data.
• Monitor: When working on subsets of a data set parallel coordinate plots can
help to relate features of a specific subset to the rest of the data set.

67
High-dimensional Data Visualization
Sorting (ordering) and Scaling Issues:
• Especially useful for variables which either have an order such as time or all share
a common scale.
• Ordering: The order of the axes is critical for finding features, and in typical
data analysis and data visualization many reordering's will need to be tried.
• Scaling: The most important scaling option is to either individually scale the
axes or to use a common scale over all axes.
• Scaling options define the alignment of the values, which can be aligned at:
• The mean
• The median
• A specific case
• A specific value

68
High-dimensional Data Visualization

69
Multivariate Data Glyphs: Principles and Practice

70
Multivariate Data Glyphs: Principles and Practice
• In the context of data visualization, a glyph is the visual representation of a piece
of data where the attributes of a graphical entity are dictated by one or more
attributes of a data record.
• Glyphs adds extra dimensions of data to visualization.
• A glyph consists of a graphical entity with p components, each of which may have r
geometric attributes and s appearance attributes.
• Geometric attributes: shape, size, orientation, position, direction / magnitude of
motion
• Appearance attributes: color, texture, and transparency

• Demo

71
Multivariate Data Glyphs: Principles and Practice
Mappings
• List of graphical attributes to which data values can be mapped are
• Position (1-, 2-, or 3-D)
• Size (length, area, or volume),
• Shape, orientation,
• Material (hue, saturation, intensity, texture, or opacity),
• Line style (width, dashes), and
• Dynamics (speed of motion, direction of motion, rate of flashing).
• Mappings can be classified as follows:
• One-to-one mappings: Each data attribute maps to a distinct and different
graphical attribute;
• One-to-many mappings: Redundant mappings are used to improve the accuracy
and ease at which a user can interpret data values; and
• Many-to-one mappings: Several or all data attributes map to a common type of
graphical attribute, separated in space, orientation, or other transformation.
72
Multivariate Data Glyphs: Principles and Practice
• Profiles: Height and color of bars.
• Stars: Length of evenly spaced rays
emanating from center.
• Stars and Anderson/metroglyphs:
Length of rays.
• Stick figure icons: Length, angle, color
of limbs.
• Trees: Length, thickness, angles of
branches; branch structure derived from
analyzing relations between dimensions.
• Autoglyph: color of boxes.
• Boxes: Height, width, depth of first box;
height of successive boxes.
• Faces: Size and position of eyes, nose,
mouth; curvature of mouth; angle of
eyebrows.
73
Multivariate Data Glyphs: Principles and Practice
• Arrows: length, width, taper, and color
of base and head.
• Weathervanes: Level in bulb, length of
flags.
• Circular profiles: Distance from center
to vertices at equal angles.
• Bugs: wing shapes controlled by time
series; length of head spikes
(antennae); size and color of tail; size of
body markings.
• Wheels: Time wheels create ring of
time series plots, value controls
distance from base ring; 3D wheel
maps time to height, variable value to
radius.
74
Multivariate Data Glyphs: Principles and Practice
Biases in Glyph Mappings When watching a football
Perception-based bias game, we tend to group
Proximity-based bias individuals based on the
colors of their uniforms.
Grouping-based bias

The Gestalt principle of

When looking at array of dots,
The concept of figure-ground proximity suggests that you
we likely perceive alternating
relationship explains why this image see (a) one block of dots on
rows of colors. We are
can be perceived either as a vase or the left side and (b) three
grouping these dots according
as a pair of faces. columns on the right side. 75
to the principle of similarity.
Multivariate Data Glyphs: Principles and Practice
Raw

Data Driven

Derived

Glyph Placement
Strategies Ordered

Structure Driven Hierarchical

Network

76
Multivariate Data Glyphs: Principles and Practice
Glyph Layout Options / Placement Strategies
• The position of glyphs can convey many attributes of data, including data values or structure (order,
hierarchy), relationships, and derived attributes.
• Data-driven Placement:
• The data are used to compute or specify the location parameters for the glyph.
• The two categories of this strategy class are raw and derived based on whether the original data
values are used directly or whether positions are derived via computations involving these data
values.
• Derived Techniques: Dimension Reduction Techniques include Principal Component Analysis
(PCA), Multidimensional Scaling (MDS), and Self-Organizing Maps (SOMs).
• Resulting display coordinates have no semantic meaning.

77
Multivariate Data Glyphs: Principles and Practice
Glyph Layout Options / Placement Strategies
• Structure implies relationships or connectivity
• Explicit structure: One or more data dimensions driven structure
• Implicit structure: Structure derived from analyzing data
• Common structures: Ordered, Hierarchical, Network/graph
• Each kind of structure can help drive placement algorithm in distinct ways.

78
Multivariate Data Glyphs: Principles and Practice
Glyph Layout Options / Placement Strategies
• Ordered structure may be linear (1-D) or grid-based
(N-D).
• By sorting the data on one or more dimensions, and
using this ordering to specify the glyph placement.
• Various placement patterns for linearly structured
data, including raster, radial, and recursive raster.

79
Multivariate Data Glyphs: Principles and Practice
Hierarchical Structure
• Hierarchical structure in data sets can be explicit or implicit.
• Explicit: Each level of the hierarchy is associated with a single data dimension, and the
branches deriving from this level correspond to some number of distinct ranges for that data
dimension.
• Example: Sales data may have dimensions associated with particular time periods,
geographical locations, sales personnel, and products.
• Different hierarchies are generated depending on the order in which the dimensions are
processed. Other examples of explicit hierarchies are file systems and organizational
charts.
• Implicit: Hierarchies are generated algorithmically using clustering or partitioning
algorithms in conjunction with some N-dimensional distance or similarity metric.
• Given a hierarchical structure, the task is to position glyphs on the display in such a way as
to convey the relationships inherent in the structure.
• Node-link graphs vary by
• Where the root node is relative to the rest of the tree (e.g., centered, top-most)
• Relative direction between a node and its children (e.g., radially outward, horizontal,
vertical, or alternating horizontal and vertical). 80
Multivariate Data Glyphs: Principles and Practice
Graph/Network Structure
• A generalization of hierarchical structure is that of a graph or network, which
consists of a set of nodes (the data points) and a finite set of directed or undirected
links / connections, each of which represents a relationship between a pair of nodes.
• Harder to imply relation with just positioning - need explicit links
• Many factors to consider
• Minimizing crossings
• Uniform node distribution
• Drawing conventions for links
• Centering, clustering subgraphs

81
Linked Views for Visual Exploration
• The basic problem in visualization still is the physical limitation of the 2-D presentation
space of paper and computer screens.
• Four approaches to address this problem and to overcoming the restrictions of 2-D:
1. Create a virtual reality environment or a pseudo-3-D environment by rotation that
is capable of portraying higher-dimensional data at least in a 3-D setting.
2. Project high-dimensional data onto a 2-D coordinate system by using a data
reduction method such as principal component analysis, projection pursuit,
multidimensional scaling, or correspondence analysis.
3. Use a nonorthogonal coordinate system such as parallel coordinates which is less
restricted by the two-dimensionality of paper.
4. Link low-dimensional displays.

Demo : When you click on a point in the scatter plot, the histogram updates to show
the distribution of the corresponding x-coordinate.

82
Linked Views for Visual Exploration
• Linking procedures become particularly effective when datasets are complex, i.e., they
are large (many observations) and/or high-dimensional (many variables), consist of a
mixture of categorical and continuous variables, and have a lot of incomplete
observations (missing values).

• The main application focus of linked displays is in statistical exploration of datasets, in

particular, addressing issues such as
• Investigating distributional characteristics,
• Finding unusual or unexpected behavior, and
• Detecting relationships, structure, and patterns

83
Linked Views for Visual Exploration
• Linked views in data visualization refer to the coordination or connection between multiple graphical
displays, allowing users to interactively explore and analyze data from different perspectives
simultaneously. The advantages of linked views.
• Easiness of Graphical Displays:
• Enhance the simplicity and clarity of graphical displays.
• Users can easily grasp complex data relationships by observing multiple visualizations simultaneously.
• The interconnected nature helps in presenting information in a visually coherent manner, making it
easier for users to interpret and understand the data.
• Speed of Exploration:
• Facilitate a faster and more efficient exploration of data.
• Users can quickly navigate between different visualizations to uncover patterns, trends, and
relationships.
• Flexibility in Portraying Different Aspects:
• Allows users to portray various aspects of the data seamlessly.
• Comparative Analysis:
• Users can compare not only within the same visualization type but also across different types, enabling a
more comprehensive understanding of the data.
• Interactivity:
• Allowing users to dynamically modify parameters or filters. This interactivity empowers users to focus on
specific subsets of the data or zoom in on interesting patterns, enhancing the depth of exploration.
84
Linked Views for Visual Exploration
Theoretical Structures for Linked Views
• Linking views means that two or more plots share and exchange information with each other.
• To achieve the exchange of information, a linking procedure needs to establish a
relationship between two or more plots.
• Once a relation between two plots has been established, the question is which information
is shared and how the sharing of information can be realized?
• To explore the wide range of possibilities of linking schemes and structures
• A data analysis display D consists of a frame F, a type, and its associated set of graphical
elements G as well as its set of scale representing axes sG, a model X and its scale sX , and a
sample population Ω,

85
Linked Views for Visual Exploration
1. Frame (F): The frame is the outer boundary or container that defines the spatial limits of the
data display. It provides a structure within which the various graphical elements and
components are organized.
2. Type and Graphical Elements (G):
• The type of the data display refers to its overall format or structure, such as bar chart, line
chart, scatter plot, etc.
• The graphical elements (G) are the individual components or marks used to represent
data points within the chosen type. For example, in a bar chart, the bars themselves would
be the graphical elements.
3. Scale-Representing Axes (sG):
• The scale-representing axes are the axes on the display that represent the scales for the
variables being measured.
4. Model (X) and its Scale (sX):
• The model (X) refers to the mathematical or statistical model used to analyze the data.
• The scale (sX) associated with the model represents the range or values of the variables
in the model.
5. Sample Population (Ω):
• The set of data points or observations that are being analyzed and displayed. 86
• It is the dataset from which the information for the display is derived.
Linked Views for Visual Exploration
D = (F, (G, SG), (X , SX), Ω).
• The pair ((X , SX ), Ω) is the data part and (F, (G, SG)) is the plotting part.
• The linking structure controls the exchange and transfer of information between
different plots.
• The concept introduces the idea of an "active plot" and "passive plots."
• The active plot is the one from which changes or messages are initiated.
• The passive plots, on the other hand, receive these messages and respond
accordingly.
• This distinction is analogous to the sender-receiver relationship in communication
theory.
• The definition of data displays and the abstract concept of linking opens the possibility
of defining a linking structure as a set of relations among any two components of the
two displays.

87
Linked Views for Visual Exploration

• A general view on possible linking structures between the active plot D1 and the passive plot
D2 assuming that information sharing is only possible among identical plot layers where
D1 = (Ω1, X1, G1, F1) and D2 = (Ω2, X2, G2, F2)

• Four types of linking structures: • At the type and at the model level the linking
• Linking frames, structures can be further differentiated into data
• Linking types, linking and scale linking, the latter being used
• Linking models, and when scales or scale representing objects are
• Linking sample populations involved in the linking process 88
Linked Views for Visual Exploration

• Sharing and exchanging information between two plots can now be resolved in two different
ways.
• The direct linking scheme from one layer in display D1 to the corresponding layer in display
D2 .
• A combined scheme that first propagates the information internally in the active plot to
the sample population layer; then the sample population link is used to connect the two
displays, and the linked information is then internally propagated in the passive plot to the
relevant layers. Hence the most widely used and most important linking structure is
sample population linking.
89
Visualization Techniques for Linked Views
• Replacement
• Overlaying
• Repetition
• Special Forms
• Replacement
• In replacement mode, when old information is replaced by new information, there
is a risk of losing valuable insights, especially when it comes to subsetting and
conditioning approaches.
• This loss is particularly notable in the context of marginal distributions.
[marginal distribution gives the probabilities of various values of the variables
in the subset without reference to the values of the other variables]
• The user can only compare the current image with a mental copy of the previous
image and hence the comparison might get distorted.
• Especially in the exploratory stage of data analysis for which interactive graphics
are designed, it is helpful to keep track of changing scenarios and the different
plot versions. 90
Visualization Techniques for Linked Views
• Replacement
• Overlaying
• Repetition
• Replacement
• In replacement mode, when old information is replaced by new information, there
is a risk of losing valuable insights, especially when it comes to subsetting and
conditioning approaches.
• This loss is particularly notable in the context of marginal distributions.
[marginal distribution gives the probabilities of various values of the variables
in the subset without reference to the values of the other variables]
• The user can only compare the current image with a mental copy of the previous
image and hence the comparison might get distorted.
• Especially in the exploratory stage of data analysis for which interactive graphics
are designed, it is helpful to keep track of changing scenarios and the different
plot versions.
91
Visualization Techniques for Linked Views
Overlaying
• Common strategy used to look at conditional distribution in area plots.
• The conditional distribution of a variable is the distribution of that variable given
specific conditions or values of other variables. It provides insights into how the
distribution of one variable changes when another variable takes on a certain
value or falls within a certain range.
• Provide framework for comparison between conditional/marginal distributions
• Overlaying creates two problems
• Basic restriction in the freedom of parameter choice for the selected subset since the
plot parameters are inherited from the original plot;
• Occlusion/Overplotting: Part of the original display is hidden by new overlaid plot.

• Two categories in the bar chart are selected.

• This selection is propagated to the histogram in which a
histogram representing the selected subset is overlaid.
• The overlaid histogram uses the same axis, scale, and plot
parameters as the original display and hence establishes
comparability between the subgroup and the total sample
92
Visualization Techniques for Linked Views
Repetition
• The displays are repeated and different views of the
same data are available at the same time.
• The advantage is that more comprehensive picture /
complete overview of data, impact of parameter changes
/ user interactions are observable.
• The disadvantage is that user confusion from multiple
changing displays / views.
Requirements:
• Keep track of user changes / interactions
• Easy, powerful method to rearrange displays on computer
screen. Instead of overlaying the plot for the
• A condensed form of the repetition strategy that works selected subgroup, it is placed next
very well for the purpose of subsetting is juxtaposition. to the original one such that no
• A common repetition strategy for sub-setting, well-known overlapping occurs
for static plots but not yet widely used in interactive
graphical systems.
93

Module 2 (1)
No ratings yet
Module 2 (1)
118 pages
BM _Unit 5_Graph
No ratings yet
BM _Unit 5_Graph
89 pages
Graph Data Structure
No ratings yet
Graph Data Structure
84 pages
Graph Theory Cambridge U
No ratings yet
Graph Theory Cambridge U
75 pages
C++&DS(Unit 4)
No ratings yet
C++&DS(Unit 4)
67 pages
I-Introduction To Network Theory: Basic Concepts
No ratings yet
I-Introduction To Network Theory: Basic Concepts
66 pages
ppt3 Network Theory
No ratings yet
ppt3 Network Theory
66 pages
Graph Theory 1-11 PDF
No ratings yet
Graph Theory 1-11 PDF
13 pages
Graph Data Structure
No ratings yet
Graph Data Structure
13 pages
14-graphs_Chapter14
No ratings yet
14-graphs_Chapter14
22 pages
Graph Theory - Introduction
No ratings yet
Graph Theory - Introduction
5 pages
DSA8
No ratings yet
DSA8
57 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
35 pages
Graph Theory-Basics
No ratings yet
Graph Theory-Basics
4 pages
graphtheoryanditsapplications-180311171041
No ratings yet
graphtheoryanditsapplications-180311171041
57 pages
Graph Theory
No ratings yet
Graph Theory
89 pages
Unit3_1_Trees
No ratings yet
Unit3_1_Trees
18 pages
Unit - 3
No ratings yet
Unit - 3
26 pages
Graph Theory
100% (1)
Graph Theory
81 pages
Graphs Data Structure
No ratings yet
Graphs Data Structure
18 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
Final Maths Proj 1.4.24
No ratings yet
Final Maths Proj 1.4.24
49 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
Unit 4 Graph
No ratings yet
Unit 4 Graph
16 pages
Graph Theory (2)
No ratings yet
Graph Theory (2)
6 pages
Graph Theory 1-11
No ratings yet
Graph Theory 1-11
76 pages
CENG 205 Data Structures: Graphs
No ratings yet
CENG 205 Data Structures: Graphs
72 pages
4- Graph Theory (1)
No ratings yet
4- Graph Theory (1)
10 pages
Graph Theory
100% (3)
Graph Theory
56 pages
unit4-graphs
No ratings yet
unit4-graphs
16 pages
Graph Theory Notes
No ratings yet
Graph Theory Notes
25 pages
Graph
No ratings yet
Graph
14 pages
Graphs: Massachusetts Institute of Technology 6.042J/18.062J, Fall '02 Professor Albert Meyer Dr. Radhika Nagpal
No ratings yet
Graphs: Massachusetts Institute of Technology 6.042J/18.062J, Fall '02 Professor Albert Meyer Dr. Radhika Nagpal
17 pages
Graphs
No ratings yet
Graphs
39 pages
Graph Theory
No ratings yet
Graph Theory
146 pages
DS-UNIT-5
No ratings yet
DS-UNIT-5
17 pages
Chapter 3-Graph Algorithms: 2021 Prepared By: Beimnet G
No ratings yet
Chapter 3-Graph Algorithms: 2021 Prepared By: Beimnet G
42 pages
72519b7d-566d-4deb-85a3-729f78822ca5
No ratings yet
72519b7d-566d-4deb-85a3-729f78822ca5
32 pages
Graph Theory (1)
No ratings yet
Graph Theory (1)
21 pages
Unit5 Graphs
No ratings yet
Unit5 Graphs
40 pages
DSA Module 5
No ratings yet
DSA Module 5
77 pages
Graphs Theory
No ratings yet
Graphs Theory
37 pages
194 - Mohini Brahma - CA2 - Btech3rdSem
No ratings yet
194 - Mohini Brahma - CA2 - Btech3rdSem
20 pages
Graphs: Definition, Applications, Representation
No ratings yet
Graphs: Definition, Applications, Representation
12 pages
Graphs
No ratings yet
Graphs
21 pages
Unit V Graph Structures
No ratings yet
Unit V Graph Structures
39 pages
Graph Theory-Basic PPT
No ratings yet
Graph Theory-Basic PPT
188 pages
Graph 1
No ratings yet
Graph 1
16 pages
Unit3 2 Trees
No ratings yet
Unit3 2 Trees
22 pages
Graph & Trees-Basic Concept
No ratings yet
Graph & Trees-Basic Concept
40 pages
DM Presentation 2
No ratings yet
DM Presentation 2
28 pages
Introduction To Graph Theory
No ratings yet
Introduction To Graph Theory
22 pages
5 Graph Theory
No ratings yet
5 Graph Theory
42 pages
Graph_theory
No ratings yet
Graph_theory
12 pages
Introduction to Topology
From Everand
Introduction to Topology
Simone Malacrida
No ratings yet
Origami Dots: Folding paper to explore geometry
From Everand
Origami Dots: Folding paper to explore geometry
Andy Parkinson
5/5 (1)
Medial Axis: Exploring the Core of Computer Vision: Unveiling the Medial Axis
From Everand
Medial Axis: Exploring the Core of Computer Vision: Unveiling the Medial Axis
Fouad Sabry
No ratings yet
A Seminar on Graph Theory
From Everand
A Seminar on Graph Theory
INscribe Digital
No ratings yet
Exercises of Basic Analytical Geometry
From Everand
Exercises of Basic Analytical Geometry
Simone Malacrida
No ratings yet
Imp P2 (2) MS
No ratings yet
Imp P2 (2) MS
11 pages
Predicate Logic 1
100% (1)
Predicate Logic 1
24 pages
Unit V (TM) - Part2
No ratings yet
Unit V (TM) - Part2
33 pages
Chapter 2 Pixel Relation
No ratings yet
Chapter 2 Pixel Relation
33 pages
Object-Oriented Compiler Construction
No ratings yet
Object-Oriented Compiler Construction
10 pages
Chap1 1
No ratings yet
Chap1 1
20 pages
Roll No. ...................... Total Pages: 3: GSM/D-21
No ratings yet
Roll No. ...................... Total Pages: 3: GSM/D-21
3 pages
Mesh Generation: 58:110 Computer-Aided Engineering
No ratings yet
Mesh Generation: 58:110 Computer-Aided Engineering
28 pages
Simplex Algorithm - Wikipedia
No ratings yet
Simplex Algorithm - Wikipedia
20 pages
Cryptography_HW2
No ratings yet
Cryptography_HW2
3 pages
MCA_NEW_assignment_Ist Semester-2025
No ratings yet
MCA_NEW_assignment_Ist Semester-2025
16 pages
D11 D12 D13 0354 Midterm
No ratings yet
D11 D12 D13 0354 Midterm
2 pages
Asymptotic Analysis of Algorithms (Growth of Function)
No ratings yet
Asymptotic Analysis of Algorithms (Growth of Function)
8 pages
Lab 01
No ratings yet
Lab 01
2 pages
Ai - 2
No ratings yet
Ai - 2
48 pages
Linear Programming
100% (1)
Linear Programming
42 pages
ocs351
No ratings yet
ocs351
20 pages
Exercises of Design & Analysis
100% (1)
Exercises of Design & Analysis
7 pages
Loss Function
No ratings yet
Loss Function
9 pages
Class01 Computer Contest Level 3 Notes
No ratings yet
Class01 Computer Contest Level 3 Notes
46 pages
Shorts
No ratings yet
Shorts
8 pages
FLAT QPs
No ratings yet
FLAT QPs
22 pages
MADChap5 InductionandRecursion
No ratings yet
MADChap5 InductionandRecursion
77 pages
Energy-Constrained Private and Quantum Capacities of Quantum Channels
No ratings yet
Energy-Constrained Private and Quantum Capacities of Quantum Channels
41 pages
Polynomials
No ratings yet
Polynomials
6 pages
Basic Short Questions For C Language
No ratings yet
Basic Short Questions For C Language
2 pages
BDA Quiz 2 Help
No ratings yet
BDA Quiz 2 Help
4 pages
Module-2 - Transportation and Assignment Problem
No ratings yet
Module-2 - Transportation and Assignment Problem
83 pages
CS302 - Lab Manual - Week No
No ratings yet
CS302 - Lab Manual - Week No
8 pages
MACHINE LEARNING UNIT WISE IMPORTANT QUESTION 3,4,5
No ratings yet
MACHINE LEARNING UNIT WISE IMPORTANT QUESTION 3,4,5
2 pages

Module-2 Notes

Uploaded by

Module-2 Notes

Uploaded by

Course title : CSE3006

Course title : Data Visualization

Visualization in High-Dimensional Data

Graph has a set of vertices V= { 1,2,3,4,5} and

Directed (digraph): A graph in which all the edges are uni-directional.

Infinite Graph: The graph G=(V, E) is called a infinite graph if the

• A path is cyclic if a node appears more than once in its corresponding

In undirected graph, for edge (A,C), also

Horizontal layout Vertical layout 24

Layout of a website tree

Violent Crime Rates By US State 27

Polar hierarchical cluster tree of US murder rates 28

A treemap is organized by color in the following ways:

From this treemap, we can estimate that

Africa Central African 622980

Asia Indonesia 1919440

Europe Finland 337030

Mosaic plot representation, in which, the counts in

• Spaces between the sections are conventional, but not

2×2×2 contingency table

• It is clear that a higher proportion of males is being admitted.

• By construction, the percent

• In a treemap, take an enclosing rectangle and subdivide it into smaller rectangles

Python: plotly or pandas

• The Trellis Technology

Standard non-trellised scatter plot

Trellis visualization based on the two

The Gestalt principle of

Structure Driven Hierarchical

• The main application focus of linked displays is in statistical exploration of datasets, in

• Two categories in the bar chart are selected.

You might also like