0% found this document useful (0 votes)
7 views28 pages

Unit 5-2

The document discusses various data visualization techniques, including pixel-oriented, geometric projection, icon-based, and hierarchical methods, aimed at effectively communicating data through graphical representation. It highlights the challenges of visualizing high-dimensional data and introduces techniques like scatter plots, parallel coordinates, Chernoff faces, and tree-maps. Additionally, it emphasizes the growing interest in visualizing non-numeric data and the role of visualization in data mining processes.

Uploaded by

greekathena0501
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views28 pages

Unit 5-2

The document discusses various data visualization techniques, including pixel-oriented, geometric projection, icon-based, and hierarchical methods, aimed at effectively communicating data through graphical representation. It highlights the challenges of visualizing high-dimensional data and introduces techniques like scatter plots, parallel coordinates, Chernoff faces, and tree-maps. Additionally, it emphasizes the growing interest in visualizing non-numeric data and the role of visualization in data mining processes.

Uploaded by

greekathena0501
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Data Visualization:

Pixel-Oriented Visualization Techniques, Geometric


Projection Visualization Techniques, Icon-Based
Visualization Techniques, Hierarchical Visualization
Techniques, Visualizing Complex Data and Relations.
 Data visualization aims to communicate data clearly and effectively
through graphical representation.
 Data visualization has been used extensively in many applications—
for example, at work for reporting, managing business operations,
and tracking progress of tasks.
 We discuss several representative approaches, including pixel-
oriented techniques, geometric projection techniques, icon-based
techniques, and hierarchical and graph-based techniques.
 A simple way to visualize the value of a dimension is to use a pixel where the
color of the pixel reflects the dimension’s value.
 For a data set of m dimensions, pixel-oriented techniques create m windows on
the screen, one for each dimension.
 The m dimension values of a record are mapped to m pixels at the corresponding
positions in the windows.
 The colors of the pixels reflect the corresponding values.
 Inside a window, the data values are arranged in some global order shared by all
windows.
 The order may be obtained by sorting all data records in a way that’s meaningful
for the task at hand.
AllElectronics maintains a customer information table, which consists of four dimensions:
income, credit limit, transaction volume, and age.
We can sort all customers in income-ascending order, and use this order to lay out the customer
data in the four visualization windows.
 A drawback of pixel-oriented visualization techniques is that they cannot help us much in
understanding the distribution of data in a multidimensional space.
 The central challenge the geometric projection techniques try to address is how to
visualize a high-dimensional space on a 2-D display.
 A 3-D scatter plot uses three axes in a Cartesian coordinate system. If it also uses
color, it can display up to 4-D data points.
 The scatter-plot matrix technique is a useful extension to the scatter plot.
 For an n-dimensional data set, a scatter-plot matrix is an n × n grid of 2-D scatter
plots that provides a visualization of each dimension with every other dimension.
 Figure shows an example, which visualizes the Iris data set. The data set consists
of 450 samples from each of three species of Iris flowers. There are five
dimensions in the data set:length and width of sepal and petal, and species.
 The scatter-plot matrix becomes less effective as the dimensionality increases.
Another popular technique, called parallel coordinates, can handle higher
dimensionality.
 To visualize n-dimensional data points, the parallel coordinates technique draws
n equally spaced axes, one for each dimension, parallel to one of the display axes.
 A data record is represented by a polygonal line that intersects each axis at the
point corresponding to the associated dimension value
 Icon-based visualization techniques use small icons to represent multidimensional
data values. We look at two popular icon-based techniques: Chernoff faces and
stick figures.
 Chernoff faces were introduced in 1973 by statistician Herman Chernoff. They
display multidimensional data of up to 18 variables (or dimensions) as a cartoon
human face.
 Chernoff faces help reveal trends in the data. Components of the face, such as the
eyes, ears, mouth, and nose, represent values of the dimensions by their shape,
size, placement, and orientation.
 For example, dimensions can be mapped to the following facial characteristics:
eye size, eye spacing, nose length, nose width, mouth curvature, mouth width,
mouth openness, pupil size, eyebrow slant.
 Chernoff faces make use of the ability of the human mind to recognize small
differences in facial characteristics and to assimilate many facial characteristics at
once.
 The stick figure visualization technique maps multidimensional data to five-piece
stick figures, where each figure has four limbs and a body
 Figure shows census data, where age and income are mapped to the display axes,
and the remaining dimensions (gender, education, and so on) are mapped to stick
figures.
 The visualization techniques discussed so far focus on visualizing multiple
dimensions simultaneously. However, for a large data set of high dimensionality, it
would be difficult to visualize all dimensions at the same time.
 Hierarchical visualization techniques partition all dimensions into subsets (i.e.,
subspaces). The subspaces are visualized in a hierarchical manner.
 “Worlds-within-Worlds,” also known as n-Vision, is a representative hierarchical
visualization method.
 Suppose we want to visualize a 6-D data set, where the dimensions are F,X1,...,X5.
 As another example of hierarchical visualization methods, tree-maps display
hierarchical data
 In early days, visualization techniques were mainly for numeric data. Recently,
more and more non-numeric data, such as text and social networks, have become
available. Visualizing and analyzing such data attracts a lot of interest.
 There are many new visualization techniques dedicated to these kinds of data.
For example, many people on the Web tag various objects such as pictures, blog
entries, and product reviews.
 A tag cloud is a visualization of statistics of user-generated tags. Often, in a tag
cloud, tags are listed alphabetically or in a user-preferred order. The importance
of a tag is indicated by font size or color.
 Figure uses a disease influence graph to visualize the correlations between
diseases
 The nodes in the graph are diseases, and the size of each node is proportional to
the prevalence of the corresponding disease.
 In summary, visualization provides effective tools to explore data. We have
introduced several popular methods and the essential ideas behind them.
 There are many existing tools and methods. Moreover, visualization can be used
in data mining in various aspects.
 In addition to visualizing data, visualization can be used to represent the data
mining process, the patterns obtained from a mining method, and user interaction
with the data.
 Visual data mining is an important research and development direction.

You might also like