0% found this document useful (0 votes)
2 views14 pages

EDS Unit 5 ?

The document outlines Unit 5 of a Data Visualization course, covering various visualization techniques including pixel-oriented, geometric projection, icon-based, and hierarchical methods. It details different types of charts and graphs such as pie charts, bar graphs, box plots, histograms, and line graphs, along with regression analysis methods like linear and multiple linear regression. The content emphasizes the importance of graphical representations in understanding and analyzing complex datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views14 pages

EDS Unit 5 ?

The document outlines Unit 5 of a Data Visualization course, covering various visualization techniques including pixel-oriented, geometric projection, icon-based, and hierarchical methods. It details different types of charts and graphs such as pie charts, bar graphs, box plots, histograms, and line graphs, along with regression analysis methods like linear and multiple linear regression. The content emphasizes the importance of graphical representations in understanding and analyzing complex datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

🎖️

Unit 5 Data Visualization


Syllabus
Data Visualization: Pixel-Oriented Visualization, Geometric Projection
Visualization Techniques, Icon-Based Visualization Techniques, Hierarchical
Visualization Techniques, Visualizing complex Data and Relations.
Charts and Graphs: Introduction, Pie Chart: Chart Legend, Bar Chart, Box Plot,
Histogram, Line Graph: Multiple Lines in Line Graph, Scatter Plot.
Regression: Linear Regression Analysis, Multiple Linear regression

Data Visualization
Data Visualization is a technique used to understand and analyze the data
using graphical representations. It provides qualitative overview of large
datasets.

Types of Data Visualizations


Pixel-Oriented Visualization
Pixel-Oriented Visualization represents data using individual pixels, where
each pixel corresponds to a specific data point.

The color of the pixel reflects the value of the data point. The smaller the
value, the lighter the pixel’s color.

It is used to find the correlation between different dimensions/attributes in a


dataset

Example: Heatmaps, Geospatial Visualizations

Unit 5 Data Visualization 1


Here, as the credit limit increases, the income increases. Customers who are in
the middle range of income are more likely to purchase more. There’s no
relationship between age and income

Geometric Projection Visualization Technique


Drawback of Pixel-Oriented Visualization is that it cannot help us much in
understanding the distribution of data in multidimensional space. This is
where Geometric Projection Visualization comes into play

Geometric-Projection Visualization Techniques helps us to understand the


distribution of data in multidimensional space

Projects multidimensional data into 2D or 3D space while retaining


relationships between data points.

Example: Scatter Plot, Parallel Coordinates, Landscapes

Unit 5 Data Visualization 2


Landscape

Icon Based Visualization Techniques


This Visualization uses Icons to represents multidimensional data.

Uses Icons, Glyphs, or Shapes

There are two popular icon based visualizations

Churn Off Faces

Unit 5 Data Visualization 3


Stick Figures

Churn Off Faces


In this Icon Based Representations, Each feature of the face represents a
data dimension. (Eyes, Mouth, Nose, etc.)

Each face represents n dimensional data points (n<18)

Example

Stick Figures
It maps data into 5 piece stick figure. A stick figure consists of 4 limbs and
one body

Unit 5 Data Visualization 4


Two Dimensions are mapped to display X and Y axis, and remaining
dimensions are mapped to angle and/or length of the limbs

Hierarchical Visualization Technique


Hierarchical visualization techniques are used to represent data with
inherent hierarchical structures, such as trees, taxonomies, or nested
relationships, in a visually interpretable format.

These techniques arrange data in layers or levels, showing parent-child


relationships and enabling us to explore/analyze the structure and
distribution of data at various levels of detail.

1. Dimensional Stacking: It is used to show complex data in nested 2D plots


to find patterns.

2. Worlds Within Worlds: It is used to display data in layers within 3D spaces


for easy exploration.

3. Tree Map: It is used to show hierarchical data with nested rectangles that
vary in size and color.

4. Info Cube: It is used to show data in a cube to analyze it from different


angles.

5. 3D Cone Trees: It is used to display hierarchical data in a 3D cone shape


for better navigation.

Charts and Graphs


Pie Charts
Pie Chart is a type of graph that represents the data in circular graphs

A pie chart requires a list of categorical variables and numerical variables

It is divided into slices, which represent the quantity of the category and the
data in it

The data in pie charts is represented by using percentages (0-100%) and


colors for the category

Unit 5 Data Visualization 5


Bar Graphs
It is the type of graph which represents the data in vertical or horizontal
bars

Here the length of the bar corresponds to the measure of the data

Bars have equal width and interval spacing

The bars have same starting point

1. Vertical Bar Graphs

The bars are drawn vertically from down to top

Categories are placed on horizontal axis, and data points on vertical


axis

Unit 5 Data Visualization 6


1. Horizontal Bar Graphs

The bars are drawn horizontally from left to right

Categories are placed on the vertical axis, and data points on the
horizontal axis.

1. Grouped Bar Graphs

In these bar graphs, bars are grouped into sets

Each set of data is graphically separated but on the same graph

Unit 5 Data Visualization 7


1. Stacked Bar Graphs

Here each bar is subdivided into sub bars, stacked end to end

It is used to represent the data points which have two categorical


variables

Unit 5 Data Visualization 8


Box Plot
A box plot (also called a box-and-whisker plot) is a graphical
representation of the distribution of a dataset.

It shows the minimum, first quartile (Q1), median, third quartile (Q3), and
maximum of the data.

It is particularly useful for visualizing the spread, central tendency, and


identifying any outliers.

Histogram
It is a graphical representation of distribution of quantitative data

Unit 5 Data Visualization 9


It is represented by a set of rectangles adjacent to each other, where each
bar represents a kind of data

A frequency distribution can be shown graphically using Histogram

Unlike bar graphs, histogram doesn’t have fixed width, and interval spacing

Line Graphs
It is a graphical representation of two or more variables in the form of lines
or curves

1. Simple Line Graph

Single line represents the relationship between two variables over time

2. Multiple Line Graph

Unit 5 Data Visualization 10


Two or more lines represent the relationship between two variables over
time

Shows relationship between two or more similar categories on single


graph

3. Compound Line Graph

Where multiple lines are combined into a single graph showing


different categories or variables

Shows relationship between different categories on single graph

Difference between Histogram and Bar


Graphs

Unit 5 Data Visualization 11


Bar graph Histogram

The bar graph is the graphical A histogram is the graphical representation


representation of categorical data. of quantitative data.

There is equal space between each pair of There is no space between the
consecutive bars. consecutive bars.

The height of the bars shows the The area of rectangular bars shows the
frequency, and the width of the bars are frequency of the data and the width of the
same. bars need not to be same.

Data can be arranged in any order. Data is arranged in the order of range.

The x-axis should represent only


The x-axis can represent anything. continuous data that is in terms of
numbers.

Unit 5 Data Visualization 12


Regression: Linear Regression Analysis &
Multiple Linear Regression
1. Linear Regression Analysis
Definition: Linear regression is a statistical method used to model the
relationship between a dependent variable (Y) and an independent variable
(X) by fitting a straight line to the data points. The goal is to find the equation of
the line that best predicts Y based on X.

Equation:
Y = β0 + β1 X + ϵ 

Where:

Y is the dependent variable.


X is the independent variable.
β0 is the intercept.

β1 is the slope (coefficient of X).


ϵis the error term.


Purpose:

To predict or understand the relationship between two variables.

Useful when you want to predict a continuous outcome based on one


predictor.

Example: Predicting a person’s weight (Y) based on their height (X).

2. Multiple Linear Regression


Definition: Multiple linear regression extends linear regression by modeling the
relationship between a dependent variable (Y) and multiple independent
variables (X1, X2, X3,...). This is used when there are multiple factors that
influence the outcome.

Equation:
Y = β0 + β1 X1 + β2 X2 + ⋯ + βn Xn + ϵ
​ ​ ​ ​ ​

Unit 5 Data Visualization 13


Where:

Y is the dependent variable.


X1 , X2 , … , Xn are the independent variables.
​ ​ ​

β0 is the intercept.


β1 , β2 , … , βn are the coefficients for each independent variable.


​ ​ ​

ϵis the error term.


Purpose:

To understand how multiple predictors influence a single outcome.

Useful when there are multiple factors that contribute to the predicted
value.

Example: Predicting a person’s weight (Y) based on height (X1), age (X2), and
gender (X3).

Key Differences:
Linear Regression: Involves one independent variable (X) and one
dependent variable (Y).

Multiple Linear Regression: Involves two or more independent variables


(X1, X2, X3, etc.) to predict the dependent variable (Y).

Unit 5 Data Visualization 14

You might also like