0% found this document useful (0 votes)
98 views33 pages

Analysis Process: Data Visualization

The document discusses data visualization techniques. It covers creating summary tables and charts to interpret and analyze data. Effective design involves maximizing the data-ink ratio by removing unnecessary elements. Tables are useful when exact values or precise comparisons are needed, while charts show patterns and relationships. Examples demonstrate crosstabulations, pivot tables, scatter plots, line charts, and sparklines. The key is choosing the visualization that best conveys insights from the data.

Uploaded by

Chee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views33 pages

Analysis Process: Data Visualization

The document discusses data visualization techniques. It covers creating summary tables and charts to interpret and analyze data. Effective design involves maximizing the data-ink ratio by removing unnecessary elements. Tables are useful when exact values or precise comparisons are needed, while charts show patterns and relationships. Examples demonstrate crosstabulations, pivot tables, scatter plots, line charts, and sparklines. The key is choosing the visualization that best conveys insights from the data.

Uploaded by

Chee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

11/6/2017

CHAPTER 3
DATA VISUALIZATION

Analysis Process
Decide what to
Pose a question measure and Collecting data
how to measure?

Summarizing,
Visualizing
data

Interpreting Analyzing
results data

1
11/6/2017

Introduction
Data visualization involves:
◦ Creating a summary table for the data
◦ Generating charts to help interpret, analyze,
and learn from the data
Uses of data visualization:
◦ Helpful for identifying data errors
◦ Reduces the size of your data set by
highlighting important relationships and
trends in the data

Overview of Data
Visualization
EFFECTIVE DESIGN TECHNIQUES

2
11/6/2017

Overview of Data Visualization


Data-ink ratio: Measures the proportion of what Tufte terms
“data-ink” to the total amount of ink used in a table or chart
◦ Edward R. Tufte first described the data-ink ratio
◦ Helpful for creating effective tables and charts for data
visualization
 Data-ink: Ink used in a table or chart that is necessary
to convey the meaning of the data to the audience
 Non-data-ink: Ink used in a table or chart that serves no
useful purpose in conveying the data to the audience

Table 3.1: Example of a Low Data-Ink Ratio Table


Figure 3.3: Example of a Low Data-Ink Ratio Chart

3
11/6/2017

Table 3.2: Increasing the Data-Ink Ratio by Removing


Unnecessary Gridlines
Figure 3.4: Increasing the Data-Ink Ratio by Adding Labels to
Axes and Removing Unnecessary Lines and Labels

Tables

4
11/6/2017

Tables
Tables should be used when:
1. The reader needs to refer to specific numerical values
2. The reader needs to make precise comparisons
between different values and not just relative
comparisons
3. The values being displayed have different units or very
different magnitudes

Table 3.3: Table Showing Exact Values for Costs and Revenues
by Month for Gossamer Industries
Figure 3.5: Line Chart of Monthly Costs and Revenues at
Gossamer Industries

10

5
11/6/2017

Figure 3.6: Combined Line Chart and Table for


Monthly Costs and Revenues at Gossamer Industries

11

Table 3.4: Table Displaying Head Count, Costs,


and Revenues at Gossamer Industries

12

6
11/6/2017

Tables
Table Design Principles
◦ Avoid using vertical lines in a table unless they
are necessary for clarity
◦ Horizontal lines are generally necessary only for
separating column titles from data values or
when indicating that a calculation has taken
place

13

Figure 3.7: Comparing Different Table


Designs

14

7
11/6/2017

Table 3.5: Larger Table Showing Revenues


by Location for 12 Months of Data

15

Table 3.6: Quality Rating and Meal Price for


300 Los Angeles Restaurants

16

8
11/6/2017

Table 3.7: Table of Quality Rating and Meal Price


for 300 Los Angeles Restaurants

crosstabulation

17

Table 3.7: Table of Quality Rating and Meal Price


for 300 Los Angeles Restaurants

Insights gained from the crosstabulation

 The greatest number of restaurants in the sample (64) have a


very good rating and a meal price in the $20–29 range
 Only two restaurants have an excellent rating and a meal
price in the $10–19 range
 The right and bottom margins of the crosstabulation give the
frequency of quality rating and meal price separately

18

9
11/6/2017

Tables
Crosstabulation: A useful type of table for
describing data of two variables
PivotTable: A crosstabulation in Microsoft
Excel

19

Figure 3.8: Excel Worksheet Containing


Restaurant Data

20

10
11/6/2017

Figure 3.9: Initial PivotTable Field List and


PivotTable Field Report for the Restaurant Data

21

Figure 3.10: Completed PivotTable Field List and A


Portion of the PivotTable Report for the Restaurant
Data (Columns H:AK are Hidden)

22

11
11/6/2017

Figure 3.11: Final PivotTable Report for the


Restaurant Data

23

Figure 3.12: Percent Frequency Distribution as


a PivotTable for the Restaurant Data

24

12
11/6/2017

Figure 3.13: PivotTable Report for the Restaurant


Data with Average Wait Times Added

25

Charts

13
11/6/2017

Table 3.8: Sample Data for the San Francisco


Electronics Store

27

Figure 3.17: Chart for the San Francisco


Electronics Store

Scatter
Trendline chart

28

14
11/6/2017

Charts
Charts (or graphs): Visual methods of displaying data
Scatter chart: Graphical presentation of the relationship
between two quantitative variables
Trendline: A line that provides an approximation of the
relationship between the variables

29

Table 3.9: Monthly Sales Data of Air


Compressors at Kirkland Industries

30

15
11/6/2017

Figure 3.19: Charts for Monthly Sales Data


at Kirkland Industries

Scatter chart
Line chart

31

Charts
Line chart: A line connects the points in the chart
◦ Useful for time series data collected over a
period of time (minutes, hours, days, years,
etc.)

32

16
11/6/2017

Table 3.10: Regional Sales Data by Month


for Air Compressors at Kirkland Industries

33

Figure 3.21: Line Chart of Regional Sales


Data at Kirkland Industries

34

17
11/6/2017

Charts
Sparkline: Special type of line chart
◦ Minimalist type of line chart that can be placed
directly into a cell in Excel
◦ Contain no axes; they display only the line for the
data
◦ Take up very little space and they can be effectively
used to provide information on overall trends for
time series data

35

Figure 3.22: Sparklines for the Regional


Sales Data at Kirkland Industries

36

18
11/6/2017

Figure 3.23: Charts for Accounts Managed


Data

Bar chart

Gentry manages the greatest


number of accounts and
Williams the fewest

37

Charts
Bar Charts: Use horizontal bars to display the magnitude
of the quantitative variable
Column Charts: Use vertical bars to display the
magnitude of the quantitative variable
Bar and column charts are very helpful in making
comparisons between categorical variables

38

19
11/6/2017

Figure 3.24: Sorted Bar Chart for Accounts


Managed Data

39

Figure 3.25: Bar Chart with Data Labels for


Accounts Managed Data

40

20
11/6/2017

Figure 3.21: Chart of Accounts


Managed

Pie
chart

41

Charts
Pie charts: Common form of chart used to
compare categorical data

42

21
11/6/2017

Table 3.11: Sample Data on Billionaires per


Country

43

Figure 3.27: Chart Comparing Billionaires by


Country

Bubble
chart

44

22
11/6/2017

Figure 3.27: Bubble Chart Comparing


Billionaires by Country

45

Charts
Bubble chart:
◦ Graphical means of visualizing three variables in a
two-dimensional graph
◦ Sometimes a preferred alternative to a 3-D graph

46

23
11/6/2017

Figure 3.28: Heat Map and Sparklines


for Same-Store Sales Data

47

Charts
Heat map: A two-dimensional graphical
representation of data that uses different shades
of color to indicate magnitude

48

24
11/6/2017

Charts
Additional Charts for Multiple Variables
◦ Stacked column chart: Allows the reader to
compare the relative values of quantitative
variables for the same category in a bar chart
◦ Clustered column (or bar) chart: An
alternative chart to stacked column chart for
comparing quantitative variables
◦ Scatter chart matrix: Useful chart for
displaying multiple variables

49

Figure 3.25: Stacked-Column Chart for


Regional Sales Data for Kirkland Industries

50

25
11/6/2017

Figure 3.30: Comparing Stacked-, Clustered-, and


Multiple-Column Charts for the Regional Sales
Data for Kirkland Industries

51

Table 3.12:
Data for New York City Subboroughs

52

26
11/6/2017

Figure 3.31: Scatter-Chart Matrix for New


York City Rent Data

53

Charts
PivotCharts in Excel
PivotChart: To summarize and analyze data with
both a crosstabulation and charting, Excel pairs
PivotCharts with PivotTables

54

27
11/6/2017

Figure 3.32: PivotTable and PivotChart for


the Restaurant Data

55

Advanced Data
Visualization
ADVANCED CHARTS
GEOGRAPHIC INFORMATION SYSTEMS
CHARTS

28
11/6/2017

Advanced Data Visualization


Parallel-coordinates plot: Chart for examining data with more than
two variables
◦ Includes a different vertical axis for each variable
◦ Each observation is represented by drawing a line on the parallel
coordinates plot connecting each vertical axis
◦ The height of the line on each vertical axis represents the value taken
by that observation for the variable corresponding to the vertical axis

Treemap: Useful for visualizing hierarchical data along multiple


dimensions

57

Figure 3.33: Parallel Coordinates Plot for


Baseball Data

58

29
11/6/2017

Figure 3.34: SmartMoney’s Map of the


Market as an Example of a Treemap

59

Advanced Data Visualization


Geographic Information Systems Charts
◦ Geographic Information Systems (GIS): A
system that merges maps and statistics to
present data collected over different
geographies
◦ Helps in interpreting data and observing
patterns

60

30
11/6/2017

Figure 3.35: GIS Chart for Cincinnati Zoo


Member Data

61

Data Dashboards
 PRINCIPLES OF EFFECTIVE DATA
DASHBOARDS
 APPLICATION OF DATA DASHBOARDS

31
11/6/2017

Data Dashboards
Data dashboard: Data visualization tool that illustrates
multiple metrics and automatically updates these metrics
as new data become available
Key performance indicators (KPIs) in dashboards:
◦ Automobile dashboard: Current speed, Fuel
level, and oil pressure
◦ Business dashboard: Financial position,
inventory on hand, customer service metrics

63

Data Dashboards
Principles of Effective Data Dashboards
◦ Should provide timely summary information on KPIs that are
important to the user
◦ Should present all KPIs as a single screen that a user can quickly
scan to understand the business’s current state of operations
◦ The KPIs displayed in the data dashboard should convey
meaning to its user and be related to the decisions the user
makes
◦ A data dashboard should call attention to unusual measures
that may require attention
◦ Color should be used to call attention to specific values to
differentiate categorical variables, but the use of color should
be restrained

64

32
11/6/2017

Figure 3.36: Data Dashboard for the Grogan Oil


Information Technology Call Center

65

33

You might also like