ANL201 Study Unit 2 - 2023
ANL201 Study Unit 2 - 2023
School of Business
CONFIDENTIAL
Study Unit 2
Science and Art of Data Visualisation
Recap
Overview of Business Performance Measurements
3
Data Visualisation
The Power of Excel and Power BI
5
Data: https://www.coe-data.com/Data/SUSS/Superstore.xls
What is Data Visualisation?
• Data visualisation” refers to transforming figures and raw data into visual objects: points,
bars,“ line plots, maps, etc. “Data visualization is the art of depicting data in a fun and
creative way, beyond the possibilities of Excel tables.
• Charles Miglietti, an expert in data visualization and co-founder of Toucan Toco
• https://www.youtube.com/watch?v=jbkSRLYSojo
6
Benefits of Data Visualisation
7
Data Visualisation - Data visualisation in everyday life
https://blog.hubspot.com/marketing/great-data-visualization-examples 8
Data Visualisation - Data visualisation in everyday life
https://www.toucantoco.com/en/blog/7-examples-of-data-visualization 9
Data Visualisation - Data visualisation in everyday life
https://www.researchgate.net/topic/Data-Visualization 10
Perceptual Processing Model
11
Four Components of
Data Visualisation
Four Components of Data Visualisation
Coordinate Systems
Visual Cues
Visual Cues
Scales Context
These four components work together and each of them affects the other.13
Four Components of
Data Visualisation
Visual Cues - Types
1. Position (e.g., scatterplot)
15
Visual Cues - Types
2. Length (e.g., bar chart)
16
Visual Cues - Types
3. Angle (e.g., pie chart)
17
Visual Cues - Types
4. Direction (e.g., line graph)
18
Visual Cues - Types
5. Shape (e.g., scatterplot)
Shape or symbol is
commonly used as a visual
cue to differentiate different
products/categories/objects
in data
19
Visual Cues - Types
6. Area and Volume- Area charts
20
Visual Cues - Types
7. Colour
SPOT THE BEAR!
Source: https://dapresy.com/wp-content/uploads/2016/07/Image-4.png; 22
http://www.sas-sr.com/intro_sas/mercredi/major.PNG
Class Discussion 1
Can you identify the components of visual cues on the dashboard ?
Colour
Shape
Angles
Length
Direction
Source: https://dapresy.com/wp-content/uploads/2016/07/Image-4.png; 23
http://www.sas-sr.com/intro_sas/mercredi/major.PNG
Coordinate
Systems
Coordinate Systems - Types
The cartesian coordinate system
‣ Two fixed perpendicular reference lines- x-axis and y-axis
25
Coordinate Systems - Types
The polar coordinate system
‣ The fixed point (0,0) is called the pole
26
Coordinate Systems - Types
The geographic coordinate system
‣ Represents every location on the earth using latitude and longitude
27
Scales
Scales - Types
Linear Scale
‣ Visual spacing between each of the data points is the same regardless where the data
points are on the axis
Source:
https://www.google.com/search?q=examples+
of+linear+scales&rlz=1C1GCEA_enSG969SG969
29
Scales - Types
Logarithmic scale
‣ Takes very large, exponentially
growing numbers and displays
them in a way that is easier for
the brain to understand
‣ Is used when numbers multiply
by a factor larger than 2 from
one time interval to another
Source: 30
https://en.wikipedia.org/wiki/Logarithmic_scale
Scales - Types
Percent scale
‣ It is used to represent part of the whole data, its maximum is 100 percent
31
Source: https://www.skillsyouneed.com/num/percentages.html
Scales - Types
Categorical scale
Example: The housing
variable with the three
categories
• for free
• own and
• rent
Source:
https://www.saedsayad.com/categorical_variables.htm
32
Scales - Types
Time scale
Used to plot temporal
data on a linear scale, or
to divide the temporal
data on a categorical
scale, such as by year,
month or day
33
Context
Context
The big idea
‣ Context is a data visualisation component that lends to better understanding of who,
what, when, where and why of the data
‣ Focus-context problem in data visualisation
1. The viewer needs both overview and details of the information simultaneously
2. Information needed in the overview may be different from that needed in the
detail
3. Need to combine both types within a single interactive data visualisation
35
Context
Solving the focus-context problem - distortion
Spatially distorts the data
presentation to give more room Hyperbolic Tree Browser
to the designated points of
interest, and to decrease the
space given to regions away
from those points
36
Source: https://zylab.files.wordpress.com/2010/09/hyperbolic_tree.png
Context
Solving the focus-context problem - rapid zooming
Allows viewers to zoom rapidly in and out of points of interest
37
Source: http://www.ceh.ac.uk/sites/default/files/hyrad-static-and-rapid.jpg
Context
Solving the focus-context problem - multiple windows
Allows viewers to
have one window
that shows an
overview of the data,
and several other
windows that show
the expanded details
39
Source: https://www.devexpress.com/products/net/dashboard/i/demos/winforms-hr-dashboard.png
Tableau (Class
Activity)
Class Activity Task
41
Class Activity Task
42
Source: http://www.faculty.virginia.edu/ASTR3130/lablinks/GuidePlots.html
Source: http://www.eea.europa.eu/data-and-maps/daviz/learn-more/chart-dos-and-donts; http://www.mathworks.com/matlabcentral/mlc- 43
downloads/downloads/submissions/35277/versions/3/previews/html/Pie_Chart_2D_2_01.png
Source: http://www.eea.europa.eu/data-and-maps/daviz/learn-more/chart-dos-and-donts; http://www.mathworks.com/matlabcentral/mlc- 44
downloads/downloads/submissions/35277/versions/3/previews/html/Pie_Chart_2D_2_01.png
Tableau (Class
Activity)
Tableau (Class Activity)
46
Table Join
This file shows 51,290 rows of online shopping customers with orders and returns from 2012 to 2015.
1. Connect to data: global_superstore_2016.xlsx
2. Drag Orders to right side
3. Double click Orders
4. Drag Returns to right side, next to Orders
5. The default join type is inner join. Order ID is set to be the key for join. 47
Table Join
Join Type Result
Inner
The resulting table contains values that have matches in both source tables.
When a value doesn't match across both source tables, it is dropped entirely.
Left
The resultant table contains all values from the first source table and corresponding matches from
the second source table.
When a value in the first source table doesn't have a corresponding match in the second source
table, you see a null value in the resulting table.
Right
The resultant table contains all values from the second source table and corresponding matches
from the first source table.
When a value in the second source table doesn't have a corresponding match in the first source
table, you see a null value in the resulting table.
Full outer
The resultant table contains all values from both the source tables.
When a value from either source table doesn't have a match with the other table, you see a null 48
value in the resultant table.
Table Join – Joined Visual
1. Add visual
2. Order date ➔ Columns
3. Click “+” sign before Year and move Year to
color
4. Returned ➔ Rows
5. Switch Returned Measure ➔ count
6. Sales ➔ Rows
49
Cross-database Join
1. Go to home page
2. Add MS Excel → Sales 2016
3. Add Text Files → Products 2016
4. Double click “Sheet1” on the right.
5. Drag “Products 2016.csv” to the
right.
6. Drag “Sales” to the right
7. Go to a new worksheet
50
Cross-database Join
51
Data Blending 1.
2.
Go to home page,
Add MS Excel → Office City
3. Click on database icon next to Purchases (Office City)
4. Click New Data Source
A company owns an
5. Add MS Excel → Coffee Chain
Office Equipment
store, and a Coffee
Chain is attached to
some of the stores.
They wish to
calculate combined
revenue data in all
US states.
They have two tables
from these different
business units.
More info: 52
https://help.tableau.com/current/pro/desktop/en-us/multiple_connections.htm
Data Blending
1. Rename Measures Sales to include business
Discuss to identify: 2. Rows → State (Office City)
Primary and Secondary data 3. Columns → Sales (Office City)
sources 4. Columns → Sales (Coffee Chain)
53
Data Blending 1. Create Calculated Field in Office City
2. Name it TotalSales
When combining sales to a Total
3. Create Formula
Sales, totals are not shown if not
both business units have sales. SUM([Sheet1 (Coffee Chain)].[Sales Coffee Chain]) +
SUM([Sales Office City])
54
Data Blending 1. Edit Measure TotalSales
2. Change Formula to:
Therefore, a condition is needed
testing whether Coffee Chain IIF(ISNULL(SUM([Sheet1 (Coffee Chain)].[Sales Coffee Chain])),
shows sales. SUM([Sales Office City]),
If Coffee Chain does not have SUM([Sheet1 (Coffee Chain)].[Sales Coffee Chain]) +
sales, only Office City is shown. SUM([Sales Office City])
)
55
Pivot Data from Columns to Rows
Pivot from wide format to long format
Go to home page, connect to data_prep_flights.xlsx
More info: 56
https://help.tableau.com/current/pro/desktop/en-us/pivot.htm
Pivot
Pivot from wide format to long format
To long format, date as rows:
1. Select all the date columns
2. Click “Pivot”
3. Rename the columns
57
Split
Split “Employee” column
1. Select “Employee” column, select “Custom
Split”
2. Use the separator “-”, and split off “All”
3. Rename the new columns
58
Split
Sort 59
Exercise
1. Use the dataset, survey_results_wide_format (survey_results)
2. Pivot the dataset from wide format to long format
3. Create the following chart
60
Tableau (Class Activity)
61
Calculation: Aggregate vs Record-Level
Building Measures
Data: Global_superstore_2016 (orders)
62
Create a Calculated Field
63
Aggregate Functions
1. Aggregation of a measure
2. Aggregation of a dimension
More information:
https://help.tableau.com/current/pro/desktop/en-us/calculations_aggregation.htm 64
https://help.tableau.com/current/pro/desktop/en-us/calculations_calculatedfields_aggregate_create.htm
Aggregate Functions
• Attribute:
• It can provide a way to aggregate dimensions when computing table calculations, which require an
aggregate expression.
65
Exercise
66
• Create a calculated field: for each state, calculating combined sales
from office city.xlsx and coffee chain.xlsx
Exercise
Go to Data, Office
City, then create a ZN() This returns the expression if it is not null; otherwise, it returns zero.
calculated field
Change the
color in “Marks”
for each of the
“Sales” columns
67
Quick Table Calculation
Percent of total
• Data: Global_superstore_2016 (orders)
• Rows → Sales
• Columns → Category
• Go to “Rows” and select drop down menu of
“Sum(sales)”.
• Select “Add Table Calculation” and change
Calculation Type to “Percent of total”.
• Add mark label: press and hold control then drag
sum(sales) delta to “Label” in Marks
68
Quick Table Calculation
Running total: apply when one dimension is time
69
Quick Table Calculation
Running total for each year: Is this chart correct?
70
Quick Table Calculation
running total for each year:
71
Quick Table Calculation
Compare “Compute Using”:
72
Quick Table Calculation
73
Exercise
Percent of total
More Information: 74
https://help.tableau.com/current/pro/desktop/en-us/calculations_tablecalculations.htm
Tableau File Extension
File Type File Extension Purpose
Tableau Workbook .twb It contains information on each sheet and dashboard that is present in a
workbook. It has the details of the fields which are used in each view and the
formula applied to the aggregation of the measures. It also has the formatting
and styles applied. It also contains the data source connection information and
any metadata information created for that connection.
Tableau Packaged Workbook .twbx This file format contains the details of workbook as well as the local data that is
used in the analysis. Its purpose is to be shared with other Tableau desktop or
Tableau reader users assuming it does not need data from the server.
Tableau Data source .tds The details of the connection used to create the tableau report are stored in this
file. In the connection details it stores the source type(excel/relational/sap etc.)
as well as the data types of the columns.
Tableau Packaged Data source .tdsx This file is similar to the .tds file with the addition of data along with the
connection details.
Tableau Data Extract .tde This file contains the data used in a .twb file in a highly compressed columnar
data format. This helps in storage optimization. It also saves the aggregated
calculations that are applied in the analysis. This file should be refreshed to get
the updated data from the source.
75