Data Visualization - Day 1 - in Class Exercises - Connecting To Data - Solution Final
Data Visualization - Day 1 - in Class Exercises - Connecting To Data - Solution Final
Notes:
Create an Entity Relationship Diagram to better display the data files you have been provided.
1
Exercise2: Blending Data
You work for the Global Superstore. Global Superstore just merged with Bob’s Coffee Shop. The CEO wants to
better understand how they are doing. We need to develop some analytics using the data we have available. Your
Manager has supplied you with 2 Excel Files: The Global Superstore Excel File that has two data sets: Orders and
Region Manager and 1 Excel file that has Bob’s Coffee Shop data. We are using Alteryx to do the data blending and
Tableau to do the Visualization.
Drag the Input Data tool to the worksheet. Connect to the Global Superstore data by browsing for the data file and
then choose the Orders table from the Global Superstore Excel Spreadsheet.
2
3
Run the extract using the Run Button
4
Step 4: Connect to the Bob’s Coffee Shop data, run and check the extract.
Bob’s Coffee shop data has most of the fields we need, but it is missing Country Data. Since Bob’s Coffee shop is
only in the United States, they did not feel it was necessary to have the field. Since we will be combining this data
with the Global Superstore data, we need to add a Country field where the Country will be United States.
Use the Text Input tool and insert one field, Country where the value is United States.
5
Append the Country field to Bob’s Coffee Shop using Bob’s as the Target and the Text Input of United States as the
Source. This is also a good time to change the fields we identified to change when we looked at the ERD:
Run and check the data. Notice that The Country field with the Value United States was appended to the Coffee
shop data and the other fields names have changed.
Step 6: Create a Union of the Global Superstore and Bob’s Coffee Shop data with the appended Country.
We use the Union tool to combine these 2 data streams with similar data – we are not joining them – we are
combining them together.
Run and check the data. There should be 10 warnings for the 10 fields that are not common in both tables.
6
Our combined data now has both the GSS and Bob’s data merged together.
Step 2: Connect to the Global Superstore Region Manager data, Run the Extract and check the data
Step 3: Join the Order data from GSS and Bob’s and the Region Data
7
Join the two files based on the common field Region. Use the Join tool to join the two files. Run and check the
output. Remove any duplicate fields (e.g. Right_Region) and Unknown fields.
The J output contains records matched from the L and R inputs based on the common Region field.
The L and R outputs will contain any records that do not match from the L and R inputs.
Because we have a match on all fields, there is no data in the L or R side. You can also join based on position in a
file or multiple fields.
Use the Output Data tool to create a Tableau file called GSS and Bobs Data. Use Tableau as the format and
Override Existing Extract file (every time you re-run it).
8
Run the workflow.
Step 1: Connect to the GSS and Bobs data we just created in Alteryx
9
We are going to import the data Live into Tableau – note that we could also extract that data to Tableau and we
could also filter the data to only select pat of the data. We will be using all of the data. Click on Go to Worksheet at
the bottom.
10
You can see the Dimensions and Measures (Facts) from our star schema on the left-hand side. You also see GSS
and Bobs at the top as our data source.
11
Exercise 2: Tables / Crosstabs in Tableau – with only one measure
Step 1: Create a Table in Tableau with Sales by Region:
Bring the Measure (Sales) to the Column Line and Region to the Row line – use the Show Me section to format this
in a Crosstab.
When you only have one measure, Tableau does not show the title the measure column. If you have only one
measure you will need to move Measure Names to the column itself (not the Column shelf at the top) for it to
show up.
We can adjust the data on the screen to make it look better and get rid of the Title for now.
12
We can add a Summary to the sheet to show Average, Minimum, Maximum and Median Sales.
13
Format the Summary data to be Currency with 2 decimal places. Format Sales to be Currency no decimal places.
14
15
16
Exercise3: Tables / Crosstabs in Tableau
Create a new sheet in Tableau. Bring the 2 Measures (Sales and Profit) to the Column Line and Region to the Row
line – use the Show Me section to format this in a Crosstab
17
b. Shade every other line in light blue
c. Total by Column
18
d. Sort from highest to lowest by Sales
Sort by Sales
19
e. Move the Sales Column so it is before the Profit Column
Group the following Regions intothe United States: Central, South, North, West and East.
20
Rename the Group United States
21
Call this sheet Sales and Profit by Region
22
Exercise 2: Create a Table that supports a Hierarchy
Step 1: Create a new Hierarchy with in the GSS and Bobs data set that includes Category, Sub Category and
Product Name
a. Notice in GSS and Bobs data set, we have Category, Sub-Category and Product Name – we can create a
hierarchy of these fields. Drag Sub-Category onto Category. It creates a hierarchy called Product
Category, Product Sub-Category. Move Product Name under Product Sub-Category to add it to the
hierarchy. Rename this to Product Hierarchy.
23
24
b. Let’s look at Sales by this new Product Hierarchy. Move Sales to Rows and Product Hierarchy to Columns.
Click on the plus sign to the left of the hierarchy on the Columns tab.
25
c. I’d like to switch my columns and rows. I can do that by clicking on the Swap Button.
d. I changed my mind – I want to go back to what I had before. Click on the Back <- Button.
26
e. Add Profit and change the chart to a Table (Crosstab) which includes sales and profit by Product
Category, Product Sub-Category and Product Names and move Sales before Profit
27
Call this sheet Sales and Profit by Product Hierarchy
28
Exercise3: Ratio of Sales by Product Category
a. Create a table (Crosstab) using Sales and Product Category and Total by Sales
29
c. Add the Sales Measure to the Table by moving it to the Rows
30
d. Change the name of the % column to Sales %
31
Exercise 4: Save
32
Call it DV ICE 1
33
Exercise 5: Answer the following
a. Which Customer has purchased the most from The Global Supply Store?
Sean Miller
$27,469
34
d. What is our top selling product?
Cisco TelePresence System EX90
35
36