0% found this document useful (0 votes)
240 views76 pages

Tableau - Prep

Uploaded by

Mohamed Mougi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
240 views76 pages

Tableau - Prep

Uploaded by

Mohamed Mougi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

TABLEAU PREP

FOR BUSINESS INTELLIGENCE

With Best-Selling Tableau Instructor Dustin Cabral

*Copyright Maven Analytics, LLC


Course Structure

This is a project-based course, for students looking for a practical, hands-on, and highly
engaging approach to learning Tableau Prep for business intelligence

Course resources include:

Downloadable Ebook to serve as a helpful reference when you’re offline or on the go

Quizzes & Homework Exercises to reinforce key concepts, with step-by-step solutions

Bonus Projects to test your abilities and apply the skills developed throughout the course

*Copyright Maven Analytics, LLC


Course Outline

Download Tableau Prep, explore the Tableau Prep visual


1 Intro to Tableau Prep dictionary, and discover community features

Start and open a workflow, connect to data sources, manage field


2 Connecting to Data metadata and properties, perform wildcard unions and merge fields

Build and organize your flow, review data types and size, and filter
3 Examining & Filtering your data using values and calculations

Leverage value and field operations like group, clean, convert, and split,
4 Operations & Calculations and create custom calculations (LODs and more)

5 Combining & Pivoting Combine and pivot your various data by leveraging aggregate, join,
union and pivot tools

Create full and incremental refresh outputs to deliver data to


6 Sharing & Updating Tableau Server, databases or local flat files

*Copyright Maven Analytics, LLC


Introducing the Course Project

THE You’ve just been hired by Maven Charter Schools, an up-and-coming private education
SITUATION institution. They have a wealth of public and private school data, but need help cleaning and
transforming it in order to expose meaningful patterns and insights.

THE Maven Charter Schools would like you to examine, clean, shape, combine and share competitive
BRIEF education data from the Massachusetts education market.
All you’ve been given is a folder of excel/csv files containing information about teacher pay and
performance, student SAT scores, pupil expenditures, and graduation rates by school and district.

THE Use Tableau Prep to:


OBJECTIVE • Connect to multiple data sources
• Examine and filter your data
• Clean and shape fields
• Combine and aggregate data
• Share and update curated data sources

*Copyright Maven Analytics, LLC


Setting Expectations

1 This course is designed to get you up & running with Tableau Prep
• Our goal is to provide a deep foundational understanding of Tableau Prep Builder; we won’t cover advanced
topics like R/Python or Tableau Prep Server integration in depth

2 What you see on your screen may not always match mine
• Tableau Prep updates on a monthly basis for minor releases and quarterly/yearly for major releases, so features
and functionality may change over time

3 This course is primarily geared towards data cleansing and preparation


• Data visualization is another key component of the analytics and business intelligence workflow, which we cover
in depth in separate courses (Tableau Desktop for Beginners and Advanced Tableau Desktop)

4 We will not cover Tableau Prep Conductor as part of this course


• This course will focus on Tableau Prep Builder specifically; online flow automation and collaboration features
will be reviewed at a high-level only

*Copyright Maven Analytics, LLC


Introducing Tableau Prep

*Copyright Maven Analytics, LLC


Meet Tableau Prep

Tableau Prep is a self-service data preparation tool, providing users with visual and intuitive
tools to combine, shape, and clean raw data for analysis

*Copyright Maven Analytics, LLC


Tableau User Roles

Tableau Prep is included as part of the Tableau Creator role, which includes Tableau Prep
Builder, Tableau Desktop, and one license of Tableau Server or Tableau Online

USER ROLES:

Creato Explore Viewe


r r r

PRODUCTS:

Desktop Prep Server

*Copyright Maven Analytics, LLC


Downloading Tableau Prep [Trial/Paid]

1) Go to tableau.com/products/prep and click By downloading a trial, you’ll get 14 days free


before starting a paid monthly subscription
• If you start a paid subscription, we recommend the
Tableau Creator [For Individuals] option

2) Enter a business email to start a 14-day free trial

*Copyright Maven Analytics, LLC


Tableau Prep Workspace
Flow Pane
A visual representation of each operation or step in the data preparation process

Connections Pane
Connect to local, server, or
published data sources

Profile Pane
Displays a summary of
each field in your data
sample

Data Grid
Displays a preview of the rows and columns in your source data

*Copyright Maven Analytics, LLC


Visual Dictionary

Tableau Prep uses visual indicators to represent steps, field types, and notifications within a flow;
familiarizing yourself with these indicators will help you interpret exactly how a flow functions

Input Steps Clean Steps, Changes Pane & Toolbars Join Steps
Icons in flow pane shows data source type Icons track changes made to data Icons define join types between data sources

Data Source Calculated Field Hide Profile Pane Full Anti Join
Data Source with Wildcard Union Change Data Type Show Profile Pane Inner Join
Excel Edit Value Merge Fields Left Inner Join
Excel with Wildcard Union Exclude Values Remove Fields Left Outer Join
CSV Filter Values Rename Field Full Outer Join
CSV with Wildcard Union Group Values Search Right Inner Join
Tableau Extract Keep Only Split Fields Right Outer Join

Aggregate Steps Pivot Steps Union Steps


Aggregated data shown with Sigma icon Icon represents data pivoting columns to rows Icon shows where data sources are combined

Aggregate Data Pivot Data Union Data

*Copyright Maven Analytics, LLC


Visual Dictionary

Tableau Prep uses visual indicators to represent steps, field types, and notifications within a flow;
familiarizing yourself with these indicators will help you interpret exactly how a flow functions

Profile Cards Output Steps Profile Pane


Icons identify data types and field transformations Icons identify data output types and running flow Summary of row count and data sample indicator

Calculated Field Rename Field CSV File Shows when data is sampled
Change Data Type Search Published Data Source Hover to show exact row count
Edit Value Split Fields Local Tableau Data Extract
Exclude Values Boolean Data Type Run Flow
Filter Values Date Data Type
Notifications
Group Values Date Time Data Type
Identify problems, errors or alerts
Keep Only Numeric Data Type
Merge Fields Text Data Type No Notifications
Remove Field Notification Alert
Error in the Step

*Copyright Maven Analytics, LLC


Example Flow Diagram

Aggregation
Join OUTPUT
Union

Pivot
INPUT

Clean

*Copyright Maven Analytics, LLC


PRO TIP: Data Design

It’s important to think about data design before you begin to clean or transform your data, as design
needs will vary based on your audience, use case, and performance needs

Who is the end user or audience consuming the data?


• Is the data to be utilized by analysts, managers or executives? How many users need access?

What purpose or use case is the data designed to support?


• Is the data intended for ad-hoc data pulls, deep dive dashboards, or executive-level KPI reporting?

Are there speed or performance implications to consider?


• What are the expectations regarding query performance, refresh frequency, and data depth?

*Copyright Maven Analytics, LLC


PRO TIP: Data Design

Vertical Views Wide Views Aggregated


Views

• Row-heavy data which is the most flexible • Highly dimensional data with many • Highly aggregated and curated views
structure for Tableau Desktop columns for best performance
• Ideal combo of good performance & • Allows for deep analysis and many • Ideal for executive-level visualizations
dynamic aggregation “cuts” of data and specific high-level use cases
• Commonly used with transactional data • Most common with survey data
and unique record data sets

*Copyright Maven Analytics, LLC


Connecting to Data

*Copyright Maven Analytics, LLC


Connecting to Data

Tableau Prep enables users to connect, clean, and configure raw data from virtually any source

Connect Clean Configure

• Connect to local files, databases or • Clean your data upfront with • Configure field names, data
published sources tools like data interpreter types, text settings, etc.
• Enhance connections with wildcard • Filter initial data down before • Choose which fields to include or
unions, SQL and more the main flow exclude from the flow

*Copyright Maven Analytics, LLC


Data Connection Types

Tableau Prep Builder enables users to connect to many data


sources and platforms, including:
• Flat Files (xlsx, csv, access, pdf, .hyper, etc.)
• Servers (SQL Server, Salesforce, Hadoop, Snowflake, Postgres, etc.)
• Published Data Sources (Tableau Server / Online Sources)

PRO TIP: Don’t have local credentials? Leverage


published Tableau Server connections as data inputs!

*Copyright Maven Analytics, LLC


Data Connection Examples

https://tableau.mavenanalytics.com

Local Files Databases Tableau Server


When you connect to local flat When you connect to a database, When you connect to tableau server,
files, prep builder will show tabs you must enter credentials in order enter your server credentials to view all
for excel files and can union files to access the schemas, tables and published data sources, tables, and files
within a given directory views available

NOTE: Data Interpreter is available for NOTE: Data Interpreter is NOT available NOTE: Data Interpreter is NOT available for
text/csv files for database connections tableau server connections

*Copyright Maven Analytics, LLC


Wildcard Unions

Wildcard unions allow you to combine files or tables within a folder or directory at the input stage

Search In
Select the directory/schema to use
to find files/tables for the union

Include Subfolders
Includes files contained in
subdirectories of the parent folder

Files, Sheets & Tables


Include or exclude files, sheets or Matching Pattern
tables using these dropdowns
Includes only files, sheets or tables which contain specific text (*),
or leave blank to union all files

Included Files & Tables


Previews the files or tables matched
based on the wildcard settings PRO TIP: CSVs union automatically in the same
directory, as well as sheets in Excel workbooks!

*Copyright Maven Analytics, LLC


PRO TIP: Input Joins

Joins can also be created at the input stage for certain database connections; if table relationships
are present, Linked Keys will be available to specify which fields to use for the join

Linked Keys
Unique Identifier (Primary Key)
Related Fields (Foreign Key)
Unique and Related Fields

*Copyright Maven Analytics, LLC


Input Cleaning

Change Data Type Filter Values


Click type to change from given drop-down menu Click Filter icon and create filter from calculation window

Remove Field Rename Field


Uncheck fields to remove them from the flow Double-click fields to enter a new field name

*Copyright Maven Analytics, LLC


Text Configuration

Text files require additional configuration in the Settings tab to determine how they will be ingested

First Line Contains Header is the default, and pulls the first row as headers

Generate Field Names Automatically will generate generic headers (F1, F2, etc.)

Field Separator gives a character dropdown to choose a field delimiter


• NOTE: Choosing “Other” will allow for a custom delimiter

Text Qualifier selects the character that encloses the values in a file
• NOTE: This defaults to automatic and gives ‘, “, and “none” as options

Character Set selects the character set that describes the file encoding (UTF-8, etc.)

Locale sets the geographic location to parse the file (important for dates, currency,
decimals/thousands separators, etc.)

*Copyright Maven Analytics, LLC


Data Sampling

To optimize performance, Tableau Prep samples large data sets and returns a subset of records

Default sample amount: Prep Builder determines number of rows to return

Use all data: Retrieves all rows regardless of size (can cause performance issues)
• NOTE: Data will still limit to 1 million rows (Aggregate/Union) and 3 million (Join/Pivot)

Fixed number of rows: Select custom number or rows (recommended <1 million)

Quick select (default): Sample is returned as quickly as possible, using N number


of rows or cached data available from a prior query

Random sample: Returns the number of rows requested, but looks at all records
and returns a representative sample (may impact performance prior to cache)

*Copyright Maven Analytics, LLC


Refreshing Data

If data changes while building a flow, you can refresh during the input stage using several methods:

OPTION 2:
Edit Connection
Edit the data connection OPTION 1: Refresh
and return to the flow
For File Inputs, refresh using the refresh
icon or the input step

OPTION 3: Remove & Re-Add


Completely remove the input step, re-connect, and drag the table back into the flow

*Copyright Maven Analytics, LLC


HOMEWORK: Connecting to Data

THE Happy Hipsters, a lifestyle apparel company, wants to analyze World Happiness data to
SITUATION support an upcoming marketing campaign, and has enlisted your help

THE The Happy Hipsters team has asked you to help clean and consolidate their raw data
BRIEF into a single source, which will enable them to explore and analyze key global happiness
metrics for their new campaign

THE Use Tableau Prep to:


OBJECTIVE • Connect to source data
• Use a wildcard union to combine files
• Clean data upon input
• Configure and refresh data sources Happy
Hipsters
*Copyright Maven Analytics, LLC
Examining & Filtering

*Copyright Maven Analytics, LLC


Examining & Filtering

After connecting to sources, users can examine & filter data using Tableau Prep’s visual interface;
it’s important to conduct these steps before making any major changes to your data in the flow!

Examine Filter

• Profile your data by looking at field value distributions • Reduce the data being pulled, using various filtering tools
• Review data types, data size, and find specific fields or values • Organize your flow’s tools and settings for optimal performance
and clear documentation
• Sort & Highlight values in your fields to find gaps or deficiencies

*Copyright Maven Analytics, LLC


Data Types & Sizes

One of the first steps in evaluating data is to examine data size, field types and unique values;
this can be done at several stages, but the simplest approach is to add a clean step

Adding a Clean Step


Clean Steps can be added to a
flow in two distinct ways:

Field Data Type


Data types can be
Automatic Manual (+) modified by selecting
from the header

Select gray outline

Data Size Unique Values


Shows the number of fields and row count Displays the distinct values in each field
(hover to see exact count)
*Copyright Maven Analytics, LLC
Value Distribution

The profile pane allows you to visualize the distribution of your data, by plotting the frequency of
each distinct value as bins in a histogram; this is a great way to identify outliers and null values!

Summary View
Detail View
Continuous view of values
Discrete view of individual showing both the range and
values within the column frequency in which they appear
NOTE: Click the distribution in the column
to skip to desired values

View State Selection


Summary visualizes the distribution, detail shows all distinct values

*Copyright Maven Analytics, LLC


Finding Fields & Values

Use the toolbar search or field search options to find specific fields or values in your data

Search for Values


Search for Fields
Search for values using various match
options (contains, starts with, exact Enter a full or partial search to
match, etc.) or click (…) for advanced return matching fields
options or to filter found values

*Copyright Maven Analytics, LLC


Sorting & Moving Profile Cards

Within the profile pane, you can sort bins by either frequency or alphabetical order (ascending or
descending), or click to drag and rearrange profile cards

Sort Bins & Fields


Sort by count (frequency)
or domain (alphabetical) Move Cards
Reorganize profile cards
by dragging until a black
line appears

*Copyright Maven Analytics, LLC


Highlighting

Highlighting is a quick way to trace fields back through flow steps, see related values across
fields, and pinpoint identical values in your data

Trace Fields
Select a field to trace where
it was used or modified
within your flow
Related Values
Highlight related values by
selecting a value/bin in the
profile pane
NOTE: Related values are
highlighted in blue

Identical Values
Select a value in the data grid to highlight all identical values

*Copyright Maven Analytics, LLC


Filtering Methods

There are several filtering methods in Tableau Prep, based on the field type and step chosen:

Keep or Exclude Keeps or removes selected value or field (available for all field types; String, Number, Date, Date Time, etc.)

Calculation Filter Filters values based on calculated field condition (available for all field types)

Selected Values Filter Chooses values to keep or exclude even if they aren’t in the data source (available for all field types)

Range of Values Filter Filters by minimum and maximum value parameters (available for Number field type)

Range of Dates Filter Filters by minimum and maximum date value parameters (available for Date and Date Time field types)

Wildcard Match Filter Filters by partial or whole matching text (available for String field type)

Null Values Filter Keeps only Null or Non-Null Values (available for all field types)

*Copyright Maven Analytics, LLC


Filtering Methods

Keep Only/Exclude
Single or multi-select values
Calculation Filter from the profile card to keep
or exclude
Condition must be Boolean
(only filter available in steps
other than clean step)

Selected Values Range of Values Range/Relative Dates Null Values Wildcard Match
Manually select values to Filter numeric values within a Range of dates (upper/lower) or Filter to only null or non-null Keep/exclude values based on
keep/exclude (keyed values can specified lower/upper limit time period relative to today or values a pattern (filter results display
be added even if not in data) an anchor date on left pane)

*Copyright Maven Analytics, LLC


HOMEWORK: Examining & Filtering

THE Your brother-in-law Sai just started his first business venture: a food truck specializing in
SITUATION Indian desserts called Bengali Sweet Treats. As the family’s resident data nerd, you’ve
been enlisted to help him analyze popular Indian dishes to help him perfect his menu.

THE Sai needs you to examine a spreadsheet containing hundreds of Indian dishes, and profile
their ingredients, prep time, regional origin, and flavor profile.
BRIEF
You’ll need to connect, profile, and filter the data to give Sai some ideas for his award-
winning food truck!

THE Use Tableau Prep to:


OBJECTIVE • Examine data types and sizes
• Profile value distribution across fields
• Sort, move and highlight relevant data Bengali Sweet
Treats
• Filter values to pinpoint key records

*Copyright Maven Analytics, LLC


Operations & Calculations

*Copyright Maven Analytics, LLC


Operations & Calculations
Tableau Prep includes a range of tools for cleaning and transforming data, including value & field
operations (grouping, cleaning, converting, splitting, etc.) and calculations (analytic, logical, LOD, etc.)

Value & Field Operations Calculated Fields

• Clean & transform data using a range of value and • Perform logical, string, aggregate or level of
field operations (group, filter, split, etc.) detail calculations to create new fields
• NOTE: Cleaning steps can be performed in multiple • Apply analytic functions (i.e. rank) across tables
flow steps (except output) or partitions

*Copyright Maven Analytics, LLC


Value & Field Operations
Common value & field operations fall into three main categories based on the scope of impact
(records, fields and values) and can be accessed from multiple flow steps
Flow Step

Operation Input Clean Aggregate Pivot Join Union

Records Filter

Keep / Remove Field

Rename Field
Fields
Duplicate Field

Calculated Field

Clean

Convert Dates

Edit Values
Values
Group Values

Split Values

Change Data Type

*Copyright Maven Analytics, LLC


Clean Step Layouts

Cleaning Operations
Accessible via the profile pane or drop-down menu
Layout Options:

Profile Pane (default)


Shows profile pane + data grid

Data Grid
Shows detailed data view

List View
Shows columns in list form

*Copyright Maven Analytics, LLC


PRO TIP: Pausing Data Updates
Pause data updates to optimize performance during flow development (NOTE: the view will
automatically switch over to list view while data updates are paused)

Pause/Resume Updates
Options to pause or resume updates

Limited Features
Features which require visual representation of values (splitting,
filtering, grouping, etc.) are disabled while updates are paused

*Copyright Maven Analytics, LLC


Value Operations

Value operations can be used to filter, clean, group or split values inside fields

Filter allows you to reduce the number of records using various filter criteria

Clean provides a list of quick cleaning operations which apply to all values in the field

Group Values replaces individual or multiple values with new a group value

Split Values parses values using an automatically detected or custom-defined delimiter

PRO TIP: Use Tableau Prep’s recommendations (light bulb) to automatically clean your data

*Copyright Maven Analytics, LLC


Value Operations | Clean

Use cleaning tools to change text case, remove specific characters, or trim spaces from strings

Make Uppercase changes text case to upper

Make Lowercase changes text case to lower

Remove Letters removes all letter characters from a string

Remove Numbers removes all number characters from a string

Remove Punctuation removes all forms of punctuation

Trim Spaces removes leading or trailing spaces

Remove Extra Spaces removes extra spaces (when >1)

Remove All Spaces removes any spaces contained in the string

*Copyright Maven Analytics, LLC


Value Operations | Manual Grouping

Manually group text values using multi-select or checkbox selections

Manual Grouping

Multi-Select Checkbox Selection


Search for a matching string and use Use checkboxes to add/remove values from a group
Ctrl/Cmd to select values to group

PRO TIP: To add new values which do not currently


exist in the data set, select an existing group and
manually type in the value (shown with a red asterisk)

*Copyright Maven Analytics, LLC


Value Operations | Automatic Grouping

Automatically group text values using fuzzy matching algorithms based on pronunciation,
common characters or spelling

Pronunciation
Find and group values which sound alike. and move
threshold slider to the left or right to adjust strictness
(left = fewer groups, right = more groups)

Common Characters
Find and group values with letters and/or numbers in
common (i.e. “John Smith” and “Smith, John” likely
refer to the same person)

Spelling
Find and group values which are spelled alike, and move
threshold slider to the left or right to adjust strictness
(left = fewer groups, right = more groups)

*Copyright Maven Analytics, LLC


Value Operations | Split Values

Split text based values on automatic or custom-defined delimiters

Automatic Split
Splits values automatically using common delimiters

Custom Split
Define the delimiter and number of columns for the split

Calculated Split
Split text using a custom calculated field
NOTE: Calculations are automatically generated when
either split type (automatic or custom) is performed

*Copyright Maven Analytics, LLC


Value Operations | Edit

Values can be edited individually or as a group to correct inaccuracies or standardize variations

Double-Click
Double-click a value in the profile pane to edit it directly
(field turns into a group after the first try)

Right-Click
Right-click and choose “Edit Value” to edit or replace
the value with null

Edit Multiple Values (Group Values)


Ctrl/Cmd click to manually group multiple values

*Copyright Maven Analytics, LLC


Value Operations | Convert Dates

Convert dates to modify formats without the need for calculated fields or parsing functions
Date and Time
Convert date field to datetime format (ex. 1/23/2020, 11:14:02 PM)

Year Number
Convert date field to year number format (ex. 2010, 2015, 2020)

Quarter Number
Convert date field to quarter number format (ex. 1, 2, 3, 4).

Month Number
Convert date field to month number format (ex. 1, 2, 3, 4 … 11, 12)

Week Number
Convert date field to week number format (ex. 1, 2, 3, 4 … 52, 53)

Day of the Month


Convert date field to day of month format (ex. 1, 2, 3, 4 … 31)

Custom Fiscal Year


Convert date field based on a
custom fiscal calendar

*Copyright Maven Analytics, LLC


Field Operations | Field Types

Field types can be customized in every flow step except the output, and are used to assign fields
as numbers (decimal or whole values), dates (date or datetime) or text strings

Number (decimal)
Numeric value with decimal values (best for exact values like dollars, ratios, etc.)

Number (whole)
Numeric value with no decimal (best for quantity, date parts, ID fields, etc.)

Date & Time


Date and Time in the same field (best for exact time needs – where parts of a day matter)

Date
Date fields (best when date filtering and date calculations are needed – datediff, dateadd, etc.)

String
String fields (best for most dimensional values, text values that should be parsed, etc.)

Note that data types not only impact how fields are used in Tableau Prep, but also how
data visualization tools interact with data and users

*Copyright Maven Analytics, LLC


Field Operations | Data Roles

Data roles represent standard sets of values, which can be used to validate the values within a field

None (default) Show Values (Valid/Not Valid)


The default role for each field (no role assigned) Once applied, developers can view valid and not valid
values and use value editing to correct potential issues
Geographic
Geospatial roles based on the same domains as Tableau Desktop
• Airport
• Area Code
• CBSA/MSA
• City
• Congressional District (US)
• Country/Region
• County
• NUTS Europe
• State/Province
• Zip code/Postal Code

URL
Web link-based role / URL fields
Published Data Roles are used in Prep Builder in conjunction with
Email Prep Conductor (not covered in this course) to compare values in
your flow against published standardized data values
Email role fields

*Copyright Maven Analytics, LLC


Field Operations | Cleaning

Field cleaning operations can be used to modify, add or remove fields from the flow

Rename Field changes the field name referenced (can double-click name as well)

Duplicate Field creates a copy of the field (and adds a “-1” to the name)

Keep Only Field keeps only the selected field(s) in the flow.
• NOTE: Use Ctrl or Cmd to select more than one field to keep

Create Calculated Field creates a new calculated field with the selected field referenced
• NOTE: We’ll cover calculations in depth later in this section!

Remove removes the selected field(s) from the flow

*Copyright Maven Analytics, LLC


Calculated Fields
Calculated fields can be created via standard editor or visual editor, depending on the function

Summarize or change the level of


Aggregate granularity of your data

Perform calculations across


Analytic tables or partitions

Create, modify, and calculate


Date date/time fields
Standard Editor
Standard calculation editor available for all functions
Determine if a conditional
Logical statement is true or false

Computation-based functions
Number used on numerical fields

String Manipulation text-based data

Visual Editor Type Convert fields from one data


Modified calculation editor for Fixed LOD and Rank Conversion type to another

*Copyright Maven Analytics, LLC


Level of Detail Calculations

Level of detail (LOD) calculations are used to perform aggregations at different grains of data

LOD Expression Syntax:

Level of Detail Element Dimension Declaration Aggregate Expression


FIXED is the only option in Prep Grain at which data is aggregated Calculation to be performed

LOD Visual Editor:

*Copyright Maven Analytics, LLC


Analytic Calculations | Rank
Rank calculations are a subset of analytic calculations which can be applied across an entire table or a
subset of rows (partition)

Analytic Calculation Editor:

Partition Order by Rank Calculation


Designates rows to which Specifies field to generate Rank or row number
calculation will be applied sequence for ranking calculation, with optional sort
order (DESC by default)

Options include:
Visual Editor: • RANK()
• RANK_DENSE()
• RANK_MODIFIED()
• RANK_PERCENTILE()
• ROW_NUMBER()

*Copyright Maven Analytics, LLC


PRO TIP: Copy & Paste

Copy and paste individual elements within flows, including cleaning operations, fields or steps

Copy Cleaning Operations Duplicate Fields Copy Flow Steps


Drag from the changes pane onto another field, or Copy fields using the “Duplicate Field” options Copy individual steps into different parts of
right-click to copy within the same flow the flow (or whitespace)

*Copyright Maven Analytics, LLC


PRO TIP: Reusable Flow Steps

Reusable flow steps can be created, saved and imported into other flows, and are commonly used
for steps which are used frequently or leveraged by other users

Save to File Insert Flows


Save a file locally as a Flow or Packaged Flow Insert flows using the “Insert Flow”
option from any step or whitespace

Publish to Server
Publish a flow to Tableau
Server using publisher
credentials
NOTE: Published flows which
utilize file-based input steps
are not yet supported

*Copyright Maven Analytics, LLC


HOMEWORK: Operations & Calculations

THE Your old boss at Tech Data Talent (TDT) contracted you for some data prep assistance.
SITUATION You’ll need to use your Tableau Prep skills to make sure the TDT team is working with
clean and accurate data.

THE Your task is to clean survey response data to help the team accurately analyze mental health
BRIEF trends in the tech industry. The key will be to clean and organize the data in a way that will
allow TDT’s analytics group to easily analyze and visualize patterns.

THE Use Tableau Prep to:


OBJECTIVE • Clean and manipulate values
• Modify and customize fields
• Create calculated fields
• Export flow steps

*Copyright Maven Analytics, LLC


Combining & Pivoting

*Copyright Maven Analytics, LLC


Combining & Pivoting

Data can be transformed and combined using several types of flow steps in Tableau Prep, including
Union, Join, Aggregate and Pivot

Union & Join Aggregate Pivot

• Union and join are used to blend data • Change the granularity of your data • Transpose rows to columns (or
together to create combined tables using aggregate (i.e. daily to monthly) columns to rows) using a pivot step
• Union stacks records from common • Group data by fields in your table to • Set up data outputs for optimal
columns, and joining adds related fields control the level of aggregation consumption using different table
from another table layouts

*Copyright Maven Analytics, LLC


Combine Data | Union

The union step appends (or “stacks”) records from multiple tables, based on matching columns

Add Union Step


From any step, select (+) and
choose Union

Drag to Union
Manually drag one step over another to
union, and use (+) to add more tables

PRO TIP: If you need to union 10+ tables, try using wildcard unions in the input step!

*Copyright Maven Analytics, LLC


Union Results & Common Issues
Review the union results in the profile pane to identify and resolve common union issues, including
data type differences and mismatched fields
Data Type Differences
Inputs Columns with the same name but different types
Color-coded list of tables included in union will automatically default to strings

Resulting Fields
Count of total and
mismatched fields

Mismatched Fields Merge Fields


List of fields which did not union (may be unique to Drag fields over each other to merge them into one (in
source or truly missed during union) case union didn’t identify them as a match)

*Copyright Maven Analytics, LLC


Combine Data | Aggregate

Aggregate allows you to change the granularity of your data by summarizing values at higher levels

Add Aggregate Step Adjust Grouping & Aggregation


Select the (+) icon next to your existing step and choose “Aggregate” Select the card headers to update grouping or aggregation logic

Additional Fields
Drag fields to the “Grouped Fields” or “Aggregated Fields” panes PRO TIP: Use “Group By” with no aggregation to create a
(NOTE: fields not selected will not pass through this step) unique list of dimensions

*Copyright Maven Analytics, LLC


Combine Data | Join

Join is used to combine data between tables which share common or related fields

Join Types Creating a Join


For each row, includes values that
Inner have matches in both tables

Include all values from left table and


Left matches from right table

Include all values from right table


Right and matches from left table

Include only values from left table


Left (only) and no matches to right

Click to Join Drag to Join


Include only values from right table
Right (only) and no matches to left Select (+) from a step and choose “Join” Add more sources to the join by dragging
step to the “(+) Add” icon in the Join step

Includes all values from right and left


Outer that don’t match

Includes all values from both tables,


Full Outer non-matches are null

*Copyright Maven Analytics, LLC


Join Results & Common Issues

Review the join results in the profile pane to identify and resolve common issues, including mismatched
values or incorrect join types or clauses

Add Join Clauses


To add join clauses, select (+) and choose fields

Applied Join Clauses


Join Results
Fields applied as join clauses
Shows fields and values resulting from
the applied join clauses
Join Type
Define join type by selecting Venn Diagram

Summary of Join Results


Shows distribution of values included/excluded
Join Clauses Pane
Values which meet join criteria are
black, mismatches are red
Join Clause Recommendations
Fields recommended for use in join clause

*Copyright Maven Analytics, LLC


Pivot

Pivoting transposes rows to columns (or vice versa), allowing you to create “wide” or “tall” tables

Pivot Names & Values Pivot Type Selector


Assign Name and Value columns for pivoted fields Columns to Rows or Rows to Columns
Add Pivot Step
Select “Pivot” from latest
step (+) dropdown

Add Fields Wildcard Search


Select and drag fields into the “Pivoted Fields” pane Use wildcard search to find all fields containing
specific text and pivot automatically

*Copyright Maven Analytics, LLC


PRO TIP: Organize Your Flow

Descriptions
Add descriptive notation to steps
to provide details and clarity

Color Scheme
Customize colors to identify
related steps in the flow

Group Steps
Use groups to organize and compress large collections of
flow steps to make them easier to digest and share

*Copyright Maven Analytics, LLC


HOMEWORK: Combining & Pivoting

THE As a leader of your local F1 racing fan club, you’re in charge of preparing data for the club’s
SITUATION upcoming annual F1 fantasy draft.

THE You’ve been asked to gather data to help members accurately analyze driver stats, lap times,
BRIEF and race results. The key will be to combine raw data into a centralized source that combines
all historical race data as well as peripheral driver and result information.

THE Use Tableau Prep to:


OBJECTIVE • Join & Union multiple sources
• Aggregate and Group data
• Pivot columns to rows
• Organize flow steps

*Copyright Maven Analytics, LLC


Sharing & Updating

*Copyright Maven Analytics, LLC


Sharing & Updating

Tableau Prep allows you to configure options for sharing data outputs and updating flows

Share Update

• Share data outputs as local files, published data sources, or • Refresh your flow and configure incremental update
updated tables in databases options
• Preview your data in Tableau Desktop prior to automating your • Learn about the benefits of using Prep Conductor to
flow to ensure your success criteria have been met fully automate prep flows

*Copyright Maven Analytics, LLC


Saving Flows

Save your flow locally to retain steps, bundle local data sources, and share flows with other users

Save Flow
Manually save your flow as a .tfl
file to retain your work

Save As
Use Save As to choose the
type of flow file saved
`

Export Packaged Flow


Exports a packaged version of the flow
directly as a .tflx file

Tableau Flow File Packaged Tableau Flow File


The standard flow as a .tfl file (no data retained) Packaged flow, which bundles Excel, text, and
Tableau extracts with the flow as a .tflx file

*Copyright Maven Analytics, LLC


PRO TIP: Preview in Desktop

Use the Preview in Tableau Desktop option to preview the output while developing a flow

Tableau Prep: Tableau Desktop:

*Copyright Maven Analytics, LLC


Create Local Extracts

Create local extracts in Tableau Prep to output as either .csv or .hyper file formats

Save Output To Add Output


Choose output target (file, published data Select (+) > Output to
source, or database table) create an output step

Name
Name the output extract

Location Data Grid / List


Choose a location to save the output View
Choose list view for
Output Type output field details
Choose an output type (.hyper or
.csv for local files)

Write Options
Choose local write options (create
table or append to table)

Run Flow
Execute the flow on full data

*Copyright Maven Analytics, LLC


Save to External Databases

Prep can write to external databases as a new table or append/replace data in an existing table

Save Output To Custom SQL


Choose output target (file, published data Embed custom SQL code to
source, or database table) execute before or after flow has
written data to database
Connection
Select database type and enter credentials

Database
Select a database schema

Table
Select an existing table or create a new one

Write Options
Create, append, or replace table data

Run Flow
Execute the flow on full data

PRO TIP: Select “enable incremental refresh” on input/output to only add new data!

*Copyright Maven Analytics, LLC


Create Published Data Sources

Publish data sources to Tableau Server to grant user access to data and enable automated refresh

Sign-In to Tableau Server Publish Flow


Use credentials or SSO to log into Tableau Server / Tableau Online Configure the flow’s publishing options

Project
Select the project where your flow will be located

Name
Name your flow

Description
Give a brief description of what your flow does

Tags
Make the flow searchable on server using tags

Connections
Edit connections to embed credentials; local files
need to be uploaded (flat) or use direct connection
(refreshed on regular basis)

NOTE: Direct connection requires that Tableau


Server is granted access to the share / database

*Copyright Maven Analytics, LLC


PREVIEW: Tableau Prep Conductor

Tableau Prep Conductor can be used to automate and optimize flows in Tableau Server / Online

Schedule Flows
Schedule flows to automatically run on a
set day or at a specified refresh time

Create / Edit Flows


Create and edit flows in your browser,
and run flows manually on-demand

Administration
View performance and scheduling to
optimize flow runs

Alerts
Configure alerts and email notifications to
notify you of failed flows

*Copyright Maven Analytics, LLC


HOMEWORK: Sharing & Updating

THE Your friend Anna is a Director at Maven Financial, a local bank branch, and needs your help
SITUATION extracting customer data from Tableau Prep.

THE Anna has asked you to set up outputs for various stakeholders, utilizing various file formats
BRIEF and platforms. Your job is to deliver the data in a predictable and efficient way, to enable the
business to use it going forward.

THE Use Tableau Prep to:


OBJECTIVE • Save flows locally
• Preview flows in Tableau Desktop
• Output flow data to local sources
• Output flow data to a database and Tableau Server*

*If you do not have access to Tableau Server, you can skip this step and review the solution video
*Copyright Maven Analytics, LLC

You might also like