0% found this document useful (0 votes)
105 views14 pages

Data Blending in Tableau 27

Data blending in Tableau allows combining relevant data from two different data sources in a single Tableau worksheet. It differs from joins by only combining matching data and aggregating it before combining, avoiding data duplication. The tutorial explains how to define primary and secondary data sources and link their fields to blend them, demonstrating with an example of blending sales data from two years. It also compares data blending to joins, noting that blending aggregates data independently before combining while joins combine and then aggregate.

Uploaded by

Giri Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views14 pages

Data Blending in Tableau 27

Data blending in Tableau allows combining relevant data from two different data sources in a single Tableau worksheet. It differs from joins by only combining matching data and aggregating it before combining, avoiding data duplication. The tutorial explains how to define primary and secondary data sources and link their fields to blend them, demonstrating with an example of blending sales data from two years. It also compares data blending to joins, noting that blending aggregates data independently before combining while joins combine and then aggregate.

Uploaded by

Giri Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Data Blending in Tableau – Learn

to combine relevant data from


distinct data sources
In this tutorial, we will be learning a very interesting and handy feature of
Tableau, that is, Data Blending. Here, we will learn about the basics of data
blending in Tableau, how is it different from joins in Tableau and lastly how to
blend data from two different data sources.

What is Data Blending in Tableau?


In Tableau, the data blending features allows us to bring data from two
different data sources together in a single view or a single Tableau worksheet.

It is different from creating joins because blending only combines relevant


data from distinct data sources, whereas joins work on row-level and often
duplicate data that is repeating in several rows.

In data blending, there are two data sources; a primary data source and a
secondary data source. The additional relevant data of the secondary data
source is taken and displayed with the main data of the primary data source.

We can make graphs and charts using the data from both data sources at the
same time in a sheet.

Before moving further, get yourself acquainted with how to extract data in


Tableau?
Primary and Secondary Tableau Data Sources in
Tableau
To understand and perform data blending in Tableau, we first need to
acquaint ourselves with primary and secondary data sources. For data
blending, the primary data source is the main table whereas secondary data
source is the additional table.

There should be at least one secondary data source to perform blending in a


Tableau sheet. The data from the primary data source is the main data based
on which Tableau defines graphs and charts.

However, only those data values that are relevant, matching or corresponding
to the values of primary data sources are taken from secondary data source
leaving everything else at the source.

For instance, in our H&M sales data, if in the primary data source we have
regions like North, West, Central. But, the secondary data source has North,
South, West, and East.

Then, upon data blending, the final graph cannot show data related to the
South and East regions because it is not in the primary data source. So, you
should always select your primary and secondary data sources wisely,
depending upon the fields and values you wish to show in a chart. You can
select a data source as primary by simply using its fields first in a chart.

Check out different types of Tableau data sources with DataFlair.


Defining Link Relationships for Data Blending in
Tableau
We can define relationships or create links between data fields from a blended
primary and secondary data source. Ideally, Tableau creates links between
similar fields automatically, but you can define more relationships with the
help of the Edit Relationships option.
In order to create a link between two fields, you must have common
dimensions in both the data sources. Such common dimensions are known
as Linking fields. Once you have linking fields in both your primary and
secondary data sources, you can easily create a link.
Here are some important points you should keep in mind while establishing a
link between two linking fields.
 All the links are shown in the secondary data source pane in a sheet.
Here, you will find an orange link (chain) icon in front of the linked fields
and a grey broken link icon for the potential linking fields.
 You can create a link between two fields only if they contain similar
data. For instance, if Table 1 has the Regions field and Table 2
has Zones field with the same field values, that is, North, South, East, West,
you can manually establish a link between these two fields or common
dimensions rather than renaming one of them.
 You just need to click on a broken link icon to activate an inactive
linking field.
 If your primary and secondary data sources have more than one field in
common or more than one potential linking field, then multiple links can
be established between them in a single Tableau worksheet at a time.
How to Blend Data in Tableau?
Let us understand data blending in Tableau with the help of an example. We
have two data sets of sales of the store H&M in the year 2018 and in the year
2019. Now, the number of fields in both these data sets is the same but data is
different.

For instance, in the table “H&M 2018” the field named Region stores all the
regions data. Whereas in the “H&M 2019” table, the field having the same 5
regions is called Zone. With the help of data blending, we can equate these two
fields as both of them contain similar regions like north, south, central, east
and west.
Hence, data blending helps in establishing relationships between two relevant
data sources which makes data analysis more meaningful and insightful. We
can compare two data sets more efficiently by blending them in a single
Tableau worksheet.
So, before we start learning about blending data, let us first show you our two
sets of sample data of H&M sales.

We make separate connections to both the data sets; “H&M Sales 2018” and
“H&M sales 2019”.

Now, to see the existing relationship between these two data sets and to make
new relationships, we go to Data tab and then select the Edit
Relationship option.

A relationship dialog will open which shows the primary data source,
secondary data source and a list of already existing or automatically detected
relationships between the fields of the two tables.

You can change the primary and secondary data source from the drop-down
list as per your liking. Also, we can change the Automatic option to Custom to
make a new relationship.

Now, we will create a relation between the fields Region and Zone from the two


tables to state that the constituent data in them is similar (not identical). To
do this, we select the Custom option and click on Add.
Then a list of fields from both the tables will open. We
select Zone and Region from here. Click on OK.

This creates a new relationship between Zone and Region fields. Click on OK to


confirm. Please note that this relationship is applied only on the current
worksheet and not on other worksheets of a Tableau workbook.

On our worksheet, we have our two data sources which have blue and orange
tick marks in front of their names indicating primary (blue) and secondary
(orange) data source.

Now, we are ready to use data from these two data sets and start our analysis.
You will find a link icon in front of the fields that are linked between both the
tables. This means that you can use these fields from the primary data set as a
common field because they are linked.

As you can see in the screenshot below, we made a bar graph for total sales in
H&M stores in the years 2018 and 2019 (in USA). We were able to get region-
wise and state-wise sales data for both the years in one graph because of data
blending. Here, we were able to use the Zone field from 2019 sales dataset as a
common field between both the tables to provide information on regions.

Difference between Data Blending and Join in


Tableau
Both data blending and join creates a left join between two data tables. But the
main point of difference is when the aggregation takes place. When a join is
created, the data is combined first and then aggregated.

Whereas, in blending, data from primary and secondary data sources are
queried independently, aggregated, combined and then used for visualization.

Left Join in Tableau


The left join keeps the data of the left table as it is and takes only the matching
values from the right table. If in the right table, there is more than one
corresponding value for a single value in the left table, then the resultant table
will have row entry for every value entry in both the tables.
Another important thing to note about left joins is that it first combines the
two tables that are joining and then aggregates the data in them.

Have a clearer understanding of left joins by the example shown below.


As we can see, in the final table, data values corresponding only to customer
IDs CT1 to CT4 are present in the final table. These values were present in the
left table and that is why any value that does not correspond to these four
customer ID values will not be in the table or have null on those places.

Data Blending in Tableau


The main difference between joining two data tables and blending two data
tables is the step at which aggregation of data happens. Like in joining, the
tables are first combined and then the data is aggregated, resulting in some
duplicating values.

Whereas, in blending, the tables are kept separated at the database. Then it
aggregates the data and sends it to Tableau where it forms a combined table
with no duplicated data. It can often happen that the secondary table has more
than one corresponding value in its rows to the primary table.
For instance, in the tables shown below, Table 1 is the primary table and
Rajesh Sharma has bought two items, that is, iPad and Macbook. In this case,
Tableau will show an asterisk (*) in its place ( in product_name column).
If you have more than one numeric value corresponding to primary data
fields, then it sums or aggregates the values before displaying them in a table.
Like, you can see for the customer Rajesh Sharma, the total price of two
items; 40,000 and 80,000 sums up to be 1,20,000.
Also, it will show Null in places where the primary table does not find
corresponding values in the secondary table.
When to Blend your Data in Tableau?
Data blending feature in Tableau is particularly useful in the following cases;

1. When you cannot use cross-database joins in some specific database that do
not support it, like Oracle Essbase, Google Analytics (an extract only
connection), etc.
In such cases, you can import or connect to separate data sources in Tableau
and then combine them using data blending. This lets you use a combination
of data from distinct data sources on a single Tableau worksheet.

2. Another best-suited case where it is fit to use data blending is when your
data values exist at different levels of details or are having different
granularity.
3. Data blending is the best option to go with when you are using larger data
sets. Instead of going with Joins, you can blend the data because Joins
combine the data beforehand and then aggregate it for the view which affects
the performance when the database is large.
On the contrary, when we blend data, it aggregates the data first and then
combines it when required. It saves a lot of computational power in case of
large data sets.

Limitations of Data Blending in Tableau


1. We cannot publish a blended data source as a single data source on the
server. Instead, we need to publish the two data sources separately on the
same server and then blend the published sources.
2. Make your cube data source as the primary data source. (If you are using a
cube data source to blend with other sources).
3. It is a must to always aggregate data in calculations from the secondary data
sources.
4. You might face issues while using non-additive aggregates like MEDIAN,
COUNTD, etc when blending data.

You might also like