Data Blending in Tableau 27
Data Blending in Tableau 27
In data blending, there are two data sources; a primary data source and a
secondary data source. The additional relevant data of the secondary data
source is taken and displayed with the main data of the primary data source.
We can make graphs and charts using the data from both data sources at the
same time in a sheet.
However, only those data values that are relevant, matching or corresponding
to the values of primary data sources are taken from secondary data source
leaving everything else at the source.
For instance, in our H&M sales data, if in the primary data source we have
regions like North, West, Central. But, the secondary data source has North,
South, West, and East.
Then, upon data blending, the final graph cannot show data related to the
South and East regions because it is not in the primary data source. So, you
should always select your primary and secondary data sources wisely,
depending upon the fields and values you wish to show in a chart. You can
select a data source as primary by simply using its fields first in a chart.
For instance, in the table “H&M 2018” the field named Region stores all the
regions data. Whereas in the “H&M 2019” table, the field having the same 5
regions is called Zone. With the help of data blending, we can equate these two
fields as both of them contain similar regions like north, south, central, east
and west.
Hence, data blending helps in establishing relationships between two relevant
data sources which makes data analysis more meaningful and insightful. We
can compare two data sets more efficiently by blending them in a single
Tableau worksheet.
So, before we start learning about blending data, let us first show you our two
sets of sample data of H&M sales.
We make separate connections to both the data sets; “H&M Sales 2018” and
“H&M sales 2019”.
Now, to see the existing relationship between these two data sets and to make
new relationships, we go to Data tab and then select the Edit
Relationship option.
A relationship dialog will open which shows the primary data source,
secondary data source and a list of already existing or automatically detected
relationships between the fields of the two tables.
You can change the primary and secondary data source from the drop-down
list as per your liking. Also, we can change the Automatic option to Custom to
make a new relationship.
On our worksheet, we have our two data sources which have blue and orange
tick marks in front of their names indicating primary (blue) and secondary
(orange) data source.
Now, we are ready to use data from these two data sets and start our analysis.
You will find a link icon in front of the fields that are linked between both the
tables. This means that you can use these fields from the primary data set as a
common field because they are linked.
As you can see in the screenshot below, we made a bar graph for total sales in
H&M stores in the years 2018 and 2019 (in USA). We were able to get region-
wise and state-wise sales data for both the years in one graph because of data
blending. Here, we were able to use the Zone field from 2019 sales dataset as a
common field between both the tables to provide information on regions.
Whereas, in blending, data from primary and secondary data sources are
queried independently, aggregated, combined and then used for visualization.
Whereas, in blending, the tables are kept separated at the database. Then it
aggregates the data and sends it to Tableau where it forms a combined table
with no duplicated data. It can often happen that the secondary table has more
than one corresponding value in its rows to the primary table.
For instance, in the tables shown below, Table 1 is the primary table and
Rajesh Sharma has bought two items, that is, iPad and Macbook. In this case,
Tableau will show an asterisk (*) in its place ( in product_name column).
If you have more than one numeric value corresponding to primary data
fields, then it sums or aggregates the values before displaying them in a table.
Like, you can see for the customer Rajesh Sharma, the total price of two
items; 40,000 and 80,000 sums up to be 1,20,000.
Also, it will show Null in places where the primary table does not find
corresponding values in the secondary table.
When to Blend your Data in Tableau?
Data blending feature in Tableau is particularly useful in the following cases;
1. When you cannot use cross-database joins in some specific database that do
not support it, like Oracle Essbase, Google Analytics (an extract only
connection), etc.
In such cases, you can import or connect to separate data sources in Tableau
and then combine them using data blending. This lets you use a combination
of data from distinct data sources on a single Tableau worksheet.
2. Another best-suited case where it is fit to use data blending is when your
data values exist at different levels of details or are having different
granularity.
3. Data blending is the best option to go with when you are using larger data
sets. Instead of going with Joins, you can blend the data because Joins
combine the data beforehand and then aggregate it for the view which affects
the performance when the database is large.
On the contrary, when we blend data, it aggregates the data first and then
combines it when required. It saves a lot of computational power in case of
large data sets.