Data Blending in Tableau – Learn to combine relevant data from distinct data sources
In this tutorial, we will be learning a very interesting and handy feature of Tableau, that is, Data Blending. Here, we will learn about the basics of data blending in Tableau, how is it different from joins in Tableau and lastly how to blend data from two different data sources.
Don't become Obsolete & get a Pink Slip
Follow DataFlair on Google News & Stay ahead of the game
What is Data Blending in Tableau?
In Tableau, the data blending features allows us to bring data from two different data sources together in a single view or a single Tableau worksheet.
It is different from creating joins because blending only combines relevant data from distinct data sources, whereas joins work on row-level and often duplicate data that is repeating in several rows.
In data blending, there are two data sources; a primary data source and a secondary data source. The additional relevant data of the secondary data source is taken and displayed with the main data of the primary data source.
We can make graphs and charts using the data from both data sources at the same time in a sheet.
Before moving further, get yourself acquainted with how to extract data in Tableau?
Primary and Secondary Tableau Data Sources in Tableau
To understand and perform data blending in Tableau, we first need to acquaint ourselves with primary and secondary data sources. For data blending, the primary data source is the main table whereas secondary data source is the additional table.
There should be at least one secondary data source to perform blending in a Tableau sheet. The data from the primary data source is the main data based on which Tableau defines graphs and charts.
However, only those data values that are relevant, matching or corresponding to the values of primary data sources are taken from secondary data source leaving everything else at the source.
For instance, in our H&M sales data, if in the primary data source we have regions like North, West, Central. But, the secondary data source has North, South, West, and East.
Then, upon data blending, the final graph cannot show data related to the South and East regions because it is not in the primary data source. So, you should always select your primary and secondary data sources wisely, depending upon the fields and values you wish to show in a chart. You can select a data source as primary by simply using its fields first in a chart.
Check out different types of Tableau data sources with DataFlair.
Defining Link Relationships for Data Blending in Tableau
We can define relationships or create links between data fields from a blended primary and secondary data source. Ideally, Tableau creates links between similar fields automatically, but you can define more relationships with the help of the Edit Relationships option.
In order to create a link between two fields, you must have common dimensions in both the data sources. Such common dimensions are known as Linking fields. Once you have linking fields in both your primary and secondary data sources, you can easily create a link.
Here are some important points you should keep in mind while establishing a link between two linking fields.
- All the links are shown in the secondary data source pane in a sheet. Here, you will find an orange link (chain) icon in front of the linked fields and a grey broken link icon for the potential linking fields.
- You can create a link between two fields only if they contain similar data. For instance, if Table 1 has the Regions field and Table 2 has Zones field with the same field values, that is, North, South, East, West, you can manually establish a link between these two fields or common dimensions rather than renaming one of them.
- You just need to click on a broken link icon to activate an inactive linking field.
- If your primary and secondary data sources have more than one field in common or more than one potential linking field, then multiple links can be established between them in a single Tableau worksheet at a time.
How to Blend Data in Tableau?
Let us understand data blending in Tableau with the help of an example. We have two data sets of sales of the store H&M in the year 2018 and in the year 2019. Now, the number of fields in both these data sets is the same but data is different.
For instance, in the table “H&M 2018” the field named Region stores all the regions data. Whereas in the “H&M 2019” table, the field having the same 5 regions is called Zone. With the help of data blending, we can equate these two fields as both of them contain similar regions like north, south, central, east and west.
Hence, data blending helps in establishing relationships between two relevant data sources which makes data analysis more meaningful and insightful. We can compare two data sets more efficiently by blending them in a single Tableau worksheet.
A relationship dialog will open which shows the primary data source, secondary data source and a list of already existing or automatically detected relationships between the fields of the two tables.
Now, we will create a relation between the fields Region and Zone from the two tables to state that the constituent data in them is similar (not identical). To do this, we select the Custom option and click on Add.
This creates a new relationship between Zone and Region fields. Click on OK to confirm. Please note that this relationship is applied only on the current worksheet and not on other worksheets of a Tableau workbook.
Now, we are ready to use data from these two data sets and start our analysis. You will find a link icon in front of the fields that are linked between both the tables. This means that you can use these fields from the primary data set as a common field because they are linked.
As you can see in the screenshot below, we made a bar graph for total sales in H&M stores in the years 2018 and 2019 (in USA). We were able to get region-wise and state-wise sales data for both the years in one graph because of data blending. Here, we were able to use the Zone field from 2019 sales dataset as a common field between both the tables to provide information on regions.
Difference between Data Blending and Join in Tableau
Both data blending and join creates a left join between two data tables. But the main point of difference is when the aggregation takes place. When a join is created, the data is combined first and then aggregated.
Whereas, in blending, data from primary and secondary data sources are queried independently, aggregated, combined and then used for visualization.
Left Join in Tableau
The left join keeps the data of the left table as it is and takes only the matching values from the right table. If in the right table, there is more than one corresponding value for a single value in the left table, then the resultant table will have row entry for every value entry in both the tables.
Another important thing to note about left joins is that it first combines the two tables that are joining and then aggregates the data in them.
As we can see, in the final table, data values corresponding only to customer IDs CT1 to CT4 are present in the final table. These values were present in the left table and that is why any value that does not correspond to these four customer ID values will not be in the table or have null on those places.
Data Blending in Tableau
The main difference between joining two data tables and blending two data tables is the step at which aggregation of data happens. Like in joining, the tables are first combined and then the data is aggregated, resulting in some duplicating values.
Whereas, in blending, the tables are kept separated at the database. Then it aggregates the data and sends it to Tableau where it forms a combined table with no duplicated data. It can often happen that the secondary table has more than one corresponding value in its rows to the primary table.
For instance, in the tables shown below, Table 1 is the primary table and Rajesh Sharma has bought two items, that is, iPad and Macbook. In this case, Tableau will show an asterisk (*) in its place ( in product_name column).
If you have more than one numeric value corresponding to primary data fields, then it sums or aggregates the values before displaying them in a table. Like, you can see for the customer Rajesh Sharma, the total price of two items; 40,000 and 80,000 sums up to be 1,20,000.
When to Blend your Data in Tableau?
Data blending feature in Tableau is particularly useful in the following cases;
1. When you cannot use cross-database joins in some specific database that do not support it, like Oracle Essbase, Google Analytics (an extract only connection), etc.
In such cases, you can import or connect to separate data sources in Tableau and then combine them using data blending. This lets you use a combination of data from distinct data sources on a single Tableau worksheet.
2. Another best-suited case where it is fit to use data blending is when your data values exist at different levels of details or are having different granularity.
3. Data blending is the best option to go with when you are using larger data sets. Instead of going with Joins, you can blend the data because Joins combine the data beforehand and then aggregate it for the view which affects the performance when the database is large.
On the contrary, when we blend data, it aggregates the data first and then combines it when required. It saves a lot of computational power in case of large data sets.
Limitations of Data Blending in Tableau
1. We cannot publish a blended data source as a single data source on the server. Instead, we need to publish the two data sources separately on the same server and then blend the published sources.
2. Make your cube data source as the primary data source. (If you are using a cube data source to blend with other sources).
3. It is a must to always aggregate data in calculations from the secondary data sources.
4. You might face issues while using non-additive aggregates like MEDIAN, COUNTD, etc when blending data.
This concludes our lesson on data blending in Tableau. Here, we learned about the basic nature of data blending and how does it work. Also, we had a step by step learning session for applying data blending in a Tableau worksheet.
Lastly, we ventured to answer the question “How is data blending and joins different from each other”. We hope this tutorial helped you in learning Tableau a step further. Stay with us as we bring more interesting tutorials on Tableau.
Time to jump into DataFlair’s types of Tableau Filters article.
If you have any doubts in data blending in Tableau article, mention them in the comment section.