Power BI Today Class Notes PDF
Power BI Today Class Notes PDF
Append Queries
Introduction:
Power BI’s merging and appending operations allow you to join data from
multiple tables.
The choice between the merge and append queries depends upon the type of
concatenation you want to carry out based on your requirement.
• When you have one or more columns that you’d like to add to
another query, then you use merge the queries option.
• When you have additional rows of data that you’d like to add to an
existing query, you append the query.
Merge operations:
Let’s consider we have two tables one is the Sales Data, and the other is the
Product Data as below:
Sales Data:
Product Data:
1. From the left pane of Power Query Editor, select the query (table) into
which you want the other query (table) to merge. In this case, it’s Sales
Data.
2. Click on Sales Data Table. Click on Home Tab in the Ribbon Menu.
3. Click on Merge in the Combine section.
4. Click on Merge Queries as New.
A pop-up menu appears.
5. From the first drop-down menu, select Sales Data and click
on Product_Key (common column between Sales and Product table)
6. From the second drop-down menu, select Product Data and click
on Product_Key.
7. Click OK.
On ‘Merge Queries,’ you will get two options, ‘ Merge Queries’ and ‘Merge
Queries as New.’
Merge Queries:
This option is used to merge two tables and does not create a new table.
This option is required to merge two or more tables and create a new one. You
need to click on ‘Merge Queries as New’ to create a new one.
• On the merge screen, we can select the two tables from the drop-
down list and then select the column or columns (we can even select
multiple columns to join upon), which will be joined together.
• In the below example, we are using Product_Key from the Sales Data
table and Product_Key from the Product Data table.
• As you can see in the below image, the Join Kind defaults to a left
outer join, meaning all rows from the 1st table (Sales Data) will be
joined with the matching rows from the 2nd (Product Data) table.
• Note that the join finds a match between 1,63,072 of the rows in each
table.
There are 6 different types of joins, including right and left outer joins, full outer
join, inner join, and left and right anti joins. Anti joins find rows that do not
match between the two query datasets.
The result of the Merge is shown below. A new column is added to the Sales
Data dataset with a column name matching the 2nd table name, Product Data, in
the below example. The data are just listed as “Table,” which can be confusing.
• To see the related columns on the right-side column of the join, this
column needs to be expanded using the double arrow button in the
right corner of the column header.
• Clicking on this button opens a window that allows for selecting
specific columns from the second table that should be included in the
merged dataset.
• Checking the use of original column name as prefix can be checked to
on or off which prefixes the table name to each column.
• Expanding the column adds the selected field from the right-side table
to the merged dataset.
• The match by combining text parts option will look at combining two
text values to find the matching join. The combing could be items such
as left-side vs. left-side, part-of vs. part of, for example.
Append operations:
1. From the left pane of Power Query Editor, select the query (table) into
which you want the other query (table) to append. In this case, it’s Sales
Data.
2. Click on Sales Data Table. Click on Home Tab in the Ribbon Menu.
3. Click on Append Queries in the Combine section.
4. Click on Append Queries as New.
• If you want to keep the existing query result as it is and create a new
query with the appended result, choose Append Queries as New.
Otherwise, just select Append Queries.
• In this example, I’ll do Append Queries as New because I want to keep
existing queries intact.
You can choose what is the primary table (typically, this is the query that you
have selected before clicking on Append Queries) and the table to append
• You can also choose to append Three or more tables and add tables to
the list as you wish.
• For this example, I have only two tables, so I’ll continue with the
above configuration.
• Append Queries simply append rows after each other, and because
column names are exactly similar in both queries, the result set will
have the same columns.
Append queries will NOT remove duplicates; we must have to use Group by or
remove duplicate rows to get rid of duplicates.
What if the columns do not match between the two source tables?
If columns in source queries are different, append still works, but it will create
one column in the output per each new column. If one of the sources doesn’t
have that column, the cell value of that column for those rows will be null.
However, Append requires columns to be precisely like work in the best
condition.
Conclusion
Power BI merge and append queries are very handy for concatenating data from
multiple questions or tables when preparing your data for visualization.
The fuzzy matching feature makes merge queries even more powerful, allowing
the combination of two tables based on partial matches.
You can use the Group By feature to find the average, count, min, max, or any
other aggregate value for one column, based on unique values in other columns.
For example, you can use the Group By feature to find the average prices for all
the products in different product categories. The sort of thing that you might add
to a Power BI report or dashboard.
You can apply the Group By feature to group data in one or multiple columns.
Similarly, you can use multiple aggregate functions with the Group By feature.
The CSV file for the dataset you will be using in this article is available at the
following link. The dataset contains information about the passenger onboard the
unfortunate Titanic ship.
https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv
Import the CSV file from your local file system or directly via the online link into
Power BI Desktop.
You have two ways to load your dataset into the Power Query editor.
1. When importing the data, you will see the following window.
If you click the Transform Data button, the Power Query editor will open.
2. If you have already imported your data into the reports view by clicking
the Load button, you can still open the Power Query editor.
Click the Transform data option in the ribbon, as shown below.
You will see the window below when the Power Query editor opens.
To apply the Group By feature using a single column and a single aggregate
function, select the radio button for the Basic option as shown.
As our first example, let’s find the average fare paid by the passengers from the
different passenger classes.
In other words, let’s group the average fares in the Titanic dataset by passenger
classes.
You need to select a minimum of three options to apply the Group By feature:
1. First, you need to select the column you want to use to group the data. Also
called your aggregation column.
In our example, this is the Pclass (passenger class) column.
2. Second, you need to select a name for the new column created. Enter the
name for this column in the New column name box.
3. Third, you need to choose the aggregate function used to group values. You
do this in the Operation drop-down list.
You can select functions like count, min, max, median, etc., as the aggregate
function.
In our example, the Operation field specifies the Average aggregate function.
4. Finally, you need to select the column that contains the data that will be
grouped. The box you enter this into is called Column. We have chosen
the Fare column.
In the output, you can see two new columns.
The first column contains the passenger classes (e.g. 1, 2, and 3), and the second
column contains the average fare paid by passengers in that class.
The name of the Average Fare column is as you specified in the previous step. You
can give any name to this column.
You can also see the APPLIED STEPS pane in the lower right-hand section of your
screen. It shows the steps that you applied in the Power Query editor.
Let’s group average fares by the Pclass (passenger class) and Sex columns.
By default, you can select one column to use in the Group By feature.
To add more columns, you need to click the Add grouping button.
All the other options are as they were in the previous section.
You can now see three columns in the output table (below).
The unique values in the Pclass column are repeated for each of the unique values
in the Sex column.
You can see that the average fare paid by the male passengers travelling in the
first class was 67.22. On the other hand, the female passengers from the first-
class paid an average fare of 106.12.
This adds another row of input boxes to enter the relevant data.
The Group By feature in the above example uses two aggregate functions.
1. The first aggregate function groups the average fare by the “PClass” and
“Sex” columns.
2. The second aggregate function groups the maximum age by the “PClass”
and “Sex” columns.
The output is a table that summarises the data. All the rows of data in the original
table have generated valuable, actionable data.
You can easily see the average fares paid by the different genders in the various
classes on the Titanic.
You can also see the maximum age of the passengers in each of those groupings.
Conclusion
The Power BI Group By feature is powerful for grouping data.
In this article, you saw how to use the Power BI Group By feature via the Power
Query editor GUI options.
You can also use the GroupBy function in DAX to gain more fine-grained control
over the Power BI Group By feature.