Lab 02 - Load Transformed Data in Power BI Desktop
Lab 02 - Load Transformed Data in Power BI Desktop
BI Desktop
Lab story
In this lab, you’ll use data cleansing and transformation techniques to start shaping your data
model. You’ll then apply the queries to load each as a table to the data model.
Get started
In this task, you’ll set up the environment for the lab.
Important: If you completed the previous lab in the same VM, skip to the next task.
Tip: By default, the Getting Started dialog box opens in front of Power BI Desktop.
You can choose to sign-in, and then close the pop-up.
2. To open the starter Power BI Desktop file, select the File > Open Report > Browse
Reports.
3. In the Open window, navigate to the D:\PL300\Labs\02-load-data-with-power-
query-in-power-bi-desktop\Starter folder, and open the Sales Analysis file.
4. Close any informational windows that may open.
5. Notice the yellow warning message beneath the ribbon.
This message alerts you to the fact that the queries haven’t been applied to load as
model tables. You’ll apply the queries later in this lab.
To dismiss the warning message, at the right of the yellow warning message, select X.
6. To create a copy of the file, go to File > Save As and save to D:\PL300\MySolution
folder.
7. If prompted to apply changes, select Apply Later.
1
Configure the Salesperson query
In this task, you’ll use Power Query Editor to configure the Salesperson query.
Important: When instructed to rename columns, it’s important that you rename them exactly
as described.
1. To open the Power Query Editor window, on the Home ribbon tab, from inside the
Queries group, select the Transform Data icon.
2. In the Power Query Editor window, in the Queries pane, select the DimEmployee
query.
3. To rename the query, in the Query Settings pane (located at the right), in the Name
box, replace the text with Salesperson, and then press Enter. Then verify the name
has been updated in Queries pane.
The query name determines the model table name. It’s recommended to define concise
and user-friendly names.
4. To locate a specific column, on the Home ribbon tab, select the Manage Columns
down-arrow, select the Choose Columns down-arrow, and then select Go to
Column.
Go to Column is a useful feature with many columns. Otherwise, you can horizontally
scroll find columns.
2
5. In the Go to Column window, to order the list by column name, select the AZ sort
button, and then select Name and SalesPersonFlag. Click OK.
6. Locate the SalesPersonFlag column, then filter the column to select only Salespeople
(that is, TRUE), and click OK.
7. In the Query Settings pane, in the Applied Steps list, notice the addition of the
Filtered Rows step.
Each transformation you create results in another step logic. It’s possible to edit or
delete steps. It’s also possible to select a step to preview the query results at that
stage of the query transformation.
3
8. To remove columns, on the Home ribbon tab, select the Manage Columns group,
select the Choose Columns icon.
9. In the Choose Columns window, to uncheck all columns, uncheck the (Select All
Columns) item.
10. To include columns, check the following six columns:
o EmployeeKey
o EmployeeNationalIDAlternateKey
o FirstName
o LastName
o Title
o EmailAddress
11. In the Applied Steps list, notice the addition of another query step.
12. To create a single name column, first select the FirstName column header. While
pressing the Ctrl key, select the LastName column.
13. Right-click either of the select column headers, and then in the context menu, select
Merge Columns.
14. In the Merge Columns window, in the Separator dropdown list, select Space.
15. In the New Column Name box, replace the text with Salesperson.
16. To rename the EmployeeNationalIDAlternateKey column, double-click the
EmployeeNationalIDAlternateKey column header and replace the text with
EmployeeID, and then press Enter.
17. Use the previous steps to rename the EmailAddress column to UPN.
18. At the bottom-left, in the status bar, verify that the query has five columns and 18
rows.
4
Configure the SalespersonRegion query
In this task, you’ll configure the SalespersonRegion query.
Important: When detailed instructions have already been provided, lab steps will provide
more concise instructions. If you need the detailed instructions, you can refer back to the
steps of previous tasks.
6. See the full list of columns, then select the Select All Columns box to unselect all
columns.
7. Select EnglishProductSubcategoryName and DimProductCategory, and uncheck
the Use Original Column Name as Prefix checkbox before selecting OK.
Query column names must always be unique. If left checked, this checkbox would
prefix each column with the expanded column name (in this case
5
DimProductSubcategory). Because it’s known that the selected column names don’t
collide with column names in the Product query, the option is deselected.
8. Notice that the transformation resulted in the addition of two columns, and that the
DimProductSubcategory column has been removed.
9. Expand the DimProductCategory column, and then introduce only the
EnglishProductCategoryName column.
10. Rename the following four columns:
o EnglishProductName to Product
o StandardCost to Standard Cost (include a space)
o EnglishProductSubcategoryName to Subcategory
o EnglishProductCategoryName to Category
11. In the status bar, verify that the query has six columns and 397 rows.
6
o StateProvinceName to State-Province
o EnglishCountryRegionName to Country-Region
8. In the status bar, verify that the query has six columns and 701 rows.
Note: You may recall in the Prepare Data in Power BI Desktop lab that a
small percentage of FactResellerSales rows had missing TotalProductCost
values. The DimProduct column has been included to retrieve the product
standard cost column to assist fixing the missing values.
7
3. Expand the DimProduct column, uncheck all columns, and then include only the
StandardCost column.
4. To create a custom column, on the Add Column ribbon tab, from inside the General
group, select Custom Column.
5. In the Custom Column window, in the New Column Name box, replace the text
with Cost.
6. In the Custom Column Formula box, enter the following expression (after the
equals symbol):
o You can copy the expression from the D:\PL300\Labs\02-load-data-with-
power-query-in-power-bi-desktop\Assets\Snippets.txt file.
o This expression tests if the TotalProductCost value is missing. If missing, it
produces a value by multiplying the OrderQuantity value by the StandardCost
value; otherwise, it uses the existing TotalProductCost value.
Configuring the correct data type is important. When the column contains numeric
value, it’s also important to choose the correct type if you expect to perform
mathematic calculations.
8
10. Modify the following three column data types to Fixed Decimal Number.
The fixed decimal number data type allows for 19 digits, and allows for more
precision to avoid rounding errors. It’s important to use the fixed decimal number
type for financial values, or rates (like exchange rates).
o Unit Price
o Sales
o Cost
11. In the status bar, verify that the query has 10 columns and 999+ rows.
A maximum of 1000 rows will be loaded as preview data for each query.
You may recall that the hyphen character was used in the source CSV file to represent
zero (0).
9
6. Rename the following two columns:
o Attribute to MonthNumber (there’s no space)
o Value to Target
7. To prepare the MonthNumber column values, right-click the MonthNumber
column header, and then select Replace Values.
You’ll now apply transformations to produce a date column. The date will be derived
from the Year and MonthNumber columns. You’ll create the column by using the
Columns From Examples feature.
8. In the Replace Values window, in the Value To Find box, enter M and leave the
Replace with empty.
9. Modify the MonthNumber column data type to Whole Number.
10. On the Add Column ribbon tab, from inside the General group, select The Column
From Examples icon.
11. Notice that the first row is for year 2017 and month number 7.
12. In the Column1 column, in the first grid cell, commence entering 7/1/2017, and then
press Enter.
The virtual machine uses US regional settings, so this date is in fact July 1, 2017.
Other regional settings may require a 0 before the date.
13. Notice that the grid cells update with predicted values.
The feature has accurately predicted that you’re combining values from the Year and
MonthNumber columns.
14. Notice also the formula presented above the query grid.
15. To rename the new column, double-click the Merged column header and rename the
column as TargetMonth.
16. Remove the following columns:
o Year
o MonthNumber
17. Modify the following column data types:
10
o Target as fixed decimal number
o TargetMonth as date
18. To multiply the Target values by 1000, select the Target column header, and then on
the Transform ribbon tab, from inside the Number Column group, select Standard,
and then select Multiply.
You may recall that the target values were stored as thousands.
19. In the Multiply window, in the Value box, enter 1000, and select OK.
20. In the status bar, verify that the query has three columns and 809 rows.
1. Select the ColorFormats query and notice that the first row contains the column
names.
2. On the Home ribbon tab, from inside the Transform group, select Use First Row as
Headers.
11
3. In the status bar, verify that the query has three columns and 10 rows.
Merging queries allows integrating data, in this case from different data sources
(SQL Server and a CSV file).
3. In the Merge window, in the Product query grid, select the Color column header.
12
4. Beneath the Product query grid, in the dropdown list, select the ColorFormats
query.
Privacy levels can be configured for data source to determine whether data can be
shared between sources. Setting each data source as Organizational allows them to
share data, if necessary. Private data sources can never be shared with other data
sources. It doesn’t mean that Private data can’t be shared; it means that the Power
Query engine can’t share data between the sources.
7. In the Merge window, use the default Join Kind - maintaining the selection of Left
Outer and select OK.
13
8. Expand the ColorFormats column to include the following two columns:
o Background Color Format
o Font Color Format
9. In the status bar, verify that the query now has eight columns and 397 rows.
3. In the Query Properties window, uncheck the Enable Load To Report checkbox.
Disabling the load means it will not load as a table to the data model. This is done
because the query was merged with the Product query, which is enabled to load to the
data model.
14
Finish up
3. In the Data pane (located at the right), notice the seven tables loaded to the data
model.
15
4. Save the Power BI Desktop file.
You’ll configure data model tables and relationships in the Model Data in Power BI Desktop
lab.
16