Study Notes
Study Notes
Home
Transform
Add Column
View
Renaming columns
Reordering columns
Removing columns
Choosing columns
Merging columns / Splitting columns
Going to a specific column
Pivoting or unpivoting columns
Filling or replacing values
Extracting text from columns
Keeping rows
Removing rows
Removing blank rows
Removing duplicates
Sorting data
Reversing the row order
Undoing a sort operation
Filtering data
Selecting specific values
Finding elements in the filter list
Filtering text ranges
Filtering numeric ranges
Table menu: This menu appears when you right-click the top corner of the grid
containing the data.
Column menu: This menu appears when you right-click a column title.
Cell menu: This menu appears when you right-click a data cell.
All transformations are recorded in the Query Settings pane (Right hand pane) APPLIED STEPS.
These steps can be undone.
Adding or Merging columns. If option is selected fro the Transform ribbon then existing columns will
be modified. If selected from Add Column ribbon then the original column will not be altered but a
new column added containing the results of the data transformation.
Merge and Append Queries are in the Combine Menu on the Home Ribbon.
Cannot: Move columns. This can only be done in the Power Query editor
To format columns
Select a column in Data View to enable the Column Tools ribbon. This allows changes to column
from:
Structure menu:
Name, Data Type
Formatting menu
Currency, %, Comma seperator, Decimal Places
Properties Menu
Summarization (Sum, Average, Min, Max, Count, Count(Distinct))
Default summarization can be over-ridden in report view.
Data category – Power BI will place icon beside Field in fields list.
Can be:
Uncategorized, Address, Place, City, County, State or Province, Postal code,
Country, Continent, Latitude, Longitude, Web URL, Image URL, Barcode
Sort menu
Sort by column
Groups menu
Data groups
Relationships menu
Manage Relationships
Calculations menu
New column
Cannot:
Format a range of cells in a table. Only the whole column.
Select multiple adjacent or noncontiguous columns and format in a single operation.
DAX Examples
TownAbbreviation = UPPER(LEFT(Clients[Town],3))
ROUND() functions modify the data whereas formatting numbers only changes appearance.
This formula returns a related value from another table as specified by the joins in the data model.
When typing RELATED in the formula Power BI will display a drop-down list of all the fields of all
the tables that can be joined to the current table in the data model.
Note: Lookup tables that use data only once cannot pull back data using the RELATED function.
These tables are on the “one” side of the “one to many” relationship. A lookup can only be performed
from the “many” side.
Safe Division
COUNTROWS()
RELATEDTABLE()
This performs a count of clients in each country in the Countries table by looking up the Clients table.
In this case the lookup is again from the one side to the many side but will work using
RELATEDTABLE instead of RELATED where the Clients table is related to the Countries table by
the CountryID field.
Flagging Data
IsPostCode = IF(ISBLANK([PostCode]),"NoPostCode","HasPostCode")
Note: The ISBLANK function in Power BI will not treat NULL values as blanks. These cells have to
be empty.
Mileage Range = IF([Mileage] <= 50000, "Low", IF([Mileage] < 100000, "Medium", "High"))
Complex Logic
Here the OR statement is the second argument of the AND statement. The RELATED functions need
to be closed within the OR function. “Special” is the output of the IF function if both the AND and
the OR functions evaluate to TRUE. “Normal” is the output otherwise.
A measure is attached to a table so that it appears as a field. However, it does not have to use any of
the fields in the table that “hosts” it. Measures can be applied to different tables.. You can also create
an otherwise empty table just to serve as a conainer for measures.
Drag the measure onto the table/matrix/visual in report view to display the result.
Basic Measures
It is good practice to use := instead of = when defining a measure.
TotalSales := SUM(Stock[CostPrice])
Average Cost Price := AVERAGE(Stock[CostPrice])
Maximum Sale Price := MAX(InvoiceLines[SalePrice])
Minimum Sale Price := MIN(InvoiceLines[SalePrice]
Multiple Measures
RatioNetMargin := SUM([Gross Margin])/SUM([SalePrice])
Simple Filters
The CALCULATE() function lets you apply a range of filters to a measure. CALCULATE nearly
always involves some kind of filter operation and is the only function that can alter the FILTER
CONTEXT. (See Evaluation Context further on).
Text Filters
DealerSales := CALCULATE(SUM(InvoiceLines[SalePrice]), Clients[ClientType]="Dealer")
The first parameter of the function defines the function to use – SUM.
The second parameter is a filter that forces the engine to show only a subset of the data. In this case it
returns the sum of sales only when the client is a dealer.
The filter that is applied here comes from another column. Indeed, it comes from another table
altogether. When using the CALCULATE() function, you can use just about any column (either an
original data column or a calculated column) as the source for a filter.
Numeric Filters
LowPriceSales := CALCULATE(
SUM(InvoiceLines[SalePrice]),
InvoiceLines[SalePrice] < 50000
)
MakePercentage := DIVIDE(
SUM(InvoiceLines[SalePrice]), close SUM
CALCULATE(
SUM(InvoiceLines[SalePrice]), close SUM
ALL(Stock[Make] ) close ALL
) close CALCULATE
) close DIVIDE
Note: To format a measure click on the measure in the Fields pane to enable the Measure Tools
ribbon. This can be done in Report view and Table view.
The ALL() function removes any filters currently being applied to the specified fields. So in the
example above the [Make] field is not filtered.
Removing Multiple Filter Elements
MakeAndColorPercentage := DIVIDE(
SUM(InvoiceLines[SalePrice] ), close SUM
CALCULATE(
SUM(InvoiceLines[SalePrice]), close SUM
ALL(Stock[Make] ), close ALL
ALL (Colors[Color]) close ALL
) close CALCULATE
) close DIVIDE
This piece of DAX is simply saying, “Don’t apply any make or color filters when calculating.”
Note: CALCULATE does not require RELATED or RELATEDTABLE functions.
Visual Totals
Use ALLSELECTED() to apply any filters that have been added either at the report, page or
visualization level, or as slicers or cross-filters from other visuals. This is especially useful when
displaying tables and matrices with subtotals and grand totals where you want percentage totals to
reflect the figures shown and not include any records that have been removed by the filter.
SalesPercentage = DIVIDE(
SUM(InvoiceLines[SalePrice]),
CALCULATE(
SUM( InvoiceLines [SalePrice]),
ALLSELECTED()
)
)
ALLEXCEPT() removes filters from all the elements in a calculation except those specified.
AllButMakePercentage = DIVIDE(
SUM( InvoiceLines[SalePrice] ),
CALCULATE(
SUM(InvoiceLines[SalePrice]),
ALLEXCEPT (Stock, Stock[Make])
)
)
If you use this measure in a matrix where Make is the leftmost column, you can then add subgroups
using any other field to get the kind of output that is shown above. In this example, other filters (color
here) are applied, but not make. So you are displaying the percentage for each color compared to the
aggregate total for the make.
Filtering on Measures
CALCULATE() must use columns, or calculated columns as part of a comparison but cannot use
measures. FILTER() can use measures but must use an iterator function such as SUMX() rather than a
simple aggregation to produce a correct result.
Note: Measures are only calculated at refresh time and take up very little space whereas calculated
columns are storedin the table and can consume a lot of memory in large datasets.
HighNetMarginSales = CALCULATE(
SUM(InvoiceLines[SalePrice]),
FILTER(
InvoiceLines, [RatioNetMargin]>0.5
)
)
Here the FILTER function uses the measure RatioNetMargin which = SUM([GrossMargin]) /
SUM([SalePrice])
Displaying Rank
SalesRankByMake = RANKX(
ALL(Stock[Make]), close ALL
SUMX(
RELATEDTABLE (InvoiceLines), close RELATEDTABLE
[SalePrice]) close SUMX
) close RANKX
DAX 101 (Microsoft Learn Video – Working with Contoso
Sample)
Some calculations can be expressed both with calculated columns and with measures but need
different DAX expressions for each.
Calculated column:
Measure:
Basic aggregation functions such as SUM, AVERAGE, COUNT, MAX and MIN can only aggregate
on one column.
SUM(Orders[Price]) will work
SUM(Orders[Price] * Orders[Quantity]) will not
Iteartors such as SUMX, AVERAGEX, MAXX AND MINX will iterate over the table and evaluate
the expression for each row.
These functions always receive TWO parameters.
• The table to iterate
• The formula to evaluate for each row
Example:
Total Sales := SUMX(
Sales,
Sales[Net Price] * Sales[Quantity]
)
Counting Values
COUNTA() Counts anything but blanks
COUNT() Use only for numeric columns
COUNTBLANK() Counts blanks
COUNTROWS() Counts rows in a table, possibly with a filter condition or for a particular
category in a related table (Number of products sold in a particular Product
Category)
DISTINCTCOUNT() Performs a distinct count of unique values
Boolean Logic
AND can be expressed with && (allowing multiple criteria)
OR can be expressed with || (allowing multiple criteria)
NOT can be expressed with !
Using Variables
Relational Functions
RELATED(): follows relationships defined in the data model and returns the value of a column in the
related table.
TotalSales := SUMX(Sales, Sales[Quantity] * RELATED( Product[Unit Price] ))
RELATED() works through any chain of relationships from the Many side to the One side. For sales
there is one product for many sales. The function will only work where the relationship from the table
to be iterated to the lookup table is Many to One.
To get the number of sales per product and produce a calculated column in the product table requires
the RELATEDTABLE() function, which will work where the relationship is One to Many.
NumberOfSales := COUNTROWS(RELATEDTABLE(Sales))
To get the total sales by product:
SalesOfProduct = SUMX(RELATEDTABLE(Sales), Sales[Quantity] * Sales[Unit Price])
To get instances of Product subcategories in Products table:
CountOfProducts = COUNTROWS(RELATEDTABLE('Product'))
NumberOfProducts = COUNTX(RELATEDTABLE('Product'), 'Product'[ProductSubcategoryKey])
These formulas will return the same result as the Product Subcategory table is related to the Products
table on the [ProductSubcategoryKey] field.
SalesOfPopularProducts :=
SUMX(
FILTER(
‘Product Subcategory’,
COUNTROWS(
RELATEDTABLE(Product)
) >100
),
SUMX(
RELATEDTABLE(Sales),
Sales[Sales Amount]
)
)
This formula Filters the Product Subcategory table where the Subcategory appears more than 100
times in the Products table then computes the total sales of each of these subcategories.
Unlike SQL, DAX does not require relationships to be specified in the code.
Table Functions
Basic functions that work on all tables:
FILTER
ALL
VALUES
DISTINCT
RELATEDTABLE
• Their result is often used in other functions
• They can be combined together to form complex expressions
SumOfMultipleSales :=
SUMX(
FILTER(
Sales,
Sales[Quantity] > 1
),
Sales[Quantity] * Sales[Net Price]
)
CountOfMultipleSales :=
COUNTX(
FILTER(
Sales, Sales[Quantity] > 1), Sales[Quantity]
),
Sales[Quantity]
)
SalesAmountMultipleItems :=
VAR
MultipleItemSales = FILTER ( Sales, Sales[Quantity] > 1 )
RETURN
SUMX (
MultipleItemSales,
Sales[Quantity] * Sales[Unit Price]
)
Removing Filters with ALL
SUMX(
ALL(Orders),
Orders[Quantity] * Orders[Price]
)
ALL(Customers[Customer Name])
Mixing Filters
InternetSales :=
SUMX(
FILTER(
ALL(Orders),
Orders[Channel] = “Internet”
),
Orders[Amount]
)
Using DISTINCT()
NumOfProducts :=
COUNTROWS(
DISTINCT(Product[ProductCode]
)
DISTINCT() is similar to ALL() but whereas ALL() ignores any filters, DISTINCT will obey it.
Using VALUES()
Returns the distinct values of a column visible within the current context together with any unmatched
rows in related tables.
NumOfProducts :=
COUNTROWS(
VALUES(Product[ProductID])
)
Power BI inserts a BLANK in the products table where there is an orphaned ProductID in the Sales
table.
The ALL() functions do not apply any filters so return the same values for each row.
Calculated Tables
FILTER returns a TABLE that represents a subset of another table or expression.
Calculated tables can be created from the Modelling or Table tools ribbons. New table.
Red Products = FILTER(‘Product’, ‘Product’[Color] = “Red”)
ADDCOLUMNS()
BrandAndColorSales = ADDCOLUMNS( ALL ( ‘Product’[Color], ‘Product’[Brand] ),
“Total Sales”, Sales[Sales Amount] )
This formula creates a subset of the Product table which lists all Brands and Colors with ALL, which
ignores all filters. ALL is the inner function
Then add a new column with ADDCOLUMNS, the outer function and call it “Total Sales”, which
computes the total sales by brand and color from the related sales table. Sales and Product have a
Many to one realtionship on the ProductKey field.
Sales[Adjusted Cost] =
IF(
RELATED(‘Product Category’[Category]) = “Cell phones”,
Sales[Unit Cost] * 0.95,
Sales[Unit Cost]
)
The RELATED function follows the chain of relationships defined in the data model to determine if
the sale is a Cell phone.
Relationship Chain
Understanding FILTER
FILTER receives a table and a logical condition as parameters. As a result, FILTER returns all the
rows satisfying the condition. FILTER is both a table function and an iterator at the same time. In
order to return a result, it scans the table evaluating the condition on a row-by-row basis.
FabrikamProducts =
FILTER (
'Product',
'Product'[Brand] = "Fabrikam"
)
RedSales :=
SUMX (
FILTER (
Sales,
RELATED ( 'Product'[Color] ) = "Red"
),
Sales[Quantity] * Sales[Net Price]
)
Nesting Filters
In general, nesting two filters produces the same result as combining the conditions of the two
FILTER functions with an AND function. The following queries produce the same result.
FabrikamHighMarginProducts =
FILTER (
FILTER (
'Product',
'Product'[Brand] = "Fabrikam"
),
'Product'[Unit Price] > 'Product'[Unit Cost] * 3
)
FabrikamHighMarginProducts =
FILTER (
'Product',
AND (
'Product'[Brand] = "Fabrikam",
'Product'[Unit Price] > 'Product'[Unit Cost] * 3
)
)
If one condition is more selective than the other, applying the most selective condition first by
using a nested FILTER function is considered best practice.
If there are many products with the Fabrikam brand, but few products priced at three times their cost,
then the following query applies the filter over Unit Price and Unit Cost in the innermost FILTER. By
doing so, the formula applies the most restrictive filter first, reducing the number of iterations needed
to check for the brand.
FabrikamHighMarginProducts =
FILTER (
FILTER (
'Product',
'Product'[Unit Price] > 'Product'[Unit Cost] * 3
),
'Product'[Brand] = "Fabrikam"
)
Sales Amount :=
SUMX (
Sales,
Sales[Quantity] * Sales[Net Price]
)
The parameter of ALL cannot be a table expression. It needs to be either a table name or a list of
column names. If we use a column in ALL instead of a table it will return all the distinct values of the
column:
Category
Audio
Cameras and camcorders
Cell phones
Computers
Games and Toys
Home Appliances
Music, Movies and Audio
Books
TV and Video
Throughout all its variations, ALL ignores any existing filter in order to produce a result. We can use
ALL as an argument of an iteration function, such as SUMX and FILTER, or as a filter argument in a
CALCULATE function.
Task:
Produce a dashboardthat shows the category and subcategory of products that sold more than twice
the average sales amount.
To produce this report, we need to first compute the average sales per subcategory and then, once
these values have been determined, retrieve from the list of subcategories the ones that have a sales
amount larger than twice that average.
BestCategories =
VAR Subcategories =
ALL ( 'Product'[Category], 'Product'[Subcategory] )
VAR AverageSales =
AVERAGEX (
Subcategories,
SUMX ( RELATEDTABLE ( Sales ), Sales[Quantity] * Sales[Net Price] )
)
VAR TopCategories =
FILTER (
Subcategories,
VAR SalesOfCategory =
SUMX ( RELATEDTABLE ( Sales ), Sales[Quantity] * Sales[Net Price] )
RETURN
SalesOfCategory >= AverageSales * 2
)
RETURN
TopCategories
ALL always returns all the distinct values of a column. On the other hand, VALUES returns only the
distinct visible values.
NumOfAllColors counts all the colors of the Product table, whereas NumOfColors counts only the
ones that—given the filter in the report—are visible.
NumOfAllColo
Category NumOfColors rs
Audio 10 16
Cameras and camcorders 14 16
Cell phones 8 16
Computers 12 16
Games and Toys 11 16
Home Appliances 13 16
Music, Movies and Audio
Books 8 16
TV and Video 4 16
Total 16 16
Category NumOfColors
Audio 3
Headphones 2
MP&MP3 1
Recording Pen 1
Cameras and camcorders 3
Camcorders 1
Camera Accessories 1
Digital Cameras 1
SLR Cameras 3
Cell phones 2
Phone Accessories 1
Suppose we want to see the names of brands beside their number. Using VALUES this is only
possible in the special case where there is only one value for the brand. In that case it is possible to
return the result of VALUES and DAX automatically converts it to a scalar value.
To make sure there is only one brand we need to use an IF statement.
Brand Name :=
IF (
COUNTROWS ( VALUES ( Product[Brand] ) ) = 1,
VALUES ( Product[Brand] )
)
Where the Brand Name column contains a blank, this means that there are two or more different
brands.
There is a simpler DAX function to check whether a column has only one visible value.
Brand Name :=
IF (
HASONEVALUE ( 'Product'[Brand] ),
VALUES ( 'Product'[Brand] )
)
DAX also offers a function that automatically checks if a column contains a single value and, if so, it
returns the value as a scalar. In case there are multiple values, it is also possible to define a default
value to be returned.
[Brand Name] :=
CONCATENATEX (
VALUES ( 'Product'[Brand] ),
'Product'[Brand],
", "
)
ALLSELECTED
ALLSELECTED is useful when retrieving the list of values of a table, or a column, as visible in the
current report and considering all and only the filters outside of the current visual.
Because the denominator uses the ALL function, it always computes the grand total of all sales,
regardless of any filter.
If some categories are selected in the slicer the grand total of the matrix no longer accounts for 100%.
The total is now 100% and the numbers reported reflect the percentage against the visible total,
not against the grand total of all sales.
The calculated column definition adds the Due Fiscal Year column to the Due Date table. The
following steps describe how Microsoft Power BI evaluates the calculated column formula:
1. The addition operator (+) is evaluated before the text concatenation operator (&).
2. The YEAR DAX function returns the whole number value of the due date year.
3. The IF DAX function returns the value when the due date month number is 7-12 (July to
December); otherwise, it returns BLANK. (For example, because the Adventure Works
financial year is July-June, the last six months of the calendar year will use the next calendar
year as their financial year.)
4. The year value is added to the value that is returned by the IF function, which is the value one
or BLANK. If the value is BLANK, it's implicitly converted to zero (0) to allow the addition
to produce the fiscal year value.
5. The literal text value "FY" concatenated with the fiscal year value, which is implicitly
converted to text.
Top10ProductsAll =
SUMX (
TOPN ( 10, 'Product', Product[ProductIDSales]), [TotalSales] )
This will return the sales for the top 10 products sold ACROSS all regions, i.e sales of the same 10
products in each region.
Top10Region =
VAR
RankingContext = VALUES(Product[Name])
RETURN
CALCULATE( [TotalSales],
TOPN(10, ALL(Product[Name]), [TotalSales]),
RankingContext )
Measure applied to table will return top the sales amount for the Top 10 products sold in EACH
region, so the products may be different in each region. However, the grand total in matrix where
the measure is applied to each region will be wrong, showing the total for the Top 10 products across
all regions. (Total of first measure). This is because the measure is applied to the whole product
dataset.
Measure to fix this:
Top10Region =
VAR
RankingContext = VALUES(Product[Name])
RETURN
SUMX(
VALUES( SalesTerritory[CountryRegionCode] ),
CALCULATE([TotalSales],
TOPN(10, ALL(Product[Name]), [TotalSales]),
RankingContext )
)
Create Measure:
Top10Quantity =
VAR
RankingContext = VALUES(Product[Name])
RETURN
SUMX(
VALUES( SalesTerritory[CountryRegionCode] ),
CALCULATE([QuantityMeasure],
TOPN(10, ALL(Product[Name]), [TotalSales]),
RankingContext )
)
QuantityTest =
CALCULATE(
[QuantityMeasure],
'Product'[ProductID] = 966,
'SalesTerritory'[CountryRegionCode] = "FR",
USERELATIONSHIP(SalesOrderHeader[TerritoryID], SalesTerritory[TerritoryID])
)
The measure [QuantityMeasure] is used in CALCULATE with filters for to find the total order
quantity of the specified product in France.
USERELATIONSHIP is not strictly necessary here as the SalesOrderHeader Table is also related to the
Customer table on CustomerID and the Customer Table is also related to the SalesTerritory table on
TerritoryID. This relationship is active. This chain of relationships propogates from the
SalesOrderDetail table through to SalesTerritory via SalesOrderHeader and Customer.