0% found this document useful (0 votes)
22 views43 pages

Power BI Week 2

Power BI Live 13 focuses on loading and transforming data. It discusses data profiling, basic transformations like removing columns and changing data types, and best practices for data transformation in Power Query. It also covers topics like tables and relationships, normalization and de-normalization, and dimensional modeling concepts such as dimensions, facts, and granularity.

Uploaded by

star lord
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views43 pages

Power BI Week 2

Power BI Live 13 focuses on loading and transforming data. It discusses data profiling, basic transformations like removing columns and changing data types, and best practices for data transformation in Power Query. It also covers topics like tables and relationships, normalization and de-normalization, and dimensional modeling concepts such as dimensions, facts, and granularity.

Uploaded by

star lord
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Power BI Live 13

Week 2
Loading and Transforming Data

Loading and
Transforming Data
Data Profiling (Column Quality)
Data Profiling (Column Distribution)
Data Profiling (Column Profile)
Basic Transformations

Column specific: Remove Columns, sort values,


duplicate, move, change data types

Keep or Remove rows

Text specific: Split columns, format, merge and


extract

Number specific: Statistics, Scientific, Information

Date Specific
Basic Transformations

Add Index column

Adding conditional columns

Adding columns from examples

Adding custom columns

Append and Merge columns


Best Practices for Data Transformation in
Power Query

1. Create Groups for your Queries

2. Naming convention for Tables and all Columns

3. Make sure all Data Types are correct

4. Delete irrelevant columns from Tables

5. Create new queries as per your dimensional modeling


requirements

6. Perform row and column transformations

7. Create a Date Table


Tables and Relationships (Entity Relationship)

Each business has different entities. Entities have different characteristics called attributes

Example : Products, Customers, Regions

Normally, 1 Entity = 1 Table, Attributes become columns of table

Tables are related to each other through relationships


Relationship and cardinality
Any association between two entity types is called a relationship.
There are 3 types of relationships
One-to-One Relationship

One-to-Many Relationship or Many-to-One Relationship

Department Employees

Many-to-Many Relationship
OLTP vs OLAP
Normalization and De-normalization

Normalization and de-normalization are two techniques to transform and store data in order to
create a model

Normalization is the process of organizing the columns (attributes) and tables of a database to
reduce data redundancy and improve data integrity

De-normalization is the opposite of normalization, that is increasing data redundancy, with the goal
of improving the understanding of the model
Relationship and cardinality
Relationship and cardinality
Relationship and cardinality
Primary key

Foreign keys
Types of Joins
Left Outer Join Right Outer Join
All from A and Matching from B All from B and Matching from A

Table-A Table-B Tables A + B Table-A Table-B Tables A + B

Full Outer Join Inner Join


All from A and B
Only Matching rows from A and B

Table-A Table-B Tables A + B

Table-A Table-B Tables A + B


Dimensional Modeling for BI
Dimension Dimensions or Dimension Tables or Lookup
Tables are Entities with their attributes. Each
record uniquely identifiable through Primary
Key

Dimension Dimension

Fact

Facts or Fact Tables are a combination of


foreign keys + Quantitative values

Dimension Dimension
Dimensional Modeling for BI
Dimension 111
Fact Constellation Or Galaxy Schema
What is granularity?

Granularity is the level of detail of your table

Higher granularity: More detailed information,


Increase in the number of rows and columns

Lower granularity : Less details, Less rows


Important concepts for Dimension Modeling

• Types of Data in BI Reporting


• Tables, Relationships and Cardinality
• Normalization and De-normalization
• Primary Key
• Foreign Key
• Table joins
• Dimension Tables / Look-up Tables / Dimensions
• Fact Tables / Facts
• Grain
• Star schema / Snowflake schema / Galaxy schema
Modeling Design Steps
1. Identify the Business Process
• Source of data and all fields in data required for the model based on
business questions

2. Identify the Grain


• What does 1 row in the fact table represent or mean?

3. Identify the Dimensions

4. Identify the Facts


Difference between Duplicate and
Reference
• Duplicate
 Duplicate will give you an exact copy of the query with all steps.
 Duplicate is a good option to choose when you want the two copies to be isolated from
each other
 Add as New Query is a Duplicate action
• Reference
 Reference will create a reference to the original query instead as a new query.
 Reference is a good option when you create different branches from one original query.
 Combining a query with its reference creates a circular reference and is not possible.
 Append Queries as New / or Merge Queries as New is a Reference action
M-code
Relationships, Cardinality and cross filter
direction in Power BI
• Relationship and Cardinality
 One-to-One
o Describes a relationship in which only one instance of a value is common between two tables.
o Requires unique values in both tables.
o Is not recommended because this relationship stores redundant information and suggests that the model is not
designed correctly. It is better practice to combine the tables.
o An example of a one-to-one relationship would be if you had products and product IDs in two different tables.
o Creating a one-to-one relationship is redundant and these two tables should be combined.
 One-to-Many / Many-to-One
o Describes a relationship in which you have many instances of a value in one column that are related to only one
unique corresponding instance in another column.
o Describes the directionality between fact and dimension tables.
o Is the most common type of directionality and is the Power BI default when you are automatically creating
relationships.
 Many-to-Many
o Describes a relationship where many values are in common between two tables.
o Does not require unique values in either table in a relationship.
o Is not recommended; a lack of unique values introduces ambiguity and your users might not know which column of
values is referring to what.
Relationships, Cardinality and cross filter
direction in Power BI
• Cross Filter Direction
• Single
 Only one table in a relationship can be used to filter the data
 For a one-to-many or many-to-one relationship, the cross-filter direction will be from the "one" side, meaning that the
filtering will occur in the table that has unique values.
• Both
 One table in a relationship can be used to filter the other. For instance, a dimension table can be filtered through the fact
table, and the fact tables can be filtered through the dimension table.
 You should not enable bi-directional cross-filtering relationships unless you fully understand the ramifications of doing so.
Enabling it can lead to ambiguity, over-sampling, unexpected results, and potential performance degradation.
 Used in scenario where one dimensional table tries to access another dimensional table or dimension-to-dimension
analysis
 For one-to-one relationships, the only option that is available is bi-directional cross-filtering
Data Modeling : Best Practices
• Best method to design a BI data model is Star Schema
• Best way to visualize a star schema data model is the waterfall layout
• In Waterfall layout, dimensions sit on top and fact table sits below dimensions
• Relationships flow from dimensions to facts and show the direction of filtering
• Use One-to-many / Many-to-One relationships as much as possible with filter direction set to ‘Single’
• Avoid Bi-directional filter direction in all scenarios unless there is no other option like in the case of
many-to-many relationships
• Can only have 1 active relationships between 2 tables. Use inactive relationships along with
USERELATIONSHIP() DAX function to perform calculations or opt for Role Playing dimensions
• Ensure linking columns between tables have the same data type. Keep the same names in both tables also
allows for a consistent understanding
• Hide all linking columns in the report from the end users as these offer no understanding for the end user
and can be a source of confusion
Active vs Inactive Relationships
Filter Flow
DAX Programming
• Calculated Columns
• Measures
• Explicit and Implicit Measures
• Evaluation Context
• Filter Context (Aggregation)
• Row Context (Iteration)
• Calculated Tables
Introduction to DAX

The DAX Language

Language of Power Pivot, Power BI, SSAS Tabular

DAX is simple, but it is not easy

No concept of rows and columns like it is in Excel

Designed for data models and business calculations


Introduction to DAX

DAX is a functional language, the execution flows with


function calls

If it is not formatted, it is not DAX.

Code formatting is of paramount importance in DAX.


DAX Formulas
• Calculated Columns
• A column you add to an existing table
• DAX formula gets calculated for each row of the table
• It is calculated immediately and the result gets stored in memory
• It gets recalculated on refresh or when the table gets released from memory (closing and opening of
Power BI)
• Not accessible in Power Query
• Measures
• Needs an aggregator or iterator function to work
• It is not calculated immediately
• It needs context to calculate results
• Same formula gives you different results depending on what filters are applied
• Calculated Tables
• Create new tables in your model using DAX
• Allows you to do calculations on different granularities or filtered tables
• They are recalculated if any of the tables they pull data from are refreshed or get updated
What is CONTEXT!

It is the environment in which a calculation is being completed

Context can come from a variety of different locations – filters and slicers

You can even change the context within the DAX formula

Calculating simple measures in DAX is very easy as the ‘context’ automatically does most of the
work
Evaluation, Filter and Row Context
• Evaluation context is filter context plus the row context
• Filter context
• This includes everything applied on the canvas + filters coming from the relationship
within the model
• Filter context is the filters coming within the formula + aggregation function
• Filter context is applied automatically and propagates through the available
directions in the relationship
• Row Context
• Row context is simply ‘row-by-row’ computation
• Calculated Columns do calculation using the Row Context
• If a filter context is used inside a calculated column, CALCULATE can change the
filter context to a row context.
• In measures, the row context comes into play through the Iterator Functions
Calculated Columns, Measures and
Calculated Tables
• Calculated Columns
• A column you add to an existing table
• DAX formula gets calculated for each row of the table
• It is calculated immediately and the result gets stored in memory
• It gets recalculated on refresh or when the table gets released from memory (closing and opening of
Power BI)
• Not accessible in Power Query
• Measures
• Needs an aggregator or iterator function to work
• It is not calculated immediately
• It needs context to calculate results
• Same formula gives you different results depending on what filters are applied
• Calculated Tables
• Create new tables in your model using DAX
• Allows you to do calculations on different granularities or filtered tables
• They are recalculated if any of the tables they pull data from are refreshed or get updated
DAX Functions
• Reference a column or a table
• A DAX function will return a value or a table
• Aggregation Functions
• Iteration Functions
• Relational Functions
• Table Functions
• Time Intelligence Functions

You might also like