0% found this document useful (0 votes)
7 views25 pages

Module 2 - Connecting & Shaping Data

The document outlines various data connectors available in Power BI, including flat files, databases, and online services. It details the Query Editor's functionalities for transforming and shaping data, including tools for editing tables, creating conditional columns, and merging queries. Additionally, it provides best practices for organizing data connections and managing refresh settings to optimize performance.

Uploaded by

Vishal Kapoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views25 pages

Module 2 - Connecting & Shaping Data

The document outlines various data connectors available in Power BI, including flat files, databases, and online services. It details the Query Editor's functionalities for transforming and shaping data, including tools for editing tables, creating conditional columns, and merging queries. Additionally, it provides best practices for organizing data connections and managing refresh settings to optimize performance.

Uploaded by

Vishal Kapoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

CONNECTING &

SHAPING DATA
TYPES OF DATA CONNECTORS

Power BI can connect to virtually any type of


source data, including (but not limited to):
• Flat files & Folders (csv, text, xls, etc)
• Databases (SQL, Access, Oracle, IBM, Azure, etc)
• Online Services (Sharepoint, GitHub, Dynamics
365,
Google Analytics, Salesforce, Power BI Service, etc)
• Others (Web feeds, R scripts, Spark, Hadoop, etc)
THE QUERY EDITOR

Query Editing Tools (Table transformations, calculated columns, etc)

Formula Bar
(this is “M” code)
Table Name
& Properties

Query
Applied Steps
Pane (like a macro)

*In older versions of Power BI, the Transform Data option may be named Edit Queries
QUERY EDITING TOOLS

The HOME tab includes general settings and common table transformation tools

The TRANSFORM tab includes tools to modify existing columns (splitting/grouping, transposing, extracting text, etc)

The ADD COLUMN tools create new columns (based on conditional rules, text operations, calculations, dates, etc)
BASIC TABLE TRANSFORMATIONS
Sort values (A-Z, Low-High, etc.) Change data type (date, $, %, text, etc.)

Promote
header row

Duplicate, move &


rename columns
Choose or remove columns
Tip: Right-click the
Tip: use the “Remove Other
column header to
Columns” option if you always
access common tools
want a specific set
Keep or remove rows
Tip: use the “Remove Duplicates”
option to create a new lookup
table from scratch
TEXT-SPECIFIC TOOLS

Extract characters from a text


Split a text column based on column based on fixed lengths,
either a specific delimiter or first/last, ranges or delimiters
a number of characters
Tip: Select two or more columns to
merge (or concatenate) fields

HEY THIS IS IMPORTANT!


You can access many of these tools in both the
“Transform” and “Add Column” menus -- the Format a text column to upper, lower or
difference is whether you want to add a new proper case, or add a prefix or suffix
column or modify an existing one Tip: Use “Trim” to eliminate leading & trailing spaces,
or “Clean” to remove non-printable characters
NUMBER-SPECIFIC TOOLS

Information tools allow


you to define binary flags
(TRUE/FALSE or 1/0) to
Standard Scientific Trigonometry mark each row in a
Statistics functions allow you to column as even, odd,
evaluate basic stats for the selected Standard, Scientific and Trigonometry tools allow you
positive or negative
column (sum, min/max, average, to apply standard operations (addition, multiplication,
count, countdistinct, etc) division, etc.) or more advanced calculations (power,
logarithm, sine, tangent, etc) to each value in a column
Note: These tools return a SINGLE value,
and are commonly used to explore a table Note: Unlike the Statistics options, these tools are applied to
rather than prepare it for loading each individual row in the table
DATE-SPECIFIC TOOLS

Date & Time tools are relatively straight-forward, and include the following options:
• Age: Difference between the current time and the date in each row
• Date Only: Removes the time component of a date/time field
• Year/Month/Quarter/Week/Day: Extracts individual components from a date field
(Time-specific options include Hour, Minute, Second, etc.)
• Earliest/Latest: Evaluates the earliest or latest date from a column as a single value
(can
only be accessed from the “Transform” menu)

Note: You will almost always want to perform these operations from the “Add Column” menu
to
build out new fields, rather than transforming an individual date/time column
PRO TIP:
Load up a table containing a single date column and use Date tools to build out an entire calendar table
CREATING A BASIC CALENDAR TABLE

Use pre-defined Date options


in the “Add Column” menu to
quickly build out a calendar
table from a list of dates
ADDING INDEX COLUMNS

Index Columns contain a list of


sequential values that can be used to
identify each unique row in a table
(typically starting from 0 or 1)

These columns are often used to


create unique IDs that can be used to
form relationships between tables
(more on that later!)
ADDING CONDITIONAL COLUMNS

Conditional Columns allow you to define new fields based


on logical rules and conditions (IF/THEN statements)

In this case we’re creating a new conditional column


called “QuantityType”, which depends on the values in
the “OrderQuantity” column, as follows:
• If OrderQuantity =1, QuantityType = “Single Item”
• If OrderQuantity >1, QuantityType = “Multiple
Items”
• Otherwise QuantityType = “Other”
GROUPING & AGGREGATING DATA

Group By allows you to aggregate your data at a different level


(i.e. transform daily data into monthly, roll up transaction-level data by store, etc)

In this case we’re transforming a daily, transaction-level table into a


summary of “TotalQuantity” rolled up by “ProductKey”

NOTE: Any fields not specified in the Group By settings are lost
GROUPING & AGGREGATING DATA
(ADVANCED)

This time we’re transforming the daily, transaction-level table into a summary
of “TotalQuantity” aggregated by both “ProductKey” and “CustomerKey”
(using the advanced option in the dialog box)

NOTE: This is similar to creating a PivotTable in Excel and pulling in “Sum of


OrderQuantity” with ProductKey and CustomerKey as row labels
PIVOTING &
UNPIVOTING
“Pivoting” is a fancy way to describe the process of turning distinct row
values into columns (“pivoting”) or turning columns into rows (“unpivoting”)

Imagine that the table is on a hinge; pivoting is like rotating


it from a vertical to a horizontal layout, and unpivoting is
like rotating it from horizontal to vertical

NOTE: Transpose works very similarly, but doesn’t recognize


unique values; instead, the entire table is transformed so
that each row becomes a column and vice versa
MERGING QUERIES

Merging queries allows you to join tables based


on a common column (like VLOOKUP)

In this case we’re merging the AW_Sales_Data


table with the AW_Product_Lookup table, which
share a common “ProductKey” column

NOTE: Merging adds columns to an existing table

HEY THIS IS IMPORTANT!


Just because you can merge tables,
doesn’t mean you should.
In general, it’s better to keep tables
separate and define relationships
between them (more on that later!)
APPENDING QUERIES

Appending queries allows you to combine (or stack) tables


that share the exact same column structure and data types

In this case we’re appending the


AdventureWorks_Sales_2015 table to the
AdventureWorks_Sales_2016 table, which is valid since
they share identical table structures

NOTE: Appending adds rows to an existing table

PRO TIP:
Use the “Folder” option (Get Data > More > Folder) to append all files within a folder (assuming they share
the same structure); as you add new files, simply refresh the query and they will automatically append!
DATA SOURCE SETTINGS

The Data Source Settings in the Query Editor allow you


to manage data connections and permissions

HEY THIS IS IMPORTANT!


Connections to local files reference the exact path
If the file name or location changes, you will need to
change the source and browse to the current version
MODIFYING QUERIES

Select Transform Data*


from the Home tab to
launch the Query Editor

Within the editor, view or


modify existing queries
in the “Queries” pane

Within each query, you can click each item within the “Applied Steps”
pane to view each stage of the transformation, add new steps or delete
existing ones, or modify individual steps by clicking the gear icons
*Formerly known as “Edit Queries”
REFRESHING QUERIES

By default, ALL queries in the model will refresh when


you use the “Refresh” command from the Home tab

From the Query Editor, uncheck “Include in report


refresh” to exclude individual queries from the refresh

PRO TIP:
Exclude queries that don’t change often,
like lookups or static data tables
DEFINING DATA CATEGORIES

Select a column in the Data view to access


Column Tools, where you can edit field properties
to define specific categories*

This is commonly used to help Power BI accurately


map location-based fields like addresses, countries,
cities, latitude/longitude coordinates, zip codes, etc.

*In older versions of Power BI, these tools can be found in the Modeling tab in the Data view
DEFINING HIERARCHIES
Hierarchies are groups of nested columns that reflect multiple levels of granularity
• For example, a “Geography” hierarchy might include Country, State, and City columns
• Each hierarchy can be treated as a single item in tables and reports, allowing users to “drill up” and
“drill down” through different levels of the hierarchy in a meaningful way

1) From within the Data view, right-click a field 2) This creates a hierarchy field 3) Right-click other fields
(or click the ellipsis) and select “New hierarchy” containing “Start of Year”, which (like “Start of Month”) and
(here we’ve selected “Start of Year”) we’ve renamed “Date Hierarchy” select “Add to Hierarchy”
PRO TIP: IMPORTING MODELS FROM
EXCEL

Already have a fully-built model in Excel?


Import models built in Excel directly into Power BI Desktop
using Import > Power Query, Power Pivot, Power View*

Imported models retain the following:


• Data source connections and queries
• Query editing procedures and applied steps
• Table relationships, hierarchies, field settings, etc.
• All calculated columns and DAX measures

PRO TIP:
Power Pivot includes some features that Power BI does not (filtering options, DAX function help, etc); if you
are more comfortable in the Excel environment, build your models there and then import to Power BI!

*In older versions of Power BI, this import option was called “Excel Workbook Contents”
BEST PRACTICES: CONNECTING & SHAPING
DATA
Get yourself organized, before loading the data into Power BI
• Define clear and intuitive table names (no spaces!) from the start; updating them later
can be a headache, especially if you’ve referenced them in multiple places
• Establish a file/folder structure that makes sense from the start, to avoid having to
modify data source settings if file names or locations change

Disabling report refresh for any static sources


• There’s no need to constantly refresh sources that don’t update frequently (or at
all), like
lookups or static data tables; only enable refresh for tables that will be changing

When working with large tables, only load the data you need
• Don’t include hourly data when you only need daily, or product-level transactions when
you only care about store-level performance; extra data will only slow you down
Reference sources:

- Microsoft PowerBI website


- PowerBI resources on Coursera, Udemy
Disclaimer
The information in this document is highly confidential and may be legally privileged. It
is intended solely for the addressee. Access to this presentation by anyone else is
unauthorized. If you are not the intended recipient, any disclosure, copying, distribution
or any action taken or omitted to be taken in reliance on it, is prohibited and may be
unlawful. The sample screens shown in this presentation are CONVZ FZE’s IP and
cannot be used or distributed without their prior consent. This presentation is
considered approved for submission to the Client by the Above-Authorized signatory.

You might also like