Power Query
Power Query
Power Query
e OVERVIEW
c HOW-TO GUIDE
Get data
Transform data
p CONCEPT
Dataflow licenses
p CONCEPT
Query folding
Query diagnostics
Error handling
c HOW-TO GUIDE
Reference content
i REFERENCE
M language reference
What is Power Query?
Article • 06/28/2023
Power Query is a data transformation and data preparation engine. Power Query comes
with a graphical interface for getting data from sources and a Power Query Editor for
applying transformations. Because the engine is available in many products and services,
the destination where the data will be stored depends on where Power Query was used.
Using Power Query, you can perform the extract, transform, and load (ETL) processing of
data.
Finding and connecting to Power Query enables connectivity to a wide range of data
data is too difficult sources, including data of all sizes and shapes.
Existing challenge How does Power Query help?
Experiences for data Consistency of experience, and parity of query capabilities over all
connectivity are too data sources.
fragmented
Data often needs to be Highly interactive and intuitive experience for rapidly and
reshaped before iteratively building queries over any data source, of any size.
consumption
Any shaping is one-off and When using Power Query to access and transform data, you
not repeatable define a repeatable process (query) that can be easily refreshed in
the future to get up-to-date data.
In the event that you need to modify the process or query to
account for underlying data or schema changes, you can use the
same interactive and intuitive experience you used when you
initially defined the query.
Volume (data sizes), velocity Power Query offers the ability to work against a subset of the
(rate of change), and variety entire dataset to define the required data transformations,
(breadth of data sources and allowing you to easily filter down and transform your data to a
data shapes) manageable size.
Power Query queries can be refreshed manually or by taking
advantage of scheduled refresh capabilities in specific products
(such as Power BI) or even programmatically (by using the Excel
object model).
Because Power Query provides connectivity to hundreds of data
sources and over 350 different types of data transformations for
each of these sources, you can work with data from any source
and in any shape.
The Power Query Editor is the primary data preparation experience, where you can
connect to a wide range of data sources and apply hundreds of different data
transformations by previewing data and selecting transformations from the UI. These
data transformation capabilities are common across all data sources, whatever the
underlying data source limitations.
When you create a new transformation step by interacting with the components of the
Power Query interface, Power Query automatically creates the M code required to do
the transformation so you don't need to write any code.
7 Note
Although two Power Query experiences exist, they both provide almost the same
user experience in every scenario.
Transformations
The transformation engine in Power Query includes many prebuilt transformation
functions that can be used through the graphical interface of the Power Query Editor.
These transformations can be as simple as removing a column or filtering rows, or as
common as using the first row as a table header. There are also advanced
transformation options such as merge, append, group by, pivot, and unpivot.
All these transformations are made possible by choosing the transformation option in
the menu, and then applying the options required for that transformation. The following
illustration shows a few of the transformations available in Power Query Editor.
Dataflows
Power Query can be used in many products, such as Power BI and Excel. However, using
Power Query within a product limits its usage to only that specific product. Dataflows
are a product-agnostic service version of the Power Query experience that runs in the
cloud. Using dataflows, you can get data and transform data in the same way, but
instead of sending the output to Power BI or Excel, you can store the output in other
storage options such as Dataverse or Azure Data Lake Storage. This way, you can use the
output of dataflows in other products and services.
The M language is the data transformation language of Power Query. Anything that
happens in the query is ultimately written in M. If you want to do advanced
transformations using the Power Query engine, you can use the Advanced Editor to
access the script of the query and modify it as you want. If you find that the user
interface functions and transformations won't perform the exact changes you need, use
the Advanced Editor and the M language to fine-tune your functions and
transformations.
Power Query M
let
Source = Exchange.Contents("xyz@contoso.com"),
Mail1 = Source{[Name="Mail"]}[Data],
#"Expanded Sender" = Table.ExpandRecordColumn(Mail1, "Sender", {"Name"},
{"Name"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Sender", each
([HasAttachments] = true)),
#"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each ([Subject] =
"sample files for email PQ test") and ([Folder Path] = "\Inbox\")),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows1",
{"Attachments"}),
#"Expanded Attachments" = Table.ExpandTableColumn(#"Removed Other
Columns", "Attachments", {"Name", "AttachmentContent"}, {"Name",
"AttachmentContent"}),
#"Filtered Hidden Files1" = Table.SelectRows(#"Expanded Attachments",
each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1",
"Transform File from Mail", each #"Transform File from Mail"
([AttachmentContent])),
#"Removed Other Columns1" = Table.SelectColumns(#"Invoke Custom
Function1", {"Transform File from Mail"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other
Columns1", "Transform File from Mail", Table.ColumnNames(#"Transform File
from Mail"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",
{{"Column1", type text}, {"Column2", type text}, {"Column3", type text},
{"Column4", type text}, {"Column5", type text}, {"Column6", type text},
{"Column7", type text}, {"Column8", type text}, {"Column9", type text},
{"Column10", type text}})
in
#"Changed Type"
1M engine The underlying query execution engine that runs queries expressed in
the Power Query formula language ("M").
2Power Query Desktop The Power Query experience found in desktop applications.
3Power Query Online The Power Query experience found in web browser applications.
4Dataflows Power Query as a service that runs in the cloud and is product-agnostic.
The stored result can be used in other applications as services.
See also
Data sources in Power Query
Getting data
Power Query quickstart
Shape and combine data using Power Query
What are dataflows
Getting data overview
Article • 04/10/2023
Power Query can connect to many different data sources so you can work with the data
you need. This article walks you through the steps for bringing in data to Power Query
either in Power Query Desktop or Power Query Online.
Connecting to a data source with Power Query follows a standard set of stages before
landing the data at a destination. This article describes each of these stages.
) Important
In some cases, a connector might have all stages of the get data experience, and in
other cases a connector might have just a few of them. For more information about
the experience of a specific connector, go to the documentation available for the
specific connector by searching on the Connectors in Power Query article.
1. Connection settings
2. Authentication
3. Data preview
4. Query destination
1. Connection settings
Most connectors initially require at least one parameter to initialize a connection to the
data source. For example, the SQL Server connector requires at least the host name to
establish a connection to the SQL Server database.
In comparison, when trying to connect to an Excel file, Power Query requires that you
use the file path to find the file you want to connect to.
7 Note
Some connectors don't require you to enter any parameters at all. These are called
singleton connectors and will only have one data source path available per
environment. Some examples are Adobe Analytics, MailChimp, and Google
Analytics.
2. Authentication
Every single connection that's made in Power Query has to be authenticated. The
authentication methods vary from connector to connector, and some connectors might
offer multiple methods of authentication.
For example, the available authentication methods for the SQL Server database
connector are Windows, Database, and Microsoft account.
3. Data preview
The goal of the data preview stage is to provide you with a user-friendly way to preview
and select your data.
Depending on the connector that you're using, you can preview data by using either:
Navigator window
Table preview dialog box
The object selection pane is displayed on the left side of the window. The user can
interact with and select these objects.
7 Note
For Power Query in Excel, select the Select multiple items option from the
upper-left corner of the navigation window to select more than one object at
a time in the object selection pane.
7 Note
The list of objects in Power Query Desktop is limited to 10,000 items. This limit
does not exist in Power Query Online. For a workaround in Power Query
Desktop, see Object limitation workaround.
The data preview pane on the right side of the window shows a preview of the
data from the object you selected.
There’s a fixed limit of 10,000 objects in the Navigator in Power Query Desktop. This
limit doesn’t occur in Power Query Online. Eventually, the Power Query Online UI will
replace the one in the desktop.
1. Right-click on the root node of the Navigator, and then select Transform Data.
2. Power Query Editor then opens with the full navigation table in the table preview
area. This view doesn't have a limit on the number of objects, and you can use
filters or any other Power Query transforms to explore the list and find the rows
you want (for example, based on the Name column).
3. Upon finding the item you want, you can get at the contents by selecting the data
link (such as the Table link in the following image).
4. Query destination
This is the stage in which you specify where to load the query. The options vary from
integration to integration, but the one option that's always available is loading data to
the Power Query editor to further transform and enrich the query.
Power Query Online experience
The stages for getting data in Power Query Online are:
2. Data preview
3. Query editor
Connection settings
Connection credentials
Connection settings
In the connection settings section, you define the information needed to establish a
connection to your data source. Depending on your connector, that could be the name
of the server, the name of a database, a folder path, a file path, or other information
required by the connector to establish a connection to your data source. Some
connectors also enable specific subsections or advanced options to give you more
control and options when connecting to your data source.
Connection credentials
The first time that you use Power Query to connect to a specific data source, you're
required to create a new connection associated with that data source. A connection is
the full definition of the gateway, credentials, privacy levels, and other connector-
specific fields that make up the connection credentials required to establish a
connection to your data source.
7 Note
Some connectors offer specific fields inside the connection credentials section to
enable or define any sort of security related to the connection that needs to be
established. For example, the SQL Server connector offers the Use Encrypted
Connection field.
The primary information required by all connectors to define a connection are:
Connection name: This is the name that you can define to uniquely identify your
connections. Note that you can't duplicate the name of a connection in your
environment.
Data gateway: If your data source requires a data gateway, select the gateway
using the dropdown list from this field.
Authentication kind & credentials: Depending on the connector, you're presented
with multiple authentication kind options that are available to establish a
connection, as well as fields where you enter your credentials. For this example, the
Windows authentication kind has been selected and you can see the Username
and Password fields that need to be filled in to establish a connection.
Privacy level: You can define the privacy level for your data source to be either
None, Private, Organizational, or Public.
7 Note
To learn more about what data gateways are and how to register a new gateway for
your environment or tenant, go to Using on-premises data gateway.
) Important
Once you've defined a connection in Power Query Online, you can reuse the same
connection later on without reentering all this information again. The Connection field
offers a dropdown menu where you select your already defined connections. Once
you've selected your already defined connection, you don't need to enter any other
details before selecting Next.
After you select a connection from this menu, you can also make changes to the
credentials, privacy level, and other connector-specific fields for your data source in your
project. Select Edit connection, and then change any of the provided fields.
2. Data preview
The goal of the data preview stage is to provide you with a user-friendly way to preview
and select your data.
Depending on the connector that you're using, you can preview data by using either:
Navigator window
Table preview dialog box
The object selection pane is displayed on the left side of the window. The user can
interact with and select these objects.
The data preview pane on the right side of the window shows a preview of the
data from the object you selected.
3. Query editor
For Power Query Online, you're required to load the data into the Power Query editor
where you can further transform and enrich the query if you choose to do so.
Additional information
To better understand how to get data using the different product integrations of Power
Query, go to Where to get data.
Where to get data
Article • 07/28/2023
Getting data from available data sources is usually the first encounter you have with
Power Query. This article provides basic steps for getting data from each of the
Microsoft products that include Power Query.
7 Note
Each of these Power Query get data experiences contain different feature sets.
More information: Where can you use Power Query?
In Power BI Desktop, you can also directly select an Excel worksheet, a Power BI dataset,
a SQL server database, or Dataverse data without using the Get data option. You can
also enter data directly in a table, or select from a data source that was recently used.
2. Scroll through the category selections in the get data context menu, and select the
connector you want to use.
You'll then be asked to fill out information that's required for you to access the data. Go
to the individual connector articles for more information about this required
information.
In Excel, you can also directly select Text/CVS, Web, and Excel worksheet data without
using the Get Data option. You can also select from a data source that was recently used
and from existing connections or tables.
7 Note
Not all Excel versions support all of the same Power Query connectors. For a
complete list of the Power Query connectors supported by all versions of Excel for
Windows and Excel for Mac, go to Power Query data sources in Excel versions .
Get data in Power BI service
To get data in Power BI service:
1. On the left side of Power BI service, select Workspaces (but not My Workspace).
2. From the Workspace context menu, select the workspace you want to use.
3. From the workspace (in this example, TestWorkspace01), select the context menu
next to +New.
6. In the Choose data source page, use Search to search for the name of the
connector, or select View more on the right hand side the connector to see a list
of all the connectors available in Power BI service.
7. If you choose to view more connectors, you can still use Search to search for the
name of the connector, or choose a category to see a list of connectors associated
with that category.
You can also choose to get data directly from an Excel worksheet without using the
Import data option.
3. In the Choose data source page, use Search to search for the name of the
connector, or select View more on the right hand side the connector to see a list
of all the connectors available in Power BI service.
4. If you choose to view more connectors, you can still use Search to search for the
name of the connector, or choose a category to see a list of connectors associated
with that category.
Select a table in the Tables pane that you want to import data to, and then
select Import > Import data.
Open the table to its individual pane, and then select Import > Import data.
In either case, you can also choose to get data from an Excel worksheet without
using the Import data option.
b. In the New dataflow dialog box, enter a name for your new dataflow.
c. Select Create.
1. On the left side of Customer Insights, select Data > Data sources.
4. In Save data source as, enter a name for your data source.
5. Select Next.
6. In the Choose data source page, use Search to search for the name of the
connector, or select View more on the right hand side the connector to see a list
of all the connectors available in Power BI service.
7. If you choose to view more connectors, you can still use Search to search for the
name of the connector, or choose a category to see a list of connectors associated
with that category.
1. On the left side of Data Factory, select Workspaces (but not My Workspace).
2. From your Data Factory workspace, select New > Dataflow Gen2 (Preview) to
create a new dataflow.
3. In Power Query, either select Get data in the ribbon or select Get data from
another source in the current view.
4. In the Choose data source page, use Search to search for the name of the
connector, or select View more on the right hand side the connector to see a list
of all the connectors available in Power BI service.
5. If you choose to view more connectors, you can still use Search to search for the
name of the connector, or choose a category to see a list of connectors associated
with that category.
The Analysis Services documentation contains the following information that describes
the process for getting data:
To set up a Visual Studio solution with the Analysis Services projects extension:
Create a tabular model project
1. On the left side of Power Automate, select Data > Tables. At this point, a new tab
with Power Apps will open in your browser.
2. In the Power Apps tab, follow the instructions for importing data to either a new
table or to an existing table in the Power Apps section.
For information about how to get data in process advisor, go to Connect to a data
source.
However, Azure Data Factory does use Power Query to transform data in data wrangling.
The following Azure Data Factory articles describe how to use Power Query for data
wrangling:
Getting data from available data sources is usually the first encounter you have with
Power Query. This article provides an explanation of the different modules in the
modern get data experience.
7 Note
Each of these Power Query get data experiences contain different feature sets.
More information: Where can you use Power Query?
The procedures for where to start getting data in Power BI Desktop are described in
Data sources in Power BI Desktop.
The new modular experience in Power Query Online is separated into different modules
located on the left side navigation bar. These modules include:
Home (all)
Templates (Power BI service only)
OneLake Data Hub (Fabric only)
New (all)
Upload (all)
Blank Table (all)
Blank Query (all)
Home
The home page acts as a summary of all the modules and presents you with different
options to expedite the process and get you closer to your data. Typically, this module
contains any existing data sources and gives you the option to use a new data source,
table, and upload files. From the home page, you can select View more on the right side
of the New sources and OneLake data hub sections to visit those modules.
Templates
A dataflow template provides a predefined set of entities and field mappings to enable
flow of data from your source to the destination, in the Common Data Model. A
dataflow template commoditizes the movement of data, which in turn reduces overall
burden and cost for a business user. It provides you with a head start to ingest data
wherein you don’t need to worry about knowing and mapping the source and
destination entities and fields—we do it for you, through dataflow templates. For more
information about templates, go to Introducing dataflow templates; A quick and
efficient way to build your sales leaderboard and get visibility over your sales pipeline .
New
The new module provides a full list of connectors that you can select from. On this page,
you can search for a connector across all categories by using the search bar at the top of
page. You can also navigate across the categories to find a specific connector to
integrate with. Selecting a connector here opens the connection settings window, which
begins the process of connecting. For more information on using connectors, go to
Getting data overview.
Upload
The Upload module lets you upload your files directly. The following connectors support
this capability:
Excel
JSON
PDF
Text/CSV
XML
This module is an extension of this capability and lets you select the browse button to
upload a local file, or even drag and drop a file. For more information on uploading files,
go to Upload a file.
Blank table
The Blank table module provides a quick start in creating a table in a dataflow.
Blank query
The Blank query module lets you write or paste your own M script to create a new
query.
Authentication with a data source
Article • 02/17/2023
When you attempt to connect to a data source using a new connector for the first time,
you might be asked to select the authentication method to use when accessing the
data. After you've selected the authentication method, you won't be asked to select an
authentication method for the connector using the specified connection parameters.
However, if you need to change the authentication method later, you can do so.
If you're using a connector from an online app, such as the Power BI service or Power
Apps, you'll see an authentication method dialog box for the OData Feed connector that
looks something like the following image.
As you can see, a different selection of authentication methods is presented from an
online app. Also, some connectors might ask you to enter the name of an on-premises
data gateway to be able to connect to your data.
The level you select for the authentication method you chose for this connector
determines what part of a URL will have the authentication method applied to it. If you
select the top-level web address, the authentication method you select for this
connector will be used for that URL address or any subaddress within that address.
However, you might not want to set the top-level address to a specific authentication
method because different subaddresses can require different authentication methods.
One example might be if you were accessing two separate folders of a single SharePoint
site and wanted to use different Microsoft accounts to access each one.
After you've set the authentication method for a connector's specific address, you won't
need to select the authentication method for that connector using that URL address or
any subaddress again. For example, let's say you select the https://contoso.com/
address as the level you want the Web connector URL settings to apply to. Whenever
you use a Web connector to access any webpage that begins with this address, you
won't be required to select the authentication method again.
In Power BI Desktop, on the File tab, select Options and settings > Data
source settings.
In Excel, on the Data tab, select Get Data > Data Source Settings.
2. In the Data source settings dialog box, select Global permissions, choose the
website where you want to change the permission setting, and then select Edit
Permissions.
You can also delete the credentials for a particular website in step 3 by selecting Clear
Permissions for a selected website, or by selecting Clear All Permissions for all of the
listed websites.
To edit the authentication method in online services, such as for dataflows in the
Power BI service and Microsoft Power Platform
This section outlines connection symptoms when the service isn't configured properly. It
also provides information on how Power Query interacts with the service when it's
properly configured.
1. Enter the Northwind endpoint in the "Get Data" experience using the OData
connector.
Supported workflow
One example of a supported service working properly with OAuth is CRM, for example,
https://*.crm.dynamics.com/api/data/v8.2 .
1. Enter the URL in the "Get Data" experience using the OData connector.
2. Select Organizational Account, and then select Sign-in to proceed to connect
using OAuth.
3. The request succeeds and the OAuth flow continues to allow you to authenticate
successfully.
When you select Sign-in in Step 2 above, Power Query sends a request to the provided
URL endpoint with an Authorization header with an empty bearer token.
The service is then expected to respond with a 401 response with a WWW_Authenticate
header indicating the Azure AD authorization URI to use. This response should include
the tenant to sign into, or /common/ if the resource isn’t associated with a specific
tenant.
HTTP/1.1 401 Unauthorized
Cache-Control: private
Content-Type: text/html
Server:
WWW-Authenticate: Bearer
authorization_uri=https://login.microsoftonline.com/3df2eaf6-33d0-4a10-8ce8-
7e596000ebe7/oauth2/authorize
Date: Wed, 15 Aug 2018 15:02:04 GMT
Content-Length: 49
Power Query can then initiate the OAuth flow against the authorization_uri. Power
Query requests an Azure AD Resource or Audience value equal to the domain of the
URL being requested. This value would be the value you use for your Azure Application
ID URL value in your API/service registration. For example, if accessing
https://api.myservice.com/path/to/data/api , Power Query would expect your
The following Azure Active Directory client IDs are used by Power Query. You might
need to explicitly allow these client IDs to access your service and API, depending on
your overall Azure Active Directory settings.
a672d62c-fc7b-4e81-a576- Power Query for Excel Public client, used in Power BI Desktop
e60dc46e951d and Gateway.
You might need to explicitly allow these client IDs to access your service and API,
depending on your overall Azure Active Directory settings. Go to step 8 of Add a scope
for more details.
If you need more control over the OAuth flow (for example, if your service must respond
with a 302 rather than a 401 ), or if your application’s Application ID URL or Azure AD
Resource value don't match the URL of your service, then you’d need to use a custom
connector. For more information about using our built-in Azure AD flow, go to Azure
Active Directory authentication.
Connections and authentication in
Power Query Online
Article • 02/17/2023
In Power Query Online, a connection refers to the unique identifier and associated
credentials used to establish a connection to a particular data source. One convenient
feature of connections in Power Query is that you can create multiple connections
against the same data source with different credentials.
Creating a connection
During the get data experience in Power Query Online, you'll find a dialog where you
enter information to create and establish a connection to your data source. The process
is standard for all connectors in Power Query, but some connectors might require more
information in order to create a connection.
After entering the values for the connector settings in the Connection settings section,
you can proceed with the Connection credentials section. In this section, you can create
a connection specific to the connection settings you previously entered.
The following table contains the fields and values used in the Connection settings
section.
Connection The name you can enter for your new connection. Sample
Name Connection
Data Gateway An optional field that lets you bind a gateway to your none
connection. For cloud connections, there's no gateway
binding to the connection.
Authentication The authentication kind you select to use that's supported Organizational
Kind by the connector. account
Credentials Depending on the authentication kind you select, there will Derived from
be available a contextual set of fields to input your OAuth2 flow, but
credentials, a button to launch an OAuth2 flow, or even no is shown as a
fields at all for an authentication kind such as Anonymous. Sign in button in
the image
7 Note
By default, the field for the connection name tries to provide a default name when
you create a connection using the information from the connection settings.
After finishing the Connection settings section, you select the Next button to move
forward in the get data experience.
Tip
Some connectors provide an auto sign in experience. To learn more about this
feature, go to auto sign in.
7 Note
To create a gateway you can read the article on using an on-premises data
gateway in dataflows.
Using a local SQL Server database as an example, you enter the connector settings to
establish a connection. For the SQL Server connector, the required setting is just the
server name, but you can also enter the name of the database and select any other
advanced options available for the connector. For demonstration purposes, both the
server name and database have been entered.
After entering the values for the connector in Connection settings, you can proceed
with the Connection credentials section. In this section, you can create a connection
specific to the connection settings you previously entered.
The following table contains the fields and values used in the Connection settings
section.
Connection The name you can enter for your new localhost;AdventureWorks2019
Name connection
Data Gateway An optional field that lets you bind a gateway Mike Test
to your connection.
After finishing the Connection settings section, you select the Next button to move
forward within the get data experience.
Components of a connection
Each connection is made up of a set of components. The following table contains more
information for each component.
Data Source Required The data source for which the connection is SQL Server, File,
kind being established. Folder, Azure
Data Lake
Storage
Data Source Required A string that represents the values or Server Name,
path parameters used to establish a connection to Database Name
your data source.
Gateway Optional Used when a gateway is needed to establish the Any gateway
connection and execute any query evaluation.
Privacy level Optional Establishes the security for each connection, None, Public,
which is taken into consideration when queries Organizational,
from different connections are combined. Private
) Important
Currently, the privacy level is set to None for any new connections created. When
you try to combine multiple data sources, a new dialog prompts you to define the
data privacy levels of the data sources that you want to combine.
Known connections
When Power Query recognizes a set of connection settings, it tries to look up in its
respective credentials storage to see if there's a connection that matches those settings
and, if so, automatically selects that connection.
To override this behavior, you can take either of the following two actions:
Display the dropdown menu to scan a list of available connections for the given
connection settings. You can then select the one that you'd like to use or create a
new one.
Select Edit connection to modify the existing connection or select Create new
connection from the dropdown menu to create a new named connection.
More resources
List of connectors in Power Query
On-premises data gateways documentation
Change the gateway used in a dataflow
Troubleshooting dataflow issues: Connection to the data source
Auto sign in for Azure Active Directory
data sources
Article • 10/06/2022
The auto sign-in feature attempts to automatically sign you in as the current user when
connecting to data sources in Power Query that use Azure Active Directory as one of
their authentication kinds. It does this auto sign-in to expedite the authentication
process and minimize the time it takes to start working with your data.
More technically, the auto sign-in feature for Azure Active Directory data sources uses
the information derived from the currently authenticated user in the Power Query
Online experience. This information is then used to request a new access token for a
selected data source during the connection settings and authentication steps of the get
data process.
7 Note
This functionality is currently only available in Power Query Online and is enabled
by default for a select set of connectors. No configuration is needed to enable this
feature.
When selecting a connector that has this capability, it automatically signs you in with
Organizational account set as the authentication kind.
Tip
If you'd like to authenticate with a different account, select the Switch account link
shown in the dialog.
Further reading
Authentication in Power Query Online
Microsoft identity platform and OAuth 2.0 On-Behalf-Of flow
Upload a file (Preview)
Article • 02/17/2023
You can upload files to your Power Query project when using Power Query Online.
Excel
JSON
PDF
Text / CSV
XML
7 Note
Only files with the following extensions are supported for upload: .csv, .json, .pdf,
.prn, .tsv, .txt, .xl, .xls, .xlsb, .xlsm, .xlsw, .xlsx, .xml.
After you've selected your file, a progress bar shows you how the upload process is
going. Once the upload process is finished, you'll be able to see a green check mark
underneath your file name, with the message Upload successful and the file size right
next to it.
7 Note
The files that are uploaded through this feature are stored in your personal
Microsoft OneDrive for Business account.
Before you select the next button, you need to change the authentication kind from
Anonymous to Organizational account and go through the authentication process.
Start this process by selecting Sign in.
After going through the authentication process, a You are currently signed in message
underneath the Authentication Kind selection let's you know that you've successfully
signed in. After you've signed in, select Next. The file is then stored in your personal
Microsoft OneDrive for Business account, and a new query is created from the file that
you've uploaded.
Drag and drop experience in the query editor
When using the Power Query editor, you can drop a file on either the diagram view or
the queries pane to upload a file.
When dropping the file on either of the previously mentioned sections, a dialog with the
appropriate connector settings page will be shown, based on the file extension of the
file that's being uploaded.
SharePoint and OneDrive for Business
files import
Article • 02/17/2023
Power Query offers a series of ways to gain access to files that are hosted on either
SharePoint or OneDrive for Business.
Browse files
7 Note
Currently, you can only browse for OneDrive for Business files of the authenticated
user inside of Power Query Online for PowerApps.
2 Warning
This feature requires your browser to allow third party cookies. If your browser has
blocked third party cookies, the Browse dialog will appear but it'll be completely
blank with no option to close the dialog.
Power Query provides a Browse OneDrive button next to the File path or URL text box
when you create a dataflow in PowerApps using any of these connectors:
Excel
JSON
PDF
XML
TXT/CSV
When you select this button, you'll be prompted to go through the authentication
process. After completing this process, a new window appears with all the files inside the
OneDrive for Business of the authenticated user.
You can select the file of your choice, and then select the Open button. After selecting
Open, you'll be taken back to the initial connection settings page where you'll see that
the File path or URL text box now holds the exact URL to the file you've selected from
OneDrive for Business.
You can select the Next button at the bottom-right corner of the window to continue
the process and get your data.
7 Note
Your browser interface might not look exactly like the following image. There
are many ways to select Open in Excel for files in your OneDrive for Business
browser interface. You can use any option that allows you to open the file in
Excel.
2. In Excel, select File > Info, and then select the Copy path button.
To use the link you just copied in Power Query, take the following steps:
3. Remove the ?web=1 string at the end of the link so that Power Query can properly
navigate to your file, and then select OK.
4. If Power Query prompts you for credentials, choose either Windows (for on-
premises SharePoint sites) or Organizational Account (for Microsoft 365 or
OneDrive for Business sites). The select Connect.
U Caution
When working with files hosted on OneDrive for Home, the file that you want
to connect to needs to be publicly available. When setting the authentication
method for this connection, select the Anonymous option.
When the Navigator dialog box appears, you can select from the list of tables, sheets,
and ranges found in the Excel workbook. From there, you can use the OneDrive for
Business file just like any other Excel file. You can create reports and use it in datasets
like you would with any other data source.
7 Note
To use a OneDrive for Business file as a data source in the Power BI service, with
Service Refresh enabled for that file, make sure you select OAuth2 as the
Authentication method when configuring your refresh settings. Otherwise, you
may encounter an error (such as, Failed to update data source credentials) when you
attempt to connect or to refresh. Selecting OAuth2 as the authentication method
remedies that credentials error.
After successfully establishing the connection, you'll be prompted with a table preview
that shows the files in your SharePoint site. Select the Transform data button at the
bottom right of the window.
Selecting the Transform Data button will take you to a view of the data called the File
system view. Each of the rows in this table represents a file that was found in your
SharePoint site.
The table has a column named Content that contains your file in a binary format. The
values in the Content column have a different color than the rest of the values in the
other columns of the table, which indicates that they're selectable.
By selecting a Binary value in the Content column, Power Query will automatically add a
series of steps in your query to navigate to the file and interpret its contents where
possible.
For example, from the table shown in the previous image, you can select the second row
where the Name field has a value of 02-February.csv. Power Query will automatically
create a series of steps to navigate and interpret the contents of the file as a CSV file.
7 Note
You can interact with the table by applying filters, sortings, and other transforms
before navigating to the file of your choice. Once you've finished these transforms,
select the Binary value you want to view.
https://contoso-
my.sharepoint.com/personal/user123_contoso_com/_layouts/15/onedrive.aspx
You don't need the full URL, but only the first few parts. The URL you need to use in
Power Query will have the following format:
https://<unique_tenant_name>.sharepoint.com/personal/<user_identifier>
For example:
https://contoso-my.sharepoint/personal/user123_contoso_com
SharePoint.Contents function
While the SharePoint folder connector offers you an experience where you can see all
the files available in your SharePoint or OneDrive for Business site at once, you can also
opt for a different experience. In this experience, you can navigate through your
SharePoint or OneDrive for Business folders and reach the folder or file(s) that you're
interested in.
SharePoint.Contents("https://contoso.sharepoint.com/marketing/data")
7 Note
[ApiVersion="Auto"]) .
3. Power Query will request that you add an authentication method for your
connection. Use the same authentication method that you'd use for the SharePoint
files connector.
4. Navigate through the different documents to the specific folder or file(s) that
you're interested in.
For example, imagine a SharePoint site with a Shared Documents folder. You can
select the Table value in the Content column for that folder and navigate directly
to that folder.
Inside this Shared Documents folder there's a folder where the company stores all
the sales reports. This folder is named Sales Reports. You can select the Table value
on the Content column for that row.
With all the files inside the Sales Reports folder, you could select the Combine files
button (see Combine files overview) to combine the data from all the files in this
folder to a single table. Or you could navigate directly to a single file of your
choice by selecting the Binary value from the Content column.
7 Note
Connecting to Microsoft Graph REST APIs from Power Query isn't recommended or
supported. Instead, we recommend users explore alternative solutions for retrieving
analytics data based on Graph, such as Microsoft Graph data connect.
You might find you can make certain REST calls to Microsoft Graph API endpoints work
through the Web.Contents or OData.Feed functions, but these approaches aren't reliable
as long-term solutions.
This article outlines the issues associated with Microsoft Graph connectivity from Power
Query and explains why it isn't recommended.
Authentication
The built-in Organizational Account authentication flow for Power Query's Web.Contents
and OData.Feed functions isn't compatible with most Graph endpoints. Specifically,
Power Query's Azure Active Directory (Azure AD) client requests the user_impersonation
scope, which isn't compatible with Graph's security model. Graph uses a rich set of
permissions that aren't available through our generic Web and OData connectors.
Implementing your own Azure AD credential retrieval flows directly from your query, or
using hardcoded or embedded credentials, also isn't recommended for security reasons.
While Power BI Desktop offers out-of-box connectivity to over 150 data sources, there
might be cases where you want to connect to a data source for which no out-of-box
connector is available.
For example, the ODBC connector can connect to services with ODBC interfaces, and the
Web connector can connect to services with REST API interfaces.
Community members and organizations can also share custom connectors that they've
created. While Microsoft doesn't offer any support, ownership, or guarantees for these
custom connectors, you might be able to use them for your scenarios. The Power BI
Partner Program also includes many partners that can build custom connectors. To learn
more about the program or find a partner, go to Contact a Power BI Partner .
Users that own an end service or data source can create a custom connector and might
be eligible to certify the connector to have it made available publicly out-of-box within
Power BI Desktop.
Request the data source owner to build and
certify a connector
As only the data source owner or an approved third party can build and certify a custom
connector for any service, end users are encouraged to share the demand for a
connector directly with the data source owner to encourage investment into creating
and certifying one.
You can connect to a multitude of different data sources using built-in connectors that
range from Access databases to Zendesk resources. You can also connect to all sorts of
other data sources to further expand your connectivity options, by using the generic
interfaces (such as ODBC or REST APIs) built into Power Query Desktop and Power
Query Online.
In addition, you can also connect to data sources that aren't identified in the get data
and choose data source lists by using one of the following generic data interfaces:
ODBC
OLE DB
OData
REST APIs
R Scripts
By providing the appropriate parameters in the connection windows that these generic
interfaces provide, the world of data sources you can access and use in Power Query
grows significantly.
In the following sections, you can find lists of data sources that can be accessed by
these generic interfaces.
Power Query generic connector External data source Link for more information
Power BI Desktop generic connector External data source Link for more information
7 Note
This feature is currently available only in Power Query Online and is in public
preview.
With Power Query, you can connect to a multitude of data sources. When you connect
to a data source, you fundamentally create a connection to it. This connection consists
of your data source, credentials, and more information, such as privacy levels and
optional parameters for each data source. The Manage connections dialog is one
centralized way in your Power Query project to manage the connections that are being
referenced by your project.
The entry to the Manage connections dialog is available in the Power Query Home tab
in the ribbon's Data sources group.
Manage connections displays a list of all connections being referenced in your Power
Query project. It also notes the ability to unlink and edit from any of the connections in
your project.
Unlink a connection
Right next to the name of connection, and before the Source type column, there's an
icon to unlink the connection.
When you unlink a connection, you're simply removing the reference of the connection
from your project. Unlinking means that the definition of the connection isn't removed
from the back-end, but it can't be used in your project.
A new prompt then asks you to confirm that you want to unlink the connection.
7 Note
If you want to delete a connection from within a specific product integration, such
as Microsoft Power BI or Microsoft Fabric, be sure to check out the documentation
for each product on how a centralized connections portal can be used.
Edit a connection
Right next to the name of connection, and before the disconnect icon, there's an icon to
edit the connection.
Data gateway: If your data source uses a data gateway, you can modify the
gateway using the dropdown list from this field.
Authentication kind & credentials: Depending on the connector, you're presented
with multiple authentication kind options that are available to establish a
connection, and fields where you enter your credentials.
More Resources
Get Data experience in Power Query
Connectors available in Power Query
Change the gateway used in a dataflow
project
Article • 05/25/2023
When creating a new dataflow project in Power Query Online, you can select the on-
premises data gateway used for your specific data sources during the get data
experience. This article showcases how you can modify or assign a gateway to an
existing dataflow project.
7 Note
Before being able to change a gateway, make sure that you have the needed
gateways already registered under your tenant and with access for the authors of
the dataflow project. You can learn more about data gateways from Using an on-
premises data gateway in Power Platform dataflows.
This query previously used a gateway named "Gateway A" to connect to the folder. But
"Gateway A" no longer has access to the folder due to new company policies. A new
gateway named "Gateway B" has been registered and now has access to the folder that
the query requires. The goal is to change the gateway used in this dataflow project so it
uses the new "Gateway B".
Tip
If there were recent changes to your gateways, select the small refresh icon to
the right of the drop-down menu to update the list of available gateways.
3. After selecting the correct gateway for the project, in this case Gateway B, select
OK to go back to the Power Query editor.
7 Note
The M engine identifies a data source using a combination of its kind and path.
The kind defines what connector or data source function is being used, such as SQL
Server, folder, Excel workbook, or others.
The path value is derived from the required parameters of your data source
function and, for this example, that would be the folder path.
The best way to validate the data source path is to go into the query where your data
source function is being used and check the parameters being used for it. For this
example, there's only one query that connects to a folder and this query has the Source
step with the data source path defined in it. You can double-click the Source step to get
the dialog that indicates the parameters used for your data source function. Make sure
that the folder path, or the correct parameters for your data source function, is the
correct one in relation to the gateway being used.
Modify authentication
To modify the credentials used against your data source, select Get data in the Power
Query editor ribbon to launch the Choose data source dialog box, then define a new or
existing connection to your data source. In this example, the connector that's used is the
Folder connector.
Once in Connection settings, create a new connection or select or modify a different
connection for your data source.
After defining the connection details, select Next at the bottom-right corner and
validate that your query is loading in the Power Query editor.
7 Note
This process is the same as if you were to connect again to your data source. But by
doing the process again, you're effectively re-defining what authentication method
and credentials to use against your data source.
The Power Query user interface
Article • 04/10/2023
With Power Query, you can connect to many different data sources and transform the
data into the shape you want.
In this article, you'll learn how to create queries with Power Query by discovering:
If you're new to Power Query, you can sign up for a free trial of Power BI before you
begin. You can use Power BI dataflows to try out the Power Query Online experiences
described in this article.
Examples in this article connect to and use the Northwind OData feed .
https://services.odata.org/V4/Northwind/Northwind.svc/
7 Note
To learn more about where to get data from each of the Microsoft products that
include Power Query, go to Where to get data.
To start, locate the OData feed connector from the "Get Data" experience. You can select
the Other category from the top, or search for OData in the search bar in the top-right
corner.
Once you select this connector, the screen displays the connection settings and
credentials.
For URL, enter the URL to the Northwind OData feed shown in the previous
section.
For On-premises data gateway, leave as none.
For Authentication kind, leave as anonymous.
The Navigator now opens, where you select the tables you want to connect to from the
data source. Select the Customers table to load a preview of the data, and then select
Transform data.
The dialog then loads the data from the Customers table into the Power Query editor.
The above experience of connecting to your data, specifying the authentication method,
and selecting the specific object or table to connect to is called the get data experience
and is documented with further detail in the Getting data article.
7 Note
1. Ribbon: the ribbon navigation experience, which provides multiple tabs to add
transforms, select options for your query, and access different ribbon buttons to
complete various tasks.
2. Queries pane: a view of all your available queries.
3. Current view: your main working view, that by default, displays a preview of the
data for your query. You can also enable the diagram view along with the data
preview view. You can also switch between the schema view and the data preview
view while maintaining the diagram view.
4. Query settings: a view of the currently selected query with relevant information,
such as query name, query steps, and various indicators.
5. Status bar: a bar displaying relevant important information about your query, such
as execution time, total columns and rows, and processing status. This bar also
contains buttons to change your current view.
7 Note
The schema and diagram view are currently only available in Power Query Online.
The ribbon
The ribbon is the component where you'll find most of the transforms and actions that
you can do in the Power Query editor. It has multiple tabs, whose values depend on the
product integration. Each of the tabs provides specific buttons and options, some of
which might be redundant across the whole Power Query experience. These buttons and
options provide you with easy access to the transforms and actions that you may need.
The Power Query interface is responsive and tries to adjust your screen resolution to
show you the best experience. In scenarios where you'd like to use a compact version of
the ribbon, there's also a collapse button at the bottom-right corner of the ribbon to
help you switch to the compact ribbon.
You can switch back to the standard ribbon view by simply clicking on the expand icon
at the bottom-right corner of the ribbon
You're encouraged to try all of these options to find the view and layout that you feel
most comfortable working with. As an example, select Schema view from the ribbon.
The right side of the status bar also contains icons for the diagram, data, and schema
views. You can use these icons to change between views. You can also use these icons to
enable or disable the view of your choice.
7 Note
For example, in schema view, select the check mark next to the Orders and
CustomerDemographics columns, and from the ribbon select the Remove columns
action. This selection applies a transformation to remove these columns from your data.
7 Note
Select OK to perform the operation. Your data preview refreshes to show the total
number of customers by country.
An alternative way to launch the Group by dialog would be to use the Group by button
in the ribbon or by right-clicking the Country column.
For convenience, transforms in Power Query can often be accessed from multiple places,
so users can opt to use the experience they prefer.
First, you'll need to add the Suppliers data. Select Get Data and from the drop-down
menu, and then select OData.
The OData connection experience reappears. Enter the connection settings as described
in Connect to an OData feed to connect to the Northwind OData feed. In the Navigator
experience, search for and select the Suppliers table.
Select Create to add the new query to the Power Query editor. The queries pane should
now display both the Customers and the Suppliers query.
Open the Group by dialog again, this time by selecting the Group by button on the
ribbon under the Transform tab.
In the Group by dialog, set the Group by operation to group by the country and count
the number of supplier rows per country.
7 Note
Referencing queries
Now that you have a query for customers and a query for suppliers, your next goal is to
combine these queries into one. There are many ways to accomplish this, including
using the Merge option in the Customers table, duplicating a query, or referencing a
query. For this example, you'll create a reference by right-clicking the Customers table
and selecting Reference, which effectively creates a new query that references the
Customers query.
After creating this new query, change the name of the query to Country Analysis and
disable the load of the Customers table by unmarking the Enable load option from the
Suppliers query.
Merging queries
A merge queries operation joins two existing tables together based on matching values
from one or multiple columns. In this example, the goal is to join both the Customers
and Suppliers tables into one table only for the countries that have both Customers and
Suppliers.
Inside the Country Analysis query, select the Merge queries option from the Home tab
in the ribbon.
A new dialog for the Merge operation appears. You can then select the query to merge
with your current query. Select the Suppliers query and select the Country field from
both queries. Finally, select the Inner join kind, as you only want the countries where
you have Customers and Suppliers for this analysis.
After selecting the OK button, a new column is added to your Country Analysis query
that contains the data from the Suppliers query. Select the icon next to the Suppliers
field, which displays a menu where you can select which fields you want to expand.
Select only the Suppliers field, and then select the OK button.
The result of this expand operation is a table with only 12 rows. Rename the
Suppliers.Suppliers field to just Suppliers by double-clicking the field name and
entering the new name.
7 Note
To learn more about the Merge queries feature, go to Merge queries overview.
Applied steps
Every transformation that is applied to your query is saved as a step in the Applied
steps section of the query settings pane. If you ever need to check how your query is
transformed from step to step, you can select a step and preview how your query
resolves at that specific point.
You can also right-click a query and select the Properties option to change the name of
the query or add a description for the query. For example, right-click the Merge queries
step from the Country Analysis query and change the name of the query to be Merge
with Suppliers and the description to be Getting data from the Suppliers query for
Suppliers by Country.
This change adds a new icon next to your step that you can hover over to read its
description.
7 Note
To learn more about Applied steps, go to Using the Applied Steps list.
Before moving on to the next section, disable the Diagram view to only use the Data
preview.
Adding a new column
With the data for customers and suppliers in a single table, you can now calculate the
ratio of customers-to-suppliers for each country. Select the last step of the Country
Analysis query, and then select both the Customers and Suppliers columns. In the Add
column tab in the ribbon and inside the From number group, select Standard, and then
Divide (Integer) from the dropdown.
This change creates a new column called Integer-division that you can rename to Ratio.
This change is the final step of your query, and provides the customer-to-supplier ratio
for the countries where the data has customers and suppliers.
Data profiling
Another Power Query feature that can help you better understand your data is Data
Profiling. By enabling the data profiling features, you'll get feedback about the data
inside your query fields, such as value distribution, column quality, and more.
We recommended that you use this feature throughout the development of your
queries, but you can always enable and disable the feature at your convenience. The
following image shows all the data profiling tools enabled for your Country Analysis
query.
7 Note
To learn more about Data profiling, go to Using the data profiling tools.
To access the inline Power Query help information in Excel, select the Help tab on the
Excel ribbon, and then enter Power Query in the search text box.
7 Note
Currently, Azure Analysis Services doesn't contain any inline Power Query help links.
However, you can get help for Power Query M functions. More information is
contained in the next section.
1. With the Power Query editor open, select the insert step ( ) button.
2. In the formula bar, enter the name of a function you want to check.
a. If you are using Power Query Desktop, enter an equal sign, a space, and the
name of a function.
b. If you are using Power Query Online, enter the name of a function.
3. Select the properties of the function.
a. If you are using Power Query Desktop, in the Query Settings pane, under
Properties, select All properties.
b. If you are using Power Query Online, in the Query Settings pane, select
Properties.
These steps will open the inline help information for your selected function, and let you
enter individual properties used by the function.
Summary
In this article, you created a series of queries with Power Query that provides a
customer-to-supplier ratio analysis at the country level for the Northwind corporation.
You learned the components of the Power Query user interface, how to create new
queries inside the query editor, reference queries, merge queries, understand the
applied steps section, add new columns, and how to use the data profiling tools to
better understand your data.
Power Query is a powerful tool used to connect to many different data sources and
transform the data into the shape you want. The scenarios outlined in this article are
examples to show you how you can use Power Query to transform raw data into
important actionable business insights.
Using the Applied Steps list
Article • 08/07/2023
The Applied steps list is part of the Query settings pane in Power Query. Any
transformations to your data is dislayed in the Applied steps list. For instance, if you
change the first column name, the new column name is displayed in the Applied steps
list as Renamed columns.
Selecting any step displays the results of that particular step, so you can see exactly how
your data changes as you add steps to the query.
If you're using Power Query Desktop (Excel, Power BI Desktop, Analysis Services) and the
Query Settings pane is closed, select the View tab from the ribbon, and then select
Query Settings.
The Query Settings pane then opens on the right side with the Applied Steps list.
If you're using Power Query Online (Power BI service, Power Apps, Data Factory
(preview), Microsoft 365 Customer Insights) and the Query settings pane is closed,
select the < icon above Query settings to open the pane.
The Query settings pane then opens on the right side with the Applied steps list.
The following image shows the different parts of the Applied steps list. Currently, not all
of these parts are found in Power Query Desktop. The Applied steps list in Power Query
Desktop only contains the delete step, step name, step description, and step settings
elements. The step icon and query folding indicator are found only in Power Query
Online.
In Power Query Online, if you hover the mouse cursor over one of the applied steps, an
informational display opens, listing the step name, step label, step description, step
settings, information about the step query folding, and a learn more link to the Query
folding indicators article. For more information about query folding, go to Power Query
query folding. The step label is automatically generated when the step is created, and
indicates the step type, as does the step icon. The step label and the step icon can't be
changed.
You can also edit the settings for any step that contains the step settings icon. The two
places where the icon appears is in the applied settings step, and in the informational
display for the step. Just select the icon and the settings page for that particular step is
displayed.
Rename step
To rename a step, right-click the step and select Rename.
Enter in the name you want, and then either select Enter or select away from the step.
Delete step
To delete a step, right-click the step and select Delete.
Alternatively, select the X next to the step.
Delete until end
To delete a series of steps, right-click the first step of the series and select Delete until
end. This action deletes the selected step and all the subsequent steps.
Select Delete in the new window.
The following image shows the Applied steps list after using the Delete until end.
Insert step after
To add a new step, right-click on the last step in the list and select Insert step after.
To insert a new intermediate step, right-click on a step and select Insert step after. Then
select Insert on the new window.
To set a transformation for the new step, select the new step in the list and make the
change to the data. It automatically links the transformation to the selected step.
Move step
To move a step up one position in the list, right-click the step and select Move before.
To move a step down one position in the list, right-click the step and select Move after.
Alternatively, or to move more than a single position, drag and drop the step to the
desired location.
Extract the previous steps into query
You can also separate a series of transformations into a different query. This allows the
query to be referenced for other sources, which can be helpful if you're trying to apply
the same transformation to multiple datasets. To extract all the previous steps into a
new query, right-click the first step you do not want to include in the query and select
Extract Previous.
Name the new query and select OK. To access the new query, navigate to the Queries
pane on the left side of the screen.
Global search box (Preview)
Article • 07/30/2022
The global search box offers you the ability to search for:
The global search box is located at the top center of the Power Query editor. The search
box follows the same design principles that you find in Microsoft Search in Office , but
contextualized to Power Query.
Search results
To make use of the global search box, select the search box or press Alt + Q. Before you
enter anything, you'll be presented with some default options to choose from.
When you start entering something to search for, the results will be updated in real
time, displaying queries, actions, and get data connectors that match the text that
you've entered.
For scenarios where you'd like to see all available options for a given search query, you
can also select the See more results for option. This option is positioned as the last
result of the search box query when there are multiple matches to your query.
Overview of query evaluation and query
folding in Power Query
Article • 02/17/2023
This article provides a basic overview of how M queries are processed and turned into
data source requests.
Tip
You can think of the M script as a recipe that describes how to prepare your data.
The most common way to create an M script is by using the Power Query editor. For
example, when you connect to a data source, such as a SQL Server database, you'll
notice on the right-hand side of your screen that there's a section called applied steps.
This section displays all the steps or transforms used in your query. In this sense, the
Power Query editor serves as an interface to help you create the appropriate M script for
the transforms that you're after, and ensures that the code you use is valid.
7 Note
Display the query as a series of steps and allow the creation or modification of
new steps.
Display a diagram view.
The previous image emphasizes the applied steps section, which contains the following
steps:
Source: Makes the connection to the data source. In this case, it's a connection to a
SQL Server database.
Navigation: Navigates to a specific table in the database.
Removed other columns: Selects which columns from the table to keep.
Sorted rows: Sorts the table using one or more columns.
Kept top rows: Filters the table to only keep a certain number of rows from the top
of the table.
This set of step names is a friendly way to view the M script that Power Query has
created for you. There are several ways to view the full M script. In Power Query, you can
select Advanced Editor in the View tab. You can also select Advanced Editor from the
Query group in the Home tab. In some versions of Power Query, you can also change
the view of the formula bar to show the query script by going into the View tab and
from the Layout group, select Script view > Query script.
Most of the names found in the Applied steps pane are also being used as is in the M
script. Steps of a query are named using something called identifiers in the M language.
Sometimes extra characters are wrapped around step names in M, but these characters
aren’t shown in the applied steps. An example is #"Kept top rows" , which is categorized
as a quoted identifier because of these extra characters. A quoted identifier can be used
to allow any sequence of zero or more Unicode characters to be used as an identifier,
including keywords, whitespace, comments, operators, and punctuators. To learn more
about identifiers in the M language, go to lexical structure.
Any changes that you make to your query through the Power Query editor will
automatically update the M script for your query. For example, using the previous image
as the starting point, if you change the Kept top rows step name to be Top 20 rows, this
change will automatically be updated in the script view.
While we recommend that you use the Power Query editor to create all or most of the
M script for you, you can manually add or modify pieces of your M script. To learn more
about the M language, go to the official docs site for the M language.
7 Note
M script, also referred to as M code, is a term used for any code that uses the M
language. In the context of this article, M script also refers to the code found inside
a Power Query query and accessible through the advanced editor window or
through the script view in the formula bar.
7 Note
While this example showcases a query with a SQL Database as a data source, the
concept applies to queries with or without a data source.
When Power Query reads your M script, it runs the script through an optimization
process to more efficiently evaluate your query. In this process, it determines which
steps (transforms) from your query can be offloaded to your data source. It also
determines which other steps need to be evaluated using the Power Query engine. This
optimization process is called query folding, where Power Query tries to push as much of
the possible execution to the data source to optimize your query's execution.
) Important
All rules from the Power Query M formula language (also known as the M
language) are followed. Most notably, lazy evaluation plays an important role
during the optimization process. In this process, Power Query understands what
specific transforms from your query need to be evaluated. Power Query also
understands what other transforms don't need to be evaluated because they're not
needed in the output of your query.
Furthermore, when multiple sources are involved, the data privacy level of each
data source is taken into consideration when evaluating the query. More
information: Behind the scenes of the Data Privacy Firewall
The following diagram demonstrates the steps that take place in this optimization
process.
1. The M script, found inside the advanced editor, is submitted to the Power Query
engine. Other important information is also supplied, such as credentials and data
source privacy levels.
2. The Query folding mechanism submits metadata requests to the data source to
determine the capabilities of the data source, table schemas, relationships between
different entities at the data source, and more.
3. Based on the metadata received, the query folding mechanism determines what
information to extract from the data source and what set of transformations need
to happen inside the Power Query engine. It sends the instructions to two other
components that take care of retrieving the data from the data source and
transforming the incoming data in the Power Query engine if necessary.
4. Once the instructions have been received by the internal components of Power
Query, Power Query sends a request to the data source using a data source query.
5. The data source receives the request from Power Query and transfers the data to
the Power Query engine.
6. Once the data is inside Power Query, the transformation engine inside Power
Query (also known as mashup engine) does the transformations that couldn't be
folded back or offloaded to the data source.
7. The results derived from the previous point are loaded to a destination.
7 Note
Depending on the transformations and data source used in the M script, Power
Query determines if it will stream or buffer the incoming data.
The query folding mechanism accomplishes this goal by translating your M script to a
language that can be interpreted and executed by your data source. It then pushes the
evaluation to your data source and sends the result of that evaluation to Power Query.
This operation often provides a much faster query execution than extracting all the
required data from your data source and running all transforms required in the Power
Query engine.
When you use the get data experience, Power Query guides you through the process
that ultimately lets you connect to your data source. When doing so, Power Query uses
a series of functions in the M language categorized as accessing data functions. These
specific functions use mechanisms and protocols to connect to your data source using a
language that your data source can understand.
However, the steps that follow in your query are the steps or transforms that the query
folding mechanism attempts to optimize. It then checks if they can be offloaded to your
data source instead of being processed using the Power Query engine.
) Important
All data source functions, commonly shown as the Source step of a query, queries
the data at the data source in its native language. The query folding mechanism is
utilized on all transforms applied to your query after your data source function so
they can be translated and combined into a single data source query or as many
transforms that can be offloaded to the data source.
Depending on how the query is structured, there could be three possible outcomes to
the query folding mechanism:
Full query folding: When all of your query transformations get pushed back to the
data source and minimal processing occurs at the Power Query engine.
Partial query folding: When only a few transformations in your query, and not all,
can be pushed back to the data source. In this case, only a subset of your
transformations is done at your data source and the rest of your query
transformations occur in the Power Query engine.
No query folding: When the query contains transformations that can't be
translated to the native query language of your data source, either because the
transformations aren't supported or the connector doesn't support query folding.
For this case, Power Query gets the raw data from your data source and uses the
Power Query engine to achieve the output you want by processing the required
transforms at the Power Query engine level.
7 Note
Leveraging a data source that has more processing resources and has query folding
capabilities can expedite your query loading times as the processing occurs at the
data source and not at the Power Query engine.
Next steps
For detailed examples of the three possible outcomes of the query folding mechanism,
go to Query folding examples.
For information about query folding indicators found in the Applied Steps pane, go to
Query folding indicators
Power Query query folding
Article • 08/31/2022
This article targets data modelers developing models in Power Pivot or Power BI
Desktop. It describes what Power Query query folding is, and why it's important in your
data model designs. This article also describes the data sources and transformations that
can achieve query folding, and how to determine that your Power Query queries can be
folded—whether fully or partially.
Query folding is the ability for a Power Query query to generate a single query
statement to retrieve and transform source data. The Power Query mashup engine
strives to achieve query folding whenever possible for reasons of efficiency.
Query folding is an important topic for data modeling for several reasons:
Import model tables: Data refresh will take place efficiently for Import model
tables (Power Pivot or Power BI Desktop), in terms of resource utilization and
refresh duration.
DirectQuery and Dual storage mode tables: Each DirectQuery and Dual storage
mode table (Power BI only) must be based on a Power Query query that can be
folded.
Incremental refresh: Incremental data refresh (Power BI only) will be efficient, in
terms of resource utilization and refresh duration. In fact, the Power BI Incremental
Refresh configuration window will notify you of a warning should it determine that
query folding for the table can't be achieved. If it can't be achieved, the goal of
incremental refresh is defeated. The mashup engine would then be required to
retrieve all source rows, and then apply filters to determine incremental changes.
Query folding may occur for an entire Power Query query, or for a subset of its steps.
When query folding cannot be achieved—either partially or fully—the Power Query
mashup engine must compensate by processing data transformations itself. This process
can involve retrieving source query results, which for large datasets is very resource
intensive and slow.
We recommend that you strive to achieve efficiency in your model designs by ensuring
query folding occurs whenever possible.
Generally, the following list describes transformations that can be query folded.
Removing columns.
Filtering rows, with static values or Power Query parameters (WHERE clause
predicates).
Expanding record columns (source foreign key columns) to achieve a join of two
source tables (JOIN clause).
Non-fuzzy merging of fold-able queries based on the same source (JOIN clause).
Appending fold-able queries based on the same source (UNION ALL operator).
Adding custom columns with simple logic (SELECT column expressions). Simple
logic implies uncomplicated operations, possibly including the use of M functions
that have equivalent functions in the SQL data source, like mathematic or text
manipulation functions. For example, the following expressions return the year
component of the OrderDate column value (to return a numeric value).
Power Query M
Date.Year([OrderDate])
Adding custom columns with complex logic. Complex logic implies the use of M
functions that have no equivalent functions in the data source. For example, the
following expressions format the OrderDate column value (to return a text value).
Power Query M
Date.ToText([OrderDate], "yyyy")
Note that when a Power Query query encompasses multiple data sources,
incompatibility of data source privacy levels can prevent query folding from taking
place. For more information, see the Power BI Desktop privacy levels article.
The View Native Query option is only available for certain relational DB/SQL
generating connectors. It doesn't work for OData based connectors, for example,
even though there is folding occurring on the backend. The Query Diagnostics
feature is the best way to see what folding has occurred for non-SQL connectors
(although the steps that fold aren't explicitly called out—you just see the resulting
URL that was generated).
To view the folded query, you select the View Native Query option. You're then be
presented with the native query that Power Query will use to source data.
If the View Native Query option isn't enabled (greyed out), this is evidence that not all
query steps can be folded. However, it could mean that a subset of steps can still be
folded. Working backwards from the last step, you can check each step to see if the
View Native Query option is enabled. If so, then you've learned where, in the sequence
of steps, that query folding could no longer be achieved.
Next steps
For more information about query folding and related articles, check out the following
resources:
This article provides some example scenarios for each of the three possible outcomes
for query folding. It also includes some suggestions on how to get the most out of the
query folding mechanism, and the effect that it can have in your queries.
The scenario
Imagine a scenario where, using the Wide World Importers database for Azure Synapse
Analytics SQL database, you're tasked with creating a query in Power Query that
connects to the fact_Sale table and retrieves the last 10 sales with only the following
fields:
Sale Key
Customer Key
Invoice Date Key
Description
Quantity
7 Note
For demonstration purposes, this article uses the database outlined on the tutorial
on loading the Wide World Importers database into Azure Synapse Analytics. The
main difference in this article is the fact_Sale table only holds data for the year
2000, with a total of 3,644,356 rows.
While the results might not exactly match the results that you get by following the
tutorial from the Azure Synapse Analytics documentation, the goal of this article is
to showcase the core concepts and impact that query folding can have in your
queries.
This article showcases three ways to achieve the same output with different levels of
query folding:
No query folding
Partial query folding
Full query folding
) Important
Queries that rely solely on unstructured data sources or that don't have a compute
engine, such as CSV or Excel files, don't have query folding capabilities. This means
that Power Query evaluates all the required data transformations using the Power
Query engine.
After connecting to your database and navigating to the fact_Sale table, you select the
Keep bottom rows transform found inside the Reduce rows group of the Home tab.
After selecting this transform, a new dialog appears. In this new dialog, you can enter
the number of rows that you'd like to keep. For this case, enter the value 10, and then
select OK.
Tip
For this case, performing this operation yields the result of the last ten sales. In
most scenarios, we recommend that you provide a more explicit logic that defines
which rows are considered last by applying a sort operation on the table.
Next, select the Choose columns transform found inside the Manage columns group of
the Home tab. You can then select the columns you want to keep from your table and
remove the rest.
Lastly, inside the Choose columns dialog, select the Sale Key , Customer Key , Invoice
Date Key , Description , and Quantity columns, and then select OK.
The following code sample is the full M script for the query you created:
Power Query M
let
Source = Sql.Database(ServerName, DatabaseName),
Navigation = Source{[Schema = "wwi", Item = "fact_Sale"]}[Data],
#"Kept bottom rows" = Table.LastN(Navigation, 10),
#"Choose columns" = Table.SelectColumns(#"Kept bottom rows", {"Sale Key",
"Customer Key", "Invoice Date Key", "Description", "Quantity"})
in
#"Choose columns""
Each box in the previous image is called a node. A node represents the operation
breakdown to fulfill this query. Nodes that represent data sources, such as SQL Server in
the example above and the Value.NativeQuery node, represent which part of the query
is offloaded to the data source. The rest of the nodes, in this case Table.LastN and
Table.SelectColumns highlighted in the rectangle in the previous image, are evaluated
by the Power Query engine. These two nodes represent the two transforms that you
added, Kept bottom rows and Choose columns. The rest of the nodes represent
operations that happen at your data source level.
To see the exact request that is sent to your data source, select View details in the
Value.NativeQuery node.
This data source request is in the native language of your data source. For this case, that
language is SQL and this statement represents a request for all the rows and fields from
the fact_Sale table.
Consulting this data source request can help you better understand the story that the
query plan tries to convey:
Sql.Database : This node represents the data source access. Connects to the
database and sends metadata requests to understand its capabilities.
Value.NativeQuery : Represents the request that was generated by Power Query to
fulfill the query. Power Query submits the data requests in a native SQL statement
to the data source. In this case, that represents all records and fields (columns)
from the fact_Sale table. For this scenario, this case is undesirable, as the table
contains millions of rows and the interest is only in the last 10.
Table.LastN : Once Power Query receives all records from the fact_Sale table, it
uses the Power Query engine to filter the table and keep only the last 10 rows.
Table.SelectColumns : Power Query will use the output of the Table.LastN node
and apply a new transform called Table.SelectColumns , which selects the specific
columns that you want to keep from a table.
For its evaluation, this query had to download all rows and fields from the fact_Sale
table. This query took an average of 6 minutes and 1 second to be processed in a
standard instance of Power BI dataflows (which accounts for the evaluation and loading
of data to dataflows).
Inside the Choose columns dialog, select the Sale Key , Customer Key , Invoice Date
Key , Description , and Quantity columns and then select OK.
You now create logic that will sort the table to have the last sales at the bottom of the
table. Select the Sale Key column, which is the primary key and incremental sequence
or index of the table. Sort the table using only this field in ascending order from the
context menu for the column.
Next, select the table contextual menu and choose the Keep bottom rows transform.
In Keep bottom rows, enter the value 10, and then select OK.
The following code sample is the full M script for the query you created:
Power Query M
let
Source = Sql.Database(ServerName, DatabaseName),
Navigation = Source{[Schema = "wwi", Item = "fact_Sale"]}[Data],
#"Choose columns" = Table.SelectColumns(Navigation, {"Sale Key", "Customer
Key", "Invoice Date Key", "Description", "Quantity"}),
#"Sorted rows" = Table.Sort(#"Choose columns", {{"Sale Key",
Order.Ascending}}),
#"Kept bottom rows" = Table.LastN(#"Sorted rows", 10)
in
#"Kept bottom rows"
You can right-click the last step of your query, the one named Kept bottom rows , and
select the Query plan option to better understand how your query might be evaluated.
Each box in the previous image is called a node. A node represents every process that
needs to happen (from left to right) in order for your query to be evaluated. Some of
these nodes can be evaluated at your data source while others, like the node for
Table.LastN , represented by the Kept bottom rows step, are evaluated using the Power
Query engine.
To see the exact request that is sent to your data source, select View details in the
Value.NativeQuery node.
This request is in the native language of your data source. For this case, that language is
SQL and this statement represents a request for all the rows, with only the requested
fields from the fact_Sale table ordered by the Sale Key field.
Consulting this data source request can help you better understand the story that the
full query plan tries to convey. The order of the nodes is a sequential process that starts
by requesting the data from your data source:
For its evaluation, this query had to download all rows and only the required fields from
the fact_Sale table. It took an average of 3 minutes and 4 seconds to be processed in a
standard instance of Power BI dataflows (which accounts for the evaluation and loading
of data to dataflows).
Full query folding example
After connecting to the database and navigating to the fact_Sale table, start by
selecting the columns that you want to keep from your table. Select the Choose
columns transform found inside the Manage columns group from the Home tab. This
transform helps you to explicitly select the columns that you want to keep from your
table and remove the rest.
In Choose columns, select the Sale Key , Customer Key , Invoice Date Key , Description ,
and Quantity columns, and then select OK.
You now create logic that will sort the table to have the last sales at the top of the table.
Select the Sale Key column, which is the primary key and incremental sequence or
index of the table. Sort the table only using this field in descending order from the
context menu for the column.
Next, select the table contextual menu and choose the Keep top rows transform.
In Keep top rows, enter the value 10, and then select OK.
The following code sample is the full M script for the query you created:
Power Query M
let
Source = Sql.Database(ServerName, DatabaseName),
Navigation = Source{[Schema = "wwi", Item = "fact_Sale"]}[Data],
#"Choose columns" = Table.SelectColumns(Navigation, {"Sale Key", "Customer
Key", "Invoice Date Key", "Description", "Quantity"}),
#"Sorted rows" = Table.Sort(#"Choose columns", {{"Sale Key",
Order.Descending}}),
#"Kept top rows" = Table.FirstN(#"Sorted rows", 10)
in
#"Kept top rows"
You can right-click the last step of your query, the one named Kept top rows, and select
the option that reads Query plan.
This request is in the native language of your data source. For this case, that language is
SQL and this statement represents a request for all the rows and fields from the
fact_Sale table.
Consulting this data source query can help you better understand the story that the full
query plan tries to convey:
7 Note
While there's no clause that can be used to SELECT the bottom rows of a table in
the T-SQL language, there's a TOP clause that retrieves the top rows of a table.
For its evaluation, this query only downloads 10 rows, with only the fields that you
requested from the fact_Sale table. This query took an average of 31 seconds to be
processed in a standard instance of Power BI dataflows (which accounts for the
evaluation and loading of data to dataflows).
Performance comparison
To better understand the affect that query folding has in these queries, you can refresh
your queries, record the time it takes to fully refresh each query, and compare them. For
simplicity, this article provides the average refresh timings captured using the Power BI
dataflows refresh mechanic while connecting to a dedicated Azure Synapse Analytics
environment with DW2000c as the service level.
It's often the case that a query that fully folds back to the data source outperforms
similar queries that don't completely fold back to the data source. There could be many
reasons why this is the case. These reasons range from the complexity of the transforms
that your query performs, to the query optimizations implemented at your data source,
such as indexes and dedicated computing, and network resources. Still, there are two
specific key processes that query folding tries to use that minimizes the affect that both
of these processes have with Power Query:
Data in transit
Transforms executed by the Power Query engine
The following sections explain the affect that these two processes have in the previously
mentioned queries.
Data in transit
When a query gets executed, it tries to fetch the data from the data source as one of its
first steps. What data is fetched from the data source is defined by the query folding
mechanism. This mechanism identifies the steps from the query that can be offloaded to
the data source.
The following table lists the number of rows requested from the fact_Sale table of the
database. The table also includes a brief description of the SQL statement sent to
request such data from the data source.
No None 3644356 Request for all fields and all records from the fact_Sale table
query
folding
Partial Partial 3644356 Request for all records, but only required fields from the
query fact_Sale table after it was sorted by the Sale Key field
folding
Full Full 10 Request for only the required fields and the TOP 10 records of
query the fact_Sale table after being sorted in descending order by
folding the Sale Key field
When requesting data from a data source, the data source needs to compute the results
for the request and then send the data to the requestor. While the computing resources
have already been mentioned, the network resources of moving the data from the data
source to Power Query, and then have Power Query be able to effectively receive the
data and prepare it for the transforms that will happen locally can take some time
depending on the size of the data.
For the showcased examples, Power Query had to request over 3.6 million rows from the
data source for the no query folding and partial query folding examples. For the full
query folding example, it only requested 10 rows. For the fields requested, the no query
folding example requested all the available fields from the table. Both the partial query
folding and the full query folding examples only submitted a request for exactly the
fields that they needed.
U Caution
The following table showcases the nodes from the query plans of the previous queries
that would have been evaluated by the Power Query engine.
For the examples showcased in this article, the full query folding example doesn't
require any transforms to happen inside the Power Query engine as the required output
table comes directly from the data source. In contrast, the other two queries required
some computation to happen at the Power Query engine. Because of the amount of
data that needs to be processed by these two queries, the process for these examples
takes more time than the full query folding example.
Type of Description
Operator
Remote Operators that are data source nodes. The evaluation of these operators occurs
outside of Power Query.
Type of Description
Operator
Streaming Operators are pass-through operators. For example, Table.SelectRows with a simple
filter can usually filter the results as they pass through the operator, and won’t need
to gather all rows before moving the data. Table.SelectColumns and
Table.ReorderColumns are other examples of these sort of operators.
Full scan Operators that need to gather all the rows before the data can move on to the next
operator in the chain. For example, to sort data, Power Query needs to gather all the
data. Other examples of full scan operators are Table.Group , Table.NestedJoin , and
Table.Pivot .
Tip
While not every transform is the same from a performance standpoint, in most
cases, having fewer transforms is usually better.
7 Note
Before reading this article, we recommended that you read Overview of query
evaluation and query folding in Power Query to better understand how folding
works in Power Query.
Query folding indicators help you understand the steps that fold or don't fold.
With query folding indicators, it becomes obvious when you make a change that breaks
folding. This feature helps you to more easily resolve issues quickly, avoid performance
issues in the first place, and have better insight into your queries. In most cases you run
into, steps will fold or won't fold. But there are many cases where the outcome isn't as
obvious, and these cases are discussed in Step diagnostics indicators (Dynamic, Opaque,
and Unknown).
7 Note
The query folding indicators feature is available only for Power Query Online.
This interpretation works even with a simple query against a SQL source. For example,
using the AdventureWorks sample database, connect to the Production.Product table
and load data. Loading this sample through the Power Query navigator gives the
following query:
Power Query M
let
Source = Sql.Database("ServerName", "AdventureWorks"),
Navigation = Source{[Schema = "Production", Item = "Product"]}[Data]
in
Navigation
If you examine how this code shows up in query folding indicators, you'll note that the
first step is inconclusive. But the second step does fold, which means that the query up
to that point does fold.
In this example, the initial steps can't be confirmed to fold (is inconclusive), but the final
step generated when you load data initially does fold. How the first steps (Source, and
sometimes other Navigation steps) are handled depends on the connector. With SQL,
for example, it's handled as a catalog table value, which doesn't fold. However, as soon
as you select data for that connector, it will fold.
Conversely, this can also mean that your query folds up to a point and then stops
folding. Unlike in the case where you have a folding indicator for the step that shows
that everything folds, when you have a not-folding indicator it doesn't mean that
everything doesn't fold. Instead, it means that "not everything" folds. Generally,
everything up to the last folding indicator will fold, with more operations happening
after.
Modifying the example from above, you can give a transform that never folds—
Capitalize Each Word.
Power Query M
let
Source = Sql.Database("ServerName", "AdventureWorks"),
Navigation = Source{[Schema = "Production", Item = "Product"]}[Data],
#"Capitalized each word" = Table.TransformColumns(Navigation, {{"Name",
each Text.Proper(_), type text}})
in
#"Capitalized each word"
In the query folding indicators, you have the same indicators as above, except the final
step doesn't fold. Everything up to this final step will be performed on the data source,
while the final step will be performed locally.
Step diagnostics indicators
Query folding indicators use an underlying query plan, and require it to be able to get
information about the query to report it. Currently the query plan only supports tables,
so some cases (lists, records, primitives) won't report as folding or not. Similarly,
constant tables report as opaque.
Folding The folding indicator tells you that the query up to this step will be evaluated
by the data source.
Not The not-folding indicator tells you that some part of the query up to this step
folding will be evaluated outside the data source. You can compare it with the last
folding indicator, if there is one, to see if you can rearrange your query to be
more performant.
Might Might fold indicators are uncommon. They mean that a query "might" fold.
fold They indicate either that folding or not folding will be determined at runtime,
when pulling results from the query, and that the query plan is dynamic. These
indicators will likely only appear with ODBC or OData connections.
Opaque Opaque indicators tell you that the resulting query plan is inconclusive for
some reason. It generally indicates that there's a true "constant" table, or that
that transform or connector isn't supported by the indicators and query plan
tool.
Example analysis
For an example analysis, start by connecting to the Production.Product table in
Adventure Works (SQL). The initial load, similar to the example above, looks like the
following image.
Adding more steps that fold will extend that green line on the right side. This extension
occurs because this step also folds.
Adding a step that doesn't fold displays a different indicator. For example, Capitalize
each word never folds. The indicator changes, showing that as of this step, it's stopped
folding. As mentioned earlier, the previous steps will still fold.
Adding more steps downstream that depend on Capitalize each step will continue to
not fold.
However, if you remove the column you applied the capitalization to so that the
optimized query plan can all fold once more, you'll get a result like the following image.
However, something like this is uncommon. This image illustrates how it's not just the
order of steps, but the actual transformations that apply as well.
Query plan for Power Query (Preview)
Article • 02/17/2023
Query plan for Power Query is a feature that provides a better view of your query's
evaluation. It's useful to help determine why a particular query might not fold at a
particular step.
Through a practical example, this article will demonstrate the main use case and
potential benefits of using the query plan feature to review your query steps. The
examples used in this article have been created using the AdventureWorksLT sample
database for Azure SQL Server, which you can download from AdventureWorks sample
databases.
7 Note
The query plan feature for Power Query is only available in Power Query Online.
This article has been divided in a series of recommended steps in order to interpret the
query plan. These steps are:
Use the following steps to create the query in your own Power Query Online
environment.
Power Query M
let
Source = Sql.Database("servername", "database"),
Navigation = Source{[Schema = "Sales", Item = "SalesOrderHeader"]}
[Data],
#"Removed other columns" = Table.SelectColumns(Navigation,
{"SalesOrderID", "OrderDate", "SalesOrderNumber",
"PurchaseOrderNumber", "AccountNumber", "CustomerID", "TotalDue"}),
#"Filtered rows" = Table.SelectRows(#"Removed other columns", each
[TotalDue] > 1000),
#"Kept bottom rows" = Table.LastN(#"Filtered rows", 5)
in
#"Kept bottom rows"
3. Change servername and database with the correct names for your own
environment.
5. Select Next.
6. In the Power Query Editor, select Configure connection and provide the
credentials to your data source.
7 Note
After following these steps, your query will look like the one in the following image.
This query connects to the SalesOrderHeader table, and selects a few columns from the
last five orders with a TotalDue value above 1000.
7 Note
This article uses a simplified example to showcase this feature, but the concepts
described in this article apply to all queries. We recommend that you have a good
knowledge of query folding before reading the query plan. To learn more about
query folding, go to Query folding basics.
7 Note
Before reading this section, we recommend that you review the article on Query
folding indicators.
Your first step in this process is to review your query and pay close attention to the
query folding indicators. The goal is to review the steps that are marked as not folded.
Then you can see if making changes to the overall query could make those
transformations fold completely.
For this example, the only step that can't be folded is Kept bottom rows, which is easy
to identify through the not folded step indicator. This step is also the last step of the
query.
The goal now is to review this step and understand what's being folded back to the data
source and what can't be folded.
Power Query tries to optimize your query by taking advantage of lazy evaluation and
query folding, as mentioned in Query folding basics. This query plan represents the
optimized translation of your M query into the native query that's sent to the data
source. It also includes any transforms that are performed by the Power Query Engine.
The order in which the nodes appear follows the order of your query starting from the
last step or output of your query, which is represented on the far left of the diagram and
in this case is the Table.LastN node that represents the Kept bottom rows step.
At the bottom of the dialog, there's a bar with icons that help you zoom in or out of the
query plan view, and other buttons to help you manage the view. For the previous
image, the Fit to view option from this bar was used to better appreciate the nodes.
7 Note
The query plan represents the optimized plan. When the engine is evaluating a
query, it tries to fold all operators into a data source. In some cases, it might even
do some internal reordering of the steps to maximize folding. With this in mind, the
nodes/operators left in this optimized query plan typically contain the "folded" data
source query and any operators that couldn't be folded and are evaluated locally.
Folded nodes: This node can be either Value.NativeQuery or "data source" nodes
such as Sql.Database . These can also be identified with the label remote under
their function name.
Non-folded nodes: Other table operators, such as Table.SelectRows ,
Table.SelectColumns , and other functions that couldn't be folded. These can also
The following image shows the folded nodes inside the red rectangle. The rest of the
nodes couldn't be folded back to the data source. You'll need to review the rest of the
nodes since the goal is to attempt to have those nodes fold back to the data source.
You can select View details at the bottom of some nodes to display extended
information. For example, the details of the Value.NativeQuery node show the native
query (in SQL) that will be sent to the data source.
The query shown here might not be exactly the same query sent to the data source, but
it's a good approximation. For this case, it tells you exactly what columns will be queried
from the SalesOrderHeader table and then how it will filter that table using the TotalDue
field to only get rows where the value for that field is larger than 1000. The node next to
it, Table.LastN, is calculated locally by the Power Query engine, as it can't be folded.
7 Note
The operators might not exactly match the functions used in the query's script.
The goal is to apply changes to your query so that the step can be folded. Some of the
changes you might implement could range from rearranging your steps to applying an
alternative logic to your query that's more explicit to the data source. This doesn't mean
that all queries and all operations are foldable by applying some changes. But it's a
good practice to determine through trial and error if your query could be folded back.
Since the data source is a SQL Server database, if the goal is to retrieve the last five
orders from the table, then a good alternative would be to take advantage of the TOP
and ORDER BY clauses in SQL. Since there's no BOTTOM clause in SQL, the Table.LastN
transform in PowerQuery can't be translated into SQL. You could remove the
Table.LastN step and replace it with:
A sort descending step by the SalesOrderID column in the table, since this column
determines which order goes first and which has been entered last.
Select the top five rows since the table has been sorted, this transform
accomplishes the same as if it was a Kept bottom rows ( Table.LastN ).
This alternative is equivalent to the original query. While this alternative in theory seems
good, you need to make the changes to see if this alternative will make this node fully
fold back to the data source.
1. Close the query plan dialog and go back to the Power Query Editor.
4. Select the table icon on the top-left corner of the data preview view and select the
option that reads Keep top rows. In the dialog, pass the number five as the
argument and hit OK.
After implementing the changes, check the query folding indicators again and see if it's
giving you a folded indicator.
Now it's time to review the query plan of the last step, which is now Keep top rows.
Now there are only folded nodes. Select View details under Value.NativeQuery to verify
which query is being sent to the database.
While this article is suggesting what alternative to apply, the main goal is for you to
learn how to use the query plan to investigate query folding. This article also provides
visibility of what's being sent to your data source and what transforms will be done
locally.
You can adjust your code to see the impact that it has in your query. By using the query
folding indicators, you'll also have a better idea of which steps are preventing your
query from folding.
Query folding on native queries
Article • 02/17/2023
In Power Query, you're able to define a native query and run it against your data source.
The Import data from a database using native database query article explains how to do
this process with multiple data sources. But, by using the process described in that
article, your query won't take advantage of any query folding from subsequent query
steps.
This article showcases an alternative method to create native queries against your data
source using the Value.NativeQuery function and keep the query folding mechanism
active for subsequent steps of your query.
7 Note
We recommend that you read the documentation on query folding and the query
folding indicators to better understand the concepts used throughout this article.
Amazon Redshift
Dataverse (when using enhanced compute)
Google BigQuery
PostgreSQL
SAP HANA
Snowflake
SQL Server
7 Note
To showcase this process, this article uses the SQL Server connector and the
AdventureWorks2019 sample database. The experience may vary from connector
to connector, but this article showcases the fundamentals on how to enable query
folding capabilities over native queries for the supported connectors.
When connecting to the data source, it's important that you connect to the node or
level where you want to execute your native query. For the example in this article, that
node will be the database level inside the server.
After defining the connection settings and supplying the credentials for your
connection, you'll be taken to the navigation dialog for your data source. In that dialog,
you'll see all the available objects that you can connect to.
From this list, you need to select the object where the native query is run (also known as
the target). For this example, that object is the database level.
At the navigator window in Power Query, right-click the database node in the navigator
window and select the Transform Data option. Selecting this option creates a new query
of the overall view of your database, which is the target you need to run your native
query.
Once your query lands in the Power Query editor, only the Source step should show in
the Applied steps pane. This step contains a table with all the available objects in your
database, similar to how they were displayed in the Navigator window.
SQL
The first step was to define the correct target, which in this case is the database where
the SQL code will be run. Once a step has the correct target, you can select that step—in
this case, Source in Applied Steps—and then select the fx button in the formula bar to
add a custom step. In this example, replace the Source formula with the following
formula:
Power Query M
The most important component of this formula is the use of the optional record for the
forth parameter of the function that has the EnableFolding record field set to true.
7 Note
You can read more about the Value.NativeQuery function from the official
documentation article.
After you have entered the formula, a warning will be shown that will require you to
enable native queries to run for your specific step. You can click continue for this step to
be evaluated.
This SQL statement yields a table with only three rows and two columns.
Test query folding
To test the query folding of your query, you can try to apply a filter to any of your
columns and see if the query folding indicator in the applied steps section shows the
step as folded. For this case, you can filter the DepartmentID column to have values that
are not equal to two.
After adding this filter, you can check that the query folding indicators still show the
query folding happening at this new step.
To further validate what query is being sent to the data source, you can right-click the
Filtered rows step and select the option that reads View query plan to check the query
plan for that step.
In the query plan view, you can see that a node with the name Value.NativeQuery at the
left side of the screen that has a hyperlink text that reads View details. You can click this
hyperlink text to view the exact query that is being sent to the SQL Server database.
The native query is wrapped around another SELECT statement to create a subquery of
the original. Power Query will do its best to create the most optimal query given the
transforms used and the native query provided.
Tip
For scenarios where you get errors because query folding wasn't possible, it is
recommended that you try validating your steps as a subquery of your original
native query to check if there might be any syntax or context conflicts.
Using the data profiling tools
Article • 08/14/2023
The data profiling tools provide new and intuitive ways to clean, transform, and
understand data in Power Query Editor. They include:
Column quality
Column distribution
Column profile
To enable the data profiling tools, go to the View tab on the ribbon. In Power Query
Desktop, enable the options you want in the Data preview group, as shown in the
following image.
In Power Query Online, select Data view, then enable the options you want in the drop-
down list.
After you enable the options, you'll see something like the following image in Power
Query Editor.
7 Note
By default, Power Query performs this data profiling over the first 1,000 rows of
your data. To have it operate over the entire dataset, select the Column profiling
based on top 1000 rows message in the lower-left corner of your editor window to
change column profiling to Column profiling based on entire dataset.
Column quality
The column quality feature labels values in rows in five categories:
Unknown, shown in dashed green. Indicates when there are errors in a column, the
quality of the remaining data is unknown.
These indicators are displayed directly underneath the name of the column as part of a
small bar chart, as shown in the following image.
The number of records in each column quality category is also displayed as a
percentage.
By hovering over any of the columns, you are presented with the numerical distribution
of the quality of values throughout the column. Additionally, selecting the ellipsis button
(...) opens some quick action buttons for operations on the values.
Column distribution
This feature provides a set of visuals underneath the names of the columns that
showcase the frequency and distribution of the values in each of the columns. The data
in these visualizations is sorted in descending order from the value with the highest
frequency.
By hovering over the distribution data in any of the columns, you get information about
the overall data in the column (with distinct count and unique values). You can also
select the ellipsis button and choose from a menu of available operations.
Column profile
This feature provides a more in-depth look at the data in a column. Apart from the
column distribution chart, it contains a column statistics chart. This information is
displayed underneath the data preview section, as shown in the following image.
Filter by value
You can interact with the value distribution chart on the right side and select any of the
bars by hovering over the parts of the chart.
Copy data
In the upper-right corner of both the column statistics and value distribution sections,
you can select the ellipsis button (...) to display a Copy shortcut menu. Select it to copy
the data displayed in either section to the clipboard.
Group by value
When you select the ellipsis button (...) in the upper-right corner of the value
distribution chart, in addition to Copy you can select Group by. This feature groups the
values in your chart by a set of available options.
The image below shows a column of product names that have been grouped by text
length. After the values have been grouped in the chart, you can interact with individual
values in the chart as described in Filter by value.
Using the Queries pane
Article • 02/17/2023
In Power Query, you'll be creating many different queries. Whether it be from getting
data from many tables or from duplicating the original query, the number of queries will
increase.
7 Note
Some actions in the Power Query Online editor may be different than actions in the
Power Query Desktop editor. These differences will be noted in this article.
Rename a query
To directly change the name of the query, double-select on the name of the query. This
action will allow you to immediately change the name.
Go to Query Settings and enter in a different name in the Name input field.
Delete a query
To delete a query, open the context pane on the query and select Delete. There will be
an additional pop-up confirming the deletion. To complete the deletion, select the
Delete button.
Duplicating a query
Duplicating a query will create a copy of the query you're selecting.
To duplicate your query, open the context pane on the query and select Duplicate. A
new duplicate query will pop up on the side of the query pane.
Referencing a query
Referencing a query will create a new query. The new query uses the steps of a previous
query without having to duplicate the query. Additionally, any changes on the original
query will transfer down to the referenced query.
To reference your query, open the context pane on the query and select Reference. A
new referenced query will pop up on the side of the query pane.
7 Note
To learn more about how to copy and paste queries in Power Query, go to Sharing
a query.
For the sake of being more comprehensive, we'll once again describe all of the context
menu actions that are relevant for either.
New query
You can import data into the Power Query editor as an option from the context menu.
To learn about how to get data into Power Query, go to Getting data
Merge queries
When you select the Merge queries option from the context menu, the Merge queries
input screen opens.
This option functions the same as the Merge queries feature located on the ribbon and
in other areas of the editor.
7 Note
To learn more about how to use the Merge queries feature, go to Merge queries
overview.
New parameter
When you select the New parameter option from the context menu, the New
parameter input screen opens.
This option functions the same as the New parameter feature located on the ribbon.
7 Note
To move the query into a group, open the context menu on the specific query.
Then, select the group you want to put the query in.
The move will look like the following image. Using the same steps as above, you can
also move the query out of the group by selecting Queries (root) or another group.
In desktop versions of Power Query, you can also drag and drop the queries into the
folders.
Diagram view
Article • 02/17/2023
Diagram view offers a visual way to prepare data in the Power Query editor. With this
interface, you can easily create queries and visualize the data preparation process.
Diagram view simplifies the experience of getting started with data wrangling. It speeds
up the data preparation process and helps you quickly understand the dataflow, both
the "big picture view" of how queries are related and the "detailed view" of the specific
data preparation steps in a query.
This feature is enabled by selecting Diagram view in the View tab on the ribbon. With
diagram view enabled, the steps pane and queries pane will be collapsed.
7 Note
Diagram view is also connected to the Data Preview and the ribbon so that you can
select columns in the Data Preview.
You can add a new step within a query, after the currently selected step, by selecting the
+ button, and then either search for the transform or choose the item from the shortcut
menu. These are the same transforms you'll find in the Power Query editor ribbon.
By searching and selecting the transform from the shortcut menu, the step gets added
to the query, as shown in the following image.
7 Note
To learn more about how to author queries in the Query editor using the Power
Query editor ribbon or data preview, go to Power Query Quickstart.
Query level actions
You can perform two quick actions on a query—expand/collapse a query and highlight
related queries. These quick actions show up on an active selected query or when
hovering over a query.
You can perform more query level actions such as duplicate, reference, and so on, by
selecting the query level context menu (the three vertical dots). You can also right-click
in the query and get to the same context menu.
Delete query
To delete a query, right-click in a query and select Delete from the context menu. There
will be an additional pop-up to confirm the deletion.
Rename query
To rename a query, right-click in a query and select Rename from the context menu.
Enable load
To ensure that the results provided by the query are available for downstream use such
as report building, by default Enable load is set to true. In case you need to disable load
for a given query, right-click in a query and select Enable load. The queries where
Enable load is set to false will be displayed with a grey outline.
Duplicate
To create a copy of a given query, right-click in the query and select Duplicate. A new
duplicate query will appear in the diagram view.
Reference
Referencing a query will create a new query. The new query will use the steps of the
previous query without having to duplicate the query. Additionally, any changes on the
original query will transfer down to the referenced query. To reference a query, right-
click in the query and select Reference.
Move to group
You can make folders and move the queries into these folders for organizational
purposes. These folders are called groups. To move a given query to a Query group,
right-click in a query and select Move to group. You can choose to move the queries to
an existing group or create a new query group.
You can view the query groups above the query box in the diagram view.
Create function
When you need to apply the same set of transformations in different queries or values,
creating custom Power Query functions can be valuable. To learn more about custom
functions, go to Using custom functions. To convert a query into a reusable function,
right-click in a given query and select Create function.
Convert to parameter
A parameter provides the flexibility to dynamically change the output of your queries
depending on their value and promotes reusability. To convert a non-structured value
such as date, text, number, and so on, right-click in the query and select Convert to
Parameter.
7 Note
Advanced editor
With the advanced editor, you can see the code that Power Query editor is creating with
each step. To view the code for a given query, right-click in the query and select
Advanced editor.
7 Note
To learn more about the code used in the advanced editor, go to Power Query M
language specification.
7 Note
7 Note
To learn more about how to merge queries in Power Query, go to Merge queries
overview.
Edit settings
To edit the step level settings, right-click the step and choose Edit settings. Instead, you
can double-click the step (that has step settings) and directly get to the settings dialog
box. In the settings dialog box, you can view or change the step level settings. For
example, the following image shows the settings dialog box for the Split column step.
Rename step
To rename a step, right-click the step and select Rename. This action opens the Step
properties dialog. Enter the name you want, and then select OK.
Delete step
To delete a step, right-click the step and select Delete. To delete a series of steps until
the end, right-click the step and select Delete until end.
This action will open a dialog box where you can add the step description. This step
description will come handy when you come back to the same query after a few days or
when you share your queries or dataflows with other users.
By hovering over each step, you can view a call out that shows the step label, step name,
and step descriptions (that were added).
By selecting each step, you can see the corresponding data preview for that step.
You can also expand or collapse a query by selecting the query level actions from the
query's context menu.
To expand all or collapse all queries, select the Expand all/Collapse all button next to
the layout options in the diagram view pane.
You can also right-click any empty space in the diagram view pane and see a context
menu to expand all or collapse all queries.
In the collapsed mode, you can quickly look at the steps in the query by hovering over
the number of steps in the query. You can select these steps to navigate to that specific
step within the query.
Layout Options
There are five layout options available in the diagram view: zoom out, zoom in, mini-
map, full screen, fit to view, and reset.
Zoom out/zoom in
With this option, you can adjust the zoom level and zoom out or zoom in to view all the
queries in the diagram view.
Mini-map
With this option, you can turn the diagram view mini-map on or off. More information:
Show mini-map
Full screen
With this option, you can view all the queries and their relationships through the Full
screen mode. The diagram view pane expands to full screen and the data preview pane,
queries pane, and steps pane remain collapsed.
Fit to view
With this option, you can adjust the zoom level so that all the queries and their
relationships can be fully viewed in the diagram view.
Reset
With this option, you can reset the zoom level back to 100% and also reset the pane to
the top-left corner.
Similarly, you can select the right dongle to view direct and indirect dependent queries.
You can also hover on the link icon below a step to view a callout that shows the query
relationships.
The second way to modify diagram view settings is to right-click over a blank part of the
diagram view background.
You can change diagram view settings to show step names to match the applied steps
within the query settings pane.
Compact view
When you have queries with multiple steps, it can be challenging to scroll horizontally to
view all your steps within the viewport.
To address this, diagram view offers Compact view, which compresses the steps from
top to bottom instead of left to right. This view can be especially useful when you have
queries with multiple steps, so that you can see as many queries as possible within the
viewport.
To enable this view, navigate to diagram view settings and select Compact view inside
the View tab in the ribbon.
Show mini-map
Once the number of queries begin to overflow the diagram view, you can use the scroll
bars at the bottom and right side of the diagram view to scroll through the queries. One
other method of scrolling is to use the diagram view mini-map control. The mini-map
control lets you keep track of the overall dataflow "map", and quickly navigate, while
looking at an specific area of the map in the main diagram view area.
To open the mini-map, either select Show mini-map from the diagram view menu or
select the mini-map button in the layout options.
Right-click and hold the rectangle on the mini-map, then move the rectangle to move
around in the diagram view.
Show animations
When the Show animations menu item is selected, the transitions of the sizes and
positions of the queries is animated. These transitions are easiest to see when collapsing
or expanding the queries or when changing the dependencies of existing queries. When
cleared, the transitions will be immediate. Animations are turned on by default.
You can also expand or collapse related queries from the query level context menu.
Multi-select queries
You select multiple queries within the diagram view by holding down the Ctrl key and
clicking queries. Once you multi-select, right-clicking will show a context menu that
allows performing operations such as merge, append, move to group, expand/collapse
and more.
Inline rename
You can double-click the query name to rename the query.
Double-clicking the step name allows you to rename the step, provided the diagram
view setting is showing step names.
When step labels are displayed in diagram view, double-clicking the step label shows
the dialog box to rename the step name and provide a description.
Accessibility
Diagram view supports accessibility features such as keyboard navigation, high-contrast
mode, and screen reader support. The following table describes the keyboard shortcuts
that are available within diagram view. To learn more about keyboard shortcuts available
within Power Query Online, go to keyboard shortcuts in Power Query.
Move focus from query level to step level Alt+Down arrow key
Schema view is designed to optimize your flow when working on schema level
operations by putting your query's column information front and center. Schema view
provides contextual interactions to shape your data structure, and lower latency
operations as it only requires the column metadata to be computed and not the
complete data results.
This article walks you through schema view and the capabilities it offers.
7 Note
The Schema view feature is available only for Power Query Online.
Overview
When working on data sets with many columns, simple tasks can become incredibly
cumbersome because even finding the right column by horizontally scrolling and
parsing through all the data is inefficient. Schema view displays your column
information in a list that's easy to parse and interact with, making it easier than ever to
work on your schema.
In addition to an optimized column management experience, another key benefit of
schema view is that transforms tend to yield results faster. These results are faster
because this view only requires the columns information to be computed instead of a
preview of the data. So even working with long running queries with a few columns will
benefit from using schema view.
You can turn on schema view by selecting Schema view in the View tab. When you're
ready to work on your data again, you can select Data view to go back.
Reordering columns
One common task when working on your schema is reordering columns. In Schema
View this can easily be done by dragging columns in the list and dropping in the right
location until you achieve the desired column order.
Applying transforms
For more advanced changes to your schema, you can find the most used column-level
transforms right at your fingertips directly in the list and in the Schema tools tab. Plus,
you can also use transforms available in other tabs on the ribbon.
Share a query
Article • 12/17/2022
You can use Power Query to extract and transform data from external data sources.
These extraction and transformations steps are represented as queries. Queries created
with Power Query are expressed using the M language and executed through the M
Engine.
You can easily share and reuse your queries across projects, and also across Power
Query product integrations. This article covers the general mechanisms to share a query
in Power Query.
Copy / Paste
In the queries pane, right-click the query you want to copy. From the dropdown menu,
select the Copy option. The query and its definition will be added to your clipboard.
7 Note
The copy feature is currently not available in Power Query Online instances.
To paste the query from your clipboard, go to the queries pane and right-click on any
empty space in it. From the menu, select Paste.
When pasting this query on an instance that already has the same query name, the
pasted query will have a suffix added with the format (#) , where the pound sign is
replaced with a number to distinguish the pasted queries.
You can also paste queries between multiple instances and product integrations. For
example, you can copy the query from Power BI Desktop, as shown in the previous
images, and paste it in Power Query for Excel as shown in the following image.
2 Warning
Copying and pasting queries between product integrations doesn't guarantee that
all functions and functionality found in the pasted query will work on the
destination. Some functionality might only be available in the origin product
integration.
With the code of your query in your clipboard, you can share this query through the
means of your choice. The recipient of this code needs to create a blank query and
follow the same steps as described above. But instead of copying the code, the recipient
will replace the code found in their blank query with the code that you provided.
7 Note
To create a blank query, go to the Get Data window and select Blank query from
the options.
If you find yourself in a situation where you need to apply the same set of
transformations to different queries or values, creating a Power Query custom function
that can be reused as many times as you need could be beneficial. A Power Query
custom function is a mapping from a set of input values to a single output value, and is
created from native M functions and operators.
While you can manually create your own Power Query custom function using code as
shown in Understanding Power Query M functions, the Power Query user interface
offers you features to speed up, simplify, and enhance the process of creating and
managing a custom function.
This article focuses on this experience, provided only through the Power Query user
interface, and how to get the most out of it.
) Important
This article outlines how to create a custom function with Power Query using
common transforms accessible in the Power Query user interface. It focuses on the
core concepts to create custom functions, and links to additional articles in Power
Query documentation for more information on specific transforms that are
referenced in this article.
7 Note
The following example was created using the desktop experience found in Power BI
Desktop and can also be followed using the Power Query experience found in Excel
for Windows.
You can follow along with this example by downloading the sample files used in this
article from the following download link . For simplicity, this article will be using the
Folder connector. To learn more about the Folder connector, go to Folder. The goal of
this example is to create a custom function that can be applied to all the files in that
folder before combining all of the data from all files into a single table.
Start by using the Folder connector experience to navigate to the folder where your files
are located and select Transform Data or Edit. This will take you to the Power Query
experience. Right-click on the Binary value of your choice from the Content field and
select the Add as New Query option. For this example, you'll see that the selection was
made for the first file from the list, which happens to be the file April 2019.csv.
This option will effectively create a new query with a navigation step directly to that file
as a Binary, and the name of this new query will be the file path of the selected file.
Rename this query to be Sample File.
Create a new parameter with the name File Parameter. Use the Sample File query as the
Current Value, as shown in the following image.
7 Note
We recommend that you read the article on Parameters to better understand how
to create and manage parameters in Power Query.
The binary parameter type is only displayed inside the Parameters dialog Type
dropdown menu when you have a query that evaluates to a binary.
Rename the newly created query from File Parameter (2) to Transform Sample file.
Right-click this new Transform Sample file query and select the Create Function option.
This operation will effectively create a new function that will be linked with the
Transform Sample file query. Any changes that you make to the Transform Sample file
query will be automatically replicated to your custom function. During the creation of
this new function, use Transform file as the Function name.
After creating the function, you'll notice that a new group will be created for you with
the name of your function. This new group will contain:
All parameters that were referenced in your Transform Sample file query.
Your Transform Sample file query, commonly known as the sample query.
Your newly created function, in this case Transform file.
The first transformation that needs to happen to this query is one that will interpret the
binary. You can right-click the binary from the preview pane and select the CSV option
to interpret the binary as a CSV file.
The format of all the CSV files in the folder is the same. They all have a header that
spans the first top four rows. The column headers are located in row five and the data
starts from row six downwards, as shown in the next image.
The next set of transformation steps that need to be applied to the Transform Sample
file are:
1. Remove the top four rows—This action will get rid of the rows that are considered
part of the header section of the file.
7 Note
To learn more about how to remove rows or filter a table by row position, go
to Filter by row position.
2. Promote headers—The headers for your final table are now in the first row of the
table. You can promote them as shown in the next image.
Power Query by default will automatically add a new Changed Type step after
promoting your column headers that will automatically detect the data types for each
column. Your Transform Sample file query will look like the next image.
7 Note
Your Transform file function relies on the steps performed in the Transform
Sample file query. However, if you try to manually modify the code for the
Transform file function, you'll be greeted with a warning that reads The definition
of the function 'Transform file' is updated whenever query 'Transform Sample
file' is updated. However, updates will stop if you directly modify function
'Transform file'.
7 Note
To learn more about how to choose or remove columns from a table, go to Choose
or remove columns.
Your function was applied to every single row from the table using the values from the
Content column as the argument for your function. Now that the data has been
transformed into the shape that you're looking for, you can expand the Output Table
column, as shown in the image below, without using any prefix for the expanded
columns.
You can verify that you have data from all files in the folder by checking the values in the
Name or Date column. For this case, you can check the values from the Date column, as
each file only contains data for a single month from a given year. If you see more than
one, it means that you've successfully combined data from multiple files into a single
table.
7 Note
What you've read so far is fundamentally the same process that happens during the
Combine files experience, but done manually.
We recommend that you also read the article on Combine files overview and
Combine CSV files to further understand how the combine files experience works
in Power Query and the role that custom functions play.
To make this requirement happen, create a new parameter called Market with the text
data type. For the Current Value, enter the value Panama.
With this new parameter, select the Transform Sample file query and filter the Country
field using the value from the Market parameter.
7 Note
Applying this new step to your query will automatically update the Transform file
function, which will now require two parameters based on the two parameters that your
Transform Sample file uses.
But the CSV files query has a warning sign next to it. Now that your function has been
updated, it requires two parameters. So the step where you invoke the function results
in error values, since only one of the arguments was passed to the Transform file
function during the Invoked Custom Function step.
To fix the errors, double-click Invoked Custom Function in the Applied Steps to open
the Invoke Custom Function window. In the Market parameter, manually enter the
value Panama.
You can now check your query to validate that only rows where Country is equal to
Panama show up in the final result set of the CSV Files query.
For example, imagine a query that has several codes as a text string and you want to
create a function that will decode those values, as in the following sample table:
code
PTY-CM1090-LAX
code
LAX-CM701-PTY
PTY-CM4441-MIA
MIA-UA1257-LAX
LAX-XY2842-MIA
You start by having a parameter that has a value that serves as an example. For this case,
it will be the value PTY-CM1090-LAX.
From that parameter, you create a new query where you apply the transformations that
you need. For this case, you want to split the code PTY-CM1090-LAX into multiple
components:
Origin = PTY
Destination = LAX
Airline = CM
FlightID = 1090
The M code for that set of transformations is shown below.
Power Query M
let
Source = code,
SplitValues = Text.Split( Source, "-"),
CreateRow = [Origin= SplitValues{0}, Destination= SplitValues{2},
Airline=Text.Start( SplitValues{1},2), FlightID= Text.End( SplitValues{1},
Text.Length( SplitValues{1} ) - 2) ],
RowToTable = Table.FromRecords( { CreateRow } ),
#"Changed Type" = Table.TransformColumnTypes(RowToTable,{{"Origin", type
text}, {"Destination", type text}, {"Airline", type text}, {"FlightID", type
text}})
in
#"Changed Type"
7 Note
To learn more about the Power Query M formula language, go to Power Query M
formula language.
You can then transform that query into a function by doing a right-click on the query
and selecting Create Function. Finally, you can invoke your custom function into any of
your queries or values, as shown in the next image.
After a few more transformations, you can see that you've reached your desired output
and leveraged the logic for such a transformation from a custom function.
Promote or demote column headers
Article • 12/17/2022
When creating a new query from unstructured data sources such as text files, Power
Query analyzes the contents of the file. If Power Query identifies a different pattern for
the first row, it will try to promote the first row of data to be the column headings for
your table. However, Power Query might not identify the pattern correctly 100 percent
of the time, so this article explains how you can manually promote or demote column
headers from rows.
Before you can promote the headers, you need to remove the first four rows of the
table. To make that happen, select the table menu in the upper-left corner of the
preview window, and then select Remove top rows.
In the Remove top rows window, enter 4 in the Number of rows box.
7 Note
To learn more about Remove top rows and other table operations, go to Filter by
row position.
The result of that operation will leave the headers as the first row of your table.
Locations of the promote headers operation
From here, you have a number of places where you can select the promote headers
operation:
7 Note
Table column names must be unique. If the row you want to promote to a header
row contains multiple instances of the same text string, Power Query will
disambiguate the column headings by adding a numeric suffix preceded by a dot
to every text string that isn't unique.
To demote column headers to rows
In the following example, the column headers are incorrect: they're actually part of the
table's data. You need to demote the headers to be part of the rows of the table.
After you do this operation, your table will look like the following image.
As a last step, select each column and type a new name for it. The end result will
resemble the following image.
See also
Filter by row position
Filter a table by row position
Article • 12/17/2022
Power Query has multiple options to filter a table based on the positions of its rows,
either by keeping or removing those rows. This article covers all the available methods.
Keep rows
The keep rows set of functions will select a set of rows from the table and remove any
other rows that don't meet the criteria.
There are two places where you can find the Keep rows buttons:
In the data preview section in the middle of the Power Query window, you can see
the position of your rows on the left side of the table. Each row position is
represented by a number. The top row starts with position 1.
The result of that change will give you the output table you're looking for. After you set
the data types for your columns, your table will look like the following image.
Keep bottom rows
Imagine the following table that comes out of a system with a fixed layout.
This report always contains seven rows of data at the end of the report page. Above the
data, the report has a section for comments with an unknown number of rows. In this
example, you only want to keep those last seven rows of data and the header row.
To do that, select Keep bottom rows from the table menu. In the Keep bottom rows
dialog box, enter 8 in the Number of rows box.
The result of that operation will give you eight rows, but now your header row is part of
the table.
You need to promote the column headers from the first row of your table. To do this,
select Use first row as headers from the table menu. After you define data types for
your columns, you'll create a table that looks like the following image.
To do that, select Keep range of rows from the table menu. In the Keep range of rows
dialog box, enter 6 in the First row box and 8 in the Number of rows box.
Similar to the previous example for keeping bottom rows, the result of this operation
gives you eight rows with your column headers as part of the table. Any rows above the
First row that you defined (row 6) are removed.
You can perform the same operation as described in Keep bottom rows to promote the
column headers from the first row of your table. After you set data types for your
columns, your table will look like the following image.
Remove rows
This set of functions will select a set of rows from the table, remove them, and keep the
rest of the rows in the table.
There are two places where you can find the Remove rows buttons:
To do that, select Remove top rows from the table menu. In the Remove top rows
dialog box, enter 5 in the Number of rows box.
In the same way as the previous examples for "Keep bottom rows" and "Keep a range of
rows," the result of this operation gives you eight rows with your column headers as part
of the table.
You can perform the same operation as described in previous examples to promote the
column headers from the first row of your table. After you set data types for your
columns, your table will look like the following image.
Remove bottom rows
Imagine the following table that comes out of a system with a fixed layout.
This report always contains a fixed section or footer that occupies the last five rows of
the table. In this example, you want to remove those last five rows and keep the rest of
the data.
To do that, select Remove bottom rows from the table menu. In the Remove top rows
dialog box, enter 5 in the Number of rows box.
The result of that change will give you the output table that you're looking for. After you
set data types for your columns, your table will look like the following image.
Remove alternate rows
Imagine the following table that comes out of a system with a dynamic layout.
The way this report is structured is that you have elements in pairs of rows. Every odd
row (1, 3, 5...) contains the data that you need. Every even row, directly underneath each
odd row, contains comments about each of those records. You don't need the
comments, and you want to remove all of them.
To do that, select Remove alternate rows from the table menu. In the Remove alternate
rows dialog box, enter the following values:
Here you start defining the pattern for removing rows. After you find the second
row, you only want to remove that specific row, so you specify that you only need
to remove one row.
After you remove one row, you keep the next row. The process starts again for the
next row.
The result of that selection will give you the output table that you're looking for. After
you set the data types to your columns, your table will look like the following image.
Filter by values in a column
Article • 12/17/2022
In Power Query, you can include or exclude rows according to a specific value in a
column. You can choose from three methods to filter the values in your column:
After you apply a filter to a column, a small filter icon appears in the column heading, as
shown in the following illustration.
In this article, we'll focus on aspects related to filtering data. To learn more about
the sort options and how to sort columns in Power Query, go to Sort columns.
Remove empty
The Remove empty command applies two filter rules to your column. The first rule gets
rid of any null values. The second rule gets rid of any blank values. For example, imagine
a table with just one text column with five rows, where you have one null value and one
blank cell.
7 Note
A null value is a specific value in the Power Query language that represents no
value.
You then select Remove empty from the sort and filter menu, as shown in the following
image.
You can also select this option from the Home tab in the Reduce Rows group in the
Remove Rows drop-down options, as shown in the next image.
The result of the Remove empty operation gives you the same table without the empty
values.
Clear filter
When a filter is applied to a column, the Clear filter command appears on the sort and
filter menu.
Auto filter
The list in the sort and filter menu is called the auto filter list, which shows the unique
values in your column. You can manually select or deselect which values to include in the
list. Any selected values will be taken into consideration by the filter; any values that
aren't selected will be ignored.
This auto filter section also has a search bar to help you find any values from your list.
7 Note
When you load the auto filter list, only the top 1,000 distinct values in the column
are loaded. If there are more than 1,000 distinct values in the column in the that
you're filtering, a message will appear indicating that the list of values in the filter
list might be incomplete, and the Load more link appears. Select the Load more
link to load another 1,000 distinct values.
If exactly 1,000 distinct values are found again, the list is displayed with a
message stating that the list might still be incomplete.
If fewer than 1,000 distinct values are found, the full list of values is shown.
Power Query displays a type-specific filter based on the data type of the column.
Type-specific filters
Depending on the data type of your column, you'll see different commands in the sort
and filter menu. The following images show examples for date, text, and numeric
columns.
Filter rows
When selecting any of the type-specific filters, you'll use the Filter rows dialog box to
specify filter rules for the column. This dialog box is shown in the following image.
The Filter rows dialog box has two modes: Basic and Advanced.
Basic
With basic mode, you can implement up to two filter rules based on type-specific filters.
In the preceding image, notice that the name of the selected column is displayed after
the label Keep rows where, to let you know which column these filter rules are being
implemented on.
For example, imagine that in the following table, you want to filter the Account Code by
all values that start with either PA or PTY.
To do that, you can go to the Filter rows dialog box for the Account Code column and
specify the set of filter rules you want.
In this example, first select the Basic button. Then under Keep rows where "Account
Code", select begins with, and then enter PA. Then select the or button. Under the or
button, select begins with, and then enter PTY. The select OK.
The result of that operation will give you the set of rows that you're looking for.
Advanced
With advanced mode, you can implement as many type-specific filters as necessary from
all the columns in the table.
For example, imagine that instead of applying the previous filter in basic mode, you
wanted to implement a filter to Account Code to show all values that end with 4. Also,
you want to show values over $100 in the Sales column.
In this example, first select the Advanced button. In the first row, select Account Code
under Column name, ends with under Operator, and select 4 for the Value. In the
second row, select and, and then select Sales under Column Name, is greater than
under Operator, and 100 under Value. Then select OK
The result of that operation will give you just one row that meets both criteria.
7 Note
You can add as many clauses as you'd like by selecting Add clause. All clauses act
at the same level, so you might want to consider creating multiple filter steps if you
need to implement filters that rely on other filters.
Choose or remove columns
Article • 12/17/2022
Choose columns and Remove columns are operations that help you define what
columns your table needs to keep and which ones it needs to remove. This article will
showcase how to use the Choose columns and Remove columns commands by using
the following sample table for both operations.
The goal is to create a table that looks like the following image.
Choose columns
On the Home tab, in the Manage columns group, select Choose columns.
The Choose columns dialog box appears, containing all the available columns in your
table. You can select all the fields that you want to keep and remove specific fields by
clearing their associated check box. For this example, you want to remove the GUID and
Report created by columns, so you clear the check boxes for those fields.
After selecting OK, you'll create a table that only contains the Date, Product,
SalesPerson, and Units columns.
Remove columns
When you select Remove columns from the Home tab, you have two options:
After selecting Remove columns, you'll create a table that only contains the Date,
Product, SalesPerson, and Units columns.
In Power Query, you can group values in various rows into a single value by grouping
the rows according to the values in one or more columns. You can choose from two
types of grouping operations:
Column groupings.
Row groupings.
Operations available
With the Group by feature, the available operations can be categorized in two ways:
Percentile Column Calculates the percentile, using an input value from 0 to 100,
operation from a column
Count distinct Column Calculates the number of distinct values from a column
values operation
Count rows Row Calculates the total number of rows from a given group
operation
Count distinct Row Calculates the number of distinct rows from a given group
rows operation
All rows Row Outputs all grouped rows in a table value with no
operation aggregations
7 Note
The Count distinct values and Percentile operations are only available in Power
Query Online.
Country
Sales Channel
After that operation is complete, notice how the Products column has [Table] values
inside each cell. Each [Table] value contains all the rows that were grouped by the
Country and Sales Channel columns from your original table. You can select the white
space inside the cell to see a preview of the contents of the table at the bottom of the
dialog box.
7 Note
The details preview pane might not show all the rows that were used for the group-
by operation. You can select the [Table] value to see all rows pertaining to the
corresponding group-by operation.
Next, you need to extract the row that has the highest value in the Units column of the
tables inside the new Products column, and call that new column Top performer
product.
Name your new column Top performer product. Enter the formula
Table.Max([Products], "Units" ) under Custom column formula.
The result of that formula creates a new column with [Record] values. These record
values are essentially a table with just one row. These records contain the row with the
maximum value for the Units column of each [Table] value in the Products column.
With this new Top performer product column that contains [Record] values, you can
select the expand icon, select the Product and Units fields, and then select OK.
After removing your Products column and setting the data type for both newly
expanded columns, your result will resemble the following image.
Fuzzy grouping
7 Note
To demonstrate how to do "fuzzy grouping," consider the sample table shown in the
following image.
The goal of fuzzy grouping is to do a group-by operation that uses an approximate
match algorithm for text strings. Power Query uses the Jaccard similarity algorithm to
measure the similarity between pairs of instances. Then it applies agglomerative
hierarchical clustering to group instances together. The following image shows the
output that you expect, where the table will be grouped by the Person column.
To do the fuzzy grouping, you perform the same steps previously described in this
article. The only difference is that this time, in the Group by dialog box, you select the
Use fuzzy grouping check box.
For each group of rows, Power Query will pick the most frequent instance as the
"canonical" instance. If multiple instances occur with the same frequency, Power Query
will pick the first one. After you select OK in the Group by dialog box, you'll get the
result that you were expecting.
However, you have more control over the fuzzy grouping operation by expanding Fuzzy
group options.
The following options are available for fuzzy grouping:
Similarity threshold (optional): This option indicates how similar two values must
be to be grouped together. The minimum setting of 0 will cause all values to be
grouped together. The maximum setting of 1 will only allow values that match
exactly to be grouped together. The default is 0.8.
Ignore case: When comparing text strings, case will be ignored. This option is
enabled by default.
Group by combining text parts: The algorithm will try to combine text parts (such
as combining Micro and soft into Microsoft) to group values.
Show similarity scores: Show similarity scores between the input values and the
computed representative values after fuzzy grouping. Requires the addition of an
operation such as All rows to showcase this information on a row-by-row level.
Transformation table (optional): You can select a transformation table that will
map values (such as mapping MSFT to Microsoft) to group them together.
For this example, a transformation table will be used to demonstrate how values can be
mapped. The transformation table has two columns:
) Important
It's important that the transformation table has a the same columns and column
names as shown above (they have to be "From" and "To"), otherwise Power Query
will not recognize these.
Return to the Group by dialog box, expand Fuzzy group options, change the operation
from Count rows to All rows, enable the Show similarity scores option, and then select
the Transformation table drop-down menu.
After you select the transformation table, select OK. The result of that operation gives
you the following information:
In this example, the Ignore case option was enabled, so the values in the From column
of the Transformation table are used to look for the text string without considering the
case of the string. This transformation operation occurs first, and then the fuzzy
grouping operation is performed.
The similarity score is also shown in the table value next to the person column, which
reflects exactly how the values were grouped and their respective similarity scores. You
can expand this column if needed or use the values from the new Frequency columns
for other sorts of transformations.
7 Note
When grouping by multiple columns, the transformation table performs the replace
operation in all columns if replacing the value increases the similarity score.
See also
Add a custom column
Remove duplicates
Unpivot columns
Article • 12/17/2022
In Power Query, you can transform columns into attribute-value pairs, where columns
become rows.
For example, given a table like the following, where country rows and date columns
create a matrix of values, it's difficult to analyze the data in a scalable way.
Instead, you can transform the table into a table with unpivoted columns, as shown in
the following image. In the transformed table, it's easier to use the date as an attribute
to filter on.
The key in this transformation is that you have a set of dates in the table that should all
be part of a single column. The respective value for each date and country should be in
a different column, effectively creating an attribute-value pair.
Power Query will always create the attribute-value pair by using two columns:
There are multiple places in the user interface where you can find Unpivot columns. You
can right-click the columns that you want to unpivot, or you can select the command
from the Transform tab in the ribbon.
There are three ways that you can unpivot columns from a table:
Unpivot columns
Unpivot other columns
Unpivot only selected columns
Unpivot columns
For the scenario described above, you first need to select the columns you want to
unpivot. You can select Ctrl as you select as many columns as you need. For this
scenario, you want to select all the columns except the one named Country. After
selecting the columns, right-click any of the selected columns, and then select Unpivot
columns.
The result of that operation will yield the result shown in the following image.
Special considerations
After creating your query from the steps above, imagine that your initial table gets
updated to look like the following screenshot.
Notice that you've added a new column for the date 9/1/2020 (September 1, 2020), and
two new rows for the countries UK and Mexico.
If you refresh your query, you'll notice that the operation will be done on the updated
column, but won't affect the column that wasn't originally selected (Country, in this
example). This means that any new column that's added to the source table will be
unpivoted as well.
The following image shows what your query will look like after the refresh with the new
updated source table.
Unpivot other columns
You can also select the columns that you don't want to unpivot and unpivot the rest of
the columns in the table. This operation is where Unpivot other columns comes into
play.
The result of that operation will yield exactly the same result as the one you got from
Unpivot columns.
7 Note
This transformation is crucial for queries that have an unknown number of columns.
The operation will unpivot all columns from your table except the ones that you've
selected. This is an ideal solution if the data source of your scenario got new date
columns in a refresh, because those will get picked up and unpivoted.
Special considerations
Similar to the Unpivot columns operation, if your query is refreshed and more data is
picked up from the data source, all the columns will be unpivoted except the ones that
were previously selected.
To illustrate this, say that you have a new table like the one in the following image.
You can select the Country column, and then select Unpivot other column, which will
yield the following result.
Unpivot only selected columns
The purpose of this last option is to only unpivot specific columns from your table. This
is important for scenarios where you're dealing with an unknown number of columns
from your data source and you only want to unpivot the selected columns.
To perform this operation, select the columns to unpivot, which in this example is all the
columns except the Country column. Then right-click any of the columns you selected,
and then select Unpivot only selected columns.
Notice how this operation will yield the same output as the previous examples.
Special considerations
After doing a refresh, if our source table changes to have a new 9/1/2020 column and
new rows for UK and Mexico, the output of the query will be different from the previous
examples. Say that our source table, after a refresh, changes to the table in the following
image.
The output of our query will look like the following image.
It looks like this because the unpivot operation was applied only on the 6/1/2020,
7/1/2020, and 8/1/2020 columns, so the column with the header 9/1/2020 remains
unchanged.
Pivot columns
Article • 12/17/2022
In Power Query, you can create a table that contains an aggregate value for each unique
value in a column. Power Query groups each unique value, does an aggregate
calculation for each value, and pivots the column into a new table.
This table contains values by country and date in a simple table. In this example, you
want to transform this table into the one where the date column is pivoted, as shown in
the following image.
7 Note
During the pivot columns operation, Power Query will sort the table based on the
values found on the first column—at the left side of the table—in ascending order.
To pivot a column
2. On the Transform tab in the Any column group, select Pivot column.
3. In the Pivot column dialog box, in the Value column list, select Value.
By default, Power Query will try to do a sum as the aggregation, but you can select
the Advanced option to see other available aggregations.
The available options are:
Don't aggregate
Count (all)
Count (not blank)
Minimum
Maximum
Median
Sum
Average
In the Pivot column dialog box, select the Product column as the value column. Select
the Advanced option button in the Pivot columns dialog box, and then select Don't
aggregate.
The result of this operation will yield the result shown in the following image.
You want to pivot that table by using the Date column, and you want to use the values
from the Value column. Because this pivot would make your table have just the Country
values on rows and the Dates as columns, you'd get an error for every single cell value
because there are multiple rows for every combination of Country and Date. The
outcome of the Pivot column operation will yield the results shown in the following
image.
Notice the error message "Expression.Error: There were too many elements in the
enumeration to complete the operation." This error occurs because the Don't aggregate
operation only expects a single value for the country and date combination.
Transpose a table
Article • 12/17/2022
The transpose table operation in Power Query rotates your table 90 degrees, turning
your rows into columns and your columns into rows.
Imagine a table like the one in the following image, with three rows and four columns.
The goal of this example is to transpose that table so you end up with four rows and
three columns.
The result of that operation will look like the following image.
7 Note
Only the contents of the table will be transposed during the transpose operation;
the column headers of the initial table will be lost. The new columns will have the
name Column followed by a sequential number.
The headers you need in this example are in the first row of the table. To promote the
first row to headers, select the table icon in the upper-left corner of the data preview,
and then select Use first row as headers.
The result of that operation will give you the output that you're looking for.
7 Note
To learn more about the promote headers operation, also known as Use first row
as headers, go to Promote or demote column headers.
Reverse rows
Article • 12/17/2022
With Power Query, it's possible to reverse the order of rows in a table.
Imagine a table with two columns, ID and Country, as shown in the following image.
Data types in Power Query are used to classify values to have a more structured dataset.
Data types are defined at the field level—values inside a field are set to conform to the
data type of the field.
The data type of a column is displayed on the left side of the column heading with an
icon that symbolizes the data type.
7 Note
The most common data types used in Power Query are listed in the following table.
Although beyond the scope of this article, you can find the complete list of data types in
the Power Query M formula language Types article.
Fixed decimal Also known as the Currency type, this data type has a fixed
number location for the decimal separator. The decimal separator always
has four digits to its right and allows for 19 digits of significance.
The largest value it can represent is 922,337,203,685,477.5807
(positive or negative). Unlike Decimal Number, the Fixed Decimal
Number type is always precise and is thus useful in cases where
the imprecision of floating-point notation might introduce errors.
Date/Time Represents both a date and time value. Underneath the covers,
the Date/Time value is stored as a Decimal Number type, so you
can actually convert between the two. The time portion of a date
is stored as a fraction to whole multiples of 1/300 seconds (3.33
ms). Dates between the years 1900 and 9999 are supported.
Date Represents just a date (no time portion). When converted into the
model, a Date is the same as a Date/Time value with zero for the
fractional value.
Data type Icon Description
Time Represents just time (no date portion). When converted into the
model, a Time value is the same as a Date/Time value with no
digits to the left of the decimal place.
Binary The Binary data type can be used to represent any other data with
a binary format.
Any The Any data type is the status given to a column that doesn't
have an explicit data type definition. Any is the data type that
classifies all values. We recommend that you always explicitly
define the column data types for your queries from unstructured
sources, and avoid having any columns with the Any data type as
the output of your query.
Structured data sources such as databases, Power Query reads the table schema
from the data source and automatically displays the data by using the correct data
type for each column.
Unstructured sources such as Excel, CSV, and text files, Power Query automatically
detects data types by inspecting the values in the table. By default, automatic data
type detection is enabled in Power Query for unstructured sources.
You can also use the Detect data type command in the Any column group on the
Transform tab to automatically detect the data types of the columns in your table.
How to define a column data type
You can define or change the data type of a column in any of four places:
On the Home tab, in the Transform group, on the Data type drop-down menu.
On the Transform tab, in the Any column group, on the Data type drop-down
menu.
Promote column headers: Promotes the first row of the table to be the column
header.
Changed type: Converts the values from the Any data type to a data type based
on the inspection of the values from each column.
By default, this setting is enabled. To disable or enable this setting, follow the steps that
apply to your Power Query experience.
You can define this behavior both at the global and per-file level in the Options window
(in the Power Query Editor, on the File tab, select Options and settings > Options).
Global: On the left pane under Global, select Data load. On the right pane under
Type detection, you can select any of three type detection configurations that will
be applied to every new file created in your application:
Always detect column types and headers for unstructured sources
Detect column types and headers for unstructured sources according to each
file's setting
Never detect column types and headers for unstructured sources
Current file: On the left pane under Current file, select Data load. On the right
pane under Type detection, select whether you want to enable or disable type
detection for the current file.
Document or project locale
Power Query handles two distinct components that manage the way that things look
and are interpreted:
Localization: the component that tells Power Query in what language it should be
displayed.
Globalization: the component that handles the formatting of the values, in addition
to the interpretation of text values.
Locale is a single value that holds both the localization and globalization components.
Locale is used to interpret text values and convert them into other data types. For
example, the locale English (United States) means that the localization is in United
States English and the globalization, or format of the value, is based on the standards
used in the United States.
When Power Query defines a column data type or converts from one data type to
another, it has to interpret the values to be converted before it can transform them to a
different data type.
In Power Query for Desktop, Power Query automatically recognizes your operating
system regional format and uses that to interpret the values for data type
conversion. To override this locale configuration, open the query Options window,
and in the left pane under Current file, select Regional settings. From here, you
can change the locale to the setting you want.
This locale setting is important for interpreting text values into a specific data type. For
example, imagine that you have your locale set as English (United States), but a column
in one of your CSV files has dates formatted in the United Kingdom format of
day/month/year.
When you try setting the data type of the Date column to be Date, you get error values.
These errors occur because the locale being used is trying to interpret the date in the
English (United States) format, which is month/day/year. Because there's no month 22 in
the calendar, it causes an error.
Instead of trying to just select the Date data type, you can right-click the column
heading, select Change type, and then select Using locale.
In the Change column type with locale dialog box, you select the data type that you
want to set, but you also select which locale to use, which in this case needs to be
English (United Kingdom).
Using this locale, Power Query will be able to interpret values correctly and convert
those values to the right data type.
The formatting of the values is driven by the globalization value. If you have any doubts
about the value displayed by Power Query, you can verify the conversion of date values
by adding new columns for the day, month, and year from the value. To do this, select
the Date column and go to the Add column tab on the ribbon. In the Date and time
column group, you'll see the options for a date column.
From here, you can extract parts of the date value, such as the year number, the month
number, the day number, or even more columns extracted from the Date column.
By using these columns, you can verify that your date value has been converted
correctly.
Conversion in this matrix starts with the original data type in the Data types
column. Each result of a conversion to the new type is shown in the original data
type’s row.
Data Types
Decimal number —
Currency —
Whole number —
Percentage —
Date/Time —
Date —
Time —
Date/Time/Timezone —
Duration —
Text —
True/False —
Icon Description
Possible
Not possible
Step-level errors
Cell-level errors
This article provides suggestions for how to fix the most common errors you might find
at each level, and describes the error reason, error message, and error detail for each.
Step-level error
A step-level error prevents the query from loading and displays the error components in
a yellow pane.
Error reason: The first section before the colon. In the example above, the error
reason is Expression.Error.
Error message: The section directly after the reason. In the example above, the
error message is The column 'Column' of the table wasn't found.
Error detail: The section directly after the Details: string. In the example above, the
error detail is Column.
Example: You have a query from a text tile that was located in drive D and created by
user A. User A shares the query with user B, who doesn't have access to drive D. When
this person tries to execute the query, they get a DataSource.Error because there's no
drive D in their environment.
Possible solutions: You can change the file path of the text file to a path that both users
have access to. As user B, you can change the file path to be a local copy of the same
text file. If the Edit settings button is available in the error pane, you can select it and
change the file path.
Example: You have a query from a text file where one of the column names was Column.
In your query, you have a step that renames that column to Date. But there was a
change in the original text file, and it no longer has a column heading with the name
Column because it was manually changed to Date. Power Query is unable to find a
column heading named Column, so it can't rename any columns. It displays the error
shown in the following image.
Possible solutions: There are multiple solutions for this case, but they all depend on
what you'd like to do. For this example, because the correct Date column header already
comes from your text file, you can just remove the step that renames the column. This
will allow your query to run without this error.
This error can be caused by a number of reasons, such as the data privacy levels
between data sources or the way that these data sources are being combined or
merged. For more information about how to diagnose this issue, go to Data privacy
firewall.
Cell-level error
A cell-level error won't prevent the query from loading, but displays error values as Error
in the cell. Selecting the white space in the cell displays the error pane underneath the
data preview.
7 Note
The data profiling tools can help you more easily identify cell-level errors with the
column quality feature. More information: Data profiling tools
Remove errors
To remove rows with errors in Power Query, first select the column that contains errors.
On the Home tab, in the Reduce rows group, select Remove rows. From the drop-down
menu, select Remove errors.
The result of that operation will give you the table that you're looking for.
Replace errors
If instead of removing rows with errors, you want to replace the errors with a fixed value,
you can do so as well. To replace rows that have errors, first select the column that
contains errors. On the Transform tab, in the Any column group, select Replace values.
From the drop-down menu, select Replace errors.
In the Replace errors dialog box, enter the value 10 because you want to replace all
errors with the value 10.
The result of that operation will give you the table that you're looking for.
Keep errors
Power Query can serve as a good auditing tool to identify any rows with errors even if
you don't fix the errors. This is where Keep errors can be helpful. To keep rows that have
errors, first select the column that contains errors. On the Home tab, in the Reduce rows
group, select Keep rows. From the drop-down menu, select Keep errors.
The result of that operation will give you the table that you're looking for.
Commonly triggered when changing the data type of a column in a table. Some values
found in the column could not be converted to the desired data type.
Example: You have a query that includes a column named Sales. One cell in that column
has NA as a cell value, while the rest have whole numbers as values. You decide to
convert the data type of the column from text to whole number, but the cell with the
NA value causes an error.
Possible solutions: After identifying the row with the error, you can either modify the
data source to reflect the correct value rather than NA, or you can apply a Replace error
operation to provide a value for any NA values that cause an error.
Operation errors
When trying to apply an operation that isn't supported, such as multiplying a text value
by a numeric value, an error occurs.
Example: You want to create a custom column for your query by creating a text string
that contains the phrase "Total Sales: " concatenated with the value from the Sales
column. An error occurs because the concatenation operation only supports text
columns and not numeric ones.
Possible solutions: Before creating this custom column, change the data type of the
Sales column to be text.
When working with data that contains nested structured values (such as tables, lists, or
records), you may sometimes encounter the following error:
Expression.Error: We cannot return a value of type {value} in this context
Details: In the past we would have returned a text value of {value}, but we
now return this error. Please see https://go.microsoft.com/fwlink/?
linkid=2099726 for more information.
When the Data Privacy Firewall buffers a data source, nested non-scalar values are
automatically converted to errors.
When a column defined with the Any data type contains non-scalar values, such
values will be reported as errors during load (such as in a Workbook in Excel or the
data model in Power BI Desktop).
Possible solutions:
Remove the column that contains the error, or set a non- Any data type for such a
column.
Change the privacy levels of the data sources involved to one that allows them to
be combined without being buffered.
Flatten the tables before doing a merge to eliminate columns that contain nested
structured values (such as table, record, or list).
Working with duplicate values
Article • 12/17/2022
You can work with duplicate sets of values through transformations that can remove
duplicates from your data or filter your data to show duplicates only, so you can focus
on them.
2 Warning
Power Query is case-sensitive. When working with duplicate values, Power Query
considers the case of the text, which might lead to undesired results. As a
workaround, users can apply an uppercase or lowercase transform prior to
removing duplicates.
For this article, the examples use the following table with id, Category, and Total
columns.
Remove duplicates
One of the operations that you can perform is to remove duplicate values from your
table.
There's no guarantee that the first instance in a set of duplicates will be chosen
when duplicates are removed. To learn more about how to preserve sorting, go to
Preserve sort.
You have four rows that are duplicates. Your goal is to remove those duplicate rows so
there are only unique rows in your table. Select all columns from your table, and then
select Remove duplicates.
The result of that operation will give you the table that you're looking for.
7 Note
You want to remove those duplicates and only keep unique values. To remove duplicates
from the Category column, select it, and then select Remove duplicates.
The result of that operation will give you the table that you're looking for.
Keep duplicates
Another operation you can perform with duplicates is to keep only the duplicates found
in your table.
You have four rows that are duplicates. Your goal in this example is to keep only the
rows that are duplicated in your table. Select all the columns in your table, and then
select Keep duplicates.
The result of that operation will give you the table that you're looking for.
In this example, you have multiple duplicates and you want to keep only those
duplicates from your table. To keep duplicates from the id column, select the id column,
and then select Keep duplicates.
The result of that operation will give you the table that you're looking for.
See also
Data profiling tools
Fill values in a column
Article • 12/17/2022
You can use fill up and fill down to replace null values with the last non-empty value in a
column. For example, imagine the following table where you'd like to fill down in the
Date column and fill up in the Comments column.
Fill down
The fill down operation takes a column and traverses through the values in it to fill any
null values in the next rows until it finds a new value. This process continues on a row-
by-row basis until there are no more values in that column.
In the following example, you want to fill down on the Date column. To do that, you can
right-click to select the Date column, and then select Fill > Down.
The result of that operation will look like the following image.
Fill up
In the same way as the fill down operation, fill up works on a column. But by contrast, fill
up finds the last value of the column and fills any null values in the previous rows until it
finds a new value. Then the same process occurs for that value. This process continues
until there are no more values in that column.
In the following example, you want to fill the Comments column from the bottom up.
You'll notice that your Comments column doesn't have null values. Instead it has what
appears to be empty cells. Before you can do the fill up operation, you need to
transform those empty cells into null values: select the column, go to the Transform tab,
and then select Replace values.
In the Replace values dialog box, leave Value to find blank. For Replace with, enter null.
After all empty cells are replaced with null, select the Comments column, go to the
Transform tab, and then select Fill > Up.
The result of that operation will look like the following image.
3. Remove the Sales Person: values from the Sales Person column so you only get
the names of the salespeople.
Now you should have exactly the table you were looking for.
See also
Replace values
Sort columns
Article • 08/09/2023
You can sort a table in Power Query by one column or multiple columns. For example,
take the following table with the columns named Competition, Competitor, and
Position.
For this example, the goal is to sort this table by the Competition and Position fields in
ascending order.
When sorted using sort descending, an alphabetical column is sorted in the following
way:
To sort a table by using columns
To sort the table, first select the column to be sorted. After the column has been
selected, you can select the sort operation from one of two places:
On the Home tab, in the Sort group, there are icons to sort your column in either
ascending or descending order.
From the column heading drop-down menu. Next to the name of the column
there's a drop-down menu indicator . When you select the icon, you'll see the
option to sort the column.
In this example, first you need to sort the Competition column. You'll perform the
operation by using the buttons in the Sort group on the Home tab. This action creates a
new step in the Applied steps section named Sorted rows.
A visual indicator, displayed as an arrow pointing up, gets added to the Competitor
drop-down menu icon to show that the column is being sorted in ascending order.
Now you'll sort the Position field in ascending order as well, but this time you'll use the
Position column heading drop-down menu.
Notice that this action doesn't create a new Sorted rows step, but modifies it to perform
both sort operations in one step. When you sort multiple columns, the order that the
columns are sorted in is based on the order the columns were selected in. A visual
indicator, displayed as a number to the left of the drop-down menu indicator, shows the
place each column occupies in the sort order.
Select the down arrow next to the column heading, and then select Clear sort.
In Applied steps on the Query Settings pane, delete the Sorted rows step.
Rename columns
Article • 12/17/2022
In Power Query, you can rename columns to format the dataset in a clear and concise
way.
Column 1 Column 2
Panama Panama
Canada Toronto
The column headers are Column 1 and Column 2, but you want to change those names
to more friendly names for your columns.
The end result that you want in Power Query looks like the following table.
Double-click the column header: The double-click action immediately lets you
rename the column.
Right-click the column of your choice: A contextual menu is displayed and you
can select the Rename option to rename the selected column.
Rename option in the Transform tab: In the Transform tab, under the Any column
group, select the Rename option.
For example, for the first sample table provided in this article, imagine that you try to
rename both Column 1 and Column 2 to "Geography". An error message pops up that
prevents you from renaming the second column to "Geography".
Promoting your column headers from your first row: For example, if you tried
promoting the first row of the sample table in this article, Power Query renames
the columns to Panama and Panama_1.
7 Note
To learn more about how to promote headers from your first row, go
toPromote or demote column headers.
Expanding a column with a field name that also exists in the current table: This
can happen, for example, when you perform a merge operation and the column
with the merged table has field names that also exist in the table. When you try to
expand the fields from that column, Power Query automatically tries to
disambiguate to prevent Column Name Conflict errors.
Move columns
Article • 02/17/2023
To accomplish this move, you can either select the Move option or drag and drop the
column.
Move option
The following example shows the different ways of moving columns. This example
focuses on moving the Contact Name column.
You move the column using the Move option. This option located in the Any column
group under the Transform tab. In the Move option, the available choices are:
Before
After
To beginning
To end
You can also find this option when you right-click a column.
If you want to move one column to the left, then select Before.
The new location of the column is now one column to the left of its original location.
If you want to move one column to the right, then select After.
The new location of the column is now one column to the right of its original location.
If you want to move the column to the most left space of the dataset, then select To
beginning.
The new location of the column is now on the far left side of the table.
If you want to move the column to the most right space of the dataset, then select To
end.
The new location of the column is now on the far right side of the table.
From there, you can specifically select the column you would like to view, which is
especially useful if there are many columns.
Replace values and errors
Article • 12/17/2022
With Power Query, you can replace one value with another value wherever that value is
found in a column. The Replace values command can be found:
On the cell shortcut menu. Right-click the cell to replace the selected value in the
column with another value.
Replace entire cell contents: This is the default behavior for non-text columns,
where Power Query searches for and replaces the full contents of a cell. You can
enable this mode for text columns by selecting Advanced options, and then
selecting the Match entire cell contents check box.
Replace instances of a text string: This is the default behavior for text columns,
where Power Query will search for a specific text string in all rows of a column and
replace as many instances of the text string that it finds.
Advanced options are only available in columns of the Text data type. Within that set of
options is the Replace using special characters option.
Replace entire cell contents
Imagine a table like the following, where you have columns for Account ID, Category
Name, and Sales Goal.
The value of -1 in the Sales Goal column is an error in the source and needs to be
replaced with the standard sales goal defined by the business for these instances, which
is 250,000. To do that, right-click the -1 value, and then select Replace values. This action
will bring up the Replace values dialog box with Value to find set to -1. Now all you
need to do is enter 250000 in the Replace with box.
The outcome of that operation will give you the result that you're looking for.
In Power Query, you can parse the contents of a column with text strings by identifying
the contents as either a JSON or XML text string.
You can perform this parse operation by selecting the Parse button found inside the
following places in the Power Query Editor:
Transform tab—This button will transform the existing column by parsing its
contents.
Add column tab—This button will add a new column to the table parsing the
contents of the selected column.
For this article, you'll be using the following sample table that contains the following
columns that you need to parse:
JSON
{
"id" : 249319,
"FirstName": "Lesa",
"LastName": "Byrd"
}
Country—Contains unparsed XML text strings with information about the Country
and the Division that the account has been assigned to, as in the following
example.
XML
<root>
<id>1</id>
<Country>USA</Country>
<Division>BI-3316</Division>
</root>
The goal is to parse the above mentioned columns and expand the contents of those
columns to get this output.
As JSON
Select the SalesPerson column. Then select JSON from the Parse dropdown menu inside
the Transform tab. These steps will transform the SalesPerson column from having text
strings to having Record values, as shown in the next image. You can select anywhere in
the whitespace inside the cell of the Record value to get a detailed preview of the
record contents at the bottom of the screen.
Select the expand icon next to the SalesPerson column header. From the expand
columns menu, select only the FirstName and LastName fields, as shown in the
following image.
The result of that operation will give you the following table.
As XML
Select the Country column. Then select the XML button from the Parse dropdown menu
inside the Transform tab. These steps will transform the Country column from having
text strings to having Table values as shown in the next image. You can select anywhere
in the whitespace inside the cell of the Table value to get a detailed preview of the
contents of the table at the bottom of the screen.
Select the expand icon next to the Country column header. From the expand columns
menu, select only the Country and Division fields, as shown in the following image.
You can define all the new columns as text columns. The result of that operation will
give you the output table that you're looking for.
Add a column from examples
Article • 12/17/2022
When you add columns from examples, you can quickly and easily create new columns
that meet your needs. This is useful for the following situations:
You know the data you want in your new column, but you're not sure which
transformation, or collection of transformations, will get you there.
You already know which transformations you need, but you're not sure what to
select in the UI to make them happen.
You know all about the transformations you need by using a custom column
expression in the M language, but one or more of those transformations aren't
available in the UI.
The Column from examples command is located on the Add column tab, in the
General group.
Range: Create bins for the Monthly Income column in discrete increments of
5,000.
Full Name: Concatenate the Last Name and First Name columns to a single
column.
To do this, select the Monthly Income column, select the Column from examples
command, and then select From selection.
The preview pane displays a new, editable column where you can enter your examples.
For the first example, the value from the selected column is 19500. So in your new
column, enter the text 15000 to 20000, which is the bin where that value falls.
When Power Query finds a matching transformation, it fills the transformation results
into the remaining rows using light-colored text. You can also see the M formula text for
the transformation above the table preview.
After you select OK, you'll see your new column as part of your query. You'll also see a
new step added to your query.
To do this, select the Column from examples command, and then select From all
columns.
Now you'll enter your first Full Name example as Enders, Maria.
After you select OK, you'll see your new column as part of your query. You'll also see a
new step added to your query.
Your last step is to remove the First Name, Last Name, and Monthly Income columns.
Your final table now contains the Range and Full Name columns with all the data you
produced in the previous steps.
Tips and considerations
When providing examples, Power Query offers a helpful list of available fields, values,
and suggested transformations for the selected columns. You can view this list by
selecting any cell of the new column.
It's important to note that the Column from examples experience works only on the top
100 rows of your data preview. You can apply steps before the Column from examples
step to create your own data sample. After the Column from examples column has
been created, you can delete those prior steps; the newly created column won't be
affected.
General
Conditional Column
Reference
Text transformations
7 Note
All Text transformations take into account the potential need to trim, clean, or apply
a case transformation to the column value.
Date transformations
Day
Day of Week
Day of Week Name
Day of Year
Month
Month Name
Quarter of Year
Week of Month
Week of Year
Year
Age
Start of Year
End of Year
Start of Month
End of Month
Start of Quarter
Days in Month
End of Quarter
Start of Week
End of Week
Day of Month
Start of Day
End of Day
Time transformations
Hour
Minute
Second
To Local Time
7 Note
All Date and Time transformations take into account the potential need to convert
the column value to Date, Time, or DateTime.
Number transformations
Absolute Value
Arccosine
Arcsine
Arctangent
Convert to Number
Cosine
Cube
Divide
Exponent
Factorial
Integer Divide
Is Even
Is Odd
Ln
Base-10 Logarithm
Modulo
Multiply
Round Down
Round Up
Sign
Sine
Square Root
Square
Subtract
Sum
Tangent
Bucketing/Ranges
Add an index column
Article • 12/17/2022
The Index column command adds a new column to the table with explicit position
values, and is usually created to support other transformation patterns.
By default, the starting index will start from the value 0 and have an increment of 1 per
row.
You can also configure the behavior of this step by selecting the Custom option and
configuring two parameters:
For the example in this article, you start with the following table that has only one
column, but notice the data pattern in the column.
Let's say that your goal is to transform that table into the one shown in the following
image, with the columns Date, Account, and Sale.
In the Modulo dialog box, enter the number from which to find the remainder for each
value in the column. In this case, your pattern repeats itself every three rows, so you'll
enter 3.
The result of that operation will give you a new column named Modulo.
Remove the Index column, because you no longer need it. Your table now looks like the
following image.
Step 4. Pivot a column
Your table now has three columns where:
To achieve the table you want, you need to pivot the Modulo column by using the
values from Column1 where these values don't get aggregated. On the Transform tab,
select the Modulo column, and then select Pivot column from the Any column group.
In the Pivot column dialog box, select the Advanced option button. Make sure Value
column is set to Column1 and Aggregate values function is set to Don't aggregate.
The result of that operation will give you a table with four columns, as shown in the
following image.
After defining the correct data types for your columns, you'll create a table that looks
like the following table, with exactly the three columns that you needed and the shape
that you were looking for.
Add a custom column
Article • 12/17/2022
If you need more flexibility for adding new columns than the ones provided out of the
box in Power Query, you can create your own custom column using the Power Query M
formula language.
Imagine that you have a table with the following set of columns.
Using the Units, Unit Price, and Discount columns, you'd like to create two new
columns:
Total Sale before Discount: Calculated by multiplying the Units column times the
Unit Price column.
Total Sale after Discount: Calculated by multiplying the Total Sale before Discount
column by the net percentage value (one minus the discount value).
The goal is to create a table with new columns that contain the total sales before the
discount and the total sales after the discount.
The initial name of your custom column in the New column name box. You can
rename this column.
A dropdown menu where you can select the data type for your new column.
An Available columns list on the right underneath the Data type selection.
A Custom column formula box where you can enter a Power Query M formula.
To add a new custom column, select a column from the Available columns list. Then,
select the Insert column button below the list to add it to the custom column formula.
You can also add a column by selecting it in the list. Alternatively, you can write your
own formula by using the Power Query M formula language in Custom column formula.
7 Note
If a syntax error occurs when you create your custom column, you'll see a yellow
warning icon, along with an error message and reason.
The result of that operation adds a new Total Sale before Discount column to your
table.
7 Note
If you're using Power Query Desktop, you'll notice that the Data type field isn't
available in Custom column. This means that you'll need to define a data type for
any custom columns after creating the columns. More information: Data types in
Power Query
The result of that operation adds a new Total Sale after Discount column to your table.
The Custom column dialog box appears with the custom column formula you created.
7 Note
Depending on the formula you've used for your custom column, Power Query
changes the settings behavior of your step for a more simplified and native
experience. For this example, the Added custom step changed its behavior from a
standard custom column step to a Multiplication experience because the formula
from that step only multiplies the values from two columns.
Next steps
You can create a custom column in other ways, such as creating a column based on
examples you provide to Power Query Editor. More information: Add a column
from an example
For Power Query M reference information, go to Power Query M function
reference.
Add a conditional column
Article • 12/17/2022
With Power Query, you can create new columns whose values will be based on one or
more conditions applied to other columns in your table.
The Conditional column command is located on the Add column tab, in the General
group.
In this table, you have a field that gives you the CustomerGroup. You also have different
prices applicable to that customer in the Tier 1 Price, Tier 2 Price, and Tier 3 Price fields.
In this example, your goal is to create a new column with the name Final Price based on
the value found in the CustomerGroup field. If the value in the CustomerGroup field is
equal to 1, you'll want to use the value from the Tier 1 Price field; otherwise, you'll use
the value from the Tier 3 Price.
To add this conditional column, select Conditional column. In the Add conditional
column dialog box, you can define three sections numbered in the following image.
1. New column name: You can define the name of your new column. In this example,
you'll use the name Final Price.
2. Conditional clauses: Here you define your conditional clauses. You can add more
clauses by selecting Add clause. Each conditional clause will be tested on the order
shown in the dialog box, from top to bottom. Each clause has four parts:
Column name: In the drop-down list, select the column to use for the
conditional test. For this example, select CustomerGroup.
Operator: Select the type of test or operator for the conditional test. In this
example, the value from the CustomerGroup column has to be equal to 1, so
select equals.
Value: You can enter a value or select a column to be used for the conditional
test. For this example, enter 1.
Output: If the test is positive, the value entered here or the column selected
will be the output. For this example, if the CustomerGroup value is equal to 1,
your Output value should be the value from the Tier 1 Price column.
3. Final Else clause: If none of the clauses above yield a positive test, the output of
this operation will be the one defined here, as a manually entered value or a value
from a column. In this case, the output will be the value from the Tier 3 Price
column.
The result of that operation will give you a new Final Price column.
7 Note
New conditional columns won't have a data type defined. You can add a new step
to define a data type for this newly created column by following the steps
described in Data types in Power Query.
If the value from the CustomerGroup column is equal to 1, the Output will be the
value from the Tier 1 Price column.
If the value from the CustomerGroup column is equal to 2, the Output will be the
value from the Tier 2 Price column.
If none of the previous tests are positive, the Output will be the value from the Tier
3 Price column.
7 Note
At the end of each clause, you can select the ellipsis button (...) to delete, move up,
or move down the clause.
The result of that operation will give you the result that you're looking for.
Rank column (Preview)
Article • 07/30/2022
The Rank column command adds a new column to a table with the ranking defined by
one or more other columns from the table. A Rank method option can be used to
define how ties should be handled.
7 Note
Currently, the rank column feature is only available in Power Query Online.
A 20 0.5
B 30 0.8
C 40 0.2
D 10 0.45
E 20 0.75
The teams have shared a list of ways that they want to rank each other:
Using only the values from the Total Points field where higher values rank higher
using standard competition as the rank method
Using only the values from the Total Points field where higher values rank higher
using dense as the rank method
Ranking first by the Total Points and then by Bonus modifier where higher values
rank higher using the standard competition as rank method
1. With the original table already in Power Query, select the Total Points column.
Then from the Power Query Add column tab, select Rank column.
2. In Rank, Rank by will be the field selected ( Total Points ) and the Rank criteria will
be Higher value ranks higher.
3. By default, the rank method for this dialog is standard competition, so just select
OK. This action will give you a new step with the added Rank column.
1. With the original table already in Power Query, select the Total Points column.
Then from the Power Query Add column tab, select Rank column.
2. In Rank, Rank by will be the field selected ( Total Points ) and the Rank criteria will
be Higher value ranks higher.
3. Select Advanced at the top of the dialog box. This selection enables the advanced
section. In Rank method, change the value from Standard competition to Dense.
4. After selecting the rank method, select OK. This action will give you a new step
with the added Rank column.
2. The rank dialog appears with its advanced section open, with both fields selected
in the Rank by column. Total Points is in the first row and then Bonus modifier
below it. Both rows use the Rank criteria of Higher value ranks higher.
3. Make sure that Rank method at the bottom is set to Standard competition.
4. After verifying the above, select OK. This action will give you a new step with the
added Rank column.
Rank methods
A rank method establishes the strategy in which ties are handled by the ranking
algorithm. This option is only available in the advanced section of the Rank dialog.
The following table lists all three available rank methods and provides a description for
each.
Rank Description
method
Standard Items that compare equally receive the same ranking number, and then a gap is
competition left in the ranking numbers. For example, 1224.
Dense Items that compare equally receive the same ranking number, and the next items
receive the immediately following ranking number. For example, 1223.
Ordinal All items receive distinct ordinal numbers, including items that compare equally.
For example, 1234.
Cluster values
Article • 12/17/2022
Cluster values automatically create groups with similar values using a fuzzy matching
algorithm, and then maps each column's value to the best-matched group. This
transform is very useful when you're working with data that has many different
variations of the same value and you need to combine values into consistent groups.
Consider a sample table with an id column that contains a set of IDs and a Person
column containing a set of variously spelled and capitalized versions of the names
Miguel, Mike, William, and Bill.
In this example, the outcome you're looking for is a table with a new column that shows
the right groups of values from the Person column and not all the different variations of
the same words.
7 Note
The Cluster values feature is available only for Power Query Online.
In the Cluster values dialog box, confirm the column that you want to use to create the
clusters from, and enter the new name of the column. For this case, name this new
column Cluster.
The result of that operation yields the result shown in the next image.
7 Note
For each cluster of values, Power Query picks the most frequent instance from the
selected column as the "canonical" instance. If multiple instances occur with the
same frequency, Power Query picks the first one.
Similarity threshold (optional): This option indicates how similar two values must
be to be grouped together. The minimum setting of 0 causes all values to be
grouped together. The maximum setting of 1 only allows values that match exactly
to be grouped together. The default is 0.8.
Ignore case: When comparing text strings, case is ignored. This option is enabled
by default.
Group by combining text parts: The algorithm tries to combine text parts (such as
combining Micro and soft into Microsoft) to group values.
Show similarity scores: Shows similarity scores between the input values and
computed representative values after fuzzy clustering.
Transformation table (optional): You can select a transformation table that maps
values (such as mapping MSFT to Microsoft) to group them together.
For this example, a new transformation table with the name My transform table is used
to demonstrate how values can be mapped. This transformation table has two columns:
It's important that the transformation table has the same columns and column
names as shown in the previous image (they have to be named "From" and "To"),
otherwise Power Query won't recognize this table as a transformation table, and no
transformation will take place.
Using the previously created query, double-click the Clustered values step, then in the
Cluster values dialog box, expand Fuzzy cluster options. Under Fuzzy cluster options,
enable the Show similarity scores option. For Transformation table (optional), select
the query that has the transform table.
After selecting your transformation table and enabling the Show similarity scores
option, select OK. The result of that operation will give you a table that contains the
same id and Person columns as the original table, but also includes two new columns
on the right called Cluster and Person_Cluster_Similarity. The Cluster column contains
the properly spelled and capitalized versions of the names Miguel for versions of Miguel
and Mike, and William for versions of Bill, Billy, and William. The
Person_Cluster_Similarity column contains the similarity scores for each of the names.
Append queries
Article • 12/17/2022
The append operation creates a single table by adding the contents of one or more
tables to another, and aggregates the column headers from the tables to create the
schema for the new table.
7 Note
When tables that don't have the same column headers are appended, all column
headers from all tables are appended to the resulting table. If one of the appended
tables doesn't have a column header from other tables, the resulting table shows
null values in the respective column, as shown in the previous image in columns C
and D.
You can find the Append queries command on the Home tab in the Combine group.
On the drop-down menu, you'll see two options:
Append queries displays the Append dialog box to add additional tables to the
current query.
Append queries as new displays the Append dialog box to create a new query by
appending multiple tables.
The append operation requires at least two tables. The Append dialog box has two
modes:
Two tables: Combine two table queries together. This mode is the default mode.
Three or more tables: Allow an arbitrary number of table queries to be combined.
7 Note
The tables will be appended in the order in which they're selected, starting with the
Primary table for the Two tables mode and from the primary table in the Tables to
append list for the Three or more tables mode.
To append these tables, first select the Online Sales table. On the Home tab, select
Append queries, which creates a new step in the Online Sales query. The Online Sales
table will be the primary table. The table to append to the primary table will be Store
Sales.
Power Query performs the append operation based on the names of the column
headers found on both tables, and not based on their relative position in the headers
sections of their respective tables. The final table will have all columns from all tables
appended.
In the event that one table doesn't have columns found in another table, null values will
appear in the corresponding column, as shown in the Referer column of the final query.
The new approach for this example is to select Append queries as new, and then in the
Append dialog box, select the Three or more tables option button. In the Available
table(s) list, select each table you want to append, and then select Add. After all the
tables you want appear in the Tables to append list, select OK.
After selecting OK, a new query will be created with all your tables appended.
Combine files overview
Article • 02/17/2023
With Power Query, you can combine multiple files that have the same schema into a
single logical table.
This feature is useful when you want to combine all the files you have in the same folder.
For example, if you have a folder that contains monthly files with all the purchase orders
for your company, you can combine these files to consolidate the orders into a single
view.
Files can come from a variety of sources, such as (but not limited to):
Local folders
SharePoint sites
Azure Blob storage
Azure Data Lake Storage (Gen1 and Gen2)
When working with these sources, you'll notice that they share the same table schema,
commonly referred to as the file system view. The following screenshot shows an
example of the file system view.
In the file system view, the Content column contains the binary representation of each
file.
7 Note
You can filter the list of files in the file system view by using any of the available
fields. It's good practice to filter this view to show only the files you need to
combine, for example by filtering fields such as Extension or Folder Path. More
information: Folder
Selecting any of the [Binary] values in the Content column automatically creates a series
of navigation steps to that specific file. Power Query will try to interpret the binary by
using one of the available connectors, such as Text/CSV, Excel, JSON, or XML.
Table preview
Combine files dialog box
Combined files output
Table preview
When you connect to a data source by using any of the previously mentioned
connectors, a table preview opens. If you're certain that you want to combine all the files
in the folder, select Combine in the lower-right corner of the screen.
Alternatively, you can select Transform data to access the Power Query Editor and
create a subset of the list of files (for example, by using filters on the folder path column
to only include files from a specific subfolder). Then combine files by selecting the
column that contains the binaries in the Content column and then selecting either:
The Combine files command in the Combine group on the Home tab.
The Combine files icon in the column header of the column that contains [Binary]
values.
1. Power Query analyzes the example file (by default, the first file in the list) and
determines the correct file connector to use to open that file.
2. The dialog box provides the file connector experience exactly as if you were to
connect directly to that example file.
If you want to use a different file for the example file, you can choose it from
the Example file drop-down menu.
Optional: You can select Skip files with errors to exclude from the final
output any files that result in errors.
In the following image, Power Query has detected that the first file has a .csv file name
extension, so it uses the Text/CSV connector to interpret the file.
Combined files output
After the Combine files process is finished, Power Query automatically performs the
following actions:
1. Creates an example query that performs all the required extraction steps for a
single file. It uses the file that was selected as the example file in the Combine files
dialog box.
This example query has the name Transform Sample file in the Queries pane.
2. Creates a function query that parameterizes the file/binary input to the example
query. The example query and the function query are linked, so that changes to
the example query are reflected in the function query.
3. Applies the function query to the original query with input binaries (for example,
the folder query) so it applies the function query for binary inputs on each row,
and then expands the resulting data extraction as top-level columns.
4. Creates a new group with the prefix Transform file from and the initial query as
the suffix, and organizes all the components used to create these combined files in
that group.
You can easily combine all files within a given folder, as long as they have the same file
type and structure (including the same columns). You can also apply additional
transformation or extraction steps by modifying the automatically generated example
query, without having to worry about modifying or creating additional function query
steps.
7 Note
You can modify the steps inside the example query to change the function applied
to each binary in your query. The example query is linked to the function, so any
changes made to the example query will be reflected in the function query.
If any of the changes affect column names or column data types, be sure to check
the last step of your output query. Adding a Change column type step can
introduce a step-level error that prevents you from visualizing your table. More
information: Dealing with errors
See also
Combine CSV files
Combine CSV files
Article • 02/17/2023
In Power Query, you can combine multiple files from a given data source. This article
describes how the experience works when the files that you want to combine are CSV
files. More information: Combine files overview
Tip
You can follow along with this example by downloading the sample files used in
this article from this download link . You can place those files in the data source
of your choice, such as a local folder, SharePoint folder, Azure Blob storage, Azure
Data Lake Storage, or other data source that provides the file system view.
For simplicity, the example in this article uses the Folder connector. More information:
Folder
There are 12 CSV files, one for each month of the calendar year 2019. The following
image shows the first 15 rows of the file for the month of January.
The number of rows varies from file to file, but all files have a header section in the first
four rows. They have column headers in the fifth row, and the data for the table begins
in the sixth row and continues through all subsequent rows.
The goal is to combine all 12 files into a single table. This combined table contains the
header row at the top of the table, and includes the source name, date, country, units,
and revenue data for the entire year in separate columns after the header row.
Table preview
When connecting to the folder that hosts the files that you want to combine—in this
example, the name of that folder is CSV Files—you're shown the table preview dialog
box, which displays your folder path in the upper-left corner. The data preview shows
the file system view.
7 Note
In a different situation, you might select Transform data to further filter and
transform your data before combining the files. Selecting Combine is only
recommended when you're certain that the folder contains only the files that you
want to combine.
7 Note
Power Query automatically detects what connector to use based on the first file
found in the list. To learn more about the CSV connector, go to Text/CSV.
For this example, leave all the default settings (Example file set to First file, and the
default values for File origin, Delimiter, and Data type detection).
Now select Transform data in the lower-right corner to go to the output query.
Output query
After selecting Transform data in the Combine files dialog box, you'll be taken back to
the Power Query Editor in the query that you initially created from the connection to the
local folder. The output query now contains the source file name in the left-most
column, along with the data from each of the source files in the remaining columns.
However, the data isn't in the correct shape. You need to remove the top four rows from
each file before combining them. To make this change in each file before you combine
them, select the Transform Sample file query in the Queries pane on the left side of
your screen.
The transformations that need to be added to the Transform Sample file query are:
1. Remove top rows: To perform this operation, select the table icon menu in the
upper-left corner of the table, and then select Remove top rows.
In the Remove top rows dialog box, enter 4, and then select OK.
After selecting OK, your table will no longer have the top four rows.
2. Use first row as headers: Select the table icon again, and then select Use first row
as headers.
The result of that operation will promote the first row of the table to the new
column headers.
After this operation is completed, Power Query by default will try to automatically detect
the data types of the columns and add a new Changed column type step.
However, notice that none of the columns derived from the files (Date, Country, Units,
Revenue) have a specific data type assigned to them. Assign the correct data type to
each column by using the following table.
Date Date
Country Text
Revenue Currency
After defining the data types for each column, you'll be ready to load the table.
7 Note
Verification
To validate that all files have been combined, you can select the filter icon on the
Source.Name column heading, which will display all the names of the files that have
been combined. If you get the warning "List may be incomplete," select Load more at
the bottom of the menu to display more available values in the column.
After you select Load more, all available file names will be displayed.
Merge queries overview
Article • 08/30/2023
A merge queries operation joins two existing tables together based on matching values
from one or multiple columns. You can choose to use different types of joins, depending
on the output you want.
Merging queries
You can find the Merge queries command on the Home tab, in the Combine group.
From the drop-down menu, you'll see two options:
Merge queries: Displays the Merge dialog box, with the selected query as the left
table of the merge operation.
Merge queries as new: Displays the Merge dialog box without any preselected
tables for the merge operation.
Left table for merge: The first selection, from top to bottom of your screen.
Right table for merge: The second selection, from top to bottom of your screen.
7 Note
The position—left or right—of the tables becomes very important when you select
the correct join kind to use.
Sales: The CountryID field is a key or an identifier from the Countries table.
Countries: This table contains the CountryID and the name of the country.
The goal is to join these tables by using the CountryID column from both tables, so you
select the CountryID column from each table. After you make the selections, a message
appears with an estimated number of matches at the bottom of the dialog box.
7 Note
Although this example shows the same column header for both tables, this isn't a
requirement for the merge operation. Column headers don't need to match
between tables. However, it's important to note that the columns must be of the
same data type, otherwise the merge operation might not yield correct results.
You can also select multiple columns to perform the join by selecting Ctrl as you select
the columns. When you do so, the order in which the columns were selected is
displayed in small numbers next to the column headings, starting with 1.
For this example, you have the Sales and Countries tables. Each of the tables has
CountryID and StateID columns, which you need to pair for the join between both
columns.
First select the CountryID column in the Sales table, select Ctrl, and then select the
StateID column. (This will show the small numbers in the column headings.) Next,
perform the same selections in the Countries table. The following image shows the
result of selecting those columns.
7 Note
When selecting multiple columns for a join, the order you select the columns in
each table must match. For example, the first column selected in the left table is
matched with the first column selected in the right table, and so on. Otherwise,
you'll observe incorrect join results.
There can be many suggestions in place but, for this scenario, there's only one
suggestion by the feature, where it maps the column CountryID from the Sales table to
the CountryID column from the Countries table. You can select it and the column-pair-
mapping will be automatically applied to your dialog.
7 Note
Only the column-pair-mapping is suggested. Other options in this dialog, such as
the join kind or fuzzy matching configuration, are out of scope for this suggestion.
From here, you can choose to expand or aggregate the fields from this new table
column, which will be the fields from your right table.
7 Note
Currently, the Power Query Online experience only provides the expand operation
in its interface. The option to aggregate will be added later this year.
Join kinds
A join kind specifies how a merge operation will be performed. The following table
describes the available join kinds in Power Query.
Join kind Icon Description
Left outer All rows from the left table, matching rows from the right table
Right outer All rows from the right table, matching rows from the left table
Fuzzy matching
You use fuzzy merge to apply fuzzy matching algorithms when comparing columns, to
try to find matches across the tables you're merging. You can enable this feature by
selecting the Use fuzzy matching to perform the merge check box in the Merge dialog
box. Expand Fuzzy matching options to view all available configurations.
7 Note
Fuzzy matching is only supported for merge operations over text columns.
Left outer join
Article • 12/17/2022
One of the join kinds available in the Merge dialog box in Power Query is a left outer
join, which keeps all the rows from the left table and brings in any matching rows from
the right table. More information: Merge operations overview
This article uses sample data to show how to do a merge operation with the left outer
join. The sample source tables for this example are:
Sales: This table includes the fields Date, CountryID, and Units. CountryID is a
whole number value that represents the unique identifier from the Countries table.
Countries: This table is a reference table with the fields id and Country. The id field
represents the unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the
Countries table as the right one. The join will be made between the following columns.
Field from the Sales table Field from the Countries table
CountryID id
The goal is to create a table like the following, where the name of the country appears
as a new Country column in the Sales table as long as the CountryID exists in the
Countries table. If there are no matches between the left and right tables, a null value is
the result of the merge for that row. In the following image, this is shown to be the case
for CountryID 4, which was brought in from the Sales table.
One of the join kinds available in the Merge dialog box in Power Query is a right outer
join, which keeps all the rows from the right table and brings in any matching rows from
the left table. More information: Merge operations overview
This article uses sample data to show how to do a merge operation with the right outer
join. The sample source tables for this example are:
Sales: This table includes the fields Date, CountryID, and Units. The CountryID is a
whole number value that represents the unique identifier from the Countries table.
Countries: This table is a reference table with the fields id and Country. The id field
represents the unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the
Countries table as the right one. The join will be made between the following columns.
Field from the Sales table Field from the Countries table
CountryID id
The goal is to create a table like the following, where the name of the country appears
as a new Country column in the Sales table. Because of how the right outer join works,
all rows from the right table will be brought in, but only matching rows from the left
table will be kept.
One of the join kinds available in the Merge dialog box in Power Query is a full outer
join, which brings in all the rows from both the left and right tables. More information:
Merge operations overview
This article uses sample data to show how to do a merge operation with the full outer
join. The sample source tables for this example are:
Sales: This table includes the fields Date, CountryID, and Units. CountryID is a
whole number value that represents the unique identifier from the Countries table.
Countries: This is a reference table with the fields id and Country. The id field
represents the unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the
Countries table as the right one. The join will be made between the following columns.
Field from the Sales table Field from the Countries table
CountryID id
The goal is to create a table like the following, where the name of the country appears
as a new Country column in the Sales table. Because of how the full outer join works, all
rows from both the left and right tables will be brought in, regardless of whether they
only appear in one of the tables.
You can merge on more than one column by selecting and holding Ctrl and then
selecting the columns.
Tip
Take a closer look at the message at the bottom of the dialog box that reads "The
selection matches 4 of 4 rows from the first table, and 3 of 4 rows from the second
table." This message is crucial for understanding the result that you get from this
operation.
In the Countries table, you have the Country Spain with id of 4, but there are no records
for CountryID 4 in the Sales table. That's why only three of four rows from the right
table found a match. All rows from the right table that didn't have matching rows from
the left table will be grouped and shown in a new row in the output table with no values
for the fields from the left table.
From the newly created Countries column after the merge operation, expand the
Country field. Don't select the Use original column name as prefix check box.
After performing this operation, you'll create a table that looks like the following image.
Inner join
Article • 12/17/2022
One of the join kinds available in the Merge dialog box in Power Query is an inner join,
which brings in only matching rows from both the left and right tables. More
information: Merge operations overview
This article uses sample data to show how to do a merge operation with the inner join.
The sample source tables for this example are:
Sales: This table includes the fields Date, CountryID, and Units. CountryID is a
whole number value that represents the unique identifier from the Countries table.
Countries: This is a reference table with the fields id and Country. The id field
represents the unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the
Countries table as the right one. The join will be made between the following columns.
Field from the Sales table Field from the Countries table
CountryID id
The goal is to create a table like the following, where the name of the country appears
as a new Country column in the Sales table. Because of how the inner join works, only
matching rows from both the left and right tables will be brought in.
Take a closer look at the message at the bottom of the dialog box that reads "The
selection matches 1 of 4 rows from the first table, and 1 of 2 rows from the second
table." This message is crucial to understanding the result that you get from this
operation.
In the Sales table, you have a CountryID of 1 and 2, but neither of these values are
found in the Countries table. That's why the match only found one of four rows in the
left (first) table.
In the Countries table, you have the Country Spain with the id 4, but there are no
records for a CountryID of 4 in the Sales table. That's why only one of two rows from
the right (second) table found a match.
From the newly created Countries column, expand the Country field. Don't select the
Use original column name as prefix check box.
After performing this operation, you'll create a table that looks like the following image.
Left anti join
Article • 12/17/2022
One of the join kinds available in the Merge dialog box in Power Query is a left anti join,
which brings in only rows from the left table that don't have any matching rows from
the right table. More information: Merge operations overview
This article uses sample data to show how to do a merge operation with the left anti
join. The sample source tables for this example are:
Sales: This table includes the fields Date, CountryID, and Units. CountryID is a
whole number value that represents the unique identifier from the Countries table.
Countries: This table is a reference table with the fields id and Country. The id field
represents the unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the
Countries table as the right one. The join will be made between the following columns.
Field from the Sales table Field from the Countries table
CountryID id
The goal is to create a table like the following, where only the rows from the left table
that don't match any from the right table are kept.
Take a closer look at the message at the bottom of the dialog box that reads "The
selection excludes 1 of 4 rows from the first table." This message is crucial to
understanding the result that you get from this operation.
In the Sales table, you have a CountryID of 1 and 2, but neither of them are found in the
Countries table. That's why the match only found one of four rows in the left (first) table.
In the Countries table, you have the Country Spain with an id of 4, but there are no
records for CountryID 4 in the Sales table. That's why only one of two rows from the
right (second) table found a match.
From the newly created Countries column, expand the Country field. Don't select the
Use original column name as prefix check box.
After doing this operation, you'll create a table that looks like the following image. The
newly expanded Country field doesn't have any values. That's because the left anti join
doesn't bring any values from the right table—it only keeps rows from the left table.
Right anti join
Article • 12/17/2022
One of the join kinds available in the Merge dialog box in Power Query is a right anti
join, which brings in only rows from the right table that don't have any matching rows
from the left table. More information: Merge operations overview
This article uses sample data to show how to do a merge operation with the right anti
join. The sample source tables for this example are:
Sales: This table includes the fields Date, CountryID, and Units. CountryID is a
whole number value that represents the unique identifier from the Countries table.
Countries: This is a reference table with the fields id and Country. The id field
represents the unique identifier for each record.
In this example, you'll merge both tables, with the Sales table as the left table and the
Countries table as the right one. The join will be made between the following columns.
Field from the Sales table Field from the Countries table
CountryID id
The goal is to create a table like the following, where only the rows from the right table
that don't match any from the left table are kept. As a common use case, you can find
all the rows that are available in the right table but aren't found in the left table.
Take a closer look at the message at the bottom of the dialog box that reads "The
selection excludes 1 of 2 rows from the second table." This message is crucial to
understanding the result that you get from this operation.
In the Countries table, you have the Country Spain with an id of 4, but there are no
records for CountryID 4 in the Sales table. That's why only one of two rows from the
right (second) table found a match. Because of how the right anti join works, you'll never
see any rows from the left (first) table in the output of this operation.
From the newly created Countries column, expand the Country field. Don't select the
Use original column name as prefix check box.
After performing this operation, you'll create a table that looks like the following image.
The newly expanded Country field doesn't have any values. That's because the right anti
join doesn't bring any values from the left table—it only keeps rows from the right table.
Fuzzy merge
Article • 12/17/2022
Fuzzy merge is a smart data preparation feature you can use to apply fuzzy matching
algorithms when comparing columns, to try to find matches across the tables that are
being merged.
You can enable fuzzy matching at the bottom of the Merge dialog box by selecting the
Use fuzzy matching to perform the merge option button. More information: Merge
operations overview
7 Note
Fuzzy matching is only supported on merge operations over text columns. Power
Query uses the Jaccard similarity algorithm to measure the similarity between pairs
of instances.
Sample scenario
A common use case for fuzzy matching is with freeform text fields, such as in a survey.
For this article, the sample table was taken directly from an online survey sent to a
group with only one question: What is your favorite fruit?
To help standardize these values, in this example you have a Fruits reference table.
7 Note
For simplicity, this Fruits reference table only includes the name of the fruits that
will be needed for this scenario. Your reference table can have as many rows as you
need.
The goal is to create a table like the following, where you've standardized all these
values so you can do more analysis.
Similarity threshold (optional): A value between 0.00 and 1.00 that provides the
ability to match records above a given similarity score. A threshold of 1.00 is the
same as specifying an exact match criteria. For example, Grapes matches with
Graes (missing the letter p) only if the threshold is set to less than 0.90. By default,
this value is set to 0.80.
Ignore case: Allows matching records no matter what the case of the text.
Match by combining text parts: Allows combining text parts to find matches. For
example, Micro soft is matched with Microsoft if this option is enabled.
Show similarity scores: Shows similarity scores between the input and the matches
values after fuzzy matching.
Number of matches (optional): Specifies the maximum number of matching rows
that can be returned for every input row.
Transformation table (optional): Allows matching records based on custom value
mappings. For example, Grapes is matched with Raisins if a transformation table is
provided where the From column contains Grapes and the To column contains
Raisins.
Transformation table
For the example in this article, you can use a transformation table to map the value that
has a missing pair. That value is apls, which needs to be mapped to Apple. Your
transformation table has two columns:
From To
apls Apple
You can go back to the Merge dialog box, and in Fuzzy matching options under
Number of matches, enter 1. Enable the Show similarity scores option, and then, under
Transformation table, select Transform Table from the drop-down menu.
After you select OK, you can go to the merge step. When you expand the column with
table values, you'll notice that besides the Fruit field you'll also see the Similarity score
field. Select both and expand them without adding a prefix.
After expanding these two fields, they'll be added to your table. Note the values you get
for the similarity scores of each value. These scores can help you with further
transformations if needed to determine if you should lower or raise your similarity
threshold.
For this example, the Similarity score serves only as additional information and isn't
needed in the output of this query, so you can remove it. Note how the example started
with nine distinct values, but after the fuzzy merge, there are only four distinct values.
Cross join
Article • 12/17/2022
A cross join is a type of join that returns the Cartesian product of rows from the tables in
the join. In other words, it combines each row from the first table with each row from
the second table.
This article demonstrates, with a practical example, how to do a cross join in Power
Query.
Product: A table with all the generic products that you sell.
Colors: A table with all the product variations, as colors, that you can have in your
inventory.
The goal is to perform a cross-join operation with these two tables to create a list of all
unique products that you can have in your inventory, as shown in the following table.
This operation is necessary because the Product table only contains the generic product
name, and doesn't give the level of detail you need to see what product variations (such
as color) there are.
Perform a cross join
To do a cross-join operation in Power Query, first go to the Product table. From the Add
column tab on the ribbon, select Custom column. More information: Add a custom
column
In the Custom column dialog box, enter whatever name you like in the New column
name box, and enter Colors in the Custom column formula box.
) Important
If your query name has spaces in it, such as Product Colors, the text that you need
to enter in the Custom column formula section has to follow the syntax #"Query
name" . For Product Colors, you need to enter #"Product Colors" .
You can check the name of your queries in the Query settings pane on the right
side of your screen or in the Queries pane on the left side.
After you select OK in the Custom column dialog box, a new column is added to the
table. In the new column heading, select Expand to expand the contents of this newly
created column, and then select OK.
After you select OK, you'll reach your goal of creating a table with all possible
combinations of Product and Colors.
Split columns by delimiter
Article • 12/17/2022
In Power Query, you can split a column through different methods. In this case, the
column(s) selected can be split by a delimiter.
Home tab—under the Split column dropdown menu inside the Transform group.
Transform tab—under the Split column dropdown menu inside the Text column
group.
Account number
Account name
In this example, you want to split this column into two columns. The values are
delimited by a space—the first space from left to right. To do this split, select the
column, and then select the option to split the column by a delimiter. In Split Column
by Delimiter, apply the following configuration:
7 Note
Power Query will split the column into as many columns as needed. The name of
the new columns will contain the same name as the original column. A suffix that
includes a dot and a number that represents the split sections of the original
column will be appended to the name of the new columns.
The Accounts column has values in pairs separated by a comma. These pairs are
separated by a semicolon. The goal of this example is to split this column into new rows
by using the semicolon as the delimiter.
To do that split, select the Accounts column. Select the option to split the column by a
delimiter. In Split Column by Delimiter, apply the following configuration:
Final Split
Your table still requires one last split column operation. You need to split the Accounts
column by the first comma that it finds. This split will create a column for the account
name and another one for the account number.
To do that split, select the Accounts column and then select Split Column > By
Delimiter. Inside the Split column window, apply the following configuration:
The result of that operation will give you a table with the three columns that you're
expecting. You then rename the columns as follows:
Your final table looks like the one in the following image.
Split columns by number of characters
Article • 12/17/2022
In Power Query, you can split a column through different methods. In this case, the
column(s) selected can be split by the number of characters.
Home tab—under the Split Column dropdown menu inside the Transform group.
Transform tab—under the Split Column dropdown menu inside the Text Column
group.
In this example, you want to split this column into three columns containing the values
described in the list above.
To do this split, select the column and then select the option to split the column by the
number of characters. In Split column by Number of Characters, apply the following
configuration:
Number of characters: 6
Split: Once, as far left as possible
The result of that operation will give you a table with two columns. One for the account
name and the other one that contains the combined values for the date and units.
7 Note
Power Query will split the column into only two columns. The name of the new
columns will contain the same name as the original column. A suffix containing a
dot and a number that represents the split section of the column will be appended
to the names of the new columns.
Now continue to do the same operation over the new Column1.2 column, but with the
following configuration:
Number of characters: 8
Split: Once, as far left as possible
The result of that operation will yield a table with three columns. Notice the new names
of the two columns on the far right. Column1.2.1 and Column1.2.2 were automatically
created by the split column operation.
You can now change the name of the columns and also define the data types of each
column as follows:
Your final table will look like the one in the following image.
Split columns by number of characters into
rows
The initial table for this example will be the one below, with the columns Group and
Account.
The Account column can hold multiple values in the same cell. Each value has the same
length in characters, with a total of six characters. In this example, you want to split
these values so you can have each account value in its own row.
To do that, select the Account column and then select the option to split the column by
the number of characters. In Split column by Number of Characters, apply the following
configuration:
Number of characters: 6
Split: Repeatedly
Split into: Rows
The result of that operation will give you a table with the same number of columns, but
many more rows because the fragments inside the original cell values in the Account
column are now split into multiple rows.
Split columns by positions
Article • 12/17/2022
In Power Query, you can split a column through different methods. In this case, the
column(s) selected can be split by positions.
Home tab—under the Split Column dropdown menu inside the Transform group.
Transform tab—under the Split Column dropdown menu inside the Text Column
group.
In this example, you want to split this column into the three columns made from the
values in the list above. To do this split, select the column and then select the option to
split the column by positions. In Split Column by Positions, apply the following
configuration:
Positions: 0,6,14
Positions are zero-based and comma-separated, where position zero is the start
of the string.
7 Note
This operation will first start creating a column from position 0 to position 6, then
from position 7 to position 14. There will be another column should there be values
with a length of 16 or more characters in the current data preview contents.
The result of that operation will give you a table with three columns.
7 Note
Power Query will split the column into only two columns. The name of the new
columns will contain the same name as the original column. A suffix created by a
dot and a number that represents the split section of the column will be appended
to the name of the new columns.
You can now change the name of the columns, and also define the data types of each
column as follows:
Your final table will look the one in the following image.
The Account column can only hold two values in the same cell. Each value has the same
length in characters, with a total of six characters. In this example, you want to split
these values so you can have each account value in its own row. To do that, select the
Account column and then select the option to split the column by positions. In Split
Column by Positions, apply the following configuration:
Positions: 0, 6
Split into: Rows
7 Note
This operation will first start creating a column from position 0 to position 6. There
will be another column should there be values with a length of 8 or more
characters in the current data preview contents.
The result of that operation will give you a table with the same number of columns, but
many more rows because the values inside the cells are now in their own cells.
Split columns by lowercase to uppercase
Article • 02/17/2023
In Power Query, you can split a column through different methods. If your data contains
CamelCased text or a similar pattern, then the column(s) selected can be split by every
instance of the last lowercase letter to the next uppercase letter easily.
Home tab—under the Split Column dropdown menu inside the Transform group.
Transform tab—under the Split Column dropdown menu inside the Text Column
group.
This single column will split into multiple columns, given every instance of the last
lowercase letter to the next uppercase letter. In this case, it only splits into two columns.
In Power Query, you can split a column through different methods. In this case, the
column(s) selected can be split by every instance of the last uppercase letter to the next
lowercase letter.
Home tab—under the Split Column dropdown menu inside the Transform group.
Transform tab—under the Split Column dropdown menu inside the Text Column
group.
This single column will split into multiple columns, given every instance of the last
uppercase letter to the next lowercase letter. In this case, it only splits into two columns.
In Power Query, you can split a column through different methods. In this case, the
column(s) selected can be split by every instance of a digit followed by a non-digit.
Home tab—under the Split Column dropdown menu inside the Transform group.
Transform tab—under the Split Column dropdown menu inside the Text Column
group.
This single column will split into multiple columns, given every instance of a digit
followed with a non-digit. In this case, it only splits it into two.
In Power Query, you can split a column through different methods. In this case, the
column(s) selected can be split by every instance of a non-digit followed by a digit.
Home tab—under the Split Column dropdown menu inside the Transform group.
Transform tab—under the Split Column dropdown menu inside the Text Column
group.
In this example, you want to split this column into the two columns described in the list
above. Select the column and then select the option to split the column by non-digit to
digit.
This single column will split into multiple columns, given every instance of a digit
followed by a non-digit. In this case, it only splits into two columns.
Tip
Try out Dataflow Gen2 in Data Factory in Microsoft Fabric, an all-in-one analytics
solution for enterprises. Microsoft Fabric covers everything from data movement to
data science, real-time analytics, business intelligence, and reporting. Learn how to
start a new trial for free!
Benefits of dataflows
The following list highlights some of the benefits of using dataflows:
A dataflow decouples the data transformation layer from the modeling and
visualization layer in a Power BI solution.
The data transformation code can reside in a central location, a dataflow, rather
than be spread out among multiple artifacts.
A dataflow creator only needs Power Query skills. In an environment with multiple
creators, the dataflow creator can be part of a team that together builds the entire
BI solution or operational application.
A dataflow is product-agnostic. It's not a component of Power BI only; you can get
its data in other tools and services.
Next steps
The following articles provide further study materials for dataflows.
Dataflows are used to extract, transform, and load data to a storage destination where it
can be leveraged for different scenarios. Because not all storage destinations share the
same characteristics, some dataflow features and behaviors differ depending on the
storage destination the dataflow loads data into. Before you create a dataflow, it's
important to understand how the data is going to be used, and choose the storage
destination according to the requirements of your solution.
Standard dataflows
A standard dataflow loads data to Dataverse tables. Standard dataflows can only be
created in Power Apps. One benefit of creating this type of dataflow is that any
application that depends on data in Dataverse can work with the data created by
standard dataflows. Typical applications that leverage Dataverse tables are Power Apps,
Power Automate, AI Builder and Power Virtual Agents.
Standard dataflows versions
We've been working on significant updates to standard dataflows to improve their
performance and reliability. These improvements will eventually be available to all
standard dataflows. But in the interim, we'll differentiate between existing standard
dataflows (version 1) and new standard dataflows (version 2) by adding a version
indicator in Power Apps.
Standard dataflow versions feature comparison
The following table lists the major features differences between standard dataflows V1
and V2, and provides information about each feature's behavior in each version.
Analytical dataflows
An analytical dataflow loads data to storage types optimized for analytics—Azure Data
Lake Storage. Microsoft Power Platform environments and Power BI workspaces provide
customers with a managed analytical storage location that's bundled with those product
licenses. In addition, customers can link their organization’s Azure Data Lake storage
account as a destination for dataflows.
Analytical dataflows are capable additional analytical features. For example, integration
with Power BI’s AI features or use of computed entities which will be discussed later.
You can create analytical dataflows in Power BI. By default, they'll load data to Power BI’s
managed storage. But you can also configure Power BI to store the data in the
organization’s Azure Data Lake Storage.
You can also create analytical dataflows in Power Apps and Dynamics 365 customer
insights portals. When you're creating a dataflow in Power Apps portal, you can choose
between Dataverse managed analytical storage or in your organization’s Azure Data
Lake Storage account.
AI Integration
Sometimes, depending on the requirement, you might need to apply some AI and
machine learning functions on the data through the dataflow. These functionalities are
available in Power BI dataflows and require a Premium workspace.
Note that the features listed above are Power BI specific and are not available when
creating a dataflow in the Power Apps or Dynamics 365 customer insights portals.
Computed tables
One of the reasons to use a computed table is the ability to process large amounts of
data. The computed table helps in those scenarios. If you have an table in a dataflow,
and another table in the same dataflow uses the first table's output, this action creates a
computed table.
The computed table helps with the performance of the data transformations. Instead of
re-doing the transformations needed in the first table multiple times, the transformation
is done only once in the computed table. Then the result is used multiple times in other
tables.
AI functions No Yes
AI features—analytical dataflow
If you're planning to use any AI functionality through the data transformation stage,
you'll find it helpful to use an analytical dataflow because you can use all the supported
AI features with this type of dataflow.
Create and use dataflows in Microsoft
Power Platform
Article • 04/06/2023
Using dataflows with Microsoft Power Platform makes data preparation easier, and lets
you reuse your data preparation work in subsequent reports, apps, and models.
In the world of ever-expanding data, data preparation can be difficult and expensive,
consuming as much as 60 to 80 percent of the time and cost for a typical analytics
project. Such projects can require wrangling fragmented and incomplete data, complex
system integration, data with structural inconsistency, and a high skillset barrier.
To make data preparation easier and to help you get more value out of your data, Power
Query and Power Platform dataflows were created.
With dataflows, Microsoft brings the self-service data preparation capabilities of Power
Query into the Power BI and Power Apps online services, and expands existing
capabilities in the following ways:
Self-service data prep for big data with dataflows: Dataflows can be used to
easily ingest, cleanse, transform, integrate, enrich, and schematize data from a
large and ever-growing array of transactional and observational sources,
encompassing all data preparation logic. Previously, extract, transform, load (ETL)
logic could only be included within datasets in Power BI, copied over and over
between datasets, and bound to dataset management settings.
With dataflows, ETL logic is elevated to a first-class artifact within Microsoft Power
Platform services, and includes dedicated authoring and management experiences.
Business analysts, BI professionals, and data scientists can use dataflows to handle
the most complex data preparation challenges and build on each other's work,
thanks to a revolutionary model-driven calculation engine, which takes care of all
the transformation and dependency logic—cutting time, cost, and expertise to a
fraction of what's traditionally been required for those tasks. You can create
dataflows by using the well-known, self-service data preparation experience of
Power Query. Dataflows are created and easily managed in app workspaces or
environments, in Power BI or Power Apps, respectively, enjoying all the capabilities
these services have to offer, such as permission management and scheduled
refreshes.
Load data to Dataverse or Azure Data Lake Storage: Depending on your use case,
you can store data prepared by Power Platform dataflows in the Dataverse or your
organization's Azure Data Lake Storage account:
Dataverse lets you securely store and manage data that's used by business
applications. Data within Dataverse is stored in a set of tables. A table is a set of
rows (formerly referred to as records) and columns (formerly referred to as
fields/attributes). Each column in the table is designed to store a certain type of
data, for example, name, age, salary, and so on. Dataverse includes a base set of
standard tables that cover typical scenarios, but you can also create custom
tables specific to your organization and populate them with data by using
dataflows. App makers can then use Power Apps and Power Automate to build
rich applications that use this data.
Azure Data Lake Storage lets you collaborate with people in your organization
using Power BI, Azure Data, and AI services, or using custom-built Line of
Business Applications that read data from the lake. Dataflows that load data to
an Azure Data Lake Storage account store data in Common Data Model folders.
Common Data Model folders contain schematized data and metadata in a
standardized format, to facilitate data exchange and to enable full
interoperability across services that produce or consume data stored in an
organization’s Azure Data Lake Storage account as the shared storage layer.
Advanced Analytics and AI with Azure: Power Platform dataflows store data in
Dataverse or Azure Data Lake Storage—which means that data ingested through
dataflows is now available to data engineers and data scientists to leverage the full
power of Azure Data Services, such as Azure Machine Learning, Azure Databricks,
and Azure Synapse Analytics for advanced analytics and AI. This enables business
analysts, data engineers, and data scientists to collaborate on the same data within
their organization.
Support for Common Data Model: Common Data Model is a set of a standardized
data schemas and a metadata system to allow consistency of data and its meaning
across applications and business processes. Dataflows support Common Data
Model by offering easy mapping from any data in any shape into the standard
Common Data Model entities, such as Account and Contact. Dataflows also land
the data, both standard and custom entities, in schematized Common Data Model
form. Business analysts can take advantage of the standard schema and its
semantic consistency, or customize their entities based on their unique needs.
Common Data Model continues to evolve as part of the Open Data Initiative .
Dataflows Data Connector in Power For dataflows with Azure Data Lake Storage Yes
BI Desktop as the destination
Dataflow linked entities For dataflows with Azure Data Lake Storage Yes
as the destination
Computed Entities (in-storage For dataflows with Azure Data Lake Storage Power BI
transformations using M) as the destination Premium
only
Dataflow incremental refresh For dataflows with Azure Data Lake Storage Power BI
as the destination, requires Power Apps Premium
Plan2 only
Known limitations
Copying dataflows as part of a Power Platform environments copy operation is not
supported.
Next steps
More information about dataflows in Power Apps:
The following articles go into more detail about common usage scenarios for dataflows.
For more information about Common Data Model and the Common Data Model folder
standard, read the following articles:
Microsoft Dataverse for Teams delivers a built-in, low-code data platform for Microsoft
Teams. It provides relational data storage, rich data types, enterprise-grade governance,
and one-click solution deployment. Dataverse for Teams enables everyone to easily
build and deploy apps.
Before today, the way to get data into Dataverse for Teams was by manually adding data
directly into a table. This process can be prone to errors and isn't scalable. But now, with
self-service data prep you can find, clean, shape, and import your data into Dataverse
for Teams.
With your organizational data already sitting in a different location, you can use Power
Query dataflows to directly access your data through the connectors and load the data
into Dataverse for Teams. When you update in your organizational data, you can refresh
your dataflows by just one click and the data in Dataverse for Teams is updated too. You
can also use the Power Query data transformations to easily validate and clean your
data and enforce data quality for your Apps.
Dataflows were introduced to help organizations retrieve data from disparate sources
and prepare it for consumption. You can easily create dataflows using the familiar, self-
service Power Query experience to ingest, transform, integrate, and enrich data. When
creating a dataflow, you'll connect to data, transform the data, and load data into
Dataverse for Teams tables. Once the dataflow is created, it begins the process of
importing data into the Dataverse table. Then you can start building apps to leverage
that data.
1. Sign in to Teams web version , and then select the link for Power Apps.
6. In Navigator, select the tables that are present in your Excel file. If your Excel file
has multiple sheets and tables, select only the tables you're interested in. When
you're done, select Transform data.
7. Clean and transform your data using Power Query. You can use the out-of-the box
transformations to delete missing values, delete unnecessary columns, or to filter
your data. With Power Query, you can apply more than 300 different
transformations on your data. To learn more about Power Query transformations,
see Use Power Query to transform data. After you're finished with preparing your
data, select Next.
8. In Map tables, select Load to new table to create a new table in Dataverse for
Teams. You can also choose to load your data into an existing table. In the Map
tables screen, you can also specify a Unique primary name column and an
Alternate key column (optional). In this example, leave these selections with the
default values. To learn more about mapping your data and the different settings,
see Field mapping considerations for standard dataflows.
9. Select Create to finish your dataflow. Once you’ve created your dataflow, data
begins loading into Dataverse for Teams. This process can take some time and you
can use the management page to check the status. When a dataflow completes a
run, its data is available to use.
In the Last Refresh column, you can see when your data was last refreshed. If your
refresh failed, an error indication appears. If you select the error indication, the details of
the error and recommended steps to address it appear.
In the Status column, you can see the current status of the dataflow. Possible states are:
Refresh in progress: the dataflow is extracting, transforming, and loading your
data from the source to the Dataverse Tables. This process can take several
minutes depending on the complexity of transformations and data source's
performance. We recommend that you check the status of the dataflow frequently.
To navigate to the action bar, select the three dots “…” next to your dataflow.
Edit your dataflow if you want to change your transformation logic or mapping.
Rename your dataflow. At creation, an autogenerated name is assigned.
Refresh your dataflow. When you refresh your dataflows, the data will be updated.
Delete your dataflow.
Show refresh history. This gives you the results from the last refresh.
Select Show refresh history to see information about the last refresh of your dataflow.
When the dataflow refresh is successful, you can see how many rows were added or
updated in Dataverse. When your dataflow refresh wasn't successful, you can investigate
why with the help of the error message.
7 Note
The following table lists the major feature differences between dataflows for Dataverse
in Teams and dataflows for Dataverse.
1
Although there's no limitation on the amount of data you can load into Dataverse for
Teams, for better performance in loading larger amounts of data, we recommend a
Dataverse environment.
Consume data from dataflows
Article • 08/04/2023
The ways you can consume data from Microsoft dataflows depends on several factors,
like storage and type of dataflow. In this article, you learn how to choose the right
dataflow for your needs.
Type of dataflow
There are multiple types of dataflows available for you to create. You can choose
between a Power BI dataflow, standard dataflow, or an analytical dataflow. To learn more
about the differences and how to select the right type based on your needs, go to
Understanding the differences between dataflow types.
Storage type
A dataflow can write to multiple output destination types. In short, you should be using
the Dataflows connector unless your destination is a Dataverse table. Then you use the
Dataverse/CDS connector.
When you've connected your data lake, you should still use the Dataflows connector. If
this connector doesn't meet your needs, you could consider using the Azure Data Lake
connector instead.
Dataverse
A standard dataflow writes the output data to a Dataverse table. Dataverse lets you
securely store and manage data that's used by business applications. After you load
data in the Dataverse table, you can consume the data using the Dataverse connector.
Dataflows can get data from other dataflows
If you'd like to reuse data created by one dataflow in another dataflow, you can do so by
using the Dataflow connector in the Power Query editor when you create the new
dataflow.
When you get data from the output of another dataflow, a linked table is created.
Linked tables provide a way to make data created in an upstream dataflow available in a
downstream dataflow, without copying the data to the downstream dataflow. Because
linked tables are just pointers to tables created in other dataflows, they're kept up to
date by the refresh logic of the upstream dataflow. If both dataflows reside in the same
workspace or environment, those dataflows are refreshed together, to keep data in both
dataflows always up to date. More information: Link tables between dataflows
You're probably using a Dataverse table as the destination for your standard dataflow.
Use the Dataverse/CDS connector instead or consider switching to an analytical
dataflow.
There's a difference in the data when I remove duplicates in dataflows—how can I
resolve this?
Next Steps
The following articles provide more details about related articles.
When you include your dataflows in a solution, their definitions become portable,
making it easier to move them from one environment to another, saving time required
to author the dataflow.
A typical use case is for an independent software vendor (ISV) to develop a solution
containing a dataflow, that extracts and transforms data from a data source to Dataverse
tables, in a sandbox environment. The ISV would then move that dataflow and
destination tables to a test environment to test with their test data source to validate
that the solution works well and is ready for production. After testing completes, the ISV
would provide the dataflow and tables to clients who will import them into their
production environment to operate on client’s data. This process is much easier when
you add both the dataflows and tables they load data to into solutions, and then move
the solutions and their contents between environments.
Dataflows added to a solution are known as solution-aware dataflows. You can add
multiple dataflows to a single solution.
7 Note
Prerequisites
You need to have created a solution before you can add a dataflow to it. More
information: Create solutions
You need to be the owner of at least one dataflow in the environment. More
information: Create dataflows
Add the dataflow
1. Sign in to Power Apps .
3. Select the solution you'll add your dataflow to, and from the context menu select
Edit.
5. Optional: If your dataflow loads data into a custom Dataverse table, add the
custom table to the solution as well.
In this example, the dataflow you added to the solution loads data into a custom
table called Full Order Details, which you want to also include in the solution with
the dataflow.
Once both the dataflow and table it loads data to are added to the solution, it has
the two artifacts added to the solution. In this case, the artifacts are
cr0c8_FullOrderDetails and Import Sales Data.
To save your work, be sure to publish all customizations. Now, the solution is ready
for you to export from the source environment and import to the destination
environment.
1. On the left navigation pane, select the down arrow next to Dataverse and select
Dataflows. Identify the dataflow that was imported, and select Edit from the
context menu.
2. In the Dataflow list, locate and double-click the dataflow that was added as part of
the solution you’ve imported.
Once the credentials for the connection have been updated, all queries that use
that connection automatically load.
4. If your dataflow loads data in Dataverse tables, select Next to review the mapping
configuration.
5. The mapping configuration is also saved as part of the solution. Since you also
added the destination table to the solutions, there's no need to recreate the table
in this environment and you can publish the dataflow.
That's it. Your dataflow now refreshes and loads data to the destination table.
Known limitations
Dataflows can't be created from within solutions. To add a dataflow to a solution,
follow the steps outlined in this article.
Dataflows can't be edited directly from within solutions. Instead, the dataflow must
be edited in the dataflows experience.
Dataflows can't use connection references for any connector.
Environment variables can't be used by dataflows.
Dataflows don't support adding required components, such as custom tables they
load data to. Instead, the custom table should be manually added to the solution.
Dataflows can't be deployed by application users (service principals).
Incremental refresh configuration isnt supported when deploying solutions. After
deployment of the dataflow via solution, the incremental refresh configuration
should be reapplied.
Linked tables to other dataflows aren't supported when deploying solutions. After
deployment of the dataflow via solution, please edit the dataflow and edit the
connection to the linked dataflow.
Using incremental refresh with
dataflows
Article • 08/04/2023
With dataflows, you can bring large amounts of data into Power BI or your
organization's provided storage. In some cases, however, it's not practical to update a
full copy of source data in each refresh. A good alternative is incremental refresh, which
provides the following benefits for dataflows:
Refresh occurs faster: Only data that's changed needs to be refreshed. For
example, refresh only the last five days of a 10-year dataflow.
Refresh is more reliable: For example, it's not necessary to maintain long-running
connections to volatile source systems.
Resource consumption is reduced: Less data to refresh reduces overall
consumption of memory and other resources.
7 Note
When the schema for a table in an analytical dataflow changes, a full refresh takes
place to ensure that all the resulting data matches the new schema. As a result, any
data stored incrementally is refreshed and in some cases, if the source system
doesn't retain historic data, is lost.
Using incremental refresh in dataflows created in Power BI requires that the dataflow
reside in a workspace in Premium capacity. Incremental refresh in Power Apps requires
Power Apps per-app or per-user plans, and is only available for dataflows with Azure
Data Lake Storage as the destination.
In either Power BI or Power Apps, using incremental refresh requires that source data
ingested into the dataflow have a DateTime field on which incremental refresh can filter.
When you select the icon, the Incremental refresh settings window appears. Turn on
incremental refresh.
The following list explains the settings in the Incremental refresh settings window.
Incremental refresh on/off toggle: Turns the incremental refresh policy on or off
for the table.
Filter field drop-down: Selects the query field on which the table should be filtered
for increments. This field only contains DateTime fields. You can't use incremental
refresh if your table doesn't contain a DateTime field.
Store/refresh rows from the past: The example in the previous image illustrates
these next few settings.
In this example, we define a refresh policy to store five years of data in total and
incrementally refresh 10 days of data. Assuming that the table is refreshed daily,
the following actions are carried out for each refresh operation:
Remove calendar years that are older than five years before the current date.
For example, if the current date is January 1, 2019, the year 2013 is removed.
The first dataflow refresh might take a while to import all five years, but
subsequent refreshes are likely to be completed much more quickly.
Tip
The current design requires that the column used to detect data changes be
persisted and cached into memory. You might want to consider one of the
following techniques to reduce cardinality and memory consumption:
Persist only the maximum value of this column at time of refresh, perhaps
by using a Power Query function.
Reduce the precision to a level that's acceptable given your refresh-
frequency requirements.
Only refresh complete periods: Imagine that your refresh is scheduled to run at
4:00 AM every day. If data appears in the source system during those first four
hours of that day, you might not want to account for it. Some business metrics,
such as barrels per day in the oil and gas industry, aren't practical or sensible to
account for based on partial days.
After incremental refresh is configured, the dataflow automatically alters your query to
include filtering by date. If the dataflow was created in Power BI, you can also edit the
automatically generated query by using the advanced editor in Power Query to fine-
tune or customize your refresh. Read more about incremental refresh and how it works
in the following sections.
Computed tables are based on queries running over a data store, which can be another
dataflow. As such, computed tables behave the same way as linked tables.
Because computed tables and linked tables behave similarly, the requirements and
configuration steps are the same for both. One difference is that for computed tables, in
certain configurations, incremental refresh can't run in an optimized fashion because of
the way partitions are built.
When moving a dataflow from full refresh to incremental, the new refresh logic updates
the dataflow by adhering to the refresh window and increment as defined in the
incremental refresh settings.
When moving a dataflow from incremental to full refresh, all data accumulated in the
incremental refresh is overwritten by the policy defined in the full refresh. You must
approve this action.
In the case where a scheduled refresh is defined in the system, incremental refresh
uses the time-zone settings from the scheduled refresh section. This ensures that
whatever time zone the person refreshing the dataflow is in, it will always be
consistent with the system's definition.
If no scheduled refresh is defined, dataflows use the time zone from the computer
of the user who's performing the refresh.
Incremental refresh can also be invoked by using APIs. In this case, the API call can hold
a time-zone setting that's used in the refresh. Using APIs can be helpful for testing and
validation purposes.
Merge partitions
In this example, day partitions are automatically merged to the month level after they
go outside the incremental range. Partitions in the incremental range need to be
maintained at daily granularity to allow only those days to be refreshed. The refresh
operation with Run Date 12/11/2016 merges the days in November, because they fall
outside the incremental range.
The next refresh operation, with Run Date 1/16/2017, takes the opportunity to merge
the days in December and the months in Q4 of 2016.
Both approaches work according to your specified definitions in the refresh settings.
More information: Incremental refresh in Power BI Premium
See also
This article described incremental refresh for dataflows. Here are some more articles that
might be useful:
For more information about Common Data Model, you can read its overview article:
With Microsoft Power BI and Power Platform dataflows, you can connect to many
different data sources to create new dataflows, or add new entities to an existing
dataflow.
This article describes how to create dataflows by using these data sources. For an
overview of how to create and use dataflows, go to Creating a dataflow for Power BI
service and Create and use dataflows in Power Apps.
Power BI service
1. Open a workspace.
2. Select New.
3. Select Dataflow from the drop-down menu.
4. Under Define new tables, select Add new tables.
All categories
File
Database
Power Platform
Azure
Online Services
Other
For a list of all of the supported data sources in Power Query, go to Connectors in Power
Query.
After the server URL or resource connection information is provided, enter the
credentials to use for access to the data. You may also need to enter the name of an on-
premises data gateway. Then select Next.
Power Query Online initiates and establishes the connection to the data source. It then
presents the available tables from that data source in the Navigator window.
You can select tables and data to load by selecting the check box next to each in the left
pane. To transform the data you've chosen, select Transform data from the bottom of
the Navigator window. A Power Query Online dialog box appears, where you can edit
queries and perform any other transformations you want to the selected data.
2. Open Power Query Editor in Power BI Desktop, right-click the relevant query, and
then select Advanced Editor, as shown in the following image. From there, you can
copy the M script that appears in the Advanced Editor window.
3. Open the Power BI dataflow, and then select Get data for a blank query.
4. Paste the copied query into the blank query for the dataflow.
Your script then connects to the data source you specified.
The following list shows which connectors you can currently use by copying and pasting
the M query into a blank query:
Next steps
This article showed which data sources you can connect to for dataflows. The following
articles go into more detail about common usage scenarios for dataflows:
For information about individual Power Query connectors, go to the connector reference
list of Power Query connectors, and select the connector you want to learn more about.
Additional information about dataflows and related information can be found in the
following articles:
For more information about Power Query and scheduled refresh, you can read these
articles:
For more information about Common Data Model, you can read its overview article:
Dataflows can be created in different portals, such as Power BI and the Power Apps, and
can be of either analytical or standard type. In addition, some dataflow features are only
available as Premium features. Considering the wide range of products that can use
dataflows, and feature availability in each product or dataflow type, it's important to
know what licensing options you need to use dataflows.
If you want to create analytical dataflows that store data in your organization's Azure
Data Lake Storage Gen2 account, you or your administrator need access to an Azure
subscription and an Azure Data Lake Storage Gen2 account.
Premium features
Some of the dataflow features are limited to premium licenses. If you want to use the
enhanced compute engine to speed up your dataflow queries' performance over
computed tables, or have the DirectQuery connection option to the dataflow, you need
to have Power BI P1 or A3 or higher capacities.
AI capabilities in Power BI, linked table, and computed table are all premium functions
that aren't available with a Power BI Pro account.
List of features
The following table contains a list of features and the license needed for them to be
available.
Store data in Dataverse tables (standard dataflow) N/A Per app plan
Per user plan
Store data in Azure Data Lake Storage (analytical Yes Yes, using analytical
dataflow) dataflows
Store data in customer provided Azure Data Lake Yes Per app plan
Storage (analytical dataflow; bring your own Azure Per user plan
Data Lake Storage)
The Power Apps per-app plan covers up to a 50-MB database capacity. The Power Apps
per-user plan allows you to have a database of 250-MB capacity.
Power BI Pro
Power BI Pro gives you the ability to create analytical dataflows, but not use any of the
premium features. With a Power BI Pro account, you can't use linked or computed
tables, you can't use AI capabilities in Power BI, and you can't use DirectQuery to
connect to the dataflow. The storage for your dataflows is limited to the space left under
your Power BI Pro account, which is a subset of 10-GB storage for all Power BI content.
Currently, we don't report the current storage usage of dataflows in the Power BI portal.
You'll be notified if you've almost reached the limit of the left over capacity.
Power BI Premium
If you use Power BI Premium (capacity-based licensing), you can use all the AI
capabilities in Power BI, computed tables and linked tables, with the ability to have a
DirectQuery connection to the dataflow. You can also use the enhanced compute
engine. However, the dataflow created under a premium capacity license uses only the
internal Azure Data Lake Storage, and isn't accessible by other platforms except Power
BI itself. You can't create external dataflows just by having a Power BI Premium license;
you need to have an Azure subscription for Azure Data Lake Storage as well.
Dataflows that are using a premium capacity to refresh the data are limited to the
maximum number of parallel tasks they can perform at a given time. The maximum
number of parallel tasks depends on the type of premium capacity you're using. This
table represents the maximum number of parallel tasks that can be run at a given time
by all dataflows in a workspace mapped to the capacity.
Parallel tasks
A premium capacity can run multiple evaluations in parallel. For example, you have a P4
capacity and a dataflow that consists of 84 tasks. You refresh your dataflow and the first
64 tasks are allocated for the refresh. The 20 left over evaluations for this dataflow are
parked in a queue. Once one of the evaluations is finished, it starts with the next
evaluation from the queue. If you start another dataflow in your workspace on the same
premium capacity while the other is still running, it gets parked in the same queue of
the premium capacity and needs to wait on the other dataflows in the workspace to
start the refresh of your data.
You can use the following pointers to estimate the number of tasks of your dataflow
refresh:
The number of queries executed in the refresh (don't forget the upstream linked
tables).
The number of partitions in an incremental refresh query are considered as extra
tasks.
To lower the number of tasks or improve the efficiency of your tasks, you can use the
following strategies:
Lower the number of queries in your dataflow by combining queries where
possible and only "enable load" for queries that are used downstream.
Evaluate if you really need the upstream linked tables to refresh automatically.
Strategically schedule your dataflow refreshes based on the number of tasks.
Make sure your query returns the minimum set of columns and rows to satisfy
your data need. The faster and more efficiently the task executes, the sooner the
next task can start.
Next step
If you want to read more details about the concepts discussed in this article, go to any
of the following links.
Pricing
Power BI pricing
Power Apps pricing
Azure Data Lake Storage Gen 2 pricing
Features
Computed tables
Linked tables
AI capabilities in Power BI dataflows
Standard vs. analytical dataflows
The enhanced compute engine
How to migrate queries from Power
Query in the desktop (Power BI and
Excel) to dataflows
Article • 02/17/2023
If you already have queries in Power Query, either in Power BI Desktop or in Excel, you
might want to migrate the queries into dataflows. The migration process is simple and
straightforward. In this article, you'll learn the steps to do so.
To learn how to create a dataflow in Microsoft Power Platform, go to Create and use
dataflows in Power Platform. To learn how to create a dataflow in Power BI, go to
Creating and using dataflows in Power BI.
In Excel on the Data tab, select Get Data> Launch Power Query Editor.
2. Copy the queries:
If you've organized your queries into folders (called groups in Power Query):
a. In the Queries pane, select Ctrl as you select the folders you want to migrate to
the dataflow.
b. Select Ctrl+C.
If you don't have folders:
a. In the Queries pane, select Ctrl as you select the queries you want to migrate.
b. Select Ctrl+C.
3. Paste the copied queries into a dataflow:
b. Open the dataflow in Power Query Editor, and in the Queries pane, select
Ctrl+V to paste the copied folders or queries.
The image below shows an example of successfully copied folders.
If your data source is an on-premises source, you need to perform an extra step.
Examples of on-premises sources can be Excel files in a shared folder in a local
domain, or a SQL Server database hosted in an on-premises server.
The gateway isn't needed for data sources residing in the cloud, such as an Azure
SQL database.
If you've done all the steps successfully, you'll see a preview of the data in the
Power Query Editor.
Install an on-premises data gateway to transfer data quickly and securely between a
Power Platform dataflow and a data source that isn't in the cloud, such as an on-
premises SQL Server database or an on-premises SharePoint site. You can view all
gateways for which you have administrative permissions and manage permissions and
connections for those gateways.
Prerequisites
Power BI service
A Power BI service account. Don't have one? Sign up for 60 days free.
Power Apps
A Power Apps account. Don't have one? Sign up for 30 days free.
Install a gateway
You can install an on-premises data gateway directly from the online service.
7 Note
It's a good general practice to make sure you're using a supported version of
the on-premises data gateway. We release a new update of the on-premises
data gateway every month. Currently, Microsoft actively supports only the last
six releases of the on-premises data gateway.
Starting April 2022, the minimum required gateway version will be Feburary
2021. Dataflows that refresh using an earlier version of the gateway might
stop refreshing.
1. Select the downloads button in the upper right corner of Power BI service, and
choose Data Gateway.
2. Install the gateway using the instructions provided in Install an on-premises data
gateway.
4. Install the gateway using the instructions provided in Install an on-premises data
gateway.
3. Provide the connection details for the enterprise gateway that will be used to
access the on-premises data. You must select the gateway itself, and provide
credentials for the selected gateway. Only gateways for which you're an
administrator appear in the list.
You can change the enterprise gateway used for a given dataflow and change the
gateway assigned to all of your queries using the dataflow authoring tool.
7 Note
The dataflow will try to find or create the required data sources using the new
gateway. If it can't do so, you won't be able to change the gateway until all needed
dataflows are available from the selected gateway.
1. If we detect that an existing data source is available for the selected gateway, the
Username and Password fields will be pre-populated.
a. If you select Next at this point, you're considered to be using that existing data
source, and so you only need to have permissions to that data source.
b. If you edit any of the credential fields and select Next, then you're considered to
be editing that existing data source, at which point you need to be an admin of
the gateway.
2. If we don't detect that an existing data source is available for the selected gateway,
the Username and Password fields will be blank, and if you edit the credential
fields and select Next, then you're considered to be creating a new data source on
the gateway, at which point you need to be an admin of the gateway.
If you only have data source user permission on the gateway, then 1.b and 2 can't be
achieved and the dataflow can't be created.
2. To add a user to a gateway, select Users, specify a user or group, and then specify
a permission level. Creating new data sources with a gateway in dataflows requires
Admin permission on the gateway. Admins have full control of the gateway,
including adding users, setting permissions, creating connections to all available
data sources, and deleting the gateway.
To view details and edit the settings, select Gateway Cluster Settings.
To add users as administrators of the gateway, select Administrators.
To add a data source to the gateway, select Add Data Source, enter a data
source name and choose the data source type under Data Source Settings,
and then enter the email address of the person who will use the data source.
To delete a gateway, select the ellipsis to the right of the gateway name and
then select Remove.
To view details, edit the settings, or delete a gateway, select Connections, and
then select a connection.
You can only share some types of connections, such as a SQL Server
connection. For more information, see Share canvas-app resources in
Power Apps.
Limitations
There are a few known limitations when using enterprise gateways and dataflows.
Dataflow refresh might fail if an out-of-date data gateway is used. Starting April
2022, the minimum required data gateway version is February 2021.
Each dataflow can use only one gateway. As such, all queries should be configured
using the same gateway.
If several gateways are needed, the best practice is to build several dataflows (one
for each gateway). Then use the compute or table reference capabilities to unify
the data.
Dataflows are only supported using enterprise gateways. Personal gateways won't
be available for selection in the drop-down lists and settings screens.
Creating new data sources with a gateway in dataflows is only supported for
people with Admins permissions.
Users with Can Use or Can Use + Share permissions can use existing connections
when creating dataflows.
Troubleshooting
When you attempt to use an on-premises data source to publish a dataflow, you might
come across the following MashupException error:
This error usually occurs because you're attempting to connect to an Azure Data Lake
Storage endpoint through a proxy, but you haven't properly configured the proxy
settings for the on-premises data gateway. To learn more about how to configure these
proxy settings, go to Configure proxy settings for the on-premises data gateway.
For more information about troubleshooting issues with gateways, or configuring the
gateway service for your network, go to the On-premises data gateway documentation.
If you're experiencing issues with the gateway version you're using, try updating to the
latest version as your issue might have been resolved in the latest version. For more
information about updating your gateway, go to Update an on-premises data gateway.
Next steps
Create and use dataflows in Power Apps
Fabric dataflows and Power Platform dataflows are Microsoft 365 services that enable
users to easily connect to, extract, move, and transform data across hundreds of
supported data sources. Dataflows build upon an underlying service called Power Query
Online, which hosts the data movement engine (Mashup Engine) as a cloud service. By
default, connectivity originates from this cloud location and has unrestricted access to
the internet. Therefore, when using dataflows to access and move sensitive data,
organizations should consider strategies to deter insiders from accidental or malicious
data exfiltration. This article outlines known risk factors and best practices for
safeguards.
Considerations
A trusted user who has access to sensitive data can author a program to push the data
to an untrusted data store. Since the Mashup Engine runs entirely in the cloud, it doesn't
go through corporate firewalls and proxy servers. So, it isn't subject to any data loss
prevention (DLP) policies that might be enforced by these networks. Since the point of
access is on the public internet, data can travel to any destination that the user has
access to—either through authentication or anonymous access. Here are some
examples of ways in which these programs can exfiltrate sensitive data:
Anonymous web requests: By using Web.Contents, users can make web requests
with sensitive data in the body of the request.
Cross data source filtering and joins: Sensitive data can be used as filtering or join
conditions against another untrusted data source. Specifically, data can travel to
the untrusted data source in the form of query strings or parameters.
Output destinations: By using Fabric dataflows, users can specify output
destinations for their queries, thereby transferring data to a list of supported data
sinks, which includes Azure SQL databases and data warehouses, Fabric
Lakehouses, Warehouses, and KQL databases.
Network isolation
We recommend that all data stores containing sensitive data be network isolated to
permit access only from selected networks. This isolation restriction must be defined
and operated at the network layer or lower. For example, layer 3 firewalls, Network
Security Groups (NSGs), and Azure Private Links are good examples of mechanisms that
can be used. However, location-based conditional access policies in Azure Active
Directory (Azure AD) operate at the application layer and are considered insufficient for
this purpose.
These network isolation policies must obstruct line of sight from dataflows' cloud
execution engine to sensitive data stores (since the cloud engine runs on the public
internet). Dataflows' connectivity to these data stores is then forced to originate from
within one of the permitted networks by binding connections to an on-premises data
gateway or VNet data gateway. An important execution characteristic of dataflows is
that cloud-based evaluation and gateway-based evaluation are never blended. If a
dataflow needs to access a network isolated data store (and is therefore bound to a
gateway), all data access is then required to flow through the gateway. Additionally,
since gateways physically reside in networks controlled by the user tenant, they comply
with network level restrictions such as firewalls and DLP protection solutions. These
restrictions make gateway environments as secure and safe as any corporate managed
devices and mitigate risks associated with arbitrary code execution in a cloud
environment.
It's worth noting that network isolation must be applied to all data stores that might
contain sensitive data. Consider an example where a user creates a dataflow to read
data from OneDrive for Business into Power BI. Then the user later creates a linked
dataflow to transform the data in Power BI into downstream entities. In this scenario, it's
not sufficient to just isolate OneDrive for Business to trusted networks. Since sensitive
data might also reside within Power BI, it's important to isolate such data by enabling
private links and disabling public Internet access for Power BI. Learn more about secure
access to Power BI using private endpoints.
Force gateway execution
The goal for isolating sensitive data store to selected networks is to force the origin of
access back to trusted networks, so that existing policies governing managed devices
can be used to govern data movement from dataflows. In certain cases, a full network
isolation solution might take time to develop, test, and deploy. As an alternative, you
can file a dataflows support ticket to apply a tenant-wide policy that turns off the
Mashup Engine. This policy affects all query evaluations that use the Power Query
Online Mashup Engine. Impacted capabilities include:
Fabric dataflows
Power Platform dataflows
Azure Data Factory wrangling dataflows
Dataflows in Dynamics 365 (Customer Insights, Intelligent Order Management, and
so on)
Power BI Datamart
Power BI Quick Import from SharePoint
After application of the policy, all cloud-based execution fails with the following error:
Cloud evaluation request denied based on tenant policies. Please use a data gateway
and try again. This error effectively forces all query evaluations in the tenant to occur
on gateways, without first rolling out a full network isolation solution. Note that the
policy is applied to the entire tenant and not a subset of workloads. This policy means
existing workloads fail immediately and require manual intervention to convert to run
on gateways. Organizations applying this policy should also ensure that they have
enough capacity in their gateway clusters to accommodate all their workloads.
Tenant isolation
For most software-as-a-service (SaaS) layer data stores, such as Fabric Lakehouse and
Power Platform Dataverse, there's usually a multi-tenant endpoint that one
communicates with to gain access to the data. These endpoints are common across all
users of the service, so they can be difficult to isolate and protect solely using network
(Layer 3) isolation techniques. The recommended approach for this kind of data store is
to use Layer 7 policies, typically provided by Azure Active Directory:
This approach restricts access to the tenant’s sensitive data stores to a set of managed
devices where signing into another tenant isn't permitted, effectively isolating data
movement across the tenant.
Roadmap
The following list contains some of the features that are currently planned to help
organizations better manage data exfiltration risks in Fabric:
Data source connection allowlisting: Allows Fabric tenant admins to control the
kinds of connectors that can be used within the tenant, and the endpoints the
connectors can connect to.
Connection usage auditing: Support for auditing logs that track connection
creation, updating, deletion, and usage.
Creating computed tables in dataflows
Article • 08/04/2023
You can perform in-storage computations when using dataflows with a Power BI
Premium subscription. This lets you do calculations on your existing dataflows, and
return results that enable you to focus on report creation and analytics.
To perform in-storage computations, you first must create the dataflow and bring data
into that Power BI dataflow storage. After you have a dataflow that contains data, you
can create computed tables, which are tables that do in-storage computations.
There are two ways you can connect dataflow data to Power BI:
The following sections describe how to create computed tables on your dataflow data.
In the dataflow authoring tool in the Power BI service, select Edit tables. Then right-click
the table you want to use as the basis for your computed table and on which you want
to perform calculations. On the shortcut menu, select Reference.
For the table to be eligible as a computed table, Enable load must be selected, as shown
in the following image. Right-click the table to display this shortcut menu.
By selecting Enable load, you create a new table whose source is the referenced table.
The icon changes to the computed icon, as shown in the following image.
Any transformation you do on this newly created table will be run on the data that
already resides in Power BI dataflow storage. That means that the query won't run
against the external data source from which the data was imported (for example, the
SQL database from which the data was pulled).
Consider the following example. You have an Account table that contains the raw data
for all the customers from your Dynamics 365 subscription. You also have ServiceCalls
raw data from the service center, with data from the support calls that were performed
from the different accounts on each day of the year.
Imagine you want to enrich the Account table with data from ServiceCalls.
First you would need to aggregate the data from the ServiceCalls to calculate the
number of support calls that were done for each account in the last year.
Next, you merge the Account table with the ServiceCallsAggregated table to calculate
the enriched Account table.
Then you can see the results, shown as EnrichedAccount in the following image.
And that's it—the transformation is done on the data in the dataflow that resides in your
Power BI Premium subscription, not on the source data.
When working with dataflows specifically created in an organization's Azure Data Lake
Storage account, linked tables and computed tables only work properly when the tables
reside in the same storage account. More information: Connect Azure Data Lake Storage
Gen2 for dataflow storage
Linked tables are only available for dataflows created in Power BI and Power Apps. As a
best practice, when doing computations on data joined by on-premises and cloud data,
create a new table to perform such computations. This provides a better experience
than using an existing table for computations, such as an table that is also querying data
from both sources and doing in-storage transformations.
See also
Computed table scenarios and use cases
This article described computed tables and dataflows. Here are some more articles that
might be useful:
The following links provide additional information about dataflows in Power BI and
other resources:
For more information about Power Query and scheduled refresh, you can read these
articles:
For more information about Common Data Model, you can read its overview article:
With dataflows in Microsoft Power Platform, you can have a single organizational data
storage source where business analysts can prep and manage their data once, and then
reuse it between different analytics apps in the organization.
When you link tables between dataflows, you can reuse tables that have already been
ingested, cleansed, and transformed by dataflows that are owned by others, without the
need to maintain that data. The linked tables simply point to the tables in other
dataflows, and don't copy or duplicate the data.
Linked tables are read-only, so if you want to create transformations for a linked table,
you must create a new computed table with a reference to the linked table.
Linked tables only work properly in new Power BI workspaces, and, likewise, all linked
dataflows must be located in new workspaces. More information: Create the new
workspaces in Power BI
7 Note
If your dataflow isn't located in a Premium capacity workspace, you can still
reference a single query—or combine two or more queries—as long as the
transformations aren't defined as in-storage transformations. Such references are
considered standard tables. To do this, turn off the Enable load option for the
referenced queries to prevent the data from being materialized and ingested into
storage. From there, you can reference those Enable load = false queries, and set
Enable load to On only for the resulting queries that you want to materialize.
You can select Get data from the dataflow authoring tool, which displays a dialog box
for selecting the categories and each data source. Then select the Power Platform
Dataflows connector.
A connection window for the selected data connection is displayed. If credentials are
required, you're prompted to provide them.
In Power BI, you can select Add linked tables from the dataflow authoring tool.
You can also select Add linked tables from the Add tables menu in the Power BI service.
A Navigator window opens, and you can choose a set of tables you can connect to. The
window displays tables for which you have permissions across all workspaces and
environments in your organization.
After you select your linked tables, they appear in the list of tables for your dataflow in
the authoring tool, with a special icon identifying them as linked tables.
You can also view the source dataflow from the dataflow settings of your linked table.
Links in the same workspace: When data refresh occurs for a source dataflow, that
event automatically triggers a refresh process for dependent tables in all
destination dataflows in the same workspace, including any calculated tables
based on them. All other tables in the destination dataflow are refreshed according
to the dataflow schedule. Tables that depend on more than one source refresh
their data whenever any of their sources are refreshed successfully.
7 Note
The entire refresh process is committed at once. Because of this, if the data
refresh for the destination dataflow fails, the data refresh for the source
dataflow fails as well.
A table can be referenced by another dataflows. That reference table can also be
reference by other dataflows, and so on, up to five times.
Cyclical dependencies of linked tables aren't allowed.
The dataflow must be in a new Power BI workspace or a Power Apps environment.
A linked table can't be joined with a regular table that gets its data from an on-
premises data source.
When using M parameters to address linked tables, if the source dataflow is
refreshed, it doesn't automatically affect the data in the destination dataflow.
Attempting to connect two dataflow tables between two workspaces of different
storage types—Bring Your Own Storage Account (BYOSA) and Internal—isn't
supported.
Next steps
The following articles might be useful as you create or work with dataflows:
The following articles provide more information about dataflows and Power BI:
For more information about Power Query and scheduled refresh, you can read these
articles:
For more information about Common Data Model, you can read its overview article:
You can configure dataflows to store their data in your organization's Azure Data Lake
Storage Gen2 account. This article describes the general steps necessary to do so, and
provides guidance and best practices along the way.
) Important
Dataflow with Analytical tables feature utilizes the Azure Synapse Link for Dataverse
service, which may offer varying levels of compliance, privacy, security, and data
location commitments. For more information about Azure Synapse Link for
Dataverse, go to What is Azure Synapse Link for Dataverse?.
There are some advantages to configuring dataflows to store their definitions and
datafiles in your data lake, such as:
Azure Data Lake Storage Gen2 provides an enormously scalable storage facility for
data.
Dataflow data and definition files can be leveraged by your IT department's
developers to leverage Azure data and artificial intelligence (AI) services as
demonstrated in the GitHub samples from Azure data services.
It enables developers in your organization to integrate dataflow data into internal
applications and line-of-business solutions, using developer resources for
dataflows and Azure.
Requirements
To use Azure Data Lake Storage Gen2 for dataflows, you need the following:
A Power Apps environment. Any Power Apps plan will allow you to create
dataflows with Azure Data Lake Storage Gen2 as a destination. You'll need to be
authorized in the environment as a maker.
An Azure subscription. You need an Azure subscription to use Azure Data Lake
Storage Gen2.
A resource group. Use a resource group you already have, or create a new one.
An Azure storage account. The storage account must have the Data Lake Storage
Gen2 feature enabled.
Tip
If you don't have an Azure subscription, create a free trial account before you
begin.
1. The storage account must be created in the same Azure Active Directory tenant as
your Power Apps tenant.
2. We recommend that the storage account is created in the same region as the
Power Apps environment you plan to use it in. To determine where your Power
Apps environment is, contact your environment admin.
3. The storage account must have the Hierarchical Name Space feature enabled.
4. You must be granted an Owner role on the storage account.
The following sections walk through the steps necessary to configure your Azure Data
Lake Storage Gen2 account.
1. Make sure you select the same region as your environment and set your storage as
StorageV2 (general purpose v2).
2. Make sure you enable the hierarchical namespace feature.
3. We recommend that you set the replication setting to Read-access geo-redundant
storage (RA-GRS).
To connect your Azure Data Lake Storage Gen2 account with the dataflow, follow these
steps:
1. Sign in to Power Apps , and verify which environment you're in. The environment
switcher is located on the right side of the header.
2. On the left navigation pane, select the down arrow next to Data.
3. In the list that appears, select Dataflows and then on the command bar select New
dataflow.
4. Select the analytical tables you want. These tables indicate what data you want to
store in your organization's Azure Data Lake Store Gen2 account.
Select the storage account to use for dataflow
storage
If a storage account hasn't yet been associated with the environment, a Link to data
lake dialog box appears. You'll need to sign in and find the data lake you created in the
previous steps. In this example, no data lake is associated with the environment and so a
prompt occurs to add one.
5. Select Save.
Once these steps are successfully completed, your Azure Data Lake Storage Gen2
account is connected to Power Platform Dataflows and you can continue to create a
dataflow.
Linking an Azure Data Lake Store Gen2 account for dataflow storage isn't
supported in the default environment.
Once a dataflow storage location is configured for a dataflow, it can't be changed.
Once a storage account is linked, changing the environment's name isn't
supported and dataflows linked to the that storage account will fail. Changing back
the environment's name will re-enable those dataflows.
By default, any member of the environment can access dataflow data using the
Power Platform Dataflows Connector. However, only the owners of a dataflow can
access its files directly in Azure Data Lake Storage Gen2. To authorize more people
to access the dataflows data directly in the lake, you must authorize them to the
dataflow's CDM Folder in the data lake or the data lake itself.
When a dataflow is deleted, its CDM Folder in the lake will also be deleted.
Attempting to connect two dataflow entities between two workspaces of different
storage types—Bring Your Own Storage Account (BYOSA) and Internal—isn't
supported.
) Important
You shouldn't change files created by dataflows in your organization's lake or add
files to a dataflow's CDM Folder. Changing files might damage dataflows or alter
their behavior and is not supported. Power Platform Dataflows only grants read
access to files it creates in the lake. If you authorize other people or services to the
filesystem used by Power Platform Dataflows, only grant them read access to files
or folders in that filesystem.
Privacy notice
By enabling the creation of dataflows with Analytical tables in your organization, via the
Azure Synapse Link for Dataverse service, details about the Azure Data Lake storage
account, such as the name of the storage account, will be sent to and stored in the
Azure Synapse Link for Dataverse service, which is currently located outside the
PowerApps compliance boundary and may employ lesser or different privacy and
security measures than those typically in PowerApps. Note that you may remove the
data lake association at any time to discontinue use of this functionality and your Azure
Data Lake storage account details will be removed from the Azure Synapse Link for
Dataverse service. Further information about Azure Synapse Link for Dataverse is
available in this article.
You can't change the storage location of a dataflow after it was created.
Next steps
This article provided guidance about how to connect an Azure Data Lake Storage Gen2
account for dataflow storage.
For more information about dataflows, the Common Data Model, and Azure Data Lake
Storage Gen2, go to these articles:
For more information about the Common Data Model, go to these articles:
Analytical dataflows store both data and metadata in Azure Data Lake Storage.
Dataflows leverage a standard structure to store and describe data created in the lake,
which is called Common Data Model folders. In this article, you'll learn more about the
storage standard that dataflows use behind the scenes.
However, when the dataflow is analytical, the data is stored in Azure Data Lake Storage.
A dataflow’s data and metadata is stored in a Common Data Model folder. Since a
storage account might have multiple dataflows stored in it, a hierarchy of folders and
subfolders has been introduced to help organize the data. Depending on the product
the dataflow was created in, the folders and subfolders may represent workspaces (or
environments), and then the dataflow’s Common Data Model folder. Inside the
Common Data Model folder, both schema and data of the dataflow tables are stored.
This structure follows the standards defined for Common Data Model.
What is the Common Data Model storage
structure?
Common Data Model is a metadata structure defined to bring conformity and
consistency for using data across multiple platforms. Common Data Model isn't data
storage, it's the way that data is stored and defined.
Common Data Model folders define how a table's schema and its data should be stored.
In Azure Data Lake Storage, data is organized in folders. Folders can represent a
workspace or environment. Under those folders, subfolders for each dataflow are
created.
You can use this JSON file to migrate (or import) your dataflow into another workspace
or environment.
To learn exactly what the model.json metadata file includes, go to The metadata file
(model.json) for Common Data Model.
Data files
In addition to the metadata file, the dataflow folder includes other subfolders. A
dataflow stores the data for each table in a subfolder with the table's name. Data for a
table might be split into multiple data partitions, stored in CSV format.
If your organization enabled dataflows to take advantage of its Data Lake Storage
account and was selected as a load target for dataflows, you can still get data from the
dataflow by using the Power Platform dataflow connector as mentioned above. But you
can also access the dataflow's Common Data Model folder directly through the lake,
even outside of Power Platform tools and services. Access to the lake is possible through
the Azure portal, Microsoft Azure Storage Explorer, or any other service or experience
that supports Azure Data Lake Storage. More information: Connect Azure Data Lake
Storage Gen2 for dataflow storage
Next steps
Use the Common Data Model to optimize Azure Data Lake Storage Gen2
Standard dataflows always load data into Dataverse tables in an environment. Analytical
dataflows always load data into Azure Data Lake Storage accounts. For both dataflow
types, there's no need to provision or manage the storage. Dataflow storage, by default,
is provided and managed by products the dataflow is created in.
Analytical dataflows allow an additional storage option: your organizations' Azure Data
Lake Storage account. This option enables access to the data created by a dataflow
directly through Azure Data Lake Storage interfaces. Providing your own storage
account for analytical dataflows enables other Azure or line-of-business applications to
leverage the data by connecting to the lake directly.
Known limitations
After a dataflow is created, its storage location can't be changed.
Linked and computed entities features are only available when both dataflows are
in the same storage account.
Next steps
The articles below provide further information that can be helpful.
Connect Azure Data Lake Storage Gen2 for dataflow storage (Power BI dataflows)
Connect Azure Data Lake Storage Gen2 for dataflow storage (Power Platform
dataflows)-->
Creating computed entities in dataflows
The enhanced compute engine
Understanding the differences between standard and analytical dataflows
Computed table scenarios and use cases
Article • 08/04/2023
There are benefits to using computed tables in a dataflow. This article describes use
cases for computed tables and describes how they work behind the scenes.
Although it's possible to repeat the queries that created a table and apply new
transformations to them, this approach has drawbacks: data is ingested twice, and the
load on the data source is doubled.
Computed tables solve both problems. Computed tables are similar to other tables in
that they get data from a source and you can apply further transformations to create
them. But their data originates from the storage dataflow used, and not the original data
source. That is, they were previously created by a dataflow and then reused.
For example, if two tables share even a part of their transformation logic, without a
computed table, the transformation has to be done twice.
However, if a computed table is used, then the common (shared) part of the
transformation is processed once and stored in Azure Data Lake Storage. The remaining
transformations are then be processed from the output of the common transformation.
Overall, this processing is much faster.
A computed table provides one place as the source code for the transformation and
speeds up the transformation because it only needs to be done once instead of multiple
times. The load on the data source is also reduced.
Using a reference from this table, you can build a computed table.
The computed table can have further transformations. For example, you can use Group
By to aggregate the data at the customer level.
This means that the Orders Aggregated table is getting data from the Orders table, and
not from the data source again. Because some of the transformations that need to be
done have already been done in the Orders table, performance is better and data
transformation is faster.
You might ask, what's the value of creating a source table that only ingests data? Such a
table can still be useful, because if the data from the source is used in more than one
table, it reduces the load on the data source. In addition, data can now be reused by
other people and dataflows. Computed tables are especially useful in scenarios where
the data volume is large, or when a data source is accessed through an on-premises
data gateway, because they reduce the traffic from the gateway and the load on data
sources behind them.
If the dataflow you're developing is getting bigger and more complex, here are some
things you can do to improve on your original design.
Having a custom function helps by having only a single version of the source code, so
you don't have to duplicate the code. As a result, maintaining the Power Query
transformation logic and the whole dataflow is much easier. For more information, go to
the following blog post: Custom Functions Made Easy in Power BI Desktop .
7 Note
Sometimes you might receive a notification that tells you a premium capacity is
required to refresh a dataflow with a custom function. You can ignore this message
and reopen the dataflow editor. This usually solves your problem unless your
function refers to a "load enabled" query.
If you set up a separate schedule for the linked dataflow, dataflows can be refreshed
unnecessarily and block you from editing the dataflow. There are two recommendations
to avoid this problem:
Don't set a refresh schedule for a linked dataflow in the same workspace as the
source dataflow.
If you want to configure a refresh schedule separately and want to avoid the
locking behavior, move the dataflow to a separate workspace.
Best practices for reusing dataflows
across environments and workspaces
Article • 08/04/2023
This article discusses a collection of best practices for reusing dataflows effectively and
efficiently. Read this article to avoid design pitfalls and potential performance issues as
you develop dataflows for reuse.
If you have data transformation dataflows, you can split them into dataflows that do
common transformations. Each dataflow can do just a few actions. These few actions per
dataflow ensure that the output of that dataflow is reusable by other dataflows.
These levels of endorsement help users find reliable dataflows easier and faster. The
dataflow with a higher endorsement level appears first. The Power BI administrator can
delegate the ability to endorse dataflows to the certified level to other people. More
information: Endorsement - Promoting and certifying Power BI content
Separate tables in multiple dataflows
You can have multiple tables in one dataflow. One of the reasons you might split tables
in multiple dataflows is what you learned earlier in this article about separating the data
ingestion and data transformation dataflows. Another good reason to have tables in
multiple dataflows is when you want a different refresh schedule than other tables.
In the example shown in the following image, the sales table needs to be refreshed
every four hours. The date table needs to be refreshed only once a day to keep the
current date record updated. And a product-mapping table just needs to be refreshed
once a week. If you have all of these tables in one dataflow, you have only one refresh
option for them all. However, if you split these tables into multiple dataflows, you can
schedule the refresh of each dataflow separately.
Good table candidates for dataflow tables
When you develop solutions using Power Query in the desktop tools, you might ask
yourself; which of these tables are good candidates to be moved to a dataflow? The best
tables to be moved to the dataflow are those tables that need to be used in more than
one solution, or more than one environment or service. For example, the Date table
shown in the following image needs to be used in two separate Power BI files. Instead of
duplicating that table in each file, you can build the table in a dataflow as a table, and
reuse it in those Power BI files.
Best practices for creating a dimensional
model using dataflows
Article • 08/04/2023
Designing a dimensional model is one of the most common tasks you can do with a
dataflow. This article highlights some of the best practices for creating a dimensional
model using a dataflow.
Staging dataflows
One of the key points in any data integration system is to reduce the number of reads
from the source operational system. In the traditional data integration architecture, this
reduction is done by creating a new database called a staging database. The purpose of
the staging database is to load data as-is from the data source into the staging
database on a regular schedule.
The rest of the data integration will then use the staging database as the source for
further transformation and converting it to the dimensional model structure.
We recommended that you follow the same approach using dataflows. Create a set of
dataflows that are responsible for just loading data as-is from the source system (and
only for the tables you need). The result is then stored in the storage structure of the
dataflow (either Azure Data Lake Storage or Dataverse). This change ensures that the
read operation from the source system is minimal.
Next, you can create other dataflows that source their data from staging dataflows. The
benefits of this approach include:
Reducing the number of read operations from the source system, and reducing the
load on the source system as a result.
Reducing the load on data gateways if an on-premises data source is used.
Having an intermediate copy of the data for reconciliation purpose, in case the
source system data changes.
Making the transformation dataflows source-independent.
Transformation dataflows
When you've separated your transformation dataflows from the staging dataflows, the
transformation will be independent from the source. This separation helps if you're
migrating the source system to a new system. All you need to do in that case is to
change the staging dataflows. The transformation dataflows are likely to work without
any problem, because they're sourced only from the staging dataflows.
This separation also helps in case the source system connection is slow. The
transformation dataflow won't need to wait for a long time to get records coming
through a slow connection from the source system. The staging dataflow has already
done that part, and the data will be ready for the transformation layer.
Layered Architecture
A layered architecture is an architecture in which you perform actions in separate layers.
The staging and transformation dataflows can be two layers of a multi-layered dataflow
architecture. Trying to do actions in layers ensures the minimum maintenance required.
When you want to change something, you just need to change it in the layer in which
it's located. The other layers should all continue to work fine.
The following image shows a multi-layered architecture for dataflows in which their
tables are then used in Power BI datasets.
Use a computed table as much as possible
When you use the result of a dataflow in another dataflow, you're using the concept of
the computed table, which means getting data from an "already-processed-and-stored"
table. The same thing can happen inside a dataflow. When you reference an table from
another table, you can use the computed table. This is helpful when you have a set of
transformations that need to be done in multiple tables, which are called common
transformations.
In the previous image, the computed table gets the data directly from the source.
However, in the architecture of staging and transformation dataflows, it's likely that the
computed tables are sourced from the staging dataflows.
Build a star schema
The best dimensional model is a star schema model that has dimensions and fact tables
designed in a way to minimize the amount of time to query the data from the model,
and also makes it easy to understand for the data visualizer.
It isn't ideal to bring data in the same layout of the operational system into a BI system.
The data tables should be remodeled. Some of the tables should take the form of a
dimension table, which keeps the descriptive information. Some of the tables should
take the form of a fact table, to keep the aggregatable data. The best layout for fact
tables and dimension tables to form is a star schema. More information: Understand star
schema and the importance for Power BI
Use a unique key value for dimensions
When building dimension tables, make sure you have a key for each one. This key
ensures that there are no many-to-many (or in other words, "weak") relationships
among dimensions. You can create the key by applying some transformation to make
sure a column or a combination of columns is returning unique rows in the dimension.
Then that combination of columns can be marked as a key in the table in the dataflow.
Do an incremental refresh for large fact tables
Fact tables are always the largest tables in the dimensional model. We recommend that
you reduce the number of rows transferred for these tables. If you have a very large fact
table, ensure that you use incremental refresh for that table. An incremental refresh can
be done in the Power BI dataset, and also the dataflow tables.
You can use incremental refresh to refresh only part of the data, the part that has
changed. There are multiple options to choose which part of the data to be refreshed
and which part to be persisted. More information: Using incremental refresh with Power
BI dataflows
One of the best practices for dataflow implementations is separating the responsibilities
of dataflows into two layers: data ingestion and data transformation. This pattern is
specifically helpful when dealing with multiple queries of slower data sources in one
dataflow, or multiple dataflows querying the same data sources. Instead of getting data
from a slow data source again and again for each query, the data ingestion process can
be done once, and the transformation can be done on top of that process. This article
explains the process.
This separation isn't only useful because of the performance improvement, it's also
helpful for the scenarios where an old legacy data source system has been migrated to a
new system. In those cases, only the data ingestion dataflows need to be changed. The
data transformation dataflows remain intact for this type of change.
Reuse in other tools and services
Separation of data ingestion dataflows from data transformation dataflows is helpful in
many scenarios. Another use case scenario for this pattern is when you want to use this
data in other tools and services. For this purpose, it's better to use analytical dataflows
and use your own Data Lake Storage as the storage engine. More information: Analytical
dataflows
Depending on the storage for the output of the Microsoft Power Platform dataflows,
you can use that output in other Azure services.
Azure Machine Learning can consume the output of dataflows and use it for
machine learning scenarios (for example, predictive analysis).
Azure Data Factory can get the output of dataflows on a much larger scale,
combined with the data from big data sources, for advanced data integration
solutions.
Azure Databricks can consume the output of dataflows for applied data science
algorithms and further AI with the big data scale in the Apache Spark back end.
Other Azure data services can use the output of Power Platform dataflows to do
further actions on that data.
In any of these services, use Azure Data Lake Storage as the source. You'll be able to
enter the details of your storage and connect to the data in it. The data is stored in CSV
format, and is readable through any of these tools and services. The following
screenshot shows how Azure Data Lake Storage is a source option for Azure Data
Factory.
In the standard dataflow, you can easily map fields from the dataflow query into
Dataverse tables. However, if the Dataverse table has lookup or relationship fields,
additional consideration is required to make sure this process works.
The tables and their relationship are fundamental concepts of designing a database. To
learn everything about relationships is beyond the scope of this article. However, we'll
discuss it in a general way here.
Let's say you want to store information about customers and their details, including
region, in Dataverse. You can keep everything in one table. Your table can be called
Customers, and it can contain fields, such as CustomerID, Name, Birthdate, and Region.
Now imagine that you have another table that also has the store's information. This
table can have fields, such as Store ID, Name, and Region. As you can see, the region is
repeated in both tables. There's no single place where you can get all regions; some of
the region's data is in the Customers table, and some of it's in the Stores table. If you
ever build an application or a report from this information, you always have to combine
the two regions' information into one.
What's done in the database design practice is to create a table for Region in scenarios
like the one described above. This Region table would have a Region ID, Name, and
other information about the region. The other two tables (Customers and Stores) will
have links to this table using a field (which can be Region ID if we have the ID in both
tables, or Name if it's unique enough to determine a region). This means having a
relationship from the Stores and Customers table to the Region table.
In Dataverse, there are a number of ways to create a relationship. One way is to create a
table, and then create a field in one table that's a relationship (or lookup) to another
table, as described in the next section.
After setting the key field, you can see the field in the mapping of the dataflow.
Known limitations
Mapping to polymorphic lookup fields is currently not supported.
Mapping to a multi-level lookup field, a lookup that points to another tables'
lookup field, is currently not supported.
Lookup fields for Standard Tables, unless they contain alternate key fields as
described in this document, won't show up in the Map Tables dialog.
Dataflows don't guarantee correct loading order when loading data to tables
configured as hierarchical data structures.
The order of query execution, or loading order to Dataverse tables isn't
guaranteed. We recommend that you separate child and parent tables into two
dataflows, and first refresh the dataflow containing child artifacts.
Field mapping considerations for
standard dataflows
Article • 06/20/2023
When loading data into Dataverse tables, you'll need to map the source query's columns
in the dataflow's editing experience to the destination Dataverse table columns. Beyond
mapping of data, there are other considerations and best practices to take into account.
In this article, we cover the different dataflow settings that control the behavior of
dataflow refresh and as a result, the data in the destination table.
Create new records each dataflow refresh, even if such records already exist in the
destination table.
Create new records if they don't already exist in the table, or update existing
records if they already exist in the table. This behavior is called upsert.
Using a key column will indicate to the dataflow to upsert records into the destination
table, while not selecting a key will always create new records in the destination table.
A key column is a column that's unique and deterministic of a data row in the table. For
example, in an Orders table, if the Order ID is a key column, you shouldn't have two
rows with the same Order ID. Also, one Order ID—let's say an order with the ID 345—
should only represent one row in the table. To choose the key column for the table in
Dataverse from the dataflow, you need to set the key field in the Map Tables experience.
Having a primary key in the table ensures that even if you have duplicate data rows with
the same value in the field that's mapped to the primary key, the duplicate entries won't
be loaded into the table, and the table will always have a high quality of the data.
Having a table with a high quality of data is essential in building reporting solutions
based on the table.
The following image shows how you can choose the key column to be used when
upserting records to an existing Dataverse table:
Setting a table’s Unique ID column and using it as a key
field for upserting records into existing Dataverse tables
All Microsoft Dataverse table rows have unique identifiers defined as GUIDs. These
GUIDs are the primary key for each table. By default, a tables primary key can't be set by
dataflows, and is auto-generated by Dataverse when a record is created. There are
advanced use cases where leveraging the primary key of a table is desirable, for
example, integrating data with external sources while keeping the same primary key
values in both the external table and Dataverse table.
7 Note
To take advantage of a table’s unique identifier field, select Load to existing table in the
Map Tables page while authoring a dataflow. In the example shown in the image below,
we would like to load data into the CustomerTransactions table, and use the
TransactionID column from the data source as the unique identifier of the table.
You'll notice that in the Select key dropdown, the unique identifier—which is always
named "tablename + id"— of the table can be selected. Since the table name is
"CustomerTransactions", the unique identifier field will be named
"CustomerTransactionId".
Once selected, the column mapping section is updated to include the unique identifier
as a destination column. You can then map the source column representing the unique
identifier for each record.
The primary key in the source system (such as OrderID in the example above).
Having this option checked means that if there's a data row in the table that doesn't
exist in the next dataflow refresh's query output, that row will be removed from the
table.
7 Note
Standard V2 dataflows rely on the createdon and modifiedon fields in order to
remove rows that don't exist in the dataflows output, from the destination table. If
those columns don't exist in the destination table, records aren't deleted.
Known limitations
Mapping to polymorphic lookup fields is currently not supported.
Mapping to a multi-level lookup field, a lookup that points to another tables'
lookup field, is currently not supported.
Mapping to Status and Status Reason fields is currently not supported.
Mapping data into multi-line text that includes line break characters isn't
supported and the line breaks will be removed. Instead, you could use the line
break tag <br> to load and preserve multi-line text.
Mapping to Choice fields configured with the multiple select option enabled is
only supported under certain conditions. The dataflow only loads data to Choice
fields with the multiple select option enabled, and a comma separated list of
values (integers) of the labels are used. For example, if the labels are "Choice1,
Choice2, Choice3" with corresponding integer values of "1, 2, 3", then the column
values should be "1,3" to select the first and last choices.
Standard V2 dataflows rely on the createdon and modifiedon fields in order to
remove rows that don't exist in the dataflows output, from the destination table. If
those columns don't exist in the destination table, records aren't deleted.
Security roles and permission levels
required to create standard dataflows
Article • 10/03/2023
Dataflows are created within an environment, and standard dataflows load data to new,
existing, or standard Dataverse tables that also reside in the environment. Depending on
the scenario, a dataflow creator might need different or multiple roles to create and
refresh a dataflow successfully. This article walks you through the roles and permission
levels related to standard dataflows, and provides links to articles to learn how to
manage them.
Basic User Write to non- Has all the rights to work with non-custom tables.
custom tables
System Create custom Custom tables this user creates are visible to this user only.
Customizer tables
Verify that the user you want to assign a security role to is present in the environment. If
not, add the user to the environment. You can assign a security role as part of the
process of adding the user. More information: Add users to an environment
In general, a security role can only be assigned to users who are in the Enabled state.
But if you need to assign a security role to users in the Disabled state, you can do so by
enabling allowRoleAssignmentOnDisabledUsers in OrgDBOrgSettings.
To add a security role to a user who is already present in an environment:
2. Select Environments > [select an environment] > Settings > Users + permissions
> Users.
4. Select the user from the list of users in the environment, and then select Manage
roles.
If you haven't heard of row-level security before, here's a quick introduction. If you have
users with different levels of access to the same table, you can filter the data at the row
level. For example, in the Orders table, you might have a SalesTerritory column. Also, you
might want to filter the data in a way that users from California could only see records
from the Orders table that belongs to California. This difference is possible through row-
level security.
One of the common scenarios that happens when you integrate data into Dataverse is
keeping it synchronized with the source. Using the standard dataflow, you can load data
into Dataverse. This article explains how you can keep the data synchronized with the
source system.
Having a key column is important for the table in Dataverse. The key column is the row
identifier; this column contains unique values in each row. Having a key column helps in
avoiding duplicate rows, and it also helps in synchronizing the data with the source
system. If a row is removed from the source system, having a key column is helpful to
find it and remove it from Dataverse as well.
The first step to create the key column is to remove all unnecessary rows, clean the
data, remove empty rows, and remove any possible duplicates.
2. Add an index column.
After the data is cleaned, the next step is to assign a key column to it. You can use
Add Index Column from the Add Column tab for this purpose.
When you add the index column, you have some options to customize it, for example,
customizations on the starting number or the number of values to jump each time. The
default start value is zero, and it increments one value each time.
The setting is simple, you just need to set the alternate key. However, if you have
multiple files or tables, it has one other step to consider.
If you're getting data from multiple Excel files, then the Combine Files option of Power
Query will automatically append all the data together, and your output will look like the
following image.
As shown in the preceding image, besides the append result, Power Query also brings in
the Source.Name column, which contains the file name. The Index value in each file
might be unique, but it's not unique across multiple files. However, the combination of
the Index column and the Source.Name column is a unique combination. Choose a
composite alternate key for this scenario.
Delete rows that no longer exists in the query
output
The last step is to select the Delete rows that no longer exist in the query output. This
option compares the data in the Dataverse table with the data coming from the source
based on the alternate key (which might be a composite key), and remove the rows that
no longer exist. As a result, your data in Dataverse will be always synchronized with your
data source.
Add data to a table in Microsoft
Dataverse by using Power Query
Article • 02/17/2023
In this procedure, you'll create a table in Dataverse and fill that table with data from an
OData feed by using Power Query. You can use the same techniques to integrate data
from these online and on-premises sources, among others:
SQL Server
Salesforce
IBM DB2
Access
Excel
Web APIs
OData feeds
Text files
You can also filter, transform, and combine data before you load it into a new or existing
table.
If you don't have a license for Power Apps, you can sign up for free.
Prerequisites
Before you start to follow this article:
2. In the navigation pane, select Dataverse to expand it, and then select Tables.
3. In the command menu, select Data > Get data.
5. Under Connection settings, type or paste this URL, and then select Next:
https://services.odata.org/V4/Northwind/Northwind.svc/
6. In the list of tables, select the Customers check box, and then select Next.
7. (optional) Modify the schema to suit your needs by choosing which columns to
include, transforming the table in one or more ways, adding an index or
conditional column, or making other changes.
2. In the Unique primary name column list, select ContactName, and then select
Next.
You can specify a different primary-name column, map a different column in the
source table to each column in the table that you're creating, or both. You can also
specify whether Text columns in your query output should be created as either
Multiline Text or Single-Line Text in the Dataverse. To follow this tutorial exactly,
leave the default column mapping.
3. Select Refresh manually for Power Query - Refresh Settings, and then select
Publish.
4. Under Dataverse (near the left edge), select Tables to show the list of tables in
your database.
The Customers table that you created from an OData feed appears as a custom
table.
2 Warning
Existing data might be altered or deleted when loading data to a Dataverse table
while having the Delete rows that no longer exist in the query output enabled or
a primary key column defined.
If you select Load to existing table, you can specify a table into which you add data
from the Customers table. You could, for example, add the data to the Account table
with which the Dataverse ships. Under Column mapping, you can further specify that
data in the ContactName column from the Customers table should be added to the
Name column in the Account table.
If an error message about permissions appears, contact your administrator.
How Microsoft Power Platform
dataflows and Azure Data Factory
wrangling dataflows relate to each other
Article • 02/17/2023
Microsoft Power Platform dataflows and Azure Data Factory dataflows are often
considered to be doing the same thing: extracting data from source systems,
transforming the data, and loading the transformed data into a destination. However,
there are differences in these two types of dataflows, and you can have a solution
implemented that works with a combination of these technologies. This article describes
this relationship in more detail.
Destinations Dataverse or Azure Data Lake Storage Many destinations (go to the list
here )
Features Power Platform dataflows Data Factory wrangling
dataflows
Power Query All Power Query functions are supported A limited set of functions are
transformation supported (go to the list here)
Sources Many sources are supported Only a few sources (go to the list
here)
If you're a data developer who's dealing with big data and huge datasets, with a large
number of rows to be ingested every time, you'll find the Data Factory wrangling
dataflows a better tool for the job. Wrangling data flow translates M generated by the
Power Query Online Mashup Editor into spark code for cloud scale execution. Working
with the Azure portal to author, monitor, and edit wrangling dataflows requires a higher
developer learning curve than the experience in Power Platform dataflows. Wrangling
dataflows are best suited for this type of audience.
Power Automate templates for the
dataflows connector
Article • 07/30/2022
This section discusses some use cases with provided tutorials to help you quickstart the
use of this connector:
Send notifications:
When a dataflow refresh fails, send a message to Azure Service Bus queue to open
a support ticket.
When your dataflow refresh completes, you or others who manage or depend on the
dataflow might want to receive a notification to alert you of the dataflow refresh status.
This way, you know your data is up to date and you can start getting new insights.
Another common scenario addressed by this tutorial is notification after a dataflow fails.
A notification allows you to start investigating the problem and alert people that
depend on the data being successfully refreshed.
To set up a Power Automate notification that will be sent when a dataflow fails:
3. Enter a flow name, and then search for the "When a dataflow refresh completes"
connector. Select this connector from the list, and then select Create.
9. Search for the "Send an email notification (V3)" connector, and then select it.
11. Inside the body of the email, select the field next to Body and use Dynamic
content to add dataflow information to the content of your email.
Open a ticket when a dataflow refresh
fails
Article • 02/17/2023
When your dataflow refresh completes or has been taking longer than expected, you
might want your support team to investigate. With this tutorial, you can automatically
open a support ticket, create a message in a queue or Service Bus, or add an item to
Azure DevOps to notify your support team.
In this tutorial, we make use of Azure Service Bus. For instructions on how to set up an
Azure Service Bus and create a queue, go to Use Azure portal to create a Service Bus
namespace and a queue.
3. Enter a flow name, and then search for the "When a dataflow refresh completes"
connector. Select this connector from the list, and then select Create.
9. Search for the "Send message" connector from Service Bus, and then select it.
10. Enter a Connection name for this message. In Connection string, enter the
connection string that was generated when you created the Service Bus
namespace. Then select Create.
11. Add dataflow information to the content of your message by selecting the field
next to Content, and then select the dynamic content you want to use from
Dynamic content.
Trigger dataflows and Power BI datasets
sequentially
Article • 02/17/2023
There are two common scenarios for how you can use this connector to trigger multiple
dataflows and Power BI datasets sequentially.
If a single dataflow does every action, then it's hard to reuse its entities in other
dataflows or for other purposes. The best dataflows to reuse are dataflows doing
only a few actions, specializing in one specific task. If you have a set of dataflows
as staging dataflows, and their only action is to extract data "as is" from the source
system, these dataflows can be reused in multiple other dataflows. More
information: Best practices for reusing dataflows across environments and
workspaces
If you want to ensure that your dashboard is up to date after a dataflow refreshes
your data, you can use the connector to trigger the refresh of a Power BI dataset
after your dataflow refreshes successfully.
3. Enter a flow name, and then search for the "When a dataflow refresh completes"
connector. Select this connector from the list, and then select Create.
9. Search for the "Refresh a dataflow" connector, and then select it.
This tutorial demonstrates how to load data in a Dataverse table to create a dataflows
monitoring report in Power BI.
You can use this dashboard to monitor your dataflows' refresh duration and failure
count. With this dashboard, you can track any issues with your dataflows performance
and share the data with others.
First, you'll create a new Dataverse table that stores all the metadata from the dataflow
run. For every refresh of a dataflow, a record is added to this table. You can also store
metadata for multiple dataflow runs in the same table. After the table is created, you'll
connect the Power BI file to the Dataverse table.
Prerequisites
Power BI Desktop .
2. On the left navigation pane expand Data, select Tables, and then select New table.
4. Select Add column to repeat adding columns for the following values:
Create a dataflow
If you don't already have one, create a dataflow. You can create a dataflow in either
Power BI dataflows or Power Apps dataflows.
3. Enter a flow name, and then search for the "When a dataflow refresh completes"
connector. Select this connector from the list, and then select Create.
6. Search for the "Add a new row" connector from Dataverse, and then select it.
7. In Add a new row, select Choose a table and then choose Dataflows Monitoring
from the list.
8. For every required field, you need to add a dynamic value. This value is the output
of the metadata of the dataflow that's run.
a. Select the field next to Dataflow Name and then select Dataflow Name from
the dynamic content.
In this dashboard, for every dataflow in your specified time interval, you can monitor:
The dataflow duration
The dataflow count
The dataflow failure count
The unique ID for every dataflow is generated by a merge between the dataflow name
and the dataflow start time.
Load data in Excel Online and build a
dataflows monitoring report with Power
BI
Article • 02/17/2023
This tutorial demonstrates how to use an Excel file and the dataflows connector in Power
Automate to create a dataflows monitoring report in Power BI.
First, you'll download the Excel file and save it in OneDrive for Business or SharePoint.
Next, you'll create a Power Automate connector that loads metadata from your dataflow
to the Excel file in OneDrive for Business or SharePoint. Lastly, you'll connect a Power BI
file to the Excel file to visualize the metadata and start monitoring the dataflows.
You can use this dashboard to monitor your dataflows' refresh duration and failure
count. With this dashboard, you can track any issues with your dataflows performance
and share the data with others.
Prerequisites
Microsoft Excel
Power BI Desktop .
Create a dataflow
If you don't already have one, create a dataflow. You can create a dataflow in either
Power BI dataflows or Power Apps dataflows.
3. Enter a flow name, and then search for the "When a dataflow refresh completes"
connector. Select this connector from the list, and then select Create.
6. Search for the "Add a row into a table" connector from Excel Online (Business), and
then select it.
7. Customize the connector. Enter the Location of the Excel file and the specific Table
the data loads to.
Location: Select the location of the Excel file on OneDrive for Business or
SharePoint.
Document Library: Select the library of the Excel file.
File: Select the file path to the Excel file.
Table: Select "Dataflow_monitoring".
For every required field, you need to add a dynamic value. This value is the output
of the metadata of the dataflow run.
In this dashboard, for every dataflow in your specified time interval, you can monitor:
The uniqueID for every dataflow is generated by a merge between the dataflow name
and the dataflow start time.
Load data in a Power BI streaming
dataset and build a dataflows
monitoring report with Power BI
Article • 02/17/2023
This tutorial demonstrates how to load data in a Power BI streaming dataset to create a
dataflows monitoring report in Power BI.
First, you'll create a new streaming dataset in Power BI. This dataset collects all the
metadata from the dataflow run, and for every refresh of a dataflow, a record is added
to this dataset. You can run multiple dataflows all to the same dataset. Lastly, you can
build a Power BI report on the data to visualize the metadata and start monitoring the
dataflows.
You can use this dashboard to monitor your dataflows' refresh duration and failure
count. With this dashboard, you can track any issues with your dataflows performance
and share the data with others.
Prerequisites
A Power BI Pro License.
2. Open a workspace.
3. Enter a flow name, and then search for the "When a dataflow refresh completes"
connector. Select this connector from the list, and then select Create.
6. Search for the connector "Add rows to a dataset" from Power BI, and then select it.
Workspace ID: Select the Power BI workspace that contains your streaming
dataset.
Dataset: Select the streaming dataset Dataflow Monitoring that you
previously created in Create a new streaming dataset in Power BI.
Table: Select RealTimeData.
For every required field, you need to add a dynamic value. This value is the output
of the metadata of the dataflow run.
a. Select the field next to Dataflow Name and then select the lightning button.
In the scenario where you want to automatically retry a dataflow when the refresh fails,
the Power Automate Connector is probably the way to go. In this tutorial, we'll guide
you step by step in setting up your Power Automate flow.
3. Enter a flow name, and then search for the When a dataflow refresh completes
connector. Select this connector from the list, and then select Create.
9. Search for the Refresh a dataflow connector, and then select it.
When working with any kind dataflows other than Power BI dataflows, you have the
ability to monitor dataflow refreshes using Power BI. This article includes step by step
instructions on how to set up your own dashboard to share with everyone on your team.
This dashboard provides insights into the success rate of refreshes, duration, and much
more.
6. Open the template file with Power BI Desktop and provide your instance URL.
7. Select Load.
8. If this is the first time you've used this dashboard, you might need to enter your
credentials to sign in.
9. Inside the dashboard, you'll find two tabs with information about errors, duration,
and the count of rows that were inserted, upserted, or failed:
Dataflow monitoring
Table monitoring
10. From this point on, you can change the dashboard however you like and publish it
to a workspace of your choice.
These tables will store history for at least the last 50 refreshes. Refresh history records
older than 90 days may be removed by the system. To use these tables, we suggest that
you use Power BI to get data through the Dataverse connector. We also suggest that
you extract this data into a self-managed table if you would like to do analysis over a
longer period of time.
Known issues
In some cases when you try to connect to the Dataverse tables manually through Power
BI, the tables might appear to be empty. To solve this issue, just refresh the preview and
you should be good to go.
Troubleshoot dataflow issues: Creating
dataflows
Article • 03/14/2023
This article explains some of the most common errors and issues you might get when
you want to create a dataflow, and how to fix them.
Reason:
Resolution:
Reason:
Resolution:
Reason:
Resolution:
Ask the Power BI tenant administrator to enable access for you by following these steps:
2. On the left pane, select Tenant settings, and in the Dataflow settings section, turn
on the toggle for Enabled. Then select Apply.
I only see limited options when I create a
dataflow
When creating a dataflow, sometimes you don't see all the options that are available.
For example, you might see only the options shown in the following image.
However, more options are actually available, as shown in the following image.
Reason:
You're creating the dataflow in an old version of the Power BI workspace, called V1.
Resolution:
Upgrade your Power BI workspace to the new version (v2). More information: Upgrade
classic workspaces to the new workspaces in Power BI
Dataflow name exists already, but I deleted the
old one
This problem happens when you try to create a dataflow with a name that already exists
or use the name of a recently deleted dataflow.
Reason:
It can take up to 48 hours for the backend systems to delete all the files and references
to the deleted dataflow.
Resolution:
You can either wait 48 hours before publishing the dataflow or create it now under a
different name and rename it later on.
Troubleshooting dataflow issues: Get
data from a dataflow
Article • 02/17/2023
You might have created a dataflow but then had difficulty getting data from it (either by
using Power Query in Power BI Desktop or from other dataflows). This article explains
some of the most common problems with getting data from a dataflow.
Reason:
Resolution:
In the desktop tools, such as Power Query in Excel and Power Query in Power BI
Desktop, the loading of data into tables happens automatically (unless you disable it).
This behavior is a bit different in Power Query in dataflows. In dataflow entities, the data
won't be loaded unless you refresh the data.
You have to set up a scheduled refresh for a dataflow, or—if you want to just have a
single refresh—use the manual refresh option.
After a dataflow is refreshed, the data in entities will be visible in the Navigator window
of other tools and services.
More information: Refreshing a dataflow in Power BI and Set the refresh frequency in
Power Apps
You might receive the error message "We reached the end of the buffer" or
"DataFormat.Error: We reached the end of the buffer".
Reason:
Only analytical dataflows can be used in a Get data operation from a dataflow.
Resolution:
If you've created a dataflow that stores data in Dataverse—that is, a standard dataflow—
you can't see it by using the Get data operation from a dataflow. However, you can use
Get data from Dataverse to access it. Or you can create an analytical dataflow instead,
and then access it by using Get data from a dataflow.
I can't make a DirectQuery connection to the
dataflow
If you intend to use the dataflow as a DirectQuery source, you might need to enable it
first.
Reason:
Resolution:
Enable the enhanced compute engine, and then you'll have the option to connect to the
dataflow by using DirectQuery.
Troubleshooting dataflow issues:
Connection to the data source
Article • 08/11/2023
When you create a dataflow, sometimes you get an error connecting to the data source.
This error can be caused by the gateway, credentials, or other reasons. This article
explains the most common connection errors and problems, and their resolution.
Reason:
When your table in the dataflow gets data from an on-premises data source, a gateway
is needed for the connection, but the gateway hasn't been selected.
Resolution:
Select Select gateway. If the gateway hasn't been set up yet, go to Install an on-
premises data gateway.
Error: Please specify how to connect
This problem happens when you're connected to a data source, but haven't set up the
credentials or connection details yet. It can happen when you migrate queries into a
dataflow.
Reason:
Resolution:
Reason:
Disabled modules are related to functions that require an on-premises data gateway
connection to work. Even if the function is getting data from a webpage, because of
some security compliance requirements, it needs to go through a gateway connection.
Resolution:
First, install and set up an on-premises gateway. Then add a web data source for the
web URL you're connecting to.
After adding the web data source, you can select the gateway in the dataflow from
Options > Project options.
You might be asked to set up credentials. When you've set up the gateway and your
credentials successfully, the modules will no longer be disabled."
A dataflow maintains its association with deleted dataflow data sources and doesn't
delete them automatically. This requires a trim initiated by the user.
Resolution:
In order to trim the data sources, you'll need to take the following steps:
2. Select Options.
4. Change the gateway to another gateway. It doesn't matter which one, as long as
it's a different gateway.
5. After you apply the change by selecting OK, repeat steps 1 through 4 to select the
original gateway again.
These steps essentially delete all the data source bindings for the dataflow. After
finishing these steps, you might be asked to set up credentials. When you've set up the
gateway and your credentials successfully, you effectively "trimmed" the data source
bindings for the dataflow to just the ones that the dataflow is actually using.
7 Note
For dataflows with cloud data sources without an on-premises data gateway, when
the data source name changes, an old data source connection still exists. To remove
this connection, in the Power BI service, select the settings icon, go to Manage
Connections and Gateways > Connections, select the ellipsis menu by the
connection name, then select Remove to remove the old/unused data source
connection. Then go to the Home page, select the workspace, select the dataflow
to open it, and then in the Table name page, select Edit tables. Then in the Power
Query editor, select Save & Close for the dataflows to be updated to the current
data source connection and to remove the older connection.
Power Query template (preview)
Article • 06/20/2023
A Power Query template is a file that encompasses the Power Query script and its
associated metadata for a single Power Query project.
It's meant to be used as a simple means to package your entire Power Query project in a
single file. You can then share the file or import it to other Power Query integrations.
7 Note
A Power Query project can be defined as a single instance of the Power Query
editor. This instance could contain any number of parameters, queries, and specific
settings, such as the locale that should be used to interpret data types.
) Important
Concepts that are outside of the Power Query script or its metadata are outside of
the scope of what to find in a Power Query template. Some examples are dataflow
related concepts such as scheduled refresh definition, refresh history, dataflow IDs,
connections referenced, and other concepts that aren't stored in the Power Query
script or the metadata of a Power Query project.
Export a template
7 Note
For Power Query in Excel for Windows, follow the guide in exporting queries in
Excel to a Power Query template.
You can find the Export template button inside of the Home tab of the Power Query
ribbon, under the Share group.
When you select this button, the Export template dialog appears. In this dialog, you can
enter the name of the template and the description that is used for the template.
This operation downloads a file with the extension.pqt, which is your Power Query
template file.
Import a template
7 Note
This functionality is only available inside of the Dataflows Gen2 experience for
Microsoft Fabric.
The option to import from a template is only available in Power Query projects with no
queries or parameters set. In the home screen of the Power Query editor, there's a link
that reads Import from Power Query template.
Selecting the Import from Power Query template button triggers an experience to help
you upload your Power Query template and immediately load the project with all its
queries, parameters, and settings.
) Important
Keyboard shortcuts provide a quick way to navigate and allow users to work more
efficiently. For users with mobility or vision disabilities, keyboard shortcuts can be easier
than using the touchscreen, and are an essential alternative to using the mouse. The
table in this article lists all the shortcuts available in Power Query Online.
When using the Query Editor in Power Query Online, you can navigate to the Keyboard
shortcuts button in the Help tab to view the list of keyboard shortcuts.
7 Note
These shortcuts have been tested with Microsoft Edge on Windows and MacOS.
While we try to provide support for all browsers, other browsers can make or
implement their own shortcuts. Because we can't control how other browsers
behave, we can't guarantee that all the shortcuts in this list will work on all
browsers.
Query Editor
Action Windows Keyboard shortcut MacOS keyboard shortcut
Move focus to column header on the Ctrl+Left arrow key Command+Left arrow key
left
Move focus to column header on the Ctrl+Right arrow key Command+Right arrow
right key
Select last cell of the last row Ctrl+End Fn+Command+Right arrow key
Select the cell one page down Page down Fn+Down arrow key
Diagram View
Action Windows keyboard MacOS keyboard shortcut
shortcut
Move focus from query level to step Alt+Down arrow key Option+Down arrow key
level
Queries pane
Action Windows keyboard MacOS keyboard shortcut
shortcut
Select multiple Ctrl+Up arrow key and Command+Up arrow key and
consecutive queries Ctrl+Down arrow key Command+Down arrow key
Best practices when working with Power
Query
Article • 04/10/2023
This article contains some tips and tricks to make the most out of your data wrangling
experience in Power Query.
Using the best connector for the task will provide you with the best experience and
performance. For example, using the SQL Server connector instead of the ODBC
connector when connecting to a SQL Server database not only provides you with a
much better Get Data experience, but the SQL Server connector also offers you features
that can improve your experience and performance, such as query folding. To read more
about query folding, go to Power Query query folding.
Each data connector follows a standard experience as explained in Getting data. This
standardized experience has a stage called Data Preview. In this stage, you're provided
with a user-friendly window to select the data that you want to get from your data
source, if the connector allows it, and a simple data preview of that data. You can even
select multiple datasets from your data source through the Navigator window, as shown
in the following image.
7 Note
Filter early
It's always recommended to filter your data in the early stages of your query or as early
as possible. Some connectors will take advantage of your filters through query folding,
as described in Power Query query folding. It's also a best practice to filter out any data
that isn't relevant for your case. This will let you better focus on your task at hand by
only showing data that’s relevant in the data preview section.
You can use the auto filter menu that displays a distinct list of the values found in your
column to select the values that you want to keep or filter out. You can also use the
search bar to help you find the values in your column.
You can also take advantage of the type-specific filters such as In the previous for a
date, datetime, or even date timezone column.
These type-specific filters can help you create a dynamic filter that will always retrieve
data that's in the previous x number of seconds, minutes, hours, days, weeks, months,
quarters, or years as showcased in the following image.
7 Note
To learn more about filtering your data based on values from a column, go to Filter
by values.
Do expensive operations last
Certain operations require reading the full data source in order to return any results, and
will thus be slow to preview in the Power Query Editor. For example, if you perform a
sort, it's possible that the first few sorted rows are at the end of the source data. So in
order to return any results, the sort operation must first read all the rows.
Other operations (such as filters) do not need to read all the data before returning any
results. Instead, they operate over the data in what's called a "streaming" fashion. The
data "streams" by, and results are returned along the way. In the Power Query Editor,
such operations only need to read enough of the source data to populate the preview.
When possible, perform such streaming operations first, and do any more expensive
operations last. This will help minimize the amount of time you spend waiting for the
preview to render each time you add a new step to your query.
It's crucial that you always work with the correct data types for your columns. When
working with structured data sources such as databases, the data type information will
be brought from the table schema found in the database. But for unstructured data
sources such as TXT and CSV files, it's important that you set the correct data types for
the columns coming from that data source. By default, Power Query offers an automatic
data type detection for unstructured data sources. You can read more about this feature
and how it can help you in Data types.
7 Note
To learn more about the importance of data types and how to work with them, see
Data types.
These data profiling tools help you better understand your data. The tools provide you
with small visualizations that show you information on a per column basis, such as:
Column quality—Provides a small bar chart and three indicators with the
representation of how many values in the column fall under the categories of valid,
error, or empty values.
Column distribution—Provides a set of visuals underneath the names of the
columns that showcase the frequency and distribution of the values in each of the
columns.
Column profile—Provides a more thorough view of your column and the statistics
associated to it.
You can also interact with these features, which will help you prepare your data.
7 Note
To learn more about the data profiling tools, go to Data profiling tools.
While Power Query automatically creates a step name for you in the applied steps pane,
you can also rename your steps or add a description to any of them.
7 Note
To learn more about all the available features and components found inside the
applied steps pane, go to Using the Applied steps list.
For example, say you have a query with the nine steps shown in the following image.
You could split this query into two at the Merge with Prices table step. That way it's
easier to understand the steps that were applied to the sales query before the merge. To
do this operation, you right-click the Merge with Prices table step and select the Extract
Previous option.
You'll then be prompted with a dialog to give your new query a name. This will
effectively split your query into two queries. One query will have all the queries before
the merge. The other query will have an initial step that will reference your new query
and the rest of the steps that you had in your original query from the Merge with Prices
table step downward.
You could also leverage the use of query referencing as you see fit. But it's a good idea
to keep your queries at a level that doesn't seem daunting at first glance with so many
steps.
7 Note
Create groups
A great way to keep your work organized is by leveraging the use of groups in the
queries pane.
The sole purpose of groups is to help you keep your work organized by serving as
folders for your queries. You can create groups within groups should you ever need to.
Moving queries across groups is as easy as drag and drop.
Try to give your groups a meaningful name that makes sense to you and your case.
7 Note
To learn more about all the available features and components found inside the
queries pane, go to Understanding the queries pane.
Future-proofing queries
Making sure that you create a query that won't have any issues during a future refresh is
a top priority. There are several features in Power Query to make your query resilient to
changes and able to refresh even when some components of your data source changes.
It's a best practice to define the scope of your query as to what it should do and what it
should account for in terms of structure, layout, column names, data types, and any
other component that you consider relevant to the scope.
Some examples of transformations that can help you make your query resilient to
changes are:
If your query has a dynamic number of rows with data, but a fixed number of rows
that serve as the footer that should be removed, you can use the Remove bottom
rows feature.
7 Note
To learn more about filtering your data by row position, go to Filter a table by
row position.
If your query has a dynamic number of columns, but you only need to select
specific columns from your dataset, you can use the Choose columns feature.
7 Note
If your query has a dynamic number of columns and you need to unpivot only a
subset of your columns, you can use the unpivot only selected columns feature.
7 Note
If your query has a step that changes the data type of a column, but some cells
yield errors as the values don't conform to the desired data type, you could
remove the rows that yielded error values.
7 Note
To more about working and dealing with errors, go to Dealing with errors.
Use parameters
Creating queries that are dynamic and flexible is a best practice. Parameters in Power
Query help you make your queries more dynamic and flexible. A parameter serves as a
way to easily store and manage a value that can be reused in many different ways. But
it's more commonly used in two scenarios:
Step argument—You can use a parameter as the argument of multiple
transformations driven from the user interface.
Custom Function argument—You can create a new function from a query, and
reference parameters as the arguments of your custom function.
Centralized view of all your parameters through the Manage Parameters window.
Reusability of the parameter in multiple steps or queries.
You can even use parameters in some of the arguments of the data connectors. For
example, you could create a parameter for your server name when connecting to your
SQL Server database. Then you could use that parameter inside the SQL Server database
dialog.
If you change your server location, all you need to do is update the parameter for your
server name and your queries will be updated.
7 Note
For example, say you have multiple queries or values that require the same set of
transformations. You could create a custom function that later could be invoked against
the queries or values of your choice. This custom function would save you time and help
you in managing your set of transformations in a central location, which you can modify
at any moment.
Power Query custom functions can be created from existing queries and parameters. For
example, imagine a query that has several codes as a text string and you want to create
a function that will decode those values.
You start by having a parameter that has a value that serves as an example.
From that parameter, you create a new query where you apply the transformations that
you need. For this case, you want to split the code PTY-CM1090-LAX into multiple
components:
Origin = PTY
Destination = LAX
Airline = CM
FlightID = 1090
You can then transform that query into a function by doing a right-click on the query
and selecting Create Function. Finally, you can invoke your custom function into any of
your queries or values, as shown in the following image.
After a few more transformations, you can see that you've reached your desired output
and leveraged the logic for such a transformation from a custom function.
7 Note
To learn more about how to create and use custom functions in Power Query from
the article Custom Functions.
Power Query feedback
Article • 02/17/2023
This article describes how to get support or submit feedback for Power Query.
For Power Query connectors, go to Feedback and support for Power Query connectors.
For Power Query documentation, you can submit feedback through the Submit and
view feedback for - This page link at the bottom of each article.
Community forums for the product you're using Power Query in. For example, for
Power BI, this forum would be the Power BI Community
Power Query website resources
For information about the built-in Power Query help support links, go to Getting Power
Query help.
Submitting feedback
To submit feedback about Power Query, provide the feedback to the "ideas" forum for
the product you're using Power Query in. For example, for Power BI, visit the Power BI
ideas forum . If you have one, you can also provide feedback directly to your Microsoft
account contact.
How fuzzy matching works in Power
Query
Article • 12/17/2022
Power Query features such as fuzzy merge, cluster values, and fuzzy grouping use the
same mechanisms to work as fuzzy matching.
This article goes over many scenarios that demonstrate how to take advantage of the
options that fuzzy matching has, with the goal of making 'fuzzy' clear.
Because the word Apples in the second string is only a small part of the whole text
string, that comparison yields a lower similarity score.
For example, the following dataset consists of responses from a survey that had only
one question—"What is your favorite fruit?"
Fruit
Blueberries
Strawberries
Strawberries = <3
Apples
'sples
4ppl3s
Bananas
Banas
Fruit
The survey provided one single textbox to input the value and had no validation.
Now you're tasked with clustering the values. To do that task, load the previous table of
fruits into Power Query, select the column, and then select the Cluster values option in
the Add column tab in the ribbon.
The Cluster values dialog box appears, where you can specify the name of the new
column. Name this new column Cluster and select OK.
By default, Power Query uses a similarity threshold of 0.8 (or 80%) and the result of the
previous operation yields the following table with a new Cluster column.
While the clustering has been done, it's not giving you the expected results for all the
rows. Row number two (2) still has the value Blue berries are simply the best , but it
should be clustered to Blueberries , and something similar happens to the text strings
Strawberries = <3 , fav fruit is bananas , and My favorite fruit, by far, is Apples.
Upon closer inspection, Power Query couldn't find any other values in the similarity
threshold for the text strings Blue berries are simply the best , Strawberries = <3 , fav
fruit is bananas , and My favorite fruit, by far, is Apples. I simply love them! .
Go back to the Cluster values dialog box one more time by double-clicking Clustered
values in the Applied steps panel. Change the Similarity threshold from 0.8 to 0.6, and
then select OK.
This change gets you closer to the result that you're looking for, except for the text
string My favorite fruit, by far, is Apples. I simply love them! . When you changed
the Similarity threshold value from 0.8 to 0.6, Power Query was now able to use the
values with a similarity score that starts from 0.6 all the way up to 1.
7 Note
Power Query always uses the value closest to the threshold to define the clusters.
The threshold defines the lower limit of the similarity score that's acceptable to
assign the value to a cluster.
You can try again by changing the Similarity score from 0.6 to a lower number until you
get the results that you're looking for. In this case, change the Similarity score to 0.5.
This change yields the exact result that you're expecting with the text string My favorite
fruit, by far, is Apples. I simply love them! now assigned to the cluster Apples .
7 Note
Currently, only the Cluster values feature in Power Query Online provides a new
column with the similarity score.
) Important
When the transformation table is used, the maximum similarity score for the values
from the transformation table is 0.95. This deliberate penalty of 0.05 is in place to
distinguish that the original value from such column isn't equal to the values that it
was compared to since a transformation occurred.
For scenarios where you first want to map your values and then perform the fuzzy
matching without the 0.05 penalty, we recommend that you replace the values
from your column and then perform the fuzzy matching.
Behind the scenes of the Data Privacy
Firewall
Article • 02/17/2023
7 Note
Privacy levels are currently unavailable in Power Platform dataflows. The product
team is working towards re-enabling this functionality in the coming weeks.
If you've used Power Query for any length of time, you've likely experienced it. There
you are, querying away, when you suddenly get an error that no amount of online
searching, query tweaking, or keyboard bashing can remedy. An error like:
so it may not directly access a data source. Please rebuild this data combination.
Or maybe:
have privacy levels which cannot be used together. Please rebuild this data
combination.
These Formula.Firewall errors are the result of Power Query's Data Privacy Firewall (also
known as the Firewall), which at times may seem like it exists solely to frustrate data
analysts the world over. Believe it or not, however, the Firewall serves an important
purpose. In this article, we'll delve under the hood to better understand how it works.
Armed with greater understanding, you'll hopefully be able to better diagnose and fix
Firewall errors in the future.
What is it?
The purpose of the Data Privacy Firewall is simple: it exists to prevent Power Query from
unintentionally leaking data between sources.
Why is this needed? I mean, you could certainly author some M that would pass a SQL
value to an OData feed. But this would be intentional data leakage. The mashup author
would (or at least should) know they were doing this. Why then the need for protection
against unintentional data leakage?
As part of folding, PQ sometimes may determine that the most efficient way to execute
a given mashup is to take data from one source and pass it to another. For example, if
you're joining a small CSV file to a huge SQL table, you probably don't want PQ to read
the CSV file, read the entire SQL table, and then join them together on your local
computer. You probably want PQ to inline the CSV data into a SQL statement and ask
the SQL database to perform the join.
Imagine if you were joining SQL data that included employee Social Security Numbers
with the results of an external OData feed, and you suddenly discovered that the Social
Security Numbers from SQL where being sent to the OData service. Bad news, right?
It does this by dividing your M queries into something called partitions, and then
enforcing the following rule:
Simple…yet confusing. What's a partition? What makes two data sources "compatible"?
And why should the Firewall care if a partition wants to access a data source and
reference a partition?
Let's break this down and look at the above rule one piece at a time.
What's a partition?
At its most basic level, a partition is just a collection of one or more query steps. The
most granular partition possible (at least in the current implementation) is a single step.
The largest partitions can sometimes encompass multiple queries. (More on this later.)
If you're not familiar with steps, you can view them on the right of the Power Query
Editor window after selecting a query, in the Applied Steps pane. Steps keep track of
everything you've done to transform your data into its final shape.
Assume you have a query called Employees, which pulls some data from a SQL
database. Assume you also have another query (EmployeesReference), which simply
references Employees.
Power Query M
These queries will end up divided into two partitions: one for the Employees query, and
one for the EmployeesReference query (which will reference the Employees partition).
When evaluated with the Firewall on, these queries will be rewritten like so:
Power Query M
shared Employees = let
Source = Sql.Database(…),
EmployeesTable = …
in
EmployeesTable;
Notice that the simple reference to the Employees query has been replaced by a call to
Value.Firewall , which is provided the full name of the Employees query.
This is how the Firewall maintains control over the data flowing between partitions.
What happens if you try to access incompatible data sources in the same partition?
Formula.Firewall: Query 'Query1' (step 'Source') is accessing data sources that
have privacy levels which cannot be used together. Please rebuild this data
combination.
Hopefully you now better understand one of the error messages listed at the beginning
of this article.
Note that this compatibility requirement only applies within a given partition. If a
partition is referencing other partitions, the data sources from the referenced partitions
don't have to be compatible with one another. This is because the Firewall can buffer
the data, which will prevent any further folding against the original data source. The
data will be loaded into memory and treated as if it came from nowhere.
As you saw earlier, when one partition references another partition, the Firewall acts as
the gatekeeper for all the data flowing into the partition. To do so, it must be able to
control what data is allowed in. If there are data sources being accessed within the
partition, and data flowing in from other partitions, it loses its ability to be the
gatekeeper, since the data flowing in could be leaked to one of the internally accessed
data sources without it knowing about it. Thus the Firewall prevents a partition that
accesses other partitions from being allowed to directly access any data sources.
So what happens if a partition tries to reference other partitions and also directly access
data sources?
Now you hopefully better understand the other error message listed at the beginning of
this article.
Partitions in-depth
As you can probably guess from the above information, how queries are partitioned
ends up being incredibly important. If you have some steps that are referencing other
queries, and other steps that access data sources, you now hopefully recognize that
drawing the partition boundaries in certain places will cause Firewall errors, while
drawing them in other places will allow your query to run just fine.
This section is probably the most important for understanding why you're seeing
Firewall errors, and understanding how to resolve them (where possible).
Initial Partitioning
Creates a partition for each step in each query
Static Phase
This phase doesn't depend on evaluation results. Instead, it relies on how the
queries are structured.
Parameter Trimming
Trims parameter-esque partitions, that is, any one that:
Doesn't reference any other partitions
Doesn't contain any function invocations
Isn't cyclic (that is, it doesn't refer to itself)
Note that "removing" a partition effectively includes it in whatever other
partitions reference it.
Trimming parameter partitions allows parameter references used within
data source function calls (for example, Web.Contents(myUrl) ) to work,
instead of throwing "partition can't reference data sources and other
steps" errors.
Grouping (Static)
Partitions are merged, while maintaining separation between:
Partitions in different queries
Partitions that reference other partitions vs. those that don't
Dynamic Phase
This phase depends on evaluation results, including information about data
sources accessed by various partitions.
Trimming
Trims partitions that meet all the following requirements:
Doesn't access any data sources
Doesn't reference any partitions that access data sources
Isn't cyclic
Grouping (Dynamic)
Now that unnecessary partitions have been trimmed, try to create Source
partitions that are as large as possible.
Merge all partitions with their input partitions if each of its inputs:
Is part of the same query
Doesn't reference any other partitions
Is only referenced by the current partition
Isn't the result (that is, final step) of a query
Isn't cyclic
Here's a sample scenario. It's a fairly straightforward merge of a text file (Contacts) with
a SQL database (Employees), where the SQL server is a parameter (DbServer).
Power Query M
Power Query M
in
#"Changed Type";
Power Query M
shared Employees = let
Source = Sql.Databases(DbServer),
AdventureWorks = Source{[Name="AdventureWorks"]}[Data],
HumanResources_Employee =
AdventureWorks{[Schema="HumanResources",Item="Employee"]}[Data],
in
#"Expanded Contacts";
Now we enter the dynamic phase. In this phase, the above static partitions are
evaluated. Partitions that don't access any data sources are trimmed. Partitions are then
grouped to create source partitions that are as large as possible. However, in this
sample scenario, all the remaining partitions access data sources, and there isn't any
further grouping that can be done. The partitions in our sample thus won't change
during this phase.
Let's pretend
For the sake of illustration, though, let's look at what would happen if the Contacts
query, instead of coming from a text file, were hard-coded in M (perhaps via the Enter
Data dialog).
In this case, the Contacts query wouldn't access any data sources. Thus, it would get
trimmed during the first part of the dynamic phase.
With the Contacts partition removed, the last two steps of Employees would no longer
reference any partitions except the one containing the first three steps of Employees.
Thus, the two partitions would be grouped.
Imagine you want to look up a company name from the Northwind OData service, and
then use the company name to perform a Bing search.
Power Query M
let
Source =
OData.Feed("https://services.odata.org/V4/Northwind/Northwind.svc/", null,
[Implementation="2.0"]),
Customers_table = Source{[Name="Customers",Signature="table"]}[Data],
CHOPS = Customers_table{[CustomerID="CHOPS"]}[CompanyName]
in
CHOPS
Next, you create a Search query that references Company and passes it to Bing.
Power Query M
let
Source = Text.FromBinary(Web.Contents("https://www.bing.com/search?q=" &
Company))
in
Source
At this point you run into trouble. Evaluating Search produces a Firewall error.
This is because the Source step of Search is referencing a data source (bing.com) and
also referencing another query/partition (Company). It's violating the rule mentioned
above ("a partition may either access compatible data sources, or reference other
partitions, but not both").
What to do? One option is to disable the Firewall altogether (via the Privacy option
labeled Ignore the Privacy Levels and potentially improve performance). But what if
you want to leave the Firewall enabled?
To resolve the error without disabling the Firewall, you can combine Company and
Search into a single query, like this:
Power Query M
let
Source =
OData.Feed("https://services.odata.org/V4/Northwind/Northwind.svc/", null,
[Implementation="2.0"]),
Customers_table = Source{[Name="Customers",Signature="table"]}[Data],
CHOPS = Customers_table{[CustomerID="CHOPS"]}[CompanyName],
Search = Text.FromBinary(Web.Contents("https://www.bing.com/search?q=" &
CHOPS))
in
Search
Everything is now happening inside a single partition. Assuming that the privacy levels
for the two data sources are compatible, the Firewall should now be happy, and you'll
no longer get an error.
That's a wrap
While there's much more that could be said on this topic, this introductory article is
already long enough. Hopefully it's given you a better understanding of the Firewall, and
will help you to understand and fix Firewall errors when you encounter them in the
future.
Query diagnostics
Article • 02/17/2023
With Query Diagnostics, you can achieve a better understanding of what Power Query is
doing at authoring and at refresh time in Power BI Desktop. While we'll be expanding on
this feature in the future, including adding the ability to use it during full refreshes, at
this time you can use it to understand what sort of queries you're emitting, what
slowdowns you might run into during authoring refresh, and what kind of background
events are happening.
To use Query Diagnostics, go to the Tools tab in the Power Query editor ribbon.
When you press Diagnose Step, Power Query runs a special evaluation of just the step
you're looking at. It then shows you the diagnostics for that step, without showing you
the diagnostics for other steps in the query. This can make it much easier to get a
narrow view into a problem.
It's important that if you're recording all traces from Start Diagnostics that you press
Stop diagnostics. Stopping the diagnostics allows the engine to collect the recorded
traces and parse them into the proper output. Without this step, you'll lose your traces.
Types of diagnostics
We currently provide three types of diagnostics, one of which has two levels of detail.
The first of these diagnostics are the primary diagnostics, which have a detailed view
and a summarized view. The summarized view is aimed to give you an immediate
insight into where time is being spent in your query. The detailed view is much deeper,
line by line, and is, in general, only needed for serious diagnosing by power users.
For this view, some capabilities, like the Data Source Query column, are currently
available only on certain connectors. We'll be working to extend the breadth of this
coverage in the future.
Data privacy partitions provide you with a better understanding of the logical partitions
used for data privacy.
7 Note
Power Query might perform evaluations that you may not have directly triggered.
Some of these evaluations are performed in order to retrieve metadata so we can
best optimize our queries or to provide a better user experience (such as retrieving
the list of distinct values within a column that are displayed in the Filter Rows
experience). Others might be related to how a connector handles parallel
evaluations. At the same time, if you see in your query diagnostics repeated queries
that you don't believe make sense, feel free to reach out through normal support
channels—your feedback is how we improve our product.
The summarized view provides an overview of what occurred during an evaluation for
easy high-level review. If further breakdown is wanted for a specific operation, the user
can look at the group ID and view the corresponding operations that were grouped in
the detail view.
To provide higher performance, currently some caching happens so that it doesn't have
to rerun every part of the final query plan as it goes back through the steps. While this
caching is useful for normal authoring, it means that you won't always get correct step
comparison information because of later evaluations pulling on cached data.
Diagnostics schema
Id
When analyzing the results of a recording, it's important to filter the recording session
by Id, so that columns such as Exclusive Duration % make sense.
Id is a composite identifier. It's formed of two numbers—one before the dot, and one
after. The first number is the same for all evaluations that resulted from a single user
action. In other words, if you press refresh twice, there will be two different numbers
leading the dot, one for each user activity taken. This numbering is sequential for a
given diagnostics recording.
The second number represents an evaluation by the engine. This number is sequential
for the lifetime of the process where the evaluation is queued. If you run multiple
diagnostics recording sessions, you'll see this number continue to grow across the
different sessions.
To summarize, if you start recording, press evaluation once, and stop recording, you'll
have some number of Ids in your diagnostics. But since you only took one action, they'll
all be 1.1, 1.2, 1.3, and so on.
The combination of the activityId and the evaluationId, separated by the dot, provides a
unique identifier for an evaluation of a single recording session.
Query
The name of the Query in the left-hand pane of the Power Query editor.
Step
The name of the Step in the right-hand pane of the Power Query editor. Things like filter
dropdowns generally associate with the step you're filtering on, even if you're not
refreshing the step.
Category
The category of the operation.
Operation
The actual operation being performed. This operation can include evaluator work,
opening connections, sending queries to the data source, and many more.
Start time
The time that the operation started.
End time
The time that the operation ended.
Exclusive duration
The absolute time, rather than %, of exclusive duration. The total duration (that is,
exclusive duration + time when the event was inactive) of an evaluation can be
calculated in one of two ways:
Find the operation called "Evaluation". The difference between End Time–Start
Time results in the total duration of an event.
Subtract the minimum start time of all operations in an event from the maximum
end time. Note that in cases when the information collected for an event doesn't
account for the total duration, an operation called "Trace Gaps" is generated to
account for this time gap.
Resource
The resource you're accessing for data. The exact format of this resource depends on
the data source.
The Data Source Query column allows you to see the query or HTTP request/response
sent against the back-end data source. As you author your Query in the editor, many
Data Source Queries will be emitted. Some of these are the actual final Data Source
Query to render the preview, but others may be for data profiling, filter dropdowns,
information on joins, retrieving metadata for schemas, and any number of other small
queries.
In general, you shouldn't be concerned by the number of Data Source Queries emitted
unless there are specific reasons to be concerned. Instead, you should focus on making
sure the proper content is being retrieved. This column might also help determine if the
Power Query evaluation was fully folded.
Additional info
There's a lot of information retrieved by our connectors. Much of it is ragged and
doesn't fit well into a standard column hierarchy. This information is put in a record in
the additional info column. Information logged from custom connectors also appears
here.
Row count
The number of rows returned by a Data Source Query. Not enabled on all connectors.
Content length
Content length returned by HTTP Requests, as commonly defined. This isn't enabled in
all connectors, and it won't be accurate for connectors that retrieve requests in chunks.
Is user query
A Boolean value that indicates if it's a query authored by the user and present in the
left-hand pane, or if it was generated by some other user action. Other user actions can
include things such as filter selection or using the navigator in the get data experience.
Path
Path represents the relative route of the operation when viewed as part of an interval
tree for all operations within a single evaluation. At the top (root) of the tree, there's a
single operation called Evaluation with path "0". The start time of this evaluation
corresponds to the start of this evaluation as a whole. The end time of this evaluation
shows when the whole evaluation finished. This top-level operation has an exclusive
duration of 0, as its only purpose is to serve as the root of the tree.
Further operations branch from the root. For example, an operation might have "0/1/5"
as a path. This path would be understood as:
0: tree root
1: current operation's parent
5: index of current operation
Operation "0/1/5" might have a child node, in which case, the path has the form
"0/1/5/8", with 8 representing the index of the child.
Group ID
Combining two (or more) operations won't occur if it leads to detail loss. The grouping
is designed to approximate "commands" executed during the evaluation. In the detailed
view, multiple operations share a Group Id, corresponding to the groups that are
aggregated in the Summary view.
As with most columns, the group ID is only relevant within a specific evaluation, as
filtered by the Id column.
Id
Same as the ID for the other query diagnostics results. The integer part represents a
single activity ID, while the fractional part represents a single evaluation.
Partition key
Corresponds to the Query/Step that's used as a firewall partition.
Firewall group
Categorization that explains why this partition has to be evaluated separately, including
details on the privacy level of the partition.
Accessed resources
List of resource paths for all the resources accessed by this partition, and is in general
uniquely identifying a data source.
Partition inputs
List of partition keys upon which the current partition depends (this list could be used to
build a graph).
Expression
The expression that gets evaluated on top of the partition's query/step. In several cases,
it coincides with the query/step.
Start time
Time when evaluation started for this partition.
End time
Time when evaluation ended for this partition.
Duration
A value derived from End Time minus Start Time.
Exclusive duration
If partitions are assumed to execute in a single thread, exclusive duration is the "real"
duration that can be attributed to this partition.
Exclusive duration %
Exclusive duration as a percentage.
Diagnostics
This column only appears when the query diagnostics "Aggregated" or "Detailed" is also
captured, allowing the user to correspond between the two diagnostics outputs.
% processor time
Percent of time spent by processors on the query. This percentage may reach above
100% because of multiple processors.
Additional reading
How to record diagnostics in various use cases
How to understand what query operations are folding using Query Diagnostics
Recording query diagnostics in Power BI
Article • 02/17/2023
When authoring in Power Query, the basic workflow is that you connect to a data
source, apply some transformations, potentially refresh your data in the Power Query
editor, and then load it to the Power BI model. Once it's in the Power BI model, you may
refresh it from time to time in Power BI Desktop (if you're using Desktop to view
analytics), aside from any refreshes you do in the service.
While you may get a similar result at the end of an authoring workflow, refreshing in the
editor, or refreshing in Power BI proper, very different evaluations are run by the
software for the different user experiences provided. It's important to know what to
expect when doing query diagnostics in these different workflows so you aren't
surprised by the very different diagnostic data.
To start Query Diagnostics, go to the Tools tab in the Power Query editor ribbon. You're
presented here with a few different options.
There are two primary options here, 'Diagnose Step' and 'Start Diagnostics' (paired with
'Stop Diagnostics'). The former will give you information on a query up to a selected
step, and is most useful for understanding what operations are being performed locally
or remotely in a query. The latter gives you more insight into a variety of other cases,
discussed below.
Connector specifics
It's important to mention that there is no way to cover all the different permutations of
what you'll see in Query Diagnostics. There are lots of things that can change exactly
what you see in results:
Connector
Transforms applied
System that you're running on
Network configuration
Advanced configuration choices
ODBC configuration
For the most broad coverage this documentation will focus on Query Diagnostics of the
Northwind Customers table, both on SQL and OData. The OData notes use the public
endpoint found at the OData.org website , while you'll need to provide a SQL server
for yourself. Many data sources will differ significantly from these, and will have
connector specific documentation added over time.
To start recording, select Start Diagnostics, perform whatever evaluations you want
(authoring, preview refresh, full refresh), and then select Stop Diagnostics.
Authoring
The authoring workflow's primary difference is that it will generally generate more
individual evaluations than seen in other workflows. As discussed in the primary Query
Diagnostics article, these are a result of populating various user interfaces such as the
navigator or filter dropdowns.
We're going to walk through an example. We're using the OData connector in this
sample, but when reviewing the output we'll also look at the SQL version of the same
database. For both data sources, we're going to connect to the data source via 'New
Source', 'Recent Sources', or 'Get Data'. For the SQL connection you'll need to put in
credentials for your server, but for the public OData endpoint you can put in the
endpoint linked above.
Once you connect and choose authentication, select the Customers table from the
OData service.
This will present you with the Customers table in the Power Query interface. Let's say
that we want to know how many Sales Representatives there are in different countries.
First, right-click on Sales Representative under the Contact Title column, mouse over
Text Filters, and select Equals.
Now, select Group By from the ribbon and do a grouping by Country, with your
aggregate being a Count.
This should present you with the same data you see below.
Finally, navigate back to the Tools tab of the Ribbon and select Stop Diagnostics. This
will stop the tracing and build your diagnostics file for you, and the summary and
detailed tables will appear on the left-hand side.
If you trace an entire authoring session, you will generally expect to see something like a
source query evaluation, then evaluations related to the relevant navigator, then at least
one query emitted for each step you apply (with potentially more depending on the
exact UX actions taken). In some connectors, parallel evaluations will happen for
performance reasons that will yield very similar sets of data.
Refresh preview
When you have finished transforming your data, you have a sequence of steps in a
query. When you press 'Refresh Preview' or 'Refresh All' in the Power Query editor, you
won't see just one step in your query diagnostics. The reason for this is that refreshing in
the Power Query Editor explicitly refreshes the query ending with the last step applied,
and then steps back through the applied steps and refreshes for the query up to that
point, back to the source.
This means that if you have five steps in your query, including Source and Navigator,
you will expect to see five different evaluations in your diagnostics. The first one,
chronologically, will often (but not always) take the longest. This is due to two different
reasons:
It may potentially cache input data that the queries run after it (representing earlier
steps in the User Query) can access faster locally.
It may have transforms applied to it that significantly truncate how much data has
to be returned.
Note that when talking about 'Refresh All' that it will refresh all queries and you'll need
to filter to the ones you care about, as you might expect.
Full refresh
Query Diagnostics can be used to diagnose the so-called 'final query' that is emitted
during the Refresh in Power BI, rather than just the Power Query editor experience. To
do this, you first need to load the data to the model once. If you are planning to do this,
make sure that you realize that if you select Close and Apply that the editor window will
close (interrupting tracing) so you either need to do it on the second refresh, or select
the dropdown icon under Close and Apply and select Apply instead.
Either way, make sure to select Start Diagnostics on the Diagnostics section of the
Tools tab in the editor. Once you've done this refresh your model, or even just the table
you care about.
Once it's done loading the data to model, select Stop Diagnostics.
You can expect to see some combination of metadata and data queries. Metadata calls
grab the information it can about the data source. Data retrieval is about accessing the
data source, emitting the final built up Data Source Query with folded down operations,
and then performing whatever evaluations are missing on top, locally.
It's important to note that just because you see a resource (database, web endpoint,
etc.) or a data source query in your diagnostics, it doesn't mean that it's necessarily
performing network activity. Power Query may retrieve this information from its cache.
In future updates, we will indicate whether or not information is being retrieved from
the cache for easier diagnosis.
Diagnose step
'Diagnose Step' is more useful for getting an insight into what evaluations are
happening up to a single step, which can help you identify, up to that step, what
performance is like as well as what parts of your query are being performed locally or
remotely.
If you used 'Diagnose Step' on the query we built above, you'll find that it only returns
10 or so rows, and if we look at the last row with a Data Source Query we can get a
pretty good idea of what our final emitted query to the data source will be. In this case,
we can see that Sales Representative was filtered remotely, but the grouping (by process
of elimination) happened locally.
If you start and stop diagnostics and refresh the same query, we get 40 rows due to the
fact that, as mentioned above, Power Query is getting information on every step, not
just the final step. This makes it harder when you're just trying to get insight into one
particular part of your query.
Additional reading
An introduction to the feature
How to understand what query operations are folding using Query Diagnostics
Visualizing and Interpreting Query
Diagnostics in Power BI
Article • 02/17/2023
Introduction
Once you've recorded the diagnostics you want to use, the next step is being able to
understand what they say.
It's helpful to have a good understanding of what exactly each column in the query
diagnostics schema means, which we're not going to repeat in this short tutorial. There's
a full write up of that here.
In general, when building visualizations, it's better to use the full detailed table. Because
regardless of how many rows it is, what you're probably looking at is some kind of
depiction of how the time spent in different resources adds up, or what the native query
emitted was.
As mentioned in our article on recording the diagnostics, I'm working with the OData
and SQL traces for the same table (or nearly so)—the Customers table from Northwind.
In particular, I'm going to focus on common ask from our customers, and one of the
easier to interpret sets of traces: full refresh of the data model.
Id
Start Time
Query
Step
Data Source Query
Exclusive Duration (%)
Row Count
Category
Is User Query
Path
For the second visualization, one choice is to use a Stacked Column Chart. In the 'Axis'
parameter, you might want to use 'Id' or 'Step'. If we're looking at the Refresh, because it
doesn't have anything to do with steps in the Editor itself, we probably just want to look
at 'Id'. For the 'Legend' parameter, you should set 'Category' or 'Operation' (depending
on the granularity you want). For the 'Value', set 'Exclusive Duration' (and make sure it's
not the %, so that you get the raw duration value). Finally, for the Tooltip, set 'Earliest
Start Time'.
Once your visualization is built, make sure you sort by 'Earliest Start Time' ascending so
you can see the order things happen in.
While your exact needs might differ, this combination of charts is a good place to start
for looking at numerous diagnostics files and for a number of purposes.
Asking how the time is spent is easy, and will be similar for most connectors. A warning
with query diagnostics, as mentioned elsewhere, is that you'll see drastically different
capabilities depending on the connector. For example, many ODBC based connectors
won't have an accurate recording of what query is sent to the actual back-end system,
as Power Query only sees what it sends to the ODBC driver.
If we want to see how the time is spent, we can just look at the visualizations we built
above.
Now, because the time values for the sample queries we're using here are so small, if we
want to work with how Power BI reports time it's better if we convert the Exclusive
Duration column to 'Seconds' in the Power Query editor. Once we do this this
conversion, we can look at our chart and get a decent idea of where time is spent.
For my OData results, I see in the image that the vast majority of the time was spent
retrieving the data from source—if I select the 'Data Source' item on the legend, it
shows me all of the different operations related to sending a query to the Data Source.
If we perform all the same operations and build similar visualizations, but with the SQL
traces instead of the ODATA ones, we can see how the two data sources compare!
If we select the Data Source table, like with the ODATA diagnostics we can see the first
evaluation (2.3 in this image) emits metadata queries, with the second evaluation
actually retrieving the data we care about. Because we're retrieving small amounts of
data in this case, the data pulled back takes a small amount of time (less than a tenth of
a second for the entire second evaluation to happen, with less than a twentieth of a
second for data retrieval itself), but that won't be true in all cases.
As above, we can select the 'Data Source' category on the legend to see the emitted
queries.
When you're looking at this, if it seems like time spent is strange—for example, on the
OData query you might see that there's a Data Source Query with the following value:
Request:
https://services.odata.org/V4/Northwind/Northwind.svc/Customers?
$filter=ContactTitle%20eq%20%27Sales%20Representative%27&$select=CustomerID%
2CCountry HTTP/1.1
Content-Type:
application/json;odata.metadata=minimal;q=1.0,application/json;odata=minimal
metadata;q=0.9,application/atomsvc+xml;q=0.8,application/atom+xml;q=0.8,appl
ication/xml;q=0.7,text/plain;q=0.7
<Content placeholder>
Response:
Content-Type:
application/json;odata.metadata=minimal;q=1.0,application/json;odata=minimal
metadata;q=0.9,application/atomsvc+xml;q=0.8,application/atom+xml;q=0.8,appl
ication/xml;q=0.7,text/plain;q=0.7
Content-Length: 435
<Content placeholder>
This Data Source Query is associated with an operation that only takes up, say, 1% of the
Exclusive Duration. Meanwhile, there's a similar one:
Request:
GET https://services.odata.org/V4/Northwind/Northwind.svc/Customers?
$filter=ContactTitle eq 'Sales Representative'&$select=CustomerID%2CCountry
HTTP/1.1
Response:
https://services.odata.org/V4/Northwind/Northwind.svc/Customers?
$filter=ContactTitle eq 'Sales Representative'&$select=CustomerID%2CCountry
HTTP/1.1 200 OK
This Data Source Query is associated with an operation that takes up nearly 75% of the
Exclusive Duration. If you turn on the Path, you discover the latter is actually a child of
the former. This means that the first query basically added a small amount of time on its
own, with the actual data retrieval being tracked by the 'inner' query.
These are extreme values, but they're within the bounds of what might be seen.
Understanding folding with Query
Diagnostics
Article • 03/31/2023
One of the most common reasons to use Query Diagnostics is to have a better
understanding of what operations were 'pushed down' by Power Query to be performed
by the back-end data source, which is also known as 'folding'. If we want to see what
folded, we can look at what is the 'most specific' query, or queries, that get sent to the
back-end data source. We can look at this for both ODATA and SQL.
The operation that was described in the article on Recording Diagnostics does
essentially four things:
Since the ODATA connector doesn't currently support folding COUNT() to the endpoint,
and since this endpoint is somewhat limited in its operations as well, we don't expect
that final step to fold. On the other hand, filtering is relatively trivial. This is exactly what
we see if we look at the most specific query emitted above:
Request:
GET https://services.odata.org/V4/Northwind/Northwind.svc/Customers?
$filter=ContactTitle eq 'Sales Representative'&$select=CustomerID%2CCountry
HTTP/1.1
Response:
https://services.odata.org/V4/Northwind/Northwind.svc/Customers?
$filter=ContactTitle eq 'Sales Representative'&$select=CustomerID%2CCountry
HTTP/1.1 200 OK
We can see we're filtering the table for ContactTitle equaling 'Sales Representative', and
we're only returning two columns--Customer ID and Country. Country, of course, is
needed for the grouping operation, which since it isn't being performed by the ODATA
endpoint must be performed locally. We can conclude what folds and doesn't fold here.
Similarly, if we look at the specific and final query emitted in the SQL diagnostics, we see
something slightly different:
select
count(1) as [Count]
from
(
select [_].[Country]
from [dbo].[Customers] as [_]
where [_].[ContactTitle] = 'Sales Representative' and [_].[ContactTitle]
is not null
) as [rows]
group by [Country]
Here, we can see that Power Query creates a subselection where ContactTitle is filtered
to 'Sales Representative', then groups by Country on this subselection. All of our
operations folded.
Using Query Diagnostics, we can examine what kind of operations folded--in the future,
we hope to make this capability easier to use.
Why does my query run multiple times?
Article • 08/31/2022
When refreshing in Power Query, there's a lot done behind the scenes to attempt to
give you a smooth user experience, and to execute your queries efficiently and securely.
However, in some cases you might notice that multiple data source requests are being
triggered by Power Query when data is refreshed. Sometimes these requests are normal,
but other times they can be prevented.
Connector design
Connectors can make multiple calls to a data source for various reasons, including
metadata, caching of results, pagination, and so on. This behavior is normal and is
designed to work that way.
In a desktop environment, a single refresh of all the tables in the data model is run using
a single shared cache. Caching can reduce the likelihood of multiple requests to the
same data source, since one query can benefit from the same request having already
been run and cached for a different query. Even here, though, you can get multiple
requests either because the data source isn't cached (for example, local CSV files), the
request to the data source is different than a request that was already cached because
of downstream operations (which can alter folding), the cache is too small (which is
relatively unlikely), or because the queries are running at roughly the same time.
In a cloud environment, each query is refreshed using its own separate cache, so a query
can’t benefit from the same request having already been cached for a different query.
Folding
Sometimes Power Query’s folding layer may generate multiple requests to a data
source, based on the operations being performed downstream. In such cases, you might
avoid multiple requests by using Table.Buffer . More information: Buffer your table
In this example, you’ll have only a single M evaluation that happens when you refresh
the Power Query editor preview. If the duplicate requests occur at this point, then
they’re somehow inherent in the way the query is authored. If not, and if you enable the
settings above one-by-one, you can then observe at what point the duplicate requests
start occurring.
1. In the Power Query editor formula bar, select the fx button to add a new step.
2. In the formula bar, surround the name of the previous step with
Table.Buffer(<previous step name goes here>). For example, if the previous step was
named Source , the formula bar will display = Source . Edit the step in the formula
bar to say = Table.Buffer(Source) .
A parameter serves as a way to easily store and manage a value that can be reused.
Parameters give you the flexibility to dynamically change the output of your queries
depending on their value, and can be used for:
Changing the argument values for particular transforms and data source functions.
Inputs in custom functions.
You can easily manage your parameters inside the Manage Parameters window. To get
to the Manage Parameters window, select the Manage Parameters option inside
Manage Parameters in the Home tab.
Creating a parameter
Power Query provides two easy ways to create parameters:
From an existing query: Right-click a query whose value is a simple non-structured
constant, such as a date, text, or number, and then select Convert to Parameter.
You can also convert a parameter to a query by right-clicking the parameter and
then selecting Convert To Query.
Using the Manage Parameters window: Select the New Parameter option from
the dropdown menu of Manage Parameters in the Home tab. Or launch the
Manage Parameters window and select New on the top to create a parameter. Fill
in this form, and then select OK to create a new parameter.
After creating the parameter, you can always go back to the Manage Parameters
window to modify any of your parameters at any moment.
Parameter properties
A parameter stores a value that can be used for transformations in Power Query. Apart
from the name of the parameter and the value that it stores, it also has other properties
that provide metadata to it. The properties of a parameter are:
Name: Provide a name for this parameter that lets you easily recognize and
differentiate it from other parameters you might create.
Required: The checkbox indicates whether subsequent users can specify whether a
value for the parameter must be provided.
Type: Specifies the data type of the parameter. We recommended that you always
set up the data type of your parameter. To learn more about the importance of
data types, go to Data types.
Suggested Values: Provides the user with suggestions to select a value for the
Current Value from the available options:
Any value: The current value can be any manually entered value.
List of values: Provides you with a simple table-like experience so you can
define a list of suggested values that you can later select from for the Current
Value. When this option is selected, a new option called Default Value will be
made available. From here, you can select what should be the default value for
this parameter, which is the default value shown to the user when referencing
the parameter. This value isn't the same as the Current Value, which is the value
that's stored inside the parameter and can be passed as an argument in
transformations. Using the List of values provides a drop-down menu that's
displayed in the Default Value and Current Value fields, where you can pick one
of the values from the suggested list of values.
7 Note
You can still manually type any value that you want to pass to the
parameter. The list of suggested values only serves as simple suggestions.
Query: Uses a list query (a query whose output is a list) to provide the list of
suggested values that you can later select for the Current Value.
Step argument
To enable this feature, first go to the View tab in the Power Query editor and select the
Always allow option in the Parameters group.
For example, the following Orders table contains the OrderID, Units, and Margin fields.
In this example, create a new parameter with the name Minimum Margin with a
Decimal Number type and a Current Value of 0.2.
Go to the Orders query, and in the Margin field select the Greater Than filter option.
In the Filter Rows window, there's a button with a data type for the field selected. Select
the Parameter option from the dropdown menu for this button. From the field selection
right next to the data type button, select the parameter that you want to pass to this
argument. In this case, it's the Minimum Margin parameter.
After you select OK, your table is filtered using the Current Value for your parameter.
If you modify the Current Value of your Minimum Margin parameter to be 0.3, your
orders query gets updated immediately and shows you only the rows where the Margin
is above 30%.
Tip
Many transformations in Power Query let you select your parameter from a
dropdown. We recommend that you always look for it and take advantage of what
parameters can offer you.
To test this new function, enter a value, such as 0.4, in the field underneath the
Minimum Margin label. Then select the Invoke button. This creates a new query with
the name Invoked Function, effectively passing the value 0.4 to be used as the
argument for the function and giving you only the rows where the margin is above 40%.
To learn more about how to create custom functions, go to Creating a custom function.
Following the previous example, change the current value for Minimum Margin from
0.3 to 0.1. The new goal is to create a list parameter that can hold the order numbers of
the orders that you're interested in analyzing. To create the new parameter, go to
Manage Parameters dialog and select New to create a new parameter. Fill in this new
parameter with the following information:
After defining these fields, a new grid pops up where you can enter the values that you
want to store for your parameter. In this case, those values are 125, 777, and 999.
7 Note
While this example uses numbers, you can also store other data types in your list,
such as text, dates, datetime, and more. More information: Data types in Power
Query
Tip
If you want to have more control over what values are used in your list parameter,
you can always create a list with constant values and convert your list query to a
parameter as showcased previously in this article.
With the new Interesting Orders list parameters in place, head back to the Orders
query. Select the auto-filter menu of the OrderID field. Select Number filters > In.
After selecting this option, a new Filter rows dialog box appears. From here, you can
select the list parameter from a drop-down menu.
7 Note
List parameters can work with either the In or Not in options. In lets you filter only
by the values from your list. Not in does exactly the opposite, and tries to filter your
column to get all values that are not equal to the values stored in your parameter.
After selecting OK, you'll be taken back to your query. There, your query has been
filtered using the list parameter that you've created, with the result that only the rows
where the OrderID was equal to either 125, 777, or 999 was kept.
Error handling
Article • 12/17/2022
Similar to how Excel and the DAX language have an IFERROR function, Power Query has
its own syntax to test and catch errors.
As mentioned in the article on dealing with errors in Power Query, errors can appear
either at the step or cell level. This article will focus on how you can catch and manage
errors based on your own specific logic.
7 Note
To demonstrate this concept, this article will use an Excel Workbook as its data
source. The concepts showcased here apply to all values in Power Query and not
only the ones coming from an Excel Workbook.
The sample data source for this demonstration is an Excel Workbook with the following
table.
This table from an Excel Workbook has Excel errors such as #NULL!, #REF!, and #DIV/0!
in the Standard Rate column. When you import this table into the Power Query editor,
the following image shows how it will look.
Notice how the errors from the Excel workbook are shown with the [Error] value in
each of the cells.
In this article, you'll learn how to replace an error with another value. In addition, you'll
also learn how to catch an error and use it for your own specific logic.
In this case, the goal is to create a new Final Rate column that will use the values from
the Standard Rate column. If there are any errors, then it will use the value from the
correspondent Special Rate column.
To create a new custom column, go to the Add column menu and select Custom
column. In the Custom column window, enter the formula try [Standard Rate]
otherwise [Special Rate] . Name this new column Final Rate.
The formula above will try to evaluate the Standard Rate column and will output its
value if no errors are found. If errors are found in the Standard Rate column, then the
output will be the value defined after the otherwise statement, which in this case is the
Special Rate column.
After adding the correct data types to all columns in the table, the following image
shows how the final table looks.
7 Note
As an alternative approach, you can also enter the formula try [Standard Rate]
catch ()=> [Special Rate] , which is equivalent to the previous formula, but using
7 Note
The sole purpose of excluding the #REF! error is for demonstration purposes. With
the concepts showcased in this article, you can target any fields of your choice from
the error record.
When you select any of the whitespace next to the error value, you get the details pane
at the bottom of the screen. The details pane contains both the error reason,
DataFormat.Error , and the error message, Invalid cell value '#REF!' :
You can only select one cell at a time, so you can effectively only see the error
components of one error value at a time. This is where you'll create a new custom
column and use the try expression.
You can expand this newly created column with record values and look at the available
fields to be expanded by selecting the icon next to the column header.
This operation will expose three new fields:
All Errors.HasError—displays whether the value from the Standard Rate column
had an error or not.
All Errors.Value—if the value from the Standard Rate column had no error, this
column will display the value from the Standard Rate column. For values with
errors this field won't be available, and during the expand operation this column
will have null values.
All Errors.Error—if the value from the Standard Rate column had an error, this
column will display the error record for the value from the Standard Rate column.
For values with no errors, this field won't be available, and during the expand
operation this column will have null values.
For further investigation, you can expand the All Errors.Error column to get the three
components of the error record:
Error reason
Error message
Error detail
After doing the expand operation, the All Errors.Error.Message field displays the specific
error message that tells you exactly what Excel error each cell has. The error message is
derived from the Error Message field of the error record.
Now with each error message in a new column, you can create a new conditional
column with the name Final Rate and the following clauses:
If the value in the All Errors.Errors.Message column equals null , then the output
will be the value from the Standard Rate column.
Else, if the value in the All Errors.Errors.Message column doesn't equal Invalid
cell value '#REF!'. , then the output will be the value from the Special Rate
column.
Else, null.
After keeping only the Account, Standard Rate, Special Rate, and Final Rate columns,
and adding the correct data type for each column, the following image demonstrates
what the final table looks like.
try [Standard Rate] catch (r)=> if r[Message] <> "Invalid cell value '#REF!'." then
[Special Rate] else null
More resources
Understanding and working with errors in Power Query
Add a Custom column in Power Query
Add a Conditional column in Power Query
Import data from a database using
native database query
Article • 09/01/2022
Power Query gives you the flexibility to import data from wide variety of databases that
it supports. It can run native database queries, which can save you the time it takes to
build queries using the Power Query interface. This feature is especially useful for using
complex queries that already exist—and that you might not want to or know how to
rebuild using the Power Query interface.
7 Note
Power Query enables you to specify your native database query in a text box under
Advanced options when connecting to a database. In the example below, you'll import
data from a SQL Server database using a native database query entered in the SQL
statement text box. The procedure is similar in all other databases with native database
query that Power Query supports.
1. Connect to a SQL Server database using Power Query. Select the SQL Server
database option in the connector selection.
a. Specify the Server and Database where you want to import data from using
native database query.
b. Under Advanced options, select the SQL statement field and paste or enter
your native database query, then select OK.
3. If this is the first time you're connecting to this server, you'll see a prompt to select
the authentication mode to connect to the database. Select an appropriate
authentication mode, and continue.
7 Note
If you don't have access to the data source (both Server and Database), you'll
see a prompt to request access to the server and database (if access-request
information is specified in Power BI for the data source).
4. If the connection is established, the result data is returned in the Power Query
Editor.
Shape the data as you prefer, then select Apply & Close to save the changes and
import the data.
DataWorld.Dataset dwSQL
Query folding
Query folding while using a native database query is limited to only a certain number of
Power Query connectors. For more information, go to Query folding on native queries.
If you see this message, select Edit Permission. This selection will open the Native
Database Query dialog box. You'll be given an opportunity to either run the native
database query, or cancel the query.
By default, if you run a native database query outside of the connector dialogs, you'll be
prompted each time you run a different query text to ensure that the query text that will
be executed is approved by you.
7 Note
Native database queries that you insert in your get data operation won't ask you
whether you want to run the query or not. They'll just run.
You can turn off the native database query security messages if the native database
query is run in either Power BI Desktop or Excel. To turn off the security messages:
1. If you're using Power BI Desktop, under the File tab, select Options and settings >
Options.
If you're using Excel, under the Data tab, select Get Data > Query Options.
4. Select OK.
You can also revoke the approval of any native database queries that you've previously
approved for a given data source in either Power BI Desktop or Excel. To revoke the
approval:
1. If you're using Power BI Desktop, under the File tab, select Options and settings >
Data source settings.
If you're using Excel, under the Data tab, select Get Data > Data Source Settings.
2. In the Data source settings dialog box, select Global permissions. Then select the
data source containing the native database queries whose approval you want to
revoke.
4. In the Edit permissions dialog box, under Native Database Queries, select Revoke
Approvals.
Create Power Microsoft Platform
dataflows from queries in Microsoft
Excel (Preview)
Article • 02/17/2023
You can create Microsoft Power Platform dataflows from queries in Microsoft Excel
workbooks to take advantage of cloud-powered dataflows refreshing and processing
the data at regular intervals instead of performing these operations manually in Excel.
This article walks you through how to export queries from Excel into a Power Query
template that can then be imported into Power Platform dataflow to create a dataflow.
7 Note
The preview feature for creating Power Query templates from queries feature is
only available to Office Insiders. For more information on the Office insider
program, go to Office Insider .
Overview
Working with large datasets or long-running queries can be cumbersome every time
you have to manually trigger a data refresh in Excel because it takes resources from your
computer to do this, and you have to wait until the computation is done to get the
latest data. Moving these data operations into a Power Platform dataflow is an effective
way to free up your computer's resources and to have the latest data easily available for
you to consume in Excel.
4. Select the Power Query template you created earlier. The dataflow name will
prepopulate with the template name provided. Once you're done with the
dataflow creation screen, select Next to view your queries from Excel in the query
editor.
5. From this point, go through the normal dataflow creation and configuration
process so you can further transform your data, set refresh schedules on the
dataflow, and any other dataflow operation possible. For more information on how
to configure and create Power Platform dataflows, go to Create and use dataflows.
See also
Create and use dataflows in Power Apps
Optimize Power Query when expanding
table columns
Article • 02/17/2023
The simplicity and ease of use that allows Power BI users to quickly gather data and
generate interesting and powerful reports to make intelligent business decisions also
allows users to easily generate poorly performing queries. This often occurs when there
are two tables that are related in the way a foreign key relates SQL tables or SharePoint
lists. (For the record, this issue isn't specific to SQL or SharePoint, and occurs in many
backend data extraction scenarios, especially where schema is fluid and customizable.)
There's also nothing inherently wrong with storing data in separate tables that share a
common key—in fact this is a fundamental tenet of database design and normalization.
But it does imply a better way to expand the relationship.
This top-level data is gathered through a single HTTP call to the SharePoint API
(ignoring the metadata call), which you can see in any web debugger.
When you expand the record, you see the fields joined from the secondary table.
When expanding related rows from one table to another, the default behavior of Power
BI is to generate a call to Table.ExpandTableColumn . You can see this in the generated
formula field. Unfortunately, this method generates an individual call to the second table
for every row in the first table.
This increases the number of HTTP calls by one for each row in the primary list. This may
not seem like a lot in the above example of five or six rows, but in production systems
where SharePoint lists reach hundreds of thousands of rows, this can cause a significant
experience degradation.
When queries reach this bottleneck, the best mitigation is to avoid the call-per-row
behavior by using a classic table join. This ensures that there will be only one call to
retrieve the second table, and the rest of the expansion can occur in memory using the
common key between the two tables. The performance difference can be massive in
some cases.
First, start with the original table, noting the column you want to expand, and ensuring
you have the ID of the item so that you can match it. Typically the foreign key is named
similar to the display name of the column with Id appended. In this example, it's
LocationId.
Second, load the secondary table, making sure to include the Id, which is the foreign
key. Right-click on the Queries panel to create a new query.
Finally, join the two tables using the respective column names that match. You can
typically find this field by first expanding the column, then looking for the matching
columns in the preview.
In this example, you can see that LocationId in the primary list matches Id in the
secondary list. The UI renames this to Location.Id to make the column name unique.
Now let's use this information to merge the tables.
By right-clicking on the query panel and selecting New Query > Combine > Merge
Queries as New, you see a friendly UI to help you combine these two queries.
Select each table from the drop-down to see a preview of the query.
Once you've selected both tables, select the column that joins the tables logically (in this
example, it's LocationId from the primary table and Id from the secondary table). The
dialog will instruct you how many of the rows match using that foreign key. You'll likely
want to use the default join kind (left outer) for this kind of data.
Select OK and you'll see a new query, which is the result of the join. Expanding the
record now doesn't imply additional calls to the backend.
Refreshing this data will result in only two calls to SharePoint—one for the primary list,
and one for the secondary list. The join will be performed in memory, significantly
reducing the number of calls to SharePoint.
This approach can be used for any two tables in PowerQuery that have a matching
foreign key.
7 Note
SharePoint user lists and taxonomy are also accessible as tables, and can be joined
in exactly the way described above, provided the user has adequate privileges to
access these lists.
Enabling Microsoft Edge (Chromium)
for OAuth authentication in Power BI
Desktop
Article • 08/31/2022
If you're using OAuth authentication to connect to your data, the OAuth dialog in Power
Query uses the Microsoft Internet Explorer 11 embedded control browser. However,
certain web services, such as QuickBooks Online, Salesforce Reports, and Salesforce
Objects no longer support Internet Explorer 11.
7 Note
If you are using an earlier release of Power BI, go to December 2020 Power BI
Release.
GitHub
QuickBooks Online
Salesforce Reports
Salesforce Objects
Smartsheet
Twilio
Zendesk
On your Power BI Desktop machine, you can get WebView2 control either by installing
the new Edge (Chromium) browser (at least beta) from
https://www.microsoftedgeinsider.com/download , or by installing the WebView2
redist package.
All other connectors will use Internet Explorer 11 by default unless the settings are
overridden using environment variables.
The following table contains a list of all the connectors currently available for Power Query.
For those connectors that have a reference page in this document, a link is provided under
the connector icon and name.
The connectors are listed in alphabetical order in separate tables for each letter in the
alphabet. Use the In this article list on the right side of this article to go to any of the
alphabetized tables.
7 Note
The Excel column in the following table indicates all connectors that are available on at
least one version of Excel. However, not all Excel versions support all of these indicated
Power Query connectors. For a complete list of the Power Query connectors supported
by all versions of Excel, go to Power Query data sources in Excel versions .
7 Note
dataflows and datamarts in Premium workspaces. There's ongoing work towards a fix
and the documentation will be updated when a fix is released.
A
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Access
Database
By Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Active
Directory
By Microsoft
Acterys
(Beta)
By Acterys
Actian (Beta)
By Actian
Adobe
Analytics
By Microsoft
Amazon
Athena
By Amazon
Amazon
OpenSearch
Service
(Beta)
By Amazon
Amazon
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Redshift
By Microsoft
Anaplan
By Anaplan
appFigures
(Beta)
By Microsoft
Asana
(Beta)
By Asana
Assemble
Views
By Autodesk
AtScale
cubes
By Microsoft
Autodesk
Construction
Cloud
By Autodesk
Automation
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Anywhere
By
Automation
Anywhere
Automy Data
Analytics
(Beta)
By
ACEROYALTY
Azure
Analysis
Services
database
By Microsoft
Azure Blob
Storage
By Microsoft
Azure
Cosmos DB
By Microsoft
Azure
Cosmos
DB v2 (Beta)
By Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Azure Cost
Management
By Microsoft
Azure
Database for
PostgreSQL
By Microsoft
Azure
Databricks
By Databricks
Azure Data
Explorer
(Kusto)
By Microsoft
Azure Data
Lake
Storage Gen1
By Microsoft
Azure Data
Lake
Storage
Gen2
By Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Azure
DevOps
By Microsoft
Azure
DevOps
Server
By Microsoft
Azure
HDInsight
(HDFS)
By Microsoft
Azure
HDInsight
on AKS Trino
(Beta)
By Microsoft
Azure
HDInsight
Spark
By Microsoft
Azure
Synapse
Analytics
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
(SQL DW)
By Microsoft
Azure
Synapse
Analytics
workspace
(Beta)
By Microsoft
Azure SQL
database
By Microsoft
Azure Table
Storage
By Microsoft
Azure Time
Series
Insights
(Beta)
By Microsoft
B
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
BI
Connector
By Guidanz
BitSight
Security
Ratings
By BitSight
Bloomberg
Data
and
Analytics
By
Bloomberg
BQE Core
By BQE
C
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
CData
Connect
Cloud
By CData
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Celonis EMS
(Beta)
By Celonis
Cherwell
(Beta)
By Cherwell
CloudBluePSA
(Beta)
By CloudBlue
PSA (Beta)
Cognite Data
Fusion
By Cognite
Common
Data
Service
(legacy)
By Microsoft
D
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Data.World -
Get Dataset
(Beta)
By Microsoft
Data
Virtuality
LDW
By Data
Virtuality
Databricks
By
Databricks
Dataflows
By Microsoft
Dataverse
By Microsoft
Delta
Sharing
By
Databricks
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Denodo
By Denodo
Digital
Construction
Works
Insights
By Digital
Construction
Works
Dremio
Cloud
By Dremio
Dremio
Software
By Dremio
Dynamics
365
Business
Central
By Microsoft
Dynamics
365
Business
Central
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
(on-
premises)
By Microsoft
Dynamics
365
Customer
Insights
(Beta)
By Microsoft
Dynamics
365
(Dataverse)
By Microsoft
Dynamics
365
Online
(legacy)
By Microsoft
Dynamics
NAV
By Microsoft
E
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
eWay-
CRM
By eWay-
CRM
Eduframe
Reporting
(Beta)
By Drieam
Emigo
Data
Source
By Sagra
Entersoft
Business
Suite
(Beta)
By
Entersoft
EQuIS
By
EarthSoft
Essbase
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
By
Microsoft
Exasol
By Exasol
Excel
By
Microsoft
F
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
FactSet
Analytics
By FactSet
FactSet
RMS
(Beta)
By FactSet
FHIR
By
Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Folder
By
Microsoft
Funnel
By Funnel
G
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Github
(Beta)
By
Microsoft
Google
Analytics
By
Microsoft
Google
BigQuery
By
Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Google
BigQuery
(Azure
AD) (Beta)
By
Microsoft
Google
Sheets
By
Microsoft
H
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Hadoop
File
(HDFS)
By
Microsoft
HDInsight
Interactive
Query
By
Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Hexagon
PPM
Smart API
By
Hexagon
PPM
Hive LLAP
By
Microsoft
I
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
IBM DB2
database
By Microsoft
IBM
Informix
database
(Beta)
By Microsoft
IBM Netezza
By Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Impala
By Microsoft
Indexima
By Indexima
Industrial
App Store
By Intelligent
Plant
Information
Grid (Beta)
By Luminis
InterSystems
IRIS (Beta)
By
Intersystems
Intune Data
Warehouse
(Beta)
By Microsoft
J
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Jamf Pro
(Beta)
By Jamf
Jethro
(Beta)
By
JethroData
JSON
By
Microsoft
1
Available in dataflows for Microsoft Teams.
K
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Kognitwin
By
Kongsberg
KQL
Database
By
Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Kyligence
By
Kyligence
L
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Linkar PICK
Style/MultiValue
Databases
(Beta)
By Kosday
Solutions
LinkedIn Sales
Navigator
(Beta)
By Microsoft
M
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Marketo
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
(Beta)
By Microsoft
MarkLogic
By MarkLogic
MariaDB
By MariaDB
Microsoft
Azure
Consumption
Insights
(Beta)
(Deprecated)
By Microsoft
Microsoft
Exchange
By Microsoft
Microsoft
Exchange
Online
By Microsoft
Microsoft
Graph
Security
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
(Deprecated)
By Microsoft
Microsoft
Teams
Personal
Analytics
(Beta)
By Microsoft
MicroStrategy
for Power BI
By
MicroStrategy
Mixpanel
(Beta)
By Microsoft
MongoDB
Atlas
SQL interface
(Beta)
By MongoDB
MySQL
database
By Microsoft
O
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
OData Feed
By
Microsoft
ODBC
By
Microsoft
OLE DB
By
Microsoft
OpenSearch
Project
(Beta)
By
OpenSearch
Oracle
database
By
Microsoft
P
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Parquet
By Microsoft
Palantir
Foundry
By Palantir
Paxata
By Paxata
2 2 1
PDF
By Microsoft
Planview
Enterprise
Architecture
By Planview
Planview
IdeaPlace
By Planview
Planview
ORK
(Beta)
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Planview
Portfolios
By Planview
Planview
Projectplace
By Planview
PostgreSQL
database
By Microsoft
Power BI
datasets
By Microsoft
Product
Insights
(Beta)
By Microsoft
Profisee
By Profisee
Python
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Script
By Microsoft
Q
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
QubolePresto
(Beta)
By Qubole
Quickbase
By Quick Base
Quickbooks
Online
(Beta)
By Microsoft
R
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
R Script
By
Microsoft
Roamler
(Beta)
By
Roamler
S
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Salesforce
Objects
By Microsoft
Salesforce
Reports
By Microsoft
SAP Business
Warehouse
Application
Server
By Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
SAP Business
Warehouse
Message
Server
By Microsoft
SAP HANA
database
By Microsoft
SIS-CC SDMX
(Beta)
By SIS-CC
SharePoint
folder
By Microsoft
SharePoint list
By Microsoft
SharePoint
Online
list
By Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Shortcuts
Business
Insights (Beta)
By Shortcuts
SingleStore
(Beta)
By SingleStore
SiteImprove
By SiteImprove
Smartsheet
By Microsoft
Snowflake
By Microsoft
Socialbakers
Metrics (Beta)
By Emplifi
SoftOneBI
(Beta)
By SoftOne
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
SolarWinds
Service Desk
(Beta)
By SolarWinds
Solver
By BI360
Spark
By Microsoft
SparkPost
(Beta)
By Microsoft
SQL Server
Analysis
Services
database
By Microsoft
SQL Server
database
By Microsoft
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Starburst
Enterprise
By Starburst
Data
SumTotal
By SumTotal
SurveyMonkey
By
SurveyMonkey
SweetIQ (Beta)
By Microsoft
Sybase
Database
By Microsoft
1
Available in dataflows for Microsoft Teams.
T
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
TeamDesk
(Beta)
By ForeSoft
Tenforce
(Smart)List
By Tenforce
Teradata
database
By Microsoft
Text/CSV
By Microsoft
TIBCO(R)
Data
Virtualization
By TIBCO
Twilio
(Deprecated)
(Beta)
By Microsoft
Usercube
(Beta)
By
Usercube
V
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Vena
By Vena
Vertica
By
Microsoft
Vessel
Insight
By
Kongsberg
Viva
Insights
By
Microsoft
W
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Web
By Microsoft
Webtrends
Analytics
(Beta)
By Microsoft
Witivio
(Beta)
By Witivio
Workforce
Dimensions
(Beta)
(Deprecated)
By Kronos
X
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
XML
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
By
Microsoft
Z
Connector Excel Power BI Power BI Fabric Power Customer Analysis
(Datasets) (Dataflows) (Dataflow Apps Insights Services
Gen2) (Dataflows) (Dataflows)
Zendesk
(Beta)
By
Microsoft
Zoho
Creator
By Zoho
Zucchetti
HR
Infinity
(Beta)
By
Zucchetti
Next steps
Power BI data sources (datasets)
Connect to data sources for Power BI dataflows
Available data sources (Dynamics 365 Customer Insights)
Data sources supported in Azure Analysis Services
Access database
Article • 07/18/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Excel
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
If you're connecting to an Access database from Power Query Online, the system that
contains the on-premises data gateway must have the 64-bit version of the Access
Database Engine 2016 OLEDB provider installed.
If you're loading an Access database to Power BI Desktop, the versions of the Access
Database Engine 2016 OLEDB provider and Power BI Desktop on that machine must
match (that is, either 32-bit or 64-bit). For more information, go to Import Access
database to Power BI Desktop.
Capabilities Supported
Import
2. Browse for and select the Access database you want to load. Then select Open.
If the Access database is online, use the Web connector to connect to the
database.
3. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
Editor.
2. In the Access database dialog that appears, provide the path to the Access
database.
You must select an on-premises data gateway for this connector, whether the
Access database is on your local network or on a web site.
5. Select the type of credentials for the connection to the Access database in
Authentication kind.
8. In Navigator, select the data you require, and then select Transform data to
continue transforming the data in Power Query Editor.
Troubleshooting
7 Note
Microsoft Office has stopped supporting the Access Database Engine 2010 OLEDB
provider as part of end-of-life for Office 2010. However, some legacy use cases,
such as using 32-bit Office and 64-bit PBI Desktop, might require the continued use
of the older 2010 version. In these cases, you can still download the 2010 version
from the following location:
Desktop. This error can be caused by using mismatched bit versions of Power BI
Desktop and the Access Database Engine 2016 OLEDB provider. For more information
about how you can fix this mismatch, go to Troubleshoot importing Access and Excel .xls
files in Power BI Desktop.
Active Directory
Article • 07/18/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Excel
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities Supported
Import
Prerequisites
To connect to Active Directory in Power BI (Dataflows) you'll need an on-premises
data gateway.
3. You can choose to use your current windows credentials or enter alternate
credentials. Then select Connect.
Tip
You may need to add the domain suffix to your username. For example:
domain\username.
4. In Navigator, review and/or select data from your database. Then select OK.
7 Note
The navigator doesn't load data for the configuration partition. More
information: Limitations and issues
7 Note
The navigator doesn't load data for the configuration partition. More
information: Limitations and issues
Summary
Item Description
Prerequisites
Before you can sign in to Adobe Analytics, you must have an Adobe Analytics account
(username/password).
Capabilities Supported
Import
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, select Adobe Analytics, and then select Connect.
2. If this is the first time you're getting data through the Adobe Analytics connector, a
third-party notice will be displayed. Select Don't warn me again with this
connector if you don't want this message to be displayed again, and then select
Continue.
Once the connection is established, you can preview and select multiple dimensions and
measures within the Navigator dialog box to create a single tabular output.
You can also provide any optional input parameters required for the selected items. For
more information about these parameters, see Optional input parameters.
You can Load the selected table, which brings the entire table into Power BI Desktop, or
you can select Transform Data to edit the query, which opens Power Query Editor. You
can then filter and refine the set of data you want to use, and then load that refined set
of data into Power BI Desktop.
Date Range—filter with a reporting range between a start date and an end date
that you set.
Segment—filter the data based on all segments contained in the data, or only
those segments you select. To change the list of segments, select the ellipsis to the
right of the Segment list box, then choose the segments you want. By default, all
segments are included in the data.
Top—filter the data based on the top items for the dimension. You can enter a
value in the Top text box, or select the ellipsis next to the text box to select some
default values. By default, all items are selected.
Adobe Analytics has a built-in limit of 50 K rows returned per API call.
If the number of API calls exceeds four per second, a warning will be issued. If the
number exceeds five per second, an error message will be returned. For more
information about these limits and the associated messages, see Web Services
Error Codes .
The default rate limit for an Adobe Analytics Company is 120 requests per minute
per user (the limit is enforced as 12 requests every 6 seconds).
Import from Adobe Analytics will stop and display an error message whenever the
Adobe Analytics connector hits any of the API limits listed above.
When accessing your data using the Adobe Analytics connector, follow the guidelines
provided under the Best Practices heading.
For additional guidelines on accessing Adobe Analytics data, see Recommended usage
guidelines .
Next steps
You may also find the following Adobe Analytics information useful:
7 Note
The following connector article is provided by Amazon, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Amazon website and use the support
channels there.
Summary
Item Description
Prerequisites
An Amazon Web Services (AWS) account
Permissions to use Athena
Customers must install the Amazon Athena ODBC driver before using the
connector
Capabilities supported
Import
DirectQuery (Power BI Datasets)
For DSN, enter the name of the ODBC DSN that you want to use. For
instructions on configuring your DSN, go to the ODBC driver
documentation .
For Data Connectivity mode, choose a mode that's appropriate for your use
case, following these general guidelines:
For smaller datasets, choose Import. When using import mode, Power BI
works with Athena to import the contents of the entire dataset for use in
your visualizations.
For larger datasets, choose DirectQuery. In DirectQuery mode, no data is
downloaded to your workstation. While you create or interact with a
visualization, Microsoft Power BI works with Athena to dynamically query
the underlying data source so that you're always viewing current data.
More information: Use DirectQuery in Power BI Desktop
6. Select OK.
7. At the prompt to configure data source authentication, select either Use Data
Source Configuration or AAD Authentication. Enter any required sign-in
information. Then select Connect.
Your data catalog, databases, and tables appear in the Navigator dialog box.
8. In the Display Options pane, select the check box for the dataset that you want to
use.
9. If you want to transform the dataset before you import it, go to the bottom of the
dialog box and select Transform Data. This selection opens the Power Query Editor
so that you can filter and refine the set of data you want to use.
10. Otherwise, select Load. After the load is complete, you can create visualizations like
the one in the following image. If you selected DirectQuery, Power BI issues a
query to Athena for the visualization that you requested.
Amazon OpenSearch Service (Beta)
Article • 07/18/2023
7 Note
The following connector article is provided by Amazon, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the OpenSearch website and use the support
channels there.
Summary
Item Description
Prerequisites
Microsoft Power BI Desktop
OpenSearch
OpenSearch SQL ODBC driver
Capabilities supported
Import
DirectQuery (Power BI Datasets)
4. Enter host and port values and select your preferred SSL option. Then select OK.
7. Select Load.
Troubleshooting
If you get an error indicating the driver wasn't installed, install the OpenSearch SQL
ODBC Driver .
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
An Amazon Web Services (AWS) account
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Provider name
Batch size
SQL statement
Connect to Amazon Redshift data from Power
Query Desktop
To connect to Amazon Redshift data:
2. In Server, enter the server name where your data is located. As part of the Server
field, you can also specify a port in the following format: ServerURL:Port. In
Database, enter the name of the Amazon Redshift database you want to access. In
this example, contoso.redshift.amazonaws.com:5439 is the server name and port
number, dev is the database name, and Data Connectivity mode is set to Import.
You can also choose some optional advanced options for your connection. More
information: Connect using advanced options
After you have finished filling in and selecting all the options you need, select OK.
3. If this is the first time you're connecting to this database, enter your credentials in
the User name and Password boxes of the Amazon Redshift authentication type.
Then select Connect.
More information: Authentication with a data source
4. Once you successfully connect, a Navigator window appears and displays the data
available on the server. Choose one or more of the elements you want to import.
5. Once you've selected the elements you want, then either select Load to load the
data or Transform Data to continue transforming the data in Power Query Editor.
6. Select either the Import or DirectQuery data connectivity mode, and then select
OK.
1. Select the Amazon Redshift option in the Power Query - Choose data source
page.
2. In Server, enter the server name where your data is located. As part of the Server
field, you can also specify a port in the following format: ServerURL:Port. In
Database, enter the name of the Amazon Redshift database you want to access. In
this example, contoso.redshift.amazonaws.com:5439 is the server name and port
number, and dev is the database name.
You can also choose some optional advanced options for your connection. More
information: Connect using advanced options
4. Select the type of authentication you want to use in Authentication kind, and then
enter your credentials.
5. Select or clear Use Encrypted Connection depending on whether you want to use
an encrypted connection or not.
7. In Navigator, select the data you require, and then select Transform data. This
selection opens the Power Query Editor so that you can filter and refine the set of
data you want to use.
The following table describes all of the advanced options you can set in Power Query.
Advanced Description
option
Provider Provides an Amazon Resource Name (ARN), which uniquely identifies AWS
Name resources.
Batch size Specifies the maximum number of rows to retrieve at a time from the server when
fetching data. A small number translates into more calls to the server when
retrieving a large dataset. A large number of rows may improve performance, but
could cause high memory usage. The default value is 100 rows.
SQL For information, go to Import data from a database using native database query.
Statement This option is only available in Power BI Desktop.
4. Under the Data Source Settings tab, enter a value in Provider Name. The Provider
Name parameter is required when using Azure AD and needs to be specified in
Advanced settings.
7 Note
The following connector article is provided by Anaplan, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Anaplan website and use the support
channels there.
Summary
Item Description
Prerequisites
There are system requirements to verify before you install Microsoft Power BI
Desktop.
Capabilities supported
The Anaplan Connector enables access to your Anaplan model exports. It also:
Get data
To access your saved export from Power BI desktop:
1. Select Get data from the ribbon menu, which displays the Common data sources
menu.
3. In Get Data, enter Anaplan into the search field, which displays the Anaplan
Connector.
5. Select Connect.
The connector uses either basic authentication (user ID, password) or the Anaplan-
configured IDP for logging into Anaplan. To use the second method, you must be
designated as an Anaplan Single-Sign On (SSO) user in your Anaplan model. You can set
your own configuration.
a. If you choose Basic auth, enter the following URLs, and then select OK.
b. If you prefer the Anaplan configured IDP for logging into Anaplan, enter the
following URLs, and then select OK.
c. Select OK.
2. From the next Anaplan dialog, choose either Basic or Organizational account
(which triggers Anaplan-configured IDP).
Authenticate
You've chosen either basic authentication or Anaplan-configured IDP.
b. Select Connect.
7 Note
Only exports that output .csv and .txt files are supported.
If you don't see the export action in the Power BI connector, check your model role
and the export actions in your model.
To run an export action, use the Navigator dialog to locate your export.
2. Check the box next to ƒx Run Export Action to select your export.
When you select ƒx Run Export Action, this selection doesn't trigger the
export run. Instead this selection downloads the last version of the exported
Anaplan data for preview.
A preview displays in the right panel. If the Anaplan export is set to Admins
only, model users might see a blank preview, but the export will run as
normal.
You'll see the preview the next time you set an integration with the same
export.
3. Select Load, which starts the export. The Load dialog displays.
More information: Create reports in Power BI. You need a report to begin.
To publish a report to Power BI service, select Publish from the Power BI Desktop report
dialog.
The report is now in Power BI service. Sign in to Power BI service to see the report.
First, create a report in the Power BI Desktop. More information: Create reports in Power
BI.
2. Select from the Data Source Type and Data Source Information dropdowns.
3. Select Apply.
If your scheduled refresh frequency is more than 15 days, you must reenter your
sign-in credentials before the end of the 15th day. If you don't, you need to
authenticate anew.
We recommend a refresh frequency that's less than every 15 days.
Apart from data refreshes, you need to reenter your sign-in credentials every 90
days.
If you do get an error message, select the refresh icon. This refresh resolves the error in
most cases.
2. Select File.
7. Select Delete.
These steps remove expired Anaplan API tokens. You must reauthenticate to proceed.
To resolve this error, select either Close & Apply or Refresh Preview.
2. Select More.
The first 1,000 rows of data then displays.
Return to your Anaplan model, decrease the size of your file, and try again.
Assemble Views
Article • 07/13/2023
7 Note
Summary
Item Description
Prerequisites
To use the Assemble Views connector, you must have an Autodesk account with a
username and password, and be a member of at least one project in Assemble.
You'll also need at least one view associated with the Assemble project.
Capabilities supported
Import
1. Select Assemble Views from the Get Data experience under the Online Services
category, and then select Connect.
2. In Assemble Views, enter your site's URL to sign in. For example, use
https://example.tryassemble.com .
a. (Optional) Select a date from which you want to load the data. Leaving this entry
blank results in the latest data being pulled each time you refresh.
Models (New!) - fetches select properties from any or all versions of a model.
Active version only – Loads only the active version of the model.
All versions – Loads all versions of the model.
All except active version – Loads all previous versions of the model without
loading the active version (intended for advanced workflows when previous
version data only needs to be loaded once and not included in a refresh).
Specific versions – Loads all specific versions of the model that are selected
(specific versions will be selected in the Version Name and Number drop
down).
3. Select the properties you want to fetch by using the search filter or scrolling. By
default, Model ID, Model Name, Version ID, Version Name, and Version Number
will always be included in the result.
7 Note
When using the search bar, be sure to clear the selection and select all
properties before selecting OK, or previous selections will be overwritten.
4. If using "Specific versions", select the versions you want to load in the Version
Name and Number (optional) dropdown, then select Apply. Selections in this
dropdown will be ignored if any of the other "Load model data" settings are
selected.
5. Once the data preview has been displayed, you can either select Transform Data to
go to the Power Query editor, or Load to go straight to the dashboard.
6. If you want to load multiple models at once, be sure to select Apply after setting
up each model per the aforementioned steps.
Loading data from Views
1. Expand the Views folder. Select the view you want to include. Additionally select
[Your Project] View Thumbnails if you want to include images in your report.
Select Transform Data to continue to Power Query.
2. In Power Query, you'll see a single column named Rows. On the header of the
column, select the button with two arrows pointing in opposite directions to
expand your rows.
a. Uncheck Use original column name as prefix and select OK for each view data
query you've selected.
b. Select Close & Apply to load the datasets.
3. (Optional) If you have chosen to load images, you'll need to update the Data
category for the image field.
a. Expand the [Your Project] View Thumbnails table, and then select the Image
field. This selection opens the Column tools tab.
b. Open the Data category drop-down and select Image URL. You can now drag
and drop the Image field into your report visuals.
Known issues and limitations
Views with greater than 100,000 rows may not load depending on the number of
fields included in the view. To avoid this limitation, we suggest breaking large
views into multiple smaller views and appending the queries in your report, or
creating relationships in your data model.
The view images feature currently only supports thumbnail sized images because
of a row size limitation in Power BI.
When creating a query using Models data, a maximum of 200 properties can be
selected.
Autodesk Construction Cloud
Article • 07/13/2023
7 Note
Summary
Item Description
Prerequisites
To use the Autodesk Construction Cloud connector, you must have an Autodesk account
with a username and password and have access to the Executive Overview in a BIM360
or an ACC Account. You also need to run a Data Connector extraction manually or have
the extractions scheduled to run in order to use this connector. The Connector pulls
from the last ran extract.
Capabilities Supported
Import
Supports US and EU Autodesk Construction Cloud servers
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, select Autodesk Construction Cloud, and then
select Connect.
2. If this is the first time you're getting data through the Autodesk Construction
Cloud connector, a preview connector notice will be displayed. Select Don't warn
me again with this connector if you don't want this message to be displayed
again, and then select Continue.
3. Select the Autodesk server region (US or EU), and then select OK.
5. In the Autodesk window that appears, provide your credentials to sign in to your
Autodesk account.
6. Once you've successfully signed in, select Connect.
7. In Navigator, expand the Account you want to pull data from, which will display an
Account Extract folder and a Project Extracts folder. Account Extract will contain
the data extract of the most recent account level extract if you have proper access
and have ran an account level data extract. Project Extracts will contain a listing of
each project in the account you have access to, which you can then expand to see
the relevant tables in that specific project's extracts if you have run a data
extraction.
8. Once you navigate to the desired Account or Project extract, select the desired
tables, and then either select Load to load the data or Transform Data to continue
transforming the data in the Power Query editor.
Connect using Autodesk provided Power BI
Templates
Download the latest Power BI Templates here:
https://construction.autodesk.com/templates/power-bi/ .
Only templates with "...(Connector).." in the file name are set up to work with this
connector.
2. Provide your ACC Account name and select the server region.
7 Note
The Account Name is the name of the ACC account you want to connect to,
not your user account name. You can find the Account name on the Account
Admin portal just to the right of the Account Admin drop down or under
Settings. The Account name also appears on the Insight (Account) page just
to the right of the Insight dropdown in the upper left hand corner.
7 Note
Summary
Item Description
Prerequisites
Before you can sign in to Automy Data Analytics, you must have an Automy Report
Token.
Capabilities Supported
Import
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, select Automy Data Analytics, and then select
Connect.
2. If this is the first time you're connecting to the Automy Data Analytics connector, a
third-party notice will be displayed. Select Don't warn me again with this
connector if you don't want this message to be displayed again, and then select
Continue.
3. Sign in to the connector with API Key to verify your access to Automy.
Once you've succeeded, select Connect.
4. In the Automy Data Analytics window that appears, select the correct parameters
to prepare the connection. Select the type of report and data type and completed
the token information, and then select Ok.
7 Note
You can generate an authentication token for reports using the configuration
option in Automy.
5. In the Navigator dialog box, select the Automy tables you want. You can then
either load or transform the data.
If you’re selecting functions, be sure to select Transform Data so that you can add
parameters to the functions you’ve selected. More information: Using parameters
Limitations and issues
Users should be aware of the following limitations and issues associated with accessing
Automy Data Analytics data.
Automy Data Analytics has a built-in limit of 100,000 rows returned per
connection.
The default rate limit for an Automy Data Analytics Company is 120 requests per
minute per user.
Import from Automy Data Analytics will stop and display an error message whenever the
Automy Data Analytics connector reaches any of the limits listed above.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
An Azure subscription. Go to Get Azure free trial .
Capabilities Supported
Import
Connect live (Power BI Desktop)
Advanced options
MDX or DAX query
1. Select the Azure Analysis Services database option in the connector selection.
More information: Where to get data
2. In the SQL Server Analysis Services database dialog that appears, provide the
name of the server and database (optional).
7 Note
Only Power BI Desktop will display the Import and Connect live options. If
you're connecting using Power BI Desktop, selecting Connect live uses a live
connection to load the connected data directly to Power BI Desktop. In this
case, you can't use Power Query to transform your data before loading the
data to Power BI Desktop. For the purposes of this article, the Import option
is selected. For more information about using a live connection in Power BI
Desktop, go to Connect to Analysis Services tabular data in Power BI
Desktop.
3. Select OK.
4. If you're connecting to this database for the first time, select the authentication
type and input your credentials. Then select Connect.
5. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in the Power
Query editor.
1. Select the Azure Analysis Services database option in the connector selection.
More information: Where to get data
2. In the Connect to data source page, provide the name of the server and database
(optional).
4. If you're connecting to this database for the first time, select the authentication
kind and input your credentials.
6. In Navigator, select the data you require, and then select Transform data.
MDX or DAX Optionally provides a specific MDX or DAX statement to the Azure Analysis
statement Services database server to execute.
Once you've entered a value in the advanced option, select OK in Power Query Desktop
or Next in Power Query Online to connect to your Azure Analysis Services database.
Troubleshooting
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
An Azure subscription. Go to Get Azure free trial .
7 Note
If you are connecting to an Azure Blob Storage account from Power BI, the Azure
Blob storage account must be in the same region as your Power BI account.
Capabilities supported
Import
1. From Get Data, select the Azure category, select Azure Blob Storage, and then
select Connect. More information: Where to get data
2. In Azure Blob Storage, enter the account name or URL of your Azure Blob Storage
account. Then select OK.
3. If this is the first time you're connecting to this account, select either the
Anonymous or Account key authentication method to sign into the Azure Blob
Storage account. More information: Copy your account key from Azure Blob
Storage
For more information about using and managing authentication, go to
Authentication with a data source.
7 Note
If you are signing in from Excel, you can also select the shared access
signature (SAS) authentication method.
4. Select Connect.
5. The Navigator dialog box shows the files that you uploaded to your Azure Blob
Storage account. Select the containers you want to use, and then select either
Transform Data to transform the data in Power Query or Load to load the data.
1. From Choose data source, select the Azure category, and then select Azure Blobs.
2. In Connection settings, enter the account name or URL of your Azure Blob Storage
account.
3. Optionally, enter the name of the on-premises data gateway you require.
4. Select the Authentication kind used to access your blob storage. If you've set blob
storage for anonymous access, choose Anonymous. If you set blob storage to
require an account key, choose Account key. More information: Copy your account
key from Azure Blob Storage
For more information about using and managing authentication, go to
Authentication with a data source.
5. Select Next.
6. The Navigator screen shows the files tht you uploaded to your Azure Blob Storage
account. Select the containers you want to use, and then select Transform data.
3. In the storage account menu pane, under Security + networking, select Access
keys.
4. In the key1 section, locate the Key value. Select Show next to the key value.
Summary
Item Description
Prerequisites
An Azure Cosmos DB account.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Number of Retries
Enable "AVERAGE" function Passdown
Enable "SORT" Passdown for multiple columns
For smaller datasets, choose Import. When using import mode, Power BI
works with Cosmos DB to import the contents of the entire dataset for use in
your visualizations.
7 Note
Use Azure Synapse Link for Azure Cosmos DB if you would like to execute
cross-partitioned aggregate functions against the Cosmos DB container.
More information:
7. In the Display Options pane, select the check box for the dataset that you want to
use.
8. The most optimal way to specify the Partition Key filter (so that the aggregate
functions can be pushed down to Cosmos DB) is to use dynamic M parameters. To
use dynamic M parameters, you would create a dataset with unique Partition Key
values, create a parameter, add it as filter on main dataset, bind it to the unique
Partition key datset, and use it as a slicer for the main dataset. Use the following
steps to enable dynamic M parameters for Partition Key filtering.
Rename the new Partition Key dataset, then right-click on the Cosmos DB partition
key column. In this example, Product is the Cosmos DB partition key column.
Select Remove Other Columns, and then select Remove Duplicates.
In the Power Query editor, select Manage Parameters > New Parameter. Rename
the new parameter to reflect the filter parameter and input a valid value as Current
Value.
Select the dropdown icon of the Partition Key column, then select Text Filters >
Equals. Change the filter type from Text to Parameter. Then choose the parameter
that was created in step b. Select Close & Apply on top left corner of the Power
Query editor.
In Power BI, select the Model tab. Then select the Partition Key field. From the
Properties pane, select Advanced > Bind to parameter. Choose the parameter
that was created in step b.
Select the Report tab and add a slicer with the unique Partition Key.
e. Add visualizations and apply Partition Key filter from the slicer:
Since the chosen partition key value on the slicer is bound to the parameter (as
done in step d) and the parameterized filter is applied on the main dataset (as
done in step c), the chosen partition key value will be applied as a filter on the
main dataset and the query with the partition key filter will be passed down to
Cosmos DB in all visualizations.
Advanced options
Power Query provides a set of advanced options that you can add to your query if
needed.
The following table lists all of the advanced options you can set in Power Query.
Advanced Description
option
Number of How many times to retry if there are HTTP return codes of 408 - Request Timeout ,
Retries 412 - Precondition Failed , or 429 - Too Many Requests . The default number of
retries is 5.
Enable Specifies whether the connector allows pass-down of the AVG aggregate function
AVERAGE to the Cosmos DB. The default value is 1 and the connector attempts to pass-
function down the AVG aggregate function down to Cosmos DB, by default. If the
Passdown argument contains string, boolean, or null values for the AVG aggregate function,
an undefined result set is returned by the Cosmos DB server. When set to value of
0, the AVG aggregate function isn't passed down to the Cosmos DB server, and
the connector handles performing the AVG aggregation operation itself.
Enable SORT Specifies whether the connector allows multiple columns to be passed down to
Passdown Cosmos DB when specified in the ORDER BY clause of the SQL query. The default
for multiple value is 0 and if more than one column is specified in the ORDER BY clause, the
columns connector doesn't pass down the columns by default and instead handles
performing the order by itself. When set to value of 1, the connector attempts to
pass-down multiple columns to Cosmos DB when specified in the ORDER BY
clause of the SQL query. To allow multiple columns to be passed down to Cosmos
DB, make sure to have composite indexes set on the columns in the respective
collections. For partitioned collections, a SQL query with ORDER BY will be passed
Advanced Description
option
down to Cosmos DB only if the query contains a filter on the partitioned key. Also,
if there are more than eight columns specified in the ORDER BY clause, the
connector doesn't pass down the ORDER BY clause and instead handles the
ordering execution itself.
The connector doesn't pass down an aggregate function if it's called upon after
TOP or LIMIT is applied. Cosmos DB processes the TOP operation at the end when
processing a query. For example, in the following query, TOP is applied in the
subquery, while the aggregate function is applied on top of that result set:
For the SUM aggregate function, Cosmos DB returns undefined as the result set if
any of the arguments in SUM are string, boolean, or null. However, if there are null
values, the connector passes the query to Cosmos DB in such a way that it asks the
data source to replace a null value with zero as part of the SUM calculation.
For the AVG aggregate function, Cosmos DB returns undefined as result set if any
of the arguments in SUM are string, boolean, or null. The connector exposes a
connection property to disable passing down the AVG aggregate function to
Cosmos DB in case this default Cosmos DB behavior needs to be overridden. When
AVG passdown is disabled, it isn't passed down to Cosmos DB, and the connector
handles performing the AVG aggregation operation itself. For more information,
go to "Enable AVERAGE function Passdown" in Advanced options.
Azure Cosmos DB Containers with large partition key are not currently supported
in the Connector.
Aggregation passdown is disabled for the following syntax due to server
limitations:
When the query isn't filtering on a partition key or when the partition key filter
uses the OR operator with another predicate at the top level in the WHERE
clause.
When the query has one or more partition keys appear in an IS NOT NULL
clause in the WHERE clause.
Filter passdown is disabled for the following syntax due to server limitations:
When the query containing one or more aggregate columns is referenced in the
WHERE clause.
Azure Cost Management
Article • 07/13/2023
Summary
Item Description
7 Note
This connector replaces the previously available Azure Consumption Insights and
Azure Cost Management (Beta) connectors. Any reports created with the previous
connector must be recreated using this connector.
Prerequisites
An Azure subscription. Go to Get Azure free trial .
Capabilities supported
Import
Advanced options
Start Date
End Date
4. In the dialog that appears, for the Choose Scope drop down, use Manually Input
Scope for Microsoft Customer Agreements, or use Enrollment Number for
Enterprise Agreements (EA).
/providers/Microsoft.Billing/billingAccounts/{billingAccountId}
Alternatively, for Choose Scope, select Enrollment Number and input the Billing
Account ID string as copied from the previous steps.
Alternatively, if you want to download less than a month's worth of data you can
set Number of months to zero, then specify a date range using Start Date and End
Date values that equate to less than 31 days.
7. When prompted, sign in with your Azure user account and password. You must
have access to the Billing account scope to successfully access the billing data.
6. For Choose Scope, select Manually Input Scope and input the connection string as
shown in the example below, replacing {billingAccountId} and {billingProfileId} with
the data copied from the previous steps.
/providers/Microsoft.Billing/billingAccounts/{billingAccountId}/billingProfile
s/{billingProfileId}
8. When prompted, sign in with your Azure user account and password. You must
have access to the Billing profile to successfully access the billing profile data.
4. For Choose Scope, select Enrollment Number and paste the billing account ID
from the previous step.
6. When prompted, sign in with your Azure user account and password. You must use
an Enterprise Administrator account for Enterprise Agreements.
Table Description
Balance summary Summary of the balance for the current billing month for Enterprise
Agreements (EA).
Billing events Event log of new invoices, credit purchases, etc. Microsoft Customer
Agreement only.
Budgets Budget details to view actual costs or usage against existing budget
targets.
Credit lots Azure credit lot purchase details for the provided billing profile.
Microsoft Customer Agreement only.
Pricesheets Applicable meter rates for the provided billing profile or EA enrollment.
RI charges Charges associated to your Reserved Instances over the last 24 months.
This table is in the process of being deprecated, please use RI
transactions
RI usage details Consumption details for your existing Reserved Instances over the last
month.
Usage details A breakdown of consumed quantities and estimated charges for the
given billing profile on EA enrollment.
You can select a table to see a preview dialog. You can select one or more tables by
selecting the boxes beside their name. Then either select Load to load the data or
Transform Data to continue transforming the data in Power Query Editor.
When you select Load, the data is loaded into Power BI Desktop.
When the data you selected is loaded, the data tables and fields are shown in the Fields
pane.
The Azure Cost Management connector uses OAuth 2.0 for authentication with
Azure and identifies users who are going to use the connector. Tokens generated
in this process are valid for a specific period. Power BI preserves the token for the
next login. OAuth 2.0, is a standard for the process that goes on behind the scenes
to ensure the secure handling of these permissions. To connect, you must use an
Enterprise Administrator account for Enterprise Agreements, or have appropriate
permissions at the billing account or billing profile levels for Microsoft Customer
Agreements.
Data row requests exceeding one million rows is not supported by Power BI.
Instead, you can try using the export feature described in create and manage
exported data in Azure Cost Management.
The Azure Cost Management data connector doesn't work with Office 365 GCC
customer accounts.
Data refresh: The cost and usage data is typically updated and available in the
Azure portal and supporting APIs within 8 to 24 hours, so we suggest you
constrain Power BI scheduled refreshes to once or twice a day.
Data source reuse: If you have multiple reports that are pulling the same data, and
don't need additional report-specific data transformations, you should reuse the
same data source, which would reduce the amount of time required to pull the
Usage Details data.
You might receive a 400 bad request from the RI usage details when you try to refresh
the data if you've chosen a date parameter greater than three months. To mitigate the
error, take the following steps:
2. In Power Query Editor, select the RI usage details dataset and select Advanced
Editor.
3. Update the Power Query code as shown in the following paragraph(s), which will
split the calls into three-month chunks. Make sure you note and retain your
enrollment number, or billing account/billing profile ID.
Power Query M
let
enrollmentNumber = "<<Enrollment Number>>",
optionalParameters1 = [startBillingDataWindow = "-9",
endBillingDataWindow = "-6"],
source1 = AzureCostManagement.Tables("Enrollment Number",
enrollmentNumber, 5, optionalParameters1),
riusagedetails1 = source1{[Key="riusagedetails"]}[Data],
optionalParameters2 = [startBillingDataWindow = "-6",
endBillingDataWindow = "-3"],
source2 = AzureCostManagement.Tables("Enrollment Number",
enrollmentNumber, 5, optionalParameters2),
riusagedetails2 = source2{[Key="riusagedetails"]}[Data],
riusagedetails = Table.Combine({riusagedetails1, riusagedetails2})
in
riusagedetails
Power Query M
let
billingProfileId = "<<Billing Profile Id>>",
optionalParameters1 = [startBillingDataWindow = "-9",
endBillingDataWindow = "-6"],
source1 = AzureCostManagement.Tables("Billing Profile Id",
billingProfileId, 5, optionalParameters1),
riusagedetails1 = source1{[Key="riusagedetails"]}[Data],
optionalParameters2 = [startBillingDataWindow = "-6",
endBillingDataWindow = "-3"],
source2 = AzureCostManagement.Tables("Billing Profile Id",
billingProfileId, 5, optionalParameters2),
riusagedetails2 = source2{[Key="riusagedetails"]}[Data],
riusagedetails = Table.Combine({riusagedetails1, riusagedetails2})
in
riusagedetails
4. Once you've updated the code with the appropriate update from the previous
step, select Done and then select Close & Apply.
Azure Databricks
Article • 07/13/2023
7 Note
Summary
Item Description
Capabilities supported
Import
DirectQuery (Power BI Datasets)
1. In the Get Data experience, search for Databricks to shortlist the Databricks
connector, Azure Databricks. Use the Azure Databricks connector for all
Databricks SQL Warehouse data unless you've been instructed otherwise by your
Databricks rep.
2. Provide the Server hostname and HTTP Path for your Databricks SQL Warehouse.
Refer to Configure the Databricks ODBC and JDBC drivers for instructions to look
up your "Server hostname" and "HTTP Path". Enter this information accordingly.
You can optionally supply a default catalog and/or database under Advanced
options. Select OK to continue.
Username / Password (useable for AWS or GCP). This option isn't available if
your organization/account uses 2FA/MFA.
Personal Access Token (useable for AWS, Azure or GCP). Refer to Personal
access tokens for instructions on generating a Personal Access Token (PAT).
Azure Active Directory (useable only for Azure). Sign in to your organizational
account using the browser popup.
7 Note
Once you enter your credentials for a particular Databricks SQL Warehouse,
Power BI Desktop caches and reuses those same credentials in subsequent
connection attempts. You can modify those credentials by going to File >
Options and settings > Data source settings. More information: Change the
authentication method
4. Once you successfully connect, the Navigator shows the data available to you on
the cluster. You can choose either Transform Data to transform the data using
Power Query or Load to load the data in Power Query Desktop.
Connect to Databricks data from Power Query
Online
To connect to Databricks from Power Query Online, take the following steps:
1. In the Get Data experience, select the Database category. (Refer to Creating a
dataflow for instructions.) Shortlist the available Databricks connectors with the
search box. Use the Azure Databricks connector for all Databricks SQL Warehouse
data unless you've been instructed otherwise by your Databricks rep.
2. Enter the Server hostname and HTTP Path for your Databricks SQL Warehouse.
Refer to Configure the Databricks ODBC and JDBC drivers for instructions to look
up your "Server hostname" and "HTTP Path". You can optionally supply a default
catalog and/or database under Advanced options.
3. Provide your credentials to authenticate with your Databricks SQL Warehouse.
There are three options for credentials:
Username / Password (useable for AWS or GCP). This option isn't available if
your organization/account uses 2FA/MFA.
Account Key (useable for AWS, Azure or GCP). Refer to Personal access tokens
for instructions on generating a Personal Access Token (PAT).
Azure Active Directory (useable only for Azure). Sign in to your organizational
account using the browser popup.
4. Once you successfully connect, the Navigator appears and displays the data
available on the server. Select your data in the navigator. Then select Next to
transform the data in Power Query.
Limitations
The Azure Databricks connector supports web proxy. However, automatic proxy
settings defined in .pac files aren't supported.
In the Azure Databricks connector, the Databricks.Query data source isn't
supported in combination with Power BI dataset's DirectQuery mode.
Azure Data Explorer (Kusto)
Article • 07/13/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
An Azure subscription. Go to Get Azure free trial .
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Limit query result record number
Limit query result data size in Bytes
Disable result-set truncation
Additional set statements
1. In Get Data, select Azure > Azure Data Explorer (Kusto), and then select Connect.
More information: Where to get data
2. In Azure Data Explorer (Kusto), provide the name of your Azure Data Explorer
cluster. For this example, use https://help.kusto.windows.net to access the
sample help cluster. For other clusters, the URL is in the form
https://<ClusterName>.<Region>.kusto.windows.net.
You can also select a database that's hosted on the cluster you're connecting to,
and one of the tables in the database, or a query like StormEvents | take 1000 .
3. If you want to use any advance options, select the option and enter the data to use
with that option. More information: Connect using advanced options
7 Note
You might need to scroll down to display all of the advanced options and the
data connectivity selection.
4. Select either the Import or DirectQuery data connectivity mode (Power BI Desktop
only). More information: When to use Import or Direct Query mode
5. Select OK to continue.
6. If you don't already have a connection to the cluster, select Sign in. Sign in with an
organizational account, then select Connect.
7. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in the Power
Query editor. In this example, StormEvents was selected in the Samples database.
1. Select the Azure Data Explorer (Kusto) option from Choose data source. More
information: Where to get data
2. In Connect to data source, provide the name of your Azure Data Explorer cluster.
For this example, use https://help.kusto.windows.net to access the sample help
cluster. For other clusters, the URL is in the form https://<ClusterName>.
<Region>.kusto.windows.net.
You can also select a database that's hosted on the cluster you're connecting to,
and one of the tables in the database, or a query like StormEvents | take 1000 .
3. If you want to use any advance options, select the option and enter the data to use
with that option. More information: Connect using advanced options
5. If you don't already have a connection to the cluster, select Sign in. Sign in with an
organizational account.
7. In the Choose data page, select the database information you want, then either
select Transform Data or Next to continue transforming the data in the Power
Query editor. In this example, StormEvents was selected in the Samples database.
The following table lists all of the advanced options you can set in Power Query Desktop
and Power Query Online.
Limit query result The maximum number of records to return in the result.
record number
Limit query result The maximum data size in bytes to return in the result.
data size in Bytes
Disable result-set Enable or disable result truncation by using the notruncation request
truncation option.
Additional Set Sets query options for the duration of the query. Query options control
Statements how a query executes and returns results. Multiple Set statements can be
separated by semicolons.
For information about additional advanced options not available in the Power Query UI,
go to Configuring Azure Data Explorer connector options in an M Query.
To learn more about using DirectQuery, go to About using DirectQuery in Power BI.
functions.
Kusto
StormEvents | where StartTime > (now()-5d)
StormEvents | where StartTime > ago(5d)
Power Query M
let
Source = AzureDataExplorer.Contents("help", "Samples", "StormEvents",
[]),
#"Filtered Rows" = Table.SelectRows(Source, each [StartTime] >
(DateTime.FixedLocalNow()-#duration(5,0,0,0)))
in
#"Filtered Rows"
Power Query M
let
Source = AzureDataExplorer.Contents("help", "Samples", "StormEvents",
[<options>])
in
Source
You can combine multiple options together to reach the required behavior:
[NoTruncate=true, CaseInsensitive=true]
These options issue set statements with your query to change the default query limits:
Case sensitivity
By default, the connector generates queries that use the case sensitive == operator
when comparing string values. If the data is case insensitive, this isn't the desired
behavior. To change the generated query, use the CaseInsensitive connector option:
Power Query M
let
Source = AzureDataExplorer.Contents("help", "Samples", "StormEvents",
[CaseInsensitive=true]),
#"Filtered Rows" = Table.SelectRows(Source, each [State] == "aLaBama")
in
#"Filtered Rows"
You can use a query parameter in any query step that supports it. For example, filter the
results based on the value of a parameter. In this example, select the drop-down menu
on the right side of the State column in the Power Query editor, select Text Filters >
Equals, then select ALABAMA under Keep rows where 'State'.
Functions can also receive parameters and so add a lot of flexibility to the Power BI user.
Power BI has a lot of ways to slice the data. But all filters and slicers are added after the
original KQL and in many cases you'll want to use filtering at an early stage of the query.
Using functions and dynamic parameters is a very effective way to customize the final
query.
Creating a function
You can create the following function in any Azure Data Explorer cluster that you have
access to, including a free cluster. The function returns the table SalesTable from the
help cluster, filtered for sales transactions greater than or smaller than a number
provided by the report user.
Kusto
Kusto
LargeOrSmallSales(2000,">")
| summarize Sales=tolong(sum(SalesAmount)) by Country
Kusto
LargeOrSmallSales(20,"<")
| summarize Sales=tolong(sum(SalesAmount)) by Country
2. In the Power Query navigator, select the function from the list of objects. The
connector analyzes the parameters and presents them above the data on the right
side of the navigator.
3. Add values to the parameters and then select Apply.
5. Once in the Power Query editor, create two parameters, one for the cutoff value
and one for the operator.
6. Go back to the LargeOrSmallSales query and replace the values with the query
parameters in the formula bar.
7. From the editor, create two static tables (Enter Data) to provide options for the two
parameters. For the cutoff, you can create a table with values like 10, 50, 100, 200,
500, 1000, 2000. For the Op , a table with two Text values < and > .
8. The two columns in the tables need to be bound to the query parameters using
the Bind to parameter selection.
The final report will include slicers for the two static tables and any visuals from the
summary sales.
In Advanced Editor:
For example:
Power Query M
2. Insert a query parameter into the Kusto Query Language (KQL) query.
If you paste a KQL query directly in the connection dialog, the query will be part of
the source step in Power Query. You can embed parameters as part of the query
using the advanced editor or when editing the source statement in the formula
bar. An example query could be StormEvents | where State == ' " & State & " '
| take 100 . State is a parameter and in run time the query will be:
3. If your query contains quotation marks, encode them correctly. For example, the
following query in KQL:
Kusto
will appear in the Advanced Editor as follows with two quotation marks:
Power Query M
If you are using a parameter, such as State , it should be replaced with the
following query, which contains three quotation marks:
Kusto
"StormEvents | where State == """ & State & """ | take 100"
The following example shows how to use the percentiles function in Azure Data
Explorer:
Power Query M
let
StormEvents = AzureDataExplorer.Contents(DefaultCluster,
DefaultDatabase){[Name = DefaultTable]}[Data],
Percentiles = Value.NativeQuery(StormEvents, "| summarize
percentiles(DamageProperty, 50, 90, 95) by State")
in
Percentiles
7 Note
On Feb 29, 2024 Azure Data Lake Storage Gen1 will be retired. For more
information, go to the official announcement . If you use Azure Data Lake
Storage Gen1, make sure to migrate to Azure Data Lake Storage Gen2 prior to that
date. To learn how, go to Migrate Azure Data Lake Storage from Gen1 to Gen2.
Unless you already have an Azure Data Lake Storage Gen1 account, you can't create
new ones.
Summary
Item Description
Products Excel
Power BI (Datasets)
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
An Azure subscription. Go to Get Azure free trial .
An Azure Data Lake Storage Gen1 account. Follow the instructions at Get started
with Azure Data Lake Storage Gen1 using the Azure portal. This article assumes
that you've already created a Data Lake Storage Gen1 account, called myadlsg1,
and uploaded a sample data file (Drivers.txt) to it. This sample file is available for
download from Azure Data Lake Git Repository .
Capabilities supported
Import
Advanced options
Page size in bytes
3. If this is the first time you're connecting to this database, select Sign in to sign into
the Azure Data Lake Storage Gen1 account. You'll be redirected to your
organization's sign-in page. Follow the prompts to sign in to the account.
5. The Navigator dialog box shows the file that you uploaded to your Azure Data
Lake Storage Gen1 account. Verify the information and then select either
Transform Data to transform the data in Power Query or Load to load the data in
Power BI Desktop.
Connect using advanced options
Power Query provides an advanced option that you can add to your query if needed.
Advanced Description
option
Page Size in Bytes Used to break up large files into smaller pieces. The default page size is 4
MB.
See also
Azure Data Lake Storage Gen2
Azure Data Lake Storage Gen1 documentation
Azure Data Lake Storage Gen2
Article • 07/07/2023
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
An Azure subscription. Go to Get Azure free trial .
Ensure you're granted one of the following roles for the storage account: Blob
Data Reader, Blob Data Contributor, or Blob Data Owner.
A sample data file named Drivers.txt located in your storage account. You can
download this sample from Azure Data Lake Git Repository , and then upload
that file to your storage account.
Capabilities supported
Import
File System View
CDM Folder View
2. In the Azure Data Lake Storage Gen2 dialog box, provide the URL to your Azure
Data Lake Storage Gen2 account, container, or subfolder using the container
endpoint format. URLs for Data Lake Storage Gen2 have the following pattern:
https://<accountname>.dfs.core.windows.net/<container>/<subfolder>
You can also select whether you want to use the file system view or the Common
Data Model folder view.
Select OK to continue.
3. If this is the first time you're using this URL address, you'll be asked to select the
authentication method.
If you select the Organizational account method, select Sign in to sign into your
storage account. You'll be redirected to your organization's sign-in page. Follow
the prompts to sign into the account. After you've successfully signed in, select
Connect.
If you select the Account key method, enter your account key and then select
Connect.
4. The Navigator dialog box shows all files under the URL you provided. Verify the
information and then select either Transform Data to transform the data in Power
Query or Load to load the data.
2. In Connect to data source, enter the URL to your Azure Data Lake Storage Gen2
account. Refer to Limitations to determine the URL to use.
3. Select whether you want to use the file system view or the Common Data Model
folder view.
4. If needed, select the on-premises data gateway in Data gateway.
5. Select Sign in to sign into the Azure Data Lake Storage Gen2 account. You'll be
redirected to your organization's sign-in page. Follow the prompts to sign in to the
account.
7. The Choose data page shows all files under the URL you provided. Verify the
information and then select Transform Data to transform the data in Power Query.
Limitations
Refresh authentication
Microsoft doesn't support dataflow or dataset refresh using OAuth2 authentication
when the Azure Data Lake Storage Gen2 (ADLS) account is in a different tenant. This
limitation only applies to ADLS when the authentication method is OAuth2, that is, when
you attempt to connect to a cross-tenant ADLS using an Azure AD account. In this case,
we recommend that you use a different authentication method that isn't OAuth2/AAD,
such as the Key authentication method.
Proxy and firewall requirements
When you create a dataflow using a gateway, you might need to change some of your
proxy settings or firewall ports to successfully connect to your Azure data lake. If a
dataflow fails with a gateway-bound refresh, it might be due to a firewall or proxy issue
on the gateway to the Azure storage endpoints.
If you're using a proxy with your gateway, you might need to configure the
Microsoft.Mashup.Container.NetFX45.exe.config file in the on-premises data gateway.
More information: Configure proxy settings for the on-premises data gateway.
To enable connectivity from your network to the Azure data lake, you might need to
enable list specific IP addresses on the gateway machine. For example, if your network
has any firewall rules in place that might block these attempts, you'll need to unblock
the outbound network connections for your Azure data lake. To enable list the required
outbound addresses, use the AzureDataLake service tag. More information: Virtual
network service tags
Dataflows also support the "Bring Your Own" data lake option, which means you create
your own data lake, manage your permissions, and you explicitly connect it to your
dataflow. In this case, when you're connecting to your development or production
environment using an Organizational account, you must enable one of the following
roles for the storage account: Blob Data Reader, Blob Data Contributor, or Blob Data
Owner.
See also
Analyze data in Azure Data Lake Storage Gen2 by using Power BI
Introduction to Azure Data Lake Storage Gen2
Analyze data in Azure Data Lake Storage
Gen2 by using Power BI
Article • 07/18/2023
In this article, you'll learn how to use Power BI Desktop to analyze and visualize data
that's stored in a storage account that has a hierarchical namespace (Azure Data Lake
Storage Gen2).
Prerequisites
Before you begin this tutorial, you must have the following prerequisites:
2. Follow the instructions in the Azure Data Lake Storage Gen2 connector article to
connect to the sample data.
4. After the data has been successfully loaded into Power BI, the following fields are
displayed in the Fields panel.
However, to visualize and analyze the data, you might prefer the data to be
available using the following fields.
In the next steps, you'll update the query to convert the imported data to the
desired format.
5. From the Home tab on the ribbon, select Transform Data. The Power Query editor
then opens, displaying the contents of the file.
6. In the Power Query editor, under the Content column, select Binary. The file will
automatically be detected as CSV and will contain the output as shown below. Your
data is now available in a format that you can use to create visualizations.
7. From the Home tab on the ribbon, select Close & Apply.
8. Once the query is updated, the Fields tab displays the new fields available for
visualization.
9. Now you can create a pie chart to represent the drivers in each city for a given
country/region. To do so, make the following selections.
From the Visualizations tab, select the symbol for a pie chart.
In this example, the columns you're going to use are Column 4 (name of the city)
and Column 7 (name of the country/region). Drag these columns from the Fields
tab to the Visualizations tab as shown below.
The pie chart should now resemble the one shown below.
10. If you select a specific country/region from the page level filters, the number of
drivers in each city of the selected country/region will be displayed. For example,
under the Visualizations tab, under Page level filters, select Brazil.
11. The pie chart is automatically updated to display the drivers in the cities of Brazil.
12. From the File menu, select Save to save the visualization as a Power BI Desktop file.
See also
Azure Data Lake Storage Gen2
Azure HDInsight (HDFS)
Article • 01/24/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Customer Insights
Analysis Services
Capabilities Supported
Import
1. From Get Data, select the Azure category, select Azure HDInsight, and then select
Connect. More information: where to get data
2. In the window that appears, enter the name of the storage account associated
with your HDInsight account. If you don't know the name of your storage account,
you can find it using the steps in the section below.
3. Select OK.
4. You can either select anonymous access, if your storage account is configured for
anonymous access, or you can select account key.
5. If you select anonymous access, there's nothing to enter, so select Connect.
6. If you select account key, add the storage account key for the Azure Storage
account associated with your HDInsight account and select Connect.
7. In Navigator, select one or more files or folders to import and use in your
application. Then select either Load to load the table, or Transform Data to open
the Power Query Editor where you can filter and refine the set of data you want to
use, and then load that refined set of data.
1. Select the Azure HDInsight option in the connector selection. More information:
Where to get data
2. In the Azure HDInsight dialog that appears, enter the name of the storage
account associated with your HDInsight account. If you don't know the name of
your storage account, you can find it using the steps in the section below.
3. You can select an existing a connection or a gateway. You can also either select
anonymous access, if your storage account is configured for anonymous access, or
you can select account key.
5. If you select account key, add the storage account key for the Azure Storage
account associated with your HDInsight account and select Next.
6. Select one or multiple tables to import and use, then select Transform Data to
transform data in the Power Query editor.
2. Locate your Azure HDInsight account and select Storage accounts in the left
menu. Then select your storage account.
3. In the storage account menu pane, under Security + networking, select Access
keys.
4. In the key1 section, locate the Key value. Select Show next to the key value.
5. Select the Copy to clipboard icon to copy the Key value.
Azure HDInsight on AKS Trino (Beta)
Article • 10/10/2023
Summary
Item Description
Prerequisites
An Azure HDInsight on AKS Trino cluster.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Native SQL statement
Session properties
Set default catalog/schema
1. In the Get Data experience, select Azure from the categories on the left, then select
Azure HDInsight on AKS Trino. More information: Where to get data
2. In the Azure HDInsight on AKS Trino dialog that appears, provide the Azure Trino
URL (Trino cluster URL).
3. Select the connectivity mode: Direct Query (recommended for Trino big data) or
Import.
4. Select OK.
5. If you're connecting to this Trino cluster for the first time, select Sign in to
authenticate. Then select Connect.
6. In Navigator, expand the catalog and schema to reveal the table you want, then
either select Load to load the data or Transform Data to continue transforming the
data in Power Query Editor.
1. In the Get Data experience, search for and select Azure HDInsight on AKS Trino.
More information: Where to get data
2. In the options that appear, provide the Azure Trino URL.
5. If you're connecting to this Trino cluster for the first time, select Sign in.
7. In Navigator, select the table you require, and then select Transform data.
Advanced Description
option
SQL statement For information, go to Import data from a database using native database
query.
Session Allows Trino session properties to be passed with the connection to the Trino
properties cluster. Session property should specify key value pairs separated by a colon,
and each pair separated by a comma. Example:
distributed_sort:true,colocated_join:false
Default Providing a catalog and schema sets the context of the connection to a
catalog/schema specific Trino catalog and schema.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to the Trino cluster.
bigint
integer
smallint
tinyint
real
double
decimal
boolean
char
varchar
date
timestamp
array
map
varbinary
Azure SQL database
Article • 07/13/2023
Summary
Item Description
Authentication types Windows (Power BI Desktop, Excel, Power Query Online with
supported gateway)
Database (Power BI Desktop, Excel)
Microsoft Account (all)
Basic (Power Query Online)
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
By default, Power BI installs an OLE DB driver for Azure SQL database. However, for
optimal performance, we recommend that the customer installs the SQL Server Native
Client before using the Azure SQL database connector. SQL Server Native Client 11.0
and SQL Server Native Client 10.0 are both supported in the latest version.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
SQL Server failover support
2. In SQL Server database, provide the name of the server and database (optional).
4. Optionally, you can select and enter advanced options that will modify the
connection query, such as a command timeout or a native query (SQL statement).
For information: Connect using advance options
5. Select OK.
6. If this is the first time you're connecting to this database, select the authentication
type, input your credentials, and select the level to apply the authentication
settings to. Then select Connect.
For more information about authentication methods, go to Authentication with a
data source.
7 Note
7. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
Editor.
Connect to Azure SQL database from Power
Query Online
To connect to an Azure SQL database from Power Query Online, take the following
steps:
2. In Azure SQL database, provide the name of the server and database.
You can also select and enter advanced options that will modify the connection
query, such as a command timeout or a native query (SQL statement). More
information: Connect using advanced options
3. If this is the first time you're connecting to this database, select the authentication
kind and input your credentials.
5. If the connection is not encrypted, clear the Use Encrypted Connection check box.
7. In Navigator, select the data you require, and then select Transform data.
Command timeout If your connection lasts longer than 10 minutes (the default timeout), you
in minutes can enter another value in minutes to keep the connection open longer.
This option is only available in Power Query Desktop.
SQL statement For information, go to Import data from a database using native database
query.
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, you won’t see those columns.
columns
Navigate using full If checked, the navigator displays the complete hierarchy of tables in the
hierarchy database you're connecting to. If cleared, the navigator displays only the
tables whose columns and rows contain data.
Enable SQL Server If checked, when a node in the Azure SQL failover group isn't available,
Failover support Power Query moves from that node to another when failover occurs. If
cleared, no failover occurs.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your Azure SQL database.
Troubleshooting
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
By default, Power BI installs an OLE DB driver for Azure Synapse Analytics (SQL DW).
However, for optimal performance, we recommend that the customer installs the SQL
Server Native Client before using the Azure Synapse Analytics (SQL DW) connector. SQL
Server Native Client 11.0 and SQL Server Native Client 10.0 are both supported in the
latest version.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
SQL Server failover support
Enable cross database folding
1. Select the Azure Synapse Analytics SQL option in the connector selection.
2. In the SQL Server database dialog that appears, provide the name of the server
and database (optional). In this example, TestAzureSQLServer is the server name
and AdventureWorks2012 is the database.
You can also select and enter advanced options that will modify the connection
query, such as a command timeout or a native query (SQL statement). More
information: Connect using advanced options
4. Select OK.
5. If this is the first time you're connecting to this database, select the authentication
type, input your credentials, and select the level to apply the authentication
settings to. Then select Connect.
7 Note
If the connection is not encrypted, you'll be prompted with the following dialog.
6. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
Editor.
1. Select the Azure Synapse Analytics (SQL DW) option in the connector selection.
2. In the Azure Synapse Analytics (SQL DW) dialog that appears, provide the name
of the server and database (optional). In this example, TestAzureSQLServer is the
server name and AdventureWorks2012 is the database.
You can also select and enter advanced options that will modify the connection
query, such as a command timeout or a native query (SQL statement). More
information: Connect using advanced options
3. If this is the first time you're connecting to this database, select the authentication
kind and input your credentials.
4. If required, select the name of your on-premises data gateway.
5. If the connection is not encrypted, clear the Use Encrypted Connection check box.
7. In Navigator, select the data you require, and then select Transform data.
The following table lists all of the advanced options you can set in Power Query Desktop
and Power Query Online.
Command timeout If your connection lasts longer than 10 minutes (the default timeout), you
in minutes can enter another value in minutes to keep the connection open longer.
SQL statement For information, go to Import data from a database using native database
query.
Include relationship If checked, includes columns that might have relationships to other tables.
columns If this box is cleared, you won’t see those columns.
Navigate using full If checked, the navigator displays the complete hierarchy of tables in the
hierarchy database you're connecting to. If cleared, the navigator displays only the
tables whose columns and rows contain data.
Enable SQL Server If checked, when a node in the Azure SQL failover group isn't available,
Failover support Power Query moves from that node to another when failover occurs. If
Advanced option Description
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to Azure Synapse Analytics.
Troubleshooting
Summary
Item Description
7 Note
This Azure Synapse Analytics workspace connector doesn't replace the Azure
Synapse Analytics (SQL DW) connector. This connector makes exploring data in
Synapse workspaces more accessible. Some capabilities aren't present in this
connector, including native query and DirectQuery support.
7 Note
This connector supports access to all data in your Synapse workspace, including
Synapse Serverless, Synapse on-demand, and Spark tables.
Prerequisites
Before you can sign in to Synapse workspaces, you must have access to Azure Synapse
Analytics Workspace.
Capabilities Supported
Import
Connect to Synapse workspace data from
Power Query Desktop
To connect to Synapse workspace data:
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Azure Synapse
Analytics workspace (Beta). Then select Connect.
2. If this is the first time you are connecting to this workspace, you'll be asked to sign
in to your Synapse account. To sign in, select Sign in.
3. In the Sign in with Microsoft window that appears, provide your credentials to
sign in to your Synapse account. Then select Next.
Once the connection is established, you’ll see a list of the workspaces you have access
to. Drill through the workspaces, databases, and tables.
You can Load the selected table, which brings the entire table into Power BI Desktop, or
you can select Transform Data to edit the query, which opens the Power Query editor.
You can then filter and refine the set of data you want to use, and then load that refined
set of data into Power BI Desktop.
Troubleshooting
If your access is only defined in Synapse RBAC, you might not see the workspace.
Make sure your access is defined by Azure RBAC to ensure all Synapse workspaces are
displayed.
Azure Table Storage
Article • 07/13/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Customer Insights (Dataflows)
Analysis Services
Capabilities Supported
Import
1. From Get Data, select the Azure category, select Azure Table Storage, and then
select Connect. More information: Where to get data
2. In the Azure Table Storage window that appears, enter the name or the URL of the
storage account where your table is located. Don't add the name of the table to
the URL.
3. Select OK.
4. Add the Azure table storage account key and select Connect.
5. In Navigator, select one or multiple tables to import and use in your application.
Then select either Load to load the table, or Transform Data to open the Power
Query Editor where you can filter and refine the set of data you want to use, and
then load that refined set of data.
1. Select the Azure Table Storage option in the connector selection. More
information: Where to get data
2. In the Azure Table Storage dialog that appears, enter the name or URL of the
Azure Storage account where the table is housed. Don't add the name of the table
to the URL.
3. Add your Azure table storage account key, and then select Next.
4. Select one or multiple tables to import and use, then select Transform Data to
transform data in the Power Query editor.
Copy your account key for Azure Table Storage
Your Azure Table Storage account key is the same as your Azure Blob storage account
key. To retrieve your Azure Table Storage account key to use while authenticating your
account in Power Query, follow these steps:
2. Locate your Azure Blob Storage account where your table is housed.
3. In the storage account menu pane, under Security + networking, select Access
keys.
4. In the key1 section, locate the Key value. Select Show next to the key value.
5. Select the Copy to clipboard icon to copy the Key value.
Bitsight Security Ratings
Article • 07/18/2023
7 Note
The following connector article is provided by Bitsight, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Bitsight website and use the support
channels there.
Summary
Item Description
Prerequisites
A user must have a Bitsight Security Ratings product in order to access the Bitsight data
in Power BI. For more information on Bitsight Security Ratings, go to
https://www.bitsight.com/security-ratings .
Users must also have the March 2021 release of Power BI Desktop or later.
Capabilities supported
Import
2. In Power BI Desktop, select Get Data from the Home ribbon, select More from the
drop down, and search for Bitsight.
3. If this is the first time you're getting the data through the Bitsight connector, a
prompt appears to inform you of the connection to a third-party service.
4. Place your Bitsight API token in Power BI. In the Window that appears, provide
your credentials.
5. Once the connection is established, you can preview and select multiple data
points in the Navigator dialog box to create an output.
You can Load the selected table, which brings the entire table into Power BI Desktop, or
you can select Transform Data to edit the query, which opens the Power Query editor.
You can then filter and refine the set of data you want to use, and then load that refined
set of data into Power BI Desktop.
7 Note
Summary
Item Description
Prerequisites
Your organization must subscribe to Bloomberg PORT Enterprise and you must be a
Bloomberg Anywhere user and have a Bloomberg biometric authentication device (B-
Unit).
Capabilities Supported
Import
3. If this is the first time you're connecting to the Bloomberg Data and Analytics
connector, a third-party notice will be displayed. Select Don't warn me again with
this connector if you don't want this message to be displayed again, and then
select Continue.
4. Enter a Bloomberg Query Language (BQL) query to specify what data you want to
get. To learn more about BQL, contact your Bloomberg Sales representative. Select
OK.
5. To sign in to your Bloomberg account, select Sign in.
6. In the window that appears, provide your credentials to sign in to your Bloomberg
account. If you entered an email address and a password, select Next.
7. Enter your B-Unit code and select Log In.
8. Once you've successfully signed in, select Connect.
Once the connection is established, you will see data available for preview in Navigator.
You can Load the selected table, or you can select Transform Data to edit the query,
which opens Power Query Editor. You can then filter and refine the set of data you want
to use, and then load that refined set of data into Power BI Desktop.
BQE Core
Article • 07/18/2023
7 Note
The following connector article is provided by BQE, the owner of this connector and
a member of the Microsoft Power Query Connector Certification Program. If you
have questions regarding the content of this article or have changes you would like
to see made to this article, visit the BQE website and use the support channels
there.
Summary
Item Description
Prerequisites
To use the BQE Core Power BI connector, you must have a BQE Core account with
username and password.
Capabilities supported
Import
2. From the Other category, select BQEDataConnector, and then select Connect.
7. From the Navigator, select the tables to load, and then select Transform Data to
transform the data in Power Query.
CData Connect Cloud
Article • 07/18/2023
7 Note
The following connector article is provided by CData, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the CData website and use the support
channels there.
Summary
Item Description
Prerequisites
A CData Connect Cloud account
At least one connection configured in your CData Connect Cloud account
Capabilities supported
Import
3. Select CData Connect Cloud in the list and then select Connect.
Import data
With the Navigator window open, follow these steps to access your CData Connect
Cloud data:
1. Expand the CData Connect Cloud tree. Your connections appear as subtrees.
2. Select the data from each connection that you want to import.
7 Note
The following connector article is provided by Celonis, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Celonis website and use the support
channels there.
Summary
Item Description
Prerequisites
Before you can sign in to Celonis EMS , you must have a Celonis EMS account
(username/password).
Capabilities Supported
Import
Navigate using full hierarchy
Connect to Celonis EMS from Power Query
Desktop
To make the connection, take the following steps:
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Celonis EMS in
the connector selection and then select Connect.
2. The Celonis EMS dialog now appears, with an example URL. Enter your Celonis
EMS Team URL and then select OK.
3. Enter your Application Key or your Personal API Key generated in the Celonis
EMS.
5. Upon a successful connection, the Navigator opens with the list of packages
available in the given EMS team. Select the Knowledge Model Record you want to
import and then select Load.
Limitations and issues
You should be aware of the following limitations and issues associated with accessing
Celonis EMS data:
Celonis EMS has a built-in limit of 200-K rows and 20 columns returned per record.
Only defined records can be imported. Autogenerated records are excluded here.
CloudBluePSA (Beta)
Article • 07/13/2023
7 Note
Summary
Item Description
Prerequisites
Before you can use the CloudBluePSA connector, you must have a CloudBluePSA
instance (username/password) and an API key. Sign in to your PSA instance, which is
usually at a URL similar to YOUR_COMPANY_NAME.cloudbluepsa.io , and then navigate to
Setup > Employees > Find employees and add a new API user, which gives you an API
key.
Capabilities
Import
Connect to CloudBluePSA from Power Query
Desktop
To connect to CloudBluePSA data:
2. Select the Search box and start typing CloudBluePSA. Then select CloudBluePSA
from the list on the right, and select Connect.
3. In the Retrieve all pages of data window that appears, copy and paste the URL
generated on the GET side of the API endpoint of your choice. Then in Filter, copy
and paste the constructed filter on the same API endpoint. For example:
URL:
https://INSTANCE_URL/webapi/v1.3/tickets/getticketsreport
"RefNumber,Type,Name"}
4. Select OK.
6. You need to select Transform Data and this selection opens the Power Query
editor.
8. Expand the Column1.1 column. This time, on the list of columns, all the grid
columns included in the filter are displayed. Select as many columns as required,
and then select OK. All the selected data is now displayed, and can be reshaped
and used to create reports as required.
9. Select Close & Apply. You can now start using your data.
1. From Choose data source, start typing in the search box: CloudBluePSA.
2. In Connect to data source, provide the URL and Filter as defined and generated in
your instance, as shown in the example inside each text box. Finally paste your API
key in the Account Key field.
3. Select Next.
4. In the Navigator screen, select the Expand button next to the Data column, and
then select OK.
5. Two new columns now appear. Select the Expand button next to the Column1.1
column and then select OK. You can now start using your data.
Additional Resources
You might also find the following CloudBluePSA information useful:
7 Note
The Common Data Service (Legacy) connector has be superseded by the Power
Query Dataverse connector. In most cases, we recommend that you use the
Dataverse connector instead of the Common Data Service (Legacy) connector.
However, there may be limited cases where it's necessary to choose the Common
Data Service (Legacy) connector. These cases are described in When to use the
Common Data Service (Legacy) connector.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
You must have a Common Data Service (Legacy) environment with maker permissions to
access the portal, and read permissions to access data within tables.
Capabilities supported
Server URL
Advanced
Reorder columns
Add display column
2. In the Get Data dialog box, select Power Platform > Common Data Service
(Legacy), and then select Connect.
3. Enter the Common Data Service (Legacy) environment URL of the data you want to
load. Use the format https://<yourenvironmentid>.crm.dynamics.com/. More
information: Finding your Dataverse environment URL
When the table is loaded in the Navigator dialog box, by default the columns in
the table are reordered in alphabetical order by the column names. If you don't
want the columns reordered, in the advanced settings enter false in Reorder
columns.
Also when the table is loaded, by default if the table contains any picklist fields, a
new column with the name of the picklist field with _display appended at the end
of the name is added to the table. If you don't want the picklist field display
column added, in the advanced settings enter false in Add display column.
4. If this attempt is the first time you're connecting to this site, select Sign in and
input your credentials. Then select Connect.
5. In Navigator, select the data you require, then either load or transform the data.
Connect to Common Data Service (Legacy)
from Power Query Online
To connect to Common Data Service (Legacy) from Power Query Online:
1. From the Data sources page, select Common Data Service (Legacy).
2. Enter the server URL address of the data you want to load.
3. If necessary, enter an on-premises data gateway if you're going to be using on-
premises data. For example, if you're going to combine data from Dataverse and
an on-premises SQL Server database.
6. In the navigation page, select the data you require, and then select Transform
Data.
In the new browser tab that opens, copy the root of the URL. This root URL is the unique
URL for your environment. The URL will be in the format of
https://<yourenvironmentid>.crm.dynamics.com/. Keep this URL somewhere handy so
you can use it later, for example, when you create Power BI reports.
When to use the Common Data Service (Legacy)
connector
Dataverse is the direct replacement for the Common Data Service connector. However,
there may be times when it's necessary to choose the Common Data Service (Legacy)
connector instead of the Dataverse connector:
There are certain Tabular Data Stream (TDS) data types that are supported in OData
when using Common Data Service (Legacy) that aren't supported in Dataverse. The
supported and unsupported data types are listed in How Dataverse SQL differs from
Transact-SQL.
All of these features will be added to the Dataverse connector in the future, at which
time the Common Data Service (Legacy) connector will be deprecated.
Use the Azure Synapse Link feature in Power Apps to extract data from Dataverse
into Azure Data Lake Storage Gen2, which can then be used to run analytics. For
more information about the Azure Synapse Link feature, go to What is Azure
Synapse Link for Dataverse?.
Use the OData connector to move data in and out of Dataverse. For more
information on how to migrate data between Dataverse environments using the
dataflows OData connector, go to Migrate data between Dataverse environments
using the dataflows OData connector.
Use the Dataverse connector to access read-only data in Dataverse. For more
information about this feature, go to View table data in Power BI Desktop.
7 Note
Both the Dataverse connector and the OData APIs are meant to serve analytical
scenarios where data volumes are relatively small. The recommended approach for
bulk data extraction is “Azure Synapse Link”.
Power Query M
If you're using the Common Data Service (Legacy) connector, you can use a single query
to access all of the data in the dataset. This connector works differently and returns the
result in “pages” of 5 K records. Although the Common Data Service (Legacy) connector
is more efficient in returning large amounts of data, it can still take a significant amount
of time to return the result.
Instead of using these connectors to access large datasets, we recommend that you use
Azure Synapse Link to access large datasets. Using Azure Synapse Link is even more
efficient that either the Power Query Dataverse or Common Data Service (Legacy)
connectors, and it is specifically designed around data integration scenarios.
Databricks
Article • 07/13/2023
7 Note
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
This connector is only for use with a Databricks SQL Warehouse running on AWS and
using OAuth. If you're using Azure Databricks, use the Azure Databricks connector. If
you aren't using OAuth with your Databricks SQL Warehouse (on AWS or GCP), use the
Azure Databricks connector too. Databricks Community Edition isn't supported.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
1. In the Get Data experience, search for Databricks to shortlist the Databricks
connector. You should only use the Databricks connector here for your Databricks
SQL Warehouse data (running on AWS) if you're using OAuth for authentication.
2. Provide the Server hostname and HTTP Path for your Databricks SQL Warehouse.
Refer to Configure the Databricks ODBC and JDBC drivers for instructions to look
up your "Server hostname" and "HTTP Path". Enter this information accordingly.
You can optionally supply a default catalog and/or database under Advanced
options. Select OK to continue.
3. Provide your credentials to authenticate with your Databricks SQL Warehouse. You
have three options for credentials:
7 Note
Once you enter your credentials for a particular Databricks SQL Warehouse,
Power BI Desktop caches and reuses those same credentials in subsequent
connection attempts. You can modify those credentials by going to File >
Options and settings > Data source settings. More information: Change the
authentication method
4. Once you successfully connect, the Navigator shows the data available to you on
the cluster. You can choose either Transform Data to transform the data using
Power Query or Load to load the data in Power Query Desktop.
1. In the Get Data experience, select the Dataflow category. (Refer to Creating a
dataflow for instructions.) Shortlist the available Databricks connector with the
search box. Select the Databricks connector for your Databricks SQL Warehouse.
2. Enter the Server hostname and HTTP Path for your Databricks SQL Warehouse.
Refer to Configure the Databricks ODBC and JDBC drivers for instructions to look
up your "Server hostname" and "HTTP Path". You can optionally supply a default
catalog and/or database under Advanced options.
Basic. Use this option when authenticating with a user name and password.
This option isn't available if your organization/account uses 2FA/MFA.
Account Key. Use this option when authenticating using a Personal Access
Token. Refer to Personal access tokens for instructions on generating a
Personal Access Token (PAT).
Organizational account. Use this option when authenticating with OAuth.
Sign in to your organizational account using the browser popup.
4. Once you successfully connect, the Navigator appears and displays the data
available on the server. Select your data in the navigator. Then select Next to
transform the data in Power Query.
Dataverse
Article • 07/13/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
You must have a Dataverse environment.
To use the Dataverse connector, the TDS endpoint setting must be enabled in your
environment. More information: Manage feature settings
To use the Dataverse connector, one of TCP ports 1433 or 5558 need to be open to
connect. Port 1433 is used automatically. However, if port 1433 is blocked, you can use
port 5558 instead. To enable port 5558, you must append that port number to the
Dataverse environment URL, such as yourenvironmentid.crm.dynamics.com, 5558. More
information: SQL Server connection issue due to closed ports
7 Note
If you are using Power BI Desktop and need to use port 5558, you must create a
source with the Dataverse environment URL, such as
yourenvironmentid.crm.dynamics.com,5558, in Power Query M.
Capabilities supported
Server URL
Import
DirectQuery (Power BI Datasets)
Advanced
Include relationship columns
7 Note
The Power Query Dataverse connector is mostly suited towards analytics workloads,
not bulk data extraction. More information: Alternative Dataverse connections
1. Select the Dataverse option from Get Data. More information: Where to get data
2. If you're connecting to this site for the first time, select Sign in and input your
credentials. Then select Connect.
3. In Navigator, select the data you require, then either load or transform the data.
4. If you're using Power Query from Power BI Desktop, you'll be asked to select either
the Import or DirectQuery data connectivity mode. Then select OK.
2. In the Connect to data source page, leave the server URL address blank. Leaving
the address blank lists all of the available environments you have permission to use
in the Power Query Navigator window.
7 Note
If you need to use port 5558 to access your data, you'll need to load a specific
environment with port 5558 appended at the end in the server URL address.
In this case, go to Finding your Dataverse environment URL for instructions
on obtaining the correct server URL address.
6. In the navigation page, select the data you require, and then select Transform
Data.
The following table lists the advanced options you can set in Power Query Online.
Advanced Description
option
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, you won’t see those columns. More information:
columns Performance issues related to relationship columns
Once you've selected the advanced options you require, select Next to connect to
Dataverse.
There are certain Tabular Data Stream (TDS) data types that are supported in OData
when using Common Data Service (Legacy) that aren't supported in Dataverse. The
supported and unsupported data types are listed in How Dataverse SQL differs from
Transact-SQL.
All of these features will be added to the Dataverse connector in the future, at which
time the Common Data Service (Legacy) connector will be deprecated.
Use the Azure Synapse Link feature in Power Apps to extract data from Dataverse
into Azure Data Lake Storage Gen2, which can then be used to run analytics. For
more information about the Azure Synapse Link feature, go to What is Azure
Synapse Link for Dataverse?.
Use the OData connector to move data in and out of Dataverse. For more
information on how to migrate data between Dataverse environments using the
dataflows OData connector, go to Migrate data between Dataverse environments
using the dataflows OData connector.
7 Note
Both the Dataverse connector and the OData APIs are meant to serve analytical
scenarios where data volumes are relatively small. The recommended approach for
bulk data extraction is “Azure Synapse Link”.
Power Query M
Once a database source has been defined, you can specify a native query using the
Value.NativeQuery function.
Power Query M
Power Query M
let
Source = CommonDataService.Database("[DATABASE]"),
myQuery = Value.NativeQuery(Source, "[QUERY]", null,
[EnableFolding=true])
in
myQuery
Misspelling a column name might result in an error message about query folding
instead of missing column.
If you're using the Common Data Service (Legacy) connector, you can use a single query
to access all of the data in the dataset. This connector works differently and returns the
result in "pages" of 5 K records. Although the Common Data Service (Legacy) connector
is more efficient in returning large amounts of data, it can still take a significant amount
of time to return the result.
Instead of using these connectors to access large datasets, we recommend that you use
Azure Synapse Link to access large datasets. Using Azure Synapse Link is even more
efficient that either the Power Query Dataverse or Common Data Service (Legacy)
connectors, and it's specifically designed around data integration scenarios.
Power Query M
Source = CommonDataService.Database("{crminstance}.crm.dynamics.com",
[CreateNavigationProperties=false]),
Dataflows
Article • 07/13/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights (Dataflows)
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
You must have an existing Dataflow with maker permissions to access the portal, and
read permissions to access data from the dataflow.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
7 Note
2. In the Get Data dialog box, select Power Platform > Dataflows, and then select
Connect.
3. If this attempt is the first time you're connecting to this site, select Sign in and
input your credentials. Then select Connect.
4. In Navigator, select the Dataflow you require, then either load or transform the
data.
Get data from Dataflows in Power Query
Online
To get data from Dataflows in Power Query Online:
6. In the navigation page, select the data you require, and then select Transform
Data.
To get DirectQuery to run, you need to have Power BI Premium and adjust a few
configuration items in your Power BI workspace. These actions are explained in the
dataflows premium features article.
You're probably using a Dataverse table as the destination for your standard dataflow.
Use the Dataverse/CDS connector instead or consider switching to an analytical
dataflow.
I'm getting data via the dataflow connector, but I'm receiving a 429 error code—how
can I resolve this?
When you are receiving an error code 429, it's possibly due to exceeding the limit of
1000 requests per minute. This error typically resolves by itself if you wait a minute or
two after the cooldown period ended. This limit is in place to prevent dataflows and
other Power BI functionality from having a degraded performance. Consequences due
to the continued high load on the service might result in additional degraded
performance, so we ask users to significantly reduce the number of requests to less than
1000 (limit) or fix your script/model to this specific limit (1000) to efficiently mitigate
impact and avoid further issues. You should also avoid nested joins that re-request
dataflow data; instead, stage data and perform merges within your dataflow instead of
your dataset.
See also
Using the output of Dataflows from other Power Query experiences
Best practices for designing and developing complex dataflows
Best practices for reusing dataflows across environments and workspaces
Delta Sharing
Article • 07/13/2023
7 Note
Summary
Item Description
Prerequisites
If you use Power BI Desktop you need to install the November release of Power BI
Desktop or later. Download the latest version .
The data provider sends an activation URL from which you can download a credentials
file that grants you access to the shared data.
After downloading the credentials file, open it with a text editor to retrieve the endpoint
URL and the token.
For detailed information about Delta Sharing, visit Access data shared with you using
Delta Sharing .
Capabilities supported
Import
Connect to Databricks Delta Sharing in Power
BI Desktop
To connect to Databricks using the Delta Sharing connector, use the following steps:
2. Navigate to the Get Data menu and search for Delta Sharing.
4. Enter the endpoint URL retrieved from the credentials file in the Delta Sharing
Server URL field.
5. Optionally, in the Advanced Options tab you can set a Row Limit for the maximum
number of rows you can download. This is set to 1 million rows by default.
6. Select OK.
7. In the Authentication dialog box, enter the token retrieved from the credentials
file in the Bearer Token field.
8. Select Connect.
You need to make sure that the data loaded with the Delta Sharing connector fits in the
memory of your machine. To ensure this, the connector limits the number of imported
rows to the Row Limit set by the user.
Denodo
Article • 07/13/2023
7 Note
The following connector article is provided by Denodo, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Denodo website and use the support
channels there.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
To use this connector, you must have installed the Denodo platform, and configured and
started its service. In case of a connection using an ODBC DSN, you must have correctly
configured the connection in the ODBC Data Source Administrator.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Connect to an ODBC data source from Power
Query Desktop
To make the connection, take the following steps:
1. In order to connect to data, select Get Data from the Home ribbon and select
Denodo in the Database section.
2. There are two ways to connect to the data source of your choice:
In the DSN or Connection String section of the Denodo Connector dialog box,
provide the Data source name (DSN) or the Connection String depending on the
type of connection you prefer.
The connection string must contain three mandatory parameters: SERVER, PORT
and DATABASE:
HTTP
HTTP
7 Note
When writing the connection string, it must be taken into account:
1. The connection string must keep the correct order of its parameters: SERVER,
PORT, DATABASE and SSLMode.
2. The name of these parameters must always be written in the same way. For
example, if you choose to write them in upper case, they must always be
written in upper case; if you decide to write them capitalized (writing the first
letter of a word in uppercase and the rest of the letters in lowercase) they
must always be written that way.
3. The second section, Enable debug mode, is an optional field that allows you to
add trace information to log files. These files are created by Power BI Desktop
when you enable tracing in the application using the Diagnostics tab in the
Options menu. Note that the default value for Enable debug mode is false, and in
this scenario, there will be no trace data in the log files from Denodo Power BI
custom connector.
4. The third section, Native Query, is an optional field where you can enter a query. If
this query field is used, the resulting dataset will be the result of the query instead
of a table or a set of tables.
You can write a query that queries only one of the databases that the datasource is
associated with.
SQL
If you want to write a query that queries more than one database, you have to
specify in the query the database that owns each table.
SQL
5. The last section in Denodo Connector is Data connectivity mode, where you can
choose between Import mode or DirectQuery mode.
6. Once you're done, select OK.
7. Before showing the navigator window that displays a preview of the available data
in Denodo Virtual DataPort, you'll be asked for authentication. The Denodo Power
BI custom connector supports two authentication types: Windows and Basic.
In this case:
The Denodo Virtual DataPort database that the data source connects to
must be configured with the option ODBC/ADO.net authentication type
set to Kerberos.
Make sure the Advanced Options page of the DSN configuration contains
all the needed configuration for using Kerberos as an authentication
method.
Basic: This authentication type allows you to connect Power BI Desktop to
your Virtual DataPort data using your Virtual DataPort server credentials.
9. In Navigator, select the data you need from the database you want and choose
Load, or choose Transform Data if you're going to modify the incoming data.
Connect to an ODBC data source from Power BI
service using the on-premises data gateway
To make the connection, take the following steps:
1. Configure the on-premises data gateway (enterprise gateway) that acts as a bridge,
providing quick and secure data transfer between on-premises data (data in your
Power BI Desktop application, not in the cloud) and the Power BI service.
2. Sign in and register your gateway. In the on-premises data gateway app, select the
Status tab to verify that your gateway is online and ready to be used.
3. Using the gateway settings page in Power BI service, create a data source for the
Denodo Power BI custom connector.
In order to create the data source, you have to specify the way to connect to the
data source of your choice:
Through DSN
Using a connection string
You also have to specify the authentication mode. The available authentication
methods are:
In Data Source Settings, enter the username and password to create the
Kerberos ticket.
The Denodo Virtual DataPort database that the data source connects to
must be configured with the option ODBC/ADO.net authentication type
set to Kerberos.
Make sure the Advanced Options page of the DSN configuration contains
all the needed configuration for using Kerberos as an authentication
method.
Basic: This authentication type allows you to create a data source in Power BI
service to connect to your Virtual DataPort data using your Virtual DataPort
server credentials.
4. If you use Windows authentication, under Advanced settings for the data source
you can enable the single sign-on (SSO) authentication schema in order to use the
same credentials of the user accessing your reports in Power BI for accessing the
required data in Denodo.
There are two options for enabling SSO: Use SSO via Kerberos for DirectQuery
queries and Use SSO via Kerberos for DirectQuery And Import queries. If you're
working with DirectQuery based reports, both options use the SSO credentials of
the user that signs in to the Power BI service. The difference comes when you work
with Import based reports. In this scenario, the former option uses the credentials
entered in the data source page (Username and Password fields), while the latter
uses the credentials of the dataset owner.
It's important to note that there are particular prerequisites and considerations
that you must take into account in order to use the Kerberos-based SSO. Some of
these essential requirements are:
By default, the Microsoft Power BI Gateway sends the user principal name
(UPN) when it performs an SSO authentication operation. Therefore, you'll
need to review the attribute that you'll use as a login identifier in Denodo
Kerberos Authentication and, if it's different from userPrincipalName , adjust
the gateway settings according to this value.
AD it must map the user principal name that comes from Azure AD. So, in this
scenario, ADUserNameLookupProperty should be userPrincipalName . Then, once
the user is found, the ADUserNameReplacementProperty value indicates the
attribute that should be used to authenticate the impersonated user (the
attribute that you'll use as the login identifier in Denodo).
You should also take into account that changes in this configuration file are at
the gateway level, and therefore will affect any source with which SSO
authentication is done through the Microsoft Power BI Gateway.
5. Once a data source is created for the Denodo connector, you can refresh Power BI
reports. To publish a report on powerbi.com, you need to:
Troubleshooting
Preview.Error: The type of the current preview value is too complex to display.
This error is due to a limitation in the Microsoft Power Query platform. In order to work
around it, select the failing data source (query) in the data transformation window and
access the advanced editor with View > Advanced Editor. Then edit the data source
expression in M language adding the following property to the options argument of
the Denodo.Contents function call:
CreateNavigationProperties=false
This property will instruct Power BI not to try and generate navigation properties from
the relationships registered for the Denodo view accessed in this data source. So if you
need some of these relationships to be actually present in your Power BI data model,
you will need to manually register them afterwards.
Digital Construction Works Insights
Article • 07/18/2023
Summary
Item Description
Authentication types supported Digital Construction Works JSON Web Token (JWT)
7 Note
Prerequisites
Use of this connector requires a Digital Construction Works Integrations Platform
subscription. To learn more, go to
https://www.digitalconstructionworks.com/solutions/the-dcw-integrations-platform .
Visit https://www.digitalconstructionworks.com for company information.
Users of the Digital Construction Works (DCW) Integrations Platform can request a JSON
Web Token (JWT) from their project administrator in order to access data using the DCW
Insights connector. Users can then follow the documentation for the OData API to
connect to the datasets they want to use in Power BI.
Capabilities supported
Import
Connect to DCW Insights OData API from
Power Query Desktop
To connect to a DCW Insights project, take the following steps:
1. Under Get Data in Power BI Desktop, choose the Digital Construction Works
Insights connector from the Online Services.
2. In Insights Api Url, provide the URL to the OData API you want to connect to. You
need to use https , and you need your full project URL and product name included
in the URL. You can also enter in query string parameters if the URL calls for it.
3. Select OK.
4. If this is the first time you're connecting to this endpoint, you'll be asked to enter
in the JWT used to authorize you for this project. Then select Connect.
For more information about authentication methods, go to Authentication with a
data source.
7 Note
5. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
editor.
Troubleshooting
OData.Feed
We use the following default settings when using OData.Feed:
This article provides basic information, prerequisites, and instructions on how to connect
to Dynamics 365 Online (legacy) from Power Query.
Summary
Item Description
Products Excel
Power BI (Datasets)
Analysis Services
Capabilities supported
Import
To make the connection to Dynamics 365 Online (legacy), follow these steps:
1. From Get Data, select the Online Services category, select Dynamics 365 Online
(legacy), and then select Connect. More information: where to get data
2. In the window that appears, enter the server name of your Dynamics 365 Online
(legacy) instance. You can select Advanced to enter other URL parts.
7 Note
3. Select OK.
4. If you're connecting to this data source for the first time, you can select one of
these authentication types: Anonymous, Windows, Basic, Web API, or
Organizational account. Enter your credentials and select Connect. The next time
you connect, it will remember your credentials.
5. In Navigator, select one or more files or folders to import and use in your
application. Then select either Load to load the table, or Transform Data to open
the Power Query editor where you can filter and refine the set of data you want to
use, and then load that refined set of data.
Eduframe Reporting (Beta)
Article • 07/18/2023
Summary
Item Description
7 Note
The following connector article is provided by Drieam, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Drieam website and use the support
channels there.
Prerequisites
Before you can sign in to Eduframe Reporting, you must have an Eduframe Admin
account (username/password) and have enabled the Eduframe Reporting integration. To
enable this integration, you can send an email to: support@eduframe.nl.
Capabilities Supported
Import
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, select Eduframe (Beta), and then select Connect.
2. If this is the first time you're getting data through the Eduframe connector, a
preview connector notice will be displayed. Select Don't warn me again with this
connector if you don't want this message to be displayed again, and then select
Continue.
3. Enter the Eduframe domain that you want to access (this domain is the URL
without the .eduframe.nl), and if you want to exclude personal information. Then
select OK.
4. To sign in to your Eduframe account, select Sign in.
5. In the window that appears, provide your credentials to sign in to your Eduframe
account and select Sign in.
6. Next it will show a window where you have to approve this integration with Power
BI. Select approve.
7. Once you've successfully signed in, select Connect.
8. In Navigator, select the information you want, then either select Load to load the
data or Transform Data to continue transforming the data in the Power Query
editor.
Next steps
You can find additional information and templates for this connector on our
documentation page .
EQuIS
Article • 07/26/2023
7 Note
Summary
Item Description
Prerequisites
To use the EQuIS connector, you must have a valid user account in an EQuIS Enterprise
site (version 7.0.0.19300 or later) that includes a REST API license. Your user account
must be a member of the REST API role. To verify user account configuration, go to the
Roles tab in your user profile and verify that you're a member of the REST API role.
Capabilities supported
Import
1. Select the EQuIS connector in the connector list, then select Connect.
2. Enter the URL of the EQuIS Enterprise site you're connecting to, then select OK.
Basic: Enter your EQuIS username and password for the given EQuIS
Enterprise site.
API Token: Enter an API Token that you generated in EQuIS Enterprise (visit
User Profile ).
Organizational Account: If your EQuIS Enterprise site is appropriately
configured, you may authenticate with Azure Active Directory
4. In Navigator, browse to the dataset or report you want to load, then select Load or
Transform Data. Visit Using EQuIS Data for more information about available
datasets.
Additional Information
For best functionality and performance, EarthSoft recommends that you use the
EQuIS connector with the latest build of EQuIS Enterprise.
When using reports in a facility group, non-administrator users must have
permission to all facilities contained in the facility group.
Only "grid" reports are available in the Navigator.
All datasets consumed by the EQuIS connector use camelCase for column names.
The current version of the EQuIS connector retrieves a dataset in a single API
request (this logic might be optimized in a future version of the connector).
Essbase
Article • 06/09/2023
Summary
Item Description
Prerequisites
Essbase 11.1.2.x version is supported.
Capabilities Supported
Import
Direct Query (Power BI Datasets)
Advanced options
Command timeout in minutes
Server
Application
MDX statement
2. Enter the URL to the Oracle Essbase Hyperion server. Typically, the URL looks like
http://[hostname]:[port number]/aps/XMLA . The components of the URL are:
The port number (for example, 19000) is the port number the APS server is
listening to for XMLA requests.
The last portion of the URL, the path (that is, /aps/XMLA), is case-sensitive
and must be specified exactly as shown.
Optionally, enter values in any advanced options that you want to use to modify
the connection query. More information: Connect using advanced options
4. The first time you connect to a data source (identified by each unique URL), you'll
be prompted to enter account credentials. Enter the User name and Password for
the connection. More information: Authentication with a data source
5. In Navigator, select the data you require. Then, either select Transform data to
transform the data in Power Query Editor, or Load to load the data in Power BI.
Advanced Description
option
Advanced Description
option
Command Lets you set the maximum time a command is allowed to run before Power BI
timeout in abandons the call. If the command timeout is reached, Power BI may retry two more
minutes times before completely abandoning the call. This setting is helpful for querying
large amounts of data. The default value of the command timeout is 140 seconds.
Server The name of the server where the optional MDX statement is to run. This value is
case sensitive.
Application The name of the application where the optional MDX statement is to run. This value
is case sensitive.
MDX Optionally provides a specific MDX statement to the Oracle Essbase server to
statement execute. Normally, Power BI interactively determines the measures and dimensions
of the cube to return. However, by specifying the MDX statement, the results of that
particular MDX statement will be loaded. When you specify the MDX statement, you
must also provide the Server (for example, essbaseserver-1 ) and Application (for
example, Sample ) advanced options to indicate where the MDX statement is to run.
Also, you can only use the MDX statement in conjunction with Data Connectivity
mode set to Import.
Choose a measure and all (or specific) dimension levels by selecting the checkbox next
to the name. A preview of the data is provided in the pane on the right. You can select
the Load button to retrieve the data associated with the selection or select the
Transform Data button to set further filters on the data before loading it in Power BI.
While in the Power Query navigator, the same Entity being expanded appears like this:
Be aware that this look is a stylistic decision and that there are no differences in data.
The levels in the Power Query navigator correspond to the hierarchical level.
The reason is because the navigator in Power Query is limited to 10,000 members to
display, and there can be millions or billions of members underneath a hierarchy. Even
for the case of no member display limit (such as with Power Query Online), navigating
and selecting every individual member in a tree format with so many possible values
quickly becomes tedious and difficult to use.
So, the grouping of the hierarchical levels makes it easier to select what to import, and
the subsequent report generation can use filters to target only the members the end
user wants.
Performance considerations
Interacting with Power BI in DirectQuery mode is very dynamic. When selecting a
checkbox to include a measure or dimension level in the visualization, Power BI Desktop
generates a query and sends it to the Oracle Essbase server to get the results. Power BI
is optimized to cache any repeated queries to improve performance. But if any new
query is generated, it's sent to the Oracle Essbase server to produce a new result.
Depending on the number of selected measures, dimension levels, and the filters
applied, the query might get sent more quickly than the Oracle Essbase server can
respond. To improve performance and increase responsiveness, consider the following
three methods to optimize your interaction with the Oracle Essbase server.
Query reductions options
There are three options to reduce the number of queries sent. In Power BI Desktop,
select the File tab, then select Options and settings > Options, and then select Query
reductions under the Current File section.
7 Note
These options apply only to the current file you are working on. Current File option
settings are saved with the file and restored when opening the same file.
The following procedure demonstrates how to reduce the chances of retrieving more
data than is necessary when importing data into Power BI by iteratively applying filters
on dimension members at each level.
2. Expand the tree to drill down to your desired server, application, and database
until it exposes the measures and dimensions for your database. For now, select
your measures and only one dimension level. Pick the most important dimension
level. In later steps, you'll build the result by incrementally adding more
dimensions levels.
1. Select Edit Queries on the Power BI Desktop ribbon to start the process.
2. If you have members you want to filter on in the initial dimension, select the
column properties button to display the list of available dimension members at
this level. Select only the dimension members you need at this level and then
select OK to apply the filter.
3. The resulting data is now updated with the applied filter. Applied Steps now
contains a new step (Filtered Rows) for the filter you set. You can select the
settings button for the step to modify the filter at a later time.
4. Now you'll add a new dimension level. In this case, you're going to add the next
level down for the same dimension you initially chose. Select Add Items on the
ribbon to bring up the Navigator dialog box.
5. Navigate to the same dimension, but this time select the next level below the first
level. Then select OK to add the dimension level to the result.
6. The result grid now has the data from the new dimension level. Notice that
because you've applied a filter at the top level, only the related members in the
second level are returned.
7. You can now apply a filter to the second-level dimension as you did for the first
level.
8. In this way, each subsequent step ensures only the members and data you need
are retrieved from the server.
9. Now let's add a new dimension level by repeating the previous steps. Select Add
Items on the ribbon bar again.
10. Navigate to the dimension level you want, select it, and then select OK to add the
dimension level to the result.
1. Drag-and-drop a dimension level from the Fields pane over to the Filters pane. You
can drag the dimension level to the Add data fields here area under Filters on this
visual, Filters on this page, or Filters on all pages, depending on your needs.
2. Once a dimension's level is in the Filter pane and the filter type is set to Basic
filtering, you'll notice that the members of that dimension's level are displayed as
a list of available filters.
3. You can check the members you want to include in your result.
Or you can select the Select all option, then uncheck the members you don't want
to include in your result.
Type some characters in the search field for that filter to find members in the list.
4. When you have filters for two or more levels of the same dimension, you'll notice
that selecting members from a higher level in the dimension changes the members
available in the lower levels of that dimension.
5. When you've finished choosing the members you want in the dimension level filter,
it's a good time to add that dimension level to your visualization. Check the
matching dimension level in the Fields pane and it's then added to your current
visualization.
For more information about adding filters, go to Add a filter to a report in Power BI.
Troubleshooting
This section outlines common issues that you might come across, and includes
troubleshooting steps to address the issues.
Connection issues
Symptom 1
Power BI Desktop returns the error message "Unable to connect to the remote server".
Resolution
1. Ensure the Essbase Analytic Provider Services (APS) server is configured correctly
for the Provider Servers and Standalone Servers in the Essbase Administration
Service (EAS) console. More information: Configuring Essbase Clusters
3. If there's a firewall between Power BI Desktop and the provided hostname, check
to ensure the provided hostname and port can pass outbound through your
firewall.
Validation
Trying to connect again won't show the error and the Cube and member list is in the
navigation pane. You can also select and display in preview in Import mode.
Symptom 2
Power BI Desktop returns the error message "We couldn't authenticate with the
credentials provided. Please try again."
Resolution
Ensure the provided username and password are correct. Reenter their values carefully.
The password is case-sensitive.
Validation
After correcting the username and password, you should be able to display the
members and the value in the preview or be able to load the data.
Symptom 3
Power BI Desktop returns the error message "Data at the root level is invalid. Line 1,
position 1."
Resolution
Ensure the Essbase Analytic Provider Services (APS) server is configured correctly for the
Provider Servers and Standalone Servers in the Essbase Administration Service (EAS)
console. More information: Configuring Essbase Clusters .
Validation
Trying to connect again won't show the error and the Cube and member list is displayed
in the navigation pane. You can also select and display in the preview in Import mode.
Symptom 4
Once successfully connected to the Oracle Essbase Analytic Provider Services (APS)
server, there are servers listed below the URL node in the data source navigator.
However, when you expand a server node, no applications are listed below that server
node.
Resolution
We recommend configuring the Oracle Hyperion server to define the provider and
standalone servers through the Essbase Administration Service (EAS) console. Refer to
section Addendum: Registering Provider and Standalone Servers in Essbase
Administration Service (EAS) Console .
Validation
Trying to connect again won't show the error and you can see the Cube and member list
in the navigation pane. You can also select and display in the preview in Import mode.
Power Query returns the error message "The operation has timed out"
Resolution
1. Ensure the network is stable and there's a reliable network path to the Essbase
Analytic Provider Services (APS) server provided in the data source URL.
2. If there's a possibility that the query to the service could return a large amount of
data, specify a long (or longer) command timeout interval. If possible, add filters to
your query to reduce the amount of data returned. For example, select only
specific members of each dimension you want returned.
Validation
Retry to load the data and if the problem persists, try to increase to a longer timeout
interval or filter the data further. If the problem still persists, try the resolution on
Symptoms 3.
Symptom 2
The query returns the error message "Internal error: Query is allocating too large
memory ( > 4GB) and cannot be executed. Query allocation exceeds allocation limits."
Resolution
The query you're trying to execute is producing results greater than the Oracle Essbase
server can handle. Supply or increase the filters on the query to reduce the amount of
data the server will return. For example, select specific members for each level of each
dimension or set numeric limits on the value of measures.
Validation
Retry to load the data and if the problem persists, try to increase to a longer timeout
interval or filter the data further. If the problem still persists, try the resolution on
Symptoms 3.
Symptom 3
Essbase Analytic Provider Services (APS) or Essbase server indicates a large number of
connections with long running sessions.
Resolution
When the connectivity mode is DirectQuery, it's easy to select measures or dimension
levels to add to the selected visualization. However, each new selection creates a new
query and a new session to the Essbase Analytic Provider Services (APS)/Essbase server.
There are a few ways to ensure a reduced number of queries or to reduce the size of
each query result. Review Performance Considerations to reduce the number of times
the server is queried and to also reduce the size of query results.
Validation
An MDX statement returns the error message "The key didn't match any rows in the
table".
Resolution
It's likely that the value or the case of the Server and Application fields don't match.
Select the Edit button and correct the value and case of the Server and Application
fields.
Validation
An MDX statement returns the error message "Unable to get the cube name from the
statement. Check the format used for specifying the cube name".
Resolution
Ensure the database name in the MDX statement's FROM clause is fully qualified with
the application and database name, for example, [Sample.Basic]. Select the Edit button
and correct the fully qualified database name in the MDX statement's FROM clause.
Validation
An MDX statement returns the error message "Essbase Error (1260060): The cube name
XXXX does not match with current application/database"
Resolution
Ensure the application name and the fully qualified database name in the FROM clause
match. Select the Edit button and correct either the application name or the fully
qualified database name in the MDX statement's FROM clause
Validation
Loading a dimension returns the error message "Essbase Error (1200549): Repeated
dimension [Measures] in MDX query".
Resolution
1. Sign in to the Essbase server, open the Essbase Administration Services Console
and sign in with an admin user (or whoever has permissions over the problematic
database).
2. Navigate to the Essbase server > application > database with the problematic
"Measures" dimension.
5. Select the Dimension Type field and set it to Accounts. Select OK.
Validation
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
To connect to a legacy workbook (such as .xls or .xlsb), the Access Database Engine
OLEDB (or ACE) provider is required. To install this provider, go to the download page
and install the relevant (32 bit or 64 bit) version. If you don't have it installed, you'll see
the following error when connecting to legacy workbooks:
required to read this type of file. To download the client software, visit the
following site: https://go.microsoft.com/fwlink/?LinkID=285987.
ACE can't be installed in cloud service environments. So if you're seeing this error in a
cloud host (such as Power Query Online), you'll need to use a gateway that has ACE
installed to connect to the legacy Excel files.
Capabilities Supported
Import
2. Browse for and select the Excel workbook you want to load. Then select Open.
If the Excel workbook is online, use the Web connector to connect to the
workbook.
3. In Navigator, select the workbook information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
Editor.
Connect to an Excel workbook from Power
Query Online
To make the connection from Power Query Online:
2. In the Excel dialog box that appears, provide the path to the Excel workbook.
4. If this is the first time you've accessed this Excel workbook, select the
authentication kind and sign in to your account (if needed).
5. In Navigator, select the workbook information you want, and then Transform Data
to continue transforming the data in Power Query Editor.
Suggested tables
If you connect to an Excel Workbook that doesn't specifically contain a single table, the
Power Query navigator will attempt to create a suggested list of tables that you can
choose from. For example, consider the following workbook example that contains data
from A1 to C5, more data from D8 to E10, and more from C13 to F16.
When you connect to the data in Power Query, the Power Query navigator creates two
lists. The first list contains the entire workbook sheet, and the second list contains three
suggested tables.
If you select the entire sheet in the navigator, the workbook is displayed as it appeared
in Excel, with all of the blank cells filled with null.
If you select one of the suggested tables, each individual table that Power Query was
able to determine from the layout of the workbook is displayed in the navigator. For
example, if you select Table 3, the data that originally appeared in cells C13 to F16 is
displayed.
7 Note
If the sheet changes enough, the table might not refresh properly. You might be
able to fix the refresh by importing the data again and selecting a new suggested
table.
Troubleshooting
The answer is a bit complicated, and has to do with how Excel stores numbers using
something called binary floating-point notation. The bottom line is that there are certain
numbers that Excel can't represent with 100% precision. If you crack open the .xlsx file
and look at the actual value being stored, you'll see that in the .xlsx file, 0.049 is actually
stored as 0.049000000000000002. This is the value Power Query reads from the .xlsx,
and thus the value that appears when you select the cell in Power Query. (For more
information on numeric precision in Power Query, go to the "Decimal number" and
"Fixed decimal number" sections of Data types in Power Query.)
If your file has a dimension attribute that points to a single cell (such as <dimension
ref="A1" /> ), Power Query uses this attribute to find the starting row and column of the
However, if your file has a dimension attribute that points to multiple cells (such as
<dimension ref="A1:AJ45000"/> ), Power Query uses this range to find the starting row
and column as well as the ending row and column. If this range doesn't contain all the
data on the sheet, some of the data won't be loaded.
You can fix issues caused by incorrect dimensions by doing one of the following actions:
Open and resave the document in Excel. This action will overwrite the incorrect
dimensions stored in the file with the correct value.
Ensure the tool that generated the Excel file is fixed to output the dimensions
correctly.
Update your M query to ignore the incorrect dimensions. As of the December 2020
release of Power Query, Excel.Workbook now supports an InferSheetDimensions
option. When true, this option will cause the function to ignore the dimensions
stored in the Workbook and instead determine them by inspecting the data.
InferSheetDimensions = true])
To fix this issue, you can refer to Locate and reset the last cell on a worksheet for
detailed instructions.
You'll notice performance degradation when retrieving very large files from SharePoint.
However, this is only one part of the problem. If you have significant business logic in an
Excel file being retrieved from SharePoint, this business logic may have to execute when
you refresh your data, which could cause complicated calculations. Consider
aggregating and pre-calculating data, or moving more of the business logic out of the
Excel layer and into the Power Query layer.
This error happens when the ACE driver isn't installed on the host computer. Workbooks
saved in the "Strict Open XML Spreadsheet" format can only be read by ACE. However,
because such workbooks use the same file extension as regular Open XML workbooks
(.xlsx), we can't use the extension to display the usual the Access Database Engine OLEDB
provider may be required to read this type of file error message.
To resolve the error, install the ACE driver. If the error is occurring in a cloud service,
you'll need to use a gateway running on a computer that has the ACE driver installed.
Usually this error indicates there is a problem with the format of the file.
However, sometimes this error can happen when a file appears to be an Open XML file
(such as .xlsx), but the ACE driver is actually needed in order to process the file. Go to
the Legacy ACE connector section for more information about how to process files that
require the ACE driver.
FactSet RMS (Beta)
Article • 07/13/2023
Summary
Item Description
7 Note
The following connector article is provided by FactSet, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the FactSet website and use the support
channels there.
Prerequisites
To start using the FactSet RMS connector, the following prerequisite steps need to be
completed.
Download Power BI
Ensure that you're using latest version of Power BI, as the latest major update to
the FactSet Power BI data connector will only be available there. Any
subsequent major or minor version updates will only be available by upgrading
Power BI.
Capabilities supported
Import
2. On the Power BI Desktop home page, select Get Data > More.
3. To connect to FactsSetRMS, search for FactSet in Get Data and select the FactSet
RMS connector from the right-hand list.
4. In the authentication page, you'll be prompted to enter the Username - Serial and
the API key. Go to the FactSet Developer Portal for more instructions on setting up
an API Key.
5. The connector opens the Power Query navigator with a list of all provided
functions. Note that all functions might not be available, depending on your
available subscriptions. Your account team can assist with requirements for access
to additional products.
6. Use the Get* queries to look up parameters for your Notes and create new queries.
A form will populate in the query window with parameter fields to narrow your
universe and return the relevant data set of interest based on IRN Subject, Author,
Date Range, Recommendations and/or Sentiments. Note that the functions contain
Get* queries that are common for IRN Notes, Custom Symbols, and Meetings APIs.
The following table describes the Get functions in the connector.
GetNotes Gets all the notes, including non-extended text custom fields in the
specified date (startDate and endDate) range, and can be filtered on
subjectId, authorId, recommendationId, sentimentId, and
modifiedSince.
GetNote Gets details of a note, including note body and extended text custom
fields.
GetMeetings Gets all the meetings, including non-extended text custom fields in
the specified date (startDate and endDate) range, and can be filtered
on modifiedSince.
GetMeeting Gets details of a meeting, including meeting body and extended text
custom fields.
GetCustomSymbols Gets a list of all custom symbols in your IRN database, along with
standard field data and non-extended text custom fields data, and
can be filtered on CustomSymbolTypeName.
FHIR
Article • 07/25/2023
served by a FHIR server. The Power Query connector for FHIR can be used to import and
shape data from a FHIR server.
If you don't have a FHIR server, you can provision the Azure API for FHIR.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities Supported
Import
Prerequisites
You must have a FHIR Data Reader role on the FHIR server to read data from the server.
More information: Assign roles for the FHIR service
Connect to a FHIR server from Power Query
Desktop
To make a connection to a FHIR server, take the following steps:
You can optionally enter an initial query for the FHIR server, if you know exactly
what data you're looking for.
Select OK to proceed.
4. Decide on your authentication scheme.
The connector supports "Anonymous" for FHIR servers with no access controls (for
example, public test servers like http://test.fhir.org/r4 ) or Azure Active
Directory authentication. You must have a FHIR Data Reader role on the FHIR
server to read data from the server. Go to FHIR connector authentication for
details.
6. Shape the data as needed, for example, expand the postal code.
7. Save the query when shaping is complete.
8. Create dashboards with data, for example, make a plot of the patient locations
based on postal code.
1. In Choose data source, search for FHIR, and then select the FHIR connector. More
information: Where to get data
2. In the FHIR dialog, enter the URL for your FHIR server.
You can optionally enter an initial query for the FHIR server, if you know exactly
what data you're looking for.
3. If necessary, include the name of your on-premises data gateway.
4. Select the Organizational account authentication kind, and select Sign in. Enter
your credentials when asked. You must have a FHIR Data Reader role on the FHIR
server to read data from the server.
7. Shape the data as needed, for example, expand the postal code.
7 Note
In some cases, query folding can't be obtained purely through data shaping
with the graphical user interface (GUI), as shown in the previous image. To
learn more about query folding when using the FHIR connector, see FHIR
query folding.
Next Steps
In this article, you've learned how to use the Power Query connector for FHIR to access
FHIR data. Next explore the authentication features of the Power Query connector for
FHIR.
FHIR® and the FHIR Flame icon are the registered trademarks of HL7 and are used
with the permission of HL7. Use of the FHIR trademark does not constitute
endorsement of this product by HL7.
FHIR connector authentication
Article • 07/25/2023
This article explains authenticated access to FHIR servers using the Power Query
connector for FHIR. The connector supports anonymous access to publicly accessible
FHIR servers and authenticated access to FHIR servers using Azure Active Directory
authentication. The Azure API for FHIR is secured with Azure Active Directory.
7 Note
If you are connecting to a FHIR server from an online service, such as Power BI
service, you can only use an organizational account.
Anonymous access
There are many publicly accessible FHIR servers . To enable testing with these public
servers, the Power Query connector for FHIR supports the "Anonymous" authentication
scheme. For example to access the public https://server.fire.ly server:
The expected Audience for the FHIR server must be equal to the base URL of the
FHIR server. For the Azure API for FHIR, you can set this when you provision the
FHIR service or later in the portal.
account to sign in. You can't use a guest account in your active directory tenant.
For the Azure API for FHIR, you must use an Azure Active Directory organizational
account.
If your FHIR service isn't the Azure API for FHIR (for example, if you're running the
open source Microsoft FHIR server for Azure ), you'll have registered an Azure
Active Directory resource application for the FHIR server. You must pre-authorize
the Power BI client application to be able to access this resource application.
The Power Query (for example, Power BI) client will only request a single scope:
user_impersonation . This scope must be available and the FHIR server can't rely on
other scopes.
Next steps
In this article, you've learned how to use the Power Query connector for FHIR
authentication features. Next, explore query folding.
Power Query folding is the mechanism used by a Power Query connector to turn data
transformations into queries that are sent to the data source. This allows Power Query to
off-load as much of the data selection as possible to the data source rather than
retrieving large amounts of unneeded data only to discard it in the client. The Power
Query connector for FHIR includes query folding capabilities, but due to the nature of
FHIR search , special attention must be given to the Power Query expressions to
ensure that query folding is performed when possible. This article explains the basics of
FHIR Power Query folding and provides guidelines and examples.
Power Query M
let
Source = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null),
Patient1 = Source{[Name="Patient"]}[Data],
#"Filtered Rows" = Table.SelectRows(Patient1, each [birthDate] <
#date(1980, 1, 1))
in
#"Filtered Rows"
Instead of retrieving all Patient resources from the FHIR server and filtering them in the
client (Power BI), it's more efficient to send a query with a search parameter to the FHIR
server:
GET https://myfhirserver.azurehealthcareapis.com/Patient?birthdate=lt1980-
01-01
With such a query, the client would only receive the patients of interest and would not
need to discard data in the client.
In the example of a birth date, the query folding is straightforward, but in general it is
challenging in FHIR because the search parameter names don't always correspond to
the data field names and frequently multiple data fields will contribute to a single search
parameter.
For example, let's consider the Observation resource and the category field. The
Observation.category field is a CodeableConcept in FHIR, which has a coding field, which
have system and code fields (among other fields). Suppose you're interested in vital-
signs only, you would be interested in Observations where
Observation.category.coding.code = "vital-signs" , but the FHIR search would look
To be able to achieve query folding in the more complicated cases, the Power Query
connector for FHIR matches Power Query expressions with a list of expression patterns
and translates them into appropriate search parameters. The expression patterns are
generated from the FHIR specification.
This matching with expression patterns works best when any selection expressions
(filtering) is done as early as possible in data transformation steps before any other
shaping of the data.
7 Note
To give the Power Query engine the best chance of performing query folding, you
should do all data selection expressions before any shaping of the data.
Power Query M
Unfortunately, the Power Query engine no longer recognized that as a selection pattern
that maps to the category search parameter, but if you restructure the query to:
Power Query M
While the first and the second Power Query expressions will result in the same data set,
the latter will, in general, result in better query performance. It's important to note that
the second, more efficient, version of the query can't be obtained purely through data
shaping with the graphical user interface (GUI). It's necessary to write the query in the
"Advanced Editor".
The initial data exploration can be done with the GUI query editor, but it's
recommended that the query be refactored with query folding in mind. Specifically,
selective queries (filtering) should be performed as early as possible.
Summary
Query folding provides more efficient Power Query expressions. A properly crafted
Power Query will enable query folding and thus off-load much of the data filtering
burden to the data source.
Next steps
In this article, you've learned how to use query folding in the Power Query connector for
FHIR. Next, explore the list of FHIR Power Query folding patterns.
This article describes Power Query patterns that will allow effective query folding in
FHIR. It assumes that you are familiar with with using the Power Query connector for
FHIR and understand the basic motivation and principles for Power Query folding in
FHIR.
And then consult the list of examples below. There are also examples of combining
these types of filtering patters in multi-level, nested filtering statements. Finally, this
article provides more complicated filtering expressions that fold to composite search
parameters .
In each example you'll find a filtering expression ( Table.SelectRows ) and right above
each filtering statement a comment // Fold: ... explaining what search parameters
and values the expression will fold to.
M
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "birthdate=lt1980-01-01"
FilteredPatients = Table.SelectRows(Patients, each [birthDate] <
#date(1980, 1, 1))
in
FilteredPatients
Filtering Patients by birth date range using and , only the 1970s:
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "birthdate=ge1970-01-01&birthdate=lt1980-01-01"
FilteredPatients = Table.SelectRows(Patients, each [birthDate] <
#date(1980, 1, 1) and [birthDate] >= #date(1970, 1, 1))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "birthdate=ge1980-01-01,lt1970-01-01"
FilteredPatients = Table.SelectRows(Patients, each [birthDate] >=
#date(1980, 1, 1) or [birthDate] < #date(1970, 1, 1))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "active=true"
FilteredPatients = Table.SelectRows(Patients, each [active])
in
FilteredPatients
Alternative search for patients where active not true (could include missing):
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "active:not=true"
FilteredPatients = Table.SelectRows(Patients, each [active] <> true)
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "gender=male"
FilteredPatients = Table.SelectRows(Patients, each [gender] = "male")
in
FilteredPatients
Filtering to keep only patients that are not male (includes other):
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "gender:not=male"
FilteredPatients = Table.SelectRows(Patients, each [gender] <> "male")
in
FilteredPatients
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "status=final"
FilteredObservations = Table.SelectRows(Observations, each [status] =
"final")
in
FilteredObservations
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "_lastUpdated=2010-12-31T11:56:02.000+00:00"
FilteredPatients = Table.SelectRows(Patients, each [meta][lastUpdated] =
#datetimezone(2010, 12, 31, 11, 56, 2, 0, 0))
in
FilteredPatients
let
Encounters =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Encounter" ]}[Data],
// Fold: "class=s|c"
FilteredEncounters = Table.SelectRows(Encounters, each [class][system] =
"s" and [class][code] = "c")
in
FilteredEncounters
let
Encounters =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Encounter" ]}[Data],
// Fold: "class=c"
FilteredEncounters = Table.SelectRows(Encounters, each [class][code] =
"c")
in
FilteredEncounters
let
Encounters =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Encounter" ]}[Data],
// Fold: "class=s|"
FilteredEncounters = Table.SelectRows(Encounters, each [class][system] =
"s")
in
FilteredEncounters
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "subject=Patient/1234"
FilteredObservations = Table.SelectRows(Observations, each [subject]
[reference] = "Patient/1234")
in
FilteredObservations
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "subject=1234,Patient/1234,https://myfhirservice/Patient/1234"
FilteredObservations = Table.SelectRows(Observations, each [subject]
[reference] = "1234" or [subject][reference] = "Patient/1234" or [subject]
[reference] = "https://myfhirservice/Patient/1234")
in
FilteredObservations
let
ChargeItems =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"ChargeItem" ]}[Data],
// Fold: "quantity=1"
FilteredChargeItems = Table.SelectRows(ChargeItems, each [quantity]
[value] = 1)
in
FilteredChargeItems
let
ChargeItems =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"ChargeItem" ]}[Data],
// Fold: "quantity=gt1.001"
FilteredChargeItems = Table.SelectRows(ChargeItems, each [quantity]
[value] > 1.001)
in
FilteredChargeItems
let
ChargeItems =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"ChargeItem" ]}[Data],
// Fold: "quantity=lt1.001|s|c"
FilteredChargeItems = Table.SelectRows(ChargeItems, each [quantity]
[value] < 1.001 and [quantity][system] = "s" and [quantity][code] = "c")
in
FilteredChargeItems
let
Consents = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Consent" ]}[Data],
// Fold: "period=sa2010-01-01T00:00:00.000+00:00"
FiltertedConsents = Table.SelectRows(Consents, each [provision][period]
[start] > #datetimezone(2010, 1, 1, 0, 0, 0, 0, 0))
in
FiltertedConsents
let
Consents = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Consent" ]}[Data],
// Fold: "period=eb2010-01-01T00:00:00.000+00:00"
FiltertedConsents = Table.SelectRows(Consents, each [provision][period]
[end] < #datetimezone(2010, 1, 1, 0, 0, 0, 0, 0))
in
FiltertedConsents
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "code:text=t"
FilteredObservations = Table.SelectRows(Observations, each [code][text]
= "t")
in
FilteredObservations
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "code:text=t"
FilteredObservations = Table.SelectRows(Observations, each
Text.StartsWith([code][text], "t"))
in
FilteredObservations
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "_profile=http://myprofile"
FilteredPatients = Table.SelectRows(Patients, each
List.MatchesAny([meta][profile], each _ = "http://myprofile"))
in
FilteredPatients
let
AllergyIntolerances =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"AllergyIntolerance" ]}[Data],
// Fold: "category=food"
FilteredAllergyIntolerances = Table.SelectRows(AllergyIntolerances, each
List.MatchesAny([category], each _ = "food"))
in
FilteredAllergyIntolerances
let
AllergyIntolerances =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"AllergyIntolerance" ]}[Data],
// Fold: "category:missing=true"
FilteredAllergyIntolerances = Table.SelectRows(AllergyIntolerances, each
List.MatchesAll([category], each _ = null))
in
FilteredAllergyIntolerances
let
AllergyIntolerances =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"AllergyIntolerance" ]}[Data],
// Fold: "category:missing=true"
FilteredAllergyIntolerances = Table.SelectRows(AllergyIntolerances, each
[category] = null)
in
FilteredAllergyIntolerances
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "family:exact=Johnson"
FilteredPatients = Table.SelectRows(Patients, each
Table.MatchesAnyRows([name], each [family] = "Johnson"))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "family=John"
FilteredPatients = Table.SelectRows(Patients, each
Table.MatchesAnyRows([name], each Text.StartsWith([family], "John")))
in
FilteredPatients
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "family=John,Paul"
FilteredPatients = Table.SelectRows(Patients, each
Table.MatchesAnyRows([name], each Text.StartsWith([family], "John") or
Text.StartsWith([family], "Paul")))
in
FilteredPatients
Filtering Patients on family name starts with John and given starts with Paul :
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "family=John&given=Paul"
FilteredPatients = Table.SelectRows(
Patients,
each
Table.MatchesAnyRows([name], each Text.StartsWith([family],
"John")) and
Table.MatchesAnyRows([name], each List.MatchesAny([given], each
Text.StartsWith(_, "Paul"))))
in
FilteredPatients
let
Goals = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Goal" ]}[Data],
// Fold: "target-date=gt2020-03-01"
FilteredGoals = Table.SelectRows(Goals, each
Table.MatchesAnyRows([target], each [due][date] > #date(2020,3,1)))
in
FilteredGoals
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "identifier=s|v"
FilteredPatients = Table.SelectRows(Patients, each
Table.MatchesAnyRows([identifier], each [system] = "s" and _[value] = "v"))
in
FilteredPatients
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "code=s|c"
FilteredObservations = Table.SelectRows(Observations, each
Table.MatchesAnyRows([code][coding], each [system] = "s" and [code] = "c"))
in
FilteredObservations
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "code:text=t&code=s|c"
FilteredObservations = Table.SelectRows(Observations, each
Table.MatchesAnyRows([code][coding], each [system] = "s" and [code] = "c")
and [code][text] = "t")
in
FilteredObservations
Filtering multi-level nested properties
Filtering Patients on family name starts with John and given starts with Paul :
let
Patients = Fhir.Contents("https://myfhirserver.azurehealthcareapis.com",
null){[Name = "Patient" ]}[Data],
// Fold: "family=John&given=Paul"
FilteredPatients =
Table.SelectRows(
Patients,
each
Table.MatchesAnyRows([name], each Text.StartsWith([family],
"John")) and
Table.MatchesAnyRows([name], each List.MatchesAny([given],
each Text.StartsWith(_, "Paul"))))
in
FilteredPatients
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "category=vital-signs"
FilteredObservations = Table.SelectRows(Observations, each
Table.MatchesAnyRows([category], each Table.MatchesAnyRows([coding], each
[code] = "vital-signs")))
in
FilteredObservations
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "category=s|c"
FilteredObservations = Table.SelectRows(Observations, each
Table.MatchesAnyRows([category], each Table.MatchesAnyRows([coding], each
[system] = "s" and [code] = "c")))
in
FilteredObservations
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "category=s1|c1,s2|c2"
FilteredObservations =
Table.SelectRows(
Observations,
each
Table.MatchesAnyRows(
[category],
each
Table.MatchesAnyRows(
[coding],
each
([system] = "s1" and [code] = "c1") or
([system] = "s2" and [code] = "c2"))))
in
FilteredObservations
let
AuditEvents =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"AuditEvent" ]}[Data],
// Fold: "policy=http://mypolicy"
FilteredAuditEvents = Table.SelectRows(AuditEvents, each
Table.MatchesAnyRows([agent], each List.MatchesAny([policy], each _ =
"http://mypolicy")))
in
FilteredAuditEvents
Filtering Observations on code and value quantity, body height greater than 150:
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "code-value-quantity=http://loinc.org|8302-2$gt150"
FilteredObservations = Table.SelectRows(Observations, each
Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and
[code] = "8302-2") and [value][Quantity][value] > 150)
in
FilteredObservations
Filtering on Observation component code and value quantity, systolic blood pressure
greater than 140:
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "component-code-value-quantity=http://loinc.org|8480-6$gt140"
FilteredObservations = Table.SelectRows(Observations, each
Table.MatchesAnyRows([component], each Table.MatchesAnyRows([code][coding],
each [system] = "http://loinc.org" and [code] = "8480-6") and [value]
[Quantity][value] > 140))
in
FilteredObservations
Filtering on multiple component code value quantities (AND), diastolic blood pressure
greater than 90 and systolic blood pressure greater than 140:
M
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "component-code-value-quantity=http://loinc.org|8462-
4$gt90&component-code-value-quantity=http://loinc.org|8480-6$gt140"
FilteredObservations =
Table.SelectRows(
Observations,
each
Table.MatchesAnyRows(
[component],
each
Table.MatchesAnyRows([code][coding], each [system] =
"http://loinc.org" and [code] = "8462-4") and [value][Quantity][value] > 90)
and
Table.MatchesAnyRows([component], each
Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and
[code] = "8480-6") and [value][Quantity][value] > 140))
in
FilteredObservations
Filtering on multiple component code value quantities (OR), diastolic blood pressure
greater than 90 or systolic blood pressure greater than 140:
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "component-code-value-quantity=http://loinc.org|8462-
4$gt90,http://loinc.org|8480-6$gt140"
FilteredObservations =
Table.SelectRows(
Observations,
each
Table.MatchesAnyRows(
[component],
each
(Table.MatchesAnyRows([code][coding], each [system]
= "http://loinc.org" and [code] = "8462-4") and [value][Quantity][value] >
90) or
Table.MatchesAnyRows([code][coding], each [system]
= "http://loinc.org" and [code] = "8480-6") and [value][Quantity][value] >
140 ))
in
FilteredObservations
Filtering Observations on code value quantities on root of resource or in component
array:
let
Observations =
Fhir.Contents("https://myfhirserver.azurehealthcareapis.com", null){[Name =
"Observation" ]}[Data],
// Fold: "combo-code-value-quantity=http://loinc.org|8302-2$gt150"
FilteredObservations =
Table.SelectRows(
Observations,
each
(Table.MatchesAnyRows([code][coding], each [system] =
"http://loinc.org" and [code] = "8302-2") and [value][Quantity][value] >
150) or
(Table.MatchesAnyRows([component], each
Table.MatchesAnyRows([code][coding], each [system] = "http://loinc.org" and
[code] = "8302-2") and [value][Quantity][value] > 150)))
in
FilteredObservations
Summary
Query folding turns Power Query filtering expressions into FHIR search parameters. The
Power Query connector for FHIR recognizes certain patterns and attempts to identify
matching search parameters. Recognizing those patterns will help you write more
efficient Power Query expressions.
Next steps
In this article, we reviewed some classes of filtering expressions that will fold to FHIR
search parameters. Next read about establishing relationships between FHIR resources.
This article describes how to establish relationships between tables that have been
imported using the Power Query connector for FHIR.
Introduction
FHIR resources are related to each other, for example, an Observation that references a
subject ( Patient ):
JSON
{
"resourceType": "Observation",
"id": "1234",
"subject": {
"reference": "Patient/456"
}
Some of the resource reference fields in FHIR can refer to multiple different types of
resources (for example, Practitioner or Organization ). To facilitate an easier way to
resolve references, the Power Query connector for FHIR adds a synthetic field to all
imported resources called <referenceId> , which contains a concatenation of the
resource type and the resource ID.
To establish a relationship between two tables, you can connect a specific reference field
on a resource to the corresponding <referenceId> field on the resource you would like
it linked to. In simple cases, Power BI will even detect this for you automatically.
3. Make any other modifications you need to the query and save the modified query.
5. Establish the relationship. In this simple example, Power BI will likely have detected
the relationship automatically:
Next steps
In this article, you've learned how to establish relationships between tables imported
with the Power Query connector for FHIR. Next, explore query folding with the Power
Query connector for FHIR.
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities supported
Folder path
Combine
Combine and load
Combine and transform
2. Enter the path to the folder you want to load, or select Browse to browse to the
folder you want to load. Then select OK.
When you select the folder you want to use, the file information about all of the
files in that folder are displayed. Also, file information about any files in any
subfolders is also displayed.
3. Select Combine & Transform Data to combine the data in the files of the selected
folder and load the data in the Power Query Editor for editing. Select Combine &
Load to load the data from all of the files in the folder directly into your app. Or
select Transform Data to load the folder data as-is in the Power Query Editor.
7 Note
The Combine & Transform Data and Combine & Load buttons are the easiest ways
to combine data found in the files of the folder you specify. You could also use the
Load button (in Power BI Desktop only) or the Transform Data buttons to combine
the files as well, but that requires more manual steps.
3. Enter the name of an on-premises data gateway that you'll use to access the
folder.
4. Select the authentication kind to connect to the folder. If you select the Windows
authentication kind, enter your credentials.
5. Select Next.
6. In the Navigator dialog box, select Combine to combine the data in the files of the
selected folder and load the data into the Power Query Editor for editing. Or select
Transform data to load the folder data as-is in the Power Query Editor.
Troubleshooting
Combining files
When you combine files using the folder connector, all the files in the folder and its
subfolders are processed the same way, and the results are then combined. The way the
files are processed is determined by the example file you select. For example, if you
select an Excel file and choose a table called "Table1", then all the files will be treated as
Excel files that contain a table called "Table1".
To ensure that combining the files works properly, make sure that all the files in the
folder and its subfolders have the same file format and structure. If you need to exclude
some of the files, first select Transform data instead of Combine and filter the table of
files in the Power Query Editor before combining.
For more information about combining files, go to Combine files in Power Query.
Funnel
Article • 07/18/2023
7 Note
The following connector article is provided by Funnel, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Funnel website and use the support
channels there.
Summary
Item Description
Prerequisites
To use the Funnel connector, you need a Funnel subscription. Funnel helps you collect
data from all your marketing platforms, transform it, and send it to the destinations you
want, like Power BI (https://funnel.io/ ).
In the Funnel App, go to your Workspace, navigate to the Microsoft Power BI page in
the left navigation (if you can't see it, please contact us). Follow the instructions on the
page. You need to create a "Data Share" that contains the fields you want to expose in
Power BI.
Capabilities Supported
Import
1. Select Online Services, find Funnel from the product-specific data connector list,
and then select Connect.
5. In the Navigator dialog box, choose one or more Data Shares from your
Workspaces to import your data.
For each Data Share you can enter number of rolling months of data you want.
7 Note
The default number of months is 12. If today is 22.03.2022, then you'll get
data for the period 01.04.2021 - 22.03.2022.
You can then either select Load to load the data or select Transform Data to
transform the data.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
7 Note
Effective July 2021, Google will discontinue support for sign-ins to Google accounts
from embedded browser frameworks. Due to this change, you will need to
update your Power BI Desktop version to June 2021 to support signing in to
Google.
7 Note
Prerequisites
Before you can sign in to Google Analytics, you must have a Google Analytics account
(username/password).
Capabilities Supported
Import
Google Analytics 4 (Data API)
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, and then select Google Analytics. Then select
Connect.
2. If this is the first time you're getting data through the Google Analytics connector,
a third-party notice is displayed. Select Don't warn me again with this connector if
you don't want this message to be displayed again. Then select Continue.
3. To connect to Google Analytics data through the legacy "Universal Analytics" API,
select Implementation 1.0. To connect to Google Analytics data through the new
Google Analytics Data API with support for Google Analytics 4, select
Implementation 2.0 (Beta).
5. In the Sign in with Google window that appears, provide your credentials to sign
in to your Google Analytics account. You can either supply an email address or
phone number. Then select Next.
6. Enter your Google Analytics password and select Next.
7. When asked if you want Power BI Desktop to access your Google account, select
Allow.
8. Once you've successfully signed in, select Connect.
Once the connection is established, you’ll see a list of the accounts you have access to.
Drill through the account, properties, and views to see a selection of values, categorized
in display folders.
You can Load the selected table, which brings the entire table into Power BI Desktop, or
you can select Transform Data to edit the query, which opens Power Query Editor. You
can then filter and refine the set of data you want to use, and then load that refined set
of data into Power BI Desktop.
1. Select Google Analytics from the Power Query - Choose data source page.
2. From the connection page, enter a connection name and choose an on-premises
data gateway if necessary.
7 Note
Currently, the Google Analytics sign-in dialog boxes indicate that you are
signing in to Power Query Desktop. This wording will be changed in the
future.
Once the connection is established, you’ll see a list of the accounts you have
access to. Drill through the account, properties, and views to see a selection of
values, categorized in display folders.
8. Select Transform data to edit the query in Power Query Editor. You can then filter
and refine the set of data you want to use, and then load that refined set of data
into Power Apps.
Troubleshooting
To make sure that the data you're seeing is the same as you would get from Google
Analytics, you can execute the query yourself in Google's interactive tool. To understand
what data Power Query is retrieving, you can use Query Diagnostics to understand what
query parameters are being sent to Google Analytics.
If you follow the instructions for Query Diagnostics and run Diagnose Step on any
Added Items, you can see the generated results in the Diagnostics Data Source Query
column. We recommend running this with as few additional operations as possible on
top of your initial connection to Google Analytics, to make sure you're not losing data in
a Power Query transform rather than what's being retrieved from Google Analytics.
Depending on your query, the row containing the emitted API call to Google Analytics
may not be in the same place. But for a simple Google Analytics only query, you'll
generally see it as the last row that has content in that column.
In the Data Source Query column, you'll find a record with the following pattern:
Request:
GET https://www.googleapis.com/analytics/v3/data/ga?ids=ga:<GA
Id>&metrics=ga:users&dimensions=ga:source&start-date=2009-03-12&end-
date=2020-08-11&start-index=1&max-results=1000"aUser=<User>%40gmail.com
HTTP/1.1
<Content placeholder>
Response:
HTTP/1.1 200 OK
Content-Length: -1
<Content placeholder>
From this record, you can see you have your Analytics view (profile) ID , your list of
metrics (in this case, just ga:users ), your list of dimensions (in this case, just referral
source), the start-date and end-date , the start-index , max-results (set to 1000
for the editor by default), and the quotaUser .
You can copy these values into the Google Analytics Query Explorer to validate that
the same data you're seeing returned by your query is also being returned by the API.
If your error is around a date range, you can easily fix it. Go into the Advanced Editor.
You'll have an M query that looks something like this (at a minimum—there may be
other transforms on top of it).
Power Query M
let
Source = GoogleAnalytics.Accounts(),
#"<ID>" = Source{[Id="<ID>"]}[Data],
#"UA-<ID>-1" = #"<ID>"{[Id="UA-<ID>-1"]}[Data],
#"<View ID>" = #"UA-<ID>-1"{[Id="<View ID>"]}[Data],
#"Added Items" = Cube.Transform(#"<View ID>",
{
{Cube.AddAndExpandDimensionColumn, "ga:source", {"ga:source"},
{"Source"}},
{Cube.AddMeasureColumn, "Users", "ga:users"}
})
in
#"Added Items"
You can do one of two things. If you have a Date column, you can filter on the Date. This
is the easier option. If you don't care about breaking it up by date, you can Group
afterwards.
If you don't have a Date column, you can manually manipulate the query in the
Advanced Editor to add one and filter on it. For example:
Power Query M
let
Source = GoogleAnalytics.Accounts(),
#"<ID>" = Source{[Id="<ID>"]}[Data],
#"UA-<ID>-1" = #"<ID>"{[Id="UA-<ID>-1"]}[Data],
#"<View ID>" = #"UA-<ID>-1"{[Id="<View ID>"]}[Data],
#"Added Items" = Cube.Transform(#"<View ID>",
{
{Cube.AddAndExpandDimensionColumn, "ga:date", {"ga:date"},
{"Date"}},
{Cube.AddAndExpandDimensionColumn, "ga:source", {"ga:source"},
{"Source"}},
{Cube.AddMeasureColumn, "Organic Searches",
"ga:organicSearches"}
}),
#"Filtered Rows" = Table.SelectRows(#"Added Items", each [Date] >=
#date(2019, 9, 1) and [Date] <= #date(2019, 9, 30))
in
#"Filtered Rows"
Next steps
Google Analytics Dimensions & Metrics Explorer
Google Analytics Core Reporting API
Google BigQuery
Article • 07/13/2023
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
7 Note
Effective July 2021, Google will discontinue support for sign-ins to Google accounts
from embedded browser frameworks. Due to this change, you will need to
update your Power BI Desktop version to June 2021 to support signing in to
Google.
Prerequisites
You'll need a Google account or a Google service account to sign in to Google BigQuery.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Connect to Google BigQuery data from Power
Query Desktop
To connect to Google BigQuery from Power Query Desktop, take the following steps:
1. In the Get Data experience, search for and select Google BigQuery.
2. If you want to use any advance options, select Advanced options. Otherwise,
select OK to continue. More information: Connect using advanced options
4. A Sign in with Google dialog appears. Select your Google account and approve
connecting to Power BI Desktop.
1. In the Get Data experience, select the Database category, and then select Google
BigQuery.
2. In the Google BigQuery Database dialog, you may need to either create a new
connection or select an existing connection. If you're using on-premises data,
select an on-premises data gateway. Then select Sign in.
3. A Sign in with Google dialog appears. Select your Google account and approve
connecting.
7 Note
Although the sign in dialog box says you'll continue to Power BI Desktop once
you've signed in, you'll be sent to your online app instead.
4. If you want to use any advance options, select Advanced options. More
information: Connect using advanced options
Advanced Description
option
Billing Project A project against which Power Query will run queries. Permissions and billing are
ID tied to this project. If no Billing Project ID is provided, by default the first
available project returned by Google APIs will be used.
Use Storage A flag that enables using the Storage API of Google BigQuery . This option is
Api true by default. This option can be set to false to not use the Storage API and
use REST APIs instead.
Connection The standard connection setting (in seconds) that controls how long Power
timeout Query waits for a connection to complete. You can change this value if your
duration connection doesn't complete before 15 seconds (the default value.)
Command How long Power Query waits for a query to complete and return results. The
timeout default depends on the driver default. You can enter another value in minutes to
duration keep the connection open longer.
Project ID The project that you want to run native queries on. This option is only available
in Power Query Desktop.
Advanced Description
option
SQL statement For information, go to Import data from a database using native database query.
In this version of native database query functionality, you need to use fully
qualified table names in the format Database.Schema.Table , for example SELECT
* FROM DEMO_DB.PUBLIC.DEMO_TABLE . This option is only available in Power Query
Desktop.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your Google BigQuery data.
Connector availability
The Google BigQuery connector is available in Power BI Desktop and in the Power BI
service. In the Power BI service, the connector can be accessed using the Cloud-to-Cloud
connection from Power BI to Google BigQuery.
with REST API: Access Denied: Project <project name>: The user <user name>
bigquery.jobs.create permissions in project <project name>.
In this case, you might need to enter a Billing Project ID in the Billing Project advanced
option in the Power Query Connection settings.
In addition, if you also create a report in Power BI service using a gateway, you might
still get this error. In this case, you must manually include the Billing Project ID in the M
code for the connection using the Power Query editor or the Power Query formula bar.
For example:
Source = GoogleBigQuery.Database([BillingProject="Include-Billing-Project-Id-
Here"])
Nested fields
To optimize performance considerations, Google BigQuery does well with large datasets
when denormalized, flattened, and nested.
The Google BigQuery connector supports nested fields, which are loaded as text
columns in JSON format.
Users should select Transform Data and then use the JSON parsing capabilities in the
Power Query Editor to extract the data.
1. Under the Transforms ribbon tab, the Text Column category, select Parse and then
JSON.
2. Extract the JSON record fields using the Expand Column option.
When you authenticate through a Google service account in Power BI service or Power
Query Online, users need to use "Basic" authentication. The Username field maps to the
Service Account Email field above, and the Password field maps to the Service Account
JSON key file contents field above. The format requirements for each credential remain
the same in both Power BI Desktop, Power BI service, and Power Query Online.
You can resolve this issue by adjusting the user permissions for the BigQuery Storage
API correctly. These storage API permissions are required to access data correctly with
BigQueryStorage API:
Storage API.
bigquery.readsessions.getData : Reads data from a read session via the BigQuery
Storage API.
bigquery.readsessions.update : Updates a read session via the BigQuery Storage
API.
These permissions typically are provided in the BigQuery.User role. More information,
Google BigQuery Predefined roles and permissions
If the above steps don't resolve the problem, you can disable the BigQuery Storage API.
Unable to use DateTime type data in Direct Query mode
There's a known issue where the DateTime type isn't supported through Direct Query.
Selecting a column with the DateTime type will cause an "Invalid query" error or a visual
error.
Google BigQuery (Azure AD) (Beta)
Article • 08/03/2023
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
You need an Azure Active Directory account to sign in to Google BigQuery (Azure AD).
Capabilities supported
Import
DirectQuery (Power BI Datasets)
1. In the Get Data experience, search for and select Google BigQuery (Azure AD).
More information: Where to get data
2. Specify a Billing Project ID, which is required for the use of this connector. If you
want to use any advanced options, select Advanced options. Otherwise, select
either Import or DirectQuery, and then select OK to continue. More information:
Connect using advanced options
3. The Google BigQuery (Azure AD) connector supports connecting through an Azure
Active Directory account. Select Sign In to continue.
4. Once signed in, select Connect to continue.
5. Once you successfully connect, a Navigator window appears and displays the data
available on the server. Select your data in the navigator. Then select either
Transform Data to transform the data in Power Query or Load to load the data in
Power BI Desktop.
1. In the Get Data experience, do a search for Google, and then select Google
BigQuery (Azure AD). More information: Where to get data
2. In the Google BigQuery (Azure AAD) dialog, you may need to either create a new
connection or select an existing connection. If you're creating a new connection,
enter the Billing Project ID. If you're using on-premises data, select an on-
premises data gateway.
3. If you want to use any advanced options, select Advanced options. More
information: Connect using advanced options
5. Once you successfully connect, a Navigator window appears and displays the data
available on the server. Select your data in the navigator. Then select Next to
transform the data in Power Query.
The following table lists all of the advanced options you can set in Power Query Desktop
and Power Query Online.
Advanced Description
option
Use Storage A flag that enables using the Storage API of Google BigQuery . This option is
Api true by default. This option can be set to false to not use the Storage API and
use REST APIs instead.
Connection The standard connection setting (in seconds) that controls how long Power
timeout Query waits for a connection to complete. You can change this value if your
duration connection doesn't complete before 15 seconds (the default value.)
Command How long Power Query waits for a query to complete and return results. The
timeout default depends on the driver default. You can enter another value in minutes to
duration keep the connection open longer.
Audience Uri
ProjectID The project that you want to run native queries on. This option is only available
in Power Query Desktop.
Native query For information, go to Import data from a database using native database query.
In this version of native database query functionality, you need to use fully
qualified table names in the format Database.Schema.Table , for example SELECT
* FROM DEMO_DB.PUBLIC.DEMO_TABLE . This option is only available in Power Query
Desktop.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your Google BigQuery data.
OIDC configurations
The Google BigQuery (Azure AD) connector utilizes Azure AD JWT tokens to connect
Azure AD-based authentication with Google's Workforce Federation feature. So, the
setup on the authentication side must be an OIDC-based set up to align with the Azure
AD JWT tokens. Reach out to your Google BigQuery point-of-contact for further
information on authentication setup and support on the Google side.
Nested fields
To optimize performance considerations, Google BigQuery does well with large datasets
when denormalized, flattened, and nested.
The Google BigQuery (Azure AD) connector supports nested fields, which are loaded as
text columns in JSON format.
Users should select Transform Data and then use the JSON parsing capabilities in the
Power Query editor to extract the data.
1. Under the Transforms ribbon tab, the Text Column category, select Parse and then
JSON.
2. Extract the JSON record fields using the Expand Column option.
Unable to authenticate with Google BigQuery Storage
API
The Google BigQuery (Azure AD) connector uses Google BigQuery Storage API by
default. This feature is controlled by the advanced option called UseStorageApi. You
might encounter issues with this feature if you use granular permissions. In this scenario,
you might see the following error message or fail to get any data from your query:
You can resolve this issue by adjusting the user permissions for the BigQuery Storage
API correctly. These storage API permissions are required to access data correctly with
BigQueryStorage API:
Storage API.
bigquery.readsessions.getData : Reads data from a read session via the BigQuery
Storage API.
bigquery.readsessions.update : Updates a read session via the BigQuery Storage
API.
These permissions typically are provided in the BigQuery.User role. More information,
Google BigQuery Predefined roles and permissions
If the above steps don't resolve the problem, you can disable the BigQuery Storage API.
Once you've enabled Azure AD SSO for all data sources, then enable Azure AD SSO for
Google BigQuery:
4. Under the Data Source Settings tab, enter a value in Billing Project ID. The Billing
Project ID parameter is required when using Azure AD and needs to be specified
in Advanced settings. Also, select Use SSO via Azure AD for DirectQuery queries.
Google Sheets
Article • 07/13/2023
Summary
Item Description
Prerequisites
Before you can use the Google Sheets connector, you must have a Google account and
have access to the Google Sheet you're trying to connect to.
Capabilities Supported
Import
1. In the Get Data experience, search for and select Google Sheets.
2. You'll be prompted for a Google Sheets URL. Copy and paste the URL from your
browser address bar into the input prompt.
Multiple connections
This connector uses a different ResourcePath for every Google Sheet URL. You'll need to
authenticate to every new resource path and URL, but you might not need to sign into
Google multiple times if the previous sessions remain active.
Spreadsheet ID from the URL to include in the Google Sheets API call. The rest of the
URL isn't used. Each Google Sheet connection is tied to the submitted URL, which will
act as the ResourcePath.
Hadoop File (HDFS)
Article • 01/24/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Capabilities Supported
Import
1. From Get Data, select the Other category, select Hadoop File (HDFS), and then
select Connect. More information: where to get data
2. In the window that appears, enter the server name of your Hadoop File (HDFS)
instance.
3. Select OK.
4. You can either select anonymous access, windows access, or Microsoft account,
select the level to apply the settings to, and connect. For more information, see:
Authentication with a data source.
5. Select either Load to load the table, or Transform Data to open the Power Query
Editor where you can filter and refine the set of data you want to use, and then
load that refined set of data.
Hive LLAP
Article • 07/13/2023
Summary
Item Description
Prerequisites
An Apache Hive LLAP username and password.
Capabilities Supported
Import
Direct Query (Power BI Datasets)
Thrift Transport Protocol
HTTP
Standard
2. Enter the URL to the Adobe Hive LLAP server. You can also enter an optional port
number. Typically, the URL looks like http://[hostname]:[port number] . The
components of the URL are:
The hostname (for example, hivellaphttp.southcentralus.contoso.com ) is the
hostname or IP address of the Apache Hive server.
The port number (for example, 10500) is the port number for the Apache Hive
server. If the port number isn't specified, the default value is 10501 for the
HTTP transport protocol and 10500 for the standard transport protocol.
3. In Thrift Transport Protocol, select either Standard for TCP mode, or HTTP for
HTTP mode.
5. Select OK to continue.
6. The first time you connect to a data source (identified by each unique URL), you'll
be prompted to enter account credentials. Select the appropriate type of
authentication and enter your credentials for the connection.
8. In Navigator, select the data you require. Then select either Transform data to
transform the data in Power Query Editor or Load to load the data in Power BI
Desktop.
1. Select the Hive LLAP option in the Power Query - Choose data source page.
2. Enter the URL to the Adobe Hive LLAP server. You can also enter an optional port
number. Typically, the URL looks like http://[hostname]:[port number] . The
components of the URL are:
3. In Thrift Transport Protocol, select either Standard for TCP mode, or HTTP for
HTTP mode.
5. If you're connecting to this Hive LLAP data for the first time, select the type of
credentials for the connection in Authentication kind.
9. In Navigator, select the data you require, then select Transform data to transform
the data in the Power Query editor.
1. Sign in to your Power BI account, and navigate to the Gateway management page.
2. Add a new data source under the gateway cluster you want to use.
5. Select the option to Use SSO via Kerberos for DirectQuery queries or Use SSO via
Kerberos for DirectQuery and Import queries.
More information, Configure Kerberos-based SSO from Power BI service to on-premises
data sources
Troubleshooting
1. In Power BI Desktop, select Files > Options and settings > Data source settings.
2. In Data source settings, select the Hive LLAP source you created, and then select
Edit Permissions.
3. In Edit Permissions, under Encryption, clear the Encrypt connections check box.
4. Select OK, and then in Data source settings, select Close.
5. Redo the steps in Connect to Hive LLAP data from Power Query Desktop.
If you get this error and you see the following message in Fiddler trace, this is an SSL
issue.
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
By default, the IBM Db2 database connector uses the Microsoft driver to connect to
your data. If you choose to use the IBM driver in the advanced options in Power Query
Desktop, you must first install the IBM Db2 driver for .NET on the machine used to
connect to the data. The name of this driver changes from time to time, so be sure to
install the IBM Db2 driver that works with .NET. For instructions on how to download,
install, and configure the IBM Db2 driver for .NET, go to Download initial Version 11.5
clients and drivers . More information: Driver limitations, Ensure the IBM Db2 driver is
installed
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Driver (IBM or Microsoft)
Command timeout in minutes
Package collection
SQL statement
Include relationship columns
Navigate using full hierarchy
2. Specify the IBM Db2 server to connect to in Server. If a port is required, specify it
by using the format ServerName:Port, where Port is the port number. Also, enter
the IBM Db2 database you want to access in Database. In this example, the server
name and port are TestIBMDb2server.contoso.com:4000 and the IBM Db2 database
being accessed is NORTHWD2 .
7 Note
By default, the IBM Db2 database dialog box uses the Microsoft driver during
sign in. If you want to use the IBM driver, open Advanced options and select
IBM. More information: Connect using advanced options
If you select DirectQuery as your data connectivity mode, the SQL statement
in the advanced options will be disabled. DirectQuery currently does not
support query push down on top of a native database query for the IBM Db2
connector.
4. Select OK.
5. If this is the first time you're connecting to this IBM Db2 database, select the
authentication type you want to use, enter your credentials, and then select
Connect. For more information about authentication, go to Authentication with a
data source.
By default, Power Query attempts to connect to the IBM Db2 database using an
encrypted connection. If Power Query can't connect using an encrypted
connection, an "unable to connect" dialog box will appear. To connect using an
unencrypted connection, select OK.
6. In Navigator, select the data you require, then either select Load to load the data
or Transform Data to transform the data.
1. Select the IBM Db2 database option in the Power Query - Connect to data source
page.
2. Specify the IBM Db2 server to connect to in Server. If a port is required, specify it
by using the format ServerName:Port, where Port is the port number. Also, enter
the IBM Db2 database you want to access in Database. In this example, the server
name and port are TestIBMDb2server.contoso.com:4000 and the IBM Db2 database
being accessed is NORTHWD2
You must select an on-premises data gateway for this connector, whether the
IBM Db2 database is on your local network or online.
4. If this is the first time you're connecting to this IBM Db2 database, select the type
of credentials for the connection in Authentication kind. Choose Basic if you plan
to use an account that's created in the IBM Db2 database instead of Windows
authentication.
8. In Navigator, select the data you require, then select Transform data to transform
the data in Power Query Editor.
Connect using advanced options
Power Query provides a set of advanced options that you can add to your query if
needed.
The following table lists all of the advanced options you can set in Power Query.
Advanced Description
option
Driver Determines which driver is used to connect to your IBM Db2 database. The
choices are IBM and Windows (default). If you select the IBM driver, you must
first ensure that the IBM Db2 driver for .NET is installed on your machine. This
option is only available in Power Query Desktop. More information: Ensure the
IBM Db2 driver is installed
Command If your connection lasts longer than 10 minutes (the default timeout), you can
timeout in enter another value in minutes to keep the connection open longer.
Advanced Description
option
minutes
Package Specifies where to look for packages. Packages are control structures used by
collection Db2 when processing an SQL statement, and will be automatically created if
necessary. By default, this option uses the value NULLID . Only available when
using the Microsoft driver. More information: DB2 packages: Concepts,
examples, and common problems
SQL statement For information, go to Import data from a database using native database
query.
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, you won’t see those columns.
columns
Navigate using If checked, the navigator displays the complete hierarchy of tables in the
full hierarchy database you're connecting to. If cleared, the navigator displays only the tables
whose columns and rows contain data.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your IBM Db2 database.
Driver limitations
The Microsoft driver is the same one used in Microsoft Host Integration Server, called
the "ADO.NET Provider for DB2". The IBM driver is the IBM Db/2 driver that works with
.NET. The name of this driver changes from time to time, so be sure it's the one that
works with .NET, which is different from the IBM Db2 drivers that work with OLE/DB,
ODBC, or JDBC.
You can choose to use either the Microsoft driver (default) or the IBM driver if you're
using Power Query Desktop. Currently, Power Query Online only uses the Microsoft
driver. Each driver has its limitations.
Microsoft driver
Doesn't support Transport Layer Security (TLS)
IBM driver
The IBM Db2 database connector, when using the IBM Db2 driver for .NET,
doesn't work with Mainframe or IBM i systems
Doesn't support DirectQuery
Microsoft provides support for the Microsoft driver, but not for the IBM driver. However,
if your IT department already has it set up and configured on your machines, your IT
department should know how to troubleshoot the IBM driver.
Troubleshooting
[System.Data.Common.DbProviderFactories]::GetFactoryClasses() | ogv
3. In the dialog box that opens, you should see the following name in the
InvariantName column:
IBM.Data.DB2
If this name is in the InvariantName column, the IBM Db2 driver has been installed and
configured correctly.
Typically, most IBM Db2 administrators don't provide bind package authority to end
users—especially in an IBM z/OS (mainframe) or IBM i (AS/400) environment. Db2 on
Linux, Unix, or Windows is different in that user accounts have bind privileges by default,
which create the MSCS001 (Cursor Stability) package in the user’s own collection (name
= user login name).
If you don't have bind package privileges, you'll need to ask your Db2 administrator for
package binding authority. With this package binding authority, connect to the
database and fetch data, which will auto-create the package. Afterwards, the
administrator can revoke the packaging binding authority. Also, afterwards, the
administrator can "bind copy" the package to other collections—to increase
concurrency, to better match your internal standards for where packages are bound, and
so on.
When connecting to IBM Db2 for z/OS, the Db2 administrator can do the following
steps.
1. Grant authority to bind a new package to the user with one of the following
commands:
2. Using Power Query, connect to the IBM Db2 database and retrieve a list of
schemas, tables, and views. The Power Query IBM Db2 database connector will
auto-create the package NULLID.MSCS001, and then grant execute on the package
to public.
3. Revoke authority to bind a new package to the user with one of the following
commands:
When connecting to IBM Db2 for Linux, Unix, or Windows, the Db2 administrator can do
the following steps.
2. Using Power Query, connect to the IBM Db2 database and retrieve a list of
schemas, tables, and views. The Power Query IBM Db2 connector will auto-create
the package NULLID.MSCS001, and then grant execute on the package to public.
When connecting to IBM Db2 for i, the Db2 administrator can do the following steps.
Microsoft Db2 Client: The host resource could not be found. Check that the Initial
This error message indicates that you didn’t put the right value in for the name of the
database.
Double check the name, and confirm that the host is reachable. For example, use ping in
a command prompt to attempt to reach the server and ensure the IP address is correct,
or use telnet to communicate with the server.
The port is specified at the end of the server name, separated by a colon. If omitted, the
default value of 50000 is used.
To find the port Db2 is using for Linux, Unix, and Windows, run this command:
To find for certain what port the DRDA service is running on:
5. Press F14 to see the port numbers instead of names, and scroll until you see the
port in question. It should have an entry with a state of “Listen”.
More information
HIS - Microsoft OLE DB Provider for DB2
Impala database
Article • 07/13/2023
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Connection timeout duration
Command timeout duration
3. In the Impala window that appears, type or paste the name of your Impala server
into the box. You can Import data directly into Power BI or you can use
DirectQuery. Learn more about using DirectQuery. Then select OK.
4. When prompted, enter your credentials or connect anonymously. The Impala
connector supports Anonymous, Database (user name and password), and
Windows authentication.
7 Note
4. If this is the first time you're connecting to this Impala database, select the type of
credentials for the connection in Authentication kind.
8. In Navigator, select the data you require, then select Transform data to transform
the data in the Power Query editor.
Connection Specifies the maximum time Power Query will wait for a connection to
timeout duration complete. You can enter another value to keep the connection open longer.
Command timeout Specifies the maximum time a command is allowed to run before Power
duration Query abandons the call.
The Impala connector is supported on the on-premises data gateway, using any of
the three supported authentication mechanisms.
The Impala connector uses the Impala driver, which limits the size of string types to
32K by default.
JSON
Article • 07/18/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities supported
Import
1. Select the JSON option in the Get Data selection. This selection launches a local
file browser where you can select your JSON file.
2. Select Open to open the file.
Loading the JSON file will automatically launch the Power Query Editor. Power Query
uses automatic table detection to seamlessly flatten the JSON data into a table. From
the editor, you can then continue to transform the data if you want, or you can just close
and apply. More information: Automatic table detection from JSON files
5. Select Next.
Loading the JSON file will automatically launch the Power Query Editor. Power Query
uses automatic table detection to seamlessly flatten the JSON data into a table. From
the editor, you can then continue to transform the data if you want, or you can just save
and close to load the data. More information: Automatic table detection from JSON files
With the addition of automatic table detection capabilities, using the JSON connector in
Power Query will automatically apply transformation steps to flatten the JSON data into
a table. Previously, users had to flatten records and lists manually.
Troubleshooting
If you see the following message, it might be because the file is invalid, for example, it's
not really a JSON file, or is malformed. Or you might be trying to load a JSON Lines file.
If you're trying to load a JSON Lines file, the following sample M code converts all JSON
Lines input to a single flattened table automatically:
Power Query M
let
// Read the file into a list of lines
Source = Table.FromColumns({Lines.FromBinary(File.Contents("C:\json-
lines-example.json"), null, null)}),
// Transform each line using Json.Document
#"Transformed Column" = Table.TransformColumns(Source, {"Column1",
Json.Document})
in
#"Transformed Column"
You'll then need to use an Expand operation to combine the lines together.
KQL Database (Preview)
Article • 07/27/2023
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
You must have read permissions on the KQL database.
Capabilities supported
Import
DirectQuery (Power BI Datasets)
1. In Get Data, select Microsoft Fabric (preview) or search for KQL, select KQL
Database, and then select Connect. More information: Where to get data
2. In KQL Database, fill on the cluster and any optional fields, such as Database.
To get the cluster URI, navigate to your KQL database in the Fabric service and
copy the Query URI.
3. If this attempt is the first time you're connecting to this site, select Sign in and
input your credentials. Then select Connect.
4. In Navigator, select the tables you require, then either load or transform the data.
1. In Choose data source, search for KQL, and then select KQL Database. More
information: Where to get data
2. In Connect to data source, fill in the cluster and any optional fields, such as
Database.
To get the cluster URI, navigate to your KQL database in the Fabric service and
copy the Query URI.
6. In Choose data, select the data you require, and then select Transform Data.
The following table lists all of the advanced options you can set in Power Query Desktop
and Power Query Online.
Limit query result The maximum number of records to return in the result.
record number
Limit query result The maximum data size in bytes to return in the result.
data size in Bytes
Disable result-set Enable or disable result truncation by using the notruncation request
truncation option.
Additional Set Sets query options for the duration of the query. Query options control
Statements how a query executes and returns results. Multiple Set statements can be
separated by semicolons.
LinkedIn Sales Navigator (Beta)
Article • 01/24/2023
Summary
Item Description
Prerequisites
A LinkedIn Sales Navigator account. If you don't already have an account, sign up for a
free trial .
Capabilities supported
Import
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, then scroll until you see LinkedIn Sales Navigator
(Beta).
Select Connect to continue.
2. You'll be advised that you're connecting to a third-party connecter that's still under
development.
3. When you select Continue, you're prompted to specify which data you want.
4. In the LinkedIn Sales Navigator window that appears, select which data you want
to return, either All contacts or Selected contacts from the first drop-down
selector. You can then specify the start and end dates to constrain the data it
receives to a particular time window.
5. Once you've provided the information, Power BI Desktop connects to the data
associated with your LinkedIn Sales Navigator contract. Use the same email
address you use to sign in to LinkedIn Sales Navigator through the website.
6. When you connect successfully, you're prompted to select the required data from
your LinkedIn Sales Navigator contract from the Navigator.
Once you've selected the data you require, either select Transform Data to
continue tranforming the data in the Power Query editor, or select Load to load
the data into Power BI Desktop. Once in Power BI Desktop, you can create
whatever reports you like with your LinkedIn Sales Navigator data.
Getting help
If you run into problems when connecting to your data, contact LinkedIn Sales
Navigator support .
Mailchimp (Deprecated)
Article • 01/24/2023
Summary
Item Description
Products -
Deprecation
This connector is deprecated, and won't be supported soon. We recommend you
transition off existing connections using this connector, and don't use this connector for
new connections.
Microsoft Azure Consumption Insights
(Beta) (Deprecated)
Article • 02/17/2023
Summary
Item Description
Products —
Deprecation
7 Note
This connector is deprecated because of end of support for the Microsoft Azure
Consumption Insights service. We recommend that users transition off existing
connections using this connector, and don't use this connector for new
connections.
Transition instructions
Users are instructed to use the certified Microsoft Azure Cost Management connector as
a replacement. The table and field names are similar and should offer the same
functionality.
Timeline
The Microsoft Azure Consumption Insights service will stop working in December 2021.
Users should transition off the Microsoft Azure Consumption Insights connector to the
Microsoft Azure Cost Management connector by December 2021.
Microsoft Exchange
Article • 01/24/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Analysis Services
Capabilities Supported
Import
1. From Get Data, select the Other category, select Microsoft Exchange, and then
select Connect. More information: Where to get data
2. In the Microsoft Exchange window that appears, enter the mailbox address for the
account you would like to access.
3. Select OK.
4. Choose either Exchange account sign in and provide your credentials, or Microsoft
account and sign in.
You can also use a User Principal Name (UPN). It looks similar to an email address.
Typical format is user@domain_name.
5. In Navigator, select the data to import and use in your application. Then select
either Load to load the table, or Transform Data to open the Power Query Editor
where you can filter and refine the set of data you want to use, and then load that
refined set of data.
Microsoft Exchange Online
Article • 07/18/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
Capabilities Supported
Import
1. From Get Data, select the Online Services category, select Microsoft Exchange
Online, and then select Connect. More information: Where to get data
2. In the Microsoft Exchange Online window that appears, enter the mailbox address
for the account you would like to access.
3. Select OK.
4. Choose either Exchange account sign in and provide your credentials, or Microsoft
account and sign in.
You can also use a User Principal Name (UPN). It looks similar to an email address.
Typical format is user@domain_name.
5. In Navigator, select the data to import and use in your application. Then select
either Load to load the table, or Transform Data to open the Power Query Editor
where you can filter and refine the set of data you want to use, and then load that
refined set of data.
Connect to Microsoft Exchange Online from
Power Query Online
Power Query Online includes Power BI (Dataflows) and Customer Insights (Dataflows) as
experiences.
1. Select the Microsoft Exchange Online option in the connector selection. More
information: Where to get data
2. Enter the mailbox address for the account you would like to access, and any other
connection details if necessary. Select Next.
Summary
Item Description
Products —
Deprecation
7 Note
7 Note
Summary
Item Description
Prerequisites
Your organization must have a configured MicroStrategy environment. The user account
on the MicroStrategy environment must have access to the Power BI connector.
Capabilities Supported
Import
Data refresh
3. If this is the first time you're connecting to the MicroStrategy for Power BI
connector, a third-party notice appears. Select Don't warn me again with this
connector, and then select Continue.
7 Note
If you want to utilize OIDC authentication, you must add a #OIDCMode string
to the end of the URL.
7. Select OK.
a. Standard/LDAP
b. Library/OIDC
) Important
i. Select Sign in. A popup appears with the external sign-in site (either
MicroStrategyLibrary or OIDC provider).
ii. Follow the required steps to authenticate with the chosen method.
9. Choose the report or cube you want to import to Power BI by navigating through
the Navigation Table.
Refresh MicroStrategy data using Power BI
Online
7 Note
1. Publish the dataset imported with Power BI Desktop using the MicroStrategy for
Power BI connector.
3. If this is the first time you're connecting to this database, select the authentication
type and enter your credentials.
Now follow the steps required to set up the scheduled refresh/refresh in Power BI
Online.
MongoDB Atlas SQL interface
Article • 07/25/2023
7 Note
Summary
Item Description
Prerequisites
To use the MongoDB Atlas SQL connector, you must have an Atlas federated database
setup.
If some or all of your data comes from an Atlas cluster, you must use MongoDB
version 5.0 or greater for that cluster to take advantage of Atlas SQL.
We also recommend that you install the MongoDB Atlas SQL ODBC Driver before using
the MongoDB Atlas SQL connector.
Capabilities Supported
Import
2. Select Database from the categories on the left, select MongoDB Atlas SQL, and
then select Connect.
3. If you're connecting to the MongoDB Atlas SQL connector for the first time, a
third-party notice is displayed. Select "Don't warn me again with this connector"
if you don't want this message to be displayed again.
Select Continue.
4. In the MongoDB Atlas SQL window that appears, fill in the following values:
The MongoDB URI. Required. Use the MongoDB URI obtained in the
prerequisites. Make sure that it doesn't contain your username and password.
URIs containing username and/or passwords are rejected.
Your federated Database name. Required
Use the name of the federated database obtained in the prerequisites.
Select OK.
5. Enter your Atlas MongoDB Database access username and password and select
Connect.
7 Note
Once you enter your username and password for a particular Atlas federated
database, Power BI Desktop uses those same credentials in subsequent
connection attempts. You can modify those credentials by going to File >
Options and settings > Data source settings.
The MongoDB URI. Required. Use the MongoDB URI obtained in the
prerequisites. Make sure that it doesn't contain your username and password.
URIs containing username and/or passwords are rejected.
Your federated Database name. Required
Use the name of the federated database obtained in the prerequisites.
Enter a Connection name.
Choose a Data gateway.
Enter your Atlas MongoDB Database access username and password and
select Next.
3. In the Navigator screen, select the data you require, and then select Transform
data. This selection opens the Power Query editor so that you can filter and refine
the set of data you want to use.
Troubleshooting
When the connection can't be established successfully, the generic error message The
driver returned invalid (or failed to return) SQL_DRIVER_ODBC_VER: 03.80 is
displayed. Start by checking your credentials and that you have no network issues
accessing your federated database.
Next steps
You might also find the following information useful:
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
Authentication Types Supported Windows (Power BI Desktop, Excel, online service with
gateway)
Database (Power BI Desktop, Excel)
Basic (online service with gateway)
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
You need to install the Oracle MySQL Connector/NET package prior to using this
connector in Power BI Desktop. This component must also be installed on the machine
running the on-premises data gateway in order to use this connector in Power Query
Online (dataflows) or Power BI service. The MySQL connector requires the MySQL
Connector/NET package to be correctly installed. To determine if the package has
installed correctly, open a PowerShell window and run the following command:
[System.Data.Common.DbProviderFactories]::GetFactoryClasses()|ogv
If the package is installed correctly, the MySQL Data Provider is displayed in the
resulting dialog. For example:
If the package doesn't install correctly, work with your MySQL support team or reach out
to MySQL.
Capabilities Supported
Import
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
1. Select the MySQL database option in the connector selection. More information:
Where to get data
2. In the MySQL database dialog, provide the name of the server and database.
3. Select the Database authentication type and input your MySQL credentials in the
User name and Password boxes.
7 Note
6. In Navigator, select the data you require, then either load or transform the data.
1. Select the MySQL database option in the connector selection. More information:
Where to get data
2. In the MySQL database dialog, provide the name of the server and database.
3. If necessary, include the name of your on-premises data gateway.
4. Select the Basic authentication kind and input your MySQL credentials in the
Username and Password boxes.
7. In Navigator, select the data you require, then select Transform data to transform
the data in Power Query Editor.
The following table lists all of the advanced options you can set in Power Query
Desktop.
Command If your connection lasts longer than 10 minutes (the default timeout), you
timeout in can enter another value in minutes to keep the connection open longer.
minutes
Advanced option Description
SQL statement For information, go to Import data from a database using native database
query.
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, those columns aren't included.
columns
Navigate using If checked, the navigator displays the complete hierarchy of tables in the
full hierarchy database you're connecting to. If cleared, the navigator displays only the
tables whose columns and rows contain data.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your MySQL database.
OData Feed
Article • 07/14/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities supported
Basic
Advanced
URL parts
Open type columns
Select related tables
7 Note
Microsoft Graph is not supported. More information: Lack of Support for Microsoft
Graph in Power Query
2. Choose the Basic button and enter a URL address in the text box. This URL should
be the root of the OData service you want to connect to. For example, enter
http://services.odata.org/V4/northwind/northwind.svc/ . Then select OK.
If the URL address you enter is invalid, a warning icon will appear next to the
URL textbox.
3. If this is the first time you're connecting using the OData Feed, select the
authentication type, input your credentials (if necessary), and select the level to
apply the authentication settings to. Then select Connect.
4. From the Navigator dialog, you can select a table, then either transform the data
in the Power Query Editor by selecting Transform Data, or load the data by
selecting Load.
If you have multiple tables that have a direct relationship to one or more of the
already selected tables, you can select the Select Related Tables button. When you
do, all tables that have a direct relationship to one or more of the already selected
tables will be imported as well.
2. In the OData dialog that appears, enter a URL in the text box.
3. If this is the first time you're connecting using the OData Feed, select the
authentication kind and enter your credentials (if necessary). Then select Next.
4. From the Navigator dialog, you can select a table, then transform the data in the
Power Query Editor by selecting Transform Data.
If you have multiple tables that have a direct relationship to one or more of the
already selected tables, you can select the Select Related Tables button. When you
do, all tables that have a direct relationship to one or more of the already selected
tables will be imported as well.
Joins
Due to the architecture of OData and other web connectors, joins can be non-
performant. While you have the option to use navigation columns when merging
between tables from an OData source, you don't have this option when merging with
non-Odata sources.
If you're seeing performance issues when merging an OData source, you should apply
Table.Buffer to your OData query in the advanced editor, before you merge the data.
When you enter credentials for an OData service into Power BI service (for example,
after publishing a PBIX that uses OData.Feed ), Power BI service will test the credentials
but will ignore any query options that were specified in the M query. These query
options might have been specified directly in the formula (for example, using the
formula bar or advanced editor), or might have been added by the Power Query editor
by default. You can find the full list of these query options in OData.Feed.
We were unable to connect because this credential type isn’t supported for this
resource. Please choose another credential type.
Contact the service owner. They'll either need to change the authentication
configuration or build a custom connector.
To get around this limitation, start with the root OData endpoint and then navigate and
filter inside Power Query. Power Query filters this URL locally when the URL is too long
for SharePoint to handle. For example, start with:
OData.Feed("https://contoso.sharepoint.com/teams/sales/_api/ProjectData")
instead of
OData.Feed("https://contoso.sharepoint.com/teams/sales/_api/ProjectData/Projects?
select=_x0031_MetricName...etc...")
Connect with data by using Power BI
and OData queries
Article • 02/24/2023
Azure DevOps Services | Azure DevOps Server 2022 - Azure DevOps Server 2019
Using OData queries is the recommended approach for pulling data into Power BI.
OData (Open Data Protocol) is an ISO/IEC approved, OASIS standard that defines best
practices for building and consuming REST APIs. To learn more, see OData
documentation.
To get started quickly, check out the Overview of sample reports that use OData queries.
For information about other approaches, see Power BI integration overview.
Power BI can run OData queries, which can return a filtered or aggregated set of data to
Power BI. OData queries have two advantages:
All filtering is done server-side. Only the data you need is returned, which leads to
shorter refresh times.
You can pre-aggregate data server-side. An OData query can carry out
aggregations such as work item rollup and build failure rates. The aggregations are
accomplished server-side, and only the aggregate values are returned to Power BI.
With pre-aggregation, you can carry out aggregations across large data sets,
without needing to pull all the detail data into Power BI.
Prerequisites
To view Analytics data and query the service, you need to be a member of a project
with Basic access or greater. By default, all project members are granted
permissions to query Analytics and define Analytics views.
To learn about other prerequisites regarding service and feature enablement and
general data tracking activities, see Permissions and prerequisites to access
Analytics.
Use Visual Studio Code to write and test OData
queries
The easiest way to write and test OData is to use Visual Studio Code with the OData
extension . Visual Studio Code is a free code editor available on Windows, Mac, and
Linux. The OData extension provides syntax highlighting and other functions that are
useful for writing and testing queries.
The following query returns the top 10 work items under a specific area path. Replace
{organization}, {project}, and {area path} with your values.
https://analytics.dev.azure.com/{organization}/{project}/_odata/v3.0-
preview/WorkItems?
$select=WorkItemId,Title,WorkItemType,State,CreatedDate
&$filter=startswith(Area/AreaPath,'{area path}')
&$orderby=CreatedDate desc
&$top=10
To query across projects, omit /{project} entirely.
For more information about how to write OData queries against Analytics, see OData
query quick reference.
After you've written the query in Visual Studio Code, you should see syntax highlighting:
Select OData: Open. This action combines the multiline query into a one-line URL and
opens it in your default browser.
The OData query result set is in JSON format. To view the results, install the JSON
Formatter extension for your browser. Several options are available for both Chrome
and Microsoft Edge.
If the query has an error, the Analytics service returns an error in JSON format. For
example, this error states that the query has selected a field that doesn't exist:
After you've verified that the query works correctly, you can run it from Power BI.
7 Note
In your filename.odata file, you might want to first create a copy of the multiline
query text and then run OData: Combine on the copy. You do this because there's
no way to convert the single-line query back to a readable multiline query.
In Visual Studio Code, place your query anywhere in the query text, and then select View
> Command Palette. In the search box, type odata and then, in the results list, select
OData: Combine.
In the OData feed window, in the URL box, paste the OData query that you copied in
the preceding section, and then select OK.
[Implementation="2.0",OmitValues = ODataOmitValues.Nulls,ODataVersion = 4]
7 Note
7 Note
The following action is required for Power BI to successfully run an OData query
against the Azure DevOps Analytics Service.
Select OK to close the Advanced Editor and return to the Power BI Power Query
Editor. You can use Power Query Editor to perform these optional actions:
Related articles
Sample Power BI Reports by using OData queries
Data available from Analytics
Grant permissions to access Analytics
Power BI integration overview
ODBC
Article • 07/14/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
Before you get started, make sure you've properly configured the connection in the
Windows ODBC Data Source Administrator. The exact process here depends on the
driver.
Capabilities Supported
Import
Advanced options
Connection string (non-credential properties)
SQL statement
Supported row reduction clauses
2. In From ODBC, select the data source name (DSN) from the Data source name
(DSN) drop-down box. In this example, a DSN name of SQL Server Database is
used.
You can also choose Advanced options to enter more optional connection
information. More information: Connect using advanced options
4. If this is the first time you're connecting to this database, select the authentication
type and input your credentials when prompted.
The authentication types available are:
Default or Custom: Select this authentication type when you don't specify
any credentials if you're using DSN configured with a username and
password. Or, if you need to include credentials as connection string
properties.
Windows: Select this authentication type if you want to connect using
Windows authentication. Optionally, include any connection string properties
you need.
Database: Select this authentication type to use a username and password to
access a data source with an ODBC driver. Optionally, include any connection
string properties you need. This is the default selection.
6. In the Navigator, select the database information you want, then either select Load
to load the data or Transform Data to continue transforming the data in Power
Query Editor.
4. Choose the authentication kind to sign in, and then enter your credentials.
5. Select Next.
6. In the Navigator, select the database information you want, and then select
Transform data to continue transforming the data in Power Query Editor.
Connection Provides an optional connection string that can be used instead of the Data
string (non- source name (DSN) selection in Power BI Desktop. If Data source name (DSN) is
credential set to (None), you can enter a connection string here instead. For example, the
properties) following connection strings are valid: dsn=<myDSN> or driver=
<myDriver>;port=<myPortNumber>;server=<myServer>;database=
<myDatabase>;. To escape special characters, use { } characters. Keys for
connection strings vary between different ODBC drivers. Consult your ODBC
driver provider for more information about valid connection strings.
SQL statement Provides a SQL statement, depending on the capabilities of the driver. Ask your
vendor for more information, or go to Import data from a database using native
database query.
Supported Enables folding support for Table.FirstN. Select Detect to find supported row
row reduction reduction clauses, or select from one of the drop-down options (TOP, LIMIT and
clauses OFFSET, LIMIT, or ANSI SQL-compatible). This option isn't applicable when using
a native SQL statement. Only available in Power Query Desktop.
Summary
Item Description
Products Excel
Power BI (Datasets)
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities Supported
Import
1. From Get Data, select the Other category, select OLE DB, and then select Connect.
More information: Where to get data
2. In the OLE DB window that appears, enter your connection string. Optionally, you
can provide other SQL query information in the Advanced options tab.
Tip
3. Select OK.
4. Choose the kind of authentication you'd like to use: Default or Custom, Windows,
or Database.
5. In Navigator, review and select data from your database. Then select either Load
to load the table, or Transform Data to open the Power Query Editor where you
can filter and refine the set of data you want to use, and then load that refined set
of data.
OpenSearch Project (Beta)
Article • 07/18/2023
7 Note
Summary
Item Description
Prerequisites
Microsoft Power BI Desktop
OpenSearch
OpenSearch SQL ODBC driver
Capabilities supported
Import
DirectQuery (Power BI Datasets)
4. Enter host and port values and select your preferred SSL option. Then select OK.
7. Select Load.
Troubleshooting
If you get an error indicating the driver wasn't installed, install the OpenSearch SQL
ODBC Driver .
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
Supported Oracle versions:
Before you can connect to an Oracle database using Power Query, you need to install
the Oracle Client for Microsoft Tools (OCMT).
To connect to an Oracle database with the on-premises data gateway, 64-bit OCMT
must be installed on the computer running the gateway. For more information, go to
Manage your data source - Oracle.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Command timeout in minutes
SQL statement
Include relationship columns
Navigate using full hierarchy
OCMT is free software. It can be downloaded from the Oracle Client for Microsoft Tools
page. For 64-bit Power BI Desktop and Power BI service, use 64-bit OCMT. For 32-bit
Power BI Desktop, use 32-bit OCMT.
Even if you already have an Oracle Client or ODP.NET installed on your Power BI client,
it's highly recommended you use the OCMT installer to properly complete all the
configuration steps Power BI requires to work with Oracle database.
2. Specify the Oracle net service name/TNS alias or Easy Connect (Plus) connection
string to connect to in Server. Easy Connect is the simplest to use by setting the
Server value to your Oracle Database server Hostname/ServiceName, where
ServiceName is the global database name. The following screenshot uses a net
service name.
4. If you're connecting to this Oracle database for the first time, select the
authentication type you want to use, and then enter your credentials. The
authentication types available are:
2. From Power Query Online, select the Oracle database option in the data sources
selection.
3. In the Oracle database dialog that appears, specify the Oracle net service
name/TNS alias, Easy Connect Plus connection string, or connect descriptor to
connect to in Server.
7 Note
You must select an on-premises data gateway for this connector, whether the
Oracle database is on your local network or on a web site.
6. If you're connecting to this Oracle database for the first time, select the type of
credentials for the connection in Authentication kind. Choose Basic if you plan to
sign in with an Oracle username and password. Choose Windows when using
Windows operating system authentication and with both the Oracle client and
server running on Windows.
9. In Navigator, select the data you require, then select Transform data to transform
the data in Power Query Editor.
7 Note
Currently, you can connect to an Oracle Autonomous Database from Excel, Power
BI Desktop, Power BI service, Fabric (Dataflow Gen2), Power Apps, SQL Server
Analysis Services, and BizTalk Server using the procedures in this section. These
tools use unmanaged ODP.NET to connect. Other Microsoft tools, including SQL
Server Data Tools, SQL Server Integration Services, and SQL Server Reporting
Services, use managed ODP.NET to connect to Oracle Autonomous Database using
largely similar procedures.
3. Enter a password you would like to use with this wallet, confirm the password, then
select Download.
Configure Oracle ADB credentials
1. On your Windows machine, go to the folder where you downloaded your Oracle
ADB credentials from Download your client credentials.
2. Unzip the credentials into the directory you specified in OCMT as the Oracle
Configuration File Directory. In this example, the credentials are extracted to
c:\data\wallet\wallet_contosomart.
7 Note
The tnsnames.ora file defines your Oracle Autonomous Database address and
connection information.
4. Under WALLET_LOCATION, change the path to your wallet folder under the
Directory option. In this example:
Open the tnsnames.ora file in the wallets folder. The file contains a list of ADB net
service names that you can connect to. In this example, the names are
contosomart_high, contosomart_low, and contosomart_medium. Your ADB net service
names are different.
4. Enter the net service name of the Oracle Autonomous Database server you want to
connect to. In this example, the Server is contosomart_high. Then select OK.
5. If you're signing in to this server from Power BI Desktop for the first time, you're
asked to enter your credentials. Select Database, then enter the user name and
password for the Oracle database. The credentials you enter here are the user
name and password for the specific Oracle Autonomous Database you want to
connect to. In this example, the database's initial administrator user name and
password are used. Then select Connect.
7 Note
At this point, the Navigator appears and displays the connection data.
You might also come across one of several errors because the configuration hasn't been
properly set up. These errors are discussed in Troubleshooting.
One error that might occur in this initial test takes place in Navigator, where the
database appears to be connected, but contains no data. Instead, an Oracle: ORA-
28759: failure to open file error appears in place of the data.
If this error occurs, be sure that the wallet folder path you supplied in sqlnet.ora is the
full and correct path to the wallet folder.
2. In Power BI service, select the gear icon in the upper right-hand side, then select
Manage gateways.
3. In Add Data Source, select Add data sources to use the gateway.
4. In Data Source Name, enter the name you want to use as the data source setting.
8. Enter the user name and password for the Oracle Autonomous Database. In this
example, the default database administrator user name (ADMIN) and password are
used.
9. Select Add.
The following table lists all of the advanced options you can set in Power Query Desktop
and Power Query Online.
Advanced Description
option
Command If your connection lasts longer than 10 minutes (the default timeout), you can
timeout in enter another value in minutes to keep the connection open longer. This
minutes option is only available in Power Query Desktop.
SQL statement For information, go to Import data from a database using native database
query.
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, these columns don't appear.
columns
Navigate using If checked, the navigator displays the complete hierarchy of tables in the
full hierarchy database you're connecting to. If cleared, the navigator displays only the
tables whose columns and rows contain data.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your Oracle database.
Troubleshooting
You might come across any of several errors from Oracle when the naming syntax is
either incorrect or not configured properly:
These errors might occur if the Oracle tnsnames.ora database connect descriptor is
misconfigured, the net service name provided is misspelled, or the Oracle database
listener isn't running or not reachable, such as a firewall blocking the listener or
database port. Be sure you're meeting the minimum installation prerequisites. More
information: Prerequisites
Visit the Oracle Database Error Help Portal to review common causes and resolutions
for the specific Oracle error you encounter. Enter your Oracle error in the portal search
bar.
If you downloaded Power BI Desktop from the Microsoft Store, you might be unable to
connect to Oracle databases because of an Oracle driver issue. If you come across this
issue, the error message returned is: Object reference not set. To address the issue, do
the following:
Download Power BI Desktop from the Download Center instead of Microsoft Store.
If the Object reference not set error message occurs in Power BI when you connect to an
Oracle database using the on-premises data gateway, follow the instructions in Manage
your data source - Oracle.
If you're using Power BI Report Server, consult the guidance in the Oracle Connection
Type article.
Next steps
Optimize Power Query when expanding table columns
Palantir Foundry
Article • 07/14/2023
7 Note
The following connector article is provided by Palantir, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Palantir website and use the support
channels there.
Summary
Item Description
Prerequisites
This connector works with any active Palantir Foundry environment. Ensure you've
completed the following setup steps before using the connector:
Capabilities supported
Import
DirectQuery (Power BI Datasets)
Connect to Palantir Foundry from Power Query
Desktop
To connect to Palantir Foundry from Power Query Desktop, take the following steps:
1. In the Get Data experience, select the Palantir Foundry option in the connector
selection.
2. In Connection Settings, provide the Base URL of your Foundry environment. For
example, https://<subdomain>.palantirfoundry.com/ . Optionally, provide a Dataset
RID and Branch.
4. Select OK.
5. If you're connecting to Foundry for the first time, select either the Foundry OAuth
(recommended) or Foundry Token authentication type. After signing in (Foundry
OAuth) or entering a token (Foundry Token), select Connect.
For more details on these authentication options, go to Foundry's Power BI:
Authenticate with Foundry documentation.
6. In Navigator, select the dataset(s) you want, then select either Load to load the
data or Transform Data to continue transforming the data in the Power Query
editor.
7 Note
Before you begin, ensure you have access to an on-premises gateway with an
existing connection to Foundry.
To connect to Palantir Foundry from Power Query Online, take the following steps:
2. In Connection Settings, provide the Base URL that matches a connection already
configured on your on-premises data gateway. For example,
https://<subdomain>.palantirfoundry.com/ . Optionally, provide a Dataset RID and
Branch.
Ensure that the Connection dropdown shows the name of your on-premises
gateway.
4. In Navigator, select the data you require, and then select Transform data.
Troubleshooting
If you encounter issues connecting to Foundry, refer to the following resources in
Palantir Foundry's documentation for troubleshooting steps:
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities supported
Import
Basic
Advanced
2. In Parquet, provide the URL for the location of the Parquet file. Enter a path and
filename if you're connecting to a local file. You can also select Advanced and build
the URL from parts. In the example used in this article, the Parquet file is located in
Azure Blob Storage.
3. Select OK.
4. If you're connecting to this data source for the first time, select the authentication
type, input your credentials, and select the level to apply the authentication
settings to. Then select Connect.
From Power Query Desktop, select one of the following authentication methods:
Anonymous
Account key
Shared access signature (SAS)
5. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
Editor.
2. In Parquet, provide the name of the server and database. Or enter a path and
filename if you're connecting to a local file.
3. If you're connecting to a local file, select the name of your on-premises data
gateway. If the data is online, you don't need to provide an on-premises data
gateway.
4. If you're connecting to this data source for the first time, select the authentication
kind and input your credentials. From Power Query Online, select one of the
following authentication kinds:
Anonymous (online)
Account key (online)
Windows (local file)
5. Select Next to continue to the Power Query editor where you can then begin to
transform your data.
It might be possible to read small files from other sources using the Binary.Buffer
function to buffer the file in memory. However, if the file is too large you're likely to get
the following error:
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
7 Note
Prerequisites
None.
Capabilities Supported
Import
Connect to a PDF file from Power Query
Desktop
To make the connection from Power Query Desktop:
2. Browse for and select the PDF file you want to load. Then select Open.
If the PDF file is online, use the Web connector to connect to the file.
3. In Navigator, select the file information you want, then either select Load to load
the data or Transform Data to continue transforming the data in Power Query
Editor.
Connect to a PDF file from Power Query Online
To make the connection from Power Query Online:
2. In the PDF dialog box that appears, either provide the file path or the URL to the
location of the PDF file. If you're loading a local file, you can also select Upload file
(Preview) to browse to the local file or drag and drop the file.
4. If this is the first time you've accessed this PDF file, select the authentication kind
and sign in to your account (if needed).
5. In Navigator, select the file information you want, and then select Transform Data
to continue transforming the data in Power Query Editor.
Limitations and considerations
Try selecting pages one at a time or one small range at a time using the StartPage
or EndPage options, iterating over the entire document as needed.
If the PDF document is one single, huge table, the MultiPageTables option can be
collecting very large intermediate values, so disabling it might help.
Summary
Item Description
Prerequisites
Before you can sign in to Planview OKR, you must have a Planview Admin account.
Capabilities Supported
Import
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, select Planview OKR (Beta), and then select
Connect.
2. If you're getting data through the Planview OKR connector for the first time, a
preview connector notice is displayed. Select Don't warn me again with this
connector if you don't want this message to be displayed again, and then select
Continue.
3. Enter the Planview OKR OData URL location that you want to access, and then
select OK.
4. To sign in to your Planview Admin account, select Sign in.
5. In the Planview OKR window that appears, provide your credentials to sign in to
your Planview OKR account.
6. Select Sign in.
8. In Navigator, select the information you want, then either select Load to load the
data or Transform Data to continue transforming the data in the Power Query
editor.
2. In the dialog that appears, enter the Planview OKR URL location in the text box. Fill
in the rest of the details as shown in the following screenshot.
3. If you're connecting using Planview OKR for the first time, you need to sign in into
your Planview Admin account.
4. After you sign in, select Next.
5. In Navigator, select the data you require, then select Transform data to transform
the data in the Power Query editor.
PostgreSQL
Article • 08/09/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
As of the December 2019 release, NpgSQL 4.0.10 shipped with Power BI Desktop and no
additional installation is required. GAC Installation overrides the version provided with
Power BI Desktop, which will be the default. Refreshing is supported both through the
cloud in the Power BI service and also on premise through the on-premise data
gateway. To refresh data from the Power BI service without an on-premise data gateway,
PostgreSQL must be hosted in a manner that allows direct connection from the Power BI
services on Azure. This is natively supported for PostgreSQL hosted in Microsoft Azure.
For other hosting environments, consult your hosting provider about configuring your
PostgreSQL for direct access from the internet. If PostgreSQL is configured so that it
can't be directly accessed from the internet (recommended for security), you'll need to
use an on-premise data gateway for refreshes. In the Power BI service, NpgSQL 4.0.10
will be used, while on premise refresh will use the local installation of NpgSQL, if
available, and otherwise use NpgSQL 4.0.10.
For Power BI Desktop versions released before December 2019, you must install the
NpgSQL provider on your local machine. To install the NpgSQL provider, go to the
releases page , search for v4.0.10, and download and run the .msi file. The provider
architecture (32-bit or 64-bit) needs to match the architecture of the product where you
intend to use the connector. When installing, make sure that you select NpgSQL GAC
Installation to ensure NpgSQL itself is added to your machine.
We recommend NpgSQL 4.0.10. NpgSQL 4.1 and up won't work due to .NET version
incompatibilities.
For Power Apps, you must install the NpgSQL provider on your local machine. To install
the NpgSQL provider, go to the releases page and download the relevant version.
Download and run the installer (the NpgSQL-[version number].msi) file. Ensure you
select the NpgSQL GAC Installation and on completion restart your machine for this
installation to take effect.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
2. In the PostgreSQL database dialog that appears, provide the name of the server
and database.
4. If this is the first time you're connecting to this database, input your PostgreSQL
credentials in the User name and Password boxes of the Database authentication
type. Select the level to apply the authentication settings to. Then select Connect.
For more information about using authentication methods, go to Authentication
with a data source.
7 Note
5. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
Editor.
Connect to a PostgreSQL database from Power
Query Online
To make the connection, take the following steps:
2. In the PostgreSQL database dialog that appears, provide the name of the server
and database.
3. Select the name of the on-premises data gateway you want to use.
4. Select the Basic authentication kind and input your PostgreSQL credentials in the
Username and Password boxes.
7. In Navigator, select the data you require, then select Transform data to transform
the data in Power Query Editor.
Advanced Description
option
Command If your connection lasts longer than 10 minutes (the default timeout), you can
timeout in enter another value in minutes to keep the connection open longer. This
minutes option is only available in Power Query Desktop.
SQL statement For information, go to Import data from a database using native database
query.
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, you won’t see those columns.
columns
Navigate using If checked, the navigator displays the complete hierarchy of tables in the
full hierarchy database you're connecting to. If cleared, the navigator displays only the
tables whose columns and rows contain data.
Once you've selected the advanced options you require, select OK in Power Query
Desktop to connect to your PostgreSQL database.
Troubleshooting
Your native query may throw the following error:
We cannot fold on top of this native query. Please modify the native query or
A basic trouble shooting step is to check if the query in Value.NativeQuery() throws the
same error with a limit 1 clause around it:
7 Note
The following connector article is provided by Profisee, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the Profisee support site and use the support
channels there.
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
Before using the Profisee connector, you must have:
Capabilities supported
Import
2. Select the Get Data option in the Home ribbon to open the Get Data dialogue.
3. Enter Profisee into the search box or select Online Services and select Profisee
from the list, then select Connect.
4. Enter the URL for your Profisee instance, then select OK.
.
6. Select Connect.
7. Once connected, the Navigator is displayed. This display lists all entities in your
Profisee instance. You can scroll through the navigator to locate specific entities, or
search for entities by name using the search bar.
8. Select the entities that you want to import into Power BI. You can preview the data
and choose to either first Transform Data if you want to edit the attribute columns,
apply filters, and so on, or Load the data directly into Power BI Desktop.
9. Once loaded, the entities appear in the model view, and you can view the
attributes ready for use in Power BI in the Fields dialog.
7 Note
Relationships in Profisee aren't created in the model in Power BI. After the entities
are loaded, you can view the model and create or modify relationships as desired.
QuickBooks Online (Beta)
Article • 01/24/2023
Summary
Item Description
2 Warning
QuickBooks Online has deprecated support for Internet Explorer 11, which Power
Query Desktop uses for authentication to online services. To be able to log in to
Quickbooks Online from Power BI Desktop, go to Enabling Microsoft Edge
(Chromium) for OAuth Authentication in Power BI Desktop.
Prerequisites
To use the QuickBooks Online connector, you must have a QuickBooks Online account
username and password.
The QuickBooks Online connector uses the QuickBooks ODBC driver. The QuickBooks
ODBC driver is shipped with Power BI Desktop and no additional installation is required.
Capabilities Supported
Import
1. In the Get Data dialog box, enter QuickBooks in the Search box, select
QuickBooks Online (Beta) from the product-specific data connector list, and then
select Connect.
7. In the Navigator dialog box, select the QuickBooks tables you want to load. You
can then either load or transform the data.
Known issues
Beginning on August 1, 2020, Intuit will no longer support Microsoft Internet Explorer 11
(IE 11) for QuickBooks Online. When you use OAuth2 for authorizing QuickBooks Online,
after August 1, 2020, only the following browsers will be supported:
Microsoft Edge
Mozilla Firefox
Google Chrome
Safari 11 or newer (Mac only)
For more information, see Alert: Support for IE11 deprecating on July 31, 2020 for
Authorization screens .
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
2 Warning
By default, Salesforce does not support Internet Explorer 11, which is used as part
of the authentication experience to online services in Power Query Desktop. Please
opt-in for extended support for accessing Lightning Experience Using Microsoft
Internet Explorer 11 . You may also want to review Salesforce documentation on
configuring Internet Explorer . At this time, users will be impaired from
authenticating, but stored credentials should continue to work until their existing
authentication tokens expire. To resolve this, go to Enabling Microsoft Edge
(Chromium) for OAuth Authentication in Power BI Desktop.
Prerequisites
To use the Salesforce Objects connector, you must have a Salesforce account username
and password.
Also, Salesforce API access should be enabled. To verify access settings, go to your
personal Salesforce page, open your profile settings, and search for and make sure the
API Enabled checkbox is selected. Note that Salesforce trial accounts don't have API
access.
Capabilities Supported
Production
Custom
Custom domains
CNAME record redirects
Relationship columns
1. Select Salesforce Objects from the product-specific data connector list, and then
select Connect.
2. In Salesforce Objects, choose the Production URL if you use the Salesforce
production URL ( https://www.salesforce.com ) to sign in.
You can also select Custom and enter a custom URL to sign in. This custom URL
might be a custom domain you've created within Salesforce, such as
https://contoso.salesforce.com . You can also use the custom URL selection if
you're using your own CNAME record that redirects to Salesforce. Note that
lightning URLs aren't supported.
Also, you can select Include relationship columns. This selection alters the query
by including columns that might have foreign-key relationships to other tables. If
this box is unchecked, you won’t see those columns.
4. If this is the first time you've signed in using a specific app, you'll be asked to verify
your authenticity by entering a code sent to your email address. You'll then be
asked whether you want the app you're using to access the data. For example,
you'll be asked if you want to allow Power BI Desktop to access your Salesforce
data. Select Allow.
5. In the Navigator dialog box, select the Salesforce Objects you want to load. You
can then either select Load to load the data or select Transform Data to transform
the data.
1. Select Salesforce objects from the product-specific data connector list, and then
select Connect.
2. In Salesforce objects, choose the URL you want to use to connect. Select the
Production URL if you use the Salesforce production URL
( https://www.salesforce.com ) to sign in.
7 Note
You can also select Custom and enter a custom URL to sign in. This custom URL
might be a custom domain you've created within Salesforce, such as
https://contoso.salesforce.com . You can also use the custom URL selection if
Also, you can select Include relationship columns. This selection alters the query by
including columns that might have foreign-key relationships to other tables. If this
box is unchecked, you won’t see those columns.
3. If this is the first time you've made this connection, select an on-premises data
gateway, if needed.
5. In the Navigator dialog box, select the Salesforce Objects you want to load. Then
select Transform Data to transform the data.
Specifying a Salesforce API version
We require you to specify a supported Salesforce API version to use the Salesforce
connector. You can do so by modifying the query using the Power Query advanced
editor. For example, Salesforce.Data("https://login.salesforce.com/",
[ApiVersion=48]) .
If you specify a version that isn't supported by Salesforce, you'll encounter an error
message indicating that you have specified an unsupported ApiVersion.
For more information on Salesforce API versions and support, visit the Salesforce
website .
There's a limit on the number of fields a query to Salesforce can contain. The limit
varies depending on the type of the columns, the number of computed columns,
and so on. When you receive the Query is either selecting too many fields or
the filter conditions are too complicated error, it means that your query
exceeds the limit. To avoid this error, use the Select Query advanced option and
specify fields that you really need.
Salesforce session settings can block this integration. Ensure that the setting Lock
sessions to the IP address from which they originated is disabled.
Custom fields of type "Picklist (Multi-Select)" are not supported by "Create record"
and "Update record" operations.
Lightning URLs aren't supported.
For more information about Salesforce internal API limits, go to Salesforce Developer
Limits and Allocations Quick Reference .
Salesforce Reports
Article • 07/14/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
2 Warning
By default, Salesforce does not support Internet Explorer 11, which is used as part
of the authentication experience to online services in Power Query Desktop. Please
opt-in for extended support for accessing Lightning Experience Using Microsoft
Internet Explorer 11 . You may also want to review Salesforce documentation on
configuring Internet Explorer . At this time, users will be impaired from
authenticating, but stored credentials should continue to work until their existing
authentication tokens expire. To resolve this, go to Enabling Microsoft Edge
(Chromium) for OAuth Authentication in Power BI Desktop.
Prerequisites
To use the Salesforce Reports connector, you must have a Salesforce account username
and password.
Also, Salesforce API access should be enabled. To verify access settings, go to your
personal Salesforce page, open your profile settings, and search for and make sure the
API Enabled checkbox is selected. Note that Salesforce trial accounts don't have API
access.
Capabilities Supported
Production
Custom
Custom domains
CNAME record redirects
1. Select Salesforce Reports from the product-specific data connector list, and then
select Connect.
2. In Salesforce Reports, choose the Production URL if you use the Salesforce
production URL ( https://www.salesforce.com ) to sign in.
You can also select Custom and enter a custom URL to sign in. This custom URL
might be a custom domain you've created within Salesforce, such as
https://contoso.salesforce.com . You can also use the custom URL selection if
4. If this is the first time you've signed in using a specific app, you'll be asked to verify
your authenticity by entering a code sent to your email address. You'll then be
asked whether you want the app you're using to access the data. For example,
you'll be asked if you want to allow Power BI Desktop to access your Salesforce
data. Select Allow.
5. In the Navigator dialog box, select the Salesforce Reports you want to load. You
can then either select Load to load the data or select Transform Data to transform
the data.
1. Select Salesforce reports from the product-specific data connector list, and then
select Connect.
2. In Salesforce reports, choose the URL you want to use to connect. Select the
Production URL if you use the Salesforce production URL
( https://www.salesforce.com ) to sign in.
7 Note
You can also select Custom and enter a custom URL to sign in. This custom URL
might be a custom domain you've created within Salesforce, such as
https://contoso.salesforce.com . You can also use the custom URL selection if
Also, you can select Include relationship columns. This selection alters the query by
including columns that might have foreign-key relationships to other tables. If this
box is unchecked, you won’t see those columns.
3. If this is the first time you've made this connection, select an on-premises data
gateway, if needed.
5. In the Navigator dialog box, select the Salesforce Reports you want to load. Then
select Transform Data to transform the data.
If you specify a version that isn't supported by Salesforce, you'll encounter an error
message indicating that you have specified an unsupported ApiVersion.
For more information on Salesforce API versions and support, visit the Salesforce
website .
exceeds the limit. To avoid this error, use the Select Query advanced option and
specify fields that you really need.
Salesforce session settings can block this integration. Ensure that the setting Lock
sessions to the IP address from which they originated is disabled.
The number of rows you can access in Salesforce Reports is limited by Salesforce
to 2000 rows. As a workaround for this issue, you can use the Salesforce Objects
connector in Power BI Desktop to retrieve all the rows from individual tables and
recreate reports you’d like. The Object connector doesn’t have the 2000-row limit.
For more information about Salesforce internal API limits, go to Salesforce Developer
Limits and Allocations Quick Reference .
SAP Business Warehouse Application
Server
Article • 07/14/2023
7 Note
The SAP Business Warehouse (BW) Application Server connector is now certified for
SAP BW/4HANA as of June 2020.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
) Important
Version 1.0 of the SAP BW Application Server connector has been deprecated. New
connections will use Implementation 2.0 of the SAP BW Application Server
connector. All support for version 1.0 will be removed from the connector in the
near future.
You'll need an SAP account to sign in to the website and download the drivers. If you're
unsure, contact the SAP administrator in your organization. The drivers need to be
installed on the gateway machine.
To use the SAP BW Application Server connector in Power BI Desktop or Power Query
Online, you must install the SAP .NET Connector 3.0. Access to the download requires a
valid S-user. Contact your SAP Basis team to get the SAP .NET Connector 3.0. You can
download the SAP .NET Connector 3.0 from SAP. The connector comes in 32-bit and
64-bit versions. Choose the version that matches your Power BI Desktop installation. For
Power Query Online, choose the 64-bit version. Currently, the website lists two versions
for .NET 4.0 framework:
SAP Connector for Microsoft .NET 3.0.23.0 for Windows 32 bit (x86) as a zip file
(6,928 KB), May 28, 2020
SAP Connector for Microsoft .NET 3.0.23.0 for Windows 64 bit (x64) as a zip file
(7,225 KB), May 28, 2020
When you install, in Optional setup steps, make sure you select Install assemblies to
GAC.
7 Note
Be sure to use SAP .NET Connector 3.0. The SAP BW Application Server connector
doesn't currently support SAP .NET Connector 3.1.
Capabilities Supported
Import
Direct Query (Power BI Datasets)
Advanced
Language code
Execution mode
Batch size
MDX statement
Enable characteristic structures
2. Enter the server name, system number, and client ID of the SAP BW Application
Server you want to connect to. This example uses SAPBWTestServer as the server
name, a system number of 00 , and a client ID of 837 .
The rest of this example describes how to import your data into Power Query
Desktop, which is the default setting for Data Connectivity mode. If you want to
use DirectQuery to load your data, go to Connect to SAP Business Warehouse by
using DirectQuery in Power BI.
If you want to use any of the advanced options for this connector to fine-tune your
query, go to Use advanced options.
3. When accessing the database for the first time, the SAP BW Application Server
requires database user credentials. Power Query Desktop offers two authentication
modes for SAP BW connections—user name/password authentication (Database),
and Windows authentication (single sign-on). SAML authentication isn't currently
supported. Select either Windows or Database. If you select Database
authentication, enter your user name and password. If you select Windows
authentication, go to Windows Authentication and single sign-on to learn more
about the requirements for Windows authentication.
Then select Connect.
4. From the Navigator dialog box, select the items you want to use. When you select
one or more items from the server, the Navigator dialog box creates a preview of
the output table. For more information about navigating the SAP BW Application
Server query objects in Power Query, go to Navigate the query objects.
5. From the Navigator dialog box, you can either transform the data in the Power
Query Editor by selecting Transform Data, or load the data by selecting Load.
Connect to an SAP BW Application Server from
Power Query Online
To connect to an SAP BW Application Server from Power Query Online:
2. Enter the server name, system number, and client ID of the SAP BW Application
Server you want to connect to. This example uses SAPBWTestServer as the server
name, a system number of 00 , and a client ID of 837 .
3. Select the on-premises data gateway you want to use to connect to the data.
4. Set Authentication Kind to Basic. Enter your user name and password.
5. You can also select from a set of advanced options to fine-tune your query.
7. From the Navigator dialog box, select the items you want to use. When you select
one or more items from the server, the Navigator dialog box creates a preview of
the output table. For more information about navigating the SAP BW Application
Server query objects in Power Query, go to Navigate the query objects.
8. From the Navigator dialog box, you can transform the data in the Power Query
Editor by selecting Transform Data.
Where:
or service name>
7 Note
/S/<port> can be omitted if the port is the default port (3299).
Considerations
Router strings can include passwords, prefixed by either /P/ or /W/ . Passwords
aren't supported in Power Query router strings as this could be unsafe. Using a
password will result in an error.
Router strings also allow the use of symbolic SAP system names, prefixed with /R/ .
This type of string isn't supported in Power Query.
In Power Query, you can use the "router string" syntax to specify a custom port, so
router strings with a single station are allowed. Router strings can then be
identified as starting with either /H/ or /M/ . Any other input is assumed to be a
server name/IP address.
Next steps
Navigate the query objects
SAP Business Warehouse fundamentals
Use advanced options
SAP Business Warehouse connector troubleshooting
SAP Business Warehouse Message
Server
Article • 07/14/2023
7 Note
The SAP Business Warehouse (BW) Message Server connector is now certified for
SAP BW/4HANA as of June 2020.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
) Important
Version 1.0 of the SAP BW Message Server connector has been deprecated. New
connections will use Implementation 2.0 of the SAP BW Message Server connector.
All support for version 1.0 will be removed from the connector in the near future.
You'll need an SAP account to sign in to the website and download the drivers. If you're
unsure, contact the SAP administrator in your organization.
To use the SAP BW Message Server connector in Power BI Desktop or Power Query
Online, you must install the SAP .NET Connector 3.0. Access to the download requires a
valid S-user. Contact your SAP Basis team to get the SAP .NET Connector 3.0. You can
download the SAP .NET Connector 3.0 from SAP. The connector comes in 32-bit and
64-bit versions. Choose the version that matches your Power BI Desktop installation. For
Power Query Online, choose the 64-bit version. Currently, the website lists two versions
for .NET 4.0 framework:
SAP Connector for Microsoft .NET 3.0.23.0 for Windows 32 bit (x86) as a zip file
(6,928 KB), May 28, 2020
SAP Connector for Microsoft .NET 3.0.23.0 for Windows 64 bit (x64) as a zip file
(7,225 KB), May 28, 2020
When you install, in Optional setup steps, make sure you select Install assemblies to
GAC.
7 Note
Be sure to use SAP .NET Connector 3.0. The SAP BW Application Server connector
doesn't currently support SAP .NET Connector 3.1.
Capabilities Supported
Import
Direct Query (Power BI Datasets)
Advanced
Language code
Execution mode
Batch size
MDX statement
Enable characteristic structures
1. From the Home tab of Power BI Desktop, select Get Data > SAP Business
Warehouse Message Server.
2. Enter the server, system number, client ID, and logon group of the SAP BW
Message Server you want to connect to. This example uses SAPBWTestServer as the
server name, a system number of 100 , a client ID of 837 , and a logon group of
PURCHASING .
7 Note
You can also use router strings to connect to your data. More information:
Connect using router strings
The rest of this example describes how to import your data into Power Query
Desktop, which is the default setting for Data Connectivity mode. If you want to
use DirectQuery to load your data, see Connect to SAP Business Warehouse by
using DirectQuery in Power BI.
If you want to use any of the advanced options for this connector to fine-tune your
query, go to Use advanced options.
When you've finished filling in the relevant information, select OK.
3. When accessing the database for the first time, the SAP BW Message Server
requires database user credentials. Power Query Desktop offers two authentication
modes for SAP BW connections—user name/password authentication (Database),
and Windows authentication (single sign-on). SAML authentication isn't currently
supported. Select either Windows or Database. If you select Database
authentication, enter your user name and password. If you select Windows
authentication, go to Windows Authentication and single sign-on to learn more
about the requirements for Windows authentication.
4. From the Navigator dialog box, select the items you want to use. When you select
one or more items from the server, the Navigator dialog box creates a preview of
the output table. For more information about navigating the SAP BW Message
Server query objects in Power Query, go to Navigate the query objects.
5. From the Navigator dialog box, you can either transform the data in the Power
Query Editor by selecting Transform Data, or load the data by selecting Load.
2. Enter the server, system number, client ID, and logo group of the SAP BW Message
Server you want to connect to. This example uses SAPBWTestServer as the server
name, a system number of 100 , a client ID of 837 , and a logon group of
PURCHASING .
3. Select the on-premises data gateway you want to use to connect to the data.
4. Set Authentication Kind to Basic. Enter your user name and password.
5. You can also select from a set of advanced options to fine-tune your query.
7. From the Navigator dialog box, select the items you want to use. When you select
one or more items from the server, the Navigator dialog box creates a preview of
the output table. For more information about navigating the SAP BW Message
Server query objects in Power Query, go to Navigate the query objects.
8. From the Navigator dialog box, you can transform the data in the Power Query
Editor by selecting Transform Data.
Connect using router strings
SAP router is an SAP program that acts as an intermediate station (proxy) in a network
connection between SAP systems, or between SAP systems and external networks. SAP
router controls the access to your network, and, as such, is a useful enhancement to an
existing firewall system (port filter). Figuratively, the firewall forms an impenetrable
"wall" around your network. However, since some connections need to penetrate this
wall, a "gate" has to be made in the firewall. SAP router assumes control of this gate. In
short, SAP router provides you with the means of controlling access to your SAP system.
Where:
7 Note
Considerations
Router strings can include passwords, prefixed by either /P/ or /W/ . Passwords
aren't supported in Power Query router strings as this could be unsafe. Using a
password will result in an error.
Router strings also allow the use of symbolic SAP system names, prefixed with /R/ .
This type of string isn't supported in Power Query.
In Power Query, you can use the "router string" syntax to specify a custom port, so
router strings with a single station are allowed. Router strings can then be
identified as starting with either /H/ or /M/ . Any other input is assumed to be a
server name/IP address.
To allow you to use the same router strings you use in other tools, the /G/ option
in the router string is supported. When provided, it should match the value
specified in the "Logon group" parameter.
If a message server port is specified, it will be sent. Under these circumstances, the
SystemId is omitted from the connection string as it’s no longer required.
However, you must still provide a value for SystemId even though it won't be used
to establish the connection.
See also
Navigate the query objects
SAP Business Warehouse fundamentals
Use advanced options
SAP Business Warehouse connector troubleshooting
SAP BW fundamentals
Article • 01/24/2023
This article describes basic terminology used when describing interactions between the
SAP BW server and Power Query. It also includes information about tools that you may
find useful when using the Power Query SAP BW connector.
Integration Architecture
From a technical point of view, the integration between applications and SAP BW is
based on the so-called Online Analytical Processing (OLAP) Business Application
Programming Interfaces (BAPI).
The OLAP BAPIs are delivered with SAP BW and provide 3rd-parties and developers with
standardized interfaces that enable them to access the data and metadata of SAP BW
with their own front-end tools.
Applications of all types can be connected with an SAP BW server using these methods.
The OLAP BAPIs are implemented in SAP BW as RFC-enabled function modules and are
invoked by applications over SAP’s RFC protocol. This requires the NetWeaver RFC
Library or SAP .NET Connector to be installed on the application's machine.
The OLAP BAPIs provide methods for browsing metadata and master data, and also for
passing MDX statements for execution to the MDX Processor.
The OLAP Processor is responsible for retrieving, processing, and formatting the data
from the SAP BW source objects, which are further described in SAP BW data source and
Data objects in SAP BW.
SAP Business Explorer and other SAP tools use a more direct interface to the SAP BW
OLAP Processor called Business Intelligence Consumer Services, commonly known as
BICS. BICS isn't available for 3rd party tools.
Typically, when a 3rd party tool like Power Query connects using the OLAP BAPIs, SAP
BW first responds with a list of catalogs available in the SAP BW system.
There's one catalog with the technical name $INFOCUBE that contains all InfoProviders in
the SAP BW system. This catalog is shown as a node in the navigator of Power Query. By
expanding this node in the navigator, you can select from the available InfoProviders in
the SAP BW system.
The other catalogs represent InfoProviders for which at least one Query exists. By
expanding one of these nodes in the navigator, you can select from the available queries
associated with the InfoProvider.
BEx Queries offer some advantages and additional functionality to create customized
data sources to meet end-user requirements. For example, you can parameterize queries
with variables that can limit the data set to what's important to the end user. Or, you can
recalculate key figures using formulas.
Although BEx Queries have advantages as data sources (go to Performance
considerations), you don't need a Query for every report. You'll need to weigh the cost
of developing and maintaining additional Queries against their reporting requirements.
InfoProvider is the generic term for a Business Intelligence (BI) object into which
data is loaded or which provides views of data. InfoProviders can be queried with
client tools, such as Business Explorer (or BEx) and also with Power Query.
InfoProviders can be seen as uniform data providers from the viewpoint of a query
definition. Their data can therefore be analyzed in a uniform way.
A Sales dimension could contain the characteristics Sales Person, Sales Group, and
Sales Office.
A Time dimension could have the characteristics Day (in the form YYYYMMDD),
Week (in the form YYYY.WW), Month (in the form YYYY.MM), Year (in the form
YYYY) and Fiscal Period (in the form YYYY.PPP).
Characteristics refer to master data with their attributes and text descriptions, and
in some cases hierarchies. The characteristics of an InfoCube are stored in
dimensions.
For example, the Customer dimension could have the characteristics Sold-to-party,
Ship-to-party, and Payer.
The characteristic Sold-to-party could have the attributes Country, Region, City,
Street, and Industry. The text description of the characteristic would be the Name
of the Sold-to-party.
InfoObjects is the generic term for all characteristics and key figures. All
InfoObjects are maintained independently of the InfoCube in SAP BW. InfoObjects
are the smallest units of Business Intelligence (BI). Using InfoObjects, information
can be stored and mapped in a structured form. This is required for constructing
InfoProviders. InfoObjects with attributes or texts can themselves be InfoProviders.
DataStore Object (DSO) serves as a storage location for consolidated and cleansed
transaction data or master data on a document (atomic) level. Unlike the
multidimensional data in InfoCubes, the data in DataStore objects is stored in
transparent, flat database tables. The system doesn't create separate fact tables or
dimension tables for DSOs. Data in DSOs can be evaluated using a BEx query.
MultiProviders are a special type of InfoProvider that combine data from several
InfoProviders. They're then available for reporting. MultiProviders don't contain
any data, their data comes exclusively from the InfoProviders upon which they're
based. MultiProviders can be based upon any combination of InfoProviders,
including InfoCubes, DataStore Objects, InfoObjects, or InfoSets.
InfoSets are a special type of InfoProvider that doesn't store data physically.
InfoSets describe data that's based on joining the tables of other InfoProviders like
DataStore Objects, standard InfoCubes, or InfoObjects with master data
characteristics. InfoSets can be useful when you have to build a report spanning
two or more different data targets in SAP BW.
Composite Providers are a new data object in SAP BW systems that run on HANA, that
is, SAP BW 7.5 or BW4/HANA. A composite provider is based on a JOIN or UNION of
other InfoProviders or Analytic Indexes. Data in Composite Providers can be evaluated
using a BEx query.
See also
Navigate the query objects
Navigate the query objects
Article • 01/24/2023
After you connect to your SAP BW instance, the Navigator dialog box will show a list of
available catalogs in the selected server.
You'll see one catalog folder with the name $INFOCUBE. This folder contains all
InfoProviders in the SAP BW system.
The other catalog folders represent InfoProviders in SAP BW for which at least one
query exists.
The Navigator dialog box displays a hierarchical tree of data objects from the connected
SAP BW system. The following table describes the types of objects.
Symbol Description
Key figure
Characteristic
Characteristic level
Property (Attribute)
Hierarchy
7 Note
The navigator shows InfoCubes and BEx queries. For BEx queries, you may need to
go into Business Explorer, open the desired query and check Allow External Access
to this Query: By OLE DB for OLAP for the query to be available in the navigator.
7 Note
In Power BI Desktop, objects below an InfoCube or BEx Query node, such as the key
figures, characteristics, and properties are only shown in Import connectivity mode,
not in DirectQuery mode. In DirectQuery mode, all the available objects are
mapped to a Power BI model and will be available for use in any visual.
In the navigator, you can select from different display options to view the available
query objects in SAP BW:
Only selected items: This option limits the objects shown in the list to just the
selected items. By default, all query objects are displayed. This option is useful for a
review of the objects that you included in your query. Another approach to viewing
selected items is to select the column names in the preview area.
Enable data previews: This value is the default. This option allows you to control
whether a preview of the data should be displayed on the right-hand side in the
Navigator dialog box. Disabling data previews reduces the amount of server
interaction and response time. In Power BI Desktop, data preview is only available
in Import connectivity mode.
Technical names: SAP BW supports the notion of technical names for query
objects, as opposed to the descriptive names that are shown by default. Technical
names uniquely identify an object within SAP BW. With the option selected, the
technical names will appear next to the descriptive name of the object.
Characteristic hierarchies
A characteristic will always have at least one characteristic level (Level 01), even when no
hierarchy is defined on the characteristic. The Characteristic Level 01 object contains all
members for the characteristic as a flat list of values.
Characteristics in SAP BW can have more than one hierarchy defined. For those
characteristics, you can only select one hierarchy or the Level 01 object.
For characteristics with hierarchies, the properties selected for that characteristic will be
included for each selected level of the hierarchy.
Measure properties
When you pick a measure, you have an option to select the units/currency, formatted
value, and format string. In the screenshot below, it's useful to get the formatted value
for COGS. This helps us follow the same formatting standard across all the reports.
7 Note
Power Query uses a newer interface that is available in SAP BW version 7.01 or higher.
The interface reduces memory consumption and the result set is not restricted by the
number of cells.
The flattened data set is aggregated in SAP BW at the level of the selected
characteristics and properties.
Even with these improvements, the resulting dataset can become very large and time-
consuming to process.
Performance recommendation
Only include the characteristics and properties that you ultimately need. Aim for higher
levels of aggregation, that is, do you need Material-level details in your report, or is
MaterialGroup-level enough? What hierarchy levels are required in Power BI? Try to
create smaller datasets, with higher levels of aggregation, or multiple smaller datasets,
that can be joined together later.
Query parameters
Queries in SAP BW can have dynamic filters defined that allow you to restrict the data
set that's returned by the query. In the BEx Query Designer, this type of dynamic filter
can be defined with what's called a Characteristic Restriction and assigning a Variable to
that restriction. Variables on a query can be required or optional, and they're available to
the user in the navigator.
When you select an SAP BW query with characteristic restrictions in the Power Query
navigator, you'll see the variables displayed as parameters above the data preview area.
Using the Show selector, you can display all parameters that are defined on the query,
or just the required ones.
The query shown in the previous image has several optional parameters, including one
for Material Group. You can select one or more material groups to only return
purchasing information for the selected values, that is, casings, motherboards, and
processors. You can also type the values directly into the values field. For variables with
multiple entries, comma-separated values are expected, in this example it would look
like [0D_MTLGROUP].[201], [0D_MTLGROUP].[202], [0D_MTLGROUP].[208] .
The value # means unassigned; in the example any data record without an assigned
material group value.
Performance recommendation
Filters based on parameter values get processed in the SAP BW data source, not in
Power BI. This type of processing can have performance advantages for larger datasets
when loading or refreshing SAP BW data into Power BI. The time it takes to load data
from SAP BW into Power BI increases with the size of the dataset, for example, the
number of columns and rows in the flattened result set. To reduce the number of
columns, only select the key figures, characteristics, and properties in the navigator that
you eventually want to see.
Similarly, to reduce the number of rows, use the available parameters on the query to
narrow the dataset, or to split up a larger dataset into multiple, smaller datasets that can
be joined together in the Power BI Desktop data model.
In many cases, it may also be possible to work with the author of the BEx Query in SAP
BW to clone and modify an existing query and optimize it for performance by adding
additional characteristic restrictions or removing unnecessary characteristics.
In the example above, a parameter was used to only bring back records with a Material
Group of casings, motherboards, and processors.
In Power Query Desktop, you can also select Load to bring the entire data set from SAP
BW into Power BI Desktop. Power BI Desktop will take you to the Report view where you
can begin visualizing the data or make further modifications using the Data or
Relationships views.
See also
Transform and filter an SAP BW dataset
Transform and filter an SAP BW dataset
Article • 01/24/2023
With Power Query Editor, you can apply additional data transformations and filtering
steps before you bring the dataset from SAP BW into the Power BI Desktop or Microsoft
Power Platform data model.
In Power Query Editor, the Applied Steps for the query are shown in the Query Settings
pane on the right. To modify or review a step, select the gear icon next to a step.
For example, if you select the gear icon next to Added Items, you can review the
selected data objects in SAP BW, or modify the specified query parameters. This way it's
possible to filter a dataset using a characteristic that isn't included in the result set.
You can apply additional filters on the dataset by selecting the drop-down menu for one
of the columns.
Another easy way to set a filter is to right-click on one of the values in the table, then
select Member Filters or Text Filters.
For example, you could filter the dataset to only include records for Calendar
Year/Month FEB 2003, or apply a text filter to only include records where Calendar
Year/Month contains 2003.
Not every filter will get folded into the query against SAP BW. You can determine if a
filter is folded into the query by examining the icon in the top-left corner of the data
table, directly above the number 1 of the first data record.
If the icon is a cube, then the filter is applied in the query against the SAP BW system.
If the icon is a table, then the filter isn't part of the query and only applied to the table.
Behind the UI of Power Query Editor, code is generated based on the M formula
language for data mashup queries.
You can view the generated M code with the Advanced Editor option in the View tab.
To see a description for each function or to test it, right-click on the existing SAP BW
query in the Queries pane and select Create Function. In the formula bar at the top,
enter:
= <function name>
where <function name> is the name of the function you want to see described. The
following example shows the description of the Cube.Transform function.
the cube.
Cube.Dimensions : Returns a table with the set of dimensions for the cube.
Cube.Measures : Returns a table with the set of measures for the cube.
See also
Power Query M formula language reference
Implementation details
Implementation details
Article • 03/08/2023
) Important
Version 1.0 of the SAP Business Warehouse connector has been deprecated. New
connections will use Implementation 2.0 of the SAP Business Warehouse connector.
All support for version 1.0 will be removed from the connector in the near future.
Use the information in this article to update existing version 1.0 reports so they can
use Implementation 2.0 of this connector.
ExecutionMode specifies the MDX interface used to execute queries on the server.
The following options are valid:
SapBusinessWarehouseExecutionMode.BasXml
SapBusinessWarehouseExecutionMode.BasXmlGzip
SapBusinessWarehouseExecutionMode.DataStream
Improved performance.
Ability to retrieve several million rows of data, and fine-tuning through the batch
size parameter.
Ability to switch execution modes.
Support for compressed mode. Especially beneficial for high latency connections
or large datasets.
Improved detection of Date variables.
Expose Date (ABAP type DATS) and Time (ABAP type TIMS) dimensions as dates
and times respectively, instead of text values. More information: Support for typed
dates in SAP BW
Better exception handling. Errors that occur in BAPI calls are now surfaced.
Column folding in BasXml and BasXmlGzip modes. For example, if the generated
MDX query retrieves 40 columns but the current selection only needs 10, this
request will be passed onto the server to retrieve a smaller dataset.
1. Open an existing report, select Edit Queries in the ribbon, and then select the SAP
Business Warehouse query to update.
Determine whether the query already contains an option record, such as the
following example.
If so, add the Implementation 2.0 option, and remove the ScaleMeasures option, if
present, as shown.
If the query doesn't already include an options record, just add it. For the following
option:
Every effort has been made to make Implementation 2.0 of the SAP BW connector
compatible with version 1. However, there may be some differences because of the
different SAP BW MDX execution modes being used. To resolve any discrepancies, try
switching between execution modes.
You'll need to add the key in to access the typed date. For example, if there's a
dimension attribute called [0CALDAY], you'll need to add the key [20CALDAY] to get the
typed value.
To manually add the key in Import mode, just expand Properties and select the key.
The key column will be of type date, and can be used for filtering. Filtering on this
column will fold to the server.
Feature Description
Local calculations Local calculations defined in a BEX Query will change the numbers as
displayed through tools like Bex Analyzer. However, they aren't reflected in
the numbers returned from SAP, through the public MDX interface.
As such, the numbers seen in Power Query won't necessarily match those
for a corresponding visual in an SAP tool.
For instance, when connecting to a query cube from a BEx query that sets the
aggregation to be Cumulated (for example, running sum), Power Query would
get back the base numbers, ignoring that setting. An analyst could then apply
a running sum calculation locally in, for example, Power BI, but would need to
exercise caution in how the numbers are interpreted if this isn't done.
Feature Description
Aggregations In some cases (particularly when dealing with multiple currencies), the
aggregate numbers returned by the SAP public interface don't match those
shown by SAP tools.
As such, the numbers seen in Power Query won't necessarily match those
for a corresponding visual in an SAP tool.
For instance, totals over different currencies would show as "*" in Bex
Analyzer, but the total would get returned by the SAP public interface,
without any information that such an aggregate number is meaningless. Thus
the number (aggregating, say, $, EUR, and AUD) would get displayed by
Power Query.
Currency Any currency formatting (for example, $2,300 or 4000 AUD) isn't reflected in
formatting Power Query.
Units of measure Units of measure (for example, 230 KG) aren't reflected in Power Query.
Key versus text For an SAP BW characteristic like CostCenter, the navigator will show a single
(short, medium, item Cost Center Level 01. Selecting this item will include the default text for
long) Cost Center in the field list. Also, the Key value, Short Name, Medium Name,
and Long Name values are available for selection in the Properties node for
the characteristic (if maintained in SAP BW).
Note that this only applies to Import connectivity mode. For DirectQuery
mode, only the default text will be included in the data set.
Multiple In SAP, a characteristic can have multiple hierarchies. Then in tools like BEx
hierarchies of a Analyzer, when a characteristic is included in a query, the user can select the
characteristic hierarchy to use.
In Power BI, the various hierarchies can be seen in the field list as different
hierarchies on the same dimension. However, selecting multiple levels from
two different hierarchies on the same dimension will result in empty data
being returned by SAP.
Feature Description
Treatment of SAP BW supports ragged hierarchies, where levels can be missed, for
ragged example:
hierarchies
Continent
Americas
Canada
USA
Not Assigned
Australia
Continent
Americas
Canada
USA
Not Assigned
(Blank)
Australia
Scaling In SAP, a key figure can have a scaling factor (for example, 1000) defined as a
factor/reverse formatting option, meaning that all displays will be scaled by that factor.
sign
It can similarly have a property set that reverses the sign. Use of such a key
figure in Power BI (in a visual, or as part of a calculation) will result in the
unscaled number being used (and the sign isn't reversed). The underlying
scaling factor isn't available. In Power BI visuals, the scale units shown on the
axis (K,M,B) can be controlled as part of the visual formatting.
Hierarchies Initially when connecting to SAP BW, the information on the levels of a
where levels hierarchy will be retrieved, resulting in a set of fields in the field list. This is
appear/disappear cached, and if the set of levels changes, then the set of fields doesn't change
dynamically until Refresh is invoked.
Default filter A BEX query can include Default Filters, which will be applied automatically by
SAP Bex Analyzer. These aren't exposed, and so the equivalent usage in Power
Query won't apply the same filters by default.
Hidden Key A BEX query can control visibility of Key Figures, and those that are hidden
figures won't appear in SAP BEx Analyzer. This isn't reflected through the public API,
and so such hidden key figures will still appear in the field list. However, they
can then be hidden within Power Query.
Feature Description
Numeric Any numeric formatting (number of decimal positions, decimal point, and so
formatting on) won't automatically be reflected in Power Query. However, it's possible to
then control such formatting within Power Query.
Time-dependent When using Power Query, time-dependent hierarchies are evaluated at the
hierarchies current date.
Currency SAP BW supports currency conversion, based on rates held in the cube. Such
conversion capabilities aren't exposed by the public API, and are therefore not available
in Power Query.
Sort Order The sort order (by Text, or by Key) for a characteristic can be defined in SAP.
This sort order isn't reflected in Power Query. For example, months might
appear as "April", "Aug", and so on.
End user The locale used to connect to SAP BW is set as part of the connection details,
language setting and doesn't reflect the locale of the final report consumer.
Text Variables SAP BW allows field names to contain placeholders for variables (for example,
"$YEAR$ Actuals") that would then get replaced by the selected value. For
example, the field appears as "2016 Actuals" in BEx tools, if the year 2016
were selected for the variable.
Customer Exit Customer Exit variables aren't exposed by the public API, and are therefore
Variables not supported by Power Query.
Performance Considerations
The following table provides a summary list of suggestions to improve performance for
data load and refresh from SAP BW.
Suggestion Description
Limit The time it takes to load data from SAP BW into Power Query increases with the
characteristics size of the dataset, that is, the number of columns and rows in the flattened
and properties result set. To reduce the number of columns, only select the characteristics and
(attribute) properties in the navigator that you eventually want to see in your report or
selection dashboard.
Make use of Using filters/parameters contributes to reducing the size of the result set, which
parameters significantly improves query runtimes.
Parameters are especially valuable when used with large dimensions, where
there's many members, such as customers, materials, or document numbers.
Limit number Selecting many key figures from a BEx query/BW model can have a significant
of key figures performance impact during query execution because of the time being spent on
loading metadata for units. Only include the key figures that you need in Power
Query.
Split up very For very large queries against InfoCubes or BEx queries, it may be beneficial to
large queries split up the query. For example, one query might be getting the key figures,
into multiple, while another query (or several other queries) is getting the characteristics data.
smaller queries You can join the individual query results in Power Query.
Avoid Virtual VirtualProviders are similar to structures without persistent storage. They are
Providers useful in many scenarios, but can show slower query performance because they
(MultiProviders represent an additional layer on top of actual data.
or InfoSets)
Avoid use of A query with a navigation attribute has to run an additional join, compared with
navigation a query with the same object as a characteristic in order to arrive at the values.
attributes in
BEx query
Use RSRT to Your SAP Admin can use the Query Monitor in SAP BW (transaction RSRT) to
monitor and analyze performance issues with SAP BW queries. Review SAP note 1591837 for
troubleshoot more information.
slow running
queries
Avoid Both are computed during query execution and can slow down query
Restricted Key performance.
Figures and
Calculated Key
Figures
Suggestion Description
Consider using Power BI refreshes the complete dataset with each refresh. If you're working
incremental with large volume of data, refreshing the full dataset on each refresh may not be
refresh to optimal. In this scenario, you can use incremental refresh, so you're refreshing
improve only a subset of data. For more details, go to Incremental refresh in Power BI.
performance
See also
SAP Business Warehouse Application Server
SAP Business Warehouse Message Server
Import vs. DirectQuery for SAP BW
Import vs. DirectQuery for SAP BW
Article • 01/24/2023
7 Note
This article discusses the differences between Import and DirectQuery modes in
Power BI Desktop. For a description of using Import mode in Power Query Desktop
or Power Query Online, go to the following sections:
With Power Query, you can connect to a wide variety of data sources, including online
services, databases, different file formats, and others. If you are using Power BI Desktop,
you can connect to these data sources in two different ways: either import the data into
Power BI, or connect directly to data in the source repository, which is known as
DirectQuery. When you connect to an SAP BW system, you can also choose between
these two connectivity modes. For a complete list of data sources that support
DirectQuery, refer to Power BI data sources.
The main differences between the two connectivity modes are outlined here, as well as
guidelines and limitations, as they relate to SAP BW connections. For additional
information about DirectQuery mode, go to Using DirectQuery in Power BI.
Import Connections
When you connect to a data source with Power BI Desktop, the navigator will allow you
to select a set of tables (for relational sources) or a set of source objects (for
multidimensional sources).
For SAP BW connections, you can select the objects you want to include in your query
from the tree displayed. You can select an InfoProvider or BEx query for an InfoProvider,
expand its key figures and dimensions, and select specific key figures, characteristics,
attributes (properties), or hierarchies to be included in your query.
The selection defines a query that will return a flattened data set consisting of columns
and rows. The selected characteristics levels, properties and key figures will be
represented in the data set as columns. The key figures are aggregated according to the
selected characteristics and their levels. A preview of the data is displayed in the
navigator. You can edit these queries in Power Query prior to loading the data, for
example to apply filters, or aggregate the data, or join different tables.
When the data defined by the queries is loaded, it will be imported into the Power BI in-
memory cache.
As you start creating your visuals in Power BI Desktop, the imported data in the cache
will be queried. The querying of cached data is very fast and changes to the visuals will
be reflected immediately.
However, the user should take care when building visuals that further aggregate the
data, when dealing with non-additive measures. For example, if the query imported each
Sales Office, and the Growth % for each one, then if the user built a visual that will Sum
the Growth % values across all Sales Offices, that aggregation will be performed locally,
over the cached data. The result wouldn't be the same as requesting the overall Growth
% from SAP BW, and is probably not what's intended. To avoid such accidental
aggregations, it's useful to set the Default Summarization for such columns to Do not
summarize.
If the data in the underlying source changes, it won't be reflected in your visuals. It will
be necessary to do a Refresh, which will reimport the data from the underlying source
into the Power BI cache.
When you publish a report (.pbix file) to the Power BI service, a dataset is created and
uploaded to the Power BI server. The imported data in the cache is included with that
dataset. While you work with a report in the Power BI service, the uploaded data is
queried, providing a fast response time and interactivity. You can set up a scheduled
refresh of the dataset, or re-import the data manually. For on-premise SAP BW data
sources, it's necessary to configure an on-premises data gateway. Information about
installing and configuring the on-premises data gateway can be found in the following
documentation:
For SAP BW queries with variables, you can enter or select values as parameters of the
query. Select the Apply button to include the specified parameters in the query.
Instead of a data preview, the metadata of the selected InfoCube or BEx Query is
displayed. Once you select the Load button in Navigator, no data will be imported.
You can make changes to the values for the SAP BW query variables with the Edit
Queries option on the Power BI Desktop ribbon.
As you start creating your visuals in Power BI Desktop, the underlying data source in SAP
BW is queried to retrieve the required data. The time it takes to update a visual depends
on the performance of the underlying SAP BW system.
Any changes in the underlying data won't be immediately reflected in your visuals. It will
still be necessary to do a Refresh, which will rerun the queries for each visual against the
underlying data source.
When you publish a report to the Power BI service, it will again result in the creation of a
dataset in the Power BI service, just as for an import connection. However, no data is
included with that dataset.
While you work with a report in the Power BI service, the underlying data source is
queried again to retrieve the necessary data. For DirectQuery connections to your SAP
BW and SAP HANA systems, you must have an on-premises data gateway installed and
the data source registered with the gateway.
For SAP BW queries with variables, end users can edit parameters of the query.
7 Note
For the end user to edit parameters, the dataset needs to be published to a
premium workspace, in DirectQuery mode, and single sign-on (SSO) needs to be
enabled.
General Recommendations
You should import data to Power BI whenever possible. Importing data takes advantage
of the high-performance query engine of Power BI and provides a highly interactive and
fully featured experience over your data.
However, DirectQuery provides the following advantages when connecting to SAP BW:
Provides the ability to access SAP BW data using SSO, to ensure that security
defined in the underlying SAP BW source is always applied. When accessing SAP
BW using SSO, the user’s data access permissions in SAP will apply, which may
produce different results for different users. Data that a user isn't authorized to
view will be trimmed by SAP BW.
Ensures that the latest data can easily be seen, even if it's changing frequently in
the underlying SAP BW source.
Ensures that complex measures can easily be handled, where the source SAP BW is
always queried for the aggregate data, with no risk of unintended and misleading
aggregates over imported caches of the data.
Avoids caches of data being extracted and published, which might violate data
sovereignty or security policies that apply.
Using DirectQuery is generally only feasible when the underlying data source can
provide interactive queries for the typical aggregate query within seconds and is able to
handle the query load that will be generated. Additionally, the list of limitations that
accompany use of DirectQuery should be considered, to ensure your goals can still be
met.
If you're working with either very large datasets or encounter slow SAP BW query
response time in DirectQuery mode, Power BI provides options in the report to send
fewer queries, which makes it easier to interact with the report. To access these options
in Power BI Desktop, go to File > Options and settings > Options, and select Query
reduction.
You can disable cross-highlighting throughout your entire report, which reduces the
number of queries sent to SAP BW. You can also add an Apply button to slicers and
filter selections. You can make as many slicer and filter selections as you want, but no
queries will be sent to SAP BW until you select the Apply button. Your selections will
then be used to filter all your data.
These changes will apply to your report while you interact with it in Power BI Desktop, as
well as when your users consume the report in the Power BI service.
In the Power BI service, the query cache for DirectQuery connections is updated on a
periodic basis by querying the data source. By default, this update happens every hour,
but it can be configured to a different interval in dataset settings. For more information,
go to Data refresh in Power BI.
Also, many of the general best practices described in Using DirectQuery in Power BI
apply equally when using DirectQuery over SAP BW. Additional details specific to SAP
BW are described in Connect to SAP Business Warehouse by using DirectQuery in Power
BI.
See also
Windows authentication and single sign-on
Windows authentication and single
sign-on
Article • 01/24/2023
7 Note
For Windows-based authentication and single sign-on functionality, your SAP BW server
must be configured for sign in using Secure Network Communications (SNC). SNC is a
mechanism provided by the SAP system that enables application-level security on data
exchanged between a client, such as Power BI Desktop, and the SAP BW server. SNC
works with different external security products and offers features that the SAP system
doesn't directly provide, including single sign-on.
In addition to your SAP BW server being configured for SNC sign in, your SAP user
account needs to be configured with an SNC name (transaction SU01 in your SAP
system).
Secure Login is a software solution by SAP that allows customers to benefit from the
advantages of SNC without having to set up a public-key infrastructure (PKI). Secure
Login allows users to authenticate with Windows Active Directory credentials.
Secure Login requires the installation of the Secure Login Client on your Power BI
Desktop machine. The installation package is named SAPSetupSCL.EXE and can be
obtained from the SAP Service Marketplace (requires SAP customer credentials).
1. In the SAP Business Warehouse server dialog box, select the Windows tab.
2. Select to either use your current Windows credentials or specify alternate Windows
credentials.
3. Enter the SNC Partner Name. This name is the configured SNC name in the SAP
BW server’s security token. You can retrieve the SNC name with transaction RZ11
(Profile Parameter Maintenance) in SAPGUI and parameter name snc/identity/as.
p:CN=<service_User_Principal_Name>
4. Select the SNC Library that your SAP BW environment has been configured for.
The NTLM and KERBEROS options will expect the corresponding DLL to be in
a folder that's been specified in the PATH variable on your local machine. The
libraries for 32-bit systems are GSSNTLM.DLL (for NTLM) and GSSKRB5.DLL
(for Kerberos). The libraries for 64-bit systems are GX64NTLM.DLL (for NTLM)
and GX64KRB5.DLL (for Kerberos).
The Custom option allows for the use of a custom developed library.
See also
Use advanced options
Use advanced options
Article • 01/24/2023
When you create a connection to an SAP Business Warehouse server, you can optionally
specify a language code, execution mode, batch size, and an MDX Statement. Also, you
can select whether you want to enable characteristic structures.
7 Note
Although the images in this article illustrate the advanced options in the SAP
Business Warehouse Application Server connector, they work the same way in the
SAP Business Warehouse Message Server connector.
Language code
You can optionally specify a language code when establishing a connection to the SAP
BW server.
The expected value is a two-letter language code as defined in the SAP system. In Power
Query Desktop, select the Help icon (question mark) next to the Language Code field for
a list of valid values.
After you set the language code, Power Query displays the descriptive names of the
data objects in SAP BW in the specified language, including the field names for the
selected objects.
7 Note
Not all listed languages might be configured in your SAP BW system, and object
descriptions might not be translated in all languages.
If no language code is specified, the default locale from the Options dialog will be used
and mapped to a valid SAP language code. To view or override the current locale in
Power BI Desktop, open the File > Options and settings > Options dialog box and
select Current File > Regional settings. To view or override the current locale in Power
Query Online, open the Home > Options > Project options dialog box. If you do
override the locale, your setting gets persisted in your M query and would be honored if
you copy-paste your query from Power Query Desktop to Power Query Online.
Execution mode
The Execution mode option specifies the MDX interface is used to execute queries on
the server. The following options are valid:
BasXml: Specifies the bXML flattening mode option for MDX execution in SAP
Business Warehouse.
BasXmlGzip: Specifies the Gzip compressed bXML flattening mode option for MDX
execution in SAP Business Warehouse. This option is recommended for low latency
or high volume queries. The default value for the execution mode option.
DataStream: Specifies the DataStream flattening mode option for MDX execution
in SAP Business Warehouse.
Batch size
Specifies the maximum number of rows to retrieve at a time when executing an MDX
statement. A small number translates into more calls to the server when retrieving a
large dataset. A large number of rows may improve performance, but could cause
memory issues on the SAP BW server. The default value is 50000 rows.
MDX Statement
7 Note
Instead of using the navigator to browse through and select from available data objects
in SAP BW, a user who's familiar with the MDX query language can specify an MDX
statement for direct execution in SAP BW. However, be aware that no further query
folding will be applied when using a custom MDX statement.
The statement for the example used here would look as shown in the following sample,
based on the technical names of the objects and properties in SAP BW.
MDXTEST can also be used to construct an MDX statement. The transaction screen
includes panels on the left that assist the user in browsing to a query object in SAP BW
and generating an MDX statement.
The transaction offers different execution modes/interfaces for the MDX statement.
Select Flattening (basXML) to mimic how Power Query would execute the query in SAP
BW. This interface in SAP BW creates the row set dynamically using the selections of the
MDX statement. The resulting dynamic table that's returned to Power Query Desktop
has a very compact form that reduces memory consumption.
The transaction will display the result set of the MDX statement and useful runtime
metrics.
Enable characteristic structures
The Enable characteristic structures selection changes the way characteristic structures
are displayed in the navigator. A structure is an SAP BW object that can be used when
building BEX queries. In the BEX UX they look like the following image.
If the Enable characteristic structures selection is clear (default), the connector will
produce a cartesian product of each dimension on the structure with each available
measure. For example:
If selected, the connector produces only the available measures. For example:
See also
Navigate the query objects
Transform and filter SAP BW dataset
SAP Business Warehouse connector troubleshooting
SAP Business Warehouse connector
troubleshooting
Article • 01/24/2023
This article provides troubleshooting situations (and possible solutions) for working with
the SAP Business Warehouse (BW) connector.
7 Note
Collecting a trace of a query sent to the SAP BW server requires some options and
settings that can only be provided by using Power BI Desktop. If you don't already
have a copy of Power BI Desktop, you can obtain a copy at the Microsoft
Download Center . You can set all of the required options and settings for
advanced traces using this free version.
Many times when an error occurs, it may be advantageous to collect a trace of the query
that was sent to the SAP BW server and its response. The following procedure shows
how to set up advanced traces for issues that occur using the SAP BW connector.
a. From the Windows Control Panel, select System > Advanced System Settings.
b. In System Properties, select the Advanced tab, and then select Environment
Variables.
e. Select OK.
When this advanced tracing is activated, an additional folder called SapBw will be
created in the Traces folder. See the rest of this procedure for the location of the
Traces folder.
6. While you're still in Options and settings > Global > Diagnostics, select Open
crash dump/traces folder. Ensure the folder is clear before capturing new traces.
8. Once done, close Power BI Desktop so the logs are flushed to disk.
9. You can view the newly captured traces under the SapBw folder (the Traces folder
that contains the SapBw folder is shown by selecting Open crash dump/traces
folder on the Diagnostics page in Power BI Desktop).
10. Make sure you deactivate this advanced tracing once you’re done, by either
removing the environment variable or setting PBI_EnableSapBwTracing to false.
CPIC_TRACE—3
CPIC_TRACE_DIR—a valid folder, for example: E:\traces\CPIC
The rest of the procedure remains the same. You can view the CPIC traces in the folder
you specified in the CPIC_TRACE_DIR environment variable. You can also view the
regular traces under the SapBw folder.
Also make sure you deactivate this advanced tracing once you’re done, by either
removing the environment variables or setting BI_EnableSapBwTracing to false and
CPIC_TRACE to 0.
2. After removing, verify that the SAP .NET Connector isn't installed in the Global
Assembly Cache (GAC), by making sure the following paths do NOT exist or do
NOT contain DLLs:
32 bit GAC:
C:\Windows\Microsoft.NET\assembly\GAC_32\sapnco\v4.0_3.0.0.42__50436dc
a5c7f7d23
C:\Windows\Microsoft.NET\assembly\GAC_32\sapnco_utils\v4.0_3.0.0.42__504
36dca5c7f7d23
64 bit GAC:
C:\Windows\Microsoft.NET\assembly\GAC_64\sapnco\v4.0_3.0.0.42__50436dc
a5c7f7d23
C:\Windows\Microsoft.NET\assembly\GAC_64\sapnco_utils\v4.0_3.0.0.42__504
36dca5c7f7d23
3. Verify that the binaries aren't in Program Files. Make sure the following locations
do NOT exist or are empty:
C:\Program Files\SAP\SAP_DotNetConnector3_Net40_x64
4. Reinstall the connector, and remember to select the Install assemblies to GAC
option. We recommend you use the latest, 3.0.23.
SAP.Middleware.Connector.RfcBaseException.get_ErrorCode()'
This error is thrown when an error occurs on the SAP BW server and the SAP .NET
connector tries to retrieve information about that error. However, this error may be
hiding the real error. This error can occur when:
The SAP .NET connector was installed twice, once in the Global Assembly Cache
(GAC) and once not in the GAC.
Follow the instructions under Perform clean installation of SAP .NET connector to
reinstall the connector.
This won't solve the problem, but will provide the actual error message.
1. Verify that the version of the SAP .NET connector is installed in the correct bit
length. If you have Power BI Desktop 64-bit installed, make sure you installed the
64-bit SAP .NET connector.
2. Verify that, while installing the SAP .NET Connector, the Install assemblies to GAC
was checked. To verify GAC is installed, open Windows Explorer and go to:
C:\Windows\Microsoft.NET\assembly\GAC_64\sapnco
If you installed the 32-bit version of the SAP .NET connector, it would be
C:\Windows\Microsoft.NET\assembly\GAC_32\sapnco\v4.0_3.0.0.42__50436dca5c7f7d23
\sapnco.dll (and you’d need a 32-bit version of Power BI Desktop).
Another way to check the GAC is to use gacutil (one of the options for disabling strong
name signing). You’d need to run it from a 64-bit command prompt. You can check the
contents of the GAC by opening a command prompt, navigating to the gacutil.exe path
and executing:
gacutil -l
Connectivity:
RFC_PING
RFC_METADATA_GET
MDX execution:
RSR_MDX_CREATE_OBJECT
BAPI_MDDATASET_CREATE_OBJECT
BAPI_MDDATASET_SELECT_DATA
BAPI_MDDATASET_DELETE_OBJECT
RSR_MDX_GET_AXIS_INFO
RSR_MDX_GET_AXIS_DATA
RSR_MDX_GET_CELL_DATA
BAPI_MDDATASET_GET_AXIS_INFO
BAPI_MDDATASET_GET_AXIS_DATA
BAPI_MDDATASET_GET_CELL_DATA
ExecutionMode flattening:
RSR_MDX_GET_FLAT_DATA
RSR_MDX_GET_FS_DATA
BAPI_MDDATASET_GET_FLAT_DATA
BAPI_MDDATASET_GET_FS_DATA
ExecutionMode streaming:
BAPI_MDDATASET_GET_STREAMDATA
BAPI_MDDATASET_GET_STREAMINFO
ExecutionMode BasXml:
RSR_MDX_BXML_GET_DATA
RSR_MDX_BXML_GET_GZIP_DATA
RSR_MDX_BXML_GET_INFO
RSR_MDX_BXML_SET_BINDING
Metadata:
BAPI_MDPROVIDER_GET_DIMENSIONS
BAPI_MDPROVIDER_GET_CATALOGS
BAPI_MDPROVIDER_GET_CUBES
BAPI_MDPROVIDER_GET_MEASURES
BAPI_MDPROVIDER_GET_HIERARCHYS
BAPI_MDPROVIDER_GET_LEVELS
BAPI_MDPROVIDER_GET_PROPERTIES
BAPI_MDPROVIDER_GET_MEMBERS
BAPI_MDPROVIDER_GET_VARIABLES
Information:
BAPI_IOBJ_GETDETAIL (required for typed dimensions (DATS, TIMS))
BAPI_USER_GET_DETAIL (only used for flattening interface)
RFC_READ_TABLE (required for catalog names and certain variable values calls)
This error appears when the installed version in the GAC is lower than the expected
3.0.18.0 version. SAP Note 2417315 discusses this scenario.
SNC_MODE—SncModeApply
SNC_LIB—with the library path specified; if it's an environment variable, it's
expanded at this point
SNC_PARTNERNAME—with the value provided
SNC_QOP = RfcConfigParameters.RfcSncQOP.Default
These are used for both SAP BW Application Server and SAP BW Message Server
connections.
LANG (Language)
CLIENT
ASHOST (AppServerHost)
SYSNR (SystemNumber)
MSHOST (MessageServerHost)
SYSID (SystemID)
GROUP (LogonGroup)
This issue is discussed in the following SAP Notes. Access to these notes requires an S-
user. Contact your SAP Basis team to apply the relevant fixes for this issue.
Additionally, for other similar errors, you can review the contents of the following SAP
notes, and apply them as appropriate for your environment:
In the logs— Message: [Expression.Error] The key didn't match any rows in the
table.
StackTrace:
at Microsoft.Mashup.Engine1.Runtime.TableValue.get_Item(Value key)
at
Microsoft.Mashup.Engine1.Library.Cube.CubeParametersModule.Cube.ApplyPa
rameterFunctionValue.GetParameterValue(CubeValue cubeValue, Value
parameter)
at
Microsoft.Mashup.Engine1.Library.Cube.CubeParametersModule.Cube.ApplyPa
rameterFunctionValue.TypedInvoke(TableValue cube, Value parameter,
Value arguments)
Detail: [Key = [Id = \"[!V000004]\"], Table = #table({...}, {...})]
7 Note
4. The query there should have a line that starts with "{Cube.ApplyParameter, "
[!V000004]" (the missing parameter). Remove that line.
5. Select Done.
If the above workaround doesn't work, the only alternative fix is for you to recreate the
report.
7 Note
The following information only applies when using Implementation 1.0 of the SAP
BW connector or Implementation 2.0 of the SAP BW connector with Flattening
mode (when ExecutionMode=67).
User accounts in SAP BW have default settings for how decimal or date/time values are
formatted when displayed to the user in the SAP GUI.
The default settings are maintained in the SAP system in the User Profile for an account,
and the user can view or change these settings in the SAP GUI with the menu path
System > User Profile > Own Data.
Power BI Desktop queries the SAP system for the decimal notation of the connected
user and uses that notation to format decimal values in the data from SAP BW.
SAP BW returns decimal data with either a , (comma) or a . (dot) as the decimal
separator. To specify which of those SAP BW should use for the decimal separator, the
driver used by Power BI Desktop makes a call to BAPI_USER_GET_DETAIL . This call returns
a structure called DEFAULTS , which has a field called DCPFM that stores Decimal Format
Notation. The field takes one of the following values:
Customers who have reported this issue found that the call to BAPI_USER_GET_DETAIL is
failing for a particular user, which is showing the incorrect data, with an error message
similar to the following message:
XML
To solve this error, users must ask their SAP admin to grant the SAP BW user being used
in Power BI the right to execute BAPI_USER_GET_DETAIL . It’s also worth verifying that the
user has the required DCPFM value, as described earlier in this troubleshooting solution.
SAP users need access to specific BAPI function modules to get metadata and retrieve
data from SAP BW's InfoProviders. These modules include:
BAPI_MDPROVIDER_GET_CATALOGS
BAPI_MDPROVIDER_GET_CUBES
BAPI_MDPROVIDER_GET_DIMENSIONS
BAPI_MDPROVIDER_GET_HIERARCHYS
BAPI_MDPROVIDER_GET_LEVELS
BAPI_MDPROVIDER_GET_MEASURES
BAPI_MDPROVIDER_GET_MEMBERS
BAPI_MDPROVIDER_GET_VARIABLES
BAPI_IOBJ_GETDETAIL
To solve this issue, verify that the user has access to the various MDPROVIDER modules
and BAPI_IOBJ_GETDETAIL . To further troubleshoot this or similar issues, you can enable
tracing. Select File > Options and settings > Options. In Options, select Diagnostics,
then select Enable tracing. Attempt to retrieve data from SAP BW while tracing is active,
and examine the trace file for more detail.
Memory Exceptions
In some cases, you might encounter one of the following memory errors:
Message: The memory request for [number] bytes could not be complied with.
These memory exceptions are from the SAP BW server and are due to the server
running out of available memory to process the query. This might happen when the
query returns a large set of results or when the query is too complex for the server to
handle, for example, when a query has many crossjoins.
To resolve this error, the recommendation is to simplify the query or divide it into
smaller queries. If possible, push more aggregation to the server. Alternatively, contact
your SAP Basis team to increase the resources available in the server.
First, follow the instructions in 2777473 - MDX: FAQ for Power BI accessing BW or
BW/4HANA and see if that resolves your issue.
Because the Power Query SAP Business Warehouse connector uses the MDX interface
provided by SAP for 3rd party access, you'll need to contact SAP for possible solutions
as they own the layer between the MDX interface and the SAP BW server. Ask how "long
text is XL" can be specified for your specific scenario.
1. Create a new environment variable either by navigating to File Explorer > This PC
> Properties > Advanced system settings > Environment Variables > System
Variables > New, or by opening a command prompt and entering sysdm.cpl and
then selecting New under System Variables.
2. Name the environment variable PBI_AlwaysEnableQueryEditor and set the value
true . This variable setting allows access to the query editor even in Direct Query
mode.
3. In Power BI Desktop, in the Home tab, select Transform Data to open the Power
Query editor.
4. Update the query to use implementation 2.0 by following these instructions,
starting with Step 2 in that article.
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
You'll need an SAP account to sign in to the website and download the drivers. If you're
unsure, contact the SAP administrator in your organization.
To use SAP HANA in Power BI Desktop or Excel, you must have the SAP HANA ODBC
driver installed on the local client computer for the SAP HANA data connection to work
properly. You can download the SAP HANA Client tools from SAP Development Tools ,
which contains the necessary ODBC driver. Or you can get it from the SAP Software
Download Center . In the Software portal, search for the SAP HANA CLIENT for
Windows computers. Since the SAP Software Download Center changes its structure
frequently, more specific guidance for navigating that site isn't available. For instructions
about installing the SAP HANA ODBC driver, go to Installing SAP HANA ODBC Driver on
Windows 64 Bits .
To use SAP HANA in Excel, you must have either the 32-bit or 64-bit SAP HANA ODBC
driver (depending on whether you're using the 32-bit or 64-bit version of Excel) installed
on the local client computer.
This feature is only available in Excel for Windows if you have Office 2019 or a Microsoft
365 subscription . If you're a Microsoft 365 subscriber, make sure you have the latest
version of Office .
HANA 1.0 SPS 12rev122.09, 2.0 SPS 3rev30 and BW/4HANA 2.0 is supported.
Capabilities Supported
Import
Direct Query (Power BI Datasets)
Advanced
SQL Statement
1. Select Get Data > SAP HANA database in Power BI Desktop or From Database >
From SAP HANA Database in the Data ribbon in Excel.
2. Enter the name and port of the SAP HANA server you want to connect to. The
example in the following figure uses SAPHANATestServer on port 30015 .
By default, the port number is set to support a single container database. If your
SAP HANA database can contain more than one multitenant database container,
select Multi-container system database (30013). If you want to connect to a
tenant database or a database with a non-default instance number, select Custom
from the Port drop-down menu.
If you're connecting to an SAP HANA database from Power BI Desktop, you're also
given the option of selecting either Import or DirectQuery. The example in this
article uses Import, which is the default (and the only mode for Excel). For more
information about connecting to the database using DirectQuery in Power BI
Desktop, go to Connect to SAP HANA data sources by using DirectQuery in Power
BI.
You can also enter an SQL statement or enable column binding from Advanced
options. More information, Connect using advanced options
3. If you're accessing a database for the first time, you'll be asked to enter your
credentials for authentication. In this example, the SAP HANA server requires
database user credentials, so select Database and enter your user name and
password. If necessary, enter your server certificate information.
Also, you may need to validate the server certificate. For more information about
using validate server certificate selections, see Using SAP HANA encryption. In
Power BI Desktop and Excel, the validate server certificate selection is enabled by
default. If you've already set up these selections in ODBC Data Source
Administrator, clear the Validate server certificate check box. To learn more about
using ODBC Data Source Administrator to set up these selections, go to Configure
SSL for ODBC client access to SAP HANA.
4. From the Navigator dialog box, you can either transform the data in the Power
Query editor by selecting Transform Data, or load the data by selecting Load.
2. Enter the name and port of the SAP HANA server you want to connect to. The
example in the following figure uses SAPHANATestServer on port 30015 .
4. Select the name of the on-premises data gateway to use for accessing the
database.
7 Note
You must use an on-premises data gateway with this connector, whether your
data is local or online.
5. Choose the authentication kind you want to use to access your data. You'll also
need to enter a username and password.
7 Note
6. Select Use Encrypted Connection if you're using any encrypted connection, then
choose the SSL crypto provider. If you're not using an encrypted connection, clear
Use Encrypted Connection. More information: Enable encryption for SAP HANA
7. Select Next to continue.
8. From the Navigator dialog box, you can either transform the data in the Power
Query editor by selecting Transform Data, or load the data by selecting Load.
The following table describes all of the advanced options you can set in Power Query.
SQL Statement More information, Import data from a database using native database
query
Advanced option Description
Enable column Binds variables to the columns of a SAP HANA result set when fetching
binding data. May potentially improve performance at the cost of slightly higher
memory utilization. This option is only available in Power Query Desktop.
More information: Enable column binding
ConnectionTimeout A duration that controls how long to wait before abandoning an attempt
to make a connection to the server. The default value is 15 seconds.
CommandTimeout A duration that controls how long the server-side query is allowed to run
before it is canceled. The default value is ten minutes.
Both the Power BI Desktop and Excel connector for an SAP HANA database use the
SAP ODBC driver to provide the best user experience.
In Power BI Desktop, SAP HANA supports both DirectQuery and Import options.
With SAP HANA, you can also use SQL commands in the native database query
SQL statement to connect to Row and Column Tables in HANA Catalog tables,
which aren't included in the Analytic/Calculation Views provided by the Navigator
experience. You can also use the ODBC connector to query these tables.
There are currently some limitations for HANA variables attached to HDI-based
Calculation Views. These limitations are because of errors on the HANA side.
First, it isn't possible to apply a HANA variable to a shared column of an HDI-
container-based Calculation View. To fix this limitation, upgrade to HANA 2
version 37.02 and onwards or to HANA 2 version 42 and onwards. Second,
multi-entry default values for variables and parameters currently don't show up
in the Power BI UI. An error in SAP HANA causes this limitation, but SAP hasn't
announced a fix yet.
Currently, when you use Power Query Desktop to connect to an SAP HANA database,
you can select the Enable column binding advanced option to enable column binding.
You can also enable column binding in existing queries or in queries used in Power
Query Online by manually adding the EnableColumnBinding option to the connection in
the Power Query formula bar or advanced editor. For example:
Power Query M
There are limitations associated with manually adding the EnableColumnBinding option:
Enable column binding works in both Import and DirectQuery mode. However,
retrofitting an existing DirectQuery query to use this advanced option isn't
possible. Instead, a new query must be created for this feature to work correctly.
In SAP HANA Server version 2.0 or later, column binding is all or nothing. If some
columns can’t be bound, none will be bound, and the user will receive an
exception, for example, DataSource.Error: Column MEASURE_UNIQUE_NAME of type
VARCHAR cannot be bound (20002 > 16384) .
SAP HANA version 1.0 servers don't always report correct column lengths. In this
context, EnableColumnBinding allows for partial column binding. For some queries,
this could mean that no columns are bound. When no columns are bound, no
performance benefits are gained.
7 Note
In the Power Query SAP HANA database connector, native queries don't support
duplicate column names when EnableFolding is set to true.
Unlike other connectors, the SAP HANA database connector supports EnableFolding =
True and specifying parameters at the same time.
To use parameters in a query, you place question marks (?) in your code as placeholders.
To specify the parameter, you use the SqlType text value and a value for that SqlType in
Value . Value can be any M value, but must be assigned to the value of the specified
SqlType .
There are multiple ways of specifying parameters:
Power Query M
Power Query M
Power Query M
SqlType follows the standard type names defined by SAP HANA. For example, the
BIGINT
BINARY
BOOLEAN
CHAR
DATE
DECIMAL
DOUBLE
INTEGER
NVARCHAR
SECONDDATE
SHORTTEXT
SMALLDECIMAL
SMALLINT
TIME
TIMESTAMP
VARBINARY
VARCHAR
Power Query M
let
Source = Value.NativeQuery(
SapHana.Database(
"myhanaserver:30015",
[Implementation = "2.0"]
),
"select ""VARCHAR_VAL"" as ""VARCHAR_VAL""
from ""_SYS_BIC"".""DEMO/CV_ALL_TYPES""
where ""VARCHAR_VAL"" = ? and ""DATE_VAL"" = ?
group by ""VARCHAR_VAL""
",
{"Seattle", #date(1957, 6, 13)},
[EnableFolding = true]
)
in
Source
The following example demonstrates how to provide a list of records (or mix values and
records):
Power Query M
let
Source = Value.NativeQuery(
SapHana.Database(Server, [Implementation="2.0"]),
"select
""COL_VARCHAR"" as ""COL_VARCHAR"",
""ID"" as ""ID"",
sum(""DECIMAL_MEASURE"") as ""DECIMAL_MEASURE""
from ""_SYS_BIC"".""DEMO/CV_ALLTYPES""
where
""COL_ALPHANUM"" = ? or
""COL_BIGINT"" = ? or
""COL_BINARY"" = ? or
""COL_BOOLEAN"" = ? or
""COL_DATE"" = ?
group by
""COL_ALPHANUM"",
""COL_BIGINT"",
""COL_BINARY"",
""COL_BOOLEAN"",
""COL_DATE"",
{
[ SqlType = "CHAR", Value = "M" ],
// COL_ALPHANUM - CHAR
[ SqlType = "BIGINT", Value = 4 ],
// COL_BIGINT - BIGINT
[ SqlType = "BINARY", Value = Binary.FromText("AKvN",
BinaryEncoding.Base64) ], // COL_BINARY - BINARY
[ SqlType = "BOOLEAN", Value = true ],
// COL_BOOLEAN - BOOLEAN
[ SqlType = "DATE", Value = #date(2022, 5, 27) ],
// COL_DATE - TYPE_DATE
} ,
[EnableFolding=false]
)
in
Source
Before, when you added a table column (or another transformation that internally adds
a column), the query would "drop out of cube space", and all operations would be done
at a table level. At some point, this drop out could cause the query to stop folding.
Performing cube operations after adding a column was no longer possible.
With this change, the added columns are treated as dynamic attributes within the cube.
Having the query remain in cube space for this operation has the advantage of letting
you continue using cube operations even after adding columns.
7 Note
This new functionality is only available when you connect to Calculation Views in
SAP HANA Server version 2.0 or higher.
The following sample query takes advantage of this new capability. In the past, you
would get a "the value is not a cube" exception when applying
Cube.CollapseAndRemoveColumns.
Power Query M
let
Source = SapHana.Database(“someserver:someport”,
[Implementation="2.0"]),
Contents = Source{[Name="Contents"]}[Data],
SHINE_CORE_SCHEMA.sap.hana.democontent.epm.models =
Contents{[Name="SHINE_CORE_SCHEMA.sap.hana.democontent.epm.models"]}[Data],
PURCHASE_ORDERS1 =
SHINE_CORE_SCHEMA.sap.hana.democontent.epm.models{[Name="PURCHASE_ORDERS"]}
[Data],
#"Added Items" = Cube.Transform(PURCHASE_ORDERS1,
{
{Cube.AddAndExpandDimensionColumn, "[PURCHASE_ORDERS]", {"
[HISTORY_CREATEDAT].[HISTORY_CREATEDAT].Attribute", "[Product_TypeCode].
[Product_TypeCode].Attribute", "[Supplier_Country].
[Supplier_Country].Attribute"}, {"HISTORY_CREATEDAT", "Product_TypeCode",
"Supplier_Country"}},
{Cube.AddMeasureColumn, "Product_Price", "[Measures].
[Product_Price]"}
}),
#"Inserted Year" = Table.AddColumn(#"Added Items", "Year", each
Date.Year([HISTORY_CREATEDAT]), Int64.Type),
#"Filtered Rows" = Table.SelectRows(#"Inserted Year", each
([Product_TypeCode] = "PR")),
#"Added Conditional Column" = Table.AddColumn(#"Filtered Rows",
"Region", each if [Supplier_Country] = "US" then "North America" else if
[Supplier_Country] = "CA" then "North America" else if [Supplier_Country] =
"MX" then "North America" else "Rest of world"),
#"Filtered Rows1" = Table.SelectRows(#"Added Conditional Column", each
([Region] = "North America")),
#"Collapsed and Removed Columns" =
Cube.CollapseAndRemoveColumns(#"Filtered Rows1", {"HISTORY_CREATEDAT",
"Product_TypeCode"})
in
#"Collapsed and Removed Columns"
Next steps
Enable encryption for SAP HANA
The following articles contain more information that you may find useful when
connecting to an SAP HANA debase.
We recommend that you encrypt connections to an SAP HANA server from Power Query
Desktop and Power Query Online. You can enable HANA encryption using SAP's
proprietary CommonCryptoLib (formerly known as sapcrypto) library. SAP recommends
using CommonCryptoLib.
7 Note
SAP no longer supports the OpenSSL, and as a result, Microsoft also has
discontinued its support. Use CommonCryptoLib instead.
7 Note
The setup steps for encryption detailed in this article overlap with the setup and
configuration steps for SAML SSO. Use CommonCryptoLib as your HANA server's
encryption provider, and make sure that your choice of CommonCryptoLib is
consistent across SAML and encryption configurations.
There are four phases to enabling encryption for SAP HANA. We cover these phases
next. More information: Securing the Communication between SAP HANA Studio and
SAP HANA Server through SSL
Use CommonCryptoLib
Ensure your HANA server is configured to use CommonCryptoLib as its cryptographic
provider.
Create a certificate signing request
Create an X509 certificate signing request for the HANA server.
1. Using SSH, connect to the Linux machine that the HANA server runs on as
<sid>adm.
If you don't already have a CA you can use, you can create a root CA yourself by
following the steps outlined in Securing the Communication between SAP HANA
Studio and SAP HANA Server through SSL .
This command creates a certificate signing request and private key. Fill in <HOSTNAME
with FQDN> with the host name and fully qualified domain name (FQDN).
5. Verify the trust relationship between a client and the CA you used to sign the SAP
HANA server's certificate.
The client must trust the CA used to sign the HANA server's X509 certificate before
an encrypted connection can be made to the HANA server from the client's
machine.
There are various ways to ensure this trust relationship exists using Microsoft
Management Console (mmc) or the command line. You can import the CA's X509
certificate (cert.pem) into the Trusted Root Certification Authorities folder for the
user that will establish the connection, or into the same folder for the client
machine itself, if that is desirable.
You must first convert cert.pem into a .crt file before you can import the certificate
into the Trusted Root Certification Authorities folder.
7 Note
Before using the procedures in this section, you must be signed in to Power BI
using your admin account credentials.
Before you can validate a server certificate in the Power BI service online, you must have
a data source already set up for the on-premises data gateway. If you don't already have
a data source set up to test the connection, you'll have to create one. To set up the data
source on the gateway:
3. Select the ellipsis (...) next to the name of the gateway you want to use with this
connector.
5. In Data Source Settings, enter the data source name you want to call this new
source in the Data Source Name text box.
7. Enter the server name in Server, and select the authentication method.
1. In Power BI Desktop or in the Data Source Settings page of the Power BI service,
ensure that Validate server certificate is enabled before attempting to establish a
connection to your SAP HANA server. For SSL crypto provider, select
commoncrypto. Leave the SSL key store and SSL trust store fields blank.
Power BI Desktop
Power BI service
2. Verify that you can successfully establish an encrypted connection to the server
with the Validate server certificate option enabled, by loading data in Power BI
Desktop or refreshing a published report in Power BI service.
You'll note that only the SSL crypto provider information is required. However, your
implementation might require that you also use the key store and trust store. For more
information about these stores and how to create them, go to Client-Side TLS/SSL
Connection Properties (ODBC) .
Additional information
Server-Side TLS/SSL Configuration Properties for External Communication
(JDBC/ODBC)
Next steps
Configure SSL for ODBC client access to SAP HANA
Configure SSL for ODBC client access to
SAP HANA
Article • 01/24/2023
If you're connecting to an SAP HANA database from Power Query Online, you may need
to set up various property values to connect. These properties could be the SSL crypto
provider, an SSL key store, and an SSL trust store. You may also require that the
connection be encrypted. In this case, you can use the ODBC Data Source Administrator
application supplied with Windows to set up these properties.
In Power BI Desktop and Excel, you can set up these properties when you first sign in
using the Power Query SAP HANA database connector. The Validate server certificate
selection in the authentication dialog box is enabled by default. You can then enter
values in the SSL crypto provider, SSL key store, and SSL trust store properties in this
dialog box. However, all of the validate server certificate selections in the authentication
dialog box in Power BI Desktop and Excel are optional. They're optional in case you want
to use ODBC Data Source Administrator to set them up at the driver level.
7 Note
You must have the proper SAP HANA ODBC driver (32-bit or 64-bit) installed
before you can set these properties in ODBC Data Source Administrator.
If you're going to use ODBC Data Source Administrator to set up the SSL crypto
provider, SSL key store, and SSL trust store in Power BI or Excel, clear the Validate server
certificate check box when presented with the authentication dialog box.
To use ODBC Data Source Administrator to set up the validate server certificate
selections:
1. From the Windows Start menu, select Windows Administrative Tools > ODBC
Data Sources. If you're using a 32-bit version of Power BI Desktop or Excel, open
ODBC Data Sources (32-bit), otherwise open ODBC Data Sources (64-bit).
3. In the Create New Data Source dialog box, select the HDBODBC driver, and then
select Finish.
4. In the ODBC Configuration for SAP HANA dialog box, enter a Data source name.
Then enter your server and database information, and select Validate the TLS/SSL
certificate.
6. In the Advanced ODBC Connection Property Setup dialog box, select the Add
button.
7. In the Add/Modify Connection Property dialog box, enter sslCryptoProvider in
the Property text box.
8. In the Value text box, enter the name of the crypto provider you'll be using: either
sapcrypto, commoncrypto, openssl, or mscrypto.
9. Select OK.
10. You can also add the optional sslKeyStore and sslTrustStore properties and values if
necessary. If the connection must be encrypted, add ENCRYPT as the property and
TRUE as the value.
11. In the Advanced ODBC Connection Property Setup dialog box, select OK.
12. To test the connection you’ve set up, select Test connection in the ODBC
Configuration for SAP HANA dialog box.
13. When the test connection has completed successfully, select OK.
For more information about the SAP HANA connection properties, see Server-Side
TLS/SSL Configuration Properties for External Communication (JDBC/ODBC) .
7 Note
If you select Validate server certificate in the SAP HANA authentication dialog box
in Power BI Desktop or Excel, any values you enter in SSL crypto provider, SSL key
store, and SSL trust store in the authentication dialog box will override any
selections you've set up using ODBC Data Source Administrator.
Next steps
SAP HANA database connector troubleshooting
Troubleshooting
Article • 01/24/2023
The following section describes some issues that may occur while using the Power
Query SAP HANA connector, along with some possible solutions.
HKEY_LOCAL_MACHINE\Software\ODBC\ODBCINST.INI\ODBC Drivers
If you’re on a 64-bit machine, but Excel or Power BI Desktop is 32-bit (like the
screenshots below), you can check for the driver in the WOW6432 node instead:
HKEY_LOCAL_MACHINE\Software\WOW6432Node\ODBC\ODBCINST.INI\ODBC Drivers
Note that the driver needs to match the bit version of your Excel or Power BI Desktop. If
you’re using:
32-bit Excel/Power BI Desktop, you'll need the 32-bit ODBC driver (HDBODBC32).
64-bit Excel/Power BI Desktop, you'll need the 64-bit ODBC driver (HDBODBC).
Finally, the driver should also show up as "ODBC DataSources 32-bit" or "ODBC
DataSources 64-bit".
Collect SAP HANA ODBC Driver traces
To capture an SAP HANA trace:
4. Open Power BI, clear the cache, and rerun the scenario.
From the Log File Path in the Tracing tab of the ODBC Data Source
Administrator.
From the HANA trace based on the path configured with the command
hdbodbc_cons32.exe config trace filename.
The trace commands should be run as the user that will be running the Mashup
process that accesses the SAP HANA server.
The trace file path you specify should be writable by the user that runs the Mashup
process.
For example:
To capture non-SSO connections from gateway, make sure you use the gateway
service user. That is, run the command-line window as the gateway user when you
want to execute the hdodbc_cons.exe calls. Make sure that the gateway server user
can write to the log file location you specify.
To capture SSO connections from Gateway, use the SSO user.
The user legitimately not having enough privileges on the view they're trying to
access.
Issue: Not able to connect to SAP Hana from PBI Desktop using SAP client 2.0
37.02, but if you downgrade the client version to 1.00.120.128, it works.
Unfortunately, this is an SAP issue so you'll need to wait for a fix from SAP.
SharePoint folder
Article • 07/14/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
7 Note
AAD/OAuth for SharePoint on-premises isn’t supported using the on-premises data
gateway.
Capabilities supported
Folder path
Combine
Combine and load
Combine and transform
Determine the site URL
When you're connecting to a SharePoint site, you'll be asked to enter the site URL. To
find the site URL that contains your SharePoint folder, first open a page in SharePoint.
From a page in SharePoint, you can usually get the site address by selecting Home in
the navigation pane, or the icon for the site at the top. Copy the address from your web
browser's address bar and save for later.
2. Paste the SharePoint site URL you copied in Determine the site URL to the Site URL
text box in the SharePoint folder dialog box. In this example, the site URL is
https://contoso.sharepoint.com/marketing/data . If the site URL you enter is
invalid, a warning icon will appear next to the URL text box.
Select OK to continue.
3. If this is the first time you've visited this site address, select the appropriate
authentication method. Enter your credentials and choose which level to apply
these settings to. Then select Connect.
4. When you select the SharePoint folder you want to use, the file information about
all of the files in that SharePoint folder are displayed. In addition, file information
about any files in any subfolders is also displayed.
5. Select Combine & Transform Data to combine the data in the files of the selected
SharePoint folder and load the data into the Power Query Editor for editing. Or
select Combine & Load to load the data from all of the files in the SharePoint
folder directly into your app.
7 Note
The Combine & Transform Data and Combine & Load buttons are the easiest ways
to combine data found in the files of the SharePoint folder you specify. You could
also use the Load button or the Transform Data buttons to combine the files as
well, but that requires more manual steps.
2. Paste the SharePoint site URL you copied in Determine the site URL to the Site URL
text box in the SharePoint folder dialog box. In this example, the site URL is
https://contoso.sharepoint.com/marketing/data .
3. If the SharePoint folder is on-premises, enter the name of an on-premises data
gateway.
4. Select the authentication kind, and enter any credentials that are required.
5. Select Next.
6. When you select the SharePoint folder you want to use, the file information about
all of the files in that SharePoint folder are displayed. In addition, file information
about any files in any subfolders is also displayed.
7. Select Combine to combine the data in the files of the selected SharePoint folder
and load the data into the Power Query Editor for editing.
7 Note
The Combine button is the easiest way to combine data found in the files of
the SharePoint folder you specify. You could also use the Transform Data
buttons to combine the files as well, but that requires more manual steps.
Troubleshooting
Combining files
All of the files in the SharePoint folder you select will be included in the data to be
combined. If you have data files located in a subfolder of the SharePoint folder you
select, all of these files will also be included. To ensure that combining the file data
works properly, make sure that all of the files in the folder and the subfolders have the
same schema.
In some cases, you might have multiple folders on your SharePoint site containing
different types of data. In this case, you'll need to delete the unnecessary files. To delete
these files:
1. In the list of files from the SharePoint folder you chose, select Transform Data.
2. In the Power Query editor, scroll down to find the files you want to keep.
3. In the example shown in the screenshot above, the required files are the last rows
in the table. Select Remove Rows, enter the value of the last row before the files to
keep (in this case 903), and select OK.
4. Once you've removed all the unnecessary files, select Combine Files from the
Home ribbon to combine the data from all of the remaining files.
For more information about combining files, go to Combine files in Power Query.
# % $
If these characters are present in the filename, the file owner must rename the file so
that it does NOT contain any of these characters.
Permissions
When requesting document library contents, you should have Read access to the
SharePoint site as well as the document library and any folders leading to the requested
file.
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
7 Note
AAD/OAuth for SharePoint on-premises isn’t supported using the on-premises data
gateway.
Capabilities supported
Site URL
2. Paste the SharePoint site URL you copied in Determine the site URL to the Site URL
field in the open dialog box.
If the URL address you enter is invalid, a warning icon will appear next to the
Site URL textbox.
Select OK to continue.
3. If this is the first time you've visited this site address, select the appropriate
authentication method. Enter your credentials and chose which level to apply these
settings to. Then select Connect.
4. From the Navigator, you can select a location, then either transform the data in
the Power Query editor by selecting Transform Data, or load the data by selecting
Load.
Connect to a SharePoint list from Power Query
Online
To connect to a SharePoint list:
2. Paste the SharePoint site URL you copied in Determine the site URL to the Site URL
field in the open dialog box.
4. Select the authentication kind, and enter any credentials that are required.
5. Select Next.
6. From the Navigator, you can select a location, then transform the data in the
Power Query editor by selecting Next.
Troubleshooting
Use root SharePoint address
Make sure you supply the root address of the SharePoint site, without any subfolders or
documents. For example, use link similar to the following:
https://contoso.sharepoint.com/teams/ObjectModel/
This issue only happens when the Data Type is not explicitly set for a column in the
Query View of Power BI Desktop. You can tell that the data type isn't set by seeing the
"ABC 123" image on the column and "Any" data type in the ribbon as shown below.
The user can force the interpretation to be consistent by explicitly setting the data type
for the column through the Power Query Editor. For example, the following image
shows the column with an explicit Boolean type.
Next steps
Optimize Power Query when expanding table columns
SharePoint Online list
Article • 07/14/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities supported
Site URL
2. Paste the SharePoint site URL you copied in Determine the site URL to the Site URL
field in the open dialog box.
If the URL address you enter is invalid, a warning icon will appear next to the
Site URL textbox.
You can also select either the 1.0 implementation of this connector or the 2.0
implementation. More information: Connect to SharePoint Online list v2.0
Select OK to continue.
3. If this is the first time you've visited this site address, select the appropriate
authentication method. Enter your credentials and chose which level to apply these
settings to. Then select Connect.
4. From the Navigator, you can select a location, then either transform the data in
the Power Query editor by selecting Transform Data, or load the data by selecting
Load.
4. Select the authentication kind, and enter any credentials that are required.
5. Select Next.
6. From the Navigator, you can select a location, then transform the data in the
Power Query editor by selecting Transform data.
With this update to the connector, we're making available two different views for the
same data:
All
Default
The All view includes all user created and system defined columns. You can see what
columns are included in the following screen.
The default view is what you'll see when looking at the list online in whichever view
you've set as Default in your settings. If you edit this view to add or remove either user
created or system defined columns, or by creating a new view and setting it as default,
these changes will propagate through the connector.
7 Note
If you set the default view in your SharePoint site to Calendar view or Board view,
SharePoint only returns the columns shown in the selected view. In this scenario,
Power BI will not retrieve all the columns in the list, even though you choose All
option. This is by design.
Troubleshooting
Timezone issues
When using the SharePoint Online list (v1.0) connector, you may notice that timezone
data doesn't match what you would expect from your browser. The SharePoint web-
based client does a local timezone conversion based on the browser's knowledge of the
user's timezone.
The backend API for SharePoint uses UTC time and sends this UTC time directly to Power
BI. Power BI doesn't convert this UTC time, but reports it to the user.
To get time into local time, the user must do the same conversion the SharePoint client
does. An example of the column operations that would do this are:
The first operation changes the type to datetimezone , and the second operation
converts it to the computer's local time.
The SharePoint Online list v2.0 connector uses a different API than the v1.0 connector
and, as such, is subject to a maximum of 12 join operations per query, as documented in
the SharePoint Online documentation under List view lookup threshold. This issue will
manifest as SharePoint queries failing when more than 12 columns are accessed
simultaneously from a SharePoint list. However, you can work around this situation by
creating a default view with less than 12 lookup columns.
7 Note
Summary
Item Description
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Connect to SingleStore
To connect Microsoft Power BI Desktop to SingleStore DB or Managed Service:
1. In the Home ribbon, from the Get Data list, select More.
2. In the Get Data dialog, select SingleStore Direct Query Connector 1.0.
3. In the SingleStore database dialog box, enter the IP address or hostname of the
SingleStore cluster in Server. In Database, enter the database name.
Under Data Connectivity mode, select the Import or DirectQuery mode, and then
select OK.
4. In the SingleStore Direct Query Connector 1.0 dialog box, in the left pane, select
the authentication type (either Windows or Basic).
7 Note
7 Note
You need to run Power BI with the user account that maps to the
SingleStore DB user.
For Basic authentication, enter the username and password used to connect
to SingleStore, and then select the Connect button.
5. Once authenticated—for Import or DirectQuery mode—in the Navigator dialog
box, choose the desired tables and select the Load button.
7 Note
Any user that creates a custom SQL report must only have read-only access to the
SingleStore databases.
To create a new custom SQL report:
2. In the Home ribbon, from the Get Data list, select Blank query.
3. In the Power Query Editor dialog, specify the query in the following format:
7 Note
If you're using the server for the first time, select Edit Credentials and enter
the credentials. Go to Step 4 in Connect to SingleStore for more information.
5. If you've worked with the dataset before and it's cached in memory, refresh the
report to reset the local cache. On the Home ribbon, select Refresh.
To update the existing custom SQL reports, select the Refresh button on the Home
ribbon.
Modify Credentials
To modify the credentials used to connect to SingleStore:
1. In the File ribbon, select Options and settings > Data source settings.
2. In the Data source settings dialog, select SingleStore DirectQuery Connector 1.0,
and then select Edit Permissions.
SIS-CC SDMX (Beta)
Article • 07/18/2023
7 Note
Summary
Item Description
Prerequisites
Before you get started, make sure you've properly configured the URL from the service
provider’s API. The exact process here will depend on the service provider.
Capabilities supported
Import of SDMX-CSV 2.1 format. Other formats aren't supported.
Connection instructions
To connect to SDMX Web Service data:
1. Select Get Data from the Home ribbon in Power BI Desktop. Select All from the
categories on the left, and then select SIS-CC SDMX. Then select Connect.
4. Select Load to import the data into Power BI, or Transform Data to edit the query
in Power Query Editor where you can refine the query before loading into Power
BI.
Next steps
If you want to submit a feature request or contribute to the open-source project, then
go to the Gitlab project site .
Smartsheet
Article • 07/18/2023
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities Supported
Import
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Specify a text value to use as Role name
Relationship columns
Connection timeout in seconds
Command timeout in seconds
Database
Native SQL statement
2. In the Snowflake window that appears, enter the name of your Snowflake server in
Server and the name of your Snowflake computing warehouse in Warehouse.
3. Optionally, enter values in any advanced options that you want to use to modify
the connection query, such as a text value to use as a Role name or a command
timeout. More information: Connect using advanced options
4. Select OK.
7 Note
Once you enter your username and password for a particular Snowflake
server, Power BI Desktop uses those same credentials in subsequent
connection attempts. You can modify those credentials by going to File >
Options and settings > Data source settings. More information: Change the
authentication method
If you want to use the Microsoft account option, the Snowflake Azure Active
Directory (Azure AD) integration must be configured on the Snowflake side. More
information: Power BI SSO to Snowflake - Getting Started
7 Note
Azure Active Directory (Azure AD) Single Sign-On (SSO) only supports
DirectQuery.
2. In the Snowflake dialog that appears, enter the name of the server and warehouse.
3. Enter any values in the advanced options you want to use. If there are any
advanced options not represented in the UI, you can edit them in the Advanced
Editor in Power Query later.
6. In Navigator, select the data you require, then select Transform data to transform
the data in Power Query Editor.
Connect using advanced options
Power Query provides a set of advanced options that you can add to your query if
needed.
The following table lists all of the advanced options you can set in Power Query.
Role name Specifies the role that the report uses via the driver. This role must be
available to the user, otherwise no role will be set.
Include relationship If checked, includes columns that might have relationships to other tables.
columns If this box is cleared, you won’t see those columns.
Connection timeout Specifies how long to wait for a response when interacting with the
in seconds Snowflake service before returning an error. Default is 0 (no timeout).
Command timeout in Specifies how long to wait for a query to complete before returning an
seconds error. Default is 0 (no timeout).
SQL Statement For information, go to Import data from a database using native database
query. This option is only available in Power Query Desktop.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your Snowflake database.
fix is being investigated and the documentation here will be updated when the fix is
ready.
Additional information
Connect to Snowflake in Power BI Service
Socialbakers (Beta)
Article • 07/14/2023
7 Note
The following connector article is provided by Socialbakers (now Emplifi), the owner
of this connector and a member of the Microsoft Power Query Connector
Certification Program. If you have questions regarding the content of this article or
have changes you would like to see made to this article, visit the Emplifi website
and use the support channels there.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
To use the Socialbakers (Emplifi) Connector, you must have Socialbakers (Emplifi)
credentials (Token and Secret). Contact the Emplifi Support team to get yours, if you
don't have them. The credentials allow the user to get the data and metrics from the
profiles the user has added to the Emplifi platform.
Capabilities Supported
Import
Authentication
When the connector is started for the first time, authentication is needed. Enter your
Token and Secret to the new modal window. Credentials can be provided to you by the
Emplifi Support team.
Once you sign in, the authentication information is automatically stored by Power BI for
future use. It can be found under File > Options and settings > Data source settings >
Global permissions.
At any time, the permissions can be cleared (or edited) and new credentials can be
entered.
Navigator
Once authenticated, a Navigator window pops up. All possible data sources can be
selected here.
Not all data sources are the same. The differences are described later.
Example usage
1. Choose the Data Source you would like to work with by selecting the checkbox.
7 Note
Not all parameters are explicitly mandatory, but they could be needed for
specific selections. For example, the profile selection is optional, but you still
need to select some profiles to get any data.
4. Once all data and metrics are selected, use the Load button to load the data to the
report. It's also possible to Transform Data before loading it into the report.
7 Note
It's possible to select more than one data source by checking more of the
boxes, setting their parameters, and then selecting Load.
When selected, Facebook Ads first display a list of the last 12 months.
By selecting the specific month(s), you're narrowing down all your Facebook Ad
Accounts to the ones that were active in the selected time period.
You can select the specific Ad Accounts in the Parameters section under the "Accounts"
parameter, along with the Campaigns selection.
Troubleshooting
If any error occurs, check the documentation and make sure you're following the
guidelines of the API.
Additional instructions
It's possible to clear the parameter selection by choosing Clear.
If Transform Data is chosen, you can see all of the function documentation from
which it’s possible to gain more understanding of what is going on behind the
screen.
SoftOne BI (Beta)
Article • 07/14/2023
7 Note
The following connector article is provided by SoftOne, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the SoftOne website and use the support
channels there.
Summary
Item Description
Prerequisites
You'll need to have the Soft1 ERP/CRM or Atlantis ERP product installed with a licensed
SoftOne BI connector module. A web account must be configured in the application
with access to the SoftOne BI Connector service. This account information and your
installation serial number will be validated during authentication by the SoftOne BI
connector.
Capabilities supported
Import
Connection instructions
SoftOne provides many templates as Power BI template files (.pbit) that you can use or
customize which will provide you with a start to your BI project. For example, Sales &
Collections, and Finance.
To connect in Power BI Desktop using a new report, follow the steps below. If you're
connecting from a report created using one of the SoftOne BI templates, see Using a
provided template later in this article.
1. Select Get Data > More... > Online Services in Power BI Desktop and search for
SoftOne BI. Select Connect.
2. Select Sign in. An authentication form will display.
7 Note
3. After signing in with SoftOne Web Services, you can connect to your data store.
Selecting Connect will take you to the navigation table and display the available
tables from the data store from which you may select the data required.
4. In the navigator, you should now see the tables in your data store. Fetching the
tables can take some time.
You must have uploaded the data from your Soft1 or Atlantis installation (per the
product documentation) to see any tables. If you haven't uploaded your data, you
won't see any tables displayed in the Navigation Table.
In this case, you'll need to go back to your application and upload your data.
2. Select Sign in and enter your credentials (Serial number, username, and password).
3. Once you're authenticated, select Connect.
Power BI Desktop will fetch the data from the data store.
4. After the refresh has completed, you're ready to start customizing the report or to
publish it as is to the Power BI Service.
) Important
If you're working with more than one Soft1/Atlantis installation, then when
switching between data stores, you must clear the SoftOne BI credentials saved by
Power BI Desktop.
SolarWinds Service Desk (Beta)
Article • 07/14/2023
This connector lets you import incident records from SolarWinds Service Desk to
Microsoft Power BI. You can import records from the past two years. You'll have raw data
on the topics most relevant to your organization, which you can then analyze and review
by viewing it in a variety of formats, such as tables, graphs, and charts.
7 Note
Summary
Item Description
Authentication Types Supported SolarWinds Service Desk JSON Web Token (JWT)
Prerequisites
Before you can use this connector to get SolarWinds Service Desk data, you must have a
SolarWinds Service Desk user who has set up Token authentication for API integration .
Capabilities Supported
Import
1. In Power BI Desktop, select Get Data from Home. Select Other from the categories
on the left, select SolarWinds Service Desk, and then select Connect.
2. Sign in with the JSON web token you generated as described in prerequisites, and
then select Connect to verify your access to SolarWinds Service Desk.
3. In the Navigator dialog box, select the table you want to import. You can then
either load or transform the data.
You can import only once per day; that is, once every 24 hours. If you attempt to
refresh prior to the allowed 24-hour cycle, you'll receive an error message.
There's no limit on the number of users who can pull data, but each user can
refresh only once every 24 hours.
Only incident data is imported, providing historical details from January 1, 2020 to
date. The incident fields that are imported are limited. If you need to import a field
that's not available, you can request that it be added. If you have any questions
about what can be imported or issues with the Power BI integration, email
Rinat.Gil@solarwinds.com.
Summary
Item Description
Products Excel
Power BI (Datasets)
Power Apps (Dataflows)
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
Analysis Services must be installed along with your SQL Server. For information about
installing Analysis Services on your SQL Server, go to Install SQL Server Analysis Services.
This connector article assumes that you've already installed Analysis Services on your
SQL server and have an existing database on the server instance.
Capabilities Supported
Import
Connect live (Power BI Desktop)
Advanced options
MDX or DAX query
Connect to SQL Server Analysis Services
database from Power Query Desktop
To make the connection, take the following steps:
1. Select the SQL Server Analysis Services database option in the connector
selection. More information: Where to get data
2. In the SQL Server Analysis Services database dialog that appears, provide the
name of the server and database (optional).
7 Note
Only Power BI Desktop will display the Import and Connect live options. If
you're connecting using Power BI Desktop, selecting Connect live uses a live
connection to load the connected data directly to Power BI Desktop. In this
case, you can't use Power Query to transform your data before loading the
data to Power BI Desktop. For the purposes of this article, the Import option
is selected. For more information about using a live connection in Power BI
Desktop, go to Connect to Analysis Services tabular data in Power BI
Desktop.
3. Select OK.
4. If you're connecting to this database for the first time, select the authentication
type and input your credentials. Then select Connect.
5. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in the Power
Query editor.
1. Select the SQL Server Analysis Services database option in the connector
selection. More information: Where to get data
2. In the Connect to data source page, provide the name of the server and database
(optional).
4. If you're connecting to this database for the first time, select the authentication
kind and input your credentials.
6. In Navigator, select the data you require, and then select Transform data.
Connect using advanced options
Power Query provides an advanced option that you can add to your query if needed.
MDX or DAX Optionally provides a specific MDX or DAX statement to the SQL Server
statement Analysis Services database server to execute.
Once you've entered a value in the advanced option, select OK in Power Query Desktop
or Next in Power Query Online to connect to your SQL Server Analysis Services
database.
See also
Connect to Analysis Services tabular data in Power BI Desktop
Connect to SSAS multidimensional models in Power BI Desktop
Connect to datasets in the Power BI service from Power BI Desktop
SQL Server
Article • 08/03/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
By default, Power BI installs an OLE DB driver for SQL Server. However, for optimal
performance, we recommend that the customer installs the SQL Server Native Client
before using the SQL Server connector. SQL Server Native Client 11.0 and SQL Server
Native Client 10.0 are both supported in the latest version.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Command timeout in minutes
Native SQL statement
Relationship columns
Navigate using full hierarchy
SQL Server failover support
2. In the SQL Server database dialog that appears, provide the name of the server
and database (optional).
3. Select either the Import or DirectQuery data connectivity mode (Power BI Desktop
only).
4. Select OK.
5. If this is the first time you're connecting to this database, select the authentication
type, input your credentials, and select the level to apply the authentication
settings to. Then select Connect.
7 Note
6. In Navigator, select the database information you want, then either select Load to
load the data or Transform Data to continue transforming the data in Power Query
Editor.
Connect to SQL Server database from Power
Query Online
To make the connection, take the following steps:
2. In the SQL Server database dialog that appears, provide the name of the server
and database (optional).
5. If the connection is not encrypted, and the connection dialog contains a Use
Encrypted Connection check box, clear the check box.
7. In Navigator, select the data you require, and then select Transform data.
Command timeout If your connection lasts longer than 10 minutes (the default timeout), you
in minutes can enter another value in minutes to keep the connection open longer.
This option is only available in Power Query Desktop.
SQL statement For information, go to Import data from a database using native database
query.
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, you won’t see those columns.
columns
Navigate using full If checked, the Navigator displays the complete hierarchy of tables in the
hierarchy database you're connecting to. If cleared, Navigator displays only the tables
whose columns and rows contain data.
Advanced option Description
Enable SQL Server If checked, when a node in the SQL Server failover group isn't available,
Failover support Power Query moves from that node to another when failover occurs. If
cleared, no failover will occur.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your SQL Server database.
Certificate errors
When using the SQL Server database connector, if encryption is disabled and the SQL
Server certificate isn't trusted on the client (Power BI Desktop or on-premises data
gateway), you'll experience the following error.
A connection was successfully established with the server, but then an error
occurred during the login process. (provider: SSL Provider, error: 0 - The
certificate chain was issued by an authority that is not trusted.
Note that in Power BI service, the Azure Active Directory authentication method shows
up as "OAuth2".
Next steps
Optimize Power Query when expanding table columns
Stripe (Deprecated)
Article • 01/24/2023
Summary
Item Description
Products -
Deprecation
This connector is deprecated, and won't be supported soon. We recommend you
transition off existing connections using this connector, and don't use this connector for
new connections.
SumTotal
Article • 07/14/2023
7 Note
Summary
Item Description
Prerequisites
You must have a SumTotal hosted environment with standard permissions to access the
portal, and read permissions to access data in tables.
Capabilities supported
Import
Query Multiple OData endpoints
Advanced
Optionally filter records by RowVersionId parameter to get incremental data
URL. Keep this URL somewhere handy so you can use it later.
7 Note
The Power Query SumTotal connector is currently only suited towards OData API
endpoints. For more information, go to SumTotal's OData API functionality .
2. In the Get Data dialog box, select Other > SumTotal, and then select Connect.
3. Enter the server URL address of the data you want to load.
7 Note
You'll be prompted with a script error, which loads JS/CSS scripts the login
form uses. Select Yes.
4. When the table is loaded in Navigator, you'll be presented with the list of OData
API entities that are currently supported by the connector. You can select to load
one or multiple entities.
7 Note
If this is the first time you're connecting to this site, select Sign in and input your
credentials. Then select Connect.
Summary
Item Description
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
Before you can connect to a Sybase database, you need the SAP SQL Anywhere driver
installed on your computer. Select the driver that matches your Excel installation (32-bit
or 64-bit).
Capabilities Supported
Import
Advanced options
Command timeout in minutes
SQL statement
Include relationship columns
Navigate using full hierarchy
Connect to a Sybase database from Power
Query Desktop
To make the connection, take the following steps:
1. Select the Sybase database option from Get Data. More information: Where to get
data
2. Specify the Sybase server to connect to in Server and the database where your
data is stored in Database.
3. Select OK.
4. If this is the first time you're connecting to this Sybase server and database, select
the authentication type you want to use, enter your credentials, and then select
Connect. For more information about using and managing authentication, go to
Authentication with a data source.
5. In Navigator, select the data you require, then either select Load to load the data
or Transform Data to transform the data.
1. Select the Sybase database option in the Choose data source page. More
information: Where to get data
2. Specify the Sybase server to connect to in Server and the database where your
data is stored in Database.
7 Note
You must select an on-premises data gateway for this connector, whether the
Sybase database is on your local network or online.
4. If this is the first time you're connecting to this Sybase server and database, select
the type of credentials for the connection in Authentication kind. Choose Basic if
you plan to use an account that's created in the Sybase database instead of
Windows authentication. For more information about using and managing
authentication, go to Authentication with a data source.
8. In Navigator, select the data you require, then select Transform data to transform
the data in the Power Query editor.
The following table lists all of the advanced options you can set in Power Query.
Advanced Description
option
Command If your connection lasts longer than 10 minutes (the default timeout), you can enter
timeout in another value in minutes to keep the connection open longer.
minutes
SQL For information, go to Import data from a database using native database query.
statement
Advanced Description
option
Include If checked, includes columns that might have relationships to other tables. If this
relationship box is cleared, you won’t see those columns.
columns
Navigate If checked, the navigator displays the complete hierarchy of tables in the database
using full you're connecting to. If cleared, the navigator displays only the tables whose
hierarchy columns and rows contain data.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your Sybase database.
Teradata database
Article • 07/14/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
Before you can connect to a Teradata database, you need the .NET Data Provider for
Teradata installed on your computer.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced options
Command timeout in minutes
SQL statement
Include relationship columns
Navigate using full hierarchy
1. Select the Teradata database option from Get Data. More information: Where to
get data
4. Select OK.
5. If this is the first time you're connecting to this Teradata database, select the
authentication type you want to use, enter your credentials, and then select
Connect. For more information about using and managing authentication, go to
Authentication with a data source.
6. In Navigator, select the data you require, then either select Load to load the data
or Transform Data to transform the data.
1. Select the Teradata database option in the Choose data source page. More
information: Where to get data
7 Note
You must select an on-premises data gateway for this connector, whether the
Teradata database is on your local network or online.
4. If this is the first time you're connecting to this Teradata database, select the type
of credentials for the connection in Authentication kind. Choose Basic if you plan
to use an account that's created in the Teradata database instead of Windows
authentication. For more information about using and managing authentication,
go to Authentication with a data source.
The following table lists all of the advanced options you can set in Power Query.
Command If your connection lasts longer than 10 minutes (the default timeout), you
timeout in can enter another value in minutes to keep the connection open longer.
minutes
SQL statement For information, go to Import data from a database using native database
query.
Include If checked, includes columns that might have relationships to other tables. If
relationship this box is cleared, you won’t see those columns.
columns
Navigate using If checked, the navigator displays the complete hierarchy of tables in the
full hierarchy database you're connecting to. If cleared, the navigator displays only the
tables whose columns and rows contain data.
Once you've selected the advanced options you require, select OK in Power Query
Desktop or Next in Power Query Online to connect to your Teradata database.
Text/CSV
Article • 07/14/2023
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities supported
Import
1. Select the Text/CSV option in Get Data. This action launches a local file browser
where you can select your text file.
Select Open to open the file.
2. From the Navigator, you can either transform the data in the Power Query Editor
by selecting Transform Data, or load the data by selecting Load.
2. In Connection settings, enter a file path to the local text or CSV file you want.
3. Select an on-premises data gateway from Data gateway.
5. Select Next.
6. From the Navigator, select Transform Data to begin transforming the data in the
Power Query Editor.
Text/CSV delimiters
Power Query will treat CSVs as structured files with a comma as a delimiter—a special
case of a text file. If you choose a text file, Power Query will automatically attempt to
determine if it has delimiter separated values, and what that delimiter is. If it can infer a
delimiter, it will automatically treat it as a structured data source.
Unstructured Text
If your text file doesn't have structure, you'll get a single column with a new row per line
encoded in the source text. As a sample for unstructured text, you can consider a
notepad file with the following contents:
Hello world.
This is sample data.
When you load it, you're presented with a navigation screen that loads each of these
lines into their own row.
There's only one thing you can configure on this dialog, which is the File Origin
dropdown select. This dropdown lets you select which character set was used to
generate the file. Currently, character set isn't inferred, and UTF-8 will only be inferred if
it starts with a UTF-8 BOM.
CSV
You can find a sample CSV file here .
In addition to file origin, CSV also supports specifying the delimiter and how data type
detection will be handled.
Delimiters available include colon, comma, equals sign, semicolon, space, tab, a custom
delimiter (which can be any string), and a fixed width (splitting up text by some standard
number of characters).
The final dropdown allows you to select how you want to handle data type detection. It
can be done based on the first 200 rows, on the entire data set, or you can choose to
not do automatic data type detection and instead let all columns default to 'Text'.
Warning: if you do it on the entire data set it may cause the initial load of the data in the
editor to be slower.
Since inference can be incorrect, it's worth double checking settings before loading.
Structured Text
When Power Query can detect structure to your text file, it will treat the text file as a
delimiter separated value file, and give you the same options available when opening a
CSV—which is essentially just a file with an extension indicating the delimiter type.
For example, if you save the following example as a text file, it will be read as having a
tab delimiter rather than unstructured text.
Editing Source
When editing the source step, you'll be presented with a slightly different dialog than
when initially loading. Depending on what you are currently treating the file as (that is,
text or csv) you'll be presented with a screen with a variety of dropdowns.
The Line breaks dropdown will allow you to select if you want to apply line breaks that
are inside quotes or not.
For example, if you edit the 'structured' sample provided above, you can add a line
break.
If Line breaks is set to Ignore quoted line breaks, it will load as if there was no line
break (with an extra space).
If Line breaks is set to Apply all line breaks, it will load an extra row, with the content
after the line breaks being the only content in that row (exact output may depend on
structure of the file contents).
The Open file as dropdown will let you edit what you want to load the file as—
important for troubleshooting. For structured files that aren't technically CSVs (such as a
tab separated value file saved as a text file), you should still have Open file as set to
CSV. This setting also determines which dropdowns are available in the rest of the
dialog.
Text/CSV by Example
Text/CSV By Example in Power Query is a generally available feature in Power BI Desktop
and Power Query Online. When you use the Text/CSV connector, you'll see an option to
Extract Table Using Examples on the bottom-left corner of the navigator.
When you select that button, you’ll be taken into the Extract Table Using Examples
page. On this page, you specify sample output values for the data you’d like to extract
from your Text/CSV file. After you enter the first cell of the column, other cells in the
column are filled out. For the data to be extracted correctly, you may need to enter
more than one cell in the column. If some cells in the column are incorrect, you can fix
the first incorrect cell and the data will be extracted again. Check the data in the first few
cells to ensure that the data has been extracted successfully.
7 Note
We recommend that you enter the examples in column order. Once the column has
successfully been filled out, create a new column and begin entering examples in
the new column.
Once you’re done constructing that table, you can either select to load or transform the
data. Notice how the resulting queries contain a detailed breakdown of all the steps that
were inferred for the data extraction. These steps are just regular query steps that you
can customize as needed.
Troubleshooting
Loading Files from the Web
If you're requesting text/csv files from the web and also promoting headers, and you’re
retrieving enough files that you need to be concerned with potential throttling, you
should consider wrapping your Web.Contents call with Binary.Buffer() . In this case,
buffering the file before promoting headers will cause the file to only be requested
once.
transport stream." These errors might be caused by the host employing protective
measures and closing a connection which might be temporarily paused, for example,
when waiting on another data source connection for a join or append operation. To
work around these errors, try adding a Binary.Buffer (recommended) or Table.Buffer call,
which will download the file, load it into memory, and immediately close the connection.
This should prevent any pause during download and keep the host from forcibly closing
the connection before the content is retrieved.
The following example illustrates this workaround. This buffering needs to be done
before the resulting table is passed to Table.PromoteHeaders.
Original:
Power Query M
Csv.Document(Web.Contents("https://.../MyFile.csv"))
With Binary.Buffer :
Power Query M
Csv.Document(Binary.Buffer(Web.Contents("https://.../MyFile.csv")))
With Table.Buffer :
Power Query M
Table.Buffer(Csv.Document(Web.Contents("https://.../MyFile.csv")))
TIBCO(R) Data Virtualization
Article • 07/14/2023
7 Note
The following connector article is provided by TIBCO, the owner of this connector
and a member of the Microsoft Power Query Connector Certification Program. If
you have questions regarding the content of this article or have changes you would
like to see made to this article, visit the TIBCO website and use the support
channels there.
Summary
Item Description
Prerequisites
To access the TIBCO eDelivery site, you must have purchased TIBCO software. There's no
TIBCO license required for the TIBCO(R) Data Virtualization (TDV) software—a TIBCO
customer only needs to have a valid contract in place. If you don't have access, then
you'll need to contact the TIBCO admin in your organization.
The Power BI Connector for TIBCO(R) Data Virtualization must first be downloaded from
https://edelivery.tibco.com and installed on the machine running Power BI Desktop.
The eDelivery site downloads a ZIP file (for example,
TIB_tdv_drivers_<VERSION>_all.zip*.zip where <VERSION>=TDV Version) that contains
an installer program that installs all TDV client drivers, including the Power BI Connector.
Once the connector is installed, configure a data source name (DSN) to specify the
connection properties needed to connect to the TIBCO(R) Data Virtualization server.
7 Note
The DSN architecture (32-bit or 64-bit) needs to match the architecture of the
product where you intend to use the connector.
7 Note
Power BI Connector for TIBCO(R) Data Virtualization is the driver used by the
TIBCO(R) Data Virtualization connector to connect Power BI Desktop to TDV.
Capabilities Supported
Import
DirectQuery (Power BI Datasets)
Advanced Connection Properties
Advanced
Native SQL statement
2. In the Power BI Connector for TIBCO(R) Data Virtualization dialog that appears,
provide the Data Source Name.
4. If this is the first time you're connecting to this database, select the authentication
type. If applicable, enter the needed credentials. Then select Connect.
The following table lists all of the advanced options you can set in Power Query
Desktop.
Advanced option Description
SQL statement For information, go to Import data from a database using native database
query.
Once you've selected the advanced options you require, select OK in Power Query
Desktop to connect to your TIBCO(R) Data Virtualization Server.
1. Sign in to your Power BI account, and navigate to the Gateway management page.
2. Add a new data source under the gateway cluster you want to use.
5. Select the option to Use SSO via Kerberos for DirectQuery queries or Use SSO via
Kerberos for DirectQuery and Import queries.
More information: Configure Kerberos-based SSO from Power BI service to on-premises
data sources
Twilio (Deprecated) (Beta)
Article • 01/24/2023
Summary
Item Description
Products -
Deprecation
7 Note
7 Note
Summary
Item Description
Prerequisites
You must have a Usercube instance with the PowerBI option.
Capabilities supported
Import
4. Enter the client credentials. The Client Id must be built from the Identifier of an
OpenIdClient element. This element is defined in the configuration of your
Usercube instance. To this identifier, you must concatenate the @ character and the
domain name of the Usercube instance.
5. In Navigator, select the data you require. Then, either select Transform data to
transform the data in the Power Query Editor, or choose Load to load the data in
Power BI.
Vessel Insight
Article • 07/14/2023
7 Note
Summary
Item Description
Prerequisites
Before you can sign in to Vessel Insight, you must have an organization account
(username/password) connected to a tenant.
Capabilities Supported
Import
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Other from the
categories on the left, select Vessel Insight, and then select Connect.
2. If this is the first time you're getting data through the Vessel Insight connector, a
third-party notice will be displayed. Select Don't warn me again with this
connector if you don't want this message to be displayed again, and then select
Continue.
5. In the window that appears, provide your credentials to sign in to your Vessel
Insight account.
If you entered an email address and password, select Continue.
Once the connection is established, you can preview and select data within the
Navigator dialog box to create a single tabular output.
You can select the following options in the navigator:
Advanced: Write custom Time series Query Language (TQL) queries (native). For
advanced Kongsberg users.
Vessel Insight Data (deprecated): Time series data for your fleets in the old asset
hierarchy.
Vessel Insight Data 2.0: Time series data for your fleets in the new asset hierarchy.
Only tags with data will be shown.
Voyage: Voyage history and location data from Automatic Identification System
(AIS).
You can provide any optional input parameters required for the selected items. For more
information about these parameters, go to Optional input parameters.
If you don't input parameters for Vessel Insight Data 2.0, you'll get the latest value by
default.
For Voyage, you need to input IMOs that you want to fetch data for.
You can Load the selected time series data, which brings the one table for each selected
time series tag into Power BI Desktop, or you can select Transform Data to edit the
query, which opens the Power Query editor. You can then filter and refine the set of data
you want to use, and then load that refined set of data into Power BI Desktop.
Interval (optional): How you want the data to be aggregated when displayed (1s,
5s, >=30s, 1m, 1h, 1d).
Time (optional): Set the time filter type if you want to filter on time.
Latest: Get latest value only. Returns one value.
Period: Filter on the time range. Requires setting the Start and End date
described below.
Custom: Custom query to filter on the number of values to return.
Start (Time: Period), e.g. 2019-10-08T00:00:00Z (optional): Filter on range by
inserting the start date and time here. Possible to set yesterday and today.
Requires setting Time: Period.
End (Time: Period), e.g. 2019-10-08T01:00:00Z (optional): Filter on range by
inserting the end date and time here. Possible to set today and now. Requires
setting Time: Period.
Custom (Time: Custom), e.g. |> takebefore now 5 (optional): Add a custom query
to filter on the number of values. |> takebefore now 5 means take five values
before the time now. Requires Time: Custom.
When importing aggregated timeseries, the connector will return avg, min, max, and
count by default.
Voyage
When you import voyage data through the Voyage node, you can limit the amount of
data for the History and Location History table by setting a set of optional input
parameters.
Comma Separated IMOs: Input one or multiple IMO numbers you want voyage
data for.
Start (Time: Period), e.g. 2019-10-08T00:00:00Z (optional): Filter on range by
inserting the start date and time here. Possible to set yesterday and today.
Requires setting Time: Period.
End (Time: Period), e.g. 2019-10-08T01:00:00Z (optional): Filter on range by
inserting the end date and time here. Possible to set today and now. Requires
setting Time: Period.
There's a general limit of 1-GB data that's imported into Power BI, unless the
workspace is in a Power BI Premium capacity. We recommend that you aggregate
and choose a short date range when importing time series data, as it can become
heavy.
Each time series tag with associated values is outputted in a separate table in
Power BI. If it's necessary to combine tags and values into one table, the tags and
their values need to be merged in the Power Query editor or with TQL queries.
The time series data is currently stored in Couchbase, which might have
weaknesses that impact the Power BI connector.
For more guidelines on accessing Vessel Insight data, go to The Getting started guide .
Recommended content
You might also find the following Vessel Insight information useful:
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Web.BrowserContents
Anonymous
Windows (preview feature)
Basic (preview feature)
Web API (preview feature)
Web.Page
Anonymous
Windows (current user's credentials only)
Web API
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
Web.Page requires Internet Explorer 10.
Capabilities supported
Connecting to a URL
Advanced
Using a combination of text constants and parameters to construct the URL
Specifying a command timeout
Defining HTTP request header parameters (Web.Contents only)
1. Select Get Data > Web in Power BI or From Web in the Data ribbon in Excel.
2. Choose the Basic button and enter a URL address in the text box. For example,
enter
https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_Sta
If the URL address you enter is invalid, a warning icon will appear next to the
URL textbox.
If you need to construct a more advanced URL before you connect to the website,
go to Load Web data using an advanced URL.
3. Select the authentication method to use for this web site. In this example, select
Anonymous. Then select the level to you want to apply these settings to—in this
case, https://en.wikipedia.org/ . Then select Connect.
The available authentication methods for this connector are:
Windows: Select this authentication method if the web page requires your
Windows credentials.
Basic: Select this authentication method if the web page requires a basic user
name and password.
Web API: Select this method if the web resource that you’re connecting to
uses an API Key for authentication purposes.
7 Note
When uploading the report to the Power BI service, only the anonymous,
Windows and basic authentication methods are available.
The level you select for the authentication method determines what part of a URL
will have the authentication method applied to it. If you select the top-level web
address, the authentication method you select here will be used for that URL
address or any subaddress within that address. However, you might not want to
set the top URL address to a specific authentication method because different
subaddresses could require different authentication methods. For example, if you
were accessing two separate folders of a single SharePoint site and wanted to use
different Microsoft Accounts to access each one.
Once you've set the authentication method for a specific web site address, you
won't need to select the authentication method for that URL address or any
subaddress again. For example, if you select the https://en.wikipedia.org/
address in this dialog, any web page that begins with this address won't require
that you select the authentication method again.
7 Note
4. From the Navigator dialog, you can select a table, then either transform the data
in the Power Query editor by selecting Transform Data, or load the data by
selecting Load.
The right side of the Navigator dialog displays the contents of the table you select
to transform or load. If you're uncertain which table contains the data you're
interested in, you can select the Web View tab. The web view lets you see the
entire contents of the web page, and highlights each of the tables that have been
detected on that site. You can select the check box above the highlighted table to
obtain the data from that table.
On the lower left side of the Navigator dialog, you can also select the Add table
using examples button. This selection presents an interactive window where you
can preview the content of the web page and enter sample values of the data you
want to extract. For more information on using this feature, go to Get webpage
data by providing examples.
1. From the Get Data dialog box, select either Web page or Web API.
In most cases, you'll want to select the Web page connector. For security reasons,
you'll need to use an on-premises data gateway with this connector. The Web Page
connector requires a gateway because HTML pages are retrieved using a browser
control, which involves potential security concerns. This concern isn't an issue with
Web API connector, as it doesn't use a browser control.
In some cases, you might want to use a URL that points at either an API or a file
stored on the web. In those scenarios, the Web API connector (or file-specific
connectors) would allow you to move forward without using an on-premises data
gateway.
Also note that if your URL points to a file, you should use the specific file connector
instead of the Web page connector.
2. Enter a URL address in the text box. For this example, enter
https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_Sta
tes .
3. Select the name of your on-premises data gateway.
4. Select the authentication method you'll use to connect to the web page.
Basic: Select this authentication method if the web page requires a basic user
name and password.
5. From the Navigator dialog, you can select a table, then transform the data in the
Power Query Editor by selecting Transform Data.
Depending on how long the POST request takes to process data, you may need to
prolong the time the request continues to stay connected to the web site. The default
timeout for both POST and GET is 100 seconds. If this timeout is too short, you can use
the optional Command timeout in minutes to extend the number of minutes you stay
connected.
You can also add specific request headers to the POST you send to the web site using
the optional HTTP request header parameters drop-down box. The following table
describes the request headers you can select.
Request Description
Header
Accept- Indicates which character sets are acceptable in the textual response content.
Charset
Request Description
Header
Accept- Indicates what response content encodings are acceptable in the response.
Encoding
Accept- Indicates the set of natural languages that are preferred in the response.
Language
Cache- Indicates the caching policies, specified by directives, in client requests and server
Control responses.
If- Conditionally determines if the web content has been changed since the date
Modified- specified in this field. If the content hasn't changed, the server responds with only
Since the headers that have a 304 status code. If the content has changed, the server will
return the requested resource along with a status code of 200.
Prefer Indicates that particular server behaviors are preferred by the client, but aren't
required for successful completion of the request.
Referer Specifies a URI reference for the resource from which the target URI was obtained.
Access database
CSV document
Excel workbook
JSON
Text file
HTML page
XML tables
PDF
For example, you could use the following steps to import a JSON file on the
https://contoso.com/products web site:
1. From the Get Data dialog box, select the Web connector.
2. Choose the Basic button and enter the address in the URL box, for example:
http://contoso.com/products/Example_JSON.json
3. Select OK.
4. If this is the first time you're visiting this URL, select Anonymous as the
authentication type, and then select Connect.
5. Power Query Editor will now open with the data imported from the JSON file.
Select the View tab in the Power Query Editor, then select Formula Bar to turn on
the formula bar in the editor.
As you can see, the Web connector returns the web contents from the URL you
supplied, and then automatically wraps the web contents in the appropriate
document type specified by the URL ( Json.Document in this example).
See also
Extract data from a Web page by example
Troubleshooting the Power Query Web connector
Get webpage data by providing
examples
Article • 01/24/2023
Getting data from a web page lets users easily extract data from web pages. Often
however, data on Web pages aren't in tidy tables that are easy to extract. Getting data
from such pages can be challenging, even if the data is structured and consistent.
There's a solution. With the Get Data from Web by example feature, you can essentially
show Power Query data you want to extract by providing one or more examples within
the connector dialog. Power Query gathers other data on the page that match your
examples. With this solution you can extract all sorts of data from Web pages, including
data found in tables and other non-table data.
7 Note
If you want to follow along, you can use the Microsoft Store URL that we use in this
article:
https://www.microsoft.com/store/top-paid/games/xbox?category=classics
When you select OK, you're taken to the Navigator dialog box where any autodetected
tables from the Web page are presented. In the case shown in the image below, no
tables were found. Select Add table using examples to provide examples.
Add table using examples presents an interactive window where you can preview the
content of the Web page. Enter sample values of the data you want to extract.
In this example, you'll extract the Name and Price for each of the games on the page.
You can do that by specifying a couple of examples from the page for each column. As
you enter examples, Power Query extracts data that fits the pattern of example entries
using smart data extraction algorithms.
7 Note
Value suggestions only include values less than or equal to 128 characters in length.
Once you're happy with the data extracted from the Web page, select OK to go to
Power Query Editor. You can then apply more transformations or shape the data, such as
combining this data with other data sources.
See also
Add a column from examples
Shape and combine data
Getting data
Troubleshooting the Power Query Web connector
Troubleshooting the Web connector
Article • 04/11/2023
Web.Contents is used for retrieving web content that doesn't need to be accessed
through a browser, such as CSV files, JSON API results, and so on.
It supports the widest variety of authentication options.
It can be used in cloud environments, such as Power Query Online, without a
gateway.
Web.Page
Web.Page is a legacy function for retrieving web content that needs to be accessed
through a browser, such as HTML pages.
It's built on Internet Explorer. Because of this requirement, it's being replaced in
the UI with Web.BrowserContents . However, Web.Page will continue to be available
at the engine level for backward compatibility.
A gateway is required to use it in cloud environments, such as Power Query Online.
Web.BrowserContents
Non-browser content x
(.txt/.csv files, JSON, and
so on)
Web.Contents Web.Page Web.BrowserContents
Requires a gateway in N Y Y
cloud hosts
These kinds of issues are usually due to timing. Pages that load their content
dynamically can sometimes be inconsistent since the content can change after the
browser considers loading complete. Sometimes the web connector downloads the
HTML after all the dynamic content has loaded. Other times the changes are still in
progress when it downloads the HTML, leading to sporadic errors.
How can you tell if a page is dynamic? Usually it's pretty simple. Open the page in a
browser and watch it load. If the content shows up right away, it's a regular HTML page.
If it appears dynamically or changes over time, it's a dynamic page.
If you're using Web.Page and receive a Please specify how to connect error, ensure
that you have Internet Explorer 10 or later installed on the machine that hosts your on-
premises data gateway.
set PQ_WebView2Connector=true
Using Web.Page instead of
Web.BrowserContents
In cases where you need to use Web.Page instead of Web.BrowserContents , you can still
manually use Web.Page .
In Power BI Desktop, you can use the older Web.Page function by clearing the Enable
web table inference option:
1. Under the File tab, select Options and settings > Options.
3. Clear the Enable web table inference option, and then select OK.
7 Note
You can also get a copy of a Web.Page query from Excel. To copy the code from Excel:
You can also manually enter the following code into a blank query. Ensure that you enter
the address of the web page you want to load.
powerqury-m
let
Source = Web.Page(Web.Contents("<your address here>")),
Navigation = Source{0}[Data]
in
Navigation
2. Under the File tab, select Options and settings > Options.
3. In Options, under Global > Security, uncheck Enable certificate revocation check.
4. Select OK.
) Important
Be aware that unchecking Enable certificate revocation check will make web
connections less secure.
“We were unable to connect because this credential type isn’t supported for this
resource. Please choose another credential type.”
Contact the service owner. They'll either need to change the authentication
configuration or build a custom connector.
It's not possible to switch Power Query to use HTTP 1.0. Power Query always sends an
Expect:100-continue when there's a body to avoid passing a possibly large payload
when the initial call itself might fail (for example, due to a lack of permissions). Currently,
this behavior can't be changed.
See also
Power Query Web connector
Get webpage data by providing examples
Workforce Dimensions (Beta)
(Deprecated)
Article • 01/24/2023
Summary
Item Description
Products -
Deprecation
7 Note
Summary
Item Description
Products Excel
Power BI (Datasets)
Power BI (Dataflows)
Fabric (Dataflow Gen2)
Power Apps (Dataflows)
Dynamics 365 Customer Insights
Analysis Services
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Capabilities supported
Import
1. Select the XML option in the Get Data selection. This action will launch a local file
browser and allow you to select your XML file.
2. Browse to the directory containing the local XMl file you want to load, and then
select Open.
3. In Navigator, select the data you want, and then either select Load to load the
data, or Transform Data to continue transforming the data in Power Query Editor.
5. Select Next.
Loading the XML file will automatically launch the Power Query Editor. From the editor,
you can then transform the data if you want, or you can just save and close to load the
data.
Troubleshooting
Data Structure
Due to the fact that many XML documents have ragged or nested data, you may have to
do extra data shaping to get it in the sort of form that will make it convenient to do
analytics. This holds true whether you use the UI accessible Xml.Tables function, or the
Xml.Document function. Depending on your needs, you may find you have to do more or
XML
<abc>
Hello <i>world</i>
</abc>
Xml.Tables will return the "world" portion but ignore "Hello". Only the element(s) are
returned, not the text. However, Xml.Document will return "Hello <i>world</i>". The
entire inner node is turned to text, and structure isn't preserved.
Zendesk (Beta)
Article • 01/24/2023
Summary
Item Description
Prerequisites
Before you can sign in to Zendesk, you must have a Zendesk account
(username/password).
Capabilities Supported
Import
1. Select Get Data from the Home ribbon in Power BI Desktop. Select Online Services
from the categories on the left, select Zendesk (Beta), and then select Connect.
2. If this is the first time you're getting data through the Zendesk connector, a
preview connector notice will be displayed. Select Don't warn me again with this
connector if you don't want this message to be displayed again, and then select
Continue.
3. Enter the Zendesk URL location that you want to access, and the select OK.
4. To sign in to your Zendesk account, select Sign in.
5. In the Zendesk window that appears, provide your credentials to sign in to your
Zendesk account.
6. Select Sign in.
Zendesk returns a 422 error status if the instance returns more than 1000 rows.
Power Query Online limits
Article • 08/30/2023
Power Query Online is integrated into a variety of Microsoft products. Since these
products target different scenarios, they may set different limits for Power Query Online
usage.
Limit types
Hourly Evaluation Count: The maximum number of evaluation requests a user can issue
during any 60 minute period
Daily Evaluation Time: The net time a user can spend evaluating queries during any 24
hour period
Concurrent Evaluations: The maximum number of evaluations a user can have running at
any given time
Authoring limits
Authoring limits are the same across all products. During authoring, query evaluations
return previews that may be subsets of the data. Data is not persisted.
Refresh limits
During refresh (either scheduled or on-demand), query evaluations return complete
results. Data is typically persisted in storage.
Product Integration Hourly Daily Evaluation Concurrent Query
Evaluation Time (Hours) Evaluations (#)
Count (#)
Dataflow limits
Dataflow is a workload that leverages Power Query Online. Dataflow is integrated into
Power BI, PowerApps, Microsoft Fabric, and Dynamics 365 Customer Insights. A single
dataflow has a limit of 50 tables. If you need more than 50 tables, you can create
multiple dataflows. If you exceed the limit, an error message occurs during publishing
and refreshing.
Common Issues
Article • 12/21/2022
Power Query
Preserving sort
You might assume that if you sort your data, any downstream operations will preserve
the sort order.
For example, if you sort a sales table so that each store's largest sale is shown first, you
might expect that doing a "Remove duplicates" operation will return only the top sale
for each store. And this operation might, in fact, appear to work. However, this behavior
isn't guaranteed.
Because of the way Power Query optimizes certain operations, including skipping them
or offloading them to data sources (which can have their own unique ordering
behavior), sort order isn't guaranteed to be preserved through aggregations (such as
Table.Group ), merges (such as Table.NestedJoin ), or duplicate removal (such as
Table.Distinct ).
There are a number of ways to work around this. Here are a few suggestions:
Perform a sort after applying the downstream operation. For example, when
grouping rows, sort the nested table in each group before applying further steps.
Here's some sample M code that demonstrates this approach:
Table.Group(Sales_SalesPerson, {"TerritoryID"}, {{"SortedRows", each
For example, imagine a column that contains integers in the first 200 rows (such as all
zeroes), but contains decimal numbers after row 200. In this case, Power Query will infer
the data type of the column to be Whole Number (Int64.Type). This inference will result
in the decimal portions of any non-integer numbers being truncated.
Or imagine a column that contains textual date values in the first 200 rows, and other
kinds of text values after row 200. In this case, Power Query will infer the data type of
the column to be Date. This inference will result in the non-date text values being
treated as type conversion errors.
Because type detection works on the first 200 rows, but Data Profiling can operate over
the entire dataset, you can consider using the Data Profiling functionality to get an early
indication in the Query Editor about Errors (from type detection or any number of other
reasons) beyond the top N rows.
Data source error: Unable to read data from the transport connection: An existing
If you run into this error, it's most likely a networking issue. Generally, the first people to
check with are the owners of the data source you're attempting to connect to. If they
don’t think they’re the ones closing the connection, then it’s possible something along
the way is (for example, a proxy server, intermediate routers/gateways, and so on).
Whether this only reproduces with any data or only larger data sizes, it's likely that
there's a network timeout somewhere on the route. If it's only with larger data,
customers should consult with the data source owner to see if their APIs support
paging, so that they can split their requests into smaller chunks. Failing that, alternative
ways to extract data from the API (following data source best practices) should be
followed.
"TLS_RSA_WITH_AES_256_GCM_SHA384”
"TLS_RSA_WITH_AES_128_GCM_SHA256”
"TLS_RSA_WITH_AES_256_CBC_SHA256”
"TLS_RSA_WITH_AES_128_CBC_SHA256”
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256"
"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256"
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384"
"TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256"
"TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384"
"TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256"
"TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384"
Cipher suites are used to encrypt messages to secure a network connection between
clients/servers and other servers. We're removing the above list of cipher suites to
comply with our current security protocols. Beginning March 1, 2021, customers can
only use our standard cipher suites.
These are the cipher suites the server you connect to must support to connect from
Power Query Online or Power BI.
In Power Query Desktop (Power BI, Excel), we don’t control your cipher suites. If you're
trying to connect to Power Platform (for example Power Platform Dataflows) or the
Power BI Service, you'll need one of those cipher suites enabled on your OS. You may
either upgrade the Windows version or update the Windows TLS registry to make sure
that your server endpoint supports one of these ciphers.
To verify that your server complies with the security protocol, you can perform a test
using a TLS cipher and scanner tool. One example might be SSLLABS .
Customers must upgrade their servers before March 1, 2021. For more information
about configuring TLS Cipher Suite order, see Manage Transport Layer Security (TLS).
Certificate revocation
An upcoming version of Power BI Desktop will cause SSL connections failure from
Desktop when any certificates in the SSL chain are missing certificate revocation status.
This is a change from the current state, where revocation only caused connection failure
in the case where the certificate was explicitly revoked. Other certificate issues might
include invalid signatures, and certificate expiration.
As there are configurations in which revocation status may be stripped, such as with
corporate proxy servers, we'll be providing another option to ignore certificates that
don't have revocation information. This option will allow situations where revocation
information is stripped in certain cases, but you don't want to lower security entirely, to
continue working.
It isn't recommended, but users will continue to be able to turn off revocation checks
entirely.
The table name has been changed, for example in the data source itself.
The account used to access the table doesn't have sufficient privileges to read the
table.
There may be multiple credentials for a single data source, which isn't supported in
Power BI Service. This error may happen, for example, when the data source is a
cloud data source and multiple accounts are being used to access the data source
at the same time with different credentials. If the data source is on-premises, you'll
need to use the on-premises data gateway.
Here are some common ways to resolve a stack overflow in your M code.
Ensure that your recursive functions actually terminate when the expected end
condition is reached.
Replace recursion with iteration (for example, by using functions such as
List.Transform, List.Generate, or List.Accumulate).
Expression.Error: Evaluation ran out of memory and can't
continue
"Out of memory" errors (or OOMs) can be caused by doing too many memory intensive
operations against very large tables. For example, the following M code produces an
OOM because it attempts to load a billion rows into memory at once.
Table.Buffer(Table.FromList({1..1000000000}, Splitter.SplitByNothing()))
To resolve out of memory errors, optimize memory intensive operations like sorts, joins,
grouping, and distincts by ensuring they fold to the source, or by removing them
altogether where possible. Sorts, for example, are often unnecessary.
Dataflows
Background
Due to the way that queries are stored in Power Query Online, there are cases where
manually entered M script (generally comments) is lost. The Review Script Changes
pane provides a "diff" experience highlighting the changes, which allows users to
understand what changes are being made. Users can then accept the changes or
rearrange their script to fix it.
There are three notable cases that may cause this experience.
Comments
Comments always have to be inside the Let .. in expression, and above a step. This
will be shown in the user interface as a 'Step property'. We lose all other comments.
Comments that are written on the same line as one step, but above another step (for
example, after the comma that trails every step) will be moved down.
Experience
When you commit a query, Power Query Online will evaluate it to see if the 'stored'
version of the script differs at all from what you have submitted. If it does, it will present
you with a Review script changes dialog box that will allow you to accept or cancel.
If you accept, the changes will be made to your query.
If you cancel, you might rewrite your query to make sure that you move your
comments properly, or rearrange however else you want.
Power Query connector feedback
Article • 01/24/2023
This article describes how to submit feedback for Power Query connectors. It's important
to distinguish between Microsoft-owned connectors and non-Microsoft-owned
connectors, as the support and feedback channels are different.
Microsoft-owned connectors
This section outlines instructions to receive support or submit feedback on Microsoft-
owned connectors.
If you're seeking help with using Microsoft-owned Power Query connectors, visit one of
the following resources.
Community forums for the product you're using Power Query in. For example, for
Power BI, this forum would be the Power BI Community and for PowerPlatform
dataflows, the forum would be Power Apps Community .
Power Query website resources .
Submitting feedback
To submit feedback about a Microsoft-owned connector, provide the feedback to the
"ideas" forum for the product you're using Power Query connectors in. For example, for
Power BI, visit the Power BI ideas forum . If you have one, you can also provide
feedback directly to your Microsoft account contact.
Non-Microsoft-owned connectors
This section outlines instructions to receive support or submit feedback on non-
Microsoft-owned connectors.
You can also engage the Power Query community resources indicated above for
Microsoft-owned connectors, in case a member of the community can assist.
Submitting feedback
As non-Microsoft-owned connectors are managed and updated by the respective
connector owner, feedback should be sent directly to the connector owner. For example,
to submit feedback about a Contoso-owned connector, you should directly submit
feedback to Contoso.
Capture web requests with Fiddler
Article • 02/16/2022
When diagnosing issues that might occur when Power Query communicates with your
data, you might be asked to supply a Fiddler trace. The information provided by Fiddler
can be of significant use when troubleshooting connectivity issues.
7 Note
This article assumes that you are already familiar with how Fiddler works in general.
If you don't already have Fiddler installed, download and install Fiddler now. Be sure
to install Fiddler on the system where the issue is occurring.
1. Open Fiddler.
10. In the Fiddler traffic pane, select one of the current traces, and then press Ctrl + X.
This action clears all of the current traces from the traffic pane.
These actions minimize the number of messages we have to dig through, and also helps
focus the investigation. It also avoids capturing other potentially sensitive information
that you don't want to share.
If you're only running Power Query and Fiddler, this minimum setup should yield a
sequence of HTTP requests and responses from whatever backend you're
communicating with, for example Power BI service, SharePoint, or Azure. The requests,
responses, headers, response codes, and sometimes the payload will all provide clues
we can use to troubleshoot your issue.
To save the capture session to a log file, select File > Save > All Sessions. You might also
be asked to compress the log file (.zip) before sending it.
4. Select Actions.
7. In Do you want to allow this app to make changes to your device?, select Yes.
10. If the root certificate dialog box appears, close the dialog box without selecting Yes
or No.
See also
Query diagnostics
Power Query feedback
Getting started with Fiddler Classic
Power Query SDK overview
Article • 02/17/2023
The Power Query SDK is a set of tools designed to help you create Power Query
connectors. These connectors are often referred to as custom connectors or Power
Query extensions.
Custom connectors let you create new data sources or customize and extend an existing
source. Common use cases include:
Visual Studio Power Query SDK: Released in 2017 as an extension for Visual
Studio 2017 and 2019.
Visual Studio Code Power Query SDK (Preview): Released in 2022 as the new and
recommended way to create Power Query connectors.
We encourage all developers to install and use the newly released Visual Studio Code
Power Query SDK (Preview) as this version will eventually be the default SDK going
forward.
1. Install the Power Query SDK from the Visual Studio Marketplace.
2. Create a new data connector project.
3. Define your connector logic.
4. Build the project to produce an extension file.
Visual Studio Code Power Query SDK (Preview)
7 Note
The new Visual Studio Code Power Query SDK is currently in public preview as of
September of 2022.
Install the new Visual Studio Code Power Query SDK from the Visual Studio Code
section of the Visual Studio Marketplace . Select Install to install the SDK.
The following sections describe, at a high level, the most common process to create a
Power Query connector using the SDK.
Your connector definition file will start with an empty data source description. You can
learn more about a data source in the context of the Power Query SDK from the article
on handling data access.
Testing
The Power Query SDK provides basic query execution capabilities, allowing you to test
your extension without having to switch over to Power BI Desktop.
The query file can contain a single expression (for example, HelloWorld.Contents() ), a
let expression (such as what Power Query would generate), or a section document.
Power Query extensions are bundled in a ZIP file and given a .mez file extension. At
runtime, Power BI Desktop loads extensions from [Documents]\Microsoft Power BI
Desktop\Custom Connectors.
7 Note
Data Connector projects don't support custom post build steps to copy the
extension file to your [Documents]\Microsoft Power BI Desktop\Custom
Connectors directory. If this is something you want to do, you might want to use a
third party extension.
Extensions are defined within an M section document. A section document has a slightly
different format from the query document(s) generated in Power Query. Code you
import from Power Query typically requires modification to fit into a section document,
but the changes are minor. Section document differences you should be aware of
include:
All functions and variables are local to the section document, unless they're
marked as shared. Shared functions become visible to other queries/functions, and
can be thought of as the exports for your extension (that is, they become callable
from Power Query).
Power BI Desktop
Power BI Desktop users can follow the steps below to consume a Power Query custom
connector:
Alternatively, as the owner of the data source and connector, you can submit your
connector to the Power Query Connector Certification program so it ships with Power BI
Desktop on every release.
7 Note
The Power Query team is working hard towards enabling all Power QUery certified
connectors in the Power Query Online experience.
Only Power Query certified connectors are shown in the Power Query Online experience.
To learn more about the Power Query connector certification program, go to Power
Query Connector Certification.
This article focuses on the experience available for the Power Query SDK found in Visual
Studio Code. You can learn more on how to install the Power Query SDK for Visual
Studio from the article on Installing the SDK.
Tip
Before creating an extension project, we recommend that you create a new folder
where you'll store your extension project. During the creation of a new project, if no
folder is selected, the Power Query SDK will help you locate or create a new folder
before creating your extension project.
Once in Visual Studio Code, in the main Explorer pane of Visual Studio Code there's a
section with the name Power Query SDK. This section has only one button that reads
Create an extension project. Select this button.
This button opens an input field at the top of the Visual Studio Code interface. Enter the
name of your new extension project, and then select Enter.
After a few seconds, your Visual Studio Code window opens the main *.pq file for your
extension project that contains your connector logic. The Power Query SDK
automatically runs some necessary tasks to complete the setup of your workspace. You
can check these tasks in the output console in Visual Studio Code.
The Power Query SDK automatically creates the following set of files:
A settings.json file that dictates specific settings to work with at your workspace
level.
It builds the extension as a .mez file and stores it in a new bin\AnyCPU\Debug
folder.
A set of connector icons as .png files.
A resources.resx file that serves as the main storage for strings that are used in the
extension.
A .pq file that holds the main logic of your extension or connector.
A .query.pq file whose main purpose is to be used as a way to create test queries
that you can later evaluate.
A .proj file that holds information about the extension project.
Once an extension project is recognized by Visual Studio Code, the section for the
Power Query SDK changes its appearance, and now displays a list of tasks you can run
against your new extension project.
Credentials
) Important
Before you can evaluate any of your data connector's queries, a set of credentials
must first be created for the extension project.
The Power Query SDK offers multiple tasks through its user interface to allow you to set,
list, and delete credentials from your extension project.
Set credential
The Power Query SDK is primarily driven by tasks that can be triggered through multiple
entry points. Setting a credential can be done in two ways (the other credential tasks can
be done in the same way).
Through the entry in the Power Query SDK section in the explorer pane.
Through the Terminal by selecting the Run Task option and then selecting the Set
credential task.
When you run this task, Visual Studio Code will guide you through a series of prompts
to allow you to set the credential. These series of prompts are predictable and always
consist of the same stages:
For the existing extension project, the authentication method available is anonymous.
Once the authentication is set, a message that confirms a credential has been generated
successfully is displayed at the bottom right corner of the window.
List credentials
Similar to setting a credential, the task to list credentials has two entry points in the
same places: the Power Query SDK section in the Explorer pane and inside the Terminal
menu.
When this task is executed, it showcases the available credentials inside the output
terminal.
Clear ALL credentials
Similar to the previous two tasks, the task to clear all credentials has two entry points in
the same places: the Power Query SDK section in the Explorer pane and inside the
Terminal menu.
This task serves as a way to clear all credentials from your current session when you
need to set a new credential to evaluate your queries.
The informational messages for this task are also shown in the output console.
For this specific connector where the project name was MyConnector, the code looks as
follows:
Power Query M
Power Query M
Right select the file that's in use and select the Evaluate current power query file
option.
Go through the Terminal menu and select the Evaluate current file task.
Use the native Run and Debug option from Visual Studio Code, select the
hyperlink to create a launch.json file, and then evaluate the file.
After evaluating the query, the results are displayed in the console at the bottom of the
window and in a new panel called the result panel on the right.
Output tab: Displays a data preview of the query evaluated. If the data is a table,
it's displayed as grid.
Summary: Displays a summary of the activity that ran the evaluations, along with
the statistics associated with that activity.
DataSource: Displays general information about the data source used for the
evaluation.
To evaluate a different query, you just modify the *.query.pq file, save it, and then run
the evaluation task again with any of the three methods.
7 Note
The Power Query SDK doesn't manage any kind of caching mechanism for the
evaluations.
To follow along, we recommend downloading the connector projects available from our
DataConnectors repository for the TripPin sample , specifically the sample 9-
TestConnection.
To bring the legacy extension project to the new SDK, follow these steps:
1. In Visual Studio code, select File > Open folder, then navigate to the folder where
your extension project is located.
2. Set up a workspace using the existing folder and its contents using one of the
following two methods:
The Power Query SDK has a mechanism to recognize the contents of your
folder and suggests that you enable the conversion to a new Power Query
SDK workspace.
Run the Setup workspace and the Build Task from the terminal menu. These
will effectively create the .mez file and the settings.json files needed for the
workspace.
The addition of two new folders and files is what transforms the current workspace into
a new Power Query SDK workspace.
Setup workspace
What the Setup workspace task does is effectively create a settings.json file for your
workspace that dictates some variables that will be used for your workspace when it
comes to evaluations and general settings.
Build an extension file
The build task allows you to create the .mez file for your extension on demand.
The task to run TestConnection enables you to test the handler inside the Power Query
SDK without having to manually try this handler in the Microsoft Cloud.
To run this task, first set a credential for your connector and then run the task either
from the Power Query SDK section in the Explorer or through the list of tasks inside the
terminal menu.
The result of this task is displayed in the output terminal at the bottom of the window.
You can use the Power BI Community forum to post general questions around Power
Query, the M language, and custom connector development.
Creating your first connector: Hello
World
Article • 12/21/2022
Following the instructions in Installing the PowerQuery SDK, create a new project called
"HelloWorld" and copy in the following M code, and then follow the rest of the
instructions to be able to open it in PowerBI.
A section statement.
A data source function with metadata establishing it as a data source definition
with the Kind HelloWorld and Publish HelloWorld.Publish .
An Authentication record declaring that implicit (anonymous) is the only
authentication type for this source.
A publish record declaring that this connection is in Beta, what text to load from
the resx file, the source image, and the source type image.
A record associating icon sizes with specific PNGs in the build folder.
section HelloWorld;
[DataSource.Kind="HelloWorld", Publish="HelloWorld.Publish"]
shared HelloWorld.Contents = (optional message as text) =>
let
message = if (message <> null) then message else "Hello world"
in
message;
HelloWorld = [
Authentication = [
Implicit = []
],
Label = Extension.LoadString("DataSourceLabel")
];
HelloWorld.Publish = [
Beta = true,
ButtonText = { Extension.LoadString("FormulaTitle"), Extension.LoadStrin
g("FormulaHelp") },
SourceImage = HelloWorld.Icons,
SourceTypeImage = HelloWorld.Icons
];
HelloWorld.Icons = [
Icon16 = { Extension.Contents("HelloWorld16.png"), Extension.Contents("H
elloWorld20.png"), Extension.Contents("HelloWorld24.png"), Extension.Content
s("HelloWorld32.png") },
Icon32 = { Extension.Contents("HelloWorld32.png"), Extension.Contents("H
elloWorld40.png"), Extension.Contents("HelloWorld48.png"), Extension.Content
s("HelloWorld64.png") }
];
Once you've built the file and copied it to the correct directory, following the
instructions in Installing the PowerQuery SDK tutorial, open PowerBI. You can search for
"hello" to find your connector in the Get Data dialog.
This step will bring up an authentication dialog. Since there's no authentication options
and the function takes no parameters, there's no further steps in these dialogs.
Press Connect and the dialog will tell you that it's a "Preview connector", since Beta is
set to true in the query. Since there's no authentication, the authentication screen will
present a tab for Anonymous authentication with no fields. Press Connect again to
finish.
Finally, the query editor will come up showing what you expect—a function that returns
the text "Hello world".
For the fully implemented sample, see the Hello World Sample in the Data Connectors
sample repo.
TripPin Tutorial
Article • 03/14/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
This tutorial uses a public OData service (TripPin ) as a reference source. Although
this lesson requires the use of the M engine's OData functions, subsequent lessons will
use Web.Contents, making it applicable to (most) REST APIs.
Prerequisites
The following applications will be used throughout this tutorial:
7 Note
You can also start trace logging of your work at any time by enabling diagnostics,
which is described later on in this tutorial. More information: Enabling diagnostics
Parts
Part Lesson Details
4 Data Source How credentials are identified for your data source
Paths
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
" Create a new Data Connector project using the Visual Studio SDK
" Author a base function to pull data from a source
" Test your connector in Visual Studio
" Register your connector in Power BI Desktop
Open Visual Studio, and create a new Project. Under the Power Query folder, select the
Data Connector project. For this sample, set the project name to TripPin .
Open the TripPin.pq file and paste in the following connector definition.
Power Query M
section TripPin;
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Feed = Value.ReplaceType(TripPinImpl, type function (url as
Uri.Type) as any);
Open the TripPin.query.pq file. Replace the current contents with a call to your exported
function.
Power Query M
TripPin.Feed("https://services.odata.org/v4/TripPinService/")
Running your query for the first time results in a credential error. In Power Query, the
hosting application would convert this error into a credential prompt. In Visual Studio,
you'll receive a similar prompt that calls out which data source is missing credentials and
its data source path. Select the shortest of the data source paths
( https://services.odata.org/ )—this will apply your credential to all URLs under this
path.
Select the Anonymous credential type, and then select Set Credential.
Select OK to close the dialog, and then select the Start button once again. You see a
query execution status dialog, and finally a Query Result table showing the data
returned from your query.
You can try out a few different OData URLs in the test file to see what how different
results are returned. For example:
https://services.odata.org/v4/TripPinService/Me
https://services.odata.org/v4/TripPinService/GetPersonWithMostFriends()
https://services.odata.org/v4/TripPinService/People
The TripPin.query.pq file can contain single statements, let statements, or full section
documents.
Power Query M
let
Source = TripPin.Feed("https://services.odata.org/v4/TripPinService/"),
People = Source{[Name="People"]}[Data],
SelectColumns = Table.SelectColumns(People, {"UserName", "FirstName",
"LastName"})
in
SelectColumns
Open Fiddler to capture HTTP traffic, and run the query. You should see a few
different requests to services.odata.org, generated by the mashup container process.
You can see that accessing the root URL of the service results in a 302 status and a
redirect to the longer version of the URL. Following redirects is another behavior you get
“for free” from the base library functions.
One thing to note if you look at the URLs is that you can see the query folding that
happened with the SelectColumns statement.
https://services.odata.org/v4/TripPinService/People?
$select=UserName%2CFirstName%2CLastName
If you add more transformations to your query, you can see how they impact the
generated URL.
This behavior is important to note. Even though you did not implement explicit folding
logic, your connector inherits these capabilities from the OData.Feed function. M
statements are compose-able—filter contexts will flow from one function to another,
whenever possible. This is similar in concept to the way data source functions used
within your connector inherit their authentication context and credentials. In later
lessons, you'll replace the use of OData.Feed, which has native folding capabilities, with
Web.Contents, which does not. To get the same level of capabilities, you'll need to use
the Table.View interface and implement your own explicit folding logic.
You can locate your extension by typing its name into the search box.
Select the function name, and select Connect. A third-party message appears—select
Continue to continue. The function invocation dialog now appears. Enter the root URL
of the service ( https://services.odata.org/v4/TripPinService/ ), and select OK.
Since this is the first time you are accessing this data source, you'll receive a prompt for
credentials. Check that the shortest URL is selected, and then select Connect.
Notice that instead of getting a simple table of data, the navigator appears. This is
because the OData.Feed function returns a table with special metadata on top of it that
the Power Query experience knows to display as a navigation table. This walkthrough
will cover how you can create and customize your own navigation table in a future
lesson.
Select the Me table, and then select Transform Data. Notice that the columns already
have types assigned (well, most of them). This is another feature of the underlying
OData.Feed function. If you watch the requests in Fiddler , you'll see that you've
fetched the service's $metadata document. The engine's OData implementation does
this automatically to determine the service's schema, data types, and relationships.
Conclusion
This lesson walked you through the creation of a simple connector based on the
OData.Feed library function. As you saw, very little logic is needed to enable a fully
functional connector over the OData base function. Other extensibility enabled
functions, such as ODBC.DataSource, provide similar capabilities.
In the next lesson, you'll replace the use of OData.Feed with a less capable function—
Web.Contents. Each lesson will implement more connector features, including paging,
metadata/schema detection, and query folding to the OData query syntax, until your
custom connector supports the same range of capabilities as OData.Feed.
Next steps
TripPin Part 2 - Data Connector for a REST Service
TripPin part 2 - Data connector for a
REST service
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
" Create a base function that calls out to a REST API using Web.Contents
" Learn how to set request headers and process a JSON response
" Use Power BI Desktop to wrangle the response into a user friendly format
This lesson converts the OData based connector for the TripPin service (created in the
previous lesson) to a connector that resembles something you'd create for any RESTful
API. OData is a RESTful API, but one with a fixed set of conventions. The advantage of
OData is that it provides a schema, data retrieval protocol, and standard query
language. Taking away the use of OData.Feed will require us to build these capabilities
into the connector ourselves.
Open the TripPin connector project from Part 1 in Visual Studio. Open the Query file and
paste in the following query:
TripPin.Feed("https://services.odata.org/v4/TripPinService/Me")
Open Fiddler and then select the Start button in Visual Studio.
M evaluation is mostly lazy. In most cases, data values are only retrieved/pulled when
they are needed. There are scenarios (like the /Me/BestFriend case) where a value is
pulled eagerly. This tends to occur when type information is needed for a member, and
the engine has no other way to determine the type than to retrieve the value and
inspect it. Making things lazy (that is, avoiding eager pulls) is one of the key aspects to
making an M connector performant.
Note the request headers that were sent along with the requests and the JSON format
of the response of the /Me request.
JSON
{
"@odata.context":
"https://services.odata.org/v4/TripPinService/$metadata#Me",
"UserName": "aprilcline",
"FirstName": "April",
"LastName": "Cline",
"MiddleName": null,
"Gender": "Female",
"Age": null,
"Emails": [ "April@example.com", "April@contoso.com" ],
"FavoriteFeature": "Feature1",
"Features": [ ],
"AddressInfo": [
{
"Address": "P.O. Box 555",
"City": {
"Name": "Lander",
"CountryRegion": "United States",
"Region": "WY"
}
}
],
"HomeAddress": null
}
When the query finishes evaluating, the M Query Output window should show the
Record value for the Me singleton.
If you compare the fields in the output window with the fields returned in the raw JSON
response, you'll notice a mismatch. The query result has additional fields ( Friends ,
Trips , GetFriendsTrips ) that don't appear anywhere in the JSON response. The
OData.Feed function automatically appended these fields to the record based on the
schema returned by $metadata. This is a good example of how a connector might
augment and/or reformat the response from the service to provide a better user
experience.
To be able to make successful web requests to the OData service, however, you'll have
to set some standard OData headers . You'll do this by defining a common set of
headers as a new variable in your connector:
Power Query M
DefaultRequestHeaders = [
#"Accept" = "application/json;odata.metadata=minimal", // column name
and values only
#"OData-MaxVersion" = "4.0" // we only
support v4
];
You'll change your implementation of your TripPin.Feed function so that rather than
using OData.Feed , it uses Web.Contents to make a web request, and parses the result as
a JSON document.
Power Query M
You can now test this out in Visual Studio using the query file. The result of the /Me
record now resembles the raw JSON that you saw in the Fiddler request.
If you watch Fiddler when running the new function, you'll also notice that the
evaluation now makes a single web request, rather than three. Congratulations—you've
achieved a 300% performance increase! Of course, you've now lost all the type and
schema information, but there's no need to focus on that part just yet.
Update your query to access some of the TripPin Entities/Tables, such as:
https://services.odata.org/v4/TripPinService/Airlines
https://services.odata.org/v4/TripPinService/Airports
https://services.odata.org/v4/TripPinService/Me/Trips
You'll notice that the paths that used to return nicely formatted tables now return a top
level "value" field with an embedded [List]. You'll need to do some transformations on
the result to make it usable for Power BI scenarios.
Authoring transformations in Power Query
While it is certainly possible to author your M transformations by hand, most people
prefer to use Power Query to shape their data. You'll open your extension in Power BI
Desktop and use it to design queries to turn the output into a more user friendly format.
Rebuild your solution, copy the new extension file to your Custom Data Connectors
directory, and relaunch Power BI Desktop.
Start a new Blank Query, and paste the following into the formula bar:
= TripPin.Feed("https://services.odata.org/v4/TripPinService/Airlines")
Manipulate the output until it looks like the original OData feed—a table with two
columns: AirlineCode and Name.
Power Query M
let
Source =
TripPin.Feed("https://services.odata.org/v4/TripPinService/Airlines"),
value = Source[value],
toTable = Table.FromList(value, Splitter.SplitByNothing(), null, null,
ExtraValues.Error),
expand = Table.ExpandRecordColumn(toTable, "Column1", {"AirlineCode",
"Name"}, {"AirlineCode", "Name"})
in
expand
Create a new Blank Query. This time, use the TripPin.Feed function to access the
/Airports entity. Apply transforms until you get something similar to the share shown
below. The matching query can also be found below—give this query a name
("Airports") as well.
Power Query M
let
Source =
TripPin.Feed("https://services.odata.org/v4/TripPinService/Airports"),
value = Source[value],
#"Converted to Table" = Table.FromList(value, Splitter.SplitByNothing(),
null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table",
"Column1", {"Name", "IcaoCode", "IataCode", "Location"}, {"Name",
"IcaoCode", "IataCode", "Location"}),
#"Expanded Location" = Table.ExpandRecordColumn(#"Expanded Column1",
"Location", {"Address", "Loc", "City"}, {"Address", "Loc", "City"}),
#"Expanded City" = Table.ExpandRecordColumn(#"Expanded Location",
"City", {"Name", "CountryRegion", "Region"}, {"Name.1", "CountryRegion",
"Region"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded City",{{"Name.1",
"City"}}),
#"Expanded Loc" = Table.ExpandRecordColumn(#"Renamed Columns", "Loc",
{"coordinates"}, {"coordinates"}),
#"Added Custom" = Table.AddColumn(#"Expanded Loc", "Latitude", each
[coordinates]{1}),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Longitude", each
[coordinates]{0}),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom1",
{"coordinates"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",
{{"Name", type text}, {"IcaoCode", type text}, {"IataCode", type text},
{"Address", type text}, {"City", type text}, {"CountryRegion", type text},
{"Region", type text}, {"Latitude", type number}, {"Longitude", type
number}})
in
#"Changed Type"
You can repeat this process for additional paths under the service. Once you're ready,
move onto the next step of creating a (mock) navigation table.
let
source = #table({"Name", "Data"}, {
{ "Airlines", Airlines },
{ "Airports", Airports }
})
in
source
If you have not set your Privacy Levels setting to "Always ignore Privacy level settings"
(also known as "Fast Combine") you'll see a privacy prompt.
Privacy prompts appear when you're combining data from multiple sources and have
not yet specified a privacy level for the source(s). Select the Continue button and set the
privacy level of the top source to Public.
Select Save and your table will appear. While this isn't a navigation table yet, it provides
the basic functionality you need to turn it into one in a subsequent lesson.
Data combination checks do not occur when accessing multiple data sources from
within an extension. Since all data source calls made from within the extension inherit
the same authorization context, it is assumed they are "safe" to combine. Your extension
will always be treated as a single data source when it comes to data combination rules.
Users would still receive the regular privacy prompts when combining your source with
other M sources.
If you run Fiddler and click the Refresh Preview button in the Query Editor, you'll notice
separate web requests for each item in your navigation table. This indicates that an
eager evaluation is occurring, which isn't ideal when building navigation tables with a lot
of elements. Subsequent lessons will show how to build a proper navigation table that
supports lazy evaluation.
Conclusion
This lesson showed you how to build a simple connector for a REST service. In this case,
you turned an existing OData extension into a standard REST extension (using
Web.Contents), but the same concepts apply if you were creating a new extension from
scratch.
In the next lesson, you'll take the queries created in this lesson using Power BI Desktop
and turn them into a true navigation table within the extension.
Next steps
TripPin Part 3 - Navigation Tables
TripPin part 3 - Navigation tables
Article • 04/28/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
This lesson adds a navigation table to the TripPin connector created in the previous
lesson. When your connector used the OData.Feed function (Part 1), you received the
navigation table “for free”, as derived from the OData service’s $metadata document.
When you moved to the Web.Contents function (Part 2), you lost the built-in navigation
table. In this lesson, you'll take a set of fixed queries you created in Power BI Desktop
and add the appropriate metadata for Power Query to popup the Navigator dialog for
your data source function.
See the Navigation Table documentation for more information about using navigation
tables.
You'll start by copying the queries you wrote in Power BI Desktop (in the previous
lesson) into your connector file. Open the TripPin Visual Studio project, and paste the
Airlines and Airports queries into the TripPin.pq file. You can then turn those queries into
functions that take a single text parameter:
Power Query M
Next you'll import the mock navigation table query you wrote that creates a fixed table
linking to these data set queries. Call it TripPinNavTable :
Power Query M
Power Query M
[DataSource.Kind="TripPin"]
shared TripPin.Feed = Value.ReplaceType(TripPinImpl, type function (url as
Uri.Type) as any);
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Contents = Value.ReplaceType(TripPinNavTable, type function
(url as Uri.Type) as any);
7 Note
Your extension can mark multiple functions as shared , with or without associating
them with a DataSource.Kind . However, when you associate a function with a
specific DataSource.Kind , each function must have the same set of required
parameters, with the same name and type. This is because the data source function
parameters are combined to make a 'key' used for looking up cached credentials.
You can test your TripPin.Contents function using your TripPin.query.pq file. Running
the following test query will give you a credential prompt, and a simple table output.
Power Query M
TripPin.Contents("https://services.odata.org/v4/TripPinService/")
Creating a navigation table
You'll use the handy Table.ToNavigationTable function to format your static table into
something that Power Query will recognize as a navigation table. Since this function is
not part of Power Query's standard library, you'll need to copy its source code into your
.pq file.
With this helper function in place, next update your TripPinNavTable function to add
the navigation table fields.
Power Query M
Running your test query again will give you a similar result as last time—with a few
more columns added.
7 Note
You will not see the Navigator window appear in Visual Studio. The M Query
Output window always displays the underlying table.
If you copy your extension over to your Power BI Desktop custom connector and invoke
the new function from the Get Data dialog, you'll see your navigator appear.
If you right click on the root of the navigation tree and select Edit, you'll see the same
table as you did within Visual Studio.
Conclusion
In this tutorial, you added a Navigation Table to your extension. Navigation Tables are a
key feature that make connectors easier to use. In this example your navigation table
only has a single level, but the Power Query UI supports displaying navigation tables
that have multiple dimensions (even when they are ragged).
Next steps
TripPin Part 4 - Data Source Paths
TripPin part 4 - Data source paths
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
This lesson simplifies the connector built in the previous lesson by removing its required
function parameters, and improving the user experience by moving to a dynamically
generated navigation table.
For an in-depth explanation of how credentials are identified, see the Data Source Paths
section of Handling Authentication.
In the previous lesson you shared two data source functions, both with a single Uri.Type
parameter.
Power Query M
[DataSource.Kind="TripPin"]
shared TripPin.Feed = Value.ReplaceType(TripPinImpl, type function (url as
Uri.Type) as any);
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Contents = Value.ReplaceType(TripPinNavTable, type function
(url as Uri.Type) as any);
The first time you run a query that uses one of the functions, you'll receive a credential
prompt with drop downs that lets you select a path and an authentication type.
If you run the same query again, with the same parameters, the M engine is able to
locate the cached credentials, and no credential prompt is shown. If you modify the url
argument to your function so that the base path no longer matches, a new credential
prompt is displayed for the new path.
You can see any cached credentials on the Credentials table in the M Query Output
window.
Depending on the type of change, modifying the parameters of your function will likely
result in a credential error.
One of the design philosophies of Power Query is to keep the initial data source dialog
as simple as possible. If at all possible, you should provide the user with choices at the
Navigator level, rather on the connection dialog. If a user provided value can be
determined programmatically, consider adding it as the top level of your navigation
table rather than a function parameter.
For example, when connecting to a relational database, you might need server,
database, and table names. Once you know the server to connect to, and credentials
have been provided, you could use the database's API to fetch a list of databases, and a
list of tables contained within each database. In this case, to keep your initial connect
dialog as simple as possible, only the server name should be a required parameter—
Database and Table would be levels of your navigation table.
Since the TripPin service has a fixed URL endpoint, you don't need to prompt the user
for any values. You'll remove the url parameter from your function, and define a BaseUrl
variable in your connector.
Power Query M
BaseUrl = "https://services.odata.org/v4/TripPinService/";
[DataSource.Kind="TripPin", Publish="TripPin.Publish"]
shared TripPin.Contents = () => TripPinNavTable(BaseUrl) as table;
You'll keep the TripPin.Feed function, but no longer make it shared, no longer associate
it with a Data Source Kind, and simplify its declaration. From this point on, you'll only
use it internally within this section document.
Power Query M
If you update the TripPin.Contents() call in your TripPin.query.pq file and run it in
Visual Studio, you'll see a new credential prompt. Note that there is now a single Data
Source Path value—TripPin.
Improving the navigation table
In the first tutorial you used the built-in OData functions to connect to the TripPin
service. This gave you a really nice looking navigation table, based on the TripPin service
document, with no additional code on your side. The OData.Feed function automatically
did the hard work for you. Since you're "roughing it" by using Web.Contents rather than
OData.Feed, you'll need to recreate this navigation table yourself.
You're going to make the following changes:
To simplify the example, you'll only expose the three entity sets (Airlines, Airports,
People), which would be exposed as Tables in M, and skip the singleton (Me) which
would be exposed as a Record. You'll skip adding the functions until a later lesson.
Power Query M
RootEntities = {
"Airlines",
"Airports",
"People"
};
You then update your TripPinNavTable function to build the table a column at a time.
The [Data] column for each entity is retrieved by calling TripPin.Feed with the full URL
to the entity.
Power Query M
When dynamically building URL paths, make sure you're clear where your forward
slashes (/) are! Note that Uri.Combine uses the following rules when combining paths:
When the relativeUri parameter starts with a /, it will replace the entire path of
the baseUri parameter
If the relativeUri parameter does not start with a / and baseUri ends with a /, the
path is appended
If the relativeUri parameter does not start with a / and baseUri does not end with
a /, the last segment of the path is replaced
Power Query M
7 Note
A disadvantage of using a generic approach to process your entities is that you lose
the nice formating and type information for your entities. A later section in this
tutorial shows how to enforce schema on REST API calls.
Conclusion
In this tutorial, you cleaned up and simplified your connector by fixing your Data Source
Path value, and moving to a more flexible format for your navigation table. After
completing these steps (or using the sample code in this directory), the
TripPin.Contents function returns a navigation table in Power BI Desktop.
Next steps
TripPin Part 5 - Paging
TripPin part 5 - Paging
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
Many Rest APIs will return data in "pages", requiring clients to make multiple requests to
stitch the results together. Although there are some common conventions for
pagination (such as RFC 5988 ), it generally varies from API to API. Thankfully, TripPin is
an OData service, and the OData standard defines a way of doing pagination using
odata.nextLink values returned in the body of the response.
To simplify previous iterations of the connector, the TripPin.Feed function was not page
aware. It simply parsed whatever JSON was returned from the request and formatted it
as a table. Those familiar with the OData protocol might have noticed that a number of
incorrect assumptions were made on the format of the response (such as assuming
there is a value field containing an array of records).
In this lesson you'll improve your response handling logic by making it page aware.
Future tutorials will make the page handling logic more robust and able to handle
multiple response formats (including errors from the service).
7 Note
You do not need to implement your own paging logic with connectors based on
OData.Feed, as it handles it all for you automatically.
Paging checklist
When implementing paging support, you'll need to know the following things about
your API:
The answer to these questions will impact the way you implement your paging logic.
While there is some amount of code reuse across paging implementations (such as the
use of Table.GenerateByPage, most connectors will end up requiring custom logic.
7 Note
This lesson contains paging logic for an OData service, which follows a specific
format. Check the documentation for your API to determine the changes you'll
need to make in your connector to support its paging format.
JSON
{
"odata.context": "...",
"odata.count": 37,
"value": [
{ },
{ },
{ }
],
"odata.nextLink": "...?$skiptoken=342r89"
}
Some OData services allow clients to supply a max page size preference , but it is up to
the service whether or not to honor it. Power Query should be able to handle responses
of any size, so you don't need to worry about specifying a page size preference—you
can support whatever the service throws at you.
Power Query M
let
source = TripPin.Contents(),
data = source{[Name="People"]}[Data],
withRowCount = Table.AddIndexColumn(data, "Index")
in
withRowCount
Turn on fiddler, and run the query in Visual Studio. You'll notice that the query returns a
table with 8 rows (index 0 to 7).
If you look at the body of the response from fiddler, you'll see that it does in fact
contain an @odata.nextLink field, indicating that there are more pages of data available.
JSON
{
"@odata.context":
"https://services.odata.org/V4/TripPinService/$metadata#People",
"@odata.nextLink": "https://services.odata.org/v4/TripPinService/People?
%24skiptoken=8",
"value": [
{ },
{ },
{ }
]
}
7 Note
As stated earlier in this tutorial, paging logic will vary between data sources. The
implementation here tries to break up the logic into functions that should be
reusable for sources that use next links returned in the response.
Table.GenerateByPage
To combine the (potentially) multiple pages returned by the source into a single table,
we'll use Table.GenerateByPage. This function takes as its argument a getNextPage
function which should do just what its name suggests: fetch the next page of data.
Table.GenerateByPage will repeatedly call the getNextPage function, each time passing it
the results produced the last time it was called, until it returns null to signal back that
no more pages are available.
Since this function is not part of Power Query's standard library, you'll need to copy its
source code into your .pq file.
Implementing GetAllPagesByNextLink
The body of your GetAllPagesByNextLink function implements the getNextPage function
argument for Table.GenerateByPage. It will call the GetPage function, and retrieve the
URL for the next page of data from the NextLink field of the meta record from the
previous call.
Power Query M
Implementing GetPage
Your GetPage function will use Web.Contents to retrieve a single page of data from the
TripPin service, and convert the response into a table. It passes the response from
Web.Contents to the GetNextLink function to extract the URL of the next page, and sets
it on the meta record of the returned table (page of data).
This implementation is a slightly modified version of the TripPin.Feed call from the
previous tutorials.
Power Query M
Implementing GetNextLink
Your GetNextLink function simply checks the body of the response for an
@odata.nextLink field, and returns its value.
Power Query M
Power Query M
If you re-run the same test query from earlier in the tutorial, you should now see the
page reader in action. You should also see that you have 20 rows in the response rather
than 8.
If you look at the requests in fiddler, you should now see separate requests for each
page of data.
7 Note
You'll notice duplicate requests for the first page of data from the service, which is
not ideal. The extra request is a result of the M engine's schema checking behavior.
Ignore this issue for now and resolve it in the next tutorial, where you'll apply an
explicit schema.
Conclusion
This lesson showed you how to implement pagination support for a Rest API. While the
logic will likely vary between APIs, the pattern established here should be reusable with
minor modifications.
In the next lesson, you'll look at how to apply an explicit schema to your data, going
beyond the simple text and number data types you get from Json.Document .
Next steps
TripPin Part 6 - Schema
TripPin part 6 - Schema
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
One of the big advantages of an OData service over a standard REST API is its
$metadata definition . The $metadata document describes the data found on this
service, including the schema for all of its Entities (Tables) and Fields (Columns). The
OData.Feed function uses this schema definition to automatically set data type
information—so instead of getting all text and number fields (like you would from
Json.Document ), end users will get dates, whole numbers, times, and so on, providing a
Many REST APIs don't have a way to programmatically determine their schema. In these
cases, you'll need to include schema definitions within your connector. In this lesson
you'll define a simple, hardcoded schema for each of your tables, and enforce the
schema on the data you read from the service.
7 Note
The approach described here should work for many REST services. Future lessons
will build upon this approach by recursively enforcing schemas on structured
columns (record, list, table), and provide sample implementations that can
programmatically generate a schema table from CSDL or JSON Schema
documents.
Overall, enforcing a schema on the data returned by your connector has multiple
benefits, such as:
Power Query M
let
source = TripPin.Contents(),
data = source{[Name="Airlines"]}[Data]
in
data
@odata.id
@odata.editLink
AirlineCode
Name
The "@odata.*" columns are part of OData protocol, and not something you'd want or
need to show to the end users of your connector. AirlineCode and Name are the two
columns you'll want to keep. If you look at the schema of the table (using the handy
Table.Schema function), you can see that all of the columns in the table have a data type
of Any.Type .
Power Query M
let
source = TripPin.Contents(),
data = source{[Name="Airlines"]}[Data]
in
Table.Schema(data)
Table.Schema returns a lot of metadata about the columns in a table, including names,
positions, type information, and many advanced properties, such as Precision, Scale, and
MaxLength. Future lessons will provide design patterns for setting these advanced
properties, but for now you need only concern yourself with the ascribed type
( TypeName ), primitive type ( Kind ), and whether the column value might be null
( IsNullable ).
Column Details
Name The name of the column. This must match the name in the results returned by the
service.
Type The M data type you're going to set. This can be a primitive type ( text , number ,
datetime , and so on), or an ascribed type ( Int64.Type , Currency.Type , and so on).
The hardcoded schema table for the Airlines table will set its AirlineCode and Name
columns to text , and looks like this:
Power Query M
Airlines = #table({"Name", "Type"}, {
{"AirlineCode", type text},
{"Name", type text}
});
The Airports table has four fields you'll want to keep (including one of type record ):
Power Query M
Finally, the People table has seven fields, including lists ( Emails , AddressInfo ), a nullable
column ( Gender ), and a column with an ascribed type ( Concurrency ).
Power Query M
table table The table of data you'll want to enforce your schema on.
schema table The schema table to read column information from, with the
following type: type table [Name = text, Type = type] .
Parameter Type Description
1. Determine if there are any missing columns from the source table.
2. Determine if there are any extra columns.
3. Ignore structured columns (of type list , record , and table ), and columns set to
type any .
4. Use Table.TransformColumnTypes to set each column type.
5. Reorder columns based on the order they appear in the schema table.
6. Set the type on the table itself using Value.ReplaceType.
7 Note
The last step to set the table type will remove the need for the Power Query UI to
infer type information when viewing the results in the query editor. This removes
the double request issue you saw at the end of the previous tutorial.
The following helper code can be copy and pasted into your extension:
Power Query M
1. Define a master schema table ( SchemaTable ) that holds all of your schema
definitions.
2. Update the TripPin.Feed , GetPage , and GetAllPagesByNextLink to accept a schema
parameter.
3. Enforce your schema in GetPage .
4. Update your navigation table code to wrap each table with a call to a new function
( GetEntity )—this will give you more flexibility to manipulate the table definitions
in the future.
Power Query M
you want to) to the paging functions, where it will be applied to the results you get back
from the service.
Power Query M
You'll also update all of the calls to these functions to make sure that you pass the
schema through correctly.
Power Query M
Power Query M
You'll then update your TripPinNavTable function to call GetEntity , rather than making
all of the calls inline. The main advantage to this is that it will let you continue modifying
your entity building code, without having to touch your nav table logic.
Power Query M
Power Query M
let
source = TripPin.Contents(),
data = source{[Name="Airlines"]}[Data]
in
Table.Schema(data)
You now see that your Airlines table only has the two columns you defined in its
schema:
Power Query M
let
source = TripPin.Contents(),
data = source{[Name="People"]}[Data]
in
Table.Schema(data)
You'll see that the ascribed type you used ( Int64.Type ) was also set correctly.
Conclusion
This tutorial provided a sample implementation for enforcing a schema on JSON data
returned from a REST service. While this sample uses a simple hardcoded schema table
format, the approach could be expanded upon by dynamically building a schema table
definition from another source, such as a JSON schema file, or metadata
service/endpoint exposed by the data source.
In addition to modifying column types (and values), your code is also setting the correct
type information on the table itself. Setting this type information benefits performance
when running inside of Power Query, as the user experience always attempts to infer
type information to display the right UI queues to the end user, and the inference calls
can end up triggering additional calls to the underlying data APIs.
If you view the People table using the TripPin connector from the previous lesson, you'll
see that all of the columns have a 'type any' icon (even the columns that contain lists):
Running the same query with the TripPin connector from this lesson, you'll now see that
the type information is displayed correctly.
Next steps
TripPin Part 7 - Advanced Schema with M Types
TripPin part 7 - Advanced schema with
M types
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
In the previous lesson you defined your table schemas using a simple "Schema Table"
system. This schema table approach works for many REST APIs/Data Connectors, but
services that return complete or deeply nested data sets might benefit from the
approach in this tutorial, which leverages the M type system.
1. Copy the common code from the UnitTest sample into your TripPin.query.pq file.
2. Add a section declaration to the top of your TripPin.query.pq file.
3. Create a shared record (called TripPin.UnitTest ).
4. Define a Fact for each test.
5. Call Facts.Summarize() to run all of the tests.
6. Reference the previous call as the shared value to ensure that it gets evaluated
when the project is run in Visual Studio.
Power Query M
section TripPinUnitTests;
shared TripPin.UnitTest =
[
// Put any common variables here if you only want them to be evaluated
once
RootTable = TripPin.Contents(),
Airlines = RootTable{[Name="Airlines"]}[Data],
Airports = RootTable{[Name="Airports"]}[Data],
People = RootTable{[Name="People"]}[Data],
Record.FieldNames(Type.RecordFields(Type.TableRow(Value.Type(Airlines))))
)
},
report = Facts.Summarize(facts)
][report];
Selecting run on the project will evaluate all of the Facts, and give you a report output
that looks like this:
Using some principles from test-driven development , you'll now add a test that
currently fails, but will soon be reimplemented and fixed (by the end of this tutorial).
Specifically, you'll add a test that checks one of the nested records (Emails) you get back
in the People entity.
Power Query M
If you run the code again, you should now see that you have a failing test.
Now you just need to implement the functionality to make this work.
In the TripPin case, the data in the People and Airports entities contain structured
columns, and even share a type ( Location ) for representing address information. Rather
than defining Name/Type pairs in a schema table, you'll define each of these entities
using custom M type declarations.
Here is a quick refresher about types in the M language from the Language
Specification:
A type value is a value that classifies other values. A value that is classified by a type
is said to conform to that type. The M type system consists of the following kinds of
types:
Primitive types, which classify primitive values ( binary , date , datetime ,
datetimezone , duration , list , logical , null , number , record , text , time ,
type ) and also include a number of abstract types ( function , table , any , and
none )
Record types, which classify record values based on field names and value
types
List types, which classify lists using a single item base type
Function types, which classify function values based on the types of their
parameters and return values
Table types, which classify table values based on column names, column types,
and keys
Nullable types, which classifies the value null in addition to all the values
classified by a base type
Type types, which classify values that are types
Using the raw JSON output you get (and/or looking up the definitions in the service's
$metadata ), you can define the following record types to represent OData complex
types:
Power Query M
LocationType = type [
Address = text,
City = CityType,
Loc = LocType
];
CityType = type [
CountryRegion = text,
Name = text,
Region = text
];
LocType = type [
#"type" = text,
coordinates = {number},
crs = CrsType
];
CrsType = type [
#"type" = text,
properties = record
];
Note how the LocationType references the CityType and LocType to represent its
structured columns.
For the top level entities (that you want represented as Tables), you define table types:
Power Query M
You then update your SchemaTable variable (which you use as a "lookup table" for entity
to type mappings) to use these new type definitions:
Power Query M
argument, and will apply your schema recursively for all nested types. It's signature looks
like this:
Power Query M
7 Note
For flexibility, the function can be used on tables, as well as lists of records (which is
how tables would be represented in a JSON document).
You then need to update the connector code to change the schema parameter from a
table to a type , and add a call to Table.ChangeType in GetEntity .
Power Query M
GetPage is updated to use the list of fields from the schema (to know the names of what
to expand when you get the results), but leaves the actual schema enforcement to
GetEntity .
Power Query M
Running your unit tests again show that they are now all passing.
7 Note
At this point, your extension almost has as much "common" code as TripPin connector
code. In the future these common functions will either be part of the built-in standard
function library, or you'll be able to reference them from another extension. For now,
you refactor your code in the following way:
Power Query M
Table.ChangeType = Extension.LoadFunction("Table.ChangeType.pqm");
Table.GenerateByPage = Extension.LoadFunction("Table.GenerateByPage.pqm");
Table.ToNavigationTable =
Extension.LoadFunction("Table.ToNavigationTable.pqm");
Conclusion
This tutorial made a number of improvements to the way you enforce a schema on the
data you get from a REST API. The connector is currently hard coding its schema
information, which has a performance benefit at runtime, but is unable to adapt to
changes in the service's metadata overtime. Future tutorials will move to a purely
dynamic approach that will infer the schema from the service's $metadata document.
In addition to the schema changes, this tutorial added Unit Tests for your code, and
refactored the common helper functions into separate files to improve overall
readability.
Next steps
TripPin Part 8 - Adding Diagnostics
TripPin part 8 - Adding diagnostics
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
Enabling diagnostics
Power Query users can enable trace logging by selecting the checkbox under Options |
Diagnostics.
Once enabled, any subsequent queries will cause the M engine to emit trace information
to log files located in a fixed user directory.
When running M queries from within the Power Query SDK, tracing is enabled at the
project level. On the project properties page, there are three settings related to tracing:
Clear Log—when this is set to true , the log will be reset/cleared when you run
your queries. We recommend you keep this set to true .
Show Engine Traces—this setting controls the output of built-in traces from the M
engine. These traces are generally only useful to members of the Power Query
team, so you'll typically want to keep this set to false .
Show User Traces—this setting controls trace information output by your
connector. You'll want to set this to true .
Once enabled, you'll start seeing log entries in the M Query Output window, under the
Log tab.
Diagnostics.Trace
The Diagnostics.Trace function is used to write messages into the M engine's trace log.
Power Query M
) Important
The traceLevel parameter can be one of the following values (in descending order):
TraceLevel.Critical
TraceLevel.Error
TraceLevel.Warning
TraceLevel.Information
TraceLevel.Verbose
When tracing is enabled, the user can select the maximum level of messages they would
like to see. All trace messages of this level and under will be output to the log. For
example, if the user selects the "Warning" level, trace messages of TraceLevel.Warning ,
TraceLevel.Error , and TraceLevel.Critical would appear in the logs.
The message parameter is the actual text that will be output to the trace file. Note that
the text will not contain the value parameter unless you explicitly include it in the text.
The value parameter is what the function will return. When the delayed parameter is
set to true , value will be a zero parameter function that returns the actual value you're
evaluating. When delayed is set to false , value will be the actual value. An example of
how this works can be found below.
Power Query M
You can force an error during evaluation (for test purposes!) by passing an invalid entity
name to the GetEntity function. Here you change the withData line in the
TripPinNavTable function, replacing [Name] with "DoesNotExist" .
Power Query M
Enable tracing for your project, and run your test queries. On the Errors tab you should
see the text of the error you raised:
Also, on the Log tab, you should see the same message. Note that if you use different
values for the message and value parameters, these would be different.
Also note that the Action field of the log message contains the name (Data Source
Kind) of your extension (in this case, Engine/Extension/TripPin ). This makes it easier to
find the messages related to your extension when there are multiple queries involved
and/or system (mashup engine) tracing is enabled.
Delayed evaluation
As an example of how the delayed parameter works, you'll make some modifications
and run the queries again.
First, set the delayed value to false , but leave the value parameter as-is:
Power Query M
When you run the query, you'll receive an error that "We cannot convert a value of type
Function to type Type", and not the actual error you raised. This is because the call is
now returning a function value, rather than the value itself.
Power Query M
Now that you understand the impact of the delayed parameter, be sure to reset
your connector back to a working state before proceeding.
Power Query M
Diagnostics.LogValue
The Diagnostics.LogValue function is a lot like Diagnostics.Trace , and can be used to
output the value of what you're evaluating.
Power Query M
The prefix parameter is prepended to the log message. You'd use this to figure out
which call output the message. The value parameter is what the function will return,
and will also be written to the trace as a text representation of the M value. For example,
if value is equal to a table with columns A and B, the log will contain the equivalent
#table representation: #table({"A", "B"}, {{"row1 A", "row1 B"}, {"row2 A", row2
B"}})
7 Note
7 Note
As an example, you'll update the TripPin.Feed function to trace the url and schema
arguments passed into the function.
Power Query M
Note that you have to use the new _url and _schema values in the call to
GetAllPagesByNextLink . If you used the original function parameters, the
When you run your queries, you should now see new messages in the log.
Accessing url:
Schema type:
Note that you see the serialized version of the schema parameter type , rather than what
you'd get when you do a simple Text.FromValue on a type value (which results in
"type").
Diagnostics.LogFailure
The Diagnostics.LogFailure function can be used to wrap function calls, and will only
write to the trace if the function call fails (that is, returns an error ).
Power Query M
Internally, Diagnostics.LogFailure adds a try operator to the function call. If the call
fails, the text value is written to the trace before returning the original error . If the
function call succeeds, the result is returned without writing anything to the trace. Since
M errors don't contain a full stack trace (that is, you typically only see the message of
the error), this can be useful when you want to pinpoint where the error was actually
raised.
As a (poor) example, modify the withData line of the TripPinNavTable function to force
an error once again:
Power Query M
In the trace, you can find the resulting error message containing your text , and the
original error information.
Be sure to reset your function to a working state before proceeding with the next
tutorial.
Conclusion
This brief (but important!) lesson showed you how to make use of the diagnostic helper
functions to log to the Power Query trace files. When used properly, these functions are
extremely useful in debugging issues within your connector.
7 Note
Next steps
TripPin Part 9 - TestConnection
TripPin part 9 - TestConnection
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
Custom connector support was added to the April 2018 release of the personal on-
premises data gateway. This new (preview) functionality allows for Scheduled Refresh of
reports that make use of your custom connector.
This tutorial will cover the process of enabling your connector for refresh, and provide a
quick walkthrough of the steps to configure the gateway. Specifically you'll:
Background
There are three prerequisites for configuring a data source for scheduled refresh using
PowerBI.com:
The data source is supported: This means that the target gateway environment is
aware of all of the functions contained in the query you want to refresh.
Credentials are provided: To present the right credential entry dialog, Power BI
needs to know the support authentication mechanism for a given data source.
The credentials are valid: After the user provides credentials, they're validated by
calling the data source's TestConnection handler.
The first two items are handled by registering your connector with the gateway. When
the user attempts to configure scheduled refresh in PowerBI.com, the query information
is sent to your personal gateway to determine if any data sources that aren't recognized
by the Power BI service (that is, custom ones that you created) are available there. The
third item is handled by invoking the TestConnection handler defined for your data
source.
be avoided.
Since the TripPin data source function has no required arguments, the implementation
for TestConnection is fairly simple:
Power Query M
7 Note
Future versions of the Power Query SDK will provide a way to validate the
TestConnection handler from Visual Studio. Currently, the only mechanism that uses
TestConnection is the on-premises data gateway.
After installation is complete, launch the gateway and sign into Power BI. The sign-in
process will automatically register your gateway with the Power BI services. Once signed
in, perform the following steps:
Add one or more visuals to your report page (optional), and then publish the report to
PowerBI.com.
After publishing, go to PowerBI.com and find the dataset for the report you just
published. Select the ellipses, and then select Schedule Refresh. Expand the Gateway
connection and Data source credentials sections.
7 Note
If the dataset configuration page says that the report contains unknown data
sources, your gateway/custom connector might not be configured properly. Go to
the personal gateway configuration UI and make sure that there are no errors next
to the TripPin connector. You may need to restart the gateway (on the Service
Settings tab) to pick up the latest configuration.
Select the Edit credentials link to bring up the authentication dialog, and then select
sign-in.
7 Note
If you receive an error similar to the one below ("Failed to update data source
credentials"), you most likely have an issue with your TestConnection handler.
After a successful call to TestConnection, the credentials will be accepted. You can now
schedule refresh, or select the dataset ellipse and then select Refresh Now. You can
select the Refresh history link to view the status of the refresh (which generally takes a
few minutes to get kicked off).
Conclusion
Congratulations! You now have a production ready custom connector that supports
automated refresh through the Power BI service.
Next steps
TripPin Part 10 - Query Folding
TripPin part 10 - Basic query folding
Article • 02/17/2023
This multi-part tutorial covers the creation of a new data source extension for Power
Query. The tutorial is meant to be done sequentially—each lesson builds on the
connector created in previous lessons, incrementally adding new capabilities to your
connector.
$top
$skip
$count
$select
$orderby
One of the powerful features of the M language is its ability to push transformation
work to underlying data source(s). This capability is referred to as Query Folding (other
tools/technologies also refer to similar function as Predicate Pushdown, or Query
Delegation).
When creating a custom connector that uses an M function with built-in query folding
capabilities, such as OData.Feed or Odbc.DataSource, your connector will automatically
inherit this capability for free.
This tutorial will replicate the built-in query folding behavior for OData by implementing
function handlers for the Table.View function. This part of the tutorial will implement
some of the easier handlers to implement (that is, ones that don't require expression
parsing and state tracking).
To understand more about the query capabilities that an OData service might offer, go
to OData v4 URL Conventions .
7 Note
As stated above, the OData.Feed function will automatically provide query folding
capabilities. Since the TripPin series is treating the OData service as a regular REST
API, using Web.Contents rather than OData.Feed, you'll need to implement the
query folding handlers yourself. For real world usage, we recommend that you use
OData.Feed whenever possible.
Go to Power Query query folding for more information about query folding.
Using Table.View
The Table.View function allows a custom connector to override default transformation
handlers for your data source. An implementation of Table.View will provide a function
for one or more of the supported handlers. If a handler is unimplemented, or returns an
error during evaluation, the M engine will fall back to its default handler.
When a custom connector uses a function that doesn't support implicit query folding,
such as Web.Contents, default transformation handlers will always be performed locally.
If the REST API you're connecting to supports query parameters as part of the query,
Table.View lets you add optimizations that allow transformation work to be pushed to
the service.
Power Query M
Your implementation will wrap your main data source function. There are two required
handlers for Table.View:
GetRows —returns the actual table result of your data source function
Power Query M
If you re-run the unit tests, you'll see that the behavior of your function hasn't changed.
In this case your Table.View implementation is simply passing through the call to
GetEntity . Since you haven't implemented any transformation handlers (yet), the
original url parameter remains untouched.
Power Query M
//
// Helper functions
//
// Retrieves the cached schema. If this is the first call
// to CalculateSchema, the table type is calculated based on
// the entity name that was passed into the function.
CalculateSchema = (state) as type =>
if (state[Schema]? = null) then
GetSchemaForEntity(entity)
else
state[Schema],
If you look at the call to Table.View, you'll see an additional wrapper function around
the handlers record— Diagnostics.WrapHandlers . This helper function is found in the
Diagnostics module (that was introduced in the adding diagnostics lesson), and
provides you with a useful way to automatically trace any errors raised by individual
handlers.
The GetType and GetRows functions have been updated to make use of two new helper
functions— CalculateSchema and CalculateUrl . Right now, the implementations of
those functions are fairly straightforward—you'll notice they contain parts of what was
previously done by the GetEntity function.
Finally, you'll notice that you're defining an internal function ( View ) that accepts a state
parameter. As you implement more handlers, they will recursively call the internal View
function, updating and passing along state as they go.
The manual way to validate folding behavior is to watch the URL requests your unit tests
make using a tool like Fiddler. Alternatively, the diagnostic logging you added to
TripPin.Feed will emit the full URL being run, which should include the OData query
An automated way to validate query folding is to force your unit test execution to fail if
a query doesn't fully fold. You can do this by opening the project properties, and setting
Error on Folding Failure to True. With this setting enabled, any query that requires local
processing results in the following error:
We couldn't fold the expression to the source. Please try a simpler expression.
You can test this out by adding a new Fact to your unit test file that contains one or
more table transformations.
Power Query M
7 Note
The Error on Folding Failure setting is an "all or nothing" approach. If you want to
test queries that aren't designed to fold as part of your unit tests, you'll need to
add some conditional logic to enable/disable tests accordingly.
The remaining sections of this tutorial will each add a new Table.View handler. You'll be
taking a Test Driven Development (TDD) approach, where you first add failing unit
tests, and then implement the M code to resolve them.
Each handler section below will describe the functionality provided by the handler, the
OData equivalent query syntax, the unit tests, and the implementation. Using the
scaffolding code described above, each handler implementation requires two changes:
Adding the handler to Table.View that will update the state record.
Modifying CalculateUrl to retrieve the values from the state and add to the url
and/or query string parameters.
Power Query M
These tests both use Table.FirstN to filter to the result set to the first X number of rows.
If you run these tests with Error on Folding Failure set to False (the default), the tests
should succeed, but if you run Fiddler (or check the trace logs), you'll notice that the
request you send doesn't contain any OData query parameters.
If you set Error on Folding Failure to True , the tests will fail with the Please try a
simpler expression. error. To fix this, you'll define your first Table.View handler for
OnTake .
The CalculateUrl function is updated to extract the Top value from the state record,
and set the right parameter in the query string.
Power Query M
encodedQueryString = Uri.BuildQueryString(qsWithTop),
finalUrl = urlWithEntity & "?" & encodedQueryString
in
finalUrl
Rerunning the unit tests, you'll notice that the URL you're accessing now contains the
$top parameter. (Note that due to URL encoding, $top appears as %24top , but the
Unit tests:
Power Query M
// OnSkip
Fact("Fold $skip 14 on Airlines",
#table( type table [AirlineCode = text, Name = text] , {{"EK",
"Emirates"}} ),
Table.Skip(Airlines, 14)
),
Fact("Fold $skip 0 and $top 1",
#table( type table [AirlineCode = text, Name = text] , {{"AA", "American
Airlines"}} ),
Table.FirstN(Table.Skip(Airlines, 0), 1)
),
Implementation:
Power Query M
Power Query M
qsWithSkip =
if (state[Skip]? <> null) then
qsWithTop & [ #"$skip" = Number.ToText(state[Skip]) ]
else
qsWithTop,
In OData terms, this operation will map to the $select query option.
The advantage of folding column selection becomes apparent when you are dealing
with tables with many columns. The $select operator will remove unselected columns
from the result set, resulting in more efficient queries.
Unit tests:
Power Query M
// OnSelectColumns
Fact("Fold $select single column",
#table( type table [AirlineCode = text] , {{"AA"}} ),
Table.FirstN(Table.SelectColumns(Airlines, {"AirlineCode"}), 1)
),
Fact("Fold $select multiple column",
#table( type table [UserName = text, FirstName = text, LastName = text],
{{"russellwhyte", "Russell", "Whyte"}}),
Table.FirstN(Table.SelectColumns(People, {"UserName", "FirstName",
"LastName"}), 1)
),
Fact("Fold $select with ignore column",
#table( type table [AirlineCode = text] , {{"AA"}} ),
Table.FirstN(Table.SelectColumns(Airlines, {"AirlineCode",
"DoesNotExist"}, MissingField.Ignore), 1)
),
The first two tests select different numbers of columns with Table.SelectColumns, and
include a Table.FirstN call to simplify the test case.
7 Note
If the test were to simply return the column names (using Table.ColumnNames and
not any data, the request to the OData service will never actually be sent. This is
because the call to GetType will return the schema, which contains all of the
information the M engine needs to calculate the result.
The third test uses the MissingField.Ignore option, which tells the M engine to ignore
any selected columns that don't exist in the result set. The OnSelectColumns handler
doesn't need to worry about this option—the M engine will handle it automatically (that
is, missing columns won't be included in the columns list).
7 Note
Power Query M
CalculateUrl is updated to retrieve the list of columns from the state, and combine
Power Query M
Power Query M
Each record contains a Name field, indicating the name of the column, and an Order field
which is equal to Order.Ascending or Order.Descending.
In OData terms, this operation will map to the $orderby query option. The $orderby
syntax has the column name followed by asc or desc to indicate ascending or
descending order. When sorting on multiple columns, the values are separated with a
comma. Note that if the columns parameter contains more than one item, it's important
to maintain the order in which they appear.
Unit tests:
Power Query M
// OnSort
Fact("Fold $orderby single column",
#table( type table [AirlineCode = text, Name = text], {{"TK", "Turkish
Airlines"}}),
Table.FirstN(Table.Sort(Airlines, {{"AirlineCode", Order.Descending}}),
1)
),
Fact("Fold $orderby multiple column",
#table( type table [UserName = text], {{"javieralfred"}}),
Table.SelectColumns(Table.FirstN(Table.Sort(People, {{"LastName",
Order.Ascending}, {"UserName", Order.Descending}}), 1), {"UserName"})
)
Implementation:
Power Query M
Updates to CalculateUrl :
Power Query M
qsWithOrderBy =
if (state[OrderBy]? <> null) then
qsWithSelect & [ #"$orderby" = state[OrderBy] ]
else
qsWithSelect,
You have a few different options on how to handle this as part of an OData query:
The $count query parameter , which returns the count as a separate field in the
result set.
The /$count path segment , which will return only the total count, as a scalar
value.
The downside to the query parameter approach is that you still need to send the entire
query to the OData service. Since the count comes back inline as part of the result set,
you'll have to process the first page of data from the result set. While this is still more
efficient then reading the entire result set and counting the rows, it's probably still more
work than you want to do.
The advantage of the path segment approach is that you'll only receive a single scalar
value in the result. This makes the entire operation a lot more efficient. However, as
described in the OData specification, the /$count path segment will return an error if
you include other query parameters, such as $top or $skip , which limits its usefulness.
In this tutorial, you'll implement the GetRowCount handler using the path segment
approach. To avoid the errors you'd get if other query parameters are included, you'll
check for other state values, and return an "unimplemented error" ( ... ) if you find any.
Returning any error from a Table.View handler tells the M engine that the operation
can't be folded, and it should fallback to the default handler instead (which in this case
would be counting the total number of rows).
Power Query M
// GetRowCount
Fact("Fold $count", 15, Table.RowCount(Airlines)),
Since the /$count path segment returns a single value (in plain/text format) rather than
a JSON result set, you'll also have to add a new internal function ( TripPin.Scalar ) for
making the request and handling the result.
Power Query M
The implementation will then use this function (if no other query parameters are found
in the state ):
Power Query M
Power Query M
To test the fallback case, you'll add another test that forces the error.
First, add a helper method that checks the result of a try operation for a folding error.
Power Query M
// Returns true if there is a folding error, or the original record (for
logging purposes) if not.
Test.IsFoldingError = (tryResult as record) =>
if ( tryResult[HasError]? = true and tryResult[Error][Message] = "We
couldn't fold the expression to the data source. Please try a simpler
expression.") then
true
else
tryResult;
Then add a test that uses both Table.RowCount and Table.FirstN to force the error.
Power Query M
An important note here is that this test will now return an error if Error on Folding Error
is set to false , because the Table.RowCount operation will fall back to the local (default)
handler. Running the tests with Error on Folding Error set to true will cause
Table.RowCount to fail, and allows the test to succeed.
Conclusion
Implementing Table.View for your connector adds a significant amount of complexity to
your code. Since the M engine can process all transformations locally, adding
Table.View handlers doesn't enable new scenarios for your users, but will result in more
efficient processing (and potentially, happier users). One of the main advantages of the
Table.View handlers being optional is that it allows you to incrementally add new
functionality without impacting backwards compatibility for your connector.
For most connectors, an important (and basic) handler to implement is OnTake (which
translates to $top in OData), as it limits the amount of rows returned. The Power Query
experience will always perform an OnTake of 1000 rows when displaying previews in the
navigator and query editor, so your users might see significant performance
improvements when working with larger data sets.
GitHub Connector Sample
Article • 02/17/2023
The GitHub M extension shows how to add support for an OAuth 2.0 protocol
authentication flow. You can learn more about the specifics of GitHub's authentication
flow on the GitHub Developer site .
Before you get started creating an M extension, you need to register a new app on
GitHub, and replace the client_id and client_secret files with the appropriate values
for your app.
Note about compatibility issues in Visual Studio: The Power Query SDK uses an Internet
Explorer based control to popup OAuth dialogs. GitHub has deprecated its support for the
version of IE used by this control, which will prevent you from completing the permission
grant for you app if run from within Visual Studio. An alternative is to load the extension
with Power BI Desktop and complete the first OAuth flow there. After your application has
been granted access to your account, subsequent logins will work fine from Visual Studio.
7 Note
To allow Power BI to obtain and use the access_token, you must specify the redirect
url as https://oauth.powerbi.com/views/oauthredirect.html .
When you specify this URL and GitHub successfully authenticates and grants
permissions, GitHub will redirect to PowerBI's oauthredirect endpoint so that Power BI
can retrieve the access_token and refresh_token.
1. Application name : Enter a name for the application for your M extension.
2. Authorization callback URL : Enter
https://oauth.powerbi.com/views/oauthredirect.html .
3. Scope : In GitHub, set scope to user, repo .
7 Note
A registered OAuth application is assigned a unique Client ID and Client Secret. The
Client Secret should not be shared. You get the Client ID and Client Secret from the
GitHub application page. Update the files in your Data Connector project with the
Client ID ( client_id file) and Client Secret ( client_secret file).
//
// Data Source definition
//
GithubSample = [
Authentication = [
OAuth = [
StartLogin = StartLogin,
FinishLogin = FinishLogin
]
],
Label = Extension.LoadString("DataSourceLabel")
];
client_id string Required. The client ID you received from GitHub when you registered.
redirect_uri string The URL in your app where users will be sent after authorization. See details
below about redirect urls. For M extensions, the redirect_uri must be
"https://oauth.powerbi.com/views/oauthredirect.html".
scope string A comma separated list of scopes. If not provided, scope defaults to an
empty list of scopes for users that don't have a valid token for the app. For
users who do already have a valid token for the app, the user won't be
shown the OAuth authorization page with the list of scopes. Instead, this
step of the flow will automatically complete with the same scopes that were
used last time the user completed the flow.
state string An un-guessable random string. It's used to protect against cross-site
request forgery attacks.
The following code snippet describes how to implement a StartLogin function to start
the login flow. A StartLogin function takes a resourceUrl , state , and display value. In
the function, create an AuthorizeUrl that concatenates the GitHub authorize URL with
the following parameters:
client_id : You get the client ID after you register your extension with GitHub from
the GitHub application page.
scope : Set scope to " user, repo ". This sets the authorization scope (that is, what
If this is the first time the user is logging in with your app (identified by its client_id
value), they'll see a page that asks them to grant access to your app. Subsequent login
attempts will simply ask for their credentials.
To get a GitHub access token, you pass the temporary code from the GitHub Authorize
Response. In the TokenMethod function, you formulate a POST request to GitHub's
access_token endpoint ( https://github.com/login/oauth/access_token ). The following
parameters are required for the GitHub endpoint:
client_id string Required. The client ID you received from GitHub when you registered.
client_secret string Required. The client secret you received from GitHub when you registered.
redirect_uri string The URL in your app where users will be sent after authorization. See
details below about redirect URLs.
Here are the details used parameters for the Web.Contents call.
The JSON response from the service will contain an access_token field. The TokenMethod
method converts the JSON response into an M record using Json.Document, and returns
it to the engine.
Sample response:
JSON
{
"access_token":"e72e16c7e42f292c6912e7710c838347ae178b4a",
"scope":"user,repo",
"token_type":"bearer"
}
[DataSource.Kind="GithubSample", Publish="GithubSample.UI"]
shared GithubSample.Contents = Value.ReplaceType(Github.Contents, type
function (url as Uri.Type) as any);
[DataSource.Kind="GithubSample"]
shared GithubSample.PagedTable = Value.ReplaceType(Github.PagedTable, type
function (url as Uri.Type) as nullable table);
The GithubSample.Contents function is also published to the UI (allowing it to appear in
the Get Data dialog). The Value.ReplaceType function is used to set the function
parameter to the Url.Type ascribed type.
By associating these functions with the GithubSample data source kind, they'll
automatically use the credentials that the user provided. Any M library functions that
have been enabled for extensibility (such as Web.Contents) will automatically inherit
these credentials as well.
For more details on how credential and authentication works, see Handling
Authentication.
Sample URL
This connector is able to retrieve formatted data from any of the GitHub v3 REST API
endpoints. For example, the query to pull all commits to the Data Connectors repo
would look like this:
GithubSample.Contents("https://api.github.com/repos/microsoft/dataconnectors
/commits")
List of Samples
Article • 02/17/2023
We maintain a list of samples on the DataConnectors repo on GitHub. Each of the links
below links to a folder in the sample repository. Generally these folders include a
readme, one or more .pq / .query.pq files, a project file for Visual Studio, and in some
cases icons. To open these files in Visual Studio, make sure you've set up the SDK
properly, and run the .mproj file from the cloned or downloaded folder.
Functionality
Sample Description Link
Hello This simple sample shows the basic structure of a connector. GitHub
World Link
Hello Similar to the Hello World sample, this sample shows how to add GitHub
World with documentation to a shared function. Link
Docs
Navigation This sample provides two examples of how to create a navigation table for GitHub
Tables your data connector using the Table.ToNavigationTable function. Link
Unit This sample shows how you can add simple unit testing to your GitHub
Testing <extension>.query.pq file. Link
OAuth
Sample Description Link
GitHub This sample corresponds to the GitHub connector tutorial. GitHub Link
ODBC
Sample Description Link
SQL This connector sample serves as a template for ODBC connectors. GitHub
Link
Redshift This connector sample uses the Redshift ODBC driver, and is based on GitHub
the connector template. Link
Sample Description Link
Hive LLAP This connector sample uses the Hive ODBC driver, and is based on the GitHub
connector template. Link
Snowflake This connector sample uses the Snowflake ODBC driver, and is based GitHub
on the connector template. Link
Impala This connector sample uses the Cloudera Impala ODBC driver, and is GitHub
based on the connector template. Link
Direct Query This sample creates an ODBC-based custom connector that enables GitHub
for SQL Direct Query for SQL Server. Link
TripPin
Sample Description Link
Part 3 This sample corresponds to TripPin Tutorial Part 3 - Navigation Tables. GitHub
Link
Part 4 This sample corresponds to TripPin Tutorial Part 4 - Data Source Paths. GitHub
Link
Part 6 This sample corresponds to TripPin Tutorial Part 6 - Enforcing Schema. GitHub
Link
Part 7 This sample corresponds to TripPin Tutorial Part 7 - Advanced Schema GitHub
with M Types. Link
Part 8 This sample corresponds to TripPin Tutorial Part 8 - Adding Diagnostics. GitHub
Link
Part 9 This sample corresponds to TripPin Tutorial Part 9 - Test Connection. GitHub
Link
Part 10 This sample corresponds to TripPin Tutorial Part 10 - Basic Query GitHub
Folding. Link
List of Samples
Article • 02/17/2023
We maintain a list of samples on the DataConnectors repo on GitHub. Each of the links
below links to a folder in the sample repository. Generally these folders include a
readme, one or more .pq / .query.pq files, a project file for Visual Studio, and in some
cases icons. To open these files in Visual Studio, make sure you've set up the SDK
properly, and run the .mproj file from the cloned or downloaded folder.
Functionality
Sample Description Link
Hello This simple sample shows the basic structure of a connector. GitHub
World Link
Hello Similar to the Hello World sample, this sample shows how to add GitHub
World with documentation to a shared function. Link
Docs
Navigation This sample provides two examples of how to create a navigation table for GitHub
Tables your data connector using the Table.ToNavigationTable function. Link
Unit This sample shows how you can add simple unit testing to your GitHub
Testing <extension>.query.pq file. Link
OAuth
Sample Description Link
GitHub This sample corresponds to the GitHub connector tutorial. GitHub Link
ODBC
Sample Description Link
SQL This connector sample serves as a template for ODBC connectors. GitHub
Link
Redshift This connector sample uses the Redshift ODBC driver, and is based on GitHub
the connector template. Link
Sample Description Link
Hive LLAP This connector sample uses the Hive ODBC driver, and is based on the GitHub
connector template. Link
Snowflake This connector sample uses the Snowflake ODBC driver, and is based GitHub
on the connector template. Link
Impala This connector sample uses the Cloudera Impala ODBC driver, and is GitHub
based on the connector template. Link
Direct Query This sample creates an ODBC-based custom connector that enables GitHub
for SQL Direct Query for SQL Server. Link
TripPin
Sample Description Link
Part 3 This sample corresponds to TripPin Tutorial Part 3 - Navigation Tables. GitHub
Link
Part 4 This sample corresponds to TripPin Tutorial Part 4 - Data Source Paths. GitHub
Link
Part 6 This sample corresponds to TripPin Tutorial Part 6 - Enforcing Schema. GitHub
Link
Part 7 This sample corresponds to TripPin Tutorial Part 7 - Advanced Schema GitHub
with M Types. Link
Part 8 This sample corresponds to TripPin Tutorial Part 8 - Adding Diagnostics. GitHub
Link
Part 9 This sample corresponds to TripPin Tutorial Part 9 - Test Connection. GitHub
Link
Part 10 This sample corresponds to TripPin Tutorial Part 10 - Basic Query GitHub
Folding. Link
List of Samples
Article • 02/17/2023
We maintain a list of samples on the DataConnectors repo on GitHub. Each of the links
below links to a folder in the sample repository. Generally these folders include a
readme, one or more .pq / .query.pq files, a project file for Visual Studio, and in some
cases icons. To open these files in Visual Studio, make sure you've set up the SDK
properly, and run the .mproj file from the cloned or downloaded folder.
Functionality
Sample Description Link
Hello This simple sample shows the basic structure of a connector. GitHub
World Link
Hello Similar to the Hello World sample, this sample shows how to add GitHub
World with documentation to a shared function. Link
Docs
Navigation This sample provides two examples of how to create a navigation table for GitHub
Tables your data connector using the Table.ToNavigationTable function. Link
Unit This sample shows how you can add simple unit testing to your GitHub
Testing <extension>.query.pq file. Link
OAuth
Sample Description Link
GitHub This sample corresponds to the GitHub connector tutorial. GitHub Link
ODBC
Sample Description Link
SQL This connector sample serves as a template for ODBC connectors. GitHub
Link
Redshift This connector sample uses the Redshift ODBC driver, and is based on GitHub
the connector template. Link
Sample Description Link
Hive LLAP This connector sample uses the Hive ODBC driver, and is based on the GitHub
connector template. Link
Snowflake This connector sample uses the Snowflake ODBC driver, and is based GitHub
on the connector template. Link
Impala This connector sample uses the Cloudera Impala ODBC driver, and is GitHub
based on the connector template. Link
Direct Query This sample creates an ODBC-based custom connector that enables GitHub
for SQL Direct Query for SQL Server. Link
TripPin
Sample Description Link
Part 3 This sample corresponds to TripPin Tutorial Part 3 - Navigation Tables. GitHub
Link
Part 4 This sample corresponds to TripPin Tutorial Part 4 - Data Source Paths. GitHub
Link
Part 6 This sample corresponds to TripPin Tutorial Part 6 - Enforcing Schema. GitHub
Link
Part 7 This sample corresponds to TripPin Tutorial Part 7 - Advanced Schema GitHub
with M Types. Link
Part 8 This sample corresponds to TripPin Tutorial Part 8 - Adding Diagnostics. GitHub
Link
Part 9 This sample corresponds to TripPin Tutorial Part 9 - Test Connection. GitHub
Link
Part 10 This sample corresponds to TripPin Tutorial Part 10 - Basic Query GitHub
Folding. Link
Additional connector functionality
Article • 02/15/2023
Authentication
While implementing authentication is covered in the authentication article, there are
other methods that connector owners might be interested in offering.
Windows authentication
Windows authentication is supported. To enable Windows-based authentication in your
connector, add the following line in the Authentication section of your connector.
Power Query M
This change will expose Windows authentication as an option in the Power BI Desktop
authentication experience. The SupportsAlternateCredentials flag will expose the
option to "Connect using alternative credentials". After this flag is enabled, you can
specify explicit Windows account credentials (username and password). You can use this
feature to test impersonation by providing your own account credentials.
Kerberos SSO
Kerberos-based single sign-on is supported in gateway scenarios. The data source must
support Windows authentication. Generally, these scenarios involve Direct Query-based
reports, and a connector based on an ODBC driver. The primary requirements for the
driver are that it can determine Kerberos configuration settings from the current thread
context, and that it supports thread-based user impersonation. The gateway must be
configured to support Kerberos Constrained Delegation (KCD). An example can be
found in the Impala sample connector.
Power BI will send the current user information to the gateway. The gateway will use
Kerberos Constrained Delegation to invoke the query process as the impersonated user.
After making the above changes, the connector owner can test the following scenarios
to validate functionality.
Connector developers can also use this procedure to test their implementation of
Kerberos-based SSO.
1. Set up an on-premises data gateway with single sign-on enabled using instructions
in the Power BI Kerberos SSO documentation article.
2. Validate the setup by testing with SQL Server and Windows accounts. Set up the
SQL Server Kerberos configuration manager. If you can use Kerberos SSO with SQL
Server then your Power BI data gateway is properly set up to enable Kerberos SSO
for other data sources as well.
5. Once you've made the changes to your application, ensure that you can use
impersonation to load and connect to your service through the ODBC driver.
Ensure that data can be retrieved. If you want to use native C or C++ code instead,
you'll need to use LsaLoginUser to retrieve a token with just the username and use
the KERB_S4U_LOGON option.
After this functionality is validated, Microsoft will make a change to thread the UPN
from the Power BI Service down through the gateway. Once at the gateway, it will
essentially act the same way as your test application to retrieve data.
Reach out to your Microsoft contact prior to starting work to learn more on how to
request this change.
SAML SSO
SAML-based SSO is often not supported by end data sources and isn't a recommended
approach. If your scenario requires the use of SAML-based SSO, reach out to your
Microsoft contact or visit our documentation to learn more.
For information on how to implement native database query support in your connector,
follow the walk through in the handling native query support article.
Handling authentication
Article • 05/23/2023
Authentication kinds
An extension can support one or more kinds of Authentication. Each authentication kind
is a different type of credential. The authentication UI displayed to end users in Power
Query is driven by the type of credential(s) that an extension supports.
OAuth StartLogin Function that provides the URL and state information for
starting an OAuth flow.
Label (optional) A text value that allows you to override the default
label for this AuthenticationKind.
Aad AuthorizationUri text value or function that returns the Azure AD authorization
endpoint (example:
"https://login.microsoftonline.com/common/oauth2/authorize" ).
Resource text value or function that returns the Azure AD resource value
for your service.
Authentication Field Description
Kind
UsernamePassword UsernameLabel (optional) A text value to replace the default label for the
Username text box on the credentials UI.
PasswordLabel (optional) A text value to replace the default label for the
Password text box on the credentials UI.
Label (optional) A text value that allows you to override the default
label for this AuthenticationKind.
Windows UsernameLabel (optional) A text value to replace the default label for the
Username text box on the credentials UI.
PasswordLabel (optional) A text value to replace the default label for the
Password text box on the credentials UI.
Label (optional) A text value that allows you to override the default
label for this AuthenticationKind.
Key KeyLabel (optional) A text value to replace the default label for the API
Key text box on the credentials UI.
Label (optional) A text value that allows you to override the default
label for this AuthenticationKind.
The following sample shows the Authentication record for a connector that supports
OAuth, Key, Windows, Basic (Username and Password), and Anonymous credentials.
Example:
Power Query M
Authentication = [
OAuth = [
StartLogin = StartLogin,
FinishLogin = FinishLogin,
Refresh = Refresh,
Logout = Logout
],
Key = [],
UsernamePassword = [],
Windows = [],
Anonymous = []
]
M data source functions that have been enabled for extensibility automatically inherit
your extension's credential scope. In most cases, you don't need to explicitly access the
current credentials, however, there are exceptions, such as:
Key The API key value. Note, the key value is also Key
available in the Password field as well. By default, the
mashup engine inserts this key in an Authorization
header as if this value were a basic auth password
(with no username). If this type of behavior isn't what
you want, you must specify the ManualCredentials =
true option in the options record.
Field Description Used By
The following code sample accesses the current credential for an API key and uses it to
populate a custom header ( x-APIKey ).
Example:
Power Query M
#"x-APIKey" = apiKey,
Accept = "application/vnd.api+json",
#"Content-Type" = "application/json"
],
request = Web.Contents(_url, [ Headers = headers, ManualCredentials =
true ])
in
request
7 Note
There are two sets of OAuth function signatures: the original signature that contains a
minimal number of parameters, and an advanced signature that accepts more
parameters. Most OAuth flows can be implemented using the original signatures. You
can also mix and match signature types in your implementation. The function calls are
matches based on the number of parameters (and their types). The parameter names
aren't taken into consideration.
All signatures accept a clientApplication record value, which is reserved for future
use.
All signatures accept a dataSourcePath (also referred to as resourceUrl in most
samples).
The Refresh function accepts an oldCredential parameter, which is the previous
record returned by your FinishLogin function (or previous call to Refresh ).
Power Query M
7 Note
7 Note
If you implement your own OAuth flow for Azure AD, users who have enabled
Conditional Access for their tenant might encounter issues when refreshing using
the Power BI service. This won't impact gateway-based refresh, but would impact a
certified connector that supports refresh from the Power BI service. Users might run
into a problem stemming from the connector using a public client application
when configuring web-based credentials through the Power BI service. The access
token generated by this flow will ultimately be used on a different computer (that
is, the Power BI service in an Azure data center, not on the company's network)
than the one used to originally authenticate (that is, the computer of the user who
configures the data source credentials on the company's network). The built-in Aad
type works around this problem by using a different Azure AD client when
configuring credentials in the Power BI service. This option won't be available to
connectors that use the OAuth authentication kind.
Most connectors need to provide values for the AuthorizationUri and Resource fields.
Both fields can be text values, or a single argument function that returns a text value .
Power Query M
AuthorizationUri =
"https://login.microsoftonline.com/common/oauth2/authorize"
Power Query M
Power Query M
Power Query M
Connectors that use a Uri based identifier don't need to provide a Resource value. By
default, the value is equal to the root path of the connector's Uri parameter. If the data
source's Azure AD resource is different than the domain value (for example, it uses a
GUID), then a Resource value needs to be provided.
Power Query M
Authentication = [
Aad = [
AuthorizationUri =
"https://login.microsoftonline.com/common/oauth2/authorize",
Resource = "77256ee0-fe79-11ea-adc1-0242ac120002" // Azure AD
resource value for your service - Guid or URL
]
]
In the following case, the data source supports tenant discovery based on OpenID
Connect (OIDC) or similar protocol. This ability allows the connector to determine the
correct Azure AD endpoint to use based on one or more parameters in the data source
path. This dynamic discovery approach allows the connector to support Azure B2B.
Power Query M
// Implement this function to retrieve or calculate the service URL based on
the data source path parameters
GetServiceRootFromDataSourcePath = (dataSourcePath) as text => ...;
Text.Replace(Text.Trim(Text.Trim(Text.AfterDelimiter(authorizationUri,
"=")), ","), """", "")
else
error Error.Record("DataSource.Error", "Unexpected WWW-
Authenticate header format or value during authentication.", [
#"WWW-Authenticate" = wwwAuthenticate
])
else
error Error.Unexpected("Unexpected response from server during
authentication.");
Authentication = [
Aad = [
AuthorizationUri = (dataSourcePath) =>
GetAuthorizationUrlFromWwwAuthenticate(
GetServiceRootFromDataSourcePath(dataSourcePath)
),
Resource = "https://myAadResourceValue.com", // Azure AD resource
value for your service - Guid or URL
]
]
Web.Contents
OData.Feed
Odbc.DataSource
AdoDotNet.DataSource
OleDb.DataSource
Example:
[DataSource.Kind="HelloWorld", Publish="HelloWorld.Publish"]
shared HelloWorld.Contents = (optional message as text) =>
let
message = if (message <> null) then message else "Hello world"
in
message;
Each function associated with the same data source must have the same required
function parameters, including name, type, and order. (For purposes of Data Source
Kind, a parameter is not considered required if it is marked optional or if its metadata
contains DataSource.Path = false .)
Functions for a specific Data Source Kind can only use credentials associated with that
Kind. Credentials are identified at runtime by performing a lookup based on the
combination of the function's required parameters. For more information about how
credentials are identified, see Data Source Paths.
Example:
HelloWorld = [
Authentication = [
Implicit = []
],
Label = Extension.LoadString("DataSourceLabel")
];
Properties
The following table lists the fields for your Data Source definition record.
Label text (optional) Friendly display name for this extension in credential
dialogs.
SupportsEncryption logical (optional) When true, the UI will present the option to connect to
the data source using an encrypted connection. This is typically
used for data sources with a non-encrypted fallback mechanism
(generally ODBC or ADO.NET based sources).
Publish to UI
Similar to the Data Source definition record, the Publish record provides the Power
Query UI the information it needs to expose this extension in the Get Data dialog.
Example:
HelloWorld.Publish = [
Beta = true,
ButtonText = { Extension.LoadString("FormulaTitle"),
Extension.LoadString("FormulaHelp") },
SourceImage = HelloWorld.Icons,
SourceTypeImage = HelloWorld.Icons
];
HelloWorld.Icons = [
Icon16 = { Extension.Contents("HelloWorld16.png"),
Extension.Contents("HelloWorld20.png"),
Extension.Contents("HelloWorld24.png"),
Extension.Contents("HelloWorld32.png") },
Icon32 = { Extension.Contents("HelloWorld32.png"),
Extension.Contents("HelloWorld40.png"),
Extension.Contents("HelloWorld48.png"),
Extension.Contents("HelloWorld64.png") }
];
Properties
The following table lists the fields for your Publish record.
ButtonText list List of text items. The first item defines the name displayed next
to the data source's icon in the Power BI Get Data dialog. The
second item (optional) defines the tool tip that will be displayed
when the preceding name is moused-over.
Category text Where the extension should be displayed in the Get Data dialog.
Currently the only category values with special handing are
Azure and Database . All other values will end up under the Other
category.
Beta logical (optional) When set to true, the UI will display a Preview/Beta
identifier next to your connector name and a warning dialog that
the implementation of the connector is subject to breaking
changes.
LearnMoreUrl text (optional) Url to website containing more information about this
data source or connector.
Using M's built-in Odbc.DataSource function is the recommended way to create custom
connectors for data sources that have an existing ODBC driver and/or support a SQL
query syntax. Wrapping the Odbc.DataSource function allows your connector to inherit
default query folding behavior based on the capabilities reported by your driver. This
will enable the M engine to generate SQL statements based on filters and other
transformations defined by the user within the Power Query experience, without having
to provide this logic within the connector itself.
7 Note
Enabling DirectQuery support raises the difficulty and complexity level of your
connector. When DirectQuery is enabled, Power BI prevents the M engine from
compensating for operations that can't be fully pushed to the underlying data
source.
This article assumes familiarity with the creation of a basic custom connector.
Refer to the SqlODBC sample for most of the code examples in the following sections.
Other samples can be found in the ODBC samples directory.
The Odbc.DataSource function provides a default navigation table with all databases,
tables, and views from your system. This function also supports query folding, and
allows for a range of customization options. Most ODBC-based extensions use this
function as their primary extensibility function. The function accepts two arguments—a
connection string, and an options record to provide behavior overrides.
The Odbc.Query function allows you to execute SQL statements through an ODBC
driver. It acts as a passthrough for query execution. Unlike the Odbc.DataSource
function, it doesn't provide query folding functionality, and requires that SQL queries be
provided by the connector (or end user). When building a custom connector, this
function is typically used internally to run queries to retrieve metadata that might not be
exposed through regular ODBC channels. The function accepts two arguments—a
connection string, and a SQL query.
Although you can define parameters with a fixed number of values (that is, a
dropdown list in the UI), parameters are entered before the user is authenticated.
Any values that can be discovered programmatically after the user is authenticated
(such as catalog or database name) should be selectable through the Navigator.
The default behavior for the Odbc.DataSource function is to return a hierarchical
navigation table consisting of Catalog (Database), Schema, and Table names.
However, this behavior can be overridden within your connector.
If you feel your users will typically know what values to enter for items they would
select from the Navigator (such as the database name), make these parameters
optional. Parameters that can be discovered programmatically shouldn't be made
required.
The last parameter for your function should be an optional record called "options".
This parameter typically allows advanced users to set common ODBC-related
properties (such as CommandTimeout ), set behavior overrides specific to your
connector, and allows for future extensibility without impacting backwards
compatibility for your function.
By default, all required parameters for your data source function are factored into the
Data Source Path value used to identify user credentials.
While the UI for the built-in Odbc.DataSource function provides a dropdown that allows
the user to select a DSN, this functionality isn't available through extensibility. If your
data source configuration is complex enough to require a fully customizable
configuration dialog, we recommended that you require your end users to pre-
configure a system DSN, and have your function take in the DSN name as a text field.
Next steps
Parameters for Odbc.DataSource
Parameters for Odbc.DataSource
Article • 09/01/2022
The supported options records fields fall into two categories—those that are public and
always available, and those that are only available in an extensibility context.
The following table describes the public fields in the options record.
Field Description
CommandTimeout A duration value that controls how long the server-side query is
allowed to run before it's canceled.
Default: 10 minutes
ConnectionTimeout A duration value that controls how long to wait before abandoning
an attempt to make a connection to the server.
Default: 15 seconds
Default: true
HierarchicalNavigation A logical value that sets whether to view the tables grouped by
their schema names. When set to false, tables are displayed in a
flat list under each database.
Default: false
Field Description
Default: true
The following table describes the options record fields that are only available through
extensibility. Fields that aren't simple literal values are described in later sections.
Field Description
CancelQueryExplicitly A logical value that instructs the M engine to explicitly cancel any
running calls through the ODBC driver before terminating the
connection to the ODBC server.
Default: false
ClientConnectionPooling A logical value that enables client-side connection pooling for the
ODBC driver. Most drivers will want to set this value to true.
Default: false
HideNativeQuery A logical value that controls whether or not the connector shows
generated SQL statements in the Power Query user experience.
This should only be set to true if the back end data source natively
supports SQL-92.
Default: false
Default: false
SQLGetTypeInfo A table or function that returns a table that overrides the type
information returned by SQLGetTypeInfo .
When the value is set to a table, the value completely replaces the
type information reported by the driver. SQLGetTypeInfo won't be
called.
When the value is set to a function, your function will receive the
result of the original call to SQLGetTypeInfo , allowing you to
modify the table.
SQLTables A function that allows you to modify the table metadata returned
by a call to SQLTables .
Default: false
UseEmbeddedDriver (internal use): A logical value that controls whether the ODBC
driver should be loaded from a local directory (using new
functionality defined in the ODBC 4.0 specification). This value is
generally only set by connectors created by Microsoft that ship
with Power Query.
Default: false
Overriding AstVisitor
The AstVisitor field is set through the Odbc.DataSource options record. It's used to
modify SQL statements generated for specific query scenarios.
7 Note
Drivers that support LIMIT and OFFSET clauses (rather than TOP) will want to
provide a LimitClause override for AstVisitor .
Constant
Providing an override for this value has been deprecated and may be removed from
future implementations.
LimitClause
This field is a function that receives two Int64.Type arguments ( skip , take ), and returns
a record with two text fields ( Text , Location ).
The skip parameter is the number of rows to skip (that is, the argument to OFFSET). If
an offset isn't specified, the skip value will be null. If your driver supports LIMIT, but
doesn't support OFFSET, the LimitClause function should return an unimplemented
error (...) when skip is greater than 0.
The take parameter is the number of rows to take (that is, the argument to LIMIT).
The Text field of the result contains the SQL text to add to the generated query.
The Location field specifies where to insert the clause. The following table describes
supported values.
LIMIT 5
BeforeQuerySpecification LIMIT clause is put before the generated SQL LIMIT 5 ROWS
statement.
SELECT a, b, c
FROM table
WHERE a > 10
AfterSelect LIMIT goes after the SELECT statement, and SELECT DISTINCT
after any modifiers (such as DISTINCT). LIMIT 5 a, b, c
FROM table
WHERE a > 10
AfterSelectBeforeModifiers LIMIT goes after the SELECT statement, but SELECT LIMIT 5
before any modifiers (such as DISTINCT). DISTINCT a, b, c
FROM table
WHERE a > 10
The following code snippet provides a LimitClause implementation for a driver that
expects a LIMIT clause, with an optional OFFSET, in the following format: [OFFSET
<offset> ROWS] LIMIT <row_count>
Overriding SqlCapabilities
Field Details
Default: null
Default: false
SupportsTop A logical value that indicates the driver supports the TOP clause
to limit the number of returned rows.
Default: false
StringLiteralEscapeCharacters A list of text values that specify the character(s) to use when
escaping string literals and LIKE expressions.
Example: {""}
Default: null
Field Details
SupportsDerivedTable A logical value that indicates the driver supports derived tables
(sub-selects).
SupportsNumericLiterals A logical value that indicates whether the generated SQL should
include numeric literals values. When set to false, numeric
values are always specified using parameter binding.
Default: false
SupportsStringLiterals A logical value that indicates whether the generated SQL should
include string literals values. When set to false, string values are
always specified using parameter binding.
Default: false
SupportsOdbcDateLiterals A logical value that indicates whether the generated SQL should
include date literals values. When set to false, date values are
always specified using parameter binding.
Default: false
SupportsOdbcTimeLiterals A logical value that indicates whether the generated SQL should
include time literals values. When set to false, time values are
always specified using parameter binding.
Default: false
SupportsOdbcTimestampLiterals A logical value that indicates whether the generated SQL should
include timestamp literals values. When set to false, timestamp
values are always specified using parameter binding.
Default: false
Overriding SQLColumns
SQLColumns is a function handler that receives the results of an ODBC call to
SQLColumns. The source parameter contains a table with the data type information. This
override is typically used to fix up data type mismatches between calls to
SQLGetTypeInfo and SQLColumns .
For details of the format of the source table parameter, go to SQLColumns Function.
Overriding SQLGetFunctions
This field is used to override SQLFunctions values returned by an ODBC driver. It
contains a record whose field names are equal to the FunctionId constants defined for
the ODBC SQLGetFunctions function. Numeric constants for each of these fields can be
found in the ODBC specification .
Field Details
SQL_CONVERT_FUNCTIONS Indicates which function(s) are supported when doing type conversions.
By default, the M Engine attempts to use the CONVERT function. Drivers
that prefer the use of CAST can override this value to report that only
SQL_FN_CVT_CAST (numeric value of 0x2) is supported.
SQL_API_SQLBINDCOL A logical (true/false) value that indicates whether the mashup engine
should use the SQLBindCol API when retrieving data. When set to false,
SQLGetData is used instead.
Default: false
The following code snippet provides an example explicitly telling the M engine to use
CAST rather than CONVERT.
SQLGetFunctions = [
SQL_CONVERT_FUNCTIONS = 0x2 /* SQL_FN_CVT_CAST */
]
Overriding SQLGetInfo
This field is used to override SQLGetInfo values returned by an ODBC driver. It contains
a record whose fields are names equal to the InfoType constants defined for the ODBC
SQLGetInfo function. Numeric constants for each of these fields can be found in the
ODBC specification . The full list of InfoTypes that are checked can be found in the
mashup engine trace files.
Field Details
SQL_SQL_CONFORMANCE An integer value that indicates the level of SQL-92 supported by the
driver:
SQL_AF_ALL
SQL_AF_AVG
SQL_AF_COUNT
SQL_AF_DISTINCT
SQL_AF_MAX
SQL_AF_MIN
SQL_AF_SUM
SQL_GROUP_BY An integer value that specifies the relationship between the columns
in the GROUP BY clause and the non-aggregated columns in the
select list:
The following helper function can be used to create bitmask values from a list of integer
values:
A fixed table value that contains the same type information as an ODBC call to
SQLGetTypeInfo .
A function that accepts a table argument, and returns a table. The argument
contains the original results of the ODBC call to SQLGetTypeInfo . Your function
implementation can modify or add to this table.
The first approach is used to completely override the values returned by the ODBC
driver. The second approach is used if you want to add to or modify these values.
For details of the format of the types table parameter and expected return value, go to
SQLGetTypeInfo function reference.
SQLGetTypeInfo = #table(
{ "TYPE_NAME", "DATA_TYPE", "COLUMN_SIZE", "LITERAL_PREF",
"LITERAL_SUFFIX", "CREATE_PARAS", "NULLABLE", "CASE_SENSITIVE",
"SEARCHABLE", "UNSIGNED_ATTRIBUTE", "FIXED_PREC_SCALE", "AUTO_UNIQUE_VALUE",
"LOCAL_TYPE_NAME", "MINIMUM_SCALE", "MAXIMUM_SCALE", "SQL_DATA_TYPE",
"SQL_DATETIME_SUB", "NUM_PREC_RADIX", "INTERNAL_PRECISION", "USER_DATA_TYPE"
}, {
The code snippet below shows the definition of a new data source function, creation of
the ConnectionString record, and invocation of the Odbc.DataSource function.
[DataSource.Kind="SqlODBC", Publish="SqlODBC.Publish"]
shared SqlODBC.Contents = (server as text) =>
let
ConnectionString = [
Driver = "SQL Server Native Client 11.0",
Server = server,
MultiSubnetFailover = "Yes",
ApplicationIntent = "ReadOnly",
APP = "PowerBICustomConnector"
],
OdbcDatasource = Odbc.DataSource(ConnectionString)
in
OdbcDatasource;
Next steps
Test and troubleshoot an ODBC-based connector
Test and troubleshoot an ODBC-based
connector
Article • 09/01/2022
While you're building your ODBC-based connector, it's a good idea to occasionally test
and troubleshoot the connector. This section describes how to set up and use some test
and troubleshooting tools.
Here are steps you can take for initial testing in Power BI Desktop:
7. Expressions that fail to fold will result in a warning bar. Note the failure, remove
the step, and move to the next test case. Details about the cause of the failure
should be emitted to the trace logs.
8. Close Power BI Desktop.
9. Copy the trace files to a new directory.
10. Open the trace files in your text editor of choice.
11. Search for OdbcQuery/FoldingWarning entries in the trace files. These entries should
contain more information as to why the query engine believes that query folding
isn't possible for this operation.
Once you have simple queries working, you can then try DirectQuery scenarios (for
example, building reports in the Report Views). The queries generated in DirectQuery
mode are significantly more complex (that is, use of sub-selects, COALESCE statements,
and aggregations).
1. Ensure that your database can support up-conversion to CLOB types when string
concatenation overflow occurs.
2. Set the TolerateConcatOverflow option for Odbc.DataSource to true .
7 Note
The M engine identifies a data source using a combination of its Kind and Path. When a
data source is encountered during a query evaluation, the M engine tries to find
matching credentials. If no credentials are found, the engine returns a special error that
results in a credential prompt in Power Query.
The Kind value comes from the Data Source Kind definition.
The Path value is derived from the required parameters of your data source function.
Optional parameters aren't factored into the data source path identifier. As a result, all
data source functions associated with a data source kind must have the same
parameters. There's special handling for functions that have a single parameter of type
Uri.Type . Go to Functions with a Uri parameter for details.
You can see an example of how credentials are stored in the Data source settings dialog
in Power BI Desktop. In this dialog, the Kind is represented by an icon, and the Path
value is displayed as text.
7 Note
If you change your data source function's required parameters during
development, previously stored credentials no longer work (because the path
values no longer match). You should delete any stored credentials any time you
change your data source function parameters. If incompatible credentials are
found, you might receive an error at runtime.
By default, you can see the actual string value in the Data source settings dialog in
Power BI Desktop, and in the credential prompt. If the Data Source Kind definition has
included a Label value, you'll see the label value instead.
For example, the data source function in the HelloWorldWithDocs sample has the
following signature:
Power Query M
The function has a single required parameter ( message ) of type text , and is used to
calculate the data source path. The optional parameter ( count ) is ignored. The path
would be displayed as follows:
Credential prompt
Data source settings UI
When a Label value is defined, the data source path value isn't shown:
7 Note
We currently recommend that you do not include a Label for your data source if
your function has required parameters, as users won't be able to distinguish
between the different credentials they've entered. We are hoping to improve this in
the future (that is, allowing data connectors to display their own custom data
source paths).
Power Query M
As Uri.Type is an ascribed type rather than a primitive type in the M language, you'll
need to use the Value.ReplaceType function to indicate that your text parameter should
be treated as a Uri.
Power Query M
REST APIs typically have some mechanism to transmit large volumes of records broken
up into pages of results. Power Query has the flexibility to support many different
paging mechanisms. However, since each paging mechanism is different, some amount
of modification of the paging examples is likely to be necessary to fit your situation.
Typical Patterns
The heavy lifting of compiling all page results into a single table is performed by the
Table.GenerateByPage() helper function, which can generally be used with no
modification. The code snippets presented in the Table.GenerateByPage() helper
function section describe how to implement some common paging patterns. Regardless
of pattern, you'll need to understand:
For situations where the data source response isn't presented in a format that Power BI
can consume directly, Power Query can be used to perform a series of transformations.
Static transformations
In most cases, the data is presented in a consistent way by the data source: column
names, data types, and hierarchical structure are consistent for a given endpoint. In this
situation it's appropriate to always apply the same set of transformations to get the data
in a format acceptable to Power BI.
An example of static transformation can be found in the TripPin Part 2 - Data Connector
for a REST Service tutorial when the data source is treated as a standard REST service:
Power Query M
let
Source =
TripPin.Feed("https://services.odata.org/v4/TripPinService/Airlines"),
value = Source[value],
toTable = Table.FromList(value, Splitter.SplitByNothing(), null, null,
ExtraValues.Error),
expand = Table.ExpandRecordColumn(toTable, "Column1", {"AirlineCode",
"Name"}, {"AirlineCode", "Name"})
in
expand
It's important to note that a sequence of static transformations of this specificity are
only applicable to a single endpoint. In the example above, this sequence of
transformations will only work if "AirlineCode" and "Name" exist in the REST endpoint
response, since they are hard-coded into the M code. Thus, this sequence of
transformations may not work if you try to hit the /Event endpoint.
This high level of specificity may be necessary for pushing data to a navigation table, but
for more general data access functions it's recommended that you only perform
transformations that are appropriate for all endpoints.
7 Note
Dynamic Transformations
More complex logic is sometimes needed to convert API responses into stable and
consistent forms appropriate for Power BI data models.
Power Query M
raw = Web.Contents(...),
columns = raw[columns],
columnTitles = List.Transform(columns, each [title]),
columnTitlesWithRowNumber = List.InsertRange(columnTitles, 0,
{"RowNumber"}),
1. First deal with column header information. You can pull the title record of each
column into a List, prepending with a RowNumber column that you know will always
be represented as this first column.
2. Next you can define a function that allows you to parse a row into a List of cell
value s. You can again prepend rowNumber information.
3. Apply your RowAsList() function to each of the row s returned in the API response.
4. Convert the List to a table, specifying the column headers.
Handling schema
Article • 02/17/2023
Depending on your data source, information about data types and column names may
or may not be provided explicitly. OData REST APIs typically handle this using the
$metadata definition , and the Power Query OData.Feed method automatically handles
parsing this information and applying it to the data returned from an OData source.
Many REST APIs don't have a way to programmatically determine their schema. In these
cases you'll need to include a schema definition in your connector.
Overall, enforcing a schema on the data returned by your connector has multiple
benefits, such as:
Power Query M
let
url = "https://services.odata.org/TripPinWebApiService/Airlines",
source = Json.Document(Web.Contents(url))[value],
asTable = Table.FromRecords(source)
in
asTable
7 Note
TripPin is an OData source, so realistically it would make more sense to simply use
the OData.Feed function's automatic schema handling. In this example you'll be
treating the source as a typical REST API and using Web.Contents to demonstrate
the technique of hardcoding a schema by hand.
You can use the handy Table.Schema function to check the data type of the columns:
Power Query M
let
url = "https://services.odata.org/TripPinWebApiService/Airlines",
source = Json.Document(Web.Contents(url))[value],
asTable = Table.FromRecords(source)
in
Table.Schema(asTable)
Both AirlineCode and Name are of any type. Table.Schema returns a lot of metadata
about the columns in a table, including names, positions, type information, and many
advanced properties such as Precision, Scale, and MaxLength. For now you should only
concern yourself with the ascribed type ( TypeName ), primitive type ( Kind ), and whether
the column value might be null ( IsNullable ).
Column Details
Name The name of the column. This must match the name in the results returned by the
service.
Type The M data type you're going to set. This can be a primitive type (text, number,
datetime, and so on), or an ascribed type (Int64.Type, Currency.Type, and so on).
The hardcoded schema table for the Airlines table will set its AirlineCode and Name
columns to text and looks like this:
Power Query M
As you look to some of the other endpoints, consider the following schema tables:
The Airports table has four fields you'll want to keep (including one of type record ):
Power Query M
The People table has seven fields, including list s ( Emails , AddressInfo ), a nullable
column ( Gender ), and a column with an ascribed type ( Concurrency ):
Power Query M
You can put all of these tables into a single master schema table SchemaTable :
Power Query M
table table The table of data you'll want to enforce your schema on.
schema table The schema table to read column info from, with the following type:
type table [Name = text, Type = type] .
1. Determine if there are any missing columns from the source table.
2. Determine if there are any extra columns.
3. Ignore structured columns (of type list , record , and table ), and columns set to
type any .
4. Use Table.TransformColumnTypes to set each column type.
5. Reorder columns based on the order they appear in the schema table.
6. Set the type on the table itself using Value.ReplaceType.
7 Note
The last step to set the table type will remove the need for the Power Query UI to
infer type information when viewing the results in the query editor, which can
sometimes result in a double-call to the API.
Sophisticated approach
The hardcoded implementation discussed above does a good job of making sure that
schemas remain consistent for simple JSON repsonses, but it's limited to parsing the
first level of the response. Deeply nested data sets would benefit from the following
approach, which takes advantage of M Types.
Here is a quick refresh about types in the M language from the Language Specification:
A type value is a value that classifies other values. A value that is classified by a type
is said to conform to that type. The M type system consists of the following kinds of
types:
type ) and also include a number of abstract types ( function , table , any , and
none ).
Record types, which classify record values based on field names and value
types.
List types, which classify lists using a single item base type.
Function types, which classify function values based on the types of their
parameters and return values.
Table types, which classify table values based on column names, column types,
and keys.
Nullable types, which classify the value null in addition to all the values
classified by a base type.
Type types, which classify values that are types.
Using the raw JSON output you get (and/or by looking up the definitions in the service's
$metadata ), you can define the following record types to represent OData complex
types:
Power Query M
LocationType = type [
Address = text,
City = CityType,
Loc = LocType
];
CityType = type [
CountryRegion = text,
Name = text,
Region = text
];
LocType = type [
#"type" = text,
coordinates = {number},
crs = CrsType
];
CrsType = type [
#"type" = text,
properties = record
];
Notice how LocationType references the CityType and LocType to represent its
structured columns.
For the top-level entities that you'll want represented as Tables, you can define table
types:
Power Query M
You can then update your SchemaTable variable (which you can use as a lookup table for
entity-to-type mappings) to use these new type definitions:
Power Query M
and will apply your schema recursively for all nested types. Its signature is:
Power Query M
7 Note
For flexibility, the function can be used on tables as well as lists of records (which is
how tables are represented in a JSON document).
You'll then need to update the connector code to change the schema parameter from a
table to a type , and add a call to Table.ChangeType . Again, the details for doing so are
very implementation-specific and thus not worth going into in detail here. This extended
TripPin connector example demonstrates an end-to-end solution implementing this
more sophisticated approach to handling schema.
Status Code Handling with Web.Contents
Article • 12/21/2022
The Web.Contents function has some built-in functionality for dealing with certain HTTP
status codes. The default behavior can be overridden in your extension using the
ManualStatusHandling field in the options record.
Automatic retry
Web.Contents will automatically retry requests that fail with one of the following status
codes:
Code Status
Requests will be retried up to three times before failing. The engine uses an exponential
back-off algorithm to determine how long to wait until the next retry, unless the
response contains a Retry-after header. When the header is found, the engine will wait
the specified number of seconds before the next retry. The minimum supported wait
time is 0.5 seconds, and the maximum value is 120 seconds.
7 Note
The Retry-after value must be in the delta-seconds format. The HTTP-date format
is currently not supported.
Authentication exceptions
The following status codes will result in a credentials exception, causing an
authentication prompt asking the user to provide credentials (or sign in again in the
case of an expired OAuth token).
Code Status
401 Unauthorized
403 Forbidden
7 Note
Extensions are able to use the ManualStatusHandling option with status codes 401
and 403, which is not something that can be done in Web.Contents calls made
outside of a custom data connector (that is, directly from Power Query).
Redirection
The following status codes will result in an automatic redirect to the URI specified in the
Location header. A missing Location header will result in an error.
Code Status
302 Found
7 Note
Only status code 307 will keep a POST request method. All other redirect status
codes will result in a switch to GET .
Wait-Retry Pattern
Article • 02/17/2023
In some situations, a data source's behavior doesn't match that expected by Power
Query's default HTTP code handling. The examples below show how to work around this
situation.
In this scenario you'll be working with a REST API that occasionally returns a 500 status
code, indicating an internal server error. In these instances, you could wait a few seconds
and retry, potentially a few times before you give up.
ManualStatusHandling
If Web.Contents gets a 500 status code response, it throws a DataSource.Error by
default. You can override this behavior by providing a list of codes as an optional
argument to Web.Contents :
By specifying the status codes in this way, Power Query will continue to process the web
response as normal. However, normal response processing is often not appropriate in
these cases. You'll need to understand that an abnormal response code has been
received and perform special logic to handle it. To determine the response code that
was returned from the web service, you can access it from the meta Record that
accompanies the response:
responseCode = Value.Metadata(response)[Response.Status]
Based on whether responseCode is 200 or 500, you can either process the result as
normal, or follow your wait-retry logic that you'll flesh out in the next section.
IsRetry
Power Query has a local cache that stores the results of previous calls to Web.Contents.
When polling the same URL for a new response, or when retrying after an error status,
you'll need to ensure that the query ignores any cached results. You can do this by
including the IsRetry option in the call to the Web.Contents function. In this sample,
we'll set IsRetry to true after the first iteration of the Value.WaitFor loop.
Value.WaitFor
Value.WaitFor() is a standard helper function that can usually be used with no
modification. It works by building a List of retry attempts.
producer Argument
This contains the task to be (possibly) retried. It's represented as a function so that the
iteration number can be used in the producer logic. The expected behavior is that
producer will return null if a retry is determined to be necessary. If anything other than
delay Argument
This contains the logic to execute between retries. It's represented as a function so that
the iteration number can be used in the delay logic. The expected behavior is that
delay returns a Duration.
A maximum number of retries can be set by providing a number to the count argument.
let
waitForResult = Value.WaitFor(
(iteration) =>
let
result = Web.Contents(url, [ManualStatusHandling = {500},
IsRetry = iteration > 0]),
status = Value.Metadata(result)[Response.Status],
actualResult = if status = 500 then null else result
in
actualResult,
(iteration) => #duration(0, 0, 0, Number.Power(2, iteration)),
5)
in
if waitForResult = null then
error "Value.WaitFor() Failed after multiple retry attempts"
else
waitForResult
Handling Unit Testing
Article • 02/17/2023
For both simple and complex connectors, adding unit tests is a best practice and highly
recommended.
Unit testing is accomplished in the context of Visual Studio's Power Query SDK . Each
test is defined as a Fact that has a name, an expected value, and an actual value. In
most cases, the "actual value" will be an M expression that tests part of your expression.
Power Query M
section Unittesting;
This unit test code is made up of a number of Facts, and a bunch of common code for
the unit test framework ( ValueToText , Fact , Facts , Facts.Summarize ). The following
code provides an example set of Facts (go to UnitTesting.query.pq for the common
code):
Power Query M
section UnitTestingTests;
shared MyExtension.UnitTest =
[
// Put any common variables here if you only want them to be evaluated
once
Running the sample in Visual Studio will evaluate all of the Facts and give you a visual
summary of the pass rates:
Implementing unit testing early in the connector development process enables you to
follow the principles of test-driven development. Imagine that you need to write a
function called Uri.GetHost that returns only the host data from a URI. You might start
by writing a test case to verify that the function appropriately performs the expected
function:
Power Query M
An early version of the function might pass some but not all tests:
Power Query M
The final version of the function should pass all unit tests. This also makes it easy to
ensure that future updates to the function do not accidentally remove any of its basic
functionality.
Helper Functions
Article • 02/17/2023
This topic contains a number of helper functions commonly used in M extensions. These
functions may eventually be moved to the official M library, but for now can be copied
into your extension file code. You shouldn't mark any of these functions as shared
within your extension code.
Navigation Tables
Table.ToNavigationTable
This function adds the table type metadata needed for your extension to return a table
value that Power Query can recognize as a Navigation Tree. See Navigation Tables for
more information.
Power Query M
Table.ToNavigationTable = (
table as table,
keyColumns as list,
nameColumn as text,
dataColumn as text,
itemKindColumn as text,
itemNameColumn as text,
isLeafColumn as text
) as table =>
let
tableType = Value.Type(table),
newTableType = Type.AddTableKey(tableType, keyColumns, true) meta
[
NavigationTable.NameColumn = nameColumn,
NavigationTable.DataColumn = dataColumn,
NavigationTable.ItemKindColumn = itemKindColumn,
Preview.DelayColumn = itemNameColumn,
NavigationTable.IsLeafColumn = isLeafColumn
],
navigationTable = Value.ReplaceType(table, newTableType)
in
navigationTable;
Parameter Details
keyColumns List of column names that act as the primary key for your navigation table.
nameColumn The name of the column that should be used as the display name in the
navigator.
dataColumn The name of the column that contains the Table or Function to display.
itemKindColumn The name of the column to use to determine the type of icon to display.
Valid values for the column are listed in the Handling Navigation article.
itemNameColumn The name of the column to use to determine the type of tooltip to display.
Valid values for the column are Table and Function .
isLeafColumn The name of the column used to determine if this is a leaf node, or if the
node can be expanded to contain another navigation table.
Example usage:
Power Query M
URI Manipulation
Uri.FromParts
This function constructs a full URL based on individual fields in the record. It acts as the
reverse of Uri.Parts.
Power Query M
Uri.GetHost
This function returns the scheme, host, and default port (for HTTP/HTTPS) for a given
URL. For example, https://bing.com/subpath/query?param=1¶m2=hello would
become https://bing.com:443 .
Power Query M
ValidateUrlScheme
This function checks if the user entered an HTTPS URL and raises an error if they don't.
This is required for user entered URLs for certified connectors.
Power Query M
To apply it, just wrap your url parameter in your data access function.
Power Query M
Retrieving Data
Value.WaitFor
This function is useful when making an asynchronous HTTP request and you need to
poll the server until the request is complete.
Power Query M
Table.GenerateByPage
This function is used when an API returns data in an incremental/paged format, which is
common for many REST APIs. The getNextPage argument is a function that takes in a
single parameter, which will be the result of the previous call to getNextPage , and should
return a nullable table .
Power Query M
into a single table. When the result of the first call to getNextPage is null, an empty table
is returned.
Power Query M
Additional notes:
The getNextPage function will need to retrieve the next page URL (or page number,
or whatever other values are used to implement the paging logic). This is generally
done by adding meta values to the page before returning it.
The columns and table type of the combined table (that is, all pages together) are
derived from the first page of data. The getNextPage function should normalize
each page of data.
The first call to getNextPage receives a null parameter.
getNextPage must return null when there are no pages left.
An example of using this function can be found in the Github sample, and the TripPin
paging sample.
Power Query M
SchemaTransformTable
Power Query M
Table.ChangeType
Power Query M
let
// table should be an actual Table.Type, or a List.Type of Records
Table.ChangeType = (table, tableType as type) as nullable table =>
// we only operate on table types
if (not Type.Is(tableType, type table)) then error "type argument
should be a table type" else
// if we have a null value, just return it
if (table = null) then table else
let
columnsForType = Type.RecordFields(Type.TableRow(tableType)),
columnsAsTable = Record.ToTable(columnsForType),
schema = Table.ExpandRecordColumn(columnsAsTable, "Value",
{"Type"}, {"Type"}),
previousMeta = Value.Metadata(tableType),
Power Query will automatically generate an invocation UI for you based on the
arguments for your function. By default, this UI will contain the name of your function,
and an input for each of your parameters.
Similarly, evaluating the name of your function, without specifying parameters, will
display information about it.
You might notice that built-in functions typically provide a better user experience, with
descriptions, tooltips, and even sample values. You can take advantage of this same
mechanism by defining specific meta values on your function type. This topic describes
the meta fields that are used by Power Query, and how you can make use of them in
your extensions.
Function Types
You can provide documentation for your function by defining custom type values. The
process looks like this:
You can find more information about types and metadata values in the M Language
Specification.
Using this approach allows you to supply descriptions and display names for your
function, as well as individual parameters. You can also supply sample values for
parameters, as well as defining a preset list of values (turning the default text box
control into a drop down).
The Power Query experience retrieves documentation from meta values on the type of
your function, using a combination of calls to Value.Type, Type.FunctionParameters, and
Value.Metadata.
Function documentation
The following table lists the Documentation fields that can be set in the metadata for
your function. All fields are optional.
Parameter documentation
The following table lists the Documentation fields that can be set in the metadata for
your function parameters. All fields are optional.
Documentation.AllowedValues list List of valid values for this parameter. Providing this
field will change the input from a textbox to a drop
down list. Note, this doesn't prevent a user from
manually editing the query to supply alternative
values.
Formatting.IsCode boolean Formats the input field for code, commonly with
multi-line inputs. Uses a code-like font rather than
the standard font.
Basic example
The following code snippet (and resulting dialogs) are from the HelloWorldWithDocs
sample.
Power Query M
[DataSource.Kind="HelloWorldWithDocs", Publish="HelloWorldWithDocs.Publish"]
shared HelloWorldWithDocs.Contents = Value.ReplaceType(HelloWorldImpl,
HelloWorldType);
Function info
Multi-line example
Power Query M
[DataSource.Kind="HelloWorld", Publish="HelloWorld.Publish"]
shared HelloWorld.Contents =
let
HelloWorldType = type function (
message1 as (type text meta [
Documentation.FieldCaption = "Message 1",
Documentation.FieldDescription = "Text to display for
message 1",
Documentation.SampleValues = {"Hello world"},
Formatting.IsMultiLine = true,
Formatting.IsCode = true
]),
message2 as (type text meta [
Documentation.FieldCaption = "Message 2",
Documentation.FieldDescription = "Text to display for
message 2",
Documentation.SampleValues = {"Hola mundo"},
Formatting.IsMultiLine = true,
Formatting.IsCode = false
])) as text,
HelloWorldFunction = (message1 as text, message2 as text) as text =>
message1 & message2
in
Value.ReplaceType(HelloWorldFunction, HelloWorldType);
This code (with associated publish information, and so on) results in the following
dialogue in Power BI. New lines will be represented in text with '#(lf)', or 'line feed'.
Handling navigation
Article • 02/17/2023
Navigation Tables (or nav tables) are a core part of providing a user-friendly experience
for your connector. The Power Query experience displays them to the user after they've
entered any required parameters for your data source function, and have authenticated
with the data source.
Behind the scenes, a nav table is just a regular M Table value with specific metadata
fields defined on its Type. When your data source function returns a table with these
fields defined, Power Query will display the navigator dialog. You can actually see the
underlying data as a Table value by right-clicking on the root node and selecting Edit.
Table.ToNavigationTable
You can use the Table.ToNavigationTable function to add the table type metadata
needed to create a nav table.
7 Note
You currently need to copy and paste this function into your M extension. In the
future it will likely be moved into the M standard library.
Parameter Details
keyColumns List of column names that act as the primary key for your navigation table.
nameColumn The name of the column that should be used as the display name in the
navigator.
dataColumn The name of the column that contains the Table or Function to display.
itemKindColumn The name of the column to use to determine the type of icon to display. See
below for the list of valid values for the column.
itemNameColumn The name of the column to use to determine the preview behavior. This is
typically set to the same value as itemKind.
isLeafColumn The name of the column used to determine if this is a leaf node, or if the
node can be expanded to contain another navigation table.
Field Parameter
NavigationTable.NameColumn nameColumn
NavigationTable.DataColumn dataColumn
NavigationTable.ItemKindColumn itemKindColumn
NavigationTable.IsLeafColumn isLeafColumn
Preview.DelayColumn itemNameColumn
Feed
Cube
CubeDatabase
CubeView
CubeViewFolder
Database
DatabaseServer
Dimension
Table
Folder
Function
View
Sheet
Subcube
DefinedName
Record
The following screenshot shows the icons for item kinds in Power BI Desktop.
Examples
Power Query M
Power Query M
This code would result in the following Navigator display in Power BI Desktop:
Error handling to ensure a good experience for users that don't have access to
certain endpoints.
Node evaluation is lazy by default; leaf nodes are not evaluated until the parent
node is expanded. Certain implementations of multi-level dynamic nav tables may
result in eager evaluation of the entire tree. Be sure to monitor the number of calls
that Power Query is making as it initially renders the navigation table. For example,
Table.InsertRows is 'lazier' than Table.FromRecords, as it doesn't need to evaluate
its arguments.
Handling Gateway Support
Article • 02/17/2023
Test Connection
Custom Connector support is available in both Personal and Standard modes of the
on-premises data gateway . Both gateway modes support Import. Direct Query is
only supported in Standard mode. OAuth for custom connectors via gateways is
currently supported only for gateway admins but not other data source users.
To support scheduled refresh through the on-premises data gateway, your connector
must implement a TestConnection handler. The function is called when the user is
configuring credentials for your source, and used to ensure they are valid. The
TestConnection handler is set in the Data Source Kind record, and has the following
signature:
Where dataSourcePath is the Data Source Path value for your function, and the return
value is a list composed of:
The name of the function to call (this function must be marked as #shared , and is
usually your primary data source function).
One or more arguments to pass to your function.
7 Note
Power Query M
TripPin = [
TestConnection = (dataSourcePath) => { "TripPin.Contents" },
Authentication = [
Anonymous = []
],
Label = "TripPin"
];
Power Query M
GithubSample = [
TestConnection = (dataSourcePath) => {"GithubSample.Contents",
dataSourcePath},
Authentication = [
OAuth = [
StartLogin = StartLogin,
FinishLogin = FinishLogin,
Label = Extension.LoadString("AuthenticationLabel")
]
]
];
DirectSQL = [
TestConnection = (dataSourcePath) =>
let
json = Json.Document(dataSourcePath),
server = json[server],
database = json[database]
in
{ "DirectSQL.Database", server, database },
Authentication = [
Windows = [],
UsernamePassword = []
],
Label = "Direct Query for SQL"
];
Handling Power Query Connector
Signing
Article • 12/21/2022
In Power BI, the loading of custom connectors is limited by your choice of security
setting. As a general rule, when the security for loading custom connectors is set to
'Recommended', the custom connectors won't load at all, and you have to lower it to
make them load.
The exception to this is trusted, 'signed connectors'. Signed connectors are a special
format of custom connector, a .pqx instead of .mez file, which has been signed with a
certificate. The signer can provide the user or the user's IT department with a
thumbprint of the signature, which can be put into the registry to securely indicate
trusting a given connector.
The following steps enable you to use a certificate (with an explanation on how to
generate one if you don't have one available) and sign a custom connector with the
'MakePQX' tool.
7 Note
7 Note
1. Download MakePQX .
2. Extract the MakePQX folder in the included zip to the target you want.
3. To run it, call MakePQX in the command line. It requires the other libraries in the
folder, so you can't copy just the one executable. Running without any parameters
will return the help information.
Options:
Options Description
Commands:
Command Description
sign Signs an unsigned pqx, or countersigns if pqx is already signed. Use the --replace
option to replace the existing signature.
verify Verify the signature status on a pqx file. Return value will be non-zero if the
signature is invalid.
There are three commands in MakePQX. Use MakePQX [command] --help for more
information about a command.
Pack
The Pack command takes a mez file and packs it into a pqx file, which can be signed.
The pqx file is also able to support some capabilities that will be added in the future.
Options:
Option Description
-t | --target Output file name. Defaults to the same name as the input file.
Example
C:\Users\cpope\Downloads\MakePQX>MakePQX.exe pack -mz
"C:\Users\cpope\OneDrive\Documents\Power BI Desktop\Custom
Connectors\HelloWorld.mez" -t "C:\Users\cpope\OneDrive\Documents\Power BI
Desktop\Custom Connectors\HelloWorldSigned.pqx"
Sign
The Sign command signs your pqx file with a certificate, giving it a thumbprint that can
be checked for trust by Power BI clients with the higher security setting. This command
takes a pqx file and returns the same pqx file, signed.
Arguments:
Argument Description
Options:
Option Description
Example
C:\Users\cpope\Downloads\MakePQX>MakePQX sign
"C:\Users\cpope\OneDrive\Documents\Power BI Desktop\Custom
Connectors\HelloWorldSigned.pqx" --certificate ContosoTestCertificate.pfx --
password password
Verify
The Verify command verifies that your module has been properly signed, and is showing
the Certificate status.
Arguments:
Argument Description
Option Description
Example
C:\Users\cpope\Downloads\MakePQX>MakePQX verify
"C:\Users\cpope\OneDrive\Documents\Power BI Desktop\Custom
Connectors\HelloWorldSigned.pqx"
{
"SignatureStatus": "Success",
"CertificateStatus": [
{
"Issuer": "CN=Colin Popell",
"Thumbprint": "16AF59E4BE5384CD860E230ED4AED474C2A3BC69",
"Subject": "CN=Colin Popell",
"NotBefore": "2019-02-14T22:47:42-08:00",
"NotAfter": "2020-02-14T23:07:42-08:00",
"Valid": false,
"Parent": null,
"Status": "UntrustedRoot"
}
]
}
This article describes how you can enable proxy support in your Power Query custom
connector using the Power Query SDK.
Internet Options
1. From the Windows start menu, search for and open Internet Options.
2. Select the Connections tab.
3. Select LAN settings.
4. In the Proxy sever section, configure the proxy server.
The proxy information can be set in the connection string against the connection
parameter, which can differ by connectors. Proxy credentials (username and password)
aren't supported. Web.DefaultProxy takes in the serverUri/host as the parameter and
returns a record containing the proxy URI as the ProxyUri field of a record. To get the
constituent parts of the URI (for example: scheme, host, port) inside the connector, use
Uri.Parts .
Example usage
Example 1
To use Web.DefaultProxy in the connector code, a boolean type variable can be used to
opt in or out of using this functionality. In this example, Web.DefaultProxy is invoked in
the connector code if the optional boolean parameter UseWebDefaultProxy is set to true
(defaults to false).
Power Query M
UseWebDefaultProxyOption = options[UseWebDefaultProxy]?,
ProxyUriRecord = if UseWebDefaultProxyOption then Web.DefaultProxy(Host)
else null,
ProxyOptions = if Record.FieldCount(ProxyUriRecord) > 0 then
[
Proxy = ProxyUriRecord[ProxyUri]
]
else [],
...
Once the UseWebDefaultProxy is set to true and ProxyUriRecord is fetched, a record can
be created to set the Proxy (configuration parameter supported by the driver, which can
vary) with the ProxyUri field returned by Web.DefaultProxy . It can be named something
like ProxyOptions . This record can then be appended to the base ConnectionString , and
include the proxy details along with it.
Power Query M
Example 2
If there are multiple configuration parameters used by the driver for setting the proxy
details (like host and port details being handled separately), Uri.Parts can be used.
Power Query M
UseWebDefaultProxyOption = options[UseWebDefaultProxy]?,
ProxyRecord = if UseWebDefaultProxyOption then Web.DefaultProxy(Host) else
null,
UriRecord = if ProxyRecord <> null then Uri.Parts(ProxyRecord) else null,
ProxyOptions = if UriRecord <> null then
[
ProxyHost = UriRecord[Scheme] & "://" & UriRecord[Host],
ProxyPort = UriRecord[Port]
]
else [],
...
Native query support in Power Query
custom connectors
Article • 10/02/2023
7 Note
This article covers advanced topics around the implementation of native query
support for custom connectors, as well as query folding on top of them. This article
assumes you already have a working knowledge of these concepts.
To learn more about Power Query custom connectors, go to Power Query SDK
Overview.
In Power Query, you're able to execute custom native queries against your data source
to retrieve the data that you're looking for. You can also enable the capability to
maintain query folding throughout this process and subsequent transformation
processes done inside of Power Query.
The goal of this article is to showcase how you can implement such capability for your
custom connector.
Prerequisites
This article uses as a starting point a sample that uses the SQL ODBC driver for its
data source. The implementation of the native query capability is currently only
supported for ODBC connectors that adhere to the SQL-92 standard.
The sample connector uses the SQL Server Native Client 11.0 driver. Make sure that you
have this driver installed to follow along with this tutorial.
You can also view the finished version of the sample connector from the Finish folder
in the GitHub Repository.
SqlCapabilities = Diagnostics.LogValue("SqlCapabilities_Options",
defaultConfig[SqlCapabilities] & [
// Place custom overrides here
// The values below are required for the SQL Native Client ODBC driver,
but might
// not be required for your data source.
SupportsTop = false,
SupportsDerivedTable = true,
Sql92Conformance = 8 /* SQL_SC_SQL92_FULL */,
GroupByCapabilities = 4 /* SQL_GB_NO_RELATION */,
FractionalSecondsScale = 3,
Sql92Translation = "PassThrough"
]),
Make sure that this field appears in your connector before moving forward. If not, you'll
face warnings and errors later on when it comes down to using a capability that isn't
supported because it isn't declared by the connector.
Build the connector file (as .mez or.pqx) and load it into Power BI Desktop for manual
testing and to define the target for your native query.
7 Note
For this article, we'll be using the AdventureWorks2019 sample database. But you
can follow along with any SQL Server database of your choice and make the
necessary changes when it comes down to the specifics of the database chosen.
The way native query support will be implemented in this article is that the user will be
requested to enter three values:
Server name
Database name
Native query at the database level
Now inside Power BI Desktop, go to the Get Data experience and find the connector
with the name SqlODBC Sample.
For the connector dialog, enter the parameters for your server and your database name.
Then select OK.
A new navigator window appears. In Navigator, you can view the native navigation
behavior from the SQL driver that displays the hierarchical view of the server and the
databases within it. Right-click the AdventureWorks2019 database, then select
Transform Data.
This selection brings you to the Power Query editor and a preview of what's effectively
the target of your native query since all native queries should run at the database level.
Inspect the formula bar of the last step to better understand how your connector should
navigate to the target of your native queries before executing them. In this case the
formula bar displays the following information:
= Source{[Name="AdventureWorks2019",Kind="Database"]}[Data]
Source is the name of the previous step that, in this case, is simply the published
function of your connector with the parameters passed. The list and the record inside of
it just helps navigate a table to a specific row. The row is defined by the criteria from the
record where the field Name has to be equal to AdventureWorks2019 and the Kind
field has to be equal to Database. Once the row is located, the [Data] outside of the list
{} lets Power Query access the value inside the Data field, which in this case is a table.
You can go back to the previous step (Source) to better understand this navigation.
Test native query
With the target now identified, create a custom step after the navigation step by
selecting the fx icon in the formula bar.
Replace the formula inside the formula bar with the following formula, and then select
Enter.
Power Query M
After you apply this change, a warning should appear underneath the formula bar
requesting permission to run the native query against your data source.
Select Edit Permission. A new Native Database Query dialog is displayed that tries to
warn you about the possibilities of running native queries. In this case, we know that this
SQL Statement is safe, so select Run to execute the command.
After you run your query, a preview of your query appears in the Power Query editor.
This preview validates that your connector is capable of running native queries.
Implement native query logic in your connector
With the information gathered from the previous sections, the goal now is to translate
such information into code for your connector.
The way that you can accomplish this translation is by adding a new
NativeQueryProperties record field to your connector's Publish record, which in this
case is the SqlODBC.Publish record. The NativeQueryProperties record plays a crucial
role in defining how the connector will interact with the Value.NativeQuery function.
NavigationSteps
Your navigation steps can be categorized into two groups. The first contains those
values that are entered by the end-user, such as the name of the server or the database,
in this case. The second contains those values that are derived by the specific connector
implementation, such as the name of fields that aren't displayed to the user during the
get data experience. These fields could include Name , Kind , Data , and others depending
on your connector implementation.
For this case, there was only one navigation step that consisted of two fields:
Name: This field is the name of the database that was passed by the end-user. In
this case, it was AdventureWorks2019 , but this field should always be passed as-is
from what the end-user entered during the get data experience.
Kind: This field is information that isn't visible to the end-user and is specific to the
connector or driver implementation. In this case, this value identifies what type of
object should be accessed. For this implementation, this field will be a fixed value
that consists of the string Database .
Such information will be translated to the following code. This code should be added as
a new field to your SqlODBC.Publish record.
Power Query M
NativeQueryProperties = [
NavigationSteps = {
[
Indices = {
[
FieldDisplayName = "database",
IndexName = "Name"
],
[
ConstantValue = "Database",
IndexName = "Kind"
]
},
FieldAccess = "Data"
]
}
]
) Important
The name of the fields are case sensitive and must be used as shown in the sample
above. All information passed to the fields, either ConstantValue , IndexName , or
FieldDisplayName must be derived from the connector's M code.
For values that will be passed from what the user entered, you can use the pair
FieldDisplayName and IndexName . For values that are fixed or predefined and can't be
passed by the end-user, you can use the pair ConstantValue and IndexName . In this
sense, the NavigationSteps record consists of two fields:
Indices: Defines what fields and what values to use to navigate to the record that
contains the target for the Value.NativeQuery function.
FieldAccess: Defines what field holds the target, which is commonly a table.
DefaultOptions
The DefaultOptions field lets you pass optional parameters to the Value.NativeQuery
function when using the native query capability for your connector.
To preserve query folding after a native query, and assuming that your connector has
query folding capabilities, you can use the following sample code for EnableFolding =
true .
Power Query M
NativeQueryProperties = [
NavigationSteps = {
[
Indices = {
[
FieldDisplayName = "database",
IndexName = "Name"
],
[
ConstantValue = "Database",
IndexName = "Kind"
]
},
FieldAccess = "Data"
]
},
DefaultOptions = [
EnableFolding = true
]
]
With these changes in place, build the connector and load it into Power BI Desktop for
testing and validation.
7 Note
If your connector has query folding capabilities and has explicitly defined
EnableFolding=true as part of the optional record for Value.NativeQuery , then you
can further test your connector in the Power Query editor by checking if further
transforms fold back to the source or not.
Versioning
Article • 12/21/2022
Power Query M
[Version = "1.0.0"]
section MyConnector;
The first number is the "major" version, which will indicate breaking changes. This
number should be incremented whenever users will be required to potentially rebuild
reports due to massive connector rearchitecture or removal of features.
The second number is the "minor" version, which indicates addition of functionality.
These will generally not be breaking, but might cause peripheral side effects. This
number should be incremented whenever functionality is added to the connector.
The final number is the "patch" version, which indicates minor tweaks and fixes to
connectors. This is the version that will change the most often, and should be
incremented whenever you release small tweaks of a connector to the public.
Power Query Connector Certification
Article • 02/17/2023
7 Note
This article describes the requirements and process to submit a Power Query
custom connector for certification. Read the entire article closely before starting the
certification process.
Introduction
With the Power Query SDK, everyone is empowered to create a custom Power Query
connector to connect to a data source from Power Query. Currently, custom connectors
are only supported in Power BI datasets (Power BI Desktop and Power BI service), and
require the use of an on-premises data gateway to refresh through Power BI service.
Custom connectors need to be individually distributed by the developer.
Data source owners who develop a custom connector for their data source might want
to distribute their custom connector more broadly to Power Query users. Once a custom
connector has been created, used, and validated by end users, the data source owner
can submit it for Microsoft certification.
Certifying a Power Query custom connector makes the connector available publicly, out-
of-box, within Power BI datasets (Power BI Desktop and Power BI service), Power BI
dataflows, and Power BI datamarts. Certified connectors are supported in PowerBI.com
and all versions of Power BI Premium.
Certified by Microsoft
Distributed by Microsoft
We work with partners to try to make sure that they have support in maintenance, but
customer issues with the connector itself will be directed to the partner developer.
Certified connector and custom connector
differences
Certified connectors are bundled out-of-box in Power BI Desktop, and deployed to
Power BI Service, Power BI dataflows, and Power BI datamarts. Custom connectors are
only supported in Power BI datasets and need to be loaded in Power BI Desktop, as
described in Loading your extension in Power BI Desktop. Both certified and custom
connectors can be refreshed through Power BI Desktop or Power BI Service through
using an on-premises data gateway by implementing a TestConnection. The on-
premises data gateway is required for custom connectors.
Both custom and certified connectors with extra components (for example, ODBC driver)
need the extra component to be installed on the end user machine and require the on-
premises data gateway, unless the extra component is deployed to Power BI cloud.
Currently, we aren't certifying and deploying any new extra components to Power BI
cloud, so the certification of connectors with a dependency on an extra component
won't remove the on-premises data gateway requirement.
From a user's perspective, users need to use the thumbprint from the developer to
securely trust and load the custom connector for use. Alternatively, users can opt to
lower their security settings to allow loading of code not certified by Microsoft or
another developer, but this isn't recommended.
Certification Overview
Prerequisites
To ensure the best experience for our customers, we only consider connectors that meet
a set of prerequisites for certification:
The developer must provide an estimate for usage. We suggest that developers of
connectors for very boutique products use our connector self-signing capabilities
to provide them directly to the customer.
The connector must be already made available to customers directly to fulfill a user
need or business scenario. This can be done using a Private Preview program by
distributing the completed connector directly to end users and organizations
through self-signing. Each user or organization should be able to provide feedback
and validation that there's a business need for the connector and that the
connector is working successfully to fulfill their business requirements.
Technical Review: finalization of the connector files, passing Microsoft review and
certification. This review must occur by the 15th of the month before the targeted
Power BI Desktop release.
For example, for the April Power BI Desktop release, the deadline would be
March 15th.
Due to the complexity of the technical reviews and potential delays, rearchitecture, and
testing issues, we highly recommend submitting early with a long lead time for the
initial release and certification. If you feel like your connector is important to deliver to
a few customers with minimal overhead, we recommend self-signing and providing it
that way.
Certification Requirements
We have a certain set of requirements for certification. We recognize that not every
developer can meet these requirements, and we're hoping to introduce a feature set
that will handle developer needs in short order.
Testing instructions
Provide any documentation on how to use the connector and test its
functionality.
The FunctionName should make sense for the domain (for example "Contents",
"Tables", "Document", "Databases", and so on).
Security
There are specific security considerations that your connector must handle.
If Extension.CurrentCredentials() is used:
Is the usage required? If so, where do the credentials get sent to?
Are the requests guaranteed to be made through HTTPS?
You can use the HTTPS enforcement helper function.
If the credentials are sent using Web.Contents() via GET:
Can it be turned into a POST?
If GET is required, the connector MUST use the CredentialQueryString record
in the Web.Contents() options record to pass in sensitive credentials.
If Expression.Evaluate() is used:
Validate where the expression is coming from and what it is (that is, can
dynamically construct calls to Extension.CurrentCredentials() and so on).
The Expression should not be user provided nor take user input.
The Expression should not be dynamic (that is, retrieved from a web call).
Ensure that your connector is code complete and has been tested in both authoring in
Power BI Desktop, and refreshing and consumption in Power BI Service. Ensure you have
tested full end-to-end refresh in Power BI Service through the use of an on-premises
data gateway.
To get started, complete our registration form , and a Microsoft contact will reach out
to begin the process.
7 Note
Introduction
This article provides instructions for how to submit your Power Query custom connector
for certification. Don't submit your connector for certification unless you've been
explicitly directed to by your Microsoft contact.
Prerequisites
After you've been approved for certification, ensure that your connector meets the
certification requirements and follows all feature, style, and security guidelines. Prepare
the submission artifacts for submission.
Initial Submission
1. Navigate to ISV Studio and sign in with your work Microsoft account. Personal
accounts aren't supported in this experience.
2. Select the Connector certification tab on the left to launch the Connector
Certification Portal experience.
5. Upload your .mez file and complete the form with information on your connector.
Submit the form to finish the connector submission process. Once submitted, you
can use the Activity Control experience on the right to communicate with your
Microsoft contact.
6. Read the guidelines for providing documentation for your custom connector.
Create a Markdown ( .md ) file following the custom connector documentation
guidelines, using examples from existing documentation if needed. This step is
crucial to ensure users know how to use your connector. Once you have the pull
request for the public documentation available, email the pull request link to your
Microsoft contact.
Note that we need you to complete all the steps in order to move forward with
certification. If you would like to add teammates to manage your connector, let your
Microsoft contact know.
After your connector code review is complete, you'll need to submit a demo video to us
outlining the following scenarios:
Updates
Updates to your connector submission can be made at any time, except when your
connector is in the process of production deployment. When you're submitting an
update, ensure that you submit an update to your existing submission, rather than
creating a new submission.
3. For an update to a certified connector, select the link to submit a new version in
the panel on the right, on top of the existing connector versions. For an update to
an existing connector version undergoing certification, select the most recent
connector version and on the bottom left, select the Submit an update button.
4. You can upload a new version of artifacts and complete the submission form again.
5. After submitting the connector form, in the Activity Control chat feature on the
right, submit a short changelog explaining the connector update. This information
should be public and written in a customer-facing way, as it will be included
verbatim in the next Power BI Desktop blog update.
Once you've finished designing your Power Query custom connector, you'll need to
submit an article that provides instructions on how to use your connector for
publication on Microsoft Learn. This article discusses the layout of such an article and
how to format the text of your article.
Article layout
This section describes the general layout of the Power Query connector articles. Your
custom connector article should follow this general layout.
Support note
Right after the title of the article, insert the following note.
7 Note
The following connector article is provided by <company name>, the owner of this
connector and a member of the Microsoft Power Query Connector Certification
Program. If you have questions regarding the content of this article or have
changes you would like to see made to this article, visit the <company name>
website and use the support channels there.
Summary table
After the support note, provide a summary table that contains the following
information:
7 Note
Some capabilities may be present in one product but not others due to
deployment schedules and host-specific capabilities.
Prerequisites
If your custom connector requires that other applications be installed on the system
running your connector or requires that a set-up procedure be done before using your
custom connector, you must include a Prerequisites section that describes these
installation and set-up procedures. This section will also include any information about
setting up various versions of your connector (if applicable).
Capabilities supported
This section should contain a list of the capabilities supported by your custom
connector. These capabilities are usually a bulleted list that indicates if the connector
supports Import and DirectQuery modes, and also any advanced options that are
available in the initial dialog box that appears after the user selects your connector in
Get data.
Connection instructions
This section contains the procedures required to connect to data. If your custom
connector is only used in Power Query Desktop, only one procedure is required.
However, if your custom connector is used on both Power Query Desktop and Power
Query Online, you must supply a separate procedure in separate sections for each
instance. That is, if your custom connector is only used by Power Query Desktop, you'll
have one procedure starting with a second order heading and a single step-by-step
procedure. If your custom connector is used by both Power Query Desktop and Power
Query Online, you'll have two procedures. Each procedure starts with a second order
heading, and contains a separate step-by-step procedure under each heading. For
examples of each of these types of procedures, go to Example connector articles.
The procedure is made up of a numbered list that includes each step required to fill in
the information needed to provide a normal connection (not requiring advance options)
to the data.
7 Note
All new certified connectors are required to support Power BI dataflows, and must
contain sections for connecting to your data in both Power Query Desktop and
Power Query Online.
Troubleshooting (optional)
If you know of any common errors that may occur with your custom connector, you can
add a troubleshooting section to describe ways to either fix the error, or work around
the error. This section can also include information on any known limitations of your
connector or the retrieval of data. You can also include any known issues with using your
connector to connect to data.
Your article should be made available on GitHub under the connectors folder in the
Power Query documentation repo: https://github.com/MicrosoftDocs/powerquery-
docs/tree/master/powerquery-docs/connectors . Ensure that you also add a link to
your article in the list of connectors referencing the correct logo image uploaded to the
/connectors/media/index folder. Lastly, ensure that you add a link to your article in the
table of contents file ( TOC.yml ). Certified connectors will only be available under Power
BI (Datasets).
See our contributor guide on how you can contribute to our repo.
The article should be formatted and submitted as a Markdown file. It should use the
Microsoft style for describing procedures and the UI layout.
Here are some answers to common questions that might occur while developing
custom Power Query connectors.
General
Is it possible to show a warning if...?
Outside of documented warning patterns, we currently don't provide a way to return
out of band warnings, such as a large table or large metadata warning.
Troubleshooting
The custom connector I've been developing
works fine in Power BI Desktop. But when I try to
run it in Power BI service, I can't set credentials
or configure the data source. What's wrong?
There could be several reasons why you're seeing this behavior. Some common errors
that might occur while running the connector on Power BI service are:
Before you begin troubleshooting this behavior, first collect a copy of the custom
connector (.pq or .mez file). If you have a .mez file, rename the file to .zip and extract the
.pq file.
1. Open the custom connector file (.pq) in a text editor of your choice.
3. If the connector uses OAuth, check for the state parameter. A common cause of
service-only failures is a missing state parameter in the connector's StartLogin
implementation. This parameter isn't used in Power BI Desktop, but is required in
the Power BI service. The state parameter must be passed into the call to
Uri.BuildQueryString. The following example demonstrates the correct
implementation of state .
Power Query M
Power Query M
Additional information
7 Note
TRAINING CONCEPT
Get started with DAX Power BI Premium features
Do more in Power BI
Use Power BI software services, apps, and connectors to turn your unrelated sources of data into
coherent, visually immersive, and interactive insights.
See all business user docs T See all mobile apps docs T
Additional Power BI resources
Troubleshooting Support
e Known issues for Power BI features c Power BI Pro and Power BI Premium support
options
c Troubleshoot refresh scenarios
c Troubleshoot subscribing to reports and c Track Power BI service health in Microsoft 365
dashboards c Move Power BI to another region
Power BI embedded
analytics client APIs
A client-side library for
embedding Power BI using
JavaScript or TypeScript.
The Power Query M function reference includes articles for each of the over 700
functions. These reference articles are auto-generated from in-product help. To learn
more about functions and how they work in an expression, go to Understanding Power
Query M functions.
Functions by category
Accessing data functions
Binary functions
Combiner functions
Comparer functions
Date functions
DateTime functions
DateTimeZone functions
Duration functions
Error handling
Expression functions
Function values
List functions
Lines functions
Logical functions
Number functions
Record functions
Replacer functions
Splitter functions
Table functions
Text functions
Time functions
Type functions
Uri functions
Value functions
Feedback
ツ Yes ト No
ツ Yes ト No
Was this page helpful?
Introduction
Lexical Structure
Basic Concepts
Values
Types
Operators
Let
Conditionals
Functions
Error Handling
Sections
Consolidated Grammar
Feedback
Was this page helpful? ツ Yes ト No
The Power Query M Formula Language is a useful and expressive data mashup
language. But it does have some limitations. For example, there is no strong
enforcement of the type system. In some cases, a more rigorous validation is needed.
Fortunately, M provides a built-in library with support for types to make stronger
validation feasible.
By exploring the M type system more carefully, many of these issues can be clarified,
and developers will be empowered to craft the solutions they need.
Knowledge of predicate calculus and naïve set theory should be adequate to understand
the notation used.
PRELIMINARIES
(1) B := { true; false }
B is the typical set of Boolean values
(3) P := ⟨B, T⟩
P is the set of function parameters. Each one is possibly optional, and has a type.
Parameter names are irrelevant.
(5) P* := ⋃0≤i≤∞ Pi
P* is the set of all possible sequences of function parameters, from length 0 on up.
(6) F := ⟨B, N, T⟩
F is the set of all record fields. Each field is possibly optional, has a name, and a type.
(7) Fn := ∏0≤i≤n F
Fn is the set of all sets of n record fields.
(9) C := ⟨N,T⟩
C is the set of column types, for tables. Each column has a name and a type.
M TYPES
(12) TF := ⟨P, P*⟩
A Function Type consists of a return type, and an ordered list of zero-or-more function
parameters.
(13) TL :=〖T〗
A List type is indicated by a given type (called the "item type") wrapped in curly braces.
Since curly braces are used in the metalanguage, 〖 〗 brackets are used in this document.
(17) TT := C*
A Table Type is an ordered sequence of zero-or-more column types, where there are no
name collisions.
(18) TP := { any; none; null; logical; number; time; date; datetime; datetimezone; duration;
text; binary; type; list; record; table; function; anynonnull }
A Primitive Type is one from this list of M keywords.
(19) TN := { tn, u ∈ T | tn = u+null } = nullable t
Any type can additionally be marked as being nullable, by using the "nullable" keyword.
(20) T := TF ∪ TL ∪ TR ∪ TT ∪ TP ∪ TN
The set of all M types is the union of these six sets of types:
Function Types, List Types, Record Types, Table Types, Primitive Types, and Nullable Types.
FUNCTIONS
One function needs to be defined: NonNullable : T ← T
This function takes a type, and returns a type that is equivalent except it does not
conform with the null value.
IDENTITIES
Some identities are needed to define some special cases, and may also help elucidate
the above.
TYPE COMPATIBILITY
As defined elsewhere, an M type is compatable with another M type if and only if all
values that conform to the first type also conform to the second type.
Here is defined a compatibility relation that does not depend on conforming values, and
is based on the properties of the types themselves. It is anticipated that this relation, as
defined in this document, is completely equivalent to the original semantic definition.
(28) t ≤ t
This relation is reflexive.
(29) ta ≤ tb ∧ tb ≤ tc → ta ≤ tc
This relation is transitive.
(32) null ≤ t ∈ TN
The primitive type null is compatible with all nullable types.
(33) t ∉ TN ≤ anynonnull
All nonnullable types are compatible with anynonnull.
(34) NonNullable(t) ≤ t
A NonNullible type is compatible with the nullable equivalent.
(35) t ∈ TF → t ≤ function
All function types are compatible with function.
(36) t ∈ TL → t ≤ list
All list types are compatible with list.
(37) t ∈ TR → t ≤ record
All record types are compatible with record.
(38) t ∈ TT → t ≤ table
All table types are compatible with table.
(39) ta ≤ tb ↔ 〖ta〗≤〖tb〗
A list type is compaible with another list type if the item types are compatible, and vice-
versa.
REFERENCES
Microsoft Corporation (2015 August)
Microsoft Power Query for Excel Formula Language Specification [PDF]
Retrieved from https://msdn.microsoft.com/library/mt807488.aspx
Feedback
Was this page helpful? ツ Yes ト No