100% found this document useful (1 vote)
278 views76 pages

Education - Power Bi

This document discusses data analysis and its importance for businesses. It explains that data analysis involves identifying, cleaning, transforming and modeling data to discover useful insights and tell stories with data. It then describes the main categories of data analysis: descriptive, diagnostic, predictive, prescriptive and cognitive. The document provides an example of how a retail business could use descriptive and diagnostic analytics on purchase data to inform product decisions.

Uploaded by

ashwinikr2011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
278 views76 pages

Education - Power Bi

This document discusses data analysis and its importance for businesses. It explains that data analysis involves identifying, cleaning, transforming and modeling data to discover useful insights and tell stories with data. It then describes the main categories of data analysis: descriptive, diagnostic, predictive, prescriptive and cognitive. The document provides an example of how a retail business could use descriptive and diagnostic analytics on purchase data to inform product decisions.

Uploaded by

ashwinikr2011
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 76

Get started with Microsoft data analytics

Businesses need data analysis more than ever. In this learning path, you will learn
about the life and journey of a data analyst, the skills, tasks, and processes they go
through in order to tell a story with data so trusted business decisions can be made.
You will learn how the suite of Power BI tools and services are used by a data analyst
to tell a compelling story through reports and dashboards, and the need for true BI
in the enterprise.

This learning path can help you prepare for the Microsoft Certified: Data Analyst
Associate certification.

This learning path helps prepare you for Exam PL-300: Microsoft Power BI Data
Analyst.

Introduction
Completed100 XP - 5 minutes

As a data analyst, you are on a journey. Think about all the data that is being
generated each day and that is available in an organization, from transactional data
in a traditional database, telemetry data from services that you use, to signals that
you get from different areas like social media.

For example, today's retail businesses collect and store massive amounts of data that
track the items you browsed and purchased, the pages you've visited on their site,
the aisles you purchase products from, your spending habits, and much more.
With data and information as the most strategic asset of a business, the underlying
challenge that organizations have today is understanding and using their data to
positively affect change within the business. Businesses continue to struggle to use
their data in a meaningful and productive way, which impacts their ability to act.

A retail business should be able to use their vast amounts of data and information in
such a way that impacts the business, including:

 Tracking inventory
 Identifying purchase habits
 Detecting user trends and patterns
 Recommending purchases
 Determining price optimizations
 Identifying and stopping fraud

Additionally, you might be looking for daily/monthly sale patterns. Common data
segments that you might want to examine include day-over-day, week-over-week,
and month-over-month so that you can compare how sales have been to where they
were in the same week last year, for example.

The key to unlocking this data is being able to tell a story with it. In today's highly
competitive and fast-paced business world, crafting reports that tell that story is
what helps business leaders take action on the data. Business decision makers
depend on an accurate story to drive better business decisions. The faster a business
can make precise decisions, the more competitive they will be and the better
advantage they will have. Without the story, it is difficult to understand what the data
is trying to tell you.

However, having data alone is not enough. You need to be able to act on the data to
affect change within the business. That action could involve reallocating resources
within the business to accommodate a need, or it could be identifying a failing
campaign and knowing when to change course. These situations are where telling a
story with your data is important.

The underlying challenge that businesses face today is understanding and using their
data in such a way that impacts their business and ultimately their bottom line. You
need to be able to look at the data and facilitate trusted business decisions. Then,
you need the ability to look at metrics and clearly understand the meaning behind
those metrics.

This requirement might seem daunting, but it's a task that you can accomplish. Your
first step is to partner with data experts within your organization, such as data
engineers and data scientists, to help get the data that you need to tell that story.
Ask these experts to participate in that data journey with you.
Your journey of telling a story with data also ties into building that data culture
within your organization. While telling the story is important, where that story is told
is also crucial, ensuring that the story is told to the right people. Also, make sure that
people can discover the story, that they know where to find it, and that it is part of
the regular interactions.

Data analysis exists to help overcome these challenges and pain points, ultimately
assisting businesses in finding insights and uncovering hidden value in troves of data
through storytelling. As you read on, you will learn how to use and apply analytical
skills to go beyond a single report and help impact and influence your organization
by telling stories with data and driving that data culture.
Overview of data analysis
Completed100 XP

 6 minutes

Before data can be used to tell a story, it must be run through a process that makes it
usable in the story. Data analysis is the process of identifying, cleaning, transforming,
and modeling data to discover meaningful and useful information. The data is then
crafted into a story through reports for analysis to support the critical decision-
making process.

As the world becomes more data-driven, storytelling through data analysis is


becoming a vital component and aspect of large and small businesses. It is the
reason that organizations continue to hire data analysts.

Data-driven businesses make decisions based on the story that their data tells, and in
today's data-driven world, data is not being used to its full potential, a challenge that
most businesses face. Data analysis is, and should be, a critical aspect of all
organizations to help determine the impact to their business, including evaluating
customer sentiment, performing market and product research, and identifying trends
or other data insights.

While the process of data analysis focuses on the tasks of cleaning, modeling, and
visualizing data, the concept of data analysis and its importance to business should
not be understated. To analyze data, core components of analytics are divided into
the following categories:

 Descriptive
 Diagnostic
 Predictive
 Prescriptive
 Cognitive
Descriptive analytics
Descriptive analytics help answer questions about what has happened based on
historical data. Descriptive analytics techniques summarize large datasets to describe
outcomes to stakeholders.

By developing key performance indicators (KPIs), these strategies can help track the
success or failure of key objectives. Metrics such as return on investment (ROI) are
used in many industries, and specialized metrics are developed to track performance
in specific industries.

An example of descriptive analytics is generating reports to provide a view of an


organization's sales and financial data.
Diagnostic analytics
Diagnostic analytics help answer questions about why events happened. Diagnostic
analytics techniques supplement basic descriptive analytics, and they use the findings
from descriptive analytics to discover the cause of these events. Then, performance
indicators are further investigated to discover why these events improved or became
worse. Generally, this process occurs in three steps:

1. Identify anomalies in the data. These anomalies might be unexpected changes


in a metric or a particular market.

2. Collect data that's related to these anomalies.

3. Use statistical techniques to discover relationships and trends that explain these
anomalies.

Predictive analytics
Predictive analytics help answer questions about what will happen in the future.
Predictive analytics techniques use historical data to identify trends and determine if
they're likely to recur. Predictive analytical tools provide valuable insight into what
might happen in the future. Techniques include a variety of statistical and machine
learning techniques such as neural networks, decision trees, and regression.
Prescriptive analytics
Prescriptive analytics help answer questions about which actions should be taken to
achieve a goal or target. By using insights from prescriptive analytics, organizations
can make data-driven decisions. This technique allows businesses to make informed
decisions in the face of uncertainty. Prescriptive analytics techniques rely on machine
learning as one of the strategies to find patterns in large datasets. By analyzing past
decisions and events, organizations can estimate the likelihood of different
outcomes.
Cognitive analytics
Cognitive analytics attempt to draw inferences from existing data and patterns,
derive conclusions based on existing knowledge bases, and then add these findings
back into the knowledge base for future inferences, a self-learning feedback loop.
Cognitive analytics help you learn what might happen if circumstances change and
determine how you might handle these situations.
Inferences aren't structured queries based on a rules database; rather, they're
unstructured hypotheses that are gathered from several sources and expressed with
varying degrees of confidence. Effective cognitive analytics depend on machine
learning algorithms, and will use several natural language processing concepts to
make sense of previously untapped data sources, such as call center conversation
logs and product reviews.
Example
By enabling reporting and data visualizations, a retail business uses descriptive
analytics to look at patterns of purchases from previous years to determine what
products might be popular next year. The company might also look at supporting
data to understand why a particular product was popular and if that trend is
continuing, which will help them determine whether to continue stocking that
product.
A business might determine that a certain product was popular over a specific
timeframe. Then, they can use this analysis to determine whether certain marketing
efforts or online social activities contributed to the sales increase.
An underlying facet of data analysis is that a business needs to trust its data. As a
practice, the data analysis process will capture data from trusted sources and shape it
into something that is consumable, meaningful, and easily understood to help with
the decision-making process. Data analysis enables businesses to fully understand
their data through data-driven processes and decisions, allowing them to be
confident in their decisions.
As the amount of data grows, so does the need for data analysts. A data analyst
knows how to organize information and distill it into something relevant and
comprehensible. A data analyst knows how to gather the right data and what to do
with it, in other words, making sense of the data in your data overload.
Roles in data
Completed100 XP

 8 minutes

Telling a story with the data is a journey that usually doesn't start with you. The data
must come from somewhere. Getting that data into a place that is usable by you
takes effort that is likely out of your scope, especially in consideration of the
enterprise.

Today's applications and projects can be large and intricate, often involving the use
of skills and knowledge from numerous individuals. Each person brings a unique
talent and expertise, sharing in the effort of working together and coordinating tasks
and responsibilities to see a project through from concept to production.

In the recent past, roles such as business analysts and business intelligence
developers were the standard for data processing and understanding. However,
excessive expansion of the size and different types of data has caused these roles to
evolve into more specialized sets of skills that modernize and streamline the
processes of data engineering and analysis.

The following sections highlight these different roles in data and the specific
responsibility in the overall spectrum of data discovery and understanding:

 Business analyst

 Data analyst

 Data engineer

 Data scientist

 Database administrator
Business analyst
While some similarities exist between a data analyst and business analyst, the key
differentiator between the two roles is what they do with data. A business analyst is
closer to the business and is a specialist in interpreting the data that comes from the
visualization. Often, the roles of data analyst and business analyst could be the
responsibility of a single person.
Data analyst
A data analyst enables businesses to maximize the value of their data assets through
visualization and reporting tools such as Microsoft Power BI. Data analysts are
responsible for profiling, cleaning, and transforming data. Their responsibilities also
include designing and building scalable and effective data models, and enabling and
implementing the advanced analytics capabilities into reports for analysis. A data
analyst works with the pertinent stakeholders to identify appropriate and necessary
data and reporting requirements, and then they are tasked with turning raw data into
relevant and meaningful insights.
A data analyst is also responsible for the management of Power BI assets, including
reports, dashboards, workspaces, and the underlying datasets that are used in the
reports. They are tasked with implementing and configuring proper security
procedures, in conjunction with stakeholder requirements, to ensure the safekeeping
of all Power BI assets and their data.
Data analysts work with data engineers to determine and locate appropriate data
sources that meet stakeholder requirements. Additionally, data analysts work with
the data engineer and database administrator to ensure that the analyst has proper
access to the needed data sources. The data analyst also works with the data
engineer to identify new processes or improve existing processes for collecting data
for analysis.
Data engineer
Data engineers provision and set up data platform technologies that are on-premises
and in the cloud. They manage and secure the flow of structured and unstructured
data from multiple sources. The data platforms that they use can include relational
databases, nonrelational databases, data streams, and file stores. Data engineers also
ensure that data services securely and seamlessly integrate across data platforms.
Primary responsibilities of data engineers include the use of on-premises and cloud
data services and tools to ingest, egress, and transform data from multiple sources.
Data engineers collaborate with business stakeholders to identify and meet data
requirements. They design and implement solutions.
While some alignment might exist in the tasks and responsibilities of a data engineer
and a database administrator, a data engineer's scope of work goes well beyond
looking after a database and the server where it's hosted and likely doesn't include
the overall operational data management.
A data engineer adds tremendous value to business intelligence and data science
projects. When the data engineer brings data together, often described as data
wrangling, projects move faster because data scientists can focus on their own areas
of work.
As a data analyst, you would work closely with a data engineer in making sure that
you can access the variety of structured and unstructured data sources because they
will support you in optimizing data models, which are typically served from a modern
data warehouse or data lake.
Both database administrators and business intelligence professionals can transition
to a data engineer role; they need to learn the tools and technology that are used to
process large amounts of data.
Data scientist
Data scientists perform advanced analytics to extract value from data. Their work can
vary from descriptive analytics to predictive analytics. Descriptive analytics evaluate
data through a process known as exploratory data analysis (EDA). Predictive analytics
are used in machine learning to apply modeling techniques that can detect
anomalies or patterns. These analytics are important parts of forecast models.
Descriptive and predictive analytics are only partial aspects of data scientists' work.
Some data scientists might work in the realm of deep learning, performing iterative
experiments to solve a complex data problem by using customized algorithms.
Anecdotal evidence suggests that most of the work in a data science project is spent
on data wrangling and feature engineering. Data scientists can speed up the
experimentation process when data engineers use their skills to successfully wrangle
data.
On the surface, it might seem that a data scientist and data analyst are far apart in
the work that they do, but this conjecture is untrue. A data scientist looks at data to
determine the questions that need answers and will often devise a hypothesis or an
experiment and then turn to the data analyst to assist with the data visualization and
reporting.
Database administrator
A database administrator implements and manages the operational aspects of cloud-
native and hybrid data platform solutions that are built on Microsoft Azure data
services and Microsoft SQL Server. A database administrator is responsible for the
overall availability and consistent performance and optimizations of the database
solutions. They work with stakeholders to identify and implement the policies, tools,
and processes for data backup and recovery plans.
The role of a database administrator is different from the role of a data engineer. A
database administrator monitors and manages the overall health of a database and
the hardware that it resides on, whereas a data engineer is involved in the process of
data wrangling, in other words, ingesting, transforming, validating, and cleaning data
to meet business needs and requirements.
The database administrator is also responsible for managing the overall security of
the data, granting and restricting user access and privileges to the data as
determined by business needs and requirements.

Tasks of a data analyst


Completed100 XP - 10 minutes

A data analyst is one of several critical roles in an organization, who help uncover
and make sense of information to keep the company balanced and operating
efficiently. Therefore, it's vital that a data analyst clearly understands their
responsibilities and the tasks that are performed on a near-daily basis. Data analysts
are essential in helping organizations gain valuable insights into the expanse of data
that they have, and they work closely with others in the organization to help reveal
valuable information.
The following figure shows the five key areas that you'll engage in during the data
analysis process.

Prepare
As a data analyst, you'll likely divide most of your time between the prepare and
model tasks. Deficient or incorrect data can have a major impact that results in
invalid reports, a loss of trust, and a negative effect on business decisions, which can
lead to loss in revenue, a negative business impact, and more.
Before a report can be created, data must be prepared. Data preparation is the
process of profiling, cleaning, and transforming your data to get it ready to model
and visualize.
Data preparation is the process of taking raw data and turning it into information
that is trusted and understandable. It involves, among other things, ensuring the
integrity of the data, correcting wrong or inaccurate data, identifying missing data,
converting data from one structure to another or from one type to another, or even a
task as simple as making data more readable.
Data preparation also involves understanding how you're going to get and connect
to the data and the performance implications of the decisions. When connecting to
data, you need to make decisions to ensure that models and reports meet, and
perform to, acknowledged requirements and expectations.
Privacy and security assurances are also important. These assurances can include
anonymizing data to avoid oversharing or preventing people from seeing personally
identifiable information when it isn't needed. Alternatively, helping to ensure privacy
and security can involve removing that data completely if it doesn't fit in with the
story that you're trying to shape.
Data preparation can often be a lengthy process. Data analysts follow a series of
steps and methods to prepare data for placement into a proper context and state
that eliminate poor data quality and allow it to be turned into valuable insights.
Model
When the data is in a proper state, it's ready to be modeled. Data modeling is the
process of determining how your tables are related to each other. This process is
done by defining and creating relationships between the tables. From that point, you
can enhance the model by defining metrics and adding custom calculations to enrich
your data.
Creating an effective and proper data model is a critical step in helping organizations
understand and gain valuable insights into the data. An effective data model makes
reports more accurate, allows the data to be explored faster and more efficient,
decreases time for the report writing process, and simplifies future report
maintenance.
The model is another critical component that has a direct effect on the performance
of your report and overall data analysis. A poorly designed model can have a
drastically negative impact on the general accuracy and performance of your report.
Conversely, a well-designed model with well-prepared data will ensure a properly
efficient and trusted report. This notion is more prevalent when you are working with
data at scale.
From a Power BI perspective, if your report is performing slowly, or your refreshes are
taking a long time, you will likely need to revisit the data preparation and modeling
tasks to optimize your report.
The process of preparing data and modeling data is an iterative process. Data
preparation is the first task in data analysis. Understanding and preparing your data
before you model it will make the modeling step much easier.
Visualize
The visualization task is where you get to bring your data to life. The ultimate goal of
the visualize task is to solve business problems. A well-designed report should tell a
compelling story about that data, which will enable business decision makers to
quickly gain needed insights. By using appropriate visualizations and interactions,
you can provide an effective report that guides the reader through the content
quickly and efficiently, therefore allowing the reader to follow a narrative into the
data.
The reports that are created during the visualization task help businesses and
decision makers understand what that data means so that accurate and vital
decisions can be made. Reports drive the overall actions, decisions, and behaviors of
an organization that is trusting and relying on the information that is discovered in
the data.
The business might communicate that they need all data points on a given report to
help them make decisions. As a data analyst, you should take the time to fully
understand the problem that the business is trying to solve. Determine whether all
their data points are necessary because too much data can make detecting key
points difficult. Having a small and concise data story can help find insights quickly.
With the built-in AI capabilities in Power BI, data analysts can build powerful reports,
without writing any code, that enable users to get insights and answers and find
actionable objectives. The AI capabilities in Power BI, such as the built-in AI visuals,
enable the discovering of data by asking questions, using the Quick Insights feature,
or creating machine learning models directly within Power BI.
An important aspect of visualizing data is designing and creating reports for
accessibility. As you build reports, it is important to think about people who will be
accessing and reading the reports. Reports should be designed with accessibility in
mind from the outset so that no special modifications are needed in the future.
Many components of your report will help with storytelling. From a color scheme
that is complementary and accessible, to fonts and sizing, to picking the right visuals
for what is being displayed, they all come together to tell that story.
Analyze
The analyze task is the important step of understanding and interpreting the
information that is displayed on the report. In your role as a data analyst, you should
understand the analytical capabilities of Power BI and use those capabilities to find
insights, identify patterns and trends, predict outcomes, and then communicate
those insights in a way that everyone can understand.
Advanced analytics enables businesses and organizations to ultimately drive better
decisions throughout the business and create actionable insights and meaningful
results. With advanced analytics, organizations can drill into the data to predict future
patterns and trends, identify activities and behaviors, and enable businesses to ask
the appropriate questions about their data.
Previously, analyzing data was a difficult and intricate process that was typically
performed by data engineers or data scientists. Today, Power BI makes data analysis
accessible, which simplifies the data analysis process. Users can quickly gain insights
into their data by using visuals and metrics directly from their desktop and then
publish those insights to dashboards so that others can find needed information.
This feature is another area where AI integrations within Power BI can take your
analysis to the next level. Integrations with Azure machine learning, cognitive
services, and built-in AI visuals will help to enrich your data and analysis.
Manage
Power BI consists of many components, including reports, dashboards, workspaces,
datasets, and more. As a data analyst, you are responsible for the management of
these Power BI assets, overseeing the sharing and distribution of items, such as
reports and dashboards, and ensuring the security of Power BI assets.
Apps can be a valuable distribution method for your content and allow easier
management for large audiences. This feature also allows you to have custom
navigation experiences and link to other assets within your organization to
complement your reports.
The management of your content helps to foster collaboration between teams and
individuals. Sharing and discovery of your content is important for the right people
to get the answers that they need. It is also important to help ensure that items are
secure. You want to make sure that the right people have access and that you are not
leaking data past the correct stakeholders.
Proper management can also help reduce data silos within your organization. Data
duplication can make managing and introducing data latency difficult when
resources are overused. Power BI helps reduce data silos with the use of shared
datasets, and it allows you to reuse data that you have prepared and modeled. For
key business data, endorsing a dataset as certified can help to ensure trust in that
data.
The management of Power BI assets helps reduce the duplication of efforts and helps
ensure security of the data.

Summary
Completed100 XP - 1 minute
In this module, you learned that the role of data analyst is vital to the success of an
organization. Additionally, the tasks that data analysts perform help ensure that the
business decisions are based on trusted data. You also learned about the different
roles in data and how the people in these roles work closely with a data analyst to
deliver valuable insights into a business's data assets.
Check your knowledge
200 XP

 6 minutes

Answer the following questions to see what you've learned.


1. Which data role enables advanced analytics capabilities specifically through reports and
visualizations?

Data scientist

Data engineer

Data analyst

2. Which data analyst task has a critical performance impact on reporting and data analysis?

Model

Analyze

Visualize

3.  Which one of the following options is the most important key benefit of data analysis?

Decisive analytics

Informed business decisions

Complex reports
Introduction
Completed100 XP - 6 minutes

Microsoft Power BI is a collection of software services, apps, and connectors that
work together to turn your unrelated sources of data into coherent, visually
immersive, and interactive insights. Whether your data is a simple Microsoft Excel
workbook, or a collection of cloud-based and on-premises hybrid data
warehouses, Power BI lets you easily connect to your data sources, visualize (or
discover) what's important, and share that with anyone or everyone you want.

Power BI can be simple and fast, capable of creating quick insights from an Excel
workbook or a local database. But Power BI is also robust and enterprise-grade,
ready not only for extensive modeling and real-time analytics, but also for custom
development. Therefore, it can be your personal report and visualization tool, but can
also serve as the analytics and decision engine behind group projects, divisions, or
entire corporations.
If you're a beginner with Power BI, this module will get you going. If you're a Power
BI veteran, this module will tie concepts together and fill in the gaps.

The parts of Power BI


Power BI consists of a Microsoft Windows desktop application called Power BI
Desktop, an online SaaS (Software as a Service) service called the Power BI service,
and mobile Power BI apps that are available on any device, with native mobile BI
apps for Windows, iOS, and Android.
These three elements—Desktop, the service, and Mobile apps—are designed to let
people create, share, and consume business insights in the way that serves them, or
their role, most effectively.

How Power BI matches your role


How you use Power BI might depend on your role on a project or a team. And other
people, in other roles, might use Power BI differently, which is just fine.
For example, you might view reports and dashboards in the Power BI service, and
that might be all you do with Power BI. But your number-crunching, business-report-
creating coworker might make extensive use of Power BI Desktop (and publish
Power BI Desktop reports to the Power BI service, which you then use to view them).
And another coworker, in sales, might mainly use her Power BI phone app to monitor
progress on her sales quotas and drill into new sales lead details.
You also might use each element of Power BI at different times, depending on what
you're trying to achieve, or what your role is for a given project or effort.
Perhaps you view inventory and manufacturing progress in a real-time dashboard in
the service, and also use Power BI Desktop to create reports for your own team
about customer engagement statistics. How you use Power BI can depend on which
feature or service of Power BI is the best tool for your situation. But each part of
Power BI is available to you, which is why it's so flexible and compelling.
We discuss these three elements—Desktop, the service, and Mobile apps—in more
detail later. In upcoming units and modules, we'll also create reports in Power BI
Desktop, share them in the service, and eventually drill into them on our mobile
device.
Download Power BI Desktop

You can download Power BI Desktop from the web or as an app from the Microsoft
Store on the Windows tab.

Download Strategy Link Notes

Windows Store App Windows Store Will automatically stay updated

Download from web Download .msi Must manually update periodically

Sign in to Power BI service

Before you can sign in to Power BI, you'll need an account. To get a free trial, go
to app.powerbi.com and sign up with your email address.

For detailed steps on setting up an account, see Sign in to Power BI service

The flow of work in Power BI

A common flow of work in Power BI begins in Power BI Desktop, where a report is


created. That report is then published to the Power BI service and finally shared, so
that users of Power BI Mobile apps can consume the information.

It doesn't always happen that way, and that's okay. But we'll use that flow to help you
learn the different parts of Power BI and how they complement each other.

Okay, now that we have an overview of this module, what Power BI is, and its three
main elements, let's take a look at what it's like to use Power BI.
Use Power BI
Completed100 XP
 2 minutes

Now that we've introduced the basics of Microsoft Power BI, let's jump into some
hands-on experiences and a guided tour.

The activities and analyses that you'll learn with Power BI generally follow a common
flow. The common flow of activity looks like this:

1. Bring data into Power BI Desktop, and create a report.


2. Publish to the Power BI service, where you can create new visualizations or
build dashboards.
3. Share dashboards with others, especially people who are on the go.
4. View and interact with shared dashboards and reports in Power BI Mobile apps.

As mentioned earlier, you might spend all your time in the Power BI service, viewing
visuals and reports that have been created by others. And that's fine. Someone else
on your team might spend their time in Power BI Desktop, which is fine too. To help
you understand the full continuum of Power BI and what it can do, we'll show you all
of it. Then you can decide how to use it to your best advantage.

So, let's jump in and step through the experience. Your first order of business is to
learn the basic building blocks of Power BI, which will provide a solid basis for
turning data into cool reports and visuals.
Building blocks of Power BI
Completed100 XP - 12 minutes

Everything you do in Microsoft Power BI can be broken down into a few


basic building blocks. After you understand these building blocks, you can expand
on each of them and begin creating elaborate and complex reports. After all, even
seemingly complex things are built from basic building blocks. For example,
buildings are created with wood, steel, concrete and glass, and cars are made from
metal, fabric, and rubber. Of course, buildings and cars can also be basic or
elaborate, depending on how those basic building blocks are arranged.
Let's take a look at these basic building blocks, discuss some simple things that can
be built with them, and then get a glimpse into how complex things can also be
created.
Here are the basic building blocks in Power BI:
 Visualizations
 Datasets
 Reports
 Dashboards
 Tiles

Visualizations
A visualization (sometimes also referred to as a visual) is a visual representation of
data, like a chart, a color-coded map, or other interesting things you can create to
represent your data visually. Power BI has all sorts of visualization types, and more
are coming all the time. The following image shows a collection of different
visualizations that were created in Power BI.
Visualizations can be simple, like a single number that represents something
significant, or they can be visually complex, like a gradient-colored map that shows
voter sentiment about a certain social issue or concern. The goal of a visual is to
present data in a way that provides context and insights, both of which would
probably be difficult to discern from a raw table of numbers or text.

Datasets
A dataset is a collection of data that Power BI uses to create its visualizations.
You can have a simple dataset that's based on a single table from a Microsoft Excel
workbook, similar to what's shown in the following image.

Datasets can also be a combination of many different sources, which you can filter
and combine to provide a unique collection of data (a dataset) for use in Power BI.
For example, you can create a dataset from three database fields, one website table,
an Excel table, and online results of an email marketing campaign. That unique
combination is still considered a single dataset, even though it was pulled together
from many different sources.
Filtering data before bringing it into Power BI lets you focus on the data that matters
to you. For example, you can filter your contact database so that only customers who
received emails from the marketing campaign are included in the dataset. You can
then create visuals based on that subset (the filtered collection) of customers who
were included in the campaign. Filtering helps you focus your data—and your efforts.
An important and enabling part of Power BI is the multitude of data connectors that
are included. Whether the data you want is in Excel or a Microsoft SQL Server
database, in Azure or Oracle, or in a service like Facebook, Salesforce, or MailChimp,
Power BI has built-in data connectors that let you easily connect to that data, filter it
if necessary, and bring it into your dataset.
After you have a dataset, you can begin creating visualizations that show different
portions of it in different ways, and gain insights based on what you see. That's
where reports come in.

Reports
In Power BI, a report is a collection of visualizations that appear together on one or
more pages. Just like any other report you might create for a sales presentation or
write for a school assignment, a report in Power BI is a collection of items that are
related to each other. The following image shows a report in Power BI Desktop—in
this case, it's the second page in a five-page report. You can also create reports in
the Power BI service.

Reports let you create many visualizations, on multiple pages if necessary, and let
you arrange those visualizations in whatever way best tells your story.
You might have a report about quarterly sales, product growth in a particular
segment, or migration patterns of polar bears. Whatever your subject, reports let you
gather and organize your visualizations onto one page (or more).
Dashboards
When you're ready to share a report, or a collection of visualizations, you create
a dashboard. Much like the dashboard in a car, a Power BI dashboard is a collection
of visuals that you can share with others. Often, it's a selected group of visuals that
provide quick insight into the data or story you're trying to present.
A dashboard must fit on a single page, often called a canvas (the canvas is the blank
backdrop in Power BI Desktop or the service, where you put visualizations). Think of
it like the canvas that an artist or painter uses—a workspace where you create,
combine, and rework interesting and compelling visuals. You can share dashboards
with other users or groups, who can then interact with your dashboards when they're
in the Power BI service or on their mobile device.

Tiles
In Power BI, a tile is a single visualization on a dashboard. It's the rectangular box
that holds an individual visual. In the following image, you see one tile, which is also
surrounded by other tiles.

When you're creating a dashboard in Power BI, you can move or arrange tiles
however you want. You can make them bigger, change their height or width, and
snuggle them up to other tiles.
When you're viewing, or consuming, a dashboard or report—which means you're not
the creator or owner, but the report or dashboard has been shared with you—you
can interact with it, but you can't change the size of the tiles or their arrangement.

All together now

Those are the basics of Power BI and its building blocks. Let's take a moment to
review.

Power BI is a collection of services, apps, and connectors that lets you connect to
your data, wherever it happens to reside, filter it if necessary, and then bring it into
Power BI to create compelling visualizations that you can share with others.

Now that you've learned about the handful of basic building blocks of Power BI, it
should be clear that you can create datasets that make sense to you and create
visually compelling reports that tell your story. Stories told with Power BI don't have
to be complex, or complicated, to be compelling.

For some people, using a single Excel table in a dataset and then sharing a
dashboard with their team will be an incredibly valuable way to use Power BI.

For others, the value of Power BI will be in using real-time Azure SQL Data
Warehouse tables that combine with other databases and real-time sources to build
a moment-by-moment dataset.

For both groups, the process is the same: create datasets, build compelling visuals,
and share them with others. And the result is also the same for both groups: harness
your ever-expanding world of data, and turn it into actionable insights.

Whether your data insights require straightforward or complex datasets, Power BI


helps you get started quickly and can expand with your needs to be as complex as
your world of data requires. And because Power BI is a Microsoft product, you can
count on it being robust, extensible, Microsoft Office–friendly, and enterprise-ready.

Now let's see how this works. We'll start by taking a quick look at the Power BI
service.
Tour and use the Power BI service
Completed100 XP - 12 minutes

As we learned in the previous unit, the common flow of work in Microsoft Power BI is
to create a report in Power BI Desktop, publish it to the Power BI service, and then
share it with others, so that they can view it in the service or on a mobile app.

But because some people begin in the Power BI service, let's take a quick look at that
first, and learn about an easy and popular way to quickly create visuals in Power
BI: apps.

An app is a collection of preset, ready-made visuals and reports that are shared with
an entire organization. Using an app is like microwaving a TV dinner or ordering a
fast-food value meal: you just have to press a few buttons or make a few comments,
and you're quickly served a collection of entrees designed to go together, all
presented in a tidy, ready-to-consume package.

So, let's take a quick look at apps, the service, and how it works. We'll go into more
detail about apps (and the service) in upcoming modules, but you can think of this as
a taste to whet your appetite. You can sign into the service
at https://powerbi.microsoft.com.

Create out-of-box dashboards with cloud services

With Power BI, connecting to data is easy. From the Power BI service, you can just
select the Get Data button in the lower-left corner of the home page.
The canvas (the area in the center of the Power BI service) shows you the available
sources of data in the Power BI service. In addition to common data sources like
Microsoft Excel files, databases, or Microsoft Azure data, Power BI can just as easily
connect to a whole assortment of software services (also called SaaS providers or
cloud services): Salesforce, Facebook, Google Analytics, and more.
For these software services, the Power BI service provides a collection of ready-
made visuals that are pre-arranged on dashboards and reports for your organization.
This collection of visuals is called an app. Apps get you up and running quickly, with
data and dashboards that your organization has created for you. For example, when
you use the GitHub app, Power BI connects to your GitHub account (after you
provide your credentials) and then populates a predefined collection of visuals and
dashboards in Power BI.

There are apps for all sorts of online services. The following image shows a page of
apps that are available for different online services, in alphabetical order. This page is
shown when you select the Get button in the Services box (shown in the previous
image). As you can see from the following image, there are many apps to choose
from.

For our purposes, we'll choose GitHub. Note that the GitHub app requires Power BI
Pro. GitHub is an application for online source control. When you select the Get it
now button in the box for the GitHub app, the Connect to GitHub dialog box
appears. Note that Github does not support Internet Explorer, so make sure you are
working in another browser.
After you enter the information and credentials for the GitHub app, installation of the
app begins.

After the data is loaded, the predefined GitHub app dashboard appears.
In addition to the app dashboard, the report that was generated (as part of the
GitHub app) and used to create the dashboard is available, as is the dataset (the
collection of data pulled from GitHub) that was created during data import and used
to create the GitHub report.
You can select any of the visuals and interact with them. If you click on a section in
one visual, all the other visuals on the page will filter accordingly. For example, when
you select MIHART in the donut chart on the Top 100 Contributors report, the
other visuals on the page adjust to reflect that selection.

Update data in the Power BI service

You can also choose to update the dataset for an app, or other data that you use in
Power BI. To set update settings, select the schedule update icon for the dataset to
update, and then use the menu that appears. You can also select the update icon
(the circle with an arrow) next to the schedule update icon to update the dataset
immediately.

The Datasets tab is selected on the Settings page that appears. In the right pane,


select the arrow next to Scheduled refresh to expand that section.
The Settings dialog box appears on the canvas, letting you set the update settings
that meet your needs.
That's enough for our quick look at the Power BI service. There are many more things
you can do with the service, and we'll cover these later in this module and in
upcoming modules. Also, remember that there are many types of data you can
connect to, and all sorts of apps, with more of both coming all the time.
Check your knowledge
200 XP

 3 minutes

Knowledge check: Get started with Power BI


1. What is the common flow of activity in Power BI?

Create a report in Power BI mobile, share it to the Power BI Desktop, view and interact in
the Power BI service.

Create a report in the Power BI service, share it to Power BI mobile, interact with it in
Power BI Desktop.

Bring data into Power BI Desktop and create a report, share it to the Power BI service,
view and interact with reports and dashboards in the service and Power BI mobile.

Bring data into Power BI mobile, create a report, then share it to Power BI Desktop.

2. Which of the following are building blocks of Power BI?

Tiles, dashboards, databases, mobile devices.

Visualizations, datasets, reports, dashboards, tiles.

Visual Studio, C#, and JSON files.

3. A collection of ready-made visuals, pre-arranged in dashboards and reports is called what
in Power BI?

The canvas.

Scheduled refresh.

An app.
Check your knowledge
Completed200 XP - 3 minutes

Knowledge check: Get started with Power BI


1. What is the common flow of activity in Power BI?

Create a report in Power BI mobile, share it to the Power BI Desktop, view and interact in
the Power BI service.

Create a report in the Power BI service, share it to Power BI mobile, interact with it in
Power BI Desktop.

The common flow of activity in Power BI doesn't start with creating a report in the
Power BI service. Commonly, the flow begins by creating a report in Power BI
Desktop.
Bring data into Power BI Desktop and create a report, share it to the Power BI
service, view and interact with reports and dashboards in the service and Power BI
mobile.

The Power BI service lets you view and interact with reports and dashboards,
but doesn't let you shape data.
Bring data into Power BI mobile, create a report, then share it to Power BI Desktop.

2. Which of the following are building blocks of Power BI?

Tiles, dashboards, databases, mobile devices.

Visualizations, datasets, reports, dashboards, tiles.

Building blocks for Power BI are visualizations, datasets, reports, dashboards, tiles.
Visual Studio, C#, and JSON files.

3. A collection of ready-made visuals, pre-arranged in dashboards and reports is called what
in Power BI?

The canvas.

Scheduled refresh.

Scheduled refresh lets you automatically update data for datasets, but is not a
collection of ready-made visuals.
An app.

An app is a collection of ready-made visuals, pre-arranged in dashboards and


reports. You can get apps that connect to many online services from the AppSource.
Summary
Completed100 XP - 5 minutes

Let's do a quick review of what we covered in this module.

Microsoft Power BI is a collection of software services, apps, and connectors that
work together to turn your data into interactive insights. You can use data from
single basic sources, like a Microsoft Excel workbook, or pull in data from multiple
databases and cloud sources to create complex datasets and reports. Power BI can
be as straightforward as you want or as enterprise-ready as your complex global
business requires.

Power BI consists of three main elements—Power BI Desktop, the Power BI service,


and Power BI Mobile—which work together to let you create, interact with, share,
and consume your data the way you want.

We also discussed the basic building blocks in Power BI:

 Visualizations – A visual representation of data, sometimes just called visuals


 Datasets – A collection of data that Power BI uses to create visualizations
 Reports – A collection of visuals from a dataset, spanning one or more pages
 Dashboards – A single-page collection of visuals built from a report
 Tiles – A single visualization on a report or dashboard

In the Power BI service, we installed an app in just a few clicks. That app, a ready-


made collection of visuals and reports, let us easily connect to a software service to
populate the app and bring that data to life.

Finally, we set up a refresh schedule for our data, so that we know the data will be
fresh when we go back to the Power BI service.
Next steps

Congratulations! You've finished the first module of the learning path for Power BI.


You now have a firm foundation of knowledge for when you move on to the next
module, which walks through the steps to create your first report.

We mentioned this before, but it's worth restating: this learning path builds your
knowledge by following the common flow of work in Power BI:

 Bring data into Power BI Desktop, and create a report.


 Publish to the Power BI service, where you create new visualizations or build
dashboards.
 Share your dashboards with others, especially people who are on the go.
 View and interact with shared dashboards and reports in Power BI Mobile apps.

You might not do all that work yourself—some people will only view dashboards that
were created by someone else, and they'll just use the service. That's fine, and we'll
soon have a module dedicated to showing how you can easily navigate and use
the Power BI service to view and interact with reports and apps.

But the next module follows the flow of work in Power BI, showing you how to create
a report and publish it to the Power BI service. You'll learn how those reports and
dashboards are created and how they connected to the data. You might even decide
to create a report or dashboard of your own.

See you in the next module!


Get data in Power BI
 1 hr 55 min
 Module
 12 Units
 4.8 (16,053)

Intermediate

Data Analyst

Power BI

Microsoft Power Platform

You will learn how to retrieve data from a wide variety of data sources, including
Microsoft Excel, relational databases, and NoSQL data stores. You will also learn how
to improve performance while retrieving data.

Learning objectives
By the end of this module, you’ll be able to:
 Identify and connect to a data source
 Get data from a relational database, like Microsoft SQL Server
 Get data from a file, like Microsoft Excel
 Get data from applications
 Get data from Azure Analysis Services
 Select a storage mode
 Fix performance issues
 Resolve data import errors
Introduction
Completed100 XP
 2 minutes

Like most of us, you work for a company where you're required to build Microsoft
Power BI reports. The data resides in several different databases and files. These data
repositories are different from each other, some are in Microsoft SQL Server, some
are in Microsoft Excel, but all the data is related.

In this module’s scenario, you work for Tailwind Traders. You’ve been tasked by
senior leadership to create a suite of reports that are dependent on data in several
different locations. The database that tracks sales transactions is in SQL Server, a
relational database that contains what items each customer bought and when. It also
tracks which employee made the sale, along with the employee name and employee
ID. However, that database doesn’t contain the employee’s hire date, their title, or
who their manager is. For that information, you need to access files that Human
Resources keeps in Excel. You've been consistently requesting that they use an SQL
database, but they haven't yet had the chance to implement it.

When an item ships, the shipment is recorded in the warehousing application, which
is new to the company. The developers chose to store data in Cosmos DB, as a set of
JSON documents.

Tailwind Traders has an application that helps with financial projections, so that they
can predict what their sales will be in future months and years, based on past trends.
Those projections are stored in Microsoft Azure Analysis Services. Here’s a view of
the many data sources you're asked to combine data from.
Before you can create reports, you must first extract data from the various data
sources. Interacting with SQL Server is different from Excel, so you should learn the
nuances of both systems. After gaining understanding of the systems, you can use
Power Query to help you clean the data, such as renaming columns, replacing values,
removing errors, and combining query results. Power Query is also available in Excel.
After the data has been cleaned and organized, you're ready to build reports in
Power BI. Finally, you'll publish your combined dataset and reports to Power BI
service. From there, other people can use your dataset and build their own reports or
they can use the reports you’ve already built. Additionally, if someone else built a
dataset you'd like to use, you can build reports from that too!

This module will focus on the first step of getting the data from the different data
sources and importing it into Power BI by using Power Query.

By the end of this module, you’ll be able to:

 Identify and connect to a data source


 Get data from a relational database, such as Microsoft SQL Server
 Get data from a file, such as Microsoft Excel
 Get data from applications
 Get data from Azure Analysis Services
 Select a storage mode
 Fix performance issues
 Resolve data import errors
Get data from files
Completed100 XP

 10 minutes

Organizations often export and store data in files. One possible file format is a flat
file. A flat file is a type of file that has only one data table and every row of data is in
the same structure. The file doesn't contain hierarchies. Likely, you're familiar with the
most common types of flat files, which are comma-separated values (.csv) files,
delimited text (.txt) files, and fixed width files. Another type of file would be the
output files from different applications, like Microsoft Excel workbooks (.xlsx).

Power BI Desktop allows you to get data from many types of files. You can find a list
of the available options when you use the Get data feature in Power BI Desktop. The
following sections explain how you can import data from an Excel file that is stored
on a local computer.

Scenario

The Human Resources (HR) team at Tailwind Traders has prepared a flat file that
contains some of your organization's employee data, such as employee name, hire
date, position, and manager. They've requested that you build Power BI reports by
using this data, and data that is located in several other data sources.

Flat file location

The first step is to determine which file location you want to use to export and store
your data.

Your Excel files might exist in one of the following locations:

 Local - You can import data from a local file into Power BI. The file isn't moved
into Power BI, and a link doesn't remain to it. Instead, a new dataset is created
in Power BI, and data from the Excel file is loaded into it. Accordingly, changes
to the original Excel file aren't reflected in your Power BI dataset. You can use
local data import for data that doesn't change.

 OneDrive for Business - You can pull data from OneDrive for Business into
Power BI. This method is effective in keeping an Excel file and your dataset,
reports, and dashboards in Power BI synchronized. Power BI connects regularly
to your file on OneDrive. If any changes are found, your dataset, reports, and
dashboards are automatically updated in Power BI.

 OneDrive - Personal - You can use data from files on a personal OneDrive
account, and get many of the same benefits that you would with OneDrive for
Business. However, you'll need to sign in with your personal OneDrive account,
and select the Keep me signed in option. Check with your system
administrator to determine whether this type of connection is allowed in your
organization.

 SharePoint - Team Sites - Saving your Power BI Desktop files to SharePoint


Team Sites is similar to saving to OneDrive for Business. The main difference is
how you connect to the file from Power BI. You can specify a URL or connect to
the root folder.

Using a cloud option such as OneDrive or SharePoint Team Sites is the most effective
way to keep your file and your dataset, reports, and dashboards in Power BI in-sync.
However, if your data doesn't change regularly, saving files on a local computer is a
suitable option.
Connect to data in a file

In Power BI, on the Home tab, select Get data. In the list that displays, select the
option that you require, such as Text/CSV or XML. For this example, you'll
select Excel.

 Tip

The Home tab contains quick access data source options, such as Excel, next to
the Get data button.

Depending on your selection, you need to find and open your data source. You
might be prompted to sign into a service, such as OneDrive, to authenticate your
request. In this example, you'll open the Employee Data Excel workbook that is
stored on the Desktop (Remember, no files are provided for practice, these are
hypothetical steps).

Select the file data to import

After the file has connected to Power BI Desktop, the Navigator window opens. This
window shows you the data that is available in your data source (the Excel file in this
example). You can select a table or entity to preview its contents, to ensure that the
correct data is loaded into the Power BI model.
Select the check box(es) of the table(s) that you want to bring in to Power BI. This
selection activates the Load and Transform Data buttons as shown in the following
image.

Now you can select the Load button to automatically load your data into the Power
BI model or select the Transform Data button to launch the Power Query Editor,
where you can review and clean your data before loading it into the Power BI model.

We often recommend that you transform data, but that process will be discussed
later in this module. For this example, you can select Load.

Change the source file

You might have to change the location of a source file for a data source during
development, or if a file storage location changes. To keep your reports up to date,
you'll need to update your file connection paths in Power BI.

Power Query provides many ways for you to accomplish this task, so that you can
make this type of change when needed.

1. Data source settings


2. Query settings
3. Advanced Editor
 Warning
If you are changing a file path, make sure that you reconnect to the same file with
the same file structure. Any structural changes to a file, such as deleting or renaming
columns in the source file, will break the reporting model.

For example, try changing the data source file path in the data source settings.
Select Data source settings in Power Query. In the Data source settings window,
select your file and then select Change Source. Update the File path or use
the Browse option to locate your file, select OK, and then select Close.
Get data from relational data sources
Completed100 XP

 14 minutes

If your organization uses a relational database for sales, you can use Power BI
Desktop to connect directly to the database instead of using exported flat files.

Connecting Power BI to your database will help you to monitor the progress of your
business and identify trends, so you can forecast sales figures, plan budgets and set
performance indicators and targets. Power BI Desktop can connect to many
relational databases that are either in the cloud or on-premises.

Scenario

The Sales team at Tailwind Traders has requested that you connect to the
organization's on-premises SQL Server database and get the sales data into Power BI
Desktop so you can build sales reports.

Connect to data in a relational database

You can use the Get data feature in Power BI Desktop and select the applicable
option for your relational database. For this example, you would select the SQL
Server option, as shown in the following screenshot.

 Tip
Next to the Get Data button are quick access data source options, such as SQL
Server.

Your next step is to enter your database server name and a database name in
the SQL Server database window. The two options in data connectivity mode
are: Import (selected by default, recommended) and DirectQuery. Mostly, you
select Import. Other advanced options are also available in the SQL Server
database window, but you can ignore them for now.
After you've added your server and database names, you'll be prompted to sign in
with a username and password. You'll have three sign-in options:

 Windows - Use your Windows account (Azure Active Directory credentials).

 Database - Use your database credentials. For instance, SQL Server has its own
sign-in and authentication system that is sometimes used. If the database
administrator gave you a unique sign-in to the database, you might need to
enter those credentials on the Database tab.

 Microsoft account - Use your Microsoft account credentials. This option is


often used for Azure services.

Select a sign-in option, enter your username and password, and then select Connect.
Select data to import

After the database has been connected to Power BI Desktop, the Navigator window


displays the data that is available in your data source (the SQL database in this
example). You can select a table or entity to preview its contents and make sure that
the correct data will be loaded into the Power BI model.

Select the check box(es) of the table(s) that you want to bring in to Power BI
Desktop, and then select either the Load or Transform Data option.

 Load - Automatically load your data into a Power BI model in its current state.

 Transform Data - Open your data in Microsoft Power Query, where you can
perform actions such as deleting unnecessary rows or columns, grouping your
data, removing errors, and many other data quality tasks.

Import data by writing an SQL query

Another way you can import data is to write an SQL query to specify only the tables
and columns that you need.

To write your SQL query, on the SQL Server database window, enter your server and
database names, and then select the arrow next to Advanced options to expand this
section and view your options. In the SQL statement box, write your query
statement, and then select OK. In this example, you'll use the Select SQL statement
to load the ID, NAME and SALESAMOUNT columns from the SALES table.
Change data source settings

After you create a data source connection and load data into Power BI Desktop, you
can return and change your connection settings at any time. This action is often
required due to a security policy within the organization, for example, when the
password needs to be updated every 90 days. You can change the data source, edit
permissions or clear permissions.

On the Home tab, select Transform data, and then select the Data source


settings option.
From the list of data sources that displays, select the data source that you want to
update. Then, you can right-click that data source to view the available update
options or you can use the update option buttons on the lower left of the window.
Select the update option that you need, change the settings as required, and then
apply your changes.

You can also change your data source settings from within Power Query. Select the
table, and then select the Data source settings option on the Home ribbon.
Alternatively, you can go to the Query Settings panel on the right side of the screen
and select the settings icon next to Source (or double Select Source). In the window
that displays, update the server and database details, and then select OK.
After you have made the changes, select Close and Apply to apply those changes to
your data source settings.

Write an SQL statement

As previously mentioned, you can import data into your Power BI model by using an
SQL query. SQL stands for Structured Query Language and is a standardized
programming language that is used to manage relational databases and perform
various data management operations.

Consider the scenario where your database has a large table that is comprised of
sales data over several years. Sales data from 2009 isn't relevant to the report that
you're creating. This situation is where SQL is beneficial because it allows you to load
only the required set of data by specifying exact columns and rows in your SQL
statement and then importing them into your data model. You can also join different
tables, run specific calculations, create logical statements, and filter data in your SQL
query.

The following example shows a simple query where the ID, NAME and
SALESAMOUNT are selected from the SALES table.

The SQL query starts with a Select statement, which allows you to choose the specific
fields that you want to pull from your database. In this example, you want to load the
ID, NAME, and SALESAMOUNT columns.

SQLCopy
SELECT
ID
, NAME
, SALESAMOUNT
FROM

FROM specifies the name of the table that you want to pull the data from. In this
case, it's the SALES table. The following example is the full SQL query:
SQLCopy
SELECT
ID
, NAME
, SALESAMOUNT
FROM
SALES

When using an SQL query to import data, try to avoid using the wildcard character (*)
in your query. If you use the wildcard character (*) in your SELECT statement, you
import all columns that you don't need from the specified table.

The following example shows the query using the wildcard character.
SQLCopy
SELECT *
FROM
SALES

The wildcard character (*) will import all columns within the Sales table. This method
isn't recommended because it will lead to redundant data in your data model, which
will cause performance issues and require extra steps to normalize your data for
reporting.

All queries should also have a WHERE clause. This clause will filter the rows to pick
only filtered records that you want. In this example, if you want to get recent sales
data after January 1st, 2020, add a WHERE clause. The evolved query would look like
the following example.

SQLCopy
SELECT
ID
, NAME
, SALESAMOUNT
FROM
SALES
WHERE
OrderDate >= ‘1/1/2020’

It's a best practice to avoid doing this directly in Power BI. Instead, consider writing a
query like this in a view. A view is an object in a relational database, similar to a table.
Views have rows and columns, and can contain almost every operator in the SQL
language. If Power BI uses a view, when it retrieves data, it participates in query
folding, a feature of Power Query. Query folding will be explained later, but in short,
Power Query will optimize data retrieval according to how the data is being used
later.
Get data from a NoSQL database
Completed100 XP

 5 minutes

Some organizations don't use a relational database but instead use


a NoSQL database. A NoSQL database (also referred to as non-SQL, not only SQL
or non-relational) is a flexible type of database that does not use tables to store data.

Scenario

Software developers at Tailwind Traders created an application to manage shipping


and tracking products from their warehouses that uses Cosmos DB, a NoSQL
database, as the data repository. This application uses Cosmos DB to store JSON
documents, which are open standard file formats that are primarily used to transmit
data between a server and web application. You need to import this data into a
Power BI data model for reporting.

Connect to a NoSQL database (Azure Cosmos DB)

In this scenario, you will use the Get data feature in Power BI Desktop. However, this
time you will select the More... option to locate and connect to the type of database
that you use. In this example, you will select the Azure category, select Azure
Cosmos DB, and then select Connect.
On the Preview Connector window, select Continue and then enter your database
credentials. In this example, on the Azure Cosmos DB window, you can enter the
database details. You can specify the Azure Cosmos DB account endpoint URL that
you want to get the data from (you can get the URL from the Keys blade of your
Azure portal). Alternatively, you can enter the database name, collection name or use
the navigator to select the database and collection to identify the data source.

If you are connecting to an endpoint for the first time, as you are in this example,
make sure that you enter your account key. You can find this key in the Primary
Key box in the Read-only Keys blade of your Azure portal.

Import a JSON file

JSON type records must be extracted and normalized before you can report on them,
so you need to transform the data before loading it into Power BI Desktop.

After you have connected to the database account, the Navigator window opens,


showing a list of databases under that account. Select the table that you want to
import. In this example, you will select the Product table. The preview pane only
shows Record items because all records in the document are represented as a
Record type in Power BI.

Select the Edit button to open the records in Power Query.

In Power Query, select the Expander button to the right side of


the Column1 header, which will display the context menu with a list of fields. Select
the fields that you want to load into Power BI Desktop, clear the Use original
column name as prefix checkbox, and then select OK.

Review the selected data to ensure that you are satisfied with it, then select Close &
Apply to load the data into Power BI Desktop.

The data now resembles a table with rows and columns. Data from Cosmos DB can
now be related to data from other data sources and can eventually be used in a
Power BI report.
Get data from online services
Completed100 XP

 5 minutes

To support their daily operations, organizations frequently use a range of software


applications, such as SharePoint, OneDrive, Dynamics 365, Google Analytics and so
on. These applications produce their own data. Power BI can combine the data from
multiple applications to produce more meaningful insights and reports.

Scenario

Tailwind Traders uses SharePoint to collaborate and store sales data. It's the start of
the new financial year and the sales managers want to enter new goals for the sales
team. The form that the leadership uses exists in SharePoint. You're required to
establish a connection to this data within Power BI Desktop, so that the sales goals
can be used alongside other sales data to determine the health of the sales pipeline.

The following sections examine how to use the Power BI Desktop Get Data feature
to connect to data sources that are produced by external applications. To illustrate
this process, we've provided an example that shows how to connect to a SharePoint
site and import data from an online list.

Connect to data in an application

When connecting to data in an application, you would begin in the same way as you
would when connecting to the other data sources: by selecting the Get data feature
in Power BI Desktop. Then, select the option that you need from the Online
Services category. In this example, you select SharePoint Online List.

After you've selected Connect, you'll be asked for your SharePoint URL. This URL is
the one that you use to sign into your SharePoint site through a web browser. You
can copy the URL from your SharePoint site and paste it into the connection window
in Power BI. You don't need to enter your full URL file path; you only need to load
your site URL because, when you're connected, you can select the specific list that
you want to load. Depending on the URL that you copied, you might need to delete
the last part of your URL, as illustrated in the following image.
After you've entered your URL, select OK. Power BI needs to authorize the
connection to SharePoint, so sign in with your Microsoft account and then
select Connect.

Choose the application data to import

After Power BI has made the connection with SharePoint, the Navigator window


appears, as it does when you connect to other data sources. The window displays the
tables and entities within your SharePoint site. Select the list that you want to load
into Power BI Desktop. Similar to when you import from other data sources, you have
the option to automatically load your data into Power BI model or launch the Power
Query Editor to transform your data before loading it.

For this example, you select the Load option.


Select a storage mode
Completed100 XP

 6 minutes

The most popular way to use data in Power BI is to import it into a Power BI dataset.
Importing the data means that the data is stored in the Power BI file and gets
published along with the Power BI reports. This process helps make it easier for you
to interact directly with your data. However, this approach might not work for all
organizations.

To continue with the scenario, you're building Power BI reports for the Sales
department at Tailwind Traders, where importing the data isn't an ideal method. The
first task you need to accomplish is to create your datasets in Power BI so you can
build visuals and other report elements. The Sales department has many different
datasets of varying sizes. For security reasons, you aren't allowed to import local
copies of the data into your reports, so directly importing data is no longer an
option. Therefore, you need to create a direct connection to the Sales department’s
data source. The following section describes how you can ensure that these business
requirements are satisfied when you're importing data into Power BI.

However, sometimes there may be security requirements around your data that
make it impossible to directly import a copy. Or your datasets may simply be too
large and would take too long to load into Power BI, and you want to avoid creating
a performance bottleneck. Power BI solves these problems by using the DirectQuery
storage mode, which allows you to query the data in the data source directly and not
import a copy into Power BI. DirectQuery is useful because it ensures you're always
viewing the most recent version of the data.

The three different types of storage modes you can choose from:

 Import
 DirectQuery
 Dual (Composite)

You can access storage modes by switching to the Model view, selecting a data


table, and in the resulting Properties pane, selecting which mode that you want to
use from the Storage mode drop-down list, as shown in the following visual.
Let’s take a closer look at the different types of Storage Modes.

Import mode

The Import mode allows you to create a local Power BI copy of your datasets from
your data source. You can use all Power BI service features with this storage mode,
including Q&A and Quick Insights. Data refreshes can be scheduled or on-demand.
Import mode is the default for creating new Power BI reports.

DirectQuery mode

The DirectQuery option is useful when you don't want to save local copies of your
data because your data won't be cached. Instead, you can query the specific tables
that you'll need by using native Power BI queries, and the required data will be
retrieved from the underlying data source. Essentially, you're creating a direct
connection to the data source. Using this model ensures that you're always viewing
the most up-to-date data, and that all security requirements are satisfied.
Additionally, this mode is suited for when you have large datasets to pull data from.
Instead of slowing down performance by having to load large amounts of data into
Power BI, you can use DirectQuery to create a connection to the source, solving data
latency issues as well.

Dual (Composite mode)

In Dual mode, you can identify some data to be directly imported and other data that
must be queried. Any table that is brought in to your report is a product of both
Import and DirectQuery modes. Using the Dual mode allows Power BI to choose the
most efficient form of data retrieval.

For more information regarding Storage Modes, refer to Storage Modes.


Get data from Azure Analysis Services
Completed100 XP

 5 minutes

Azure Analysis Services is a fully managed platform as a service (PaaS) that provides
enterprise-grade data models in the cloud. You can use advanced mashup and
modeling features to combine data from multiple data sources, define metrics, and
secure your data in a single, trusted tabular semantic data model. The data model
provides an easier and faster way for users to perform ad hoc data analysis using
tools like Power BI.

To resume the scenario, Tailwind Traders uses Azure Analysis Services to store
financial projection data. You’ve been asked to compare this data with actual sales
data in a different database. Getting data from Azure Analysis Services server is
similar to getting data from SQL Server, in that you can:

 Authenticate to the server.


 Pick the model you want to use.
 Select which tables you need.

Notable differences between Azure Analysis Services and SQL Server are:

 Analysis Services models have calculations already created.


 If you don’t need an entire table, you can query the data directly. Instead of using
Transact-SQL (T-SQL) to query the data, like you would in SQL Server, you can use
multi-dimensional expressions (MDX) or data analysis expressions (DAX).

Connect to data in Azure Analysis Services 

As previously mentioned, you use the Get data feature in Power BI Desktop. When


you select Analysis Services, you're prompted for the server address and the
database name with two options: Import and Connect live.
Connect live is an option for Azure Analysis Services. Azure Analysis Services uses
the tabular model and DAX to build calculations, similar to Power BI. These models
are compatible with one another. Using the Connect live option helps you keep the
data and DAX calculations in their original location, without having to import them
all into Power BI. Azure Analysis Services can have a fast refresh schedule, which
means that when data is refreshed in the service, Power BI reports will immediately
be updated, without the need to initiate a Power BI refresh schedule. This process
can improve the timeliness of the data in your report.

Similar to a relational database, you can choose the tables that you want to use. If
you want to directly query the Azure Analysis Services model, you can use DAX or
MDX.

You'll likely import the data directly into Power BI. An acceptable alternative is to
import all other data that you want (from Excel, SQL Server, and so on) into the Azure
Analysis Services model and then use a live connection. This approach simplifies your
solution by keeping the data modeling and DAX measures in one place.

For more information on connecting Power BI to Azure Analysis Services,


see Connect with Power BI documentation.
Fix performance issues
Completed100 XP

 10 minutes

Occasionally, organizations will need to address performance issues when running


reports. Power BI provides the Performance Analyzer tool to help fix problems and
streamline the process.

Consider the scenario where you're building reports for the Sales team in your
organization. You’ve imported your data, which is in several tables within the Sales
team’s SQL database, by creating a data connection to the database through
DirectQuery. When you create preliminary visuals and filters, you notice that some
tables are queried faster than others, and some filters are taking longer to process
compared to others.

Optimize performance in Power Query

The performance in Power Query depends on the performance at the data source
level. The variety of data sources that Power Query offers is wide, and the
performance tuning techniques for each source are equally wide. For instance, if you
extract data from a Microsoft SQL Server, you should follow the performance tuning
guidelines for the product. Good SQL Server performance tuning techniques include
index creation, hardware upgrades, execution plan tuning, and data compression.
These topics are beyond the scope here, and are covered only as an example to build
familiarity with your data source and reap the benefits when using Power BI and
Power Query.

Power Query takes advantage of good performance at the data source through a
technique called Query Folding.

Query folding

The query folding within Power Query Editor helps you increase the performance of
your Power BI reports. Query folding is the process by which the transformations and
edits that you make in Power Query Editor are simultaneously tracked as native
queries, or simple Select SQL statements, while you're actively making
transformations. The reason for implementing this process is to ensure that these
transformations can take place in the original data source server and don't
overwhelm Power BI computing resources.
You can use Power Query to load data into Power BI. Then use Power Query Editor to
transform your data, such as renaming or deleting columns, appending, parsing,
filtering, or grouping your data.

Consider a scenario where you’ve renamed a few columns in the Sales data and
merged a city and state column together in the “city state” format. Meanwhile, the
query folding feature tracks those changes in native queries. Then, when you load
your data, the transformations take place independently in the original source, this
ensures that performance is optimized in Power BI.

The benefits to query folding include:

 More efficiency in data refreshes and incremental refreshes. When you


import data tables by using query folding, Power BI is better able to allocate
resources and refresh the data faster because Power BI doesn't have to run
through each transformation locally.

 Automatic compatibility with DirectQuery and Dual storage modes. All


DirectQuery and Dual storage mode data sources must have the back-end
server processing abilities to create a direct connection, which means that query
folding is an automatic capability that you can use. If all transformations can be
reduced to a single Select statement, then query folding can occur.

The following scenario shows query folding in action. In this scenario, you apply a set
of queries to multiple tables. After you add a new data source by using Power Query,
and you're directed to the Power Query Editor, you go to the Query Settings pane
and right-click the last applied step, as shown in the following figure.
If the View Native Query option isn't available (not displayed in bold type), then
query folding isn't possible for this step, and you'll have to work backward in
the Applied Steps area until you reach the step in which View Native Query is
available (displays in bold type). This process will reveal the native query that is used
to transform the dataset.

Native queries aren't possible for the following transformations:

 Adding an index column


 Merging and appending columns of different tables with two different sources
 Changing the data type of a column

A good guideline to remember is that if you can translate a transformation into


a Select SQL statement, which includes operators and clauses such as GROUP BY,
SORT BY, WHERE, UNION ALL, and JOIN, you can use query folding.

While query folding is one option to optimize performance when retrieving,


importing, and preparing data, another option is query diagnostics.
Query diagnostics

Another tool that you can use to study query performance is query diagnostics. You
can determine what bottlenecks may exist while loading and transforming your data,
refreshing your data in Power Query, running SQL statements in Query Editor, and so
on.

To access query diagnostics in Power Query Editor, go to Tools in the Home ribbon.
When you're ready to begin transforming your data or making other edits in Power
Query Editor, select Start Diagnostics in the Session Diagnostics section. When
you're finished, make sure that you select Stop Diagnostics.

Selecting Diagnose Step shows you the length of time that it takes to run that step,
as shown in the following image. This selection can tell you if a step takes longer to
complete than others, which then serves as a starting point for further investigation.

This tool is useful when you want to analyze performance on the Power Query side
for tasks such as loading datasets, running data refreshes, or running other
transformative tasks.
Other techniques to optimize performance

Other ways to optimize query performance in Power BI include:

 Process as much data as possible in the original data source. Power Query


and Power Query Editor allow you to process the data; however, the processing
power that is required to complete this task might lower performance in other
areas of your reports. Generally, a good practice is to process, as much as
possible, in the native data source.

 Use native SQL queries. When using DirectQuery for SQL databases, such as
the case for our scenario, make sure that you aren't pulling data from stored
procedures or common table expressions (CTEs).

 Separate date and time, if bound together. If any of your tables have
columns that combine date and time, make sure that you separate them into
distinct columns before importing them into Power BI. This approach will
increase compression abilities.

For more information, refer to Query Folding Guidance and Query Folding.


Resolve data import errors
Completed100 XP

 7 minutes

While importing data into Power BI, you may encounter errors resulting from factors
such as:

 Power BI imports from numerous data sources.


 Each data source might have dozens (and sometimes hundreds) of different error
messages.
 Other components can cause errors, such as hard drives, networks, software services,
and operating systems.
 Data often can't comply with any specific schema.

The following sections cover some of the more common error messages that you
might encounter in Power BI.

Query timeout expired

Relational source systems often have many people who are concurrently using the
same data in the same database. Some relational systems and their administrators
seek to limit a user from monopolizing all hardware resources by setting a query
timeout. These timeouts can be configured for any timespan, from as little as five
seconds to as much as 30 minutes or more.

For instance, if you’re pulling data from your organization’s SQL Server, you might
see the error shown in the following figure.
Power BI Query Error: Timeout expired

This error indicates that you’ve pulled too much data according to your
organization’s policies. Administrators incorporate this policy to avoid slowing down
a different application or suite of applications that might also be using that database.

You can resolve this error by pulling fewer columns or rows from a single table. While
you're writing SQL statements, it might be a common practice to include groupings
and aggregations. You can also join multiple tables in a single SQL statement.
Additionally, you can perform complicated subqueries and nested queries in a single
statement. These complexities add to the query processing requirements of the
relational system and can greatly elongate the time of implementation.

If you need the rows, columns, and complexity, consider taking small chunks of data
and then bringing them back together by using Power Query. For instance, you can
combine half the columns in one query and the other half in a different query. Power
Query can merge those two queries back together after you're finished.

We couldn't find any data formatted as a table

Occasionally, you may encounter the “We couldn’t find any data formatted as a
table” error while importing data from Microsoft Excel. Fortunately, this error is self-
explanatory. Power BI expects to find data formatted as a table from Excel. The error
event tells you the resolution. Perform the following steps to resolve the issue:
1. Open your Excel workbook, and highlight the data that you want to import.

2. Press the Ctrl-T keyboard shortcut. The first row will likely be your column
headers.

3. Verify that the column headers reflect how you want to name your columns.
Then, try to import data from Excel again. This time, it should work.

Couldn't find file

While importing data from a file, you may get the "Couldn't find file" error.

Usually, this error is caused by the file moving locations or the permissions to the file
changing. If the cause is the former, you need to find the file and change the source
settings.

1. Open Power Query by selecting the Transform Data button in Power BI.

2. Highlight the query that is creating the error.

3. On the left, under Query Settings, select the gear icon next to Source.
4. Change the file location to the new location.

Data type errors

Sometimes, when you import data into Power BI, the columns appear blank. This
situation happens because of an error in interpreting the data type in Power BI. The
resolution to this error is unique to the data source. For instance, if you're importing
data from SQL Server and see blank columns, you could try to convert to the correct
data type in the query.

Instead of using this query:

SELECT CustomerPostalCode FROM Sales.Customers

Use this query:


SELECT CAST(CustomerPostalCode as varchar(10)) FROM Sales.Customers

By specifying the correct type at the data source, you eliminate many of these
common data source errors.

You may encounter different types of errors in Power BI that are caused by the
diverse data source systems where your data resides.

If you experience an error not covered, you can search Microsoft documentation for


the error message, and the resolution
Exercise - Prepare data in Power BI
Desktop
Completed100 XP
 45 minutes
This unit includes a lab to complete.

Use the free resources provided in the lab to complete the exercises in this unit. You
will not be charged.

Microsoft provides this lab experience and related content for educational purposes.
All presented information is owned by Microsoft and intended solely for learning
about the covered products and services in this Microsoft Learn module.

Sign in to launch the lab

The estimated time to complete the exercise is 45 minutes.

 Note

A virtual machine containing the client tools you need is provided, along with the
exercise instructions. Use the button above to launch the virtual machine.

A limited number of concurrent sessions are available - if the hosted environment is


unavailable, try again later.

Alternatively, you can use these setup instructions to create your own lab
environment, then follow these exercise

You might also like