0% found this document useful (0 votes)
289 views101 pages

Power BI

power bi

Uploaded by

pepote21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
289 views101 pages

Power BI

power bi

Uploaded by

pepote21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Introduction to

Visualizations Datasets Reports Dashboards Tiles


Agenda
1. Introduction to Power BI
▹ Components
▹ Architecture
▹ Product Portfolio
▹ Life Hack: Guide to install Pro
2. Desktop Features
3. Power BI Services and Integration with Various Apps
4. Power Query Editor: The Heart of Power BI
5. Understanding DAX
6. Power BI Functions
7. Power BI Visuals
8. Power BI Charts
9. Power BI KPIs
10. Administration Options
11. Data Visualization
12. Exploratory Data Analysis
13. Project: Subscriber Churn
Power BI
▸ Business Analytics Solution that lets you visualize the data
▸ Share insights to stakeholders and business owners
▸ Components
▹ Power BI Desktop
▹ Power BI service (SaaS –Software as a Service)
▹ Power BI Mobile Apps
▸ Common Workflow
▹ Begins by connecting to data sources and building a report in Power BI Desktop
▹ Publish report from Power BI Desktop to the Power BI service
▹ Share it to end users with the Power BI Service
▹ Mobile Devices can view and interact with the report
Architecture
▸ Out of the box SaaS content packs
▸ Real time dashboards
Cloud Services
& interactive reports
Mobile, Web,
▸ Natural Language query Excel, Cortana

▸ Custom visualizations
▸ Native Office 365 integration

On Premises
Product Portfolio

Desktop Service Premium Report Embedded


Author Share & Large Scale Server App Dev
Collaborate Deployments Share &
Free data Collaborate Visual
analysis and Cloud based Dedicated analytics
reporting modern capacity for On-
authoring tool business increased premises embedded
analytics performance report in your
solution server applications
Life Hack
▸ Download Power BI Desktop:
▹ https://powerbi.microsoft.com/en-us/downloads/
▸ How to sign up for Power BI without a work email?
▹ Use incognito browser
▹ Log in to office.com
▹ Enterprise 🡪 Plans & Pricing 🡪 E3 or E5 Account
▹ Try for free!
Desktop
2
Features
Power BI Desktop
Get Data

Collaborate Analyze

Publish Visualize
Power BI Desktop
Get Data
Easily connect, clean, and mashup data

▸ Connect to 80+ data sources, both on-premises and cloud


▸ Shape, transform, and clean data for analysis
▸ Live connectivity to on-premises and cloud data sources
▸ Extend with custom data connectors for any data source
▸ Prep your data using the familiar Power Query experience on the web
▸ Get started quickly with a common data model
▸ Extend self-service prep to Azure Data Lake Storage
Power BI Desktop
Analyze
Build powerful models and flexible measures

▸ Automatically create model when connecting to data


▸ High performance, in-memory engine
▸ Point and click analysis with Quick measures, clustering & binning
▸ Create powerful measures with familiar DAX (Data Analysis Expressions)
formulas
Power BI Desktop
Visualize
Create stunning interactive reports

▸ Author reports using 150+ visuals via a drag-drop canvas


▸ Explore data across multiple interactive visualizations
▸ Provide insights in the context of the business with Custom Visuals
▸ Visualize data story with bookmarks and customer navigation
Power BI Desktop
Publish
Share insights with others

▸ Publish directly to the cloud or on-premises


▸ Automatic data refresh, so the reports are always up to date
▸ Package your reports in apps for easy consumption and control
▸ Manage analytics content with admin and governance tools
Power BI Desktop
Collaborate
Empower your organization with self-service analytics
Power BI Services &
Integration with other
3
Apps
Power BI Services
Power BI Services

Data Sources
Secure, live connection to the data
Connect
sources on-premises and in the
Access
cloud
Publish

Power BI Gateways

Keep data anywhere


Power BI
Desktop 1.

2. Keep data fresh


Your Organization’s Data
Integration with Power BI
Deliver insights through other services

1. Collaborate and share insights with teams in your organization


using existing services
2. Fully interactive reports integrated into the service
Excel and Power BI
Easily aggregate objects from multiple Excel files on the
same dashboard in Power BI
▸ Analyze in Excel
▸ Use Excel to view and interact with
a dataset you have in Power BI
▸ Import Excel data into Power BI
▸ Connect to the data in your workbook so you
can create Power BI report and dashboards
▸ Upload your Excel file to Power BI
▸ Bring your Excel file into Power BI to view
and interact with it just as you would in Excel
Online. Pin ranges to Dashboards
Power Query Editor:
4
The Heart of Power BI
User Experience
▸ Power Query Editor represents
the user interface
▸ Modify or Add Queries
▸ Manage Queries by grouping or
adding descriptions to query
steps
▸ Visualize queries and their
structure
▸ Five Distinct Components
Data Profiling Tools
▸ Provide new and intuitive ways
to clean, transform, and
understand data
▸ Includes:
▹ Column Quality
▹ Column Distribution
▹ Column Profile
Group By Dialog
Set the Group By operation to:
▸ Group by the Geography
▸ Count the number of supplier rows
per Geography
Applied Steps
▸ Any steps performed in Power BI
is logged under the Applied
Steps
▸ Steps can be added or deleted
anytime during the process
Appending vs Merging
▸ Merging
▹ When you have one or more
columns that you’d like to
add to another query
▸ Appending
▹ When you have additional
rows of data that you’d like
to add to an existing query
Understanding
5 Data Analysis
Expression
What do you need to know?
▸ Contexts ▸ Functions
▹ Row Context ▹ SUM
▹ Filter Context ▹ AVERAGE
▸ Formatting ▹ MIN
▸ Best Practice ▹ MAX
▸ X vs non-X functions ▹ COUNT
▸ Time Intelligence functions ▹ COUNTROWS
▹ CALCULATE
▹ FILTER, etc.
DAX: Data Analysis Expression
▸ Two Business Logics
▹ Measures
▹ Calculated Columns
▸ Difference?
▹ Context of Evaluation
▹ Measures
▹ Evaluated in the context of the cell evaluated in a report
or in a DAX query
▹ Calculated Column
▹ Computed at the row level within the table it belongs to
Measures Calculated
Columns
▸ Represents a single value per
▸ Represents a single value per
data model
row
▸ Computed at run time
▸ Computed at compile time
▸ Dynamic results, based on filters
▸ Dynamic Results, based on Rows
▸ Filter Context
▸ Row Context
▸ Not attached to any specific
table

TotalQuantity := SUM(Sales[Quantity]) Tenure_Months := Churn[Tenure]*12


Implicit Measures
If we use a calculated column as a value/result, it creates an implicit measure.
▸ For example:
▹ If we have columns such as:
▹ Tenure in years,
▹ Monthly average usage
▹ Goal: to create the overall average usage for that customer
Churn[Tenure_Months] = Churn[Tenure]*12
▹ Total usage would be:
Churn[Total Usage] = Churn[Tenure_Months] * Churn[Monthly_Average_Usage]
▹ Change in the Primitive Column, i.e. Tenure, will impact the change in the Total
Usage column
DAX is great at two things
in particular
Aggregations & Filtering

Aggregations: Combining a group of values into one value

Examples: Sum, Average, Min, Max, Distinct Count


Power BI: Example
Let’s do a SUM (Column) 🡪 How to check if it’s correct?
If you are using select SUM (quantity) from tablename;
Power BI
6
Functions
Power BI: Functions
SUM
AVERAGE
MIN
MAX
COUNT
COUNTROWS
DATEDIFF
DATEADD
Average and Datediff
Probation Period = DATEDIFF(column1, column2, DAY)
Average = AVERAGE(column)
Calculated Table
Dates = CALENDAR(range)

▸ Creates a dates table with a date per day


between the specified range
▸ Also creates a Date Hierarchy
Contexts
▸ Two different contexts: 1. Row context, 2. Filter context
▸ We've been using it for all our calculated columns so far, let's revisit
our first DAX

Tenure in Years = ROUND(Churn_Modelling[Tenure]/12,2)

▸ Notice we expect a value per row in a table


▸ This runs at import and gets stored
▸ Might increase file size
Filter Context
▸ Easy to show with measures
Calculate: Breaking out
of the filter context
Total Sales - Beverages =
CALCULATE(sum('Sales OrderDetails
'[Order Line Total]), 'Production
Categories'[categoryname] = "Beverages")
Filter
Number of US Orders =
CALCULATE (COUNT(
'SalesOrderDetails'[orderid]),
FILTER ( 'Sales Customers' , 'Sales
Customers'[country] = "USA" ))

Number of Orders = COUNT('Sales


Orders'[orderid])
Variables
VAR myVar = 1

Data Type Variable


Variable
Value
Name
RETURN myVar + 25
If-Else and Nested If Blocks
▸ Similar concepts like other programming languages.

Age_Bins = IF(Churn_Modelling[Age]>=60,
"Above 60", "Below 60")
Time Intelligence Functions
▸ Enables user to manipulate data using time periods
such as years, quarters, months, and days
▸ Creating calculations over those time periods
▸ Most common time periods:
▹ Year– to – Date
▹ Quarter – to – Date
▹ Month – to – Date
▹ Last Year
▹ Full Year
▹ Rolling 12 Months
Time Intelligence: TOTALYTD

YTD Total Sales =


TOTALYTD (SUM('Sales
OrderDetails
'[Order Line Total]),
Dates[Date].[Date])
Time Intelligence:
PREVIOUSMONTH
Total Sales Previous Month =
CALCULATE(sum('Sales OrderDetails'[Order Line Total]),
PREVIOUSMONTH(Dates[Date]))
non-X vs X functions
(SUM vs SUMX)

SUM is an aggregator function.


It works like a measure, calculating based on the current filter context.

SUMX is an iterator function. It works row by row. SUMX has awareness


of rows in a table, and can reference the intersection of each row with
any columns in the table.
non-X vs X functions
(SUM vs SUMX) – An Example
Total Sales SUMX =
SUMX('Sales OrderDetails’,
'Sales OrderDetails'[qty]*
'Sales OrderDetails'[unitprice])

Total Sales =
sum('Sales OrderDetails’
[Order Line Total])
AVERAGE, AVERAGEA,
AVERAGE X
▸ AVERAGE → Averages out the data
▸ AVERAGEA → Considers non-integer values as null
▸ AVERAGEX → Creates In memory measure
▹ Also an iterator function
▹ Works row by row
▹ Has awareness of rows in a table
▹ Can reference the intersection of each row with any
columns in the table
Tool Tips and Drill Throughs
Best Practice:
Organize your Code
▸ Create a separate table for measures
▸ Limit Visuals: As visuals interact with each other, if we have more
visuals, it might take a lot of time to refresh.
Tool tips & Drill through can be used.
▸ Process as much data as required in the original source
▸ Certified Visuals are recommended
▸ Use a lighter background
Data Types
▸ Numeric
▸ String
▸ Bool
▸ DateTime

Data Modding Trends for 2019 and Beyond


▸ If a function is expecting a numeric, but gets a string, it won't
work. Clean up the model and watch it start working!
▸ Uses less space and memory with your model
▸ Improves performance
Relationship
Manipulating the Relationship
Total Sales By Ship Year =
CALCULATE(SUM('Sales OrderDetails’
[Order Line Total]),USERELATIONSHIP
('Sales Orders'[shippeddate], Dates[Date]))
Power BI
7
Visuals
Building Blocks of
Power BI
Tiles
Dashboards
Reports
Datasets
Visualizations
Building Blocks of Power BI
Visualizations Datasets Reports Dashboards Tiles

A visual representation
of data is called
visualization.
For example, a chart, or
a graph can be used to
represent data visually.
Building Blocks of Power BI
Visualizations Datasets Reports Dashboards Tiles

A dataset is a
collection of data
or information.
Building Blocks of Power BI
Visualizations Datasets Reports Dashboards Tiles

A collection of
visualizations that
appear together
on one or more pages.
It is a collection of
items that have
common motive.
Building Blocks of Power BI
Visualizations Datasets Reports Dashboards Tiles

A single page interface


that uses the most
important elements of a
report to tell a story.
Building Blocks of Power BI
Visualizations Datasets Reports Dashboards Tiles

A tile is a single
visualization
found in a report
or on a dashboard.
Power BI
8
Charts
1 Bar, Column, Line & Area Charts

Combination Charts
Different
2

Charts in 3 Pie-Charts, Doughnut Charts

Power BI 4 Maps, Funnel Charts

5 Gauge, Cards, Tables & Matrices


Key
9 Performance
Indicators
What is a KPI?
What is a KPI? When to use? Requirements Visualizations

A key performance
indicator (KPI) is a visual
cue that communicates
the amount of progress
made toward a target.
When should we use KPIs?
What is a
KPI? When to use? Requirements Visualizations

TARGET TREND
Requirements for KPIs
What is a
KPI?
When to use? Requirements Visualizations

BASE MEASURE

TARGET MEASURE

THRESHOLD
KPI Visualizations
What is a When to
KPI? use?
Requirements Visualizations
Edit
10
Interactions
Formatting
11
Options
Security in
12
Power BI
Administration
13
Options
1 Admin

2 Member
Different
Roles in 3 Contributor

Power BI 4 Viewer
Different Roles in Power BI

Link:
https://docs.microsoft.com/
en-us/power-bi/collaborate-
share/service-roles-new-
workspaces
Data
14
Visualization
“ Data visualization
helps to bridge the gap
between numbers and
words

— Brie E. Anderson, Digital Marketer and


Data Scientist at BEAST Analytics
Data
Visualization
Giving visual context to
information to help identify and
infer trends, patterns, and
outliers in data sets
A picture is
worth a
thousand
words
A complex idea can be conveyed
with just a single still image, namely
making it possible to absorb large
amounts of data quickly
Importance of Data Visualization
▸ Data is only useful if we can learn from it
▸ It delivers data with efficiency, clarity and effectiveness
▸ Can identify patterns, e.g.
▹ Correlations
▹ Trends over time
▹ Frequency
▸ Analyze large data sets and have data-driven decision management
Data Visualization
Techniques
Data Visualization Techniques
Histograms Area / Bar Charts Pie Charts
Measure frequency Represent no. of Represent the percentage of
distribution of data observations for different data by each category
categories
Data Visualization Techniques
Pair-Plots Heatmaps Fever Chart
Bivariate distribution of Light/warm colours to Time-Series chart for change
datasets. Shows the pairwise indicate low- and high-value of data over a period of time
relationship between points. Humans interpret
variables colour better than numbers
Exploratory
12
Data Analysis
What is EDA?
▸ Deals with the process of performing initial investigations on data with
the help of summary statistics and graphical representations
▹ To discover patterns
▹ Spot anomalies
▹ Test hypotheses
▹ Check assumptions
Outline of Performing EDA
1. What question(s) are you trying to solve (or prove wrong)?
2. What kind of data do you have and how do you treat different
types?
3. What’s missing from the data and how do you deal with it?
4. Where are the outliers and why should you care about them?
5. How can you add, change or remove features to get more out
of your data?
Steps Involved in EDA
Data
Sourcing

Derived Data
Metrics Cleaning

Numerical Categorical
Data Data
Analysis Analysis
Data Cleaning: Handling Missing Values
1. Delete rows/columns
▹ Rows: can be deleted if it has an insignificant no. of missing values
▹ Columns: can be deleted if it >75%
of missing values
2. Replace with mean/median/mode
▹ Can be used on an independent variable when it has numerical variables
▹ Categorical features: Apply mode method
3. Algorithm Imputation
▹ Machine learning algorithms e.g. KNN, Naïve Bayes, Random Forest
4. Predicting the missing values
▹ Training set: Data set with no missing values
▹ Testing set: Data set with missing values
▹ Target variable: Missing values
Types of Data

Qualitative Quantitative

A variable to describe A variable to


the quality of the population quantify the population

Nominal Ordinal Discrete Continuous


Qualitative
Nominal Ordinal

Indicates
Represents Value Represents measurement
classifications
qualitative info represents qualitative info which are
without order discrete units with order different and can
be ranked

e.g. Gender: M/F e.g. Economic Status:


Low/Medium/High
Quantitative

Discrete Continuous

Only takes counted values, not Numbers within


decimal values a range of values

e.g. Number of students in a class e.g. Height


Derived Metrics
Create a new variable from the existing variables
to get insightful information from the data

From
Feature Feature Calculated
Domain
Binning Encoding from Data
Knowledge
Handle Missing Value
Delete Rows/Columns
This method we commonly used to handle missing values. Rows can be deleted if it has
insignificant number of missing value Columns can be delete if it has more than 75% of missing
value

Replacing with mean/median/mode


This method can be used on independent variable when it has numerical
variables. On categorical feature we apply mode method to fill the missing value.
.

Algorithm Imputation
Some machine learning algorithm supports to handle missing value in the
datasets. Like KNN, Naïve Bayes, Random forest.
.

Predicting the missing values


Prediction model is one of the advanced method to handle missing values. In this
method dataset with no missing value become training set and dataset with
missing value become the test set and the missing values is treated as target
variable.
Feature Scaling Technique

Standard Scaler
Standard scaler ensures that for each feature, the mean is
zero and the standard deviation is 1,bringing all feature to the
same magnitude. In simple words Standardization helps you
to scale down your feature based on the standard normal
distribution

Standardization
.

Normalization
Min-Max Scaler
Normalization helps you to scale down your features
between a range 0 to 1
Outlier Treatment
Outliers are the most extremes values in the data. It is a
abnormal observations that deviate from the norm.
Outliers do. not fit in the normal behaviour of the data.

Detect Outliers using following methods:

1.Boxplot 2. Histogram 3. Scatter plot


4.Z-score
5. Interquartile range(values out of 1.5 time of IQR)

Handle Outlier using following methods:-

1.Remove the outliers.


2.Replace outlier with suitable values by using following
methods:-
• Quantile method
• Interquartile range

3.Use that ML model which are not sensitive to outliers


Like:-KNN,Decision Tree,SVM,NaïveBayes,Ensemble
methods
Handle Invalid Value
• Encode Unicode properly:-In case the data is being read as junk
characters, try to change encoding, E.g. CP1252 instead of UTF-8.
.
• Convert incorrect data types:- Correct the incorrect data types
to the correct data types for ease of analysis. E.g. if numeric
values are stored as strings, it would not be possible to calculate
metrics such as mean, median, etc. Some of the common data
type corrections are — string to number: "12,300" to “12300”;
string to date: "2013-Aug" to “2013/08”; number to string: “PIN
Code 110001” to "110001"; etc.

• Correct values that go beyond range:- If some of the values are


beyond logical range, e.g. temperature less than -273° C (0° K),
you would need to correct them as required. A close look would
help you check if there is scope for correction, or if the value
needs to be removed.

• Correct wrong structure:- Values that don’t follow a defined


structure can be removed. E.g. In a data set containing pin codes
of Indian cities, a pin code of 12 digits would be an invalid value
and needs to be removed. Similarly, a phone number of 12 digits
would be an invalid value
Analysis
EDA is evolving around these 4 concepts

Univariate Bivariate Correlation Outliers


Uses Cases

Basically EDA is important in every business problem,


it’s a first crucial step in data analysis process.
Some of the use cases where we use EDA is:-

• Cancer Data Analysis :- In this data set we have to


predict who are suffering from cancer and who’s
Uses not.
Cases • Fraud Data Analysis in E-commerce
Transactions :- In this dataset we have to detect
the fraud in a E-commerce transaction.
Project:
13
Subscriber Churn
Subscriber Churn can be in different forms and not just exit from the base

Tariff Plan (e.g. €50 to €30 monthly)


Churn
Subscriber Churn
(e.g. Port Out to
Service (e.g. Weekly/ Monthly Competition)
Churn Subscription)
Provider Provider
Different A B
Churn Leads to Subscriber churn
Scenarios

Product (e.g. Postpaid


Churn to Prepaid)
Usage (e.g. Inactive or Zero Usage)
Churn
Decision Cycle of a Subscriber: Changes as per needs and/or experiences
… because it’s
too complex or
I don’t have Inert Subscriber 1
… and I time or it’s not
haven’t worth it
thought about
churning … … because my
Unconditionally Loyal 2
operator is the
best

… but I am
I am a Mobile locked in by my Locked In Subscriber 3
Four Churn Segments
customer … contract

… and I have Conditionally Loyal 1


decided to
stay … because I
… and I am found a better Conditional Churner 2
… and I have not locked in offer
thought about by my contract
… and I have … because my
churning … … Lifestyle Migrator 3
decided to needs have
leave … changed
… because I’m Unsatisfied Churner 4
not satisfied
Four Churn Segments: Loyalty drivers for each segment

▸ Frequently re-evaluates Loyalty Drivers


Conditionally Loyal 1
purchase decisions
▸ Choose brand on
Key drivers that Influence Churn
Conditional Churner 2 (semi) rational basis 1. Handset Loss/Upgrade
▸ Likes to change to test 2. Cost of Service / Competitor
pricing
new products
3. Network Quality
4. Customer Care Quality

▸ Uncontrollable change in needs / Key drivers for Subscriber loyalty


Lifestyle Migrator 3 usage behavior 1. Offers and services
2. Price
▸ Involuntary Churn 3. Quality of products and services
4. Quality of customer service
5. Length of contract period
6. Perception of telecom brand
Unsatisfied Churner 4
▸ Unsatisfied subscriber 7. Marketing programs and
campaigns
High level Overview of a Data Science led approach to Manage Churn

Capture & Analyze Report & Predict Engage & Act

Discover Predictive Model Output


Relationships

Churn
Extract Drivers
Trends
Acquisition Tariff & Customer Recharge Monthly Segmented Subscriber
Data Usage Data level Data data Snapshot Targeting Monitoring

Aggregate
Clean
Transform

Targeted Actionable
Ready to Marketing Insights
use data
Customer
Profiles
▸ Business Understanding
▸ List of churn drivers / KPIs for
▸ Identify data requirements and ▸ Business Analysis of
tracking and monitoring
explore data availability standardized data
▸ A generated list of recommended
▸ Request and extract data required ▸ Predictive model design
subscribers for targeted churn
to build a model ▸ Development and campaigns
▸ Aggregate, Clean and Standardize Implementation of
▸ Recommendations on monthly
data in desired format for model Predictive model
churn initiatives
Thank
you!
Any questions?
Resources
▸ Power BI Documentation
▹ https://docs.microsoft.com/en-us/power-bi/
▸ Power BI Guided Learning
▹ https://docs.microsoft.com/en-us/power-bi/guided-learning/
▹ https://www.youtube.com/playlist?list=PL1N57mwBHtN0JFoKSR0n-tBkUJHeMP2cP
▸ Power BI White Paper
▹ https://docs.microsoft.com/en-us/power-bi/guidance/whitepapers
▸ Power BI Blogs
▹ https://powerbi.microsoft.com/en-us/blog/

You might also like