Manual Transforming Data Into Intelligence With Power BI and AI Integration
Manual Transforming Data Into Intelligence With Power BI and AI Integration
MICROSOFT EXCEL:
With
INTRO TO MSPOWER
EXCEL AND POWER BI
QUERY, POWER PIVOT
& DAX Two-Full days Training Program
Organized by NIBAF
Microsoft Excel is the world's most used and versatile business analysis, reporting
and strategy software. Having a deep practical knowledge of Excel will turn you
almost superhuman at work and increase your productivity.
This training is going to focus on making you highly proficient in the use of
Business Intelligence Tools of Excel which provide an expansive set of features
that enable to extract information from complex data, create dynamic reports and
presentations. In this training, learn how to manage, connect excel with different
data sources, compile huge data sets and create data model with Excel
PowerPivot, create a visually appealing Power View sheet, and use Power Map to
integrate the data with Bing Maps to
COURSE OVERVIEW
1 The “Power” Excel Landscape
• Power Query/Power Pivot workflow and key benefits vs. “traditional” Excel
2 Power Query
• Types of data connectors, query editing tools, loading options, etc.
3 Data Modeling
• Excel Data Model interface, normalization, table relationships, hierarchies, etc.
6 Final Project
• Super Store Sales Data
VERSIONS &
COMPATIBILITY
IMPORTANT NOTE: Power Pivot is currently not available for Mac,
and is only available in certain versions of Excel for Windows/PC
For a full, current list of compatible versions, visit support.office.com (or Google “Where is Power Pivot?”):
https://support.office.com/en-us/article/Where-is-Power-Pivot-aa64e217-4b6e-410b-8337-20b87e1c2a4b (or use: bit.ly/2yd80rd)
Other considerations:
• Power Pivot works best with 64-bit Excel, which can access more processing power and memory (not critical)
• Note: make sure you’re running a 64-bit operating system and that you’ve updated Office to the 64-bit version
• Power Pivot menus, features and tools have evolved over time; what you see on your screen may differ
from what you see on mine, but the fundamental skills and concepts covered are universally applicable
• Even if you have a compatible version of Excel, you may need to enable the Power Pivot or Power
Query plug-ins to access the tools in this course (File > Options > Add-Ins > Manage: COM Add-Ins)
COURSE RATINGS &
REVIEWS
• Throughout the course, we’ll be using sample data from a fictitious super market
chain called “FoodMart”*
• In addition to daily transactional records from 1997-1998, our data set
includes information about products, customers, stores, and regions
• All files are available for download in the course resources section of your
course dashboard (Course Dashboard > Course Content > All Resources)
Transactions Returns Customer Lookup Calendar Lookup Product Lookup Store Lookup Region Lookup
-transaction_date -return_date customer_id date product_id store_id region_id
-stock_date -product_id customer_acct_num month_num product_brand region_id sales_district
-product_id -store_id first_name quarter product_name store_type sales_region
-customer_id -quantity last_name year product_sku store_name
-store_id customer_address weekday_num product_retail_price store_street_address
-quantity etc.. etc… etc… etc…
*This data is provided by Microsoft for informational purposes only as an aid to illustrate a concept. These samples are provided “as is” without warranty of any kind. The example companies, organizations, products, domain names,
e-mail addresses, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, person, place, or event is intended or should be inferred.
LET’S DO THIS.
INTRO TO “POWER EXCEL”
THE “POWER EXCEL”
WORKFLOW
These are Excel’s Business Intelligence tools, all of which are available directly in Excel
(provided you have a compatible version); no additional software is required!
RAW DATA POWER QUERY DATA MODEL POWER PIVOT & DAX
Flat files (csv, txt), Excel tables, (aka “Get & Explore and analyze the entire
Create table relationships, add
databases (SQL, Azure), folders, Transform”) calculated columns, define data model, and create
streaming sources, web data, Connect to sources, import hierarchies and perspectives, powerful measures using Data
etc. data, and apply shaping etc. Analysis Expressions (DAX)
and transformation tools
(ETL)
“THE BEST THING TO HAPPEN TO EXCEL IN 20
YEARS”
• Import and analyze MILLIONS of rows of data in Excel
• Access data from virtually anywhere (database tables, flat files, cloud services, folders, etc.)
From File From FromAzur From Online Services From Other Sources
Database e
THE QUERY
EDITOR
Quer
y
Editin
g Formula Bar
(this is “M” code)
Tools
Name
your
table!
Dat Applie
a d Steps
Previe
w
THE QUERY
EDITOR
Access the Query Editor by creating a new query and choosing the “Edit” option, or by launching
the Workbook Queries pane (Data > Show Queries) and right-clicking an existing query to edit
QUERY EDITOR
TOOLS
The HOME tab includes general settings and common table transformation
tools
The TRANSFORM tab includes tools to modify existing columns (splitting/grouping, transposing, extracting text, etc.
The ADD COLUMN tools create new columns based on conditional rules, text operations, calculations, dates, etc.
DATA LOADING
OPTIONS
When you load data from Power Query, you have several options:
• Table
• Stores the data in a new or existing worksheet
• Requires relatively small data sets (<1mm rows)
• Connection Only
• Saves the data connection settings and applied steps
• Data does not load to a worksheet
u
T m
i n
p
: h
e
R a
i d
g e
h r
t
- t
c o
l
i a
c c
k c
e
t s
h s
e
c
c o
TEXT-SPECIFIC TOOLS
Date & Time tools are relatively straight-forward, and include the following options:
• Age: Difference between the current time and the date in each row
• Date Only: Removes the time component of a date/time field
• Year/Month/Quarter/Week/Day: Extracts individual components from a date
field (Time-specific options include Hour, Minute, Second, etc.)
• Earliest/Latest: Evaluates the earliest or latest date from a column as a single value
(can only be accessed from the “Transform” menu)
Note: You will almost always want to perform these operations from the “Add Column” menu to
build out new fields, rather than transforming an individual date/time column
CREATING A BASIC CALENDAR
TABLE
2) In the formula bar, generate a starting date by entering a “literal” (1/1/2013 shown below):
3) Click the fX icon to add a new custom step, and enter the following formula exactly as shown:
4) Convert the resulting list into a Table (List Tools > To Table) and format the column as a Date
5) Add calculated Date columns (Year, Month, Week, etc.) as necessary using the Add Column tools
ADDING AN INDEX COLUMN
Note that we lose any field not specified in the Group By settings
PIVOTING &
UNPIVOTING
“Pivoting” is a fancy way to describe the process of turning distinct row
values into columns (“pivoting”) or turning columns into rows (“unpivoting”)
PRO TIP:
Use the “From Folder” query option to automatically append all files from within the same folder
POWER QUERY BEST
PRACTICES
Give your queries clear and intuitive names, before loading the data
• Define names immediately; updating query & table names later can be a
headache, especially if you’ve already referenced them in calculated measures
• Don’t use spaces in table names (otherwise you have surround them with single quotes)
When working with large tables, only load the data you need
• Don’t include hourly data when you only need daily, or product-level transactions
POWER QUERY BEST
PRACTICES
when you only care about store-level performance; extra data will only slow you down
DATA MODELING
MEET EXCEL’S DATA
MODEL
The Data Model provides simple and intuitive tools for building
relational databases directly in Excel. With the data model you can:
• Manage massive datasets that can’t fit into worksheets
• Create table relationships to blend data across multiple sources
• Define custom hierarchies and perspectives
In a normalized database, each table should serve a distinct and specific purpose (i.e. product information, calendar
fields, transaction records, customer attributes, etc.)
This Calendar Lookup table provides additional attributes about each date (month, year, weekday, quarter,
etc.)
This Product Lookup table provides additional attributes about each product (brand, product name, sku, price,
etc.)
This Data Table contains “quantity” values, and connects
to lookup tables via the “date” and “product_id” columns
PRIMARY & FOREIGN
KEYS
Original Fact Table fields Attributes from Calendar Lookup table Attributes from Product Lookup table
Tip: Always drag relationships from the Data table to the Lookup tables
*Note: In Excel 2010/2013 the diagram view looks a bit different, and arrows point in the opposite direction by default
CONNECTING LOOKUPS TO
LOOKUPS
PRO
TIP:
Models with multiple related lookup
tables are called “snowflake” schemas
Models with a single table for each
To make a connection active or inactive, double-click the connection and check the box, or
right-click the relationship line itself (Note: must deactivate one before activating
ACTIVE VS. INACTIVE
RELATIONSHIPS
another!)
RELATIONSHIP
CARDINALITY
Cardinality refers to the uniqueness of values in a column
In Power Pivot, all relationships in a data model should
follow a “one-to-many” cardinality
• Each column (or “key”) used to join tables can only have one
instance of each unique value in the lookup table (these are
the primary keys), but may have many instances of each
unique value in the data table (these are the foreign keys)
In this case we’re joining the Calendar_Lookup table to the FoodMart_Transactions data table
using the date column as our key
There is only one instance of each date in the lookup table (noted by the “1”), but many instances of
each date in the data table (noted by the asterisk “*”), since multiple transactions occur each day
*Note: In Excel 2010/2013 the diagram view looks a bit different, and arrows point in the opposite direction by default
BAD CARDINALITY: MANY-TO-
MANY
• If we try to connect these tables using the product_id field, we’ll have a many-to-many
relationship since there are multiple instances of each ID in both tables
• Even if we could create this relationship in Power Pivot, how would you know which product
was actually sold on each date – Cream Soda or Diet Cream Soda?
BAD CARDINALITY: ONE-TO-
ONE
• In this case, connecting the tables above using the product_id field creates a one-to-one
relationship, since each ID only appears once in each table
• Unlike many-to-many, there is nothing illegal about this relationship; it’s just inefficient
PRO TIP:
Always hide the foreign key columns in your data tables to prevent users from accidentally filtering on them!
DEFINING HIERARCHIES
Hierarchies are groups of nested columns that reflect multiple levels of granularity
• For example, a “Geography” hierarchy might include Country, State, and City columns
• Each hierarchy is treated as a single item in PivotTables and PivotCharts, allowing users to “drill
up” and “drill down” through different levels of the hierarchy in a meaningful way
More
Tables!
NO MORE “CALCULATED
FIELDS”
Oh rats, where are my calculated fields??
2) Adding Measures
PRO TIP:
Use measures to create values that users can explore with a pivot (Power Pivot version of a “Calculated Field”)
CREATING IMPLICIT
MEASURES
STEP 1: Check the box next to a value field in a data
table, or manually drag it into the “Values” box
PRO TIP:
AutoSum is a nice way to get comfortable with basic DAX and quickly add measures;
just don’t rely on them when things start to get more complicated!
CREATING EXPLICIT MEASURES (POWER
PIVOT)
Each measure is
The Formula assigned to a table and
pane contains the given a measure name
actual DAX code, as (as well as an optional
well as options to description)
browse the formula
library or check syntax
This cell does NOT add up the values above it (it’s an island, remember?)
• Total rows represent a lack of filters; since this cell does not have a customer_city coordinate,
it evaluates the Total Quantity measure across the entire, unfiltered Customer_Lookup table
FILTER CONTEXT
EXAMPLES
Cell coordinates:
• Calendar_Lookup[Year] = 1997
• Customer_Lookup[customer_country] = “USA”
• Customer_Lookup[customer_city] = “Altadena”
Cell coordinates:
• Calendar_Lookup[Year] = 1998 Cell coordinates:
• Calendar_Lookup[Quarter] = 1 • Store_Lookup[store_country] = “Canada”
• Customer_Lookup[customer_country] = “USA” • Product_Lookup[product_brand] = “Amigo”
Cell coordinates:
• Customer_Lookup[customer_country] = “USA”
STEP-BY-STEP MEASURE
CALCULATION
How exactly is this measure calculated?
• REMEMBER: This all happens instantly behind the scenes, every time a measure cell calculates
Store_Lookup[store_country] = “USA” 1 1
FoodMart_Transactions
Store_Lookup Table USA *
* USA
FoodMart Returns
USA Sum of
Transactions[quantity] = 555,899
when store_country =
“USA”
STEP-BY-STEP MEASURE
CALCULATION
RECAP: CALCULATED COLUMNS VS.
MEASURES
CALCULATED COLUMNS MEASURES
• Evaluated in the context of each row of the table • Evaluated in the context of each cell of the
to which it belongs (has row context) PivotTable in which it is displayed (has filter context)
• Appends static values to each row in a table • Does not create new data in the tables
and stores them in the model, increasing file themselves, and does not increase file size
size
• Recalculated in response to any change in
• Only recalculated on data source refresh or the PivotTable view
changes to component columns
• Can only be used as PivotTable values
• Primarily used as rows, columns, slicers or filters
*Note: Calculated columns CAN be placed in the values area of a pivot, but you can (and should) use a measure instead
POWER PIVOT BEST
PRACTICES
Avoid using implicit measures whenever possible
• Implicit measures are limited in functionality and restricted to the pivot in
which they were created; explicit measures are more portable and powerful
FUNCTION NAME to evaluate that as a single value in a pivot (you need some sort of
aggregation)
• Calculated columns don’t always use functions,
but measures do:
• In a calculated column,
=Transactions[quantity] returns the value from
the quantity column in each row (since it
evaluates for each row)
PRO TIP:
For column references, use the fully qualified name (i.e. Table[Column])
For measure references, just use the measure name (i.e. [Measure])
DAX
OPERATORS
Arithmetic Comparison
Meaning Example Meaning Example
Operator Operator
& Concatenates two values to produce one text string [City] & “ “ & [State]
&& Create an AND condition between two logical expressions ([State]=“MA”) && ([Quantity]>10)
|| (double pipe) Create an OR condition between two logical expressions ([State]=“MA”) || ([State]=“CT”)
IN Creates a logical OR condition based on a given list (using curly ‘Store Lookup’[State] IN { “MA”, “CT”, “NY”
brackets) }
*Head to www.msdn.microsoft.com for more information about DAX syntax, operators, troubleshooting, etc.
COMMON FUNCTION
CATEGORIES
MATH & LOGICAL TEXT FILTER DATE & TIME
STATS Functions Functions Functions Functions
Functions
Basic aggregation Functions for returning Functions to Lookup functions based Basic date and time
functions as well as information about manipulate text strings on related tables and functions as well as
“iterators” evaluated values in a given or control formats for filtering functions for advanced time
at the row-level conditional expression dates, times or dynamic calculations intelligence
numbers operations
Common Examples: Common Examples: Common Examples: Common Examples: Common Examples:
• SUM • IF • CONCATENATE • CALCULATE • DATEDIFF
• AVERAGE • IFERROR • FORMAT • FILTER • YEARFRAC
• MAX/MIN • AND • LEFT/MID/RIGHT • ALL • YEAR/MONTH/DAY
• DIVIDE • OR • UPPER/LOWER • ALLEXCEPT • HOUR/MINUTE/SECOND
• COUNT/COUNTA • NOT • PROPER • RELATED • TODAY/NOW
• COUNTROWS • SWITCH • LEN • RELATEDTABLE • WEEKDAY/WEEKNUM
• DISTINCTCOUNT • TRUE • SEARCH/FIND • DISTINCT
• FALSE • REPLACE • VALUES Time Intelligence Functions:
Iterator Functions: • REPT • EARLIER/EARLIEST • DATESYTD
• SUMX • SUBSTITUTE • HASONEVALUE • DATESQTD
• AVERAGEX • TRIM • HASONEFILTER • DATESMTD
• MAXX/MINX • UNICHAR • ISFILTERED • DATEADD
• RANKX • USERELATIONSHIP • DATESINPERIOD
• COUNTX
*Note: This is NOT a comprehensive list (does not include trigonometry functions, parent/child functions, information functions, or other less common functions)
BASIC MATH & STATS
FUNCTIONS
SUM() Evaluates the sum of a column =SUM(<column>)
PRO TIP:
Even though it might seem unnecessary, creating measures for even simple calculations (like the sum of a column)
allows you to use those measures within other calculations, anywhere in the workbook
COUNT, COUNTA, DISTINCTCOUNT &
COUNTROWS
Counts the number of rows in the
COUNTROWS() specified table, or a table defined by an =COUNTROWS(<table>)
expression
Count of all rows in the Transactions table Count of non-empty cells in the recyclable
column
Any DAX expression that returns a List of values produced by the expression, each Price”
single scalar value, evaluated paired with a result to return for rows/cases that “Premium
multiple times (for each match Price”)
row/constant) Examples:
Examples:
• Calendar_Lookup[month_num]
=SWITCH(Calendar_Lookup[month_num],
• Product_Lookup[product_brand] 1, “January”,
2,
“February”,
etc…
PRO TIP:
Use the SWITCH(TRUE() combo to =SWITCH(TRUE(),
generate results based on Boolean [retail_price]<5, “Low Price”,
(True/False) expressions (instead
AND([retail_price>=5, [retail_price]<20), “Med Price”,
of those pesky nested IF
statements!) AND([retail_price>=20, [retail_price]<50), “High
SWITCH &
Value returned if the expression
doesn’t match any value argument
SWITCH(TRUE)
SWITCH & SWITCH(TRUE)
(EXAMPLES)
Switch quarter 1 with “Q1”, quarter 2 with “Q2”, quarter 3 = “Q3”, else “Q4”
Extract characters from the left of the customer_address column, up to the space
CALCULATE
PRO TIP:
CALCULATE works just like SUMIF or COUNTIF, except it can evaluate measures based on ANY sort of
calculation (not just a sum, count, etc); it may help to think of it like “CALCULATEIF”
CALCULATE
(EXAMPLE)
Store_Lookup Table
USA
Store_Lookup[store_country] = “MEXICO”
Store_Lookup Table
1 1
MEXICO
* Transactions
*
FoodMart Returns USA Total Transactions
USA where store_country
= “USA”
= 180,823
CALCULATE CHANGES THE FILTER CONTEXT
Examples:
to be evaluated for each row of the table Since FILTER returns a table (as opposed
• Store_Lookup to a scalar), it’s almost always used as
Examples: an input to other functions, like
• Product_Looku • Store_Lookup[store_country]=“USA”
p enabling more complex filtering
• Calendar[Year]=1998 options within a CALCULATE function
• [retail_price]>AVERAGE[retail_price] (or passing a filtered table to an iterator
like SUMX)
PRO TIP:
Since FILTER iterates through each row in a table, it can be slow and processor-intensive; never use FILTER
when a normal CALCULATE function will accomplish the same thing!
PRO TIP: FILTERING WITH DISCONNECTED SLICERS (PART ) 1
STEP 1: Create an Excel table containing a STEP 3: Make sure that your table loaded, and is
list of values to use as thresholds or NOT connected to any other table in the model:
parameters:
Calculate Total Transactions only for cases where the product price is below a selected threshold Calculate Total Revenue, but only for USA stores
ALL
ALL() Returns all rows in a table, or all values in a column, ignoring any filters that have been applied
The table or column that List of columns that you want to clear filters on (optional)
you want to clear filters on
Notes:
Examples: • If your first parameter is a table, you can’t specify additional columns
• Transactions • All columns must include the table name, and come from the same table
• Product_Lookup[product_brand
Examples:
]
• Customer_Lookup[customer_city], Customer_Lookup[customer_country]
• Product_Lookup[product_name]
ALL
PRO TIP:
ALL is like the opposite of FILTER; instead of adding filter context, ALL removes filter context. This is often used when
you need unfiltered values that won’t be skewed by the PivotTable layout (i.e. Category sales as % of Total)
ALL (EXAMPLE)
• In this example, we use ALL to calculate total transactions across all rows in
the Transactions table, ignoring any filter context from the PivotTable
• By dividing the original [Total Transaction] measure (which responds to PivotTable filter context
as expected) by the new [All Transactions] measure, we can correctly calculate the percentage of
the total no matter how the PivotTable is filtered
RELATED
RELATED() Returns related values in each row of a table using relationships with other tables
=RELATED(<column>)
HEY THIS IS IMPORTANT!
RELATED works almost exactly like a VLOOKUP function – it uses
The column that contains the relationship between tables (defined by primary and foreign
the values you want to keys) to pull values from one table into a new column of
retrieve another.
Since this function requires row context, it can only be used as a
Examples: calculated column or as part of an iterator function that cycles
• Product_Lookup[product_brand] through all rows in a table (FILTER, SUMX, MAXX, etc.)
• Store_Lookup[store_country]
PRO TIP:
Avoid using RELATED to create redundant calculated columns unless you absolutely need them, since those
extra columns increase file size; instead, use RELATED within a measure like FILTER or SUMX
RELATED
(EXAMPLES)
Retrieve the retail price from the Product_Lookup table and append it to the Transactions table
Multiply the quantity in each row of the Transactions table with the
related retail price from the Product_Lookup table, and sum the results
ITERATOR (“X”)
FUNCTIONS
Iterator (or “X”) functions allow you to loop through the same calculation or expression on
each row of a table, and then apply some sort of aggregation to the results (SUM, MAX, etc.)
=SUMX(<table>, <expression>)
Aggregation to Table in which the Expression to be evaluated
apply to calculated expression will be for each row of the given
rows* evaluated table
Examples: Examples: Examples:
• SUMX • Transactions • [Total Transactions]
• COUNTX • FILTER(Transactions, • Transactions[price] * Transactions[quantity]
• AVERAGEX RELATED(Store_Lookup[country])=“USA”)
• RANKX
• MAXX/MINX
PRO TIP:
Imagine the function adding a temporary new column to the table, calculating the value in each row
(based on the expression) and then applying the aggregation to that new column (like SUMPRODUCT)
ITERATOR (“X”)
FUNCTIONS
*In this example we’re looking at SUMX, but all “X” functions follow a similar syntax
ITERATOR (“X”) FUNCTIONS
(EXAMPLES)
Multiply quantity and retail price for each row in the Transactions table, and sum the results Calculate the rank of each product brand, based on total revenue
BASIC DATE & TIME
FUNCTIONS
DAY/MONTH/ Returns the day of the month (1-31), month
=DAY/MONTH/YEAR(<date>)
YEAR() of the year (1-12), or year of a given date
Calculate the end date of the month, for each row in the Calendar_Lookup table
TIME INTELLIGENCE FORMULAS
Time Intelligence functions allow you to easily calculate common time comparisons:
PRO TIP:
To calculate a moving average, use the running total calculation above and divide by the # of intervals!
SPEED & PERFORMANCE
CONSIDERATIONS
Avoid using unnecessary slicers, or consider disabling cross-filtering
• When you use multiple slicers, they “cross-filter” by default; in other words, options in Slicer B
are automatically grayed out if they aren’t relevant given a selected value in Slicer A
• To disable, select Slicer Tools > Slicer Settings and uncheck “Visually indicate items with no data”
Available
within Excel
2 Spreadsheet-based dashboards built with CUBE functions
• Use CUBE functions to pull values from the data model for custom Excel reports (no pivots)
Standalone
4 Microsoft PowerBI
product
(desktop + online) • Brand new (free!) self-service BI product for loading, shaping, modeling, and visualizing
data
SNEAK PEEK: POWERBI
Ratings and reviews mean the world to me, so please share feedback!
• Feel free to post to the Q&A section or message me directly if you need any support, or if
there’s anything I can do to improve your course experience!
THANK YOU!