0% found this document useful (0 votes)
5 views

Daily Running Notes_Power Query

Important Notes on Power Query

Uploaded by

arun.moto2022
Copyright
© © All Rights Reserved
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Daily Running Notes_Power Query

Important Notes on Power Query

Uploaded by

arun.moto2022
Copyright
© © All Rights Reserved
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Requiremen

7/25/2022
7/26/2022
7/27/2022
7/28/2022
7/29/2022
8/1/2022

8/2/2022
8/3/2022
Use everyone the same dataset created at cloud
PBI Dataset--> Get Data--> Power BI Dataset--> Choose the dataset

After extraction what we need to do?


In real-time, we should verify column headings first because Power BI Query n
column headings and column by column data
When do we go for Transformation?
A) Column by column data not available
B) Need additional operations which are not available at source level and requ
analytical purpose
Ex: Sorting, Grouping, Pivoting, Add columns, Filter rows, Duplciate
Note: We don't create aggregate and analytical objects in this area, for t
to Power View and must use DAX language

What is the component / concept suitable in Power BI? How do we go


Power Query is the concept, two ways we go
1. While extracting [beside load button, trnasform button] 2. After loading the
menu->Transform Data]
Note: Earlier name for Transform Data is "Edit Query"
Power Query Brief Points:
Extraction: Two ways
1. New Source : To bring data from new source which is not used
2. Recent Sources: To bring data from existing sources used in the machine
reports
Note: You need not establish connection, you can directly c
bring.
Transformation: Nearly 8 places we do transformations
1. Queries 2. Query 3. Field 4. Multi-Field 5. Filter 6. Menu options 7. S
Languages and Code

b) We perform operations graphically (More and more), if they are not sufficie
for Coding. Majorly mashup script, minorly R and Python.

c) For each query system generates a script with all the operations p
[sgtarting from reading]
This is visible in Advanced Editor.
Ex: A sample script
Let
Source=Operation,
step1=source+operation,
Step2= Step1+ operation,
---
Laststep=Step2+Operation
In <laststep>
Script Explanation:
a) Each script starts with "let", ends with "in"
b) One step seperated with another step using comma
c) In followed with last step
d) In each step, we take previous step + Current operations
Step[System defined]= Transform[Logical]= Variable [technial]

How do we add, modify or delete steps?


Two ways
a) Advanced Editor and Scripting method b) View Menu->Query Setti
Graphical way->Insert Step After

1) Queries
Remarks Properties:/ managing
/ comments queries
annotations/ [no rows
description: and columns
Wherever dealin
available w
approperiate information, so that
either your team or others when they join they will understand the o
easily.
Note: Queries, Groups, Query, Column, Step etc. all these areas having descri
Move to groups, move up, move down heps to see the queries proper
understanding.
Note: No impact to visuals level
Copy & paste, duplicate, and reference differences:
a) Copy and Paste:
1. Legacy process [MS very popular] 2. Two independent operatio
"an Independent" copy of the query, so we can make our own change
will not take parent changes.
Note: After paste, it will bring all transforms and exact situation of p
b) Duplicate
2. Creates "a[copy + paste]:
dependent" Similar
query, to copyrefers
so always and paste, whereas
to parent it isla
and get
information
3. Dependent query level changes not imposed on parent
Note: If you want to get parent features+ you want to have your feat
for Reference.
a) If the query to be used in the visual and data is required b) If sour
required
Diable Load:
a) The query not required to use as it is having wrong data / in the cu
not required.
Note: Disables object in Italic font at Power query and No show at Po
Query Properties [rows and columns]
a) Individual Query rows and columns
Remove / Keep rows [top, bottom ot required]
Add Columns [ 5 ways]
b) Query with another query rows and columns
Merge: Getting columns from another table in to the main query
condtion
Append: Multiple queries rows appended

Bar colors below column


a)Red: Errors b) Grey: Blank values c) Green: Correct Values
There are 5 ways of adding
with 5 different mechanisms
a) Add index column: adding column with sequence numbers
1. Add Index column starting with "0"
2. Add index column starting with "1"
3. Add index column starting with "Custom"
Note: a) If sequence column not available b) You want to add surroga
column

b) Add conditional Column: Adds a column based conditions


Ex: Based on tax amount value showing a rating

c) Add Column from Examples [Flashy column concept]


Helps in two ways without writing code / expression
a) Create copy of the column
b) Create a column with built-in transforms
d) Add Custom Column
This requires expression to perform operations. Mashup language k
required.

Function: Reusable set of operations.


All languages contain rich set of functions to perform operations.
In tools we perform those operations graphically, if we are not able t
required graphical options we write our own functions.
In Power BI, by using mashup script or R and Python languages funct

Ex: Increment 10 percent TA, 20 percent DA, 30 percent HRA and the
R and Python Coding:
a) Install R or Python in your machine and ensure the paths accessib
environment variables area
b) Go to R and Python, and Write Code
Working on Multiple Queries:
Two ways
a) Combining rows [ Append option in Power Query, Set theory Union
Language]
b) Combining columns [Merge option in Power Query, Joins in SQL La
a) Combining rows [ Append option in Power Query, Set theory Union
Language]
1) Same structured queries you want to append [combine rows]
Note: Number of columns, Order of data types, and column names m
2) If Structure not matching…then system create new columns

null Handling: [NULL is the indication in SQL Server]


null->unknown->undefined
Three ways to handle
a) Replace with a user defined value [single column, multiple column
b) Upper null values replace with first not null value (Fill up)
c) Lower null values replace with last not null value (Fill down)

a) Reverse rows: last row become first row


b) Transpose rows: Rows into columns, columns into rows [blind oper
Ex: 3 columns and 10 rows I want 3 rows and 10 columns [we can p
c) Pivot:
a) Denormalized data into Normalized Data [Duplicates into Non du
b) You want to reduce rows and you want aggregate values instea
rows
Note: We can't predict count because of aggregation
d) UnPivot:
a) Normalized data into De-Normalized Data [Non Duplicates to du
b) You want to have history of rows and you want have your own a
analytical operations
Ex: 3 columns and 2 rows..then 3 * 2 =6 rows [ we can predict coun
Aggregate Tables Creation: Possible using Group By option in Power
Ex: Location and Mode wise sum of actual fee, avg of discount fee, an
students
Merge / Join concept in Power Query [Joins concept in SQL]
There are six merge options available
a) Inner join: Get matched rows
b) Left Outer Join: Matched rows + Left hand side unmatched rows
c) Right Outer Join: Matched rows+ Right Hand side unmatched rows
d) Full Outer Join: Matched rows+ Left hand side unmatched+ right h
unmatched rows
e) Left Anti: Unmatched data from left
f) Right Anti: Unmatched from right
Fuzzy search: Similartity search, similarity based on a threshold valu

When do we go for Joins / Merge concept in Power Query?


To bring columns from another query [multiple queries columns comb

Data Types:
Type of data assigned to the column
Advantages:
a) Position at reporting changes [ Numeric and Dates : Right -->Left, Text: Left-->Right]
b) Type of transforms or operations changes[Numeric: Numeric operations, Text: Text Ope
Operations etc.]
Note: Database level data type plays vital role because it controls storage and also many d

Power BI Data types in brief:


a) Whole number: represented with 123 [Non decimal values]. 19 digit number
b) Decimal number: represented with 1.2 :15 digit number
c) Fixed Decimal number: represented with 1.2: 19 digit number
d) Text : represented with ABC. 256 Mega Characters
e) Date: represented with calendar symbol : Date in 10 chars [yyyy-mm-dd]
f) Time: represented with clock image : Time in 8 chars[HH:MI:SS]
g) Boolean : True and False
h) Timestamp : Date and Time [YYYY-MM-DD HH:MI:SS.NNNNNN]
Ex: 2022-01-01 11:23:45.234234 [Timestamp(6)]
Note: Based on milliseconds the timestamp is recognized.
Two NNà Timestamp(2) Four NNNN-> Timestamp(4)
NNNNNNà Timestamp(6) à Recommended in real-time
i) Locale : Country and Code Page
Etc.

Text Transforms: Upper, Lower, Trim,Clean,Parse JSON, Parse XML, P


Numeric Transforms: Math, Stat etc.
Date Transforms: Year, Month, Quarter, Day etc. [values, names, par
Multi Column Transforms:
a) If all are numeric: Merge[concatenate], Sum, Product [multiplicatio
b) if at least one textual column available : Only Merge
How many ways we create duplicate columns in Power query?
Three ways
a) Column->right click->duplicate column b) Add Custom Column c) A
from examples
Parse JSON: You have rows information using JSON in column cells, y
to read and split the row
intohave
Parse XML: You multiple
rowscolumns.
information using XML in column cells, you
to read and split the row
into multiple columns.

Split options-multiple ways--all completed

Individual Column Properties: Generic properties common to all type


Drilldown: Converting a query into list. Means you want to have singl
values by replacing
existing
Add table
as New with column,
Query: Creatingthis concept
a list is helpful
from the query. Means you want to
column of values without replacing existing table with column, this c
helpful

Menu options: Completed {support, About, Power Bi Developer)


Tools Menu: 2021 added to provide diagnostics to the queries or que
Diagnostics: Statistics about the internal processing of operations
How many places?
Session statistics [to see the process for set of operations across que
Step
How isdiagnostics(individual
it helpful? step execution process)
A) To monitor internal process
B) In case of performance issues, to identify them for appropriate ac
(bottleneck resolution)
We collect statistics?
A) Aggregated: High Level Summarized (less statements, less memor
stats, less time)
B) Detailed: Very low level (more statements, more memory to store,
C) Performance Counters [ Memory, CPU, Tree internal process etc.]
D) Partitioning
Note: informations
In real-time unnecessarily we will not turn on (performance iss
you feel somethings is not
correct or taking more time / memory / or any…start with aggregate
not clear, then go for detailed statistics

View Menu:
Query dependencies: Highly helpful for technical people to understan
dependencies in the Power Query Area.
A) From which we got the queries
B) What all area the functions or parameters used
C) Joined queries etc.

Unique: Only one time value available


Distinct: One or more time available
Add Column: Perform operations and adds new column
Transform Menu: Perform transformations and may ot may not add c
DataSource Settings
Simply source settings, we can see, modify connections and permissi
to have good knowledge.
File paths:
Two types
a) Absolute Path: Starts with C / D / E etc. drives
b) Network Path[shared link]: Starts with servername/domainname/ip
Note: Network paths only recommened to access machi
and files in real-time.
You want to connect to remote machine data..network path...
Practical way to create: Go to folde-->right click properties-->Sharin
share-->mention who you want to share [a specific user / everyone]
Real-time :
a) Power BI Desktop level Datasource settings will help to see and ch
b) Power BI Service Level, Dataset-->Settings will help see and chang
Parameters.

What is parameter?
User specified / choosen value.
Why we require?
Better user interaction
How many types user specify values?
This can also be called as type sof parameters, three types available
a) User Enter Value: ANY PARAMETER
b) User select a value from drop down list of fixed values: LIST PARA
c) User select a value from drop down list of variable values: QUERY

How do we operate?
Two step process
a) Create parameter with type, data type and value
b) Filter where you want user specified data

Based on studentid entered, location (fixed values)and course(chang


Do we use parameters for user interaction in real-time?
No, we mostly use parameters for Data Source Settings and Permissi
Note: a) We have slicers, filters, what-if parameters, edit interactions
user interaction
b) In MSBI-SSRS, parameters play vital role.
Step propertiex-Completed
Native Query: Enabled when we work with databases

Database concepts will begin [next two work sheets]


eration and a mashup terminology
e final result
perator in SQL]

ffix, First, Last, Split etc.


tion, table or page should display
How many places data stored in real-

Database Management System[DBMS


RDBMS Vendors?

SQL Server?

How do we work with SQL Server?


Two places
a) File System: In the form of files and folders
Adv: Quick storage
Disadv: Not easy to manage[adding, modifying, removing etc.}
Ex: You have a flat file with 1 lakh records. Adding, Modifying Cells, Removing rows is
very tuff.
B) Database: Scientists desgned to overcome the above file issues
Database: Base for the data storage, data storage in the form of structured objects
called tables.
Table: Which contain set of columns [at least one column required]

Managing tables and other objects is called as DBMS. Multiple types available based on
operations.
A) RDBMS[Relational Database Management System]
B) ORDBMS
C) Simple DBMS
D) Hierarchical DBMS etc.
Note: Relational DBMS is recommended for DWH and Analytics projects since the
relationships lead to good modeling and accurate data
a) Oracle b) SQL Server c) Teradata etc.
a) SQL Server Originated at Sybase Corporation [1987]. Intially it was on Unix
and OS/2.
b) Microsoft Acquired around 1995…later on it is on windows OS.
C) Year of release and database releases are different
SQL Server releases: SQL Server 1987…2016,2017,2019,2022 etc
SQL Server Databases: 1.0.....17.x
a) Instance Required [Server installtion required]
b) Tool required to work with [SSMS, Azure Query Studio, WinSQL, SQL Developer,
TOAD etc.]
c) One Language [T-SQL : Transact SQL ] is required.

You might also like